Cottontail DB: An Open Source Database System for Multimedia Retrieval and Analysis

Authors
Ralph Gasser, Luca Rossetto, Silvan Heller, Heiko Schuldt
Type
In Proceedings
Date
2020/10
Appears in
Proceedings of 28th ACM International Conference on Multimedia.(ACM MM 2020) - Open Source Competition
Location
Seattle, WA, USA (held virtually)
Publisher
ACM
Pages
4465–4468
Abstract

Multimedia retrieval and analysis are two important areas in “Big data” research. They have in common that they work with feature vectors as proxies for the media objects themselves. Together with metadata such as textual descriptions or numbers, these vectors describe a media object in its entirety, and must therefore be considered jointly for both storage and retrieval. In this paper we introduce Cottontail DB, an open source database management system that integrates support for scalar and vector attributes in a unified data and query model that allows for both Boolean retrieval and nearest neighbour search. We demonstrate that Cottontail DB scales well to large collection sizes and vector dimensions and provide insights into how it proved to be a valuable tool in various use cases ranging from the analysis of MRI data to realizing retrieval solutions in the cultural heritage domain.