ADAMpro: Database Support for Big Multimedia Retrieval

Ivan Giangreco and Heiko Schuldt
Appears in
Datenbank-Spektrum, Schwerpunktthema "Big Data & IR"

For supporting retrieval tasks within large multimedia collections, not only the sheer size of data but also the complexity of data and their associated metadata pose a challenge. Applications that have to deal with big multimedia collections need to manage the volume of data and to effectively and efficiently search within these data. When providing similarity search, a multimedia retrieval system has to consider the actual multimedia content, the corresponding structured metadata (e.g., content author, creation date, etc.) and –for providing similarity queries– the extracted low-level features stored as densely populated high-dimensional feature vectors. In this paper, we present ADAMpro, a combined database and information retrieval system that is particularly tailored to big multimedia collections. ADAMpro follows a modular architecture for storing structured metadata, as well as the extracted feature vectors and it provides various index structures, i.e., Locality-Sensitive Hashing, Spectral Hashing, and the VA-File, for a fast retrieval in the context of a similarity search. Since similarity queries are often long-running, ADAMpro supports progressive queries that provide the user with streaming result lists by returning (possibly imprecise) results as soon as they become available. We provide the results of an evaluation of ADAMpro on the basis of several collection sizes up to 50 million entries and feature vectors with different numbers of dimensions.