Multimodal Multimedia Retrieval with vitrivr
The steady growth of multimedia collections -- both in terms of size and heterogeneity -- necessitates systems that are able to conjointly deal with several types of media as well as large volumes of data. This is especially true when it comes to satisfying a particular information need, i.e., retrieving a particular object of interest from a large collection. Nevertheless, existing multimedia management and retrieval systems are mostly organized in silos and treat different media types separately. Hence, they are limited when it comes to crossing these silos for accessing objects.
In this paper, we present vitrivr, a general-purpose content-based multimedia retrieval stack. In addition to the keyword search provided by most media management systems, vitrivr also exploits the object's content in order to facilitate different types of similarity search. This can be done within and, most importantly, across different media types giving rise to new, interesting use cases. To the best of our knowledge, the full vitrivr stack is unique in that it seamlessly integrates support for four different types of media, namely images, audio, videos, and 3D models.