Deep Learning-based Concept Detection in vitrivr

Luca Rossetto, Mahnaz Amiri Parian, Ralph Gasser, Ivan Giangreco, Silvan Heller and Heiko Schuldt
In Proceedings
Appears in
Proceedings of the 25th International Conference on MultiMedia Modeling
Thessaloniki, Greece

This paper presents the most recent additions to the vitrivr retrieval stack, which will be put to the test in the context of the 2019 Video Browser Showdown (VBS). The vitrivr stack has been extended by approaches for detecting, localizing, or describing concepts and actions in video scenes using various convolutional neural networks. Leveraging those additions, we have added support for searching the video collection based on semantic sketches. Furthermore, vitrivr offers new types of labels for text-based retrieval. In the same vein, we have also improved upon vitrivr's pre-existing capabilities for extracting text from video through scene text recognition. Moreover, the user interface has received a major overhaul so as to make it more accessible to novice users, especially for query formulation and result exploration.

Research Projects