Using Self-Organizing Maps to Explore and Query Multimedia Collections (Bachelor Thesis, Ongoing)


Maurizio Pasquinelli


With the tremendous increase of video recording devices and the resulting abundance of digital video, finding a particular video sequence in ever-growing collections is a major research challenge. The Video Browser Showdown is a yearly competition of research retrieval systems to test their ability to retrieve content based on either a textual description or seeing the target video sequence. vitrivr is an open-source system for indexing and retrieving multimedia data based on its content which has been a fixture in the Video Browser Showdown for the past years, winning in 2017 and 2019.

While vitrivr excels at retrieving based on content, it has limited functionality for exploring a collection. Additionally, tag-based retrieval as currently used by a lot of systems at the VBS has two disadvantages: The user both has to use (and know about) appropriate tags and the ML-Algorithm has to have detected the used tag – otherwise no match is found.

In this project inspired by the winning system at VBS 2020 (SOM-Hunter), the goal is to expand vitrivr to enable exploration of a multimedia collection based on its content. The user should be able to guide this exploration using relevance feedback. In a VBS-Style competition, the feasibility of the implemented prototype shall be evaluated.

In particular, the project will involve the following tasks & activities:

· Implement in both back- and frontend (Cineast / vitrivr-ng) a self-organizing map to visualize a multimedia collection. The choice of feature vector should be configurable. As a starting point, the feature can be based on visual similarity, but vectors for content similarity should also be considered.

· The user should be able to select relevant and non-relevant items. This relevance feedback should result in a recalculation of the SOM and update the set of items being shown, including possibly new items not shown previously.

· The new visualization should be able to submit items to the VBS-Server

· If time permits, a detailed investigation into the different possible feature vectors should be done.

For the evaluation, an in-house test run of the video browser showdown should be done. This requires setting up at least 6 independent machines so that at least 6 teams of two people each can participate. At least two teams should use vitrivr “as-is” with all functionality enabled, at least one team only vitrivr with tag-based retrieval and at least two teams should use the newly developed interface.

As part of the bachelor’s thesis, the candidate is expected to contribute to the vitrivr stack by fixing small bugs as they are encountered.


Start / End Dates

2020/03/09 - 2020/07/08


Research Topics