Scene Text Recognition in Images and Video with Cineast (Bachelor Thesis, Finished)


Renato Farruggio


The Cineast multimedia retrieval system provides a versatile system that allows users to browse through a multimedia collection or search for a specific multimedia entry, such as a video. Among many others, Cineast provides a query-by-scene-text functionality whereby users can search for a picture or video that contains visible text. However, this functionality of Cineast currently relies on an external API that only offers a limited amount of uses per month. With this thesis, we aimed to integrate state-of-the-art scene text recognition into Cineast, which does not rely on such external APIs.

The evaluation shows that the models we used are not as good as desired. However, we've created the foundation for a reusable and framework-agnostic facility to import other trained machine learning models into Cineast in the future, that may perform better. Furthermore, when using the correct models, this facility could also be used to perform a range of other machine learning tasks for feature extraction, including object detection, speech recognition or image classification.

Start / End Dates

2021/04/06 - 2021/08/05


Research Topics