Flexible Information Retrieval Evaluation

Authors
Loris Sauter
Type
PhD Thesis
Date
2024/2
Appears in
PhD Thesis, Department of Mathematics and Computer Science
Location
University of Basel, Switzerland
Abstract

The widespread adoption of smartphones and social media applications has fuelled the exponential growth of globally available data, especially multimedia.
This has led to the rise of more complex retrieval abilities for multimedia and, more recently, cross-modal deep learning tools.
Resulting in the suggestion that textual queries are universally effective.
With the multi-modal query capabilities of contemporary multimedia retrieval systems, the potential area for query formulation has significantly broadened.
Meanwhile, extensive international evaluation campaigns for information retrieval, which have been adapting to the constantly changing requirements of the field for decades, also play a crucial role.
However, there has been limited research on the approach of conducting an information retrieval evaluation that encompasses a formal model from evaluation definition to execution as well as post-hoc analysis of evaluation measures.

This doctoral thesis aims to address this research gap by presenting multiple contributions of information retrieval evaluation.
We focus on general-purpose information retrieval evaluation and draw on methodologies from both classical, text-based retrieval as well as multimedia retrieval.
Our proposed flexible model facilitates a broad range of information retrieval evaluations with attention given to defining and implementing state-of-the-art information retrieval evaluation campaigns.
The proposed model and concepts have been incorporated into DRES, our open-source reference implementation, which is currently being utilized in two international competition-style evaluation campaigns, the Video Browser Showdown and Lifelog Search Challenge.
We evaluate our reference implementation from a systems perspective, detailing its flexible application in the aforementioned evaluation campaigns and the associated repercussions on their analyses.

Our implemented model has been found to be sufficiently flexible to accommodate cutting-edge evaluation concepts, including Known-Item Search, Ad-hoc Search, and Question and Answer, for video and lifelog retrieval.

Staff members