Multimodal Pose Retrieval (Master Thesis, Finished)

Author

Description

The objective of this Master's thesis is to design, implement, and evaluate network that takes an image together with a text as query, where the text expresses the desired pose-modification of the target person in the image. The system should be able to exploit the text information in order to alter the input image to retrieve the desired modification.

The project will extract the language expression and attach it to the pose while concurrently transforming the desired modication and preserving the rest of the body pose. Furthermore,
as we want to compare the composed features to the features in the underlying database, we must ensure that they reside in the same space for being comparable. Thereby, the textual data should be employed to transform the reference pose that its embedding gets closer to the target feature in the embedding space.

Start / End Dates

2021/11/29 - 2022/05/28

Supervisors

Research Topics

Multimedia Information Retrieval