Search Target Inference

Visual Search target inference subsumes methods for predicting the target object through eye tracking. A person intents to find an object in a visual scene which we predict based on the fixation behavior. Knowing about the search target can improve intelligent user interaction.

Search Target Inference

Introducing the Bag of Deep Visual Words Encoding for Search Target Inference

In our work [Stauden et al., 2018], we implement a new feature encoding, the Bag of Deep Visual Words, for search target inference using a pre-trained convolutional neural network (CNN). Our work is based on a recent approach from the literature that uses Bag of Visual Words, common in computer vision applications. We evaluate our method using a gold standard dataset. The results show that our new feature encoding outperforms the baseline from the literature, in particular, when excluding fixations on the target. We presented this work at the 41st German conference on Artificial Intelligence.

Reference

Contact

Michael Barz and Sven Stauden