Finding a Thesis Topic

Students who are interested in writing a bachelor’s or master’s thesis should begin thinking about possible topics (cf. hot topics for thesis projects on this page) or propose their own (cf. introduction to IML). Good research questions often have their origins in scientific papers around the research topics of the IML lab. Be on the look out for new data sources that might help provide new insights into a special IML research topic.

Your Advisor and Your Committee

In order to write a bachelor’s or master’s thesis you must find an member of the IML lab who is willing to be your thesis advisor. You propose your thesis topic together with your advisor to Prof. Sonntag as the first reviewer in your committee. 

How Long Should it Be? How Long Does it Take?

A bachelor’s thesis is generally 30-60 pages, not including the bibliography. A master’s thesis is generally 60-80 pages, not including the bibliography. However, the length will vary according to the topic and the method of analysis, so the appropriate length will be determined by you, your advisor, and your committee.  Students who write a master’s thesis generally do so over two semesters, bachelor’s one semester.

Procedure and Formal Requirements

You must maintain continuous enrollment Oldenburg University or at Saarland University while working on the bachelor’s or master’s thesis. If you are planning to conduct interviews, surveys or do other research involving human subjects, you must obtain prior approval from DFKI.

Here you can find some theses examples.

Hot Topics for Thesis Projects

Explainable Medical Decision

You will implement novel modern approaches in computer vision such as Transfer Learning, Graph Neural Network, or Semi-Supervised Learning to solve important medical decision problems like Breast cancer detection, Chest-(X-Ray/CT) abnormalities diagnosis, or related medical domains. The target is to achieve state-of-the-art performance and the proposed method could be explainable to end users to improve the system’s reliability.

Nguyen, Duy MH, et al. “An Attention Mechanism using Multiple Knowledge Sources for COVID-19 Detection from CT Images.”,  AAAI 2021, Workshop: Trustworthy AI for Healthcare. 

Soberanis-Mukul, Roger D., Nassir Navab, and Shadi Albarqouni. “An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation.” arXiv preprint arXiv:2012.03352 (2020).

Contact: Duy Nguyen

Theoretical Machine Learning for Medical Applications

In this topic, we will investigate important theoretical machine learning problems that have high impacts on several medical applications. It includes but is not limited to optimization formulation to incorporate efficient user’s feedback to boost the performance of trained models besides available training data (active learning), investigate benefits of transfer learning strategies when dealing with scarce data issues in medical problems, or training algorithms to adapt with highly imbalanced data distribution.

Wilder, Bryan, Eric Horvitz, and Ece Kamar. “Learning to complement humans.” arXiv preprint arXiv:2005.00582 (2020).

De, Abir, et al. “Classification Under Human Assistance.” AAAI (2021).

Yao, Huaxiu, et al. “Hierarchically structured meta-learning.” International Conference on Machine Learning. PMLR, 2019.

Contact: Duy Nguyen

Creating a dataset of natural explanatory conversations (about cooking*) [Master]

Requirements: Programming in Python, ideally experience with processing video and audio data

Project description: The aim is to create an annotated dataset of human-to-human dialogue in Youtube cooking videos*, that can serve as a resource for training ML models to generate conversational explanations of the cooking process. This involves the identification of videos with multiple speakers, speaker diarization (partitioning audio and/or transcript according to speaker identity), identification of conversational interaction between the speakers, and investigating if these interactions qualify as ‘conversational explanations’ of the video content

Contact: Mareike Hartmann

Relevant literature:

Speaker diarization:
Potential videos:
Background on ‘conversational explanations’ from an XAI perspective: (Sec. 5) Note that in this project, we focus on ‘explaining’ the video content rather than model predictions.

*We focus on the process of cooking as there is some related ongoing work at DFKI, but other instructional scenarios are possible.

Towards improving Image Captioning systems [Master]

Requirements: Programming in Python, Pytorch, basic understanding of Deep Learning, ideally some project work on DL / CV / NLP

Project description: The student will experiment with Image Captioning, more specifically testing existing architectures on different datasets. Then, an error analysis can be conducted, in order to find out how the system can be improved.

Contact: Aliki Anagnostopoulou

Relevant literature:

Explainable Machine Translation [Master]

Requirements: Programming in Python, Pytorch (or Tensorflow)

Project description: The aim of the project is to investigate how explainable NMT methods are. For example, attention weights from the Transformer architecture can be used as alignments, however it is not straight-forward which weights can be used.

Contact: Aliki Anagnostopoulou

Relevant literature:

Interactive Relation Extraction from clinical documents [Master]

Requirements: Programming in Python, Pytorch (or Tensorflow)

Project description: The aim of the project is to investigate active learning strategies applied to relation extraction from clinical documents when using deep learning models.

Contact: Siting Liang

Relevant literature: