Hybrid Human-Machine Translation Services

Crowdsourcing is recently used to automate complex tasks when computational systems alone fail. In this project, we investigate how humans can effectively contribute to automate natural language translation. The envisioned goal is a hybrid machine translation service that incrementally adapts machine translation models to new domains by employing human computation to make machine translation more competitive. Therefore, we investigate efficient ways for domain adoption of neural machine translation systems using crowd-generated input data.

Poster at Collective Intelligence 2018

In this work, we investigated (1) whether a (paid) crowd, that is acquired from a multilingual website’s community, is capable of translating coherent content from English to their mother tongue; and (2) in which cases state-of-the-art machine translation models can compete with human translations for automation in order to reduce task completion times and costs.

References

Michael Barz, Tim Polzehl, Daniel Sonntag: Towards Hybrid Human-Machine Translation Services. EasyChair Preprint no. 333, 2018.
Michael Barz, Neslihan Büyükdemircioglu, Rikhu Prasad Surya, Tim Polzehl, Daniel Sonntag: Device-Type Influence in Crowd-based Natural Language Translation Tasks (short paper). In: Aroyo, Lora; Dumitrache, Anca; Paritosh, Praveen; Quinn, Alexander; Welty, Chris; Checco, Alessandro; Demartini, Gianluca; Gadiraju, Ujwal; Sarasua, Cristina (Ed.): Proceedings of the 1st Workshop on Subjectivity, Ambiguity and Disagreement in Crowdsourcing, and Short Paper Proceedings of the 1st Workshop on Disentangling the Relation Between Crowdsourcing and Bias Management (SAD 2018 and CrowdBias 2018), pp. 93-97, CEUR-WS.org, 2018.
Marimuthu Kalimuthu, Michael Barz, Daniel Sonntag: Incremental Domain Adaptation for Neural Machine Translation in Low-Resource Settings. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop, pp. 1-10, Association for Computational Linguistics, 2019.

Contact

Michael Barz and Marimuthu Kalimuthu

Supported by

Computational Sustainability Machine Learning Natural Language Processing

Grounded Label Space Engineering for Knowledge-Centric Annotation Workflows

Building reliable AI models depends not only on how much data is annotated, but on the quality and meaning of the labels used during annotation. In many workflows, labels are flat, task-specific class names. They Read more

Natural Language Processing

A case study for contextualised image captioning uning foundation models: journalism enhancement with AI

Large language models (LLMs) and large multimodal models (LMMs) have significantly impacted the AI community, industry, and various economic sectors. In journalism, integrating AI poses unique challenges and opportunities, particularly in enhancing the quality and Read more

Natural Language Processing

Towards self-improving scene understanding with vision-language knowledge integration

Image captioning has seen immense progress in the last few years. However, general-purpose systems often fail to provide personalised, context-aware captions tailored to individual users or domains. In this work, we investigate the task of Read more

Published by Michael Barz on April 12, 2018April 12, 2018