2017 |
Inproceedings |
Barz, Michael; Poller, Peter; Sonntag, Daniel Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI Inproceedings Mutlu, Bilge; Tscheligi, Manfred; Weiss, Astrid; Young, James E (Ed.): Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, pp. 79-80, ACM, 2017. @inproceedings{8991, title = {Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI}, author = {Michael Barz and Peter Poller and Daniel Sonntag}, editor = {Bilge Mutlu and Manfred Tscheligi and Astrid Weiss and James E Young}, url = {https://www.dfki.de/fileadmin/user_upload/import/8991_2017_Evaluating_Remote_and_Head-worn_Eye_Trackers_in_Multi-modal_Speech-based_HRI.pdf http://doi.acm.org/10.1145/3029798.3038367}, year = {2017}, date = {2017-03-01}, booktitle = {Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction}, pages = {79-80}, publisher = {ACM}, abstract = {Gaze is known to be a dominant modality for conveying spatial information, and it has been used for grounding in human-robot dialogues. In this work, we present the prototype of a gaze-supported multi-modal dialogue system that enhances two core tasks in human-robot collaboration: 1) our robot is able to learn new objects and their location from user instructions involving gaze, and 2) it can instruct the user to move objects and passively track this movement by interpreting the user's gaze. We performed a user study to investigate the impact of different eye trackers on user performance. In particular, we compare a head-worn device and an RGB-based remote eye tracker. Our results show that the head-mounted eye tracker outperforms the remote device in terms of task completion time and the required number of utterances due to its higher precision.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Gaze is known to be a dominant modality for conveying spatial information, and it has been used for grounding in human-robot dialogues. In this work, we present the prototype of a gaze-supported multi-modal dialogue system that enhances two core tasks in human-robot collaboration: 1) our robot is able to learn new objects and their location from user instructions involving gaze, and 2) it can instruct the user to move objects and passively track this movement by interpreting the user's gaze. We performed a user study to investigate the impact of different eye trackers on user performance. In particular, we compare a head-worn device and an RGB-based remote eye tracker. Our results show that the head-mounted eye tracker outperforms the remote device in terms of task completion time and the required number of utterances due to its higher precision. |
Barz, Michael; Poller, Peter; Sonntag, Daniel Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI (Demo) Inproceedings Mutlu, Bilge; Tscheligi, Manfred; Weiss, Astrid; Young, James E (Ed.): Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, ACM, 2017. @inproceedings{8992, title = {Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI (Demo)}, author = {Michael Barz and Peter Poller and Daniel Sonntag}, editor = {Bilge Mutlu and Manfred Tscheligi and Astrid Weiss and James E Young}, url = {https://www.dfki.de/fileadmin/user_upload/import/8992_2017_Evaluating_Remote_and_Head-worn_Eye_Trackers_in_Multi-modal_Speech-based_HRI_(Demo)_.pdf}, doi = {https://doi.org/10.1145/3029798.3036665}, year = {2017}, date = {2017-01-01}, booktitle = {Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction}, publisher = {ACM}, abstract = {Gaze is known to be a dominant modality for conveying spatial information, and it has been used for grounding in human-robot dialogues. In this work, we present the prototype of a gaze-supported multi-modal dialogue system that enhances two core tasks in human-robot collaboration: 1) our robot is able to learn new objects and their location from user instructions involving gaze, and 2) it can instruct the user to move objects and passively track this movement by interpreting the user's gaze. We performed a user study to investigate the impact of different eye trackers on user performance. In particular, we compare a head-worn device and an RGB-based remote eye tracker. Our results show that the head-mounted eye tracker outperforms the remote device in terms of task completion time and the required number of utterances due to its higher precision.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Gaze is known to be a dominant modality for conveying spatial information, and it has been used for grounding in human-robot dialogues. In this work, we present the prototype of a gaze-supported multi-modal dialogue system that enhances two core tasks in human-robot collaboration: 1) our robot is able to learn new objects and their location from user instructions involving gaze, and 2) it can instruct the user to move objects and passively track this movement by interpreting the user's gaze. We performed a user study to investigate the impact of different eye trackers on user performance. In particular, we compare a head-worn device and an RGB-based remote eye tracker. Our results show that the head-mounted eye tracker outperforms the remote device in terms of task completion time and the required number of utterances due to its higher precision. |
Prange, Alexander; Barz, Michael; Sonntag, Daniel Speech-based Medical Decision Support in VR using a Deep Neural Network (Demonstration) Inproceedings Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 5241-5242, IJCAI, 2017. @inproceedings{9198, title = {Speech-based Medical Decision Support in VR using a Deep Neural Network (Demonstration)}, author = {Alexander Prange and Michael Barz and Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/9198_2017_Speech-based_Medical_Decision_Support_in_VR_using_a_Deep_Neural_Network.pdf}, doi = {https://doi.org/10.24963/ijcai.2017/777}, year = {2017}, date = {2017-01-01}, booktitle = {Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17}, pages = {5241-5242}, publisher = {IJCAI}, abstract = {We present a speech dialogue system that facilitates medical decision support for doctors in a virtual re- ality (VR) application. The therapy prediction is based on a recurrent neural network model that incorporates the examination history of patients. A central supervised patient database provides input to our predictive model and allows us, first, to add new examination reports by a pen-based mobile application on-the-fly, and second, to get therapy prediction results in real-time. This demo includes a visualisation of patient records, radiology image data, and the therapy prediction results in VR.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a speech dialogue system that facilitates medical decision support for doctors in a virtual re- ality (VR) application. The therapy prediction is based on a recurrent neural network model that incorporates the examination history of patients. A central supervised patient database provides input to our predictive model and allows us, first, to add new examination reports by a pen-based mobile application on-the-fly, and second, to get therapy prediction results in real-time. This demo includes a visualisation of patient records, radiology image data, and the therapy prediction results in VR. |
Sonntag, Daniel; Profitlich, Hans-Jürgen Integrated Decision Support by Combining Textual Information Extraction, Facetted Search and Information Visualisation Inproceedings 2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS), pp. 95-100, IEEE, 2017. @inproceedings{11492, title = {Integrated Decision Support by Combining Textual Information Extraction, Facetted Search and Information Visualisation}, author = {Daniel Sonntag and Hans-Jürgen Profitlich}, url = {https://ieeexplore.ieee.org/document/8104164}, year = {2017}, date = {2017-01-01}, booktitle = {2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS)}, pages = {95-100}, publisher = {IEEE}, abstract = {This work focusses on our integration steps of complex and partly unstructured medical data into a clinical research database with subsequent decision support. Our main application is an integrated facetted search tool, followed by information visualisation based on automatic information extraction results from textual documents. We describe the details of our technical architecture (open-source tools), to be replicated at other universities, research institutes, or hospitals. Our exemplary use case is nephrology, where we try to answer questions about the temporal characteristics of sequences and gain significant insight from the data for cohort selection. We report on this case study, illustrating how the application can be used by a clinician and which questions can be answered.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This work focusses on our integration steps of complex and partly unstructured medical data into a clinical research database with subsequent decision support. Our main application is an integrated facetted search tool, followed by information visualisation based on automatic information extraction results from textual documents. We describe the details of our technical architecture (open-source tools), to be replicated at other universities, research institutes, or hospitals. Our exemplary use case is nephrology, where we try to answer questions about the temporal characteristics of sequences and gain significant insight from the data for cohort selection. We report on this case study, illustrating how the application can be used by a clinician and which questions can be answered. |
Miscellaneous |
Sonntag, Daniel Interakt - A Multimodal Multisensory Interactive Cognitive Assessment Tool Miscellaneous 2017. @misc{11489, title = {Interakt - A Multimodal Multisensory Interactive Cognitive Assessment Tool}, author = {Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/11489_1709.01796.pdf http://arxiv.org/abs/1709.01796}, year = {2017}, date = {2017-01-01}, volume = {abs/1709.01796}, pages = {4}, abstract = {Cognitive assistance may be valuable in applications for doctors and therapists that reduce costs and improve quality in healthcare systems. Use cases and scenarios include the assessment of dementia. In this paper, we present our approach to the (semi-)automatic assessment of dementia.}, keywords = {}, pubstate = {published}, tppubtype = {misc} } Cognitive assistance may be valuable in applications for doctors and therapists that reduce costs and improve quality in healthcare systems. Use cases and scenarios include the assessment of dementia. In this paper, we present our approach to the (semi-)automatic assessment of dementia. |
2016 |
Journal Articles |
Sonntag, Daniel; Tresp, Volker; Zillner, Sonja; Cavallaro, Alexander; Hammon, Matthias; Reis, André; Fasching, Peter A; Sedlmayr, Martin; Ganslandt, Thomas; Prokosch, Hans-Ulrich; Budde, Klemens; Schmidt, Danilo; Hinrichs, Carl; Wittenberg, Thomas; Daumke, Philipp; Oppelt, Patricia G The Clinical Data Intelligence Project - A smart data initiative Journal Article Informatik Spektrum, 39 , pp. 290-300, 2016. @article{11490, title = {The Clinical Data Intelligence Project - A smart data initiative}, author = {Daniel Sonntag and Volker Tresp and Sonja Zillner and Alexander Cavallaro and Matthias Hammon and André Reis and Peter A Fasching and Martin Sedlmayr and Thomas Ganslandt and Hans-Ulrich Prokosch and Klemens Budde and Danilo Schmidt and Carl Hinrichs and Thomas Wittenberg and Philipp Daumke and Patricia G Oppelt}, doi = {https://doi.org/10.1007/s00287-015-0913-x}, year = {2016}, date = {2016-01-01}, journal = {Informatik Spektrum}, volume = {39}, pages = {290-300}, publisher = {Springer}, abstract = {This article is about a new project that combines clinical data intelligence and smart data. It provides an introduction to the “Klinische Datenintelligenz” (KDI) project which is founded by the Federal Ministry for Economic Affairs and Energy (BMWi); we transfer research and development results (R&D) of the analysis of data which are generated in the clinical routine in specific medical domain. We present the project structure and goals, how patient care should be improved, and the joint efforts of data and knowledge engineering, information extraction (from textual and other unstructured data), statistical machine learning, decision support, and their integration into special use cases moving towards individualised medicine. In particular, we describe some details of our medical use cases and cooperation with two major German university hospitals.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This article is about a new project that combines clinical data intelligence and smart data. It provides an introduction to the “Klinische Datenintelligenz” (KDI) project which is founded by the Federal Ministry for Economic Affairs and Energy (BMWi); we transfer research and development results (R&D) of the analysis of data which are generated in the clinical routine in specific medical domain. We present the project structure and goals, how patient care should be improved, and the joint efforts of data and knowledge engineering, information extraction (from textual and other unstructured data), statistical machine learning, decision support, and their integration into special use cases moving towards individualised medicine. In particular, we describe some details of our medical use cases and cooperation with two major German university hospitals. |
Inproceedings |
Moniri, Mehdi; Luxenburger, Andreas; Schuffert, Winfried; Sonntag, Daniel Real-Time 3D Peripheral View Analysis Inproceedings Dirk, Reiners; Daisuke, Iwai; Frank, Steinicke (Ed.): Proceedings of the International Conference on Artificial Reality and Telexistence and Eurographics Symposium on Virtual Environments 2016, pp. 37-44, The Eurographics Association, 2016. @inproceedings{8850, title = {Real-Time 3D Peripheral View Analysis}, author = {Mehdi Moniri and Andreas Luxenburger and Winfried Schuffert and Daniel Sonntag}, editor = {Reiners Dirk and Iwai Daisuke and Steinicke Frank}, url = {https://www.dfki.de/fileadmin/user_upload/import/8850_2016_Real-Time_3D_Peripheral_View_Analysis.pdf}, year = {2016}, date = {2016-12-01}, booktitle = {Proceedings of the International Conference on Artificial Reality and Telexistence and Eurographics Symposium on Virtual Environments 2016}, pages = {37-44}, publisher = {The Eurographics Association}, abstract = {Human peripheral vision suffers from several limitations that differ among various regions of the visual field. Since these limitations result in natural visual impairments, many interesting intelligent user interfaces based on eye tracking could benefit from peripheral view calculations that aim to compensate for events occurring outside the very center of gaze. We present a general peripheral view calculation model which extends previous work on attention-based user interfaces that use eye gaze. An intuitive, two dimensional visibility measure based on the concept of solid angle is developed for determining to which extent an object of interest observed by a user intersects with each region of the underlying visual field model. The results are weighted considering the visual acuity in each visual field region to determine the total visibility of the object. We exemplify the proposed model in a virtual reality car simulation application incorporating a head-mounted display with integrated eye tracking functionality. In this context, we provide a quantitative evaluation in terms of a runtime analysis of the different steps of our approach. We provide also several example applications including an interactive web application which visualizes the concepts and calculations presented in this paper.}, howpublished = {ICAT-EGVE2016}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Human peripheral vision suffers from several limitations that differ among various regions of the visual field. Since these limitations result in natural visual impairments, many interesting intelligent user interfaces based on eye tracking could benefit from peripheral view calculations that aim to compensate for events occurring outside the very center of gaze. We present a general peripheral view calculation model which extends previous work on attention-based user interfaces that use eye gaze. An intuitive, two dimensional visibility measure based on the concept of solid angle is developed for determining to which extent an object of interest observed by a user intersects with each region of the underlying visual field model. The results are weighted considering the visual acuity in each visual field region to determine the total visibility of the object. We exemplify the proposed model in a virtual reality car simulation application incorporating a head-mounted display with integrated eye tracking functionality. In this context, we provide a quantitative evaluation in terms of a runtime analysis of the different steps of our approach. We provide also several example applications including an interactive web application which visualizes the concepts and calculations presented in this paper. |
Barz, Michael; Sonntag, Daniel Gaze-guided Object Classification Using Deep Neural Networks for Attention-based Computing Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 253-256, ACM, 2016. @inproceedings{8936, title = {Gaze-guided Object Classification Using Deep Neural Networks for Attention-based Computing}, author = {Michael Barz and Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/8936_2016_Gaze-guided_object_classification_using_deep_neural_networks_for_attention-based_computing.pdf}, doi = {https://doi.org/10.1145/2968219.2971389}, year = {2016}, date = {2016-09-01}, booktitle = {Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct}, pages = {253-256}, publisher = {ACM}, abstract = {Recent advances in eye tracking technologies opened the way to design novel attention-based user interfaces. This is promising for pro-active and assistive technologies for cyber-physical systems in the domains of, e.g., healthcare and industry 4.0. Prior approaches to recognize a user's attention are usually limited to the raw gaze signal or sensors in instrumented environments. We propose a system that (1) incorporates the gaze signal and the egocentric camera of the eye tracker to identify the objects the user focuses at; (2) employs object classification based on deep learning which we recompiled for our purposes on a GPU-based image classification server; (3) detects whether the user actually draws attention to that object; and (4) combines these modules for constructing episodic memories of egocentric events in real-time.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Recent advances in eye tracking technologies opened the way to design novel attention-based user interfaces. This is promising for pro-active and assistive technologies for cyber-physical systems in the domains of, e.g., healthcare and industry 4.0. Prior approaches to recognize a user's attention are usually limited to the raw gaze signal or sensors in instrumented environments. We propose a system that (1) incorporates the gaze signal and the egocentric camera of the eye tracker to identify the objects the user focuses at; (2) employs object classification based on deep learning which we recompiled for our purposes on a GPU-based image classification server; (3) detects whether the user actually draws attention to that object; and (4) combines these modules for constructing episodic memories of egocentric events in real-time. |
Prange, Alexander; Sonntag, Daniel Digital PI-RADS: Smartphone Sketches for Instant Knowledge Acquisition in Prostate Cancer Detection Inproceedings 2016 IEEE 29th International Symposium on Computer-Based Medical Systems (CBMS), pp. 13-18, IEEE Xplore, 2016. @inproceedings{11234, title = {Digital PI-RADS: Smartphone Sketches for Instant Knowledge Acquisition in Prostate Cancer Detection}, author = {Alexander Prange and Daniel Sonntag}, year = {2016}, date = {2016-06-01}, booktitle = {2016 IEEE 29th International Symposium on Computer-Based Medical Systems (CBMS)}, pages = {13-18}, publisher = {IEEE Xplore}, abstract = {In order to improve reporting practices for the detection of prostate cancer, we present an application that allows urologists to create structured reports by using a digital pen on a smartphone. In this domain, printed documents cannot be easily replaced by computer systems because they contain free-form sketches and textual annotations, and the acceptance of traditional PC reporting tools is rather low among the doctors. Our approach provides an instant knowledge acquisition system by automatically interpreting the written strokes, texts, and sketches. We have incorporated this structured reporting system for MRI of the prostate (PI-RADS). Our system imposes only minimal overhead on traditional form-filling processes and provides for a direct, ontology-based structuring of the user input for semantic search and retrieval applications.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In order to improve reporting practices for the detection of prostate cancer, we present an application that allows urologists to create structured reports by using a digital pen on a smartphone. In this domain, printed documents cannot be easily replaced by computer systems because they contain free-form sketches and textual annotations, and the acceptance of traditional PC reporting tools is rather low among the doctors. Our approach provides an instant knowledge acquisition system by automatically interpreting the written strokes, texts, and sketches. We have incorporated this structured reporting system for MRI of the prostate (PI-RADS). Our system imposes only minimal overhead on traditional form-filling processes and provides for a direct, ontology-based structuring of the user input for semantic search and retrieval applications. |
Barz, Michael; Daiber, Florian; Bulling, Andreas Prediction of Gaze Estimation Error for Error-Aware Gaze-Based Interfaces Inproceedings Proceedings of the Symposium on Eye Tracking Research and Applications, ACM, 2016. @inproceedings{8242, title = {Prediction of Gaze Estimation Error for Error-Aware Gaze-Based Interfaces}, author = {Michael Barz and Florian Daiber and Andreas Bulling}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the Symposium on Eye Tracking Research and Applications}, publisher = {ACM}, abstract = {Gaze estimation error is inherent in head-mounted eye trackers and seriously impacts performance, usability, and user experience of gaze-based interfaces. Particularly in mobile settings, this error varies constantly as users move in front and look at different parts of a display. We envision a new class of gaze-based interfaces that are aware of the gaze estimation error and adapt to it in real time. As a first step towards this vision we introduce an error model that is able to predict the gaze estimation error. Our method covers major building blocks of mobile gaze estimation, specifically mapping of pupil positions to scene camera coordinates, marker-based display detection, and mapping of gaze from scene camera to on-screen coordinates. We develop our model through a series of principled measurements of a state-of-the-art head-mounted eye tracker.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Gaze estimation error is inherent in head-mounted eye trackers and seriously impacts performance, usability, and user experience of gaze-based interfaces. Particularly in mobile settings, this error varies constantly as users move in front and look at different parts of a display. We envision a new class of gaze-based interfaces that are aware of the gaze estimation error and adapt to it in real time. As a first step towards this vision we introduce an error model that is able to predict the gaze estimation error. Our method covers major building blocks of mobile gaze estimation, specifically mapping of pupil positions to scene camera coordinates, marker-based display detection, and mapping of gaze from scene camera to on-screen coordinates. We develop our model through a series of principled measurements of a state-of-the-art head-mounted eye tracker. |
Lessel, Pascal; Altmeyer, Maximilian; Kerber, Frederic; Barz, Michael; Leidinger, Cornelius; Krüger, Antonio WaterCoaster: A Device to Encourage People in a Playful Fashion to Reach Their Daily Water Intake Level Inproceedings Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, pp. 1813-1820, ACM, 2016. @inproceedings{8342, title = {WaterCoaster: A Device to Encourage People in a Playful Fashion to Reach Their Daily Water Intake Level}, author = {Pascal Lessel and Maximilian Altmeyer and Frederic Kerber and Michael Barz and Cornelius Leidinger and Antonio Krüger}, url = {https://umtl.cs.uni-saarland.de/paper_preprints/paper_watercoaster_lessel.pdf http://doi.acm.org/10.1145/2851581.2892498}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems}, pages = {1813-1820}, publisher = {ACM}, abstract = {In this paper, we present WaterCoaster, a mobile device and a mobile application to motivate people to drink beverages more often and more regularly. The WaterCoaster measures the amount drunk and reminds the user to consume more, if necessary. The app is designed as a game in which the user needs to take care of a virtual character living in a fish tank, dropping the water level if the user does not consume beverages in a healthy way. We report results of a pilot study (N=17) running three weeks suggesting that our approach is appreciated and subjectively influences participants. Based on the results, we look forward to evaluating the system in a long-term study in the next iteration.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper, we present WaterCoaster, a mobile device and a mobile application to motivate people to drink beverages more often and more regularly. The WaterCoaster measures the amount drunk and reminds the user to consume more, if necessary. The app is designed as a game in which the user needs to take care of a virtual character living in a fish tank, dropping the water level if the user does not consume beverages in a healthy way. We report results of a pilot study (N=17) running three weeks suggesting that our approach is appreciated and subjectively influences participants. Based on the results, we look forward to evaluating the system in a long-term study in the next iteration. |
Barz, Michael; Moniri, Mehdi; Weber, Markus; Sonntag, Daniel Multimodal multisensor activity annotation tool Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 17-20, ACM, 2016. @inproceedings{8770, title = {Multimodal multisensor activity annotation tool}, author = {Michael Barz and Mehdi Moniri and Markus Weber and Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/8770_2016_Multimodal_multisensor_activity_annotation_tool.pdf http://dl.acm.org/citation.cfm?id=2971459}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct}, pages = {17-20}, publisher = {ACM}, abstract = {In this paper we describe a multimodal-multisensor annotation tool for physiological computing; for example mobile gesture-based interaction devices or health monitoring devices can be connected. It should be used as an expert authoring tool to annotate multiple video-based sensor streams for domain-specific activities. Resulting datasets can be used as supervised datasets for new machine learning tasks. Our tool provides connectors to commercially available sensor systems (e.g., Intel RealSense F200 3D camera, Leap Motion, and Myo) and a graphical user interface for annotation.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper we describe a multimodal-multisensor annotation tool for physiological computing; for example mobile gesture-based interaction devices or health monitoring devices can be connected. It should be used as an expert authoring tool to annotate multiple video-based sensor streams for domain-specific activities. Resulting datasets can be used as supervised datasets for new machine learning tasks. Our tool provides connectors to commercially available sensor systems (e.g., Intel RealSense F200 3D camera, Leap Motion, and Myo) and a graphical user interface for annotation. |
Luxenburger, Andreas; Prange, Alexander; Moniri, Mehdi; Sonntag, Daniel MedicalVR: Towards Medical Remote Collaboration Using Virtual Reality Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 321-324, ACM, 2016. @inproceedings{8771, title = {MedicalVR: Towards Medical Remote Collaboration Using Virtual Reality}, author = {Andreas Luxenburger and Alexander Prange and Mehdi Moniri and Daniel Sonntag}, url = {http://doi.acm.org/10.1145/2968219.2971392}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct}, pages = {321-324}, publisher = {ACM}, abstract = {We present a virtual reality framework and assistive tool for design practices for new medical environments, grounded on human visual perception, attention and action. This includes an interactive visualization of shared electronic patient records, previously acquired with a remote tablet device, in a virtual environment incorporating hand tracking, eye tracking and a vision-based peripheral view monitoring. The goal is to influence medical environments' affordances, especially for e-health and m-health applications as well as user experience and design conception for tele-medicine.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a virtual reality framework and assistive tool for design practices for new medical environments, grounded on human visual perception, attention and action. This includes an interactive visualization of shared electronic patient records, previously acquired with a remote tablet device, in a virtual environment incorporating hand tracking, eye tracking and a vision-based peripheral view monitoring. The goal is to influence medical environments' affordances, especially for e-health and m-health applications as well as user experience and design conception for tele-medicine. |
Moniri, Mehdi; Luxenburger, Andreas; Sonntag, Daniel Peripheral View Calculation in Virtual Reality Applications Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 333-336, ACM, 2016. @inproceedings{8769, title = {Peripheral View Calculation in Virtual Reality Applications}, author = {Mehdi Moniri and Andreas Luxenburger and Daniel Sonntag}, url = {http://doi.acm.org/10.1145/2968219.2971391}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct}, pages = {333-336}, publisher = {ACM}, abstract = {We present an application based on a general peripheral view calculation model which extends previous work on attention-based user interfaces that use eye gaze. An intuitive, two dimensional visibility measure based on the concept of solid angle is developed. We determine to which extent an object of interest, observed by a user, intersects with each region of the underlying visual field model. The results are weighted (thereby considering the visual acuity in each visual field) to determine the total visibility of the object. As a proof of concept, we exemplify the proposed model in a virtual reality application which incorporates a head-mounted display with integrated eye tracking functionality. In this context, we implement several proactive system behaviors including contextual information presentation with an adaptive level of detail and attention guidance; the latter is implemented by detecting visual acuity limitations or attention drifts.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present an application based on a general peripheral view calculation model which extends previous work on attention-based user interfaces that use eye gaze. An intuitive, two dimensional visibility measure based on the concept of solid angle is developed. We determine to which extent an object of interest, observed by a user, intersects with each region of the underlying visual field model. The results are weighted (thereby considering the visual acuity in each visual field) to determine the total visibility of the object. As a proof of concept, we exemplify the proposed model in a virtual reality application which incorporates a head-mounted display with integrated eye tracking functionality. In this context, we implement several proactive system behaviors including contextual information presentation with an adaptive level of detail and attention guidance; the latter is implemented by detecting visual acuity limitations or attention drifts. |
Luxenburger, Andreas; Sonntag, Daniel Immersive Virtual Reality Games for Persuasion Inproceedings Meschtscherjakov, Alexander; Ruyter, Boris De; Fuchsberger, Verena; Murer, Martin; Tscheligi, Manfred (Ed.): Proceedings of the 11th International Conference on Persuasive Technology 2016: Adjunct, pp. 110-111, o.A., 2016. @inproceedings{8930, title = {Immersive Virtual Reality Games for Persuasion}, author = {Andreas Luxenburger and Daniel Sonntag}, editor = {Alexander Meschtscherjakov and Boris De Ruyter and Verena Fuchsberger and Martin Murer and Manfred Tscheligi}, url = {https://www.dfki.de/fileadmin/user_upload/import/8930_DC_Persuasive'16_Luxenburger.pdf}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the 11th International Conference on Persuasive Technology 2016: Adjunct}, pages = {110-111}, publisher = {o.A.}, abstract = {Virtual reality (VR) can create stunning and memorable experiences and has been used in many different areas such as entertainment and simulation. Immersion states a key factor in this context. Fusing immersive virtual environments with persuasive technology (PT) in a game setting paves the way for creating interactive platforms aiming at user-oriented behavioral change. This work outlines important aspects for designing an immersive VR game platform for persuasion. Our future research aims at investigating how and to which extent recent advances in intelligent user interfaces (IUIs) can benefit immersion and persuasion. In particular, this includes how interactions with a persuasive VR game platform are influenced by contextual or individual conditions and how associated designs can be adapted to target audiences in specific, like medical or educational contexts.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Virtual reality (VR) can create stunning and memorable experiences and has been used in many different areas such as entertainment and simulation. Immersion states a key factor in this context. Fusing immersive virtual environments with persuasive technology (PT) in a game setting paves the way for creating interactive platforms aiming at user-oriented behavioral change. This work outlines important aspects for designing an immersive VR game platform for persuasion. Our future research aims at investigating how and to which extent recent advances in intelligent user interfaces (IUIs) can benefit immersion and persuasion. In particular, this includes how interactions with a persuasive VR game platform are influenced by contextual or individual conditions and how associated designs can be adapted to target audiences in specific, like medical or educational contexts. |
2015 |
Inproceedings |
Orlosky, Jason; Weber, Markus; Gu, Yecheng; Sonntag, Daniel; Sosnovsky, Sergey An Interactive Pedestrian Environment Simulator for Cognitive Monitoring and Evaluation Inproceedings Proceedings of the 20th International Conference on Intelligent User Interfaces Companion, pp. 57-60, ACM, 2015. @inproceedings{7676, title = {An Interactive Pedestrian Environment Simulator for Cognitive Monitoring and Evaluation}, author = {Jason Orlosky and Markus Weber and Yecheng Gu and Daniel Sonntag and Sergey Sosnovsky}, url = {https://www.dfki.de/fileadmin/user_upload/import/7676_2015_An_Interactive_Pedestrian_Environment_Simulator_for_Cognitive_Monitoring_and_Evaluation.pdf}, year = {2015}, date = {2015-01-01}, booktitle = {Proceedings of the 20th International Conference on Intelligent User Interfaces Companion}, pages = {57-60}, publisher = {ACM}, abstract = {Recent advances in virtual and augmented reality have led to the development of a number of simulations for different applications. In particular, simulations for monitoring, evaluation, training, and education have started to emerge for the consumer market due to the availability and affordability of immersive display technology. In this work, we introduce a virtual reality environment that provides an immersive traffic simulation designed to observe behavior and monitor relevant skills and abilities of pedestrians who may be at risk, such as elderly persons with cognitive impairments. The system provides basic reactive functionality, such as display of navigation instructions and notifications of dangerous obstacles during navigation tasks. Methods for interaction using hand and arm gestures are also implemented to allow users explore the environment in a more natural manner.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Recent advances in virtual and augmented reality have led to the development of a number of simulations for different applications. In particular, simulations for monitoring, evaluation, training, and education have started to emerge for the consumer market due to the availability and affordability of immersive display technology. In this work, we introduce a virtual reality environment that provides an immersive traffic simulation designed to observe behavior and monitor relevant skills and abilities of pedestrians who may be at risk, such as elderly persons with cognitive impairments. The system provides basic reactive functionality, such as display of navigation instructions and notifications of dangerous obstacles during navigation tasks. Methods for interaction using hand and arm gestures are also implemented to allow users explore the environment in a more natural manner. |
Prange, Alexander; Sonntag, Daniel Easy Deployment of Spoken Dialogue Technology on Smartwatches for Mental Healthcare Inproceedings Pervasive Computing Paradigms for Mental Health - 5th International Conference, MindCare 2015, Milan, Italy, September 24-25, 2015, Revised Selected Papers, pp. 150-156, Springer, 2015. @inproceedings{11231, title = {Easy Deployment of Spoken Dialogue Technology on Smartwatches for Mental Healthcare}, author = {Alexander Prange and Daniel Sonntag}, doi = {https://doi.org/10.1007/978-3-319-32270-4_15}, year = {2015}, date = {2015-01-01}, booktitle = {Pervasive Computing Paradigms for Mental Health - 5th International Conference, MindCare 2015, Milan, Italy, September 24-25, 2015, Revised Selected Papers}, volume = {604}, pages = {150-156}, publisher = {Springer}, abstract = {Smartwatches are becoming increasingly sophisticated and popular as several major smartphone manufacturers, including Apple, have released their new models recently. We believe that these devices can serve as smart objects for people suffering from mental disorders such as memory loss. In this paper, we describe how to utilise smartwatches to create intelligent user interfaces that can be used to provide cognitive assistance in daily life situations of dementia patients. By using automatic speech recognisers and text-to-speech synthesis, we create a dialogue application that allows patients to interact through natural language. We compare several available libraries for Android and show an example of integrating a smartwatch application into an existing healthcare infrastructure.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Smartwatches are becoming increasingly sophisticated and popular as several major smartphone manufacturers, including Apple, have released their new models recently. We believe that these devices can serve as smart objects for people suffering from mental disorders such as memory loss. In this paper, we describe how to utilise smartwatches to create intelligent user interfaces that can be used to provide cognitive assistance in daily life situations of dementia patients. By using automatic speech recognisers and text-to-speech synthesis, we create a dialogue application that allows patients to interact through natural language. We compare several available libraries for Android and show an example of integrating a smartwatch application into an existing healthcare infrastructure. |
Prange, Alexander; Sandrala, Indra Praveen; Weber, Markus; Sonntag, Daniel Robot Companions and Smartpens for Improved Social Communication of Dementia Patients Inproceedings Proceedings of the 20th International Conference on Intelligent User Interfaces Companion, Association for Computing Machinery, 2015. @inproceedings{11232, title = {Robot Companions and Smartpens for Improved Social Communication of Dementia Patients}, author = {Alexander Prange and Indra Praveen Sandrala and Markus Weber and Daniel Sonntag}, doi = {https://doi.org/10.1145/2732158.2732174}, year = {2015}, date = {2015-01-01}, booktitle = {Proceedings of the 20th International Conference on Intelligent User Interfaces Companion}, publisher = {Association for Computing Machinery}, abstract = {In this demo paper we describe how a digital pen and a humanoid robot companion can improve the social communication of a dementia patient. We propose the use of NAO, a humanoid robot, as a companion to the dementia patient in order to continuously monitor his or her activities and provide cognitive assistance in daily life situations. For example, patients can communicate with NAO through natural language by the speech dialogue functionality we integrated. Most importantly, to improve communication, i.e., sending digital messages (texting, emails), we propose the usage of a smartpen, where the patients write messages on normal paper with an invisible dot pattern to initiate hand-writing and sketch recognition in real-time. The smartpen application is embedded into the human-robot speech dialogue.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this demo paper we describe how a digital pen and a humanoid robot companion can improve the social communication of a dementia patient. We propose the use of NAO, a humanoid robot, as a companion to the dementia patient in order to continuously monitor his or her activities and provide cognitive assistance in daily life situations. For example, patients can communicate with NAO through natural language by the speech dialogue functionality we integrated. Most importantly, to improve communication, i.e., sending digital messages (texting, emails), we propose the usage of a smartpen, where the patients write messages on normal paper with an invisible dot pattern to initiate hand-writing and sketch recognition in real-time. The smartpen application is embedded into the human-robot speech dialogue. |
Prange, Alexander; Toyama, Takumi; Sonntag, Daniel Towards Gaze and Gesture Based Human-Robot Interaction for Dementia Patients Inproceedings 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015, pp. 111-113, AAAI Press, 2015. @inproceedings{11233, title = {Towards Gaze and Gesture Based Human-Robot Interaction for Dementia Patients}, author = {Alexander Prange and Takumi Toyama and Daniel Sonntag}, url = {http://www.aaai.org/ocs/index.php/FSS/FSS15/paper/view/11696}, year = {2015}, date = {2015-01-01}, booktitle = {2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015}, pages = {111-113}, publisher = {AAAI Press}, abstract = {Gaze and gestures are important modalities in human-human interactions and hence important to human-robot interaction. We describe how to use human gaze and robot pointing gestures to disambiguate and extend a human-robot speech dialogue developed for aiding people suffering from dementia.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Gaze and gestures are important modalities in human-human interactions and hence important to human-robot interaction. We describe how to use human gaze and robot pointing gestures to disambiguate and extend a human-robot speech dialogue developed for aiding people suffering from dementia. |
Technical Reports |
Barz, Michael; Bulling, Andreas; Daiber, Florian Computational Modelling and Prediction of Gaze Estimation Error for Head-mounted Eye Trackers Technical Report DFKI , 2015. @techreport{7619, title = {Computational Modelling and Prediction of Gaze Estimation Error for Head-mounted Eye Trackers}, author = {Michael Barz and Andreas Bulling and Florian Daiber}, url = {https://www.dfki.de/fileadmin/user_upload/import/7619_gazequality.pdf}, year = {2015}, date = {2015-01-01}, volume = {1}, institution = {DFKI}, abstract = {Gaze estimation error is inherent in head-mounted eye trackers and seriously impacts performance, usability, and user experience of gaze-based interfaces. Particularly in mobile settings, this error varies constantly as users move in front and look at different parts of a display. We envision a new class of gaze-based interfaces that are aware of the gaze estimation error and adapt to it in real time. As a first step towards this vision we introduce an error model that is able to predict the gaze estimation error. Our method covers major building blocks of mobile gaze estimation, specifically mapping of pupil positions to scene camera coordinates, marker-based display detection, and mapping of gaze from scene camera to on-screen coordinates. We develop our model through a series of principled measurements of a state-of-the-art head-mounted eye tracker.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } Gaze estimation error is inherent in head-mounted eye trackers and seriously impacts performance, usability, and user experience of gaze-based interfaces. Particularly in mobile settings, this error varies constantly as users move in front and look at different parts of a display. We envision a new class of gaze-based interfaces that are aware of the gaze estimation error and adapt to it in real time. As a first step towards this vision we introduce an error model that is able to predict the gaze estimation error. Our method covers major building blocks of mobile gaze estimation, specifically mapping of pupil positions to scene camera coordinates, marker-based display detection, and mapping of gaze from scene camera to on-screen coordinates. We develop our model through a series of principled measurements of a state-of-the-art head-mounted eye tracker. |
and, Daniel Sonntag ISMAR 2015 Tutorial on Intelligent User Interfaces Technical Report DFKI , 2015. @techreport{8134, title = {ISMAR 2015 Tutorial on Intelligent User Interfaces}, author = {Daniel Sonntag and}, url = {https://www.dfki.de/fileadmin/user_upload/import/8134_ISMAR-2015-IUI-TUTORIAL.pdf}, year = {2015}, date = {2015-01-01}, volume = {1}, institution = {DFKI}, abstract = {IUIs aim to incorporate intelligent automated capabilities in human computer interaction, where the net impact is a human- computer interaction that improves performance or usability in critical ways. It also involves designing and implementing an artificial intelligence (AI) component that effectively leverages human skills and capabilities, so that human performance with an application excels. IUIs embody capabilities that have traditionally been associated more strongly with humans than with computers: how to perceive, interpret, learn, use language, reason, plan, and decide.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } IUIs aim to incorporate intelligent automated capabilities in human computer interaction, where the net impact is a human- computer interaction that improves performance or usability in critical ways. It also involves designing and implementing an artificial intelligence (AI) component that effectively leverages human skills and capabilities, so that human performance with an application excels. IUIs embody capabilities that have traditionally been associated more strongly with humans than with computers: how to perceive, interpret, learn, use language, reason, plan, and decide. |
2014 |
Incollections |
Sonntag, Daniel; and, Daniel Porta Intelligent Semantic Mediation, Knowledge Acquisition and User Interaction Incollection Grallert, Hans-Joachim; Weiss, Stefan; Friedrich, Hermann; Widenka, Thomas; Wahlster, Wolfgang (Ed.): Towards the Internet of Services: The THESEUS Research Program, pp. 179-189, Springer, 2014. @incollection{7752, title = {Intelligent Semantic Mediation, Knowledge Acquisition and User Interaction}, author = {Daniel Sonntag and Daniel Porta and}, editor = {Hans-Joachim Grallert and Stefan Weiss and Hermann Friedrich and Thomas Widenka and Wolfgang Wahlster}, url = {http://link.springer.com/chapter/10.1007/978-3-319-06755-1_14}, year = {2014}, date = {2014-01-01}, booktitle = {Towards the Internet of Services: The THESEUS Research Program}, pages = {179-189}, publisher = {Springer}, abstract = {The design and implementation of combined mobile and touchscreen-based multimodal Web 3.0 interfaces should include new approaches of intelligent semantic mediation, knowledge acquisition and user interaction when dealing with a semantic-based digitalization of mostly unstructured textual or image-based source information. In this article, we propose a semantic-based model for those three tasks. The technical components rely on semantic web data structures in order to, first, transcend the traditional keyboard and mouse interaction metaphors, and second, provide the representation structures for more complex, collaborative interaction scenarios that may combine mobile with terminal-based interaction to accommodate the growing need to store, organize, and retrieve all these data. Interactive knowledge acquisition plays a major role in increasing the quality of automatic annotations as well as the usability of different intelligent user interfaces to control, correct, and add annotations to unstructured text and image sources. Examples are provided in the context of the Medico and Texo use cases.}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } The design and implementation of combined mobile and touchscreen-based multimodal Web 3.0 interfaces should include new approaches of intelligent semantic mediation, knowledge acquisition and user interaction when dealing with a semantic-based digitalization of mostly unstructured textual or image-based source information. In this article, we propose a semantic-based model for those three tasks. The technical components rely on semantic web data structures in order to, first, transcend the traditional keyboard and mouse interaction metaphors, and second, provide the representation structures for more complex, collaborative interaction scenarios that may combine mobile with terminal-based interaction to accommodate the growing need to store, organize, and retrieve all these data. Interactive knowledge acquisition plays a major role in increasing the quality of automatic annotations as well as the usability of different intelligent user interfaces to control, correct, and add annotations to unstructured text and image sources. Examples are provided in the context of the Medico and Texo use cases. |
Inproceedings |
Toyama, Takumi; Sonntag, Daniel; Matsuda, Takahiro; Dengel, Andreas; Iwamura, Masakazu; and, Koichi Kise A Mixed Reality Head-Mounted Text Translation System Using Eye Gaze Input Inproceedings Proceedings of the 2014 international conference on Intelligent user interfaces, pp. 329-334, ACM, 2014. @inproceedings{7409, title = {A Mixed Reality Head-Mounted Text Translation System Using Eye Gaze Input}, author = {Takumi Toyama and Daniel Sonntag and Takahiro Matsuda and Andreas Dengel and Masakazu Iwamura and Koichi Kise and}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 2014 international conference on Intelligent user interfaces}, pages = {329-334}, publisher = {ACM}, abstract = {Efficient text recognition has recently been a challenge for augmented reality systems. In this paper, we propose a system with the ability to provide translations to the user in real-time. We use eye gaze for more intuitive and efficient input for ubiquitous text reading and translation in head mounted displays (HMDs). The eyes can be used to indicate regions of interest in text documents and activate optical-character-recognition (OCR) and translation functions. Visual feedback and navigation help in the interaction process, and text snippets with translations from Japanese to English text snippets, are presented in a see-through HMD. We focus on travelers who go to Japan and need to read signs and propose two different gaze gestures for activating the OCR text reading and translation function. We evaluate which type of gesture suits our OCR scenario best. We also show that our gaze-based OCR method on the extracted gaze regions provide faster access times to information than traditional OCR approaches. Other benefits include that visual feedback of the extracted text region can be given in real-time, the Japanese to English translation can be presented in real-time, and the augmentation of the synchronized and calibrated HMD in this mixed reality application are presented at exact locations in the augmented user view to allow for dynamic text translation management in head-up display systems.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Efficient text recognition has recently been a challenge for augmented reality systems. In this paper, we propose a system with the ability to provide translations to the user in real-time. We use eye gaze for more intuitive and efficient input for ubiquitous text reading and translation in head mounted displays (HMDs). The eyes can be used to indicate regions of interest in text documents and activate optical-character-recognition (OCR) and translation functions. Visual feedback and navigation help in the interaction process, and text snippets with translations from Japanese to English text snippets, are presented in a see-through HMD. We focus on travelers who go to Japan and need to read signs and propose two different gaze gestures for activating the OCR text reading and translation function. We evaluate which type of gesture suits our OCR scenario best. We also show that our gaze-based OCR method on the extracted gaze regions provide faster access times to information than traditional OCR approaches. Other benefits include that visual feedback of the extracted text region can be given in real-time, the Japanese to English translation can be presented in real-time, and the augmentation of the synchronized and calibrated HMD in this mixed reality application are presented at exact locations in the augmented user view to allow for dynamic text translation management in head-up display systems. |
Orlosky, Jason; Toyama, Takumi; Sonntag, Daniel; Sárkány, András; andrincz, András Lő Proceedings of the 2014 International Conference on Pervasive Computing and Communications Workshops, pp. 320-325, IEEE, 2014. @inproceedings{7410, title = {On-body multi-input indoor localization for dynamic emergency scenarios: fusion of magnetic tracking and optical character recognition with mixed-reality display}, author = {Jason Orlosky and Takumi Toyama and Daniel Sonntag and András Sárkány and András Lő andrincz}, url = {https://www.dfki.de/fileadmin/user_upload/import/7410_2014_On-body_multi-input_indoor_localization_for_dynamic_emergency_scenarios-_Fusion_of_magnetic_tracking_and_optical_character_recognition_with_mixed-reality_display.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 2014 International Conference on Pervasive Computing and Communications Workshops}, pages = {320-325}, publisher = {IEEE}, abstract = {Indoor navigation in emergency scenarios poses a challenge to evacuation and emergency support, especially for injured or physically encumbered individuals. Navigation systems must be lightweight, easy to use, and provide robust localization and accurate navigation instructions in adverse conditions. To address this challenge, we combine magnetic location tracking with an optical character recognition (OCR) and eye gaze based method to recognize door plates and position related text to provide more robust localization. In contrast to typical wireless or sensor based tracking, our fused system can be used in low-lighting, smoke, and areas without power or wireless connectivity. Eye gaze tracking is also used to improve time to localization and accuracy of the OCR algorithm. Once localized, navigation instructions are transmitted directly into the user's immediate field of view via head mounted display (HMD). Additionally, setting up the system is simple and can be done with minimal calibration, requiring only a walk-through of the environment and numerical annotation of a 2D area map. We conduct an evaluation for the magnetic and OCR systems individually to evaluate feasibility for use in the fused framework.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Indoor navigation in emergency scenarios poses a challenge to evacuation and emergency support, especially for injured or physically encumbered individuals. Navigation systems must be lightweight, easy to use, and provide robust localization and accurate navigation instructions in adverse conditions. To address this challenge, we combine magnetic location tracking with an optical character recognition (OCR) and eye gaze based method to recognize door plates and position related text to provide more robust localization. In contrast to typical wireless or sensor based tracking, our fused system can be used in low-lighting, smoke, and areas without power or wireless connectivity. Eye gaze tracking is also used to improve time to localization and accuracy of the OCR algorithm. Once localized, navigation instructions are transmitted directly into the user's immediate field of view via head mounted display (HMD). Additionally, setting up the system is simple and can be done with minimal calibration, requiring only a walk-through of the environment and numerical annotation of a 2D area map. We conduct an evaluation for the magnetic and OCR systems individually to evaluate feasibility for use in the fused framework. |
Palotai, Zsolt; Láng, Miklós; Sárkány, András; andsér, Zoltán Tő Sonntag, Daniel; Toyama, Takumi; andrincz, András Lő LabelMovie: a Semi-supervised Machine Annotation Tool with Quality Assurance and Crowd-sourcing Options for Videos Inproceedings Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, IEEE, 2014. @inproceedings{7411, title = {LabelMovie: a Semi-supervised Machine Annotation Tool with Quality Assurance and Crowd-sourcing Options for Videos}, author = {Zsolt Palotai and Miklós Láng and András Sárkány and Zoltán Tő andsér and Daniel Sonntag and Takumi Toyama and András Lő andrincz}, url = {https://www.dfki.de/fileadmin/user_upload/import/7411_2014_LabelMovie-_Semi-supervised_Machine_Annotation_Tool_with_Quality_Assurance_and_Crowd-sourcing_Options_for_Videos.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing}, publisher = {IEEE}, abstract = {For multiple reasons, the automatic annotation of video recordings is challenging: first, the amount of database video instances to be annotated is huge, second, tedious manual labelling sessions are required, third, the multimodal annotation needs exact information of space, time, and context, fourth, the different labelling opportunities (e.g., for the case of affects) require special agreements between annotators, and so forth. Crowdsourcing with quality assurance by experts may come to the rescue here. We have developed a special tool where individual experts can annotate videos over the Internet, their work can be joined and filtered, the annotated material can be evaluated by machine learning methods, and automated annotation starts according to a predefined confidence level. Qualitative manual labelling instances by humans, the seeds, assure that relatively small samples of manual annotations can effectively bootstrap the machine annotation procedure. The annotation tool features special visualization methods for crowd- sourced users not familiar with machine learning methods and, in turn, ignites the bootstrapping process.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } For multiple reasons, the automatic annotation of video recordings is challenging: first, the amount of database video instances to be annotated is huge, second, tedious manual labelling sessions are required, third, the multimodal annotation needs exact information of space, time, and context, fourth, the different labelling opportunities (e.g., for the case of affects) require special agreements between annotators, and so forth. Crowdsourcing with quality assurance by experts may come to the rescue here. We have developed a special tool where individual experts can annotate videos over the Internet, their work can be joined and filtered, the annotated material can be evaluated by machine learning methods, and automated annotation starts according to a predefined confidence level. Qualitative manual labelling instances by humans, the seeds, assure that relatively small samples of manual annotations can effectively bootstrap the machine annotation procedure. The annotation tool features special visualization methods for crowd- sourced users not familiar with machine learning methods and, in turn, ignites the bootstrapping process. |
Orlosky, Jason; Toyama, Takumi; Sonntag, Daniel; Kiyokawa, Kiyoshi Using Eye-gaze and Visualization To Augment Memory: A Framework for Improving Context Recognition and Recall Inproceedings Proceedings of the 16th International Conference on Human-Computer Interaction, pp. 282-291, LNCS Springer, 2014. @inproceedings{7412, title = {Using Eye-gaze and Visualization To Augment Memory: A Framework for Improving Context Recognition and Recall}, author = {Jason Orlosky and Takumi Toyama and Daniel Sonntag and Kiyoshi Kiyokawa}, url = {https://www.dfki.de/fileadmin/user_upload/import/7412_2014_Using_Eye-Gaze_and_Visualization_to_Augment_Memory_.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 16th International Conference on Human-Computer Interaction}, pages = {282-291}, publisher = {LNCS Springer}, abstract = {In our everyday lives, bits of important information are lost due to the fact that our brain fails to convert a large portion of short term memory into long term memory. In this paper, we propose a framework that uses an eye-tracking interface to store pieces of forgotten information and present them back to the user later with an integrated head mounted display (HMD). This process occurs in three main steps, including context recognition, data storage, and augmented reality (AR) display. We demonstrate the system’s ability to recall information with the example of a lost book page by detecting when the user reads the book again and intelligently presenting the last read position back to the user. Two short user evaluations show that the system can recall book pages within 40 milliseconds, and that the position where a user left off can be calculated with approximately 0.5 centimeter accuracy.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In our everyday lives, bits of important information are lost due to the fact that our brain fails to convert a large portion of short term memory into long term memory. In this paper, we propose a framework that uses an eye-tracking interface to store pieces of forgotten information and present them back to the user later with an integrated head mounted display (HMD). This process occurs in three main steps, including context recognition, data storage, and augmented reality (AR) display. We demonstrate the system’s ability to recall information with the example of a lost book page by detecting when the user reads the book again and intelligently presenting the last read position back to the user. Two short user evaluations show that the system can recall book pages within 40 milliseconds, and that the position where a user left off can be calculated with approximately 0.5 centimeter accuracy. |
Toyama, Takumi; Orlosky, Jason; Sonntag, Daniel; Kiyokawa, Kiyoshi A natural interface for multi-focal plane head mounted displays using 3D gaze Inproceedings Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces, pp. 25-32, ACM, 2014. @inproceedings{7413, title = {A natural interface for multi-focal plane head mounted displays using 3D gaze}, author = {Takumi Toyama and Jason Orlosky and Daniel Sonntag and Kiyoshi Kiyokawa}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces}, pages = {25-32}, publisher = {ACM}, abstract = {In mobile augmented reality (AR), it is important to develop interfaces for wearable displays that not only reduce distraction, but that can be used quickly and in a natural manner. In this paper, we propose a focal-plane based interaction approach with several advantages over traditional methods designed for head mounted displays (HMDs) with only one focal plane. Using a novel prototype that combines a monoscopic multi-focal plane HMD and eye tracker, we facilitate interaction with virtual elements such as text or buttons by measuring eye convergence on objects at different depths. This can prevent virtual information from being unnecessarily overlaid onto real world objects that are at a different range, but in the same line of sight. We then use our prototype in a series of experiments testing the feasibility of interaction. Despite only being presented with monocular depth cues, users have the ability to correctly select virtual icons in near, mid, and far planes in 98.6% of cases.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In mobile augmented reality (AR), it is important to develop interfaces for wearable displays that not only reduce distraction, but that can be used quickly and in a natural manner. In this paper, we propose a focal-plane based interaction approach with several advantages over traditional methods designed for head mounted displays (HMDs) with only one focal plane. Using a novel prototype that combines a monoscopic multi-focal plane HMD and eye tracker, we facilitate interaction with virtual elements such as text or buttons by measuring eye convergence on objects at different depths. This can prevent virtual information from being unnecessarily overlaid onto real world objects that are at a different range, but in the same line of sight. We then use our prototype in a series of experiments testing the feasibility of interaction. Despite only being presented with monocular depth cues, users have the ability to correctly select virtual icons in near, mid, and far planes in 98.6% of cases. |
2013 |
Incollections |
Sonntag, Daniel; Zillner, Sonja; Schulz, Christian Husodo; Toyama, Takumi; Weber, Markus Marcus, Aaron (Ed.): Design, User Experience, and Usability. User Experience in Novel Technological Environments, pp. 401-410, Springer, 2013. @incollection{7165, title = {Towards Medical Cyber-Physical Systems: Multimodal Augmented Reality for Doctors and Knowledge Discovery about Patients}, author = {Daniel Sonntag and Sonja Zillner and Christian Husodo Schulz and Takumi Toyama and Markus Weber}, editor = {Aaron Marcus}, url = {https://www.dfki.de/fileadmin/user_upload/import/7165_2013_Vision-Based_Location-Awareness_in_Augmented_Reality_Applications.pdf}, year = {2013}, date = {2013-01-01}, booktitle = {Design, User Experience, and Usability. User Experience in Novel Technological Environments}, pages = {401-410}, publisher = {Springer}, abstract = {In the medical domain, which becomes more and more digital, every improvement in efficiency and effectiveness really counts. Doctors must be able to retrieve data easily and provide their input in the most convenient way. With new technologies towards medical cyber-physical systems, such as networked head-mounted displays (HMDs) and eye trackers, new interaction opportunities arise. With our medical demo in the context of a cancer screening programme, we are combining active speech based input, passive/active eye tracker user input, and HMD output (all devices are on-body and hands-free) in a convenient way for both the patient and the doctor.}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } In the medical domain, which becomes more and more digital, every improvement in efficiency and effectiveness really counts. Doctors must be able to retrieve data easily and provide their input in the most convenient way. With new technologies towards medical cyber-physical systems, such as networked head-mounted displays (HMDs) and eye trackers, new interaction opportunities arise. With our medical demo in the context of a cancer screening programme, we are combining active speech based input, passive/active eye tracker user input, and HMD output (all devices are on-body and hands-free) in a convenient way for both the patient and the doctor. |
Inproceedings |
Toyama, Takumi; Sonntag, Daniel; Weber, Markus; Schulz, Christian Husodo Gaze-based Online Face Learning and Recognition in Augmented Reality Inproceedings Proceedings of the IUI 2013 Workshop on Interactive Machine Learning, ACM, 2013. @inproceedings{6780, title = {Gaze-based Online Face Learning and Recognition in Augmented Reality}, author = {Takumi Toyama and Daniel Sonntag and Markus Weber and Christian Husodo Schulz}, url = {https://www.dfki.de/fileadmin/user_upload/import/6780_iui2012-ml-workshop2.pdf}, year = {2013}, date = {2013-03-01}, booktitle = {Proceedings of the IUI 2013 Workshop on Interactive Machine Learning}, publisher = {ACM}, abstract = {We propose a new online face learning and recognition approach using user gaze and augmented displays. User gaze is used to select a face in focus in a scene image whereupon visual feedback and information about the detected person is presented in a head mounted display. Our specific medical application leverages the doctors capabilities of recalling the specific patient context.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We propose a new online face learning and recognition approach using user gaze and augmented displays. User gaze is used to select a face in focus in a scene image whereupon visual feedback and information about the detected person is presented in a head mounted display. Our specific medical application leverages the doctor’s capabilities of recalling the specific patient context. |
Sonntag, Daniel; Toyama, Takumi On-body IE: A Head-Mounted Multimodal Augmented Reality System for Learning and Recalling Faces Inproceedings 9th International Conference on Intelligent Environments, IEEE, 2013. @inproceedings{7039, title = {On-body IE: A Head-Mounted Multimodal Augmented Reality System for Learning and Recalling Faces}, author = {Daniel Sonntag and Takumi Toyama}, year = {2013}, date = {2013-01-01}, booktitle = {9th International Conference on Intelligent Environments}, publisher = {IEEE}, abstract = {We present a new augmented reality (AR) system for knowledge-intensive location-based expert work. The multi-modal interaction system combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted augmented reality display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the expert work in a specific medical application context which shows its potential. In the sensitive domain of examining patients in a cancer screening program we try to combine several active user input devices in the most convenient way for both the patient and the doctor. The resulting multimodal AR is an on-body intelligent environment (IE) and has the potential to yield higher performance outcomes and provides a direct data acquisition control mechanism. It leverages the doctor's capabilities of recalling the specific patient context by a virtual, context-based patient-specific """"external brain"""" for the doctor which can remember patient faces and adapts the virtual augmentation according to the specific patient observation and finding context. In addition, patient data can be displayed on the HMD -- triggered by voice or object/patient recognition. The learned (patient) faces and immovable objects (e.g., a big medical device) define the environmental clues to make the context-dependent recognition model part of the IE to achieve specific goals for the doctors in the hospital routine.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a new augmented reality (AR) system for knowledge-intensive location-based expert work. The multi-modal interaction system combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted augmented reality display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the expert work in a specific medical application context which shows its potential. In the sensitive domain of examining patients in a cancer screening program we try to combine several active user input devices in the most convenient way for both the patient and the doctor. The resulting multimodal AR is an on-body intelligent environment (IE) and has the potential to yield higher performance outcomes and provides a direct data acquisition control mechanism. It leverages the doctor's capabilities of recalling the specific patient context by a virtual, context-based patient-specific """"external brain"""" for the doctor which can remember patient faces and adapts the virtual augmentation according to the specific patient observation and finding context. In addition, patient data can be displayed on the HMD -- triggered by voice or object/patient recognition. The learned (patient) faces and immovable objects (e.g., a big medical device) define the environmental clues to make the context-dependent recognition model part of the IE to achieve specific goals for the doctors in the hospital routine. |
Sonntag, Daniel; Toyama, Takumi Vision-Based Location-Awareness in Augmented Reality Applications Inproceedings 3rd Workshop on Location Awareness for Mixed and Dual Reality (LAMDa’13), ACM Press, 2013. @inproceedings{7164, title = {Vision-Based Location-Awareness in Augmented Reality Applications}, author = {Daniel Sonntag and Takumi Toyama}, url = {https://www.dfki.de/fileadmin/user_upload/import/7164_2013_Vision-Based_Location-Awareness_in_Augmented_Reality_Applications.pdf}, year = {2013}, date = {2013-01-01}, booktitle = {3rd Workshop on Location Awareness for Mixed and Dual Reality (LAMDa’13)}, publisher = {ACM Press}, abstract = {We present an integral HCI approach that incorporates eye-gaze for location-awareness in real-time. A new augmented reality (AR) system for knowledge-intensive location-based work combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted AR display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the navigation on a hospitals premises (outdoors and indoors, figure 1) which shows its potential. We focus on the eye-tracker interaction which provides cues for location-awareness.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present an integral HCI approach that incorporates eye-gaze for location-awareness in real-time. A new augmented reality (AR) system for knowledge-intensive location-based work combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted AR display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the navigation on a hospital’s premises (outdoors and indoors, figure 1) which shows its potential. We focus on the eye-tracker interaction which provides cues for location-awareness. |
Weber, Markus; Schulz, Christian Husodo; Sonntag, Daniel; Toyama, Takumi Digital Pens as Smart Objects in Multimodal Medical Application Frameworks Inproceedings Proceedings of the Second Workshop on Interacting with Smart Objects, in conjunction with IUI’13, ACM Press, 2013. @inproceedings{7166, title = {Digital Pens as Smart Objects in Multimodal Medical Application Frameworks}, author = {Markus Weber and Christian Husodo Schulz and Daniel Sonntag and Takumi Toyama}, year = {2013}, date = {2013-01-01}, booktitle = {Proceedings of the Second Workshop on Interacting with Smart Objects, in conjunction with IUI’13}, publisher = {ACM Press}, abstract = {In this paper, we present a novel mobile interaction system which combines a pen-based interface with a head-mounted display (HMD) for clinical radiology reports in the field of mammography. We consider a digital pen as an anthropocentric smart object, one that allows for a physical, tangible and embodied interaction to enhance data input in a mobile onbody HMD environment. Our system provides an intuitive way for a radiologist to write a structured report with a special pen on normal paper and receive real-time feedback using HMD technology. We will focus on the combination of new interaction possibilities with smart digital pens in this multimodal scenario due to a new real-time visualisation possibility.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper, we present a novel mobile interaction system which combines a pen-based interface with a head-mounted display (HMD) for clinical radiology reports in the field of mammography. We consider a digital pen as an anthropocentric smart object, one that allows for a physical, tangible and embodied interaction to enhance data input in a mobile onbody HMD environment. Our system provides an intuitive way for a radiologist to write a structured report with a special pen on normal paper and receive real-time feedback using HMD technology. We will focus on the combination of new interaction possibilities with smart digital pens in this multimodal scenario due to a new real-time visualisation possibility. |
Schulz, Christian Husodo; Sonntag, Daniel; Weber, Markus; Toyama, Takumi Multimodal Interaction Strategies in a Multi-Device Environment using Natural Speech Inproceedings Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion, ACM Press, 2013. @inproceedings{7167, title = {Multimodal Interaction Strategies in a Multi-Device Environment using Natural Speech}, author = {Christian Husodo Schulz and Daniel Sonntag and Markus Weber and Takumi Toyama}, year = {2013}, date = {2013-01-01}, booktitle = {Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion}, publisher = {ACM Press}, abstract = {In this paper we present an intelligent user interface which combines a speech-based interface with several other input modalities. The integration of multiple devices into a working environment should provide greater flexibility to the daily routine of medical experts for example. To this end, we will introduce a medical cyber-physical system that demonstrates the use of a bidirectional connection between a speech-based interface and a head-mounted see-through display. We will show examples of how we can exploit multiple input modalities and thus increase the usability of a speech-based interaction system.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper we present an intelligent user interface which combines a speech-based interface with several other input modalities. The integration of multiple devices into a working environment should provide greater flexibility to the daily routine of medical experts for example. To this end, we will introduce a medical cyber-physical system that demonstrates the use of a bidirectional connection between a speech-based interface and a head-mounted see-through display. We will show examples of how we can exploit multiple input modalities and thus increase the usability of a speech-based interaction system. |
2012 |
Inproceedings |
Sonntag, Daniel; Schulz, Christian Husodo; Reuschling, Christian; Galarraga, Luis RadSpeech, a mobile dialogue system for radiologists Inproceedings Proceedings of the International Conference on Intelligent User Interfaces (IUI), pp. 317-318, ACM, 2012. @inproceedings{6198, title = {RadSpeech, a mobile dialogue system for radiologists}, author = {Daniel Sonntag and Christian Husodo Schulz and Christian Reuschling and Luis Galarraga}, url = {https://www.dfki.de/fileadmin/user_upload/import/6198_iui-2012-preprint.pdf}, year = {2012}, date = {2012-01-01}, booktitle = {Proceedings of the International Conference on Intelligent User Interfaces (IUI)}, pages = {317-318}, publisher = {ACM}, abstract = {With RadSpeech, we aim to build the next generation of intelligent, scalable, and user-friendly semantic search interfaces for the medical imaging domain, based on semantic technologies. Ontology-based knowledge representation is used not only for the image contents, but also for the complex natural language understanding and dialogue management process. This demo shows a speechbased annotation system for radiology images and focuses on a new and effective way to annotate medical image regions with a specific medical, structured, diagnosis while using speech and pointing gestures on the go.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } With RadSpeech, we aim to build the next generation of intelligent, scalable, and user-friendly semantic search interfaces for the medical imaging domain, based on semantic technologies. Ontology-based knowledge representation is used not only for the image contents, but also for the complex natural language understanding and dialogue management process. This demo shows a speechbased annotation system for radiology images and focuses on a new and effective way to annotate medical image regions with a specific medical, structured, diagnosis while using speech and pointing gestures on the go. |
2011 |
Inproceedings |
Sonntag, Daniel; Liwicki, Marcus; Weber, Markus Digital Pen in Mammography Patient Forms Inproceedings Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 303-306, ACM, 2011. @inproceedings{5786, title = {Digital Pen in Mammography Patient Forms}, author = {Daniel Sonntag and Marcus Liwicki and Markus Weber}, url = {https://www.dfki.de/fileadmin/user_upload/import/5786_2011_Digital_Pen_in_Mammography_Patient_Forms.pdf http://dl.acm.org/citation.cfm?id=2070537}, year = {2011}, date = {2011-11-01}, booktitle = {Proceedings of the 13th International Conference on Multimodal Interfaces}, pages = {303-306}, publisher = {ACM}, abstract = {We present a digital pen based interface for clinical radiology reports in the field of mammography. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices in mammography, we allow the radiologist to write structured reports with a special pen on paper with an invisible dot pattern. A handwriting software takes care of the interpretation of the written report which is transferred into an ontological representation. In addition, a gesture recogniser allows radiologists to encircle predefined annotation suggestions which turns out to be the most beneficial feature. The radiologist can (1) provide the image and image region annotations mapped to a FMA, RadLex, or ICD10 code, (2) provide free text entries, and (3) correct/select annotations while using multiple gestures on the forms and sketch regions. The resulting, automatically generated PDF report is then stored in a semantic backend system for further use and contains all transcribed annotations as well as all free form sketches.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a digital pen based interface for clinical radiology reports in the field of mammography. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices in mammography, we allow the radiologist to write structured reports with a special pen on paper with an invisible dot pattern. A handwriting software takes care of the interpretation of the written report which is transferred into an ontological representation. In addition, a gesture recogniser allows radiologists to encircle predefined annotation suggestions which turns out to be the most beneficial feature. The radiologist can (1) provide the image and image region annotations mapped to a FMA, RadLex, or ICD10 code, (2) provide free text entries, and (3) correct/select annotations while using multiple gestures on the forms and sketch regions. The resulting, automatically generated PDF report is then stored in a semantic backend system for further use and contains all transcribed annotations as well as all free form sketches. |
Sonntag, Daniel; and, Christian Husodo Schulz Monitoring and Explaining Reasoning Processes in a Dialogue System’s Input Interpretation Step Inproceedings Proceedings of the International Workshop on Explanation-aware Computing at IJCAI, IJCAI, 2011. @inproceedings{6196, title = {Monitoring and Explaining Reasoning Processes in a Dialogue Systems Input Interpretation Step}, author = {Daniel Sonntag and Christian Husodo Schulz and}, url = {https://www.dfki.de/fileadmin/user_upload/import/6196_Exact2011_final.pdf}, year = {2011}, date = {2011-07-01}, booktitle = {Proceedings of the International Workshop on Explanation-aware Computing at IJCAI}, publisher = {IJCAI}, abstract = {We implemented a generic speech-based dialogue shell that can be configured for and applied to domain-specific dialogue applications. A toolbox for ontology-based dialogue engineering provides a technical solution for the two challenges of engineering domain extensions for new question and answer possibilities and debugging functional modules. In this paper, we address the process of debugging and maintaining rule-based input interpretation modules. While supporting a rapid implementation cycle until the dialogue systems works robustly for a new domain (e.g., the dialogue-based retrieval of medical images), production rules for input interpretation have to be monitored, configured, and maintained. We implemented a special graphical user interface to monitor and explain reasoning processes for the input interpretation phase of multimodal dialogue systems. A particular challenge was the presentation of the software system's ontology-based interaction rules in a way that they were accessible to and editable for humans for maintenance, and, at the same time, allowed a real-time monitoring of their application in the running dialogue system.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We implemented a generic speech-based dialogue shell that can be configured for and applied to domain-specific dialogue applications. A toolbox for ontology-based dialogue engineering provides a technical solution for the two challenges of engineering domain extensions for new question and answer possibilities and debugging functional modules. In this paper, we address the process of debugging and maintaining rule-based input interpretation modules. While supporting a rapid implementation cycle until the dialogue systems works robustly for a new domain (e.g., the dialogue-based retrieval of medical images), production rules for input interpretation have to be monitored, configured, and maintained. We implemented a special graphical user interface to monitor and explain reasoning processes for the input interpretation phase of multimodal dialogue systems. A particular challenge was the presentation of the software system's ontology-based interaction rules in a way that they were accessible to and editable for humans for maintenance, and, at the same time, allowed a real-time monitoring of their application in the running dialogue system. |
Sonntag, Daniel; Liwicki, Marcus; Weber, Markus Interactive Paper for Radiology Findings Inproceedings Proceedings of the 16th International Conference on Intelligent User Interfaces, pp. 459-460, ACM, 2011. @inproceedings{5785, title = {Interactive Paper for Radiology Findings}, author = {Daniel Sonntag and Marcus Liwicki and Markus Weber}, year = {2011}, date = {2011-01-01}, booktitle = {Proceedings of the 16th International Conference on Intelligent User Interfaces}, pages = {459-460}, publisher = {ACM}, abstract = {This paper presents a pen-based interface for clinical radiologists. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices, we allow the radiologist to write structured reports with a special pen on normal paper. A handwriting recognition and interpretation software takes care of the interpretation of the written report which is transferred into an ontological representation. The resulting report is then stored in a semantic backend system for further use. We will focus on the pen-based interface and new interaction possibilities with gestures in this scenario.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper presents a pen-based interface for clinical radiologists. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices, we allow the radiologist to write structured reports with a special pen on normal paper. A handwriting recognition and interpretation software takes care of the interpretation of the written report which is transferred into an ontological representation. The resulting report is then stored in a semantic backend system for further use. We will focus on the pen-based interface and new interaction possibilities with gestures in this scenario. |
2010 |
Books |
and, Daniel Sonntag Ontologies and Adaptivity in Dialogue for Question Answering Book AKA and IOS Press, 2010. @book{4994, title = {Ontologies and Adaptivity in Dialogue for Question Answering}, author = {Daniel Sonntag and}, url = {http://www.dfki.de/~sonntag/Ontologies_and_Adaptivity_in_Dialogue_for_Question_Answering.html}, year = {2010}, date = {2010-01-01}, volume = {4}, pages = {410}, publisher = {AKA and IOS Press}, abstract = {Question answering (QA) has become one of the fastest growing topics in computational linguistics and information access. To advance research in the area of dialogue-based question answering, we propose a combination of methods from different scientific fields (i.e., information retrieval, dialogue systems, semantic web and machine learning). This book sheds light on adaptable dialogue-based question answering. We demonstrate the technical and computational feasibility of the proposed ideas, the introspective methods in particular, by beginning with an extensive introduction to the dialogical problem domain which motivates the technical implementation. The ideas have been carried out in a mature natural language processing (NLP) system, the SmartWeb dialogue system, which was developed between 2004 and 2007 by partners from academia and industry. We have attempted to make this book a self-containing text and provide an extra section on the interdisciplinary scientific background. The target audience for this book comprises of researchers and students interested in the application potential of semantic technologies for difficult AI tasks such as working dialogue and QA systems.}, keywords = {}, pubstate = {published}, tppubtype = {book} } Question answering (QA) has become one of the fastest growing topics in computational linguistics and information access. To advance research in the area of dialogue-based question answering, we propose a combination of methods from different scientific fields (i.e., information retrieval, dialogue systems, semantic web and machine learning). This book sheds light on adaptable dialogue-based question answering. We demonstrate the technical and computational feasibility of the proposed ideas, the introspective methods in particular, by beginning with an extensive introduction to the dialogical problem domain which motivates the technical implementation. The ideas have been carried out in a mature natural language processing (NLP) system, the SmartWeb dialogue system, which was developed between 2004 and 2007 by partners from academia and industry. We have attempted to make this book a self-containing text and provide an extra section on the interdisciplinary scientific background. The target audience for this book comprises of researchers and students interested in the application potential of semantic technologies for difficult AI tasks such as working dialogue and QA systems. |
Book Chapters |
Sonntag, Daniel; Wennerberg, Pinar; Buitelaar, Paul; Zillner, Sonja Pillars of Ontology Treatment in the Medical Domain Book Chapter Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications, pp. 162-186, Information Science Reference, 2010. @inbook{4256, title = {Pillars of Ontology Treatment in the Medical Domain}, author = {Daniel Sonntag and Pinar Wennerberg and Paul Buitelaar and Sonja Zillner}, url = {https://www.dfki.de/fileadmin/user_upload/import/4256_2010_Pillars_of_Ontology_Treatment_in_the_Medical_Domain_.pdf http://www.igi-global.com/bookstore/Chapter.aspx?TitleId=38043}, year = {2010}, date = {2010-01-01}, booktitle = {Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications}, pages = {162-186}, publisher = {Information Science Reference}, abstract = {In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project. |
Inproceedings |
Sonntag, Daniel; Möller, Manuel Prototyping Semantic Dialogue Systems for Radiologists Inproceedings Proceedings of the 6th International Conference on Intelligent Environments, AAAI, 2010. @inproceedings{4922, title = {Prototyping Semantic Dialogue Systems for Radiologists}, author = {Daniel Sonntag and Manuel Möller}, url = {https://www.dfki.de/fileadmin/user_upload/import/4922_2010_Prototyping_Semantic_Dialogue_Systems_for_Radiologists.pdf}, year = {2010}, date = {2010-07-01}, booktitle = {Proceedings of the 6th International Conference on Intelligent Environments}, publisher = {AAAI}, abstract = {In the future, speech-based semantic image retrieval and annotation of medical images should provide the basis for help in clinical decision support and computer aided diagnosis. We will present a semantic dialogue system installation for radiologists and describe today's clinical workflow and interaction requirements. The focus is on the interaction design and implementation of our prototype system for patient image search and image annotation while using a speech-based dialogue shell and a big touchscreen in the radiology environment. Ontology modeling provides the backbone for knowledge representation in the dialogue shell and the specific medical application domain.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In the future, speech-based semantic image retrieval and annotation of medical images should provide the basis for help in clinical decision support and computer aided diagnosis. We will present a semantic dialogue system installation for radiologists and describe today's clinical workflow and interaction requirements. The focus is on the interaction design and implementation of our prototype system for patient image search and image annotation while using a speech-based dialogue shell and a big touchscreen in the radiology environment. Ontology modeling provides the backbone for knowledge representation in the dialogue shell and the specific medical application domain. |
Sonntag, Daniel; Möller, Manuel A Multimodal Dialogue Mashup for Medical Image Semantics Inproceedings Proceedings of the International Conference on Intelligent User Interfaces, o.A., 2010. @inproceedings{4691, title = {A Multimodal Dialogue Mashup for Medical Image Semantics}, author = {Daniel Sonntag and Manuel Möller}, url = {https://www.dfki.de/fileadmin/user_upload/import/4691_2010_A_MULTIMODAL_DIALOGUE_MASHUP_FOR_MEDICAL_IMAGE_SEMANTICS.pdf http://portal.acm.org/citation.cfm?doid=1719970.1720036}, year = {2010}, date = {2010-02-01}, booktitle = {Proceedings of the International Conference on Intelligent User Interfaces}, publisher = {o.A.}, abstract = {This paper presents a multimodal dialogue mashup where different users are involved in the use of different user interfaces for the annotation and retrieval of medical images. Our solution is a mashup that integrates a multimodal interface for speech-based annotation of medical images and dialogue-based image retrieval with a semantic image annotation tool for manual annotations on a desktop computer. A remote RDF repository connects the annotation and querying task into a common framework and serves as the semantic backend system for the advanced multimodal dialogue a radiologist can use.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper presents a multimodal dialogue mashup where different users are involved in the use of different user interfaces for the annotation and retrieval of medical images. Our solution is a mashup that integrates a multimodal interface for speech-based annotation of medical images and dialogue-based image retrieval with a semantic image annotation tool for manual annotations on a desktop computer. A remote RDF repository connects the annotation and querying task into a common framework and serves as the semantic backend system for the advanced multimodal dialogue a radiologist can use. |
Sonntag, Daniel; Kiesel, Malte Linked Data Integration for Semantic Dialogue and Backend Access Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", o.A., 2010. @inproceedings{4742, title = {Linked Data Integration for Semantic Dialogue and Backend Access}, author = {Daniel Sonntag and Malte Kiesel}, url = {https://www.dfki.de/fileadmin/user_upload/import/4742_2010_Linked_Data_Integration_for_Semantic_Dialogue_and_Backend_Access.pdf http://www.foaf-project.org/events/linkedai}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence"}, publisher = {o.A.}, abstract = {A dialogue system for answering user questions in natural speech presents one of the main achievements of contemporary interaction-based AI technology. Modern dialogue frameworks function as middleware between the interface component and the backend where the answers to the user questions are stored in heterogeneous formats. We implemented an interface to linked data sources as part of a complex natural language understanding and semantic retrieval process, thereby integrating the querying and answering task into a common framework. The semantic backend system integrates multiple linked data sources to allow for an advanced multimodal question answering (QA) dialogue.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } A dialogue system for answering user questions in natural speech presents one of the main achievements of contemporary interaction-based AI technology. Modern dialogue frameworks function as middleware between the interface component and the backend where the answers to the user questions are stored in heterogeneous formats. We implemented an interface to linked data sources as part of a complex natural language understanding and semantic retrieval process, thereby integrating the querying and answering task into a common framework. The semantic backend system integrates multiple linked data sources to allow for an advanced multimodal question answering (QA) dialogue. |
Sonntag, Daniel; Porta, Daniel; Setz, Jochen HTTP/REST-based Meta Web Services in Mobile Application Frameworks Inproceedings Proceedings of the 4nd International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, XPS, 2010. @inproceedings{4864, title = {HTTP/REST-based Meta Web Services in Mobile Application Frameworks}, author = {Daniel Sonntag and Daniel Porta and Jochen Setz}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the 4nd International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies}, publisher = {XPS}, abstract = {This paper describes how a multimodal dialogue application framework can be used to implement specific mobile applications and dynamic HTTP-based REST services. REST services are already publicly available and provide useful location-based information for the user on the go. We use a distributed, ontology-based dialogue system architecture where every major component can be run on a different host, thereby increasing the scalability of the overall system with a mobile user interface. The dialogue system provides customised access to the Google Maps Local Search and two REST services provided by GeoNames (i.e., the findNearbyWikipedia search and the findNearbyWeather search).}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper describes how a multimodal dialogue application framework can be used to implement specific mobile applications and dynamic HTTP-based REST services. REST services are already publicly available and provide useful location-based information for the user on the go. We use a distributed, ontology-based dialogue system architecture where every major component can be run on a different host, thereby increasing the scalability of the overall system with a mobile user interface. The dialogue system provides customised access to the Google Maps Local Search and two REST services provided by GeoNames (i.e., the findNearbyWikipedia search and the findNearbyWeather search). |
Sonntag, Daniel; Wennerberg, Pinar; Zillner, Sonja Applications of an Ontology Engineering Methodology Accessing Linked Data for Medical Image Retrieval Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", Stanford University, 2010. @inproceedings{5010, title = {Applications of an Ontology Engineering Methodology Accessing Linked Data for Medical Image Retrieval}, author = {Daniel Sonntag and Pinar Wennerberg and Sonja Zillner}, url = {https://www.dfki.de/fileadmin/user_upload/import/5010_2010_Applications_of_an_Ontology_Engineering_Methodology_Accessing_Linked_Data_for_Medical_Image_Retrieval_.pdf http://www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1117}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence"}, publisher = {Stanford University}, abstract = {This paper examines first ideas on the applicability of Linked Data, in particular a subset of the Linked Open Drug Data (LODD), to connect radiology, human anatomy, and drug information for improved medical image annotation and subsequent search. One outcome of our ontology engineering methodology is the alignment between radiology-related OWL ontologies (FMA and RadLex). These can be used to provide new connections in the medicine-related linked data cloud. A use case scenario is provided that demonstrates the benefits of the approach by enabling the radiologist to query and explore related data, e.g., medical images and drugs. The diagnosis is on a special type of cancer (lymphoma).}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper examines first ideas on the applicability of Linked Data, in particular a subset of the Linked Open Drug Data (LODD), to connect radiology, human anatomy, and drug information for improved medical image annotation and subsequent search. One outcome of our ontology engineering methodology is the alignment between radiology-related OWL ontologies (FMA and RadLex). These can be used to provide new connections in the medicine-related linked data cloud. A use case scenario is provided that demonstrates the benefits of the approach by enabling the radiologist to query and explore related data, e.g., medical images and drugs. The diagnosis is on a special type of cancer (lymphoma). |
Sonntag, Daniel; Sacaleanu, Bogdan Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering Inproceedings Proceedings of the Seventh Conference on International Language Resources and Evaluation, European Language Resources Association (ELRA), 2010. @inproceedings{5013, title = {Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering}, author = {Daniel Sonntag and Bogdan Sacaleanu}, url = {https://www.dfki.de/fileadmin/user_upload/import/5013_2010_Speech_Grammars_for_Textual_Entailment_Patterns_in_Multimodal_Question_Answering_.pdf http://www.lrec-conf.org/proceedings/lrec2010/summaries/911.html}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the Seventh Conference on International Language Resources and Evaluation}, publisher = {European Language Resources Association (ELRA)}, abstract = {Over the last several years, speech-based question answering (QA) has become very popular in contrast to pure search engine based approaches on a desktop. Open-domain QA systems are now much more powerful and precise, and they can be used in speech applications. Speech-based question answering systems often rely on predefined grammars for speech understanding. In order to improve the coverage of such complex AI systems, we reused speech patterns used to generate textual entailment patterns. These can make multimodal question understanding more robust. We exemplify this in the context of a domain-specific dialogue scenario. As a result, written text input components (e.g., in a textual input field) can deal with more flexible input according to the derived textual entailment patterns. A multimodal QA dialogue spanning over several domains of interest, i.e., personal address book entries, questions about the music domain and politicians and other celebrities, demonstrates how the textual input mode can be used in a multimodal dialogue shell.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Over the last several years, speech-based question answering (QA) has become very popular in contrast to pure search engine based approaches on a desktop. Open-domain QA systems are now much more powerful and precise, and they can be used in speech applications. Speech-based question answering systems often rely on predefined grammars for speech understanding. In order to improve the coverage of such complex AI systems, we reused speech patterns used to generate textual entailment patterns. These can make multimodal question understanding more robust. We exemplify this in the context of a domain-specific dialogue scenario. As a result, written text input components (e.g., in a textual input field) can deal with more flexible input according to the derived textual entailment patterns. A multimodal QA dialogue spanning over several domains of interest, i.e., personal address book entries, questions about the music domain and politicians and other celebrities, demonstrates how the textual input mode can be used in a multimodal dialogue shell. |
Sonntag, Daniel; Reithinger, Norbert; Herzog, Gerd; Becker, Tilman A Discourse and Dialogue Infrastructure for Industrial Dissemination Inproceedings Lee, Gary Geunbae; Mariani, Joseph; Minker, Wolfgang; Nakamura, Satoshi (Ed.): Spoken Dialogue Systems for Ambient Environments - IWSDS 2010: Proceedings of the Seond International Workshop on Spoken Dialogue Systems, pp. 132-143, Springer, 2010. @inproceedings{5014, title = {A Discourse and Dialogue Infrastructure for Industrial Dissemination}, author = {Daniel Sonntag and Norbert Reithinger and Gerd Herzog and Tilman Becker}, editor = {Gary Geunbae Lee and Joseph Mariani and Wolfgang Minker and Satoshi Nakamura}, url = {https://www.dfki.de/fileadmin/user_upload/import/5014_2010_A_Discourse_and_Dialogue_Infrastructure_for_Industrial_Dissemination_.pdf http://www.springerlink.com/content/5149m52mt5378316/}, year = {2010}, date = {2010-01-01}, booktitle = {Spoken Dialogue Systems for Ambient Environments - IWSDS 2010: Proceedings of the Seond International Workshop on Spoken Dialogue Systems}, volume = {6392}, pages = {132-143}, publisher = {Springer}, abstract = {We think that modern speech dialogue systems need a prior usability analysis to identify the requirements for industrial applications. In addition, work from the area of the Semantic Web should be integrated. These requirements can then be met by multimodal semantic processing, semantic navigation, interactive semantic mediation, user adaptation/personalisation, interactive service composition, and semantic output representation which we will explain in this paper. We will also describe the discourse and dialogue infrastructure these components develop and provide two examples of disseminated industrial prototypes.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We think that modern speech dialogue systems need a prior usability analysis to identify the requirements for industrial applications. In addition, work from the area of the Semantic Web should be integrated. These requirements can then be met by multimodal semantic processing, semantic navigation, interactive semantic mediation, user adaptation/personalisation, interactive service composition, and semantic output representation which we will explain in this paper. We will also describe the discourse and dialogue infrastructure these components develop and provide two examples of disseminated industrial prototypes. |
Technical Reports |
Sonntag, Daniel; Weihrauch, Colette; Jacobs, Oliver; Porta, Daniel THESEUS CTC-WP4 Usability Guidelines for Use Case Applications Technical Report Bundesministerium für Wirtschaft und Technologie , 2010. @techreport{4788, title = {THESEUS CTC-WP4 Usability Guidelines for Use Case Applications}, author = {Daniel Sonntag and Colette Weihrauch and Oliver Jacobs and Daniel Porta}, url = {https://www.dfki.de/fileadmin/user_upload/import/4788_100430_CTC-WP4-1_Theseus_Usability-Guidelines_Final.pdf}, year = {2010}, date = {2010-04-01}, volume = {V 1.5}, institution = {Bundesministerium für Wirtschaft und Technologie}, abstract = {Usability Guidelines for Use Case Applications serves as an introduction to the general topic of usability, i.e., how user-friendly and efficient a THESEUS prototype is. In these guidelines, we emphasize the importance of usability testing, particularly during the development of a given THESEUS prototype. We discuss the many advantages of testing prototypes and products in terms of costs, product quality, and customer satisfaction. Usability testing can improve development productivity through more efficient design and fewer code revisions. It can help to eliminate over-design by emphasizing the functionality required to meet the needs of real users. Design problems can be detected earlier in the development process, saving both time and money. In these Guidelines we provide a brief overview of testing options, ranging from a cognitive walkthrough to interviews to eye tracking. Different techniques are used at different stages of a product's development. While many techniques can be applied, no single technique alone can ensure the usability of prototypes. Usability is a process with iterative steps, meaning the cycle is repeated but in a cumulative fashion, similar to software development. In order to test, a prototype must be available and we devote some time in the Guidelines to an overview of different tools and ways to build the necessary prototypes. We also describe some options such as paper prototyping, prototypes from Visio, PowerPoint, HTML, Flash and others, and working prototypes (Java, C++, etc.) before addressing the actual tests. Before any testing is conducted, the purpose of the test should be clarified. This will have considerable impact on the kind of testing to be done. A test plan should also be written before the start of the test which considers several different aspects including, for instance, the duration of the test, where it will take place, or who the experimenter will be. A pilot test is also recommended to avoid misunderstandings and other problems during the actual test. In this context, the Guidelines also discuss other important aspects such as budget, room set-up, time, and limitations of the experimenter and test subjects themselves. To provide an overview of some of the projects THESEUS is concerned with in the context of usability, we supply explicit recommendations that result in proposed scenarios for use cases in the Guidelines. The THESEUS program consists of six use cases: ALEXANDRIA, CONTENTUS, MEDICO, ORDO, PROCESSUS, and TEXO. In order to come up with the different testing scenarios, each of which has specific design and testing recommendations, we first extracted some substantial information from the different use cases in different user settings: we discerned between those who will use the system, where they will use the system, and what they will do with the system. After considering the results, we determined that the THESEUS program works with seven different scenarios. We provide a decision tree that leads to specific recommendations for designing and testing with prototypes for each of the different scenarios and user settings. General recommendations concerning various input methods, the design, and the testing itself have also been included in the Guidelines. Following that, we emphasize what we find important for the design and testing of each of the seven testing scenarios. We address, for instance, the appropriate input method (keyboard, mouse, speech, etc.), according to the type of test subject (e.g., administrator or mobile user), or also which prototype could be used for the usability test. We will also challenge the usability of traditional usability guidelines. Oftentimes, guideline descriptions and explanations are unsatisfactory, remaining vague and ambiguous in explanation The Guidelines close with an extensive list of recommended further information sources.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } Usability Guidelines for Use Case Applications serves as an introduction to the general topic of usability, i.e., how user-friendly and efficient a THESEUS prototype is. In these guidelines, we emphasize the importance of usability testing, particularly during the development of a given THESEUS prototype. We discuss the many advantages of testing prototypes and products in terms of costs, product quality, and customer satisfaction. Usability testing can improve development productivity through more efficient design and fewer code revisions. It can help to eliminate over-design by emphasizing the functionality required to meet the needs of real users. Design problems can be detected earlier in the development process, saving both time and money. In these Guidelines we provide a brief overview of testing options, ranging from a cognitive walkthrough to interviews to eye tracking. Different techniques are used at different stages of a product's development. While many techniques can be applied, no single technique alone can ensure the usability of prototypes. Usability is a process with iterative steps, meaning the cycle is repeated but in a cumulative fashion, similar to software development. In order to test, a prototype must be available and we devote some time in the Guidelines to an overview of different tools and ways to build the necessary prototypes. We also describe some options such as paper prototyping, prototypes from Visio, PowerPoint, HTML, Flash and others, and working prototypes (Java, C++, etc.) before addressing the actual tests. Before any testing is conducted, the purpose of the test should be clarified. This will have considerable impact on the kind of testing to be done. A test plan should also be written before the start of the test which considers several different aspects including, for instance, the duration of the test, where it will take place, or who the experimenter will be. A pilot test is also recommended to avoid misunderstandings and other problems during the actual test. In this context, the Guidelines also discuss other important aspects such as budget, room set-up, time, and limitations of the experimenter and test subjects themselves. To provide an overview of some of the projects THESEUS is concerned with in the context of usability, we supply explicit recommendations that result in proposed scenarios for use cases in the Guidelines. The THESEUS program consists of six use cases: ALEXANDRIA, CONTENTUS, MEDICO, ORDO, PROCESSUS, and TEXO. In order to come up with the different testing scenarios, each of which has specific design and testing recommendations, we first extracted some substantial information from the different use cases in different user settings: we discerned between those who will use the system, where they will use the system, and what they will do with the system. After considering the results, we determined that the THESEUS program works with seven different scenarios. We provide a decision tree that leads to specific recommendations for designing and testing with prototypes for each of the different scenarios and user settings. General recommendations concerning various input methods, the design, and the testing itself have also been included in the Guidelines. Following that, we emphasize what we find important for the design and testing of each of the seven testing scenarios. We address, for instance, the appropriate input method (keyboard, mouse, speech, etc.), according to the type of test subject (e.g., administrator or mobile user), or also which prototype could be used for the usability test. We will also challenge the usability of traditional usability guidelines. Oftentimes, guideline descriptions and explanations are unsatisfactory, remaining vague and ambiguous in explanation The Guidelines close with an extensive list of recommended further information sources. |
and, Daniel Sonntag A Methodology for Emergent Software Technical Report German Research Center for AI (DFKI ) Stuhlsatzenhausweg 3 66123 Saarbrücken, , 2010. @techreport{4993, title = {A Methodology for Emergent Software}, author = {Daniel Sonntag and}, url = {https://www.dfki.de/fileadmin/user_upload/import/4993_emergent.pdf}, year = {2010}, date = {2010-01-01}, volume = {2}, pages = {10}, address = {Stuhlsatzenhausweg 3 66123 Saarbrücken}, institution = {German Research Center for AI (DFKI )}, abstract = {We present a methodology for software components that suggests adaptations to specific conditions and situations in which the components are used. The emergent software should be able to better function in a specific situation. For this purpose, we survey its background in metacognition and introspection, develop an augmented data mining cycle, and invent an introspective mechanism, a methodology for emergent software. This report is based on emergent software implementations in adaptive information systems (Sonntag, 2010).}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } We present a methodology for software components that suggests adaptations to specific conditions and situations in which the components are used. The emergent software should be able to better function in a specific situation. For this purpose, we survey its background in metacognition and introspection, develop an augmented data mining cycle, and invent an introspective mechanism, a methodology for emergent software. This report is based on emergent software implementations in adaptive information systems (Sonntag, 2010). |
2009 |
Journal Articles |
Sonntag, Daniel; Zapatrin, Roman R Macrodynamics of users' behavior in Information Retrieval Journal Article Computing Research Repository eprint Journal, ArXiv online , pp. 1-15, 2009. @article{5005, title = {Macrodynamics of users' behavior in Information Retrieval}, author = {Daniel Sonntag and Roman R Zapatrin}, url = {http://arxiv.org/abs/0905.2501}, year = {2009}, date = {2009-05-01}, journal = {Computing Research Repository eprint Journal}, volume = {ArXiv online}, pages = {1-15}, publisher = {ArXiv.org}, abstract = {We present a method to geometrize massive data sets from search engines query logs. For this purpose, a macrodynamic-like quantitative model of the Information Retrieval (IR) process is developed, whose paradigm is inspired by basic constructions of Einstein's general relativity theory in which all IR objects are uniformly placed in a common Room. The Room has a structure similar to Einsteinian spacetime, namely that of a smooth manifold. Documents and queries are treated as matter objects and sources of material fields. Relevance, the central notion of IR, becomes a dynamical issue controlled by both gravitation (or, more precisely, as the motion in a curved spacetime) and forces originating from the interactions of matter fields. The spatio-temporal description ascribes dynamics to any document or query, thus providing a uniform description for documents of both initially static and dynamical nature. Within the IR context, the techniques presented are based on two ideas. The first is the placement of all objects participating in IR into a common continuous space. The second idea is the `objectivization' of the IR process; instead of expressing users' wishes, we consider the overall IR as an objective physical process, representing the IR process in terms of motion in a given external-fields configuration. Various semantic environments are treated as various IR universes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We present a method to geometrize massive data sets from search engines query logs. For this purpose, a macrodynamic-like quantitative model of the Information Retrieval (IR) process is developed, whose paradigm is inspired by basic constructions of Einstein's general relativity theory in which all IR objects are uniformly placed in a common Room. The Room has a structure similar to Einsteinian spacetime, namely that of a smooth manifold. Documents and queries are treated as matter objects and sources of material fields. Relevance, the central notion of IR, becomes a dynamical issue controlled by both gravitation (or, more precisely, as the motion in a curved spacetime) and forces originating from the interactions of matter fields. The spatio-temporal description ascribes dynamics to any document or query, thus providing a uniform description for documents of both initially static and dynamical nature. Within the IR context, the techniques presented are based on two ideas. The first is the placement of all objects participating in IR into a common continuous space. The second idea is the `objectivization' of the IR process; instead of expressing users' wishes, we consider the overall IR as an objective physical process, representing the IR process in terms of motion in a given external-fields configuration. Various semantic environments are treated as various IR universes. |
Sonntag, Daniel; Wennerberg, Pinar; Buitelaar, Paul; Zillner, Sonja Pillars of Ontology Treatment in the Medical Domain Journal Article Journal of Cases on Information Technology, 11 , pp. 47-73, 2009. @article{5007, title = {Pillars of Ontology Treatment in the Medical Domain}, author = {Daniel Sonntag and Pinar Wennerberg and Paul Buitelaar and Sonja Zillner}, editor = {Mehdi Khosrow-Pour}, url = {https://www.dfki.de/fileadmin/user_upload/import/5007_2009_Pillars_of_Ontology_Treatment_in_the_Medical_Domain.pdf http://www.igi-global.com/journals/details.asp?ID=202&mode=toc&volume=Journal+of+Cases+on+Information+Technology%2C+Vol.+11%2C+Issue+4}, year = {2009}, date = {2009-01-01}, journal = {Journal of Cases on Information Technology}, volume = {11}, pages = {47-73}, publisher = {IGI Global}, abstract = {In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project. |
2017 |
Inproceedings |
Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI Inproceedings Mutlu, Bilge; Tscheligi, Manfred; Weiss, Astrid; Young, James E (Ed.): Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, pp. 79-80, ACM, 2017. |
Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI (Demo) Inproceedings Mutlu, Bilge; Tscheligi, Manfred; Weiss, Astrid; Young, James E (Ed.): Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, ACM, 2017. |
Speech-based Medical Decision Support in VR using a Deep Neural Network (Demonstration) Inproceedings Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 5241-5242, IJCAI, 2017. |
Integrated Decision Support by Combining Textual Information Extraction, Facetted Search and Information Visualisation Inproceedings 2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS), pp. 95-100, IEEE, 2017. |
Miscellaneous |
Interakt - A Multimodal Multisensory Interactive Cognitive Assessment Tool Miscellaneous 2017. |
2016 |
Journal Articles |
The Clinical Data Intelligence Project - A smart data initiative Journal Article Informatik Spektrum, 39 , pp. 290-300, 2016. |
Inproceedings |
Real-Time 3D Peripheral View Analysis Inproceedings Dirk, Reiners; Daisuke, Iwai; Frank, Steinicke (Ed.): Proceedings of the International Conference on Artificial Reality and Telexistence and Eurographics Symposium on Virtual Environments 2016, pp. 37-44, The Eurographics Association, 2016. |
Gaze-guided Object Classification Using Deep Neural Networks for Attention-based Computing Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 253-256, ACM, 2016. |
Digital PI-RADS: Smartphone Sketches for Instant Knowledge Acquisition in Prostate Cancer Detection Inproceedings 2016 IEEE 29th International Symposium on Computer-Based Medical Systems (CBMS), pp. 13-18, IEEE Xplore, 2016. |
Prediction of Gaze Estimation Error for Error-Aware Gaze-Based Interfaces Inproceedings Proceedings of the Symposium on Eye Tracking Research and Applications, ACM, 2016. |
WaterCoaster: A Device to Encourage People in a Playful Fashion to Reach Their Daily Water Intake Level Inproceedings Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, pp. 1813-1820, ACM, 2016. |
Multimodal multisensor activity annotation tool Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 17-20, ACM, 2016. |
MedicalVR: Towards Medical Remote Collaboration Using Virtual Reality Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 321-324, ACM, 2016. |
Peripheral View Calculation in Virtual Reality Applications Inproceedings Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, pp. 333-336, ACM, 2016. |
Immersive Virtual Reality Games for Persuasion Inproceedings Meschtscherjakov, Alexander; Ruyter, Boris De; Fuchsberger, Verena; Murer, Martin; Tscheligi, Manfred (Ed.): Proceedings of the 11th International Conference on Persuasive Technology 2016: Adjunct, pp. 110-111, o.A., 2016. |
2015 |
Inproceedings |
An Interactive Pedestrian Environment Simulator for Cognitive Monitoring and Evaluation Inproceedings Proceedings of the 20th International Conference on Intelligent User Interfaces Companion, pp. 57-60, ACM, 2015. |
Easy Deployment of Spoken Dialogue Technology on Smartwatches for Mental Healthcare Inproceedings Pervasive Computing Paradigms for Mental Health - 5th International Conference, MindCare 2015, Milan, Italy, September 24-25, 2015, Revised Selected Papers, pp. 150-156, Springer, 2015. |
Robot Companions and Smartpens for Improved Social Communication of Dementia Patients Inproceedings Proceedings of the 20th International Conference on Intelligent User Interfaces Companion, Association for Computing Machinery, 2015. |
Towards Gaze and Gesture Based Human-Robot Interaction for Dementia Patients Inproceedings 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015, pp. 111-113, AAAI Press, 2015. |
Technical Reports |
Computational Modelling and Prediction of Gaze Estimation Error for Head-mounted Eye Trackers Technical Report DFKI , 2015. |
ISMAR 2015 Tutorial on Intelligent User Interfaces Technical Report DFKI , 2015. |
2014 |
Incollections |
Intelligent Semantic Mediation, Knowledge Acquisition and User Interaction Incollection Grallert, Hans-Joachim; Weiss, Stefan; Friedrich, Hermann; Widenka, Thomas; Wahlster, Wolfgang (Ed.): Towards the Internet of Services: The THESEUS Research Program, pp. 179-189, Springer, 2014. |
Inproceedings |
A Mixed Reality Head-Mounted Text Translation System Using Eye Gaze Input Inproceedings Proceedings of the 2014 international conference on Intelligent user interfaces, pp. 329-334, ACM, 2014. |
Proceedings of the 2014 International Conference on Pervasive Computing and Communications Workshops, pp. 320-325, IEEE, 2014. |
LabelMovie: a Semi-supervised Machine Annotation Tool with Quality Assurance and Crowd-sourcing Options for Videos Inproceedings Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, IEEE, 2014. |
Using Eye-gaze and Visualization To Augment Memory: A Framework for Improving Context Recognition and Recall Inproceedings Proceedings of the 16th International Conference on Human-Computer Interaction, pp. 282-291, LNCS Springer, 2014. |
A natural interface for multi-focal plane head mounted displays using 3D gaze Inproceedings Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces, pp. 25-32, ACM, 2014. |
2013 |
Incollections |
Marcus, Aaron (Ed.): Design, User Experience, and Usability. User Experience in Novel Technological Environments, pp. 401-410, Springer, 2013. |
Inproceedings |
Gaze-based Online Face Learning and Recognition in Augmented Reality Inproceedings Proceedings of the IUI 2013 Workshop on Interactive Machine Learning, ACM, 2013. |
On-body IE: A Head-Mounted Multimodal Augmented Reality System for Learning and Recalling Faces Inproceedings 9th International Conference on Intelligent Environments, IEEE, 2013. |
Vision-Based Location-Awareness in Augmented Reality Applications Inproceedings 3rd Workshop on Location Awareness for Mixed and Dual Reality (LAMDa’13), ACM Press, 2013. |
Digital Pens as Smart Objects in Multimodal Medical Application Frameworks Inproceedings Proceedings of the Second Workshop on Interacting with Smart Objects, in conjunction with IUI’13, ACM Press, 2013. |
Multimodal Interaction Strategies in a Multi-Device Environment using Natural Speech Inproceedings Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion, ACM Press, 2013. |
2012 |
Inproceedings |
RadSpeech, a mobile dialogue system for radiologists Inproceedings Proceedings of the International Conference on Intelligent User Interfaces (IUI), pp. 317-318, ACM, 2012. |
2011 |
Inproceedings |
Digital Pen in Mammography Patient Forms Inproceedings Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 303-306, ACM, 2011. |
Monitoring and Explaining Reasoning Processes in a Dialogue System’s Input Interpretation Step Inproceedings Proceedings of the International Workshop on Explanation-aware Computing at IJCAI, IJCAI, 2011. |
Interactive Paper for Radiology Findings Inproceedings Proceedings of the 16th International Conference on Intelligent User Interfaces, pp. 459-460, ACM, 2011. |
2010 |
Books |
Ontologies and Adaptivity in Dialogue for Question Answering Book AKA and IOS Press, 2010. |
Book Chapters |
Pillars of Ontology Treatment in the Medical Domain Book Chapter Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications, pp. 162-186, Information Science Reference, 2010. |
Inproceedings |
Prototyping Semantic Dialogue Systems for Radiologists Inproceedings Proceedings of the 6th International Conference on Intelligent Environments, AAAI, 2010. |
A Multimodal Dialogue Mashup for Medical Image Semantics Inproceedings Proceedings of the International Conference on Intelligent User Interfaces, o.A., 2010. |
Linked Data Integration for Semantic Dialogue and Backend Access Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", o.A., 2010. |
HTTP/REST-based Meta Web Services in Mobile Application Frameworks Inproceedings Proceedings of the 4nd International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, XPS, 2010. |
Applications of an Ontology Engineering Methodology Accessing Linked Data for Medical Image Retrieval Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", Stanford University, 2010. |
Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering Inproceedings Proceedings of the Seventh Conference on International Language Resources and Evaluation, European Language Resources Association (ELRA), 2010. |
A Discourse and Dialogue Infrastructure for Industrial Dissemination Inproceedings Lee, Gary Geunbae; Mariani, Joseph; Minker, Wolfgang; Nakamura, Satoshi (Ed.): Spoken Dialogue Systems for Ambient Environments - IWSDS 2010: Proceedings of the Seond International Workshop on Spoken Dialogue Systems, pp. 132-143, Springer, 2010. |
Technical Reports |
THESEUS CTC-WP4 Usability Guidelines for Use Case Applications Technical Report Bundesministerium für Wirtschaft und Technologie , 2010. |
A Methodology for Emergent Software Technical Report German Research Center for AI (DFKI ) Stuhlsatzenhausweg 3 66123 Saarbrücken, , 2010. |
2009 |
Journal Articles |
Macrodynamics of users' behavior in Information Retrieval Journal Article Computing Research Repository eprint Journal, ArXiv online , pp. 1-15, 2009. |
Pillars of Ontology Treatment in the Medical Domain Journal Article Journal of Cases on Information Technology, 11 , pp. 47-73, 2009. |