2015 |
Inproceedings |
Kadir, Md Abdul; Chowdhury, Md Belayet; Rashid, Jaber AL; Shakil, Shifur Rahman; Rhaman, Md Khalilur An autonomous industrial robot for loading and unloading goods Inproceedings 2015 International Conference on Informatics, Electronics & Vision (ICIEV), pp. 1-6, IEEE, 2015. @inproceedings{14629, title = {An autonomous industrial robot for loading and unloading goods}, author = {Md Abdul Kadir and Md Belayet Chowdhury and Jaber AL Rashid and Shifur Rahman Shakil and Md Khalilur Rhaman}, url = {http://10.1109/ICIEV.2015.7333984}, year = {2015}, date = {2015-01-01}, booktitle = {2015 International Conference on Informatics, Electronics & Vision (ICIEV)}, pages = {1-6}, publisher = {IEEE}, abstract = {In industries loading and unloading of heavy loads manually is one of the most important task which turns out to be quite difficult, time-consuming and risky for humans. This paper illustrates the mechanical design of the industry based automated robot which include: Ackerman Steering Mechanism and Differential Mechanism. Ackerman Steering allows front two wheels to turn left and right in the track without going out of the track. Differential has been mounted with two back wheels and a DC motor has been used with its controller to start motion of the robot. The autonomous robot is designed to start its movement from a starting position where goods are loaded on it, then follow a path of white line drawn on black surface and unload goods by itself after reaching a destination place. Digital Line Following sensor has been mounted in front of the robot so that the sensor can detect path by emitting and receiving signals allowing it to move in the pre-defined track having left and right turns while carrying goods from starting position to the destination. The main objective is to load and unload heavy goods that has been achieved by two large linear actuators for producing required torque and force necessary to unload heavy loads up to 150kg sideways to the ground safely. Besides, the robot has been built up having the ability to avoid collision with any obstacles that come in its way. Building an industrial robot with moderate speed, good efficiency for loading and unloading purpose within a short time to ease human suffering has been the main focus of this paper.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In industries loading and unloading of heavy loads manually is one of the most important task which turns out to be quite difficult, time-consuming and risky for humans. This paper illustrates the mechanical design of the industry based automated robot which include: Ackerman Steering Mechanism and Differential Mechanism. Ackerman Steering allows front two wheels to turn left and right in the track without going out of the track. Differential has been mounted with two back wheels and a DC motor has been used with its controller to start motion of the robot. The autonomous robot is designed to start its movement from a starting position where goods are loaded on it, then follow a path of white line drawn on black surface and unload goods by itself after reaching a destination place. Digital Line Following sensor has been mounted in front of the robot so that the sensor can detect path by emitting and receiving signals allowing it to move in the pre-defined track having left and right turns while carrying goods from starting position to the destination. The main objective is to load and unload heavy goods that has been achieved by two large linear actuators for producing required torque and force necessary to unload heavy loads up to 150kg sideways to the ground safely. Besides, the robot has been built up having the ability to avoid collision with any obstacles that come in its way. Building an industrial robot with moderate speed, good efficiency for loading and unloading purpose within a short time to ease human suffering has been the main focus of this paper. |
Technical Reports |
Barz, Michael; Bulling, Andreas; Daiber, Florian Computational Modelling and Prediction of Gaze Estimation Error for Head-mounted Eye Trackers Technical Report DFKI , 2015. @techreport{7619, title = {Computational Modelling and Prediction of Gaze Estimation Error for Head-mounted Eye Trackers}, author = {Michael Barz and Andreas Bulling and Florian Daiber}, url = {https://www.dfki.de/fileadmin/user_upload/import/7619_gazequality.pdf}, year = {2015}, date = {2015-01-01}, volume = {1}, institution = {DFKI}, abstract = {Gaze estimation error is inherent in head-mounted eye trackers and seriously impacts performance, usability, and user experience of gaze-based interfaces. Particularly in mobile settings, this error varies constantly as users move in front and look at different parts of a display. We envision a new class of gaze-based interfaces that are aware of the gaze estimation error and adapt to it in real time. As a first step towards this vision we introduce an error model that is able to predict the gaze estimation error. Our method covers major building blocks of mobile gaze estimation, specifically mapping of pupil positions to scene camera coordinates, marker-based display detection, and mapping of gaze from scene camera to on-screen coordinates. We develop our model through a series of principled measurements of a state-of-the-art head-mounted eye tracker.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } Gaze estimation error is inherent in head-mounted eye trackers and seriously impacts performance, usability, and user experience of gaze-based interfaces. Particularly in mobile settings, this error varies constantly as users move in front and look at different parts of a display. We envision a new class of gaze-based interfaces that are aware of the gaze estimation error and adapt to it in real time. As a first step towards this vision we introduce an error model that is able to predict the gaze estimation error. Our method covers major building blocks of mobile gaze estimation, specifically mapping of pupil positions to scene camera coordinates, marker-based display detection, and mapping of gaze from scene camera to on-screen coordinates. We develop our model through a series of principled measurements of a state-of-the-art head-mounted eye tracker. |
and, Daniel Sonntag ISMAR 2015 Tutorial on Intelligent User Interfaces Technical Report DFKI , 2015. @techreport{8134, title = {ISMAR 2015 Tutorial on Intelligent User Interfaces}, author = {Daniel Sonntag and}, url = {https://www.dfki.de/fileadmin/user_upload/import/8134_ISMAR-2015-IUI-TUTORIAL.pdf}, year = {2015}, date = {2015-01-01}, volume = {1}, institution = {DFKI}, abstract = {IUIs aim to incorporate intelligent automated capabilities in human computer interaction, where the net impact is a human- computer interaction that improves performance or usability in critical ways. It also involves designing and implementing an artificial intelligence (AI) component that effectively leverages human skills and capabilities, so that human performance with an application excels. IUIs embody capabilities that have traditionally been associated more strongly with humans than with computers: how to perceive, interpret, learn, use language, reason, plan, and decide.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } IUIs aim to incorporate intelligent automated capabilities in human computer interaction, where the net impact is a human- computer interaction that improves performance or usability in critical ways. It also involves designing and implementing an artificial intelligence (AI) component that effectively leverages human skills and capabilities, so that human performance with an application excels. IUIs embody capabilities that have traditionally been associated more strongly with humans than with computers: how to perceive, interpret, learn, use language, reason, plan, and decide. |
2014 |
Incollections |
Sonntag, Daniel; and, Daniel Porta Intelligent Semantic Mediation, Knowledge Acquisition and User Interaction Incollection Grallert, Hans-Joachim; Weiss, Stefan; Friedrich, Hermann; Widenka, Thomas; Wahlster, Wolfgang (Ed.): Towards the Internet of Services: The THESEUS Research Program, pp. 179-189, Springer, 2014. @incollection{7752, title = {Intelligent Semantic Mediation, Knowledge Acquisition and User Interaction}, author = {Daniel Sonntag and Daniel Porta and}, editor = {Hans-Joachim Grallert and Stefan Weiss and Hermann Friedrich and Thomas Widenka and Wolfgang Wahlster}, url = {http://link.springer.com/chapter/10.1007/978-3-319-06755-1_14}, year = {2014}, date = {2014-01-01}, booktitle = {Towards the Internet of Services: The THESEUS Research Program}, pages = {179-189}, publisher = {Springer}, abstract = {The design and implementation of combined mobile and touchscreen-based multimodal Web 3.0 interfaces should include new approaches of intelligent semantic mediation, knowledge acquisition and user interaction when dealing with a semantic-based digitalization of mostly unstructured textual or image-based source information. In this article, we propose a semantic-based model for those three tasks. The technical components rely on semantic web data structures in order to, first, transcend the traditional keyboard and mouse interaction metaphors, and second, provide the representation structures for more complex, collaborative interaction scenarios that may combine mobile with terminal-based interaction to accommodate the growing need to store, organize, and retrieve all these data. Interactive knowledge acquisition plays a major role in increasing the quality of automatic annotations as well as the usability of different intelligent user interfaces to control, correct, and add annotations to unstructured text and image sources. Examples are provided in the context of the Medico and Texo use cases.}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } The design and implementation of combined mobile and touchscreen-based multimodal Web 3.0 interfaces should include new approaches of intelligent semantic mediation, knowledge acquisition and user interaction when dealing with a semantic-based digitalization of mostly unstructured textual or image-based source information. In this article, we propose a semantic-based model for those three tasks. The technical components rely on semantic web data structures in order to, first, transcend the traditional keyboard and mouse interaction metaphors, and second, provide the representation structures for more complex, collaborative interaction scenarios that may combine mobile with terminal-based interaction to accommodate the growing need to store, organize, and retrieve all these data. Interactive knowledge acquisition plays a major role in increasing the quality of automatic annotations as well as the usability of different intelligent user interfaces to control, correct, and add annotations to unstructured text and image sources. Examples are provided in the context of the Medico and Texo use cases. |
Inproceedings |
Orlosky, Jason; Toyama, Takumi; Sonntag, Daniel; Sárkány, András; andrincz, András Lő Proceedings of the 2014 International Conference on Pervasive Computing and Communications Workshops, pp. 320-325, IEEE, 2014. @inproceedings{7410, title = {On-body multi-input indoor localization for dynamic emergency scenarios: fusion of magnetic tracking and optical character recognition with mixed-reality display}, author = {Jason Orlosky and Takumi Toyama and Daniel Sonntag and András Sárkány and András Lő andrincz}, url = {https://www.dfki.de/fileadmin/user_upload/import/7410_2014_On-body_multi-input_indoor_localization_for_dynamic_emergency_scenarios-_Fusion_of_magnetic_tracking_and_optical_character_recognition_with_mixed-reality_display.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 2014 International Conference on Pervasive Computing and Communications Workshops}, pages = {320-325}, publisher = {IEEE}, abstract = {Indoor navigation in emergency scenarios poses a challenge to evacuation and emergency support, especially for injured or physically encumbered individuals. Navigation systems must be lightweight, easy to use, and provide robust localization and accurate navigation instructions in adverse conditions. To address this challenge, we combine magnetic location tracking with an optical character recognition (OCR) and eye gaze based method to recognize door plates and position related text to provide more robust localization. In contrast to typical wireless or sensor based tracking, our fused system can be used in low-lighting, smoke, and areas without power or wireless connectivity. Eye gaze tracking is also used to improve time to localization and accuracy of the OCR algorithm. Once localized, navigation instructions are transmitted directly into the user's immediate field of view via head mounted display (HMD). Additionally, setting up the system is simple and can be done with minimal calibration, requiring only a walk-through of the environment and numerical annotation of a 2D area map. We conduct an evaluation for the magnetic and OCR systems individually to evaluate feasibility for use in the fused framework.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Indoor navigation in emergency scenarios poses a challenge to evacuation and emergency support, especially for injured or physically encumbered individuals. Navigation systems must be lightweight, easy to use, and provide robust localization and accurate navigation instructions in adverse conditions. To address this challenge, we combine magnetic location tracking with an optical character recognition (OCR) and eye gaze based method to recognize door plates and position related text to provide more robust localization. In contrast to typical wireless or sensor based tracking, our fused system can be used in low-lighting, smoke, and areas without power or wireless connectivity. Eye gaze tracking is also used to improve time to localization and accuracy of the OCR algorithm. Once localized, navigation instructions are transmitted directly into the user's immediate field of view via head mounted display (HMD). Additionally, setting up the system is simple and can be done with minimal calibration, requiring only a walk-through of the environment and numerical annotation of a 2D area map. We conduct an evaluation for the magnetic and OCR systems individually to evaluate feasibility for use in the fused framework. |
Palotai, Zsolt; Láng, Miklós; Sárkány, András; andsér, Zoltán Tő Sonntag, Daniel; Toyama, Takumi; andrincz, András Lő LabelMovie: a Semi-supervised Machine Annotation Tool with Quality Assurance and Crowd-sourcing Options for Videos Inproceedings Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, IEEE, 2014. @inproceedings{7411, title = {LabelMovie: a Semi-supervised Machine Annotation Tool with Quality Assurance and Crowd-sourcing Options for Videos}, author = {Zsolt Palotai and Miklós Láng and András Sárkány and Zoltán Tő andsér and Daniel Sonntag and Takumi Toyama and András Lő andrincz}, url = {https://www.dfki.de/fileadmin/user_upload/import/7411_2014_LabelMovie-_Semi-supervised_Machine_Annotation_Tool_with_Quality_Assurance_and_Crowd-sourcing_Options_for_Videos.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing}, publisher = {IEEE}, abstract = {For multiple reasons, the automatic annotation of video recordings is challenging: first, the amount of database video instances to be annotated is huge, second, tedious manual labelling sessions are required, third, the multimodal annotation needs exact information of space, time, and context, fourth, the different labelling opportunities (e.g., for the case of affects) require special agreements between annotators, and so forth. Crowdsourcing with quality assurance by experts may come to the rescue here. We have developed a special tool where individual experts can annotate videos over the Internet, their work can be joined and filtered, the annotated material can be evaluated by machine learning methods, and automated annotation starts according to a predefined confidence level. Qualitative manual labelling instances by humans, the seeds, assure that relatively small samples of manual annotations can effectively bootstrap the machine annotation procedure. The annotation tool features special visualization methods for crowd- sourced users not familiar with machine learning methods and, in turn, ignites the bootstrapping process.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } For multiple reasons, the automatic annotation of video recordings is challenging: first, the amount of database video instances to be annotated is huge, second, tedious manual labelling sessions are required, third, the multimodal annotation needs exact information of space, time, and context, fourth, the different labelling opportunities (e.g., for the case of affects) require special agreements between annotators, and so forth. Crowdsourcing with quality assurance by experts may come to the rescue here. We have developed a special tool where individual experts can annotate videos over the Internet, their work can be joined and filtered, the annotated material can be evaluated by machine learning methods, and automated annotation starts according to a predefined confidence level. Qualitative manual labelling instances by humans, the seeds, assure that relatively small samples of manual annotations can effectively bootstrap the machine annotation procedure. The annotation tool features special visualization methods for crowd- sourced users not familiar with machine learning methods and, in turn, ignites the bootstrapping process. |
Orlosky, Jason; Toyama, Takumi; Sonntag, Daniel; Kiyokawa, Kiyoshi Using Eye-gaze and Visualization To Augment Memory: A Framework for Improving Context Recognition and Recall Inproceedings Proceedings of the 16th International Conference on Human-Computer Interaction, pp. 282-291, LNCS Springer, 2014. @inproceedings{7412, title = {Using Eye-gaze and Visualization To Augment Memory: A Framework for Improving Context Recognition and Recall}, author = {Jason Orlosky and Takumi Toyama and Daniel Sonntag and Kiyoshi Kiyokawa}, url = {https://www.dfki.de/fileadmin/user_upload/import/7412_2014_Using_Eye-Gaze_and_Visualization_to_Augment_Memory_.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 16th International Conference on Human-Computer Interaction}, pages = {282-291}, publisher = {LNCS Springer}, abstract = {In our everyday lives, bits of important information are lost due to the fact that our brain fails to convert a large portion of short term memory into long term memory. In this paper, we propose a framework that uses an eye-tracking interface to store pieces of forgotten information and present them back to the user later with an integrated head mounted display (HMD). This process occurs in three main steps, including context recognition, data storage, and augmented reality (AR) display. We demonstrate the system’s ability to recall information with the example of a lost book page by detecting when the user reads the book again and intelligently presenting the last read position back to the user. Two short user evaluations show that the system can recall book pages within 40 milliseconds, and that the position where a user left off can be calculated with approximately 0.5 centimeter accuracy.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In our everyday lives, bits of important information are lost due to the fact that our brain fails to convert a large portion of short term memory into long term memory. In this paper, we propose a framework that uses an eye-tracking interface to store pieces of forgotten information and present them back to the user later with an integrated head mounted display (HMD). This process occurs in three main steps, including context recognition, data storage, and augmented reality (AR) display. We demonstrate the system’s ability to recall information with the example of a lost book page by detecting when the user reads the book again and intelligently presenting the last read position back to the user. Two short user evaluations show that the system can recall book pages within 40 milliseconds, and that the position where a user left off can be calculated with approximately 0.5 centimeter accuracy. |
Toyama, Takumi; Sonntag, Daniel; Matsuda, Takahiro; Dengel, Andreas; Iwamura, Masakazu; and, Koichi Kise A Mixed Reality Head-Mounted Text Translation System Using Eye Gaze Input Inproceedings Proceedings of the 2014 international conference on Intelligent user interfaces, pp. 329-334, ACM, 2014. @inproceedings{7409, title = {A Mixed Reality Head-Mounted Text Translation System Using Eye Gaze Input}, author = {Takumi Toyama and Daniel Sonntag and Takahiro Matsuda and Andreas Dengel and Masakazu Iwamura and Koichi Kise and}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 2014 international conference on Intelligent user interfaces}, pages = {329-334}, publisher = {ACM}, abstract = {Efficient text recognition has recently been a challenge for augmented reality systems. In this paper, we propose a system with the ability to provide translations to the user in real-time. We use eye gaze for more intuitive and efficient input for ubiquitous text reading and translation in head mounted displays (HMDs). The eyes can be used to indicate regions of interest in text documents and activate optical-character-recognition (OCR) and translation functions. Visual feedback and navigation help in the interaction process, and text snippets with translations from Japanese to English text snippets, are presented in a see-through HMD. We focus on travelers who go to Japan and need to read signs and propose two different gaze gestures for activating the OCR text reading and translation function. We evaluate which type of gesture suits our OCR scenario best. We also show that our gaze-based OCR method on the extracted gaze regions provide faster access times to information than traditional OCR approaches. Other benefits include that visual feedback of the extracted text region can be given in real-time, the Japanese to English translation can be presented in real-time, and the augmentation of the synchronized and calibrated HMD in this mixed reality application are presented at exact locations in the augmented user view to allow for dynamic text translation management in head-up display systems.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Efficient text recognition has recently been a challenge for augmented reality systems. In this paper, we propose a system with the ability to provide translations to the user in real-time. We use eye gaze for more intuitive and efficient input for ubiquitous text reading and translation in head mounted displays (HMDs). The eyes can be used to indicate regions of interest in text documents and activate optical-character-recognition (OCR) and translation functions. Visual feedback and navigation help in the interaction process, and text snippets with translations from Japanese to English text snippets, are presented in a see-through HMD. We focus on travelers who go to Japan and need to read signs and propose two different gaze gestures for activating the OCR text reading and translation function. We evaluate which type of gesture suits our OCR scenario best. We also show that our gaze-based OCR method on the extracted gaze regions provide faster access times to information than traditional OCR approaches. Other benefits include that visual feedback of the extracted text region can be given in real-time, the Japanese to English translation can be presented in real-time, and the augmentation of the synchronized and calibrated HMD in this mixed reality application are presented at exact locations in the augmented user view to allow for dynamic text translation management in head-up display systems. |
Toyama, Takumi; Orlosky, Jason; Sonntag, Daniel; Kiyokawa, Kiyoshi A natural interface for multi-focal plane head mounted displays using 3D gaze Inproceedings Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces, pp. 25-32, ACM, 2014. @inproceedings{7413, title = {A natural interface for multi-focal plane head mounted displays using 3D gaze}, author = {Takumi Toyama and Jason Orlosky and Daniel Sonntag and Kiyoshi Kiyokawa}, year = {2014}, date = {2014-01-01}, booktitle = {Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces}, pages = {25-32}, publisher = {ACM}, abstract = {In mobile augmented reality (AR), it is important to develop interfaces for wearable displays that not only reduce distraction, but that can be used quickly and in a natural manner. In this paper, we propose a focal-plane based interaction approach with several advantages over traditional methods designed for head mounted displays (HMDs) with only one focal plane. Using a novel prototype that combines a monoscopic multi-focal plane HMD and eye tracker, we facilitate interaction with virtual elements such as text or buttons by measuring eye convergence on objects at different depths. This can prevent virtual information from being unnecessarily overlaid onto real world objects that are at a different range, but in the same line of sight. We then use our prototype in a series of experiments testing the feasibility of interaction. Despite only being presented with monocular depth cues, users have the ability to correctly select virtual icons in near, mid, and far planes in 98.6% of cases.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In mobile augmented reality (AR), it is important to develop interfaces for wearable displays that not only reduce distraction, but that can be used quickly and in a natural manner. In this paper, we propose a focal-plane based interaction approach with several advantages over traditional methods designed for head mounted displays (HMDs) with only one focal plane. Using a novel prototype that combines a monoscopic multi-focal plane HMD and eye tracker, we facilitate interaction with virtual elements such as text or buttons by measuring eye convergence on objects at different depths. This can prevent virtual information from being unnecessarily overlaid onto real world objects that are at a different range, but in the same line of sight. We then use our prototype in a series of experiments testing the feasibility of interaction. Despite only being presented with monocular depth cues, users have the ability to correctly select virtual icons in near, mid, and far planes in 98.6% of cases. |
2013 |
Incollections |
Sonntag, Daniel; Zillner, Sonja; Schulz, Christian Husodo; Toyama, Takumi; Weber, Markus Marcus, Aaron (Ed.): Design, User Experience, and Usability. User Experience in Novel Technological Environments, pp. 401-410, Springer, 2013. @incollection{7165, title = {Towards Medical Cyber-Physical Systems: Multimodal Augmented Reality for Doctors and Knowledge Discovery about Patients}, author = {Daniel Sonntag and Sonja Zillner and Christian Husodo Schulz and Takumi Toyama and Markus Weber}, editor = {Aaron Marcus}, url = {https://www.dfki.de/fileadmin/user_upload/import/7165_2013_Vision-Based_Location-Awareness_in_Augmented_Reality_Applications.pdf}, year = {2013}, date = {2013-01-01}, booktitle = {Design, User Experience, and Usability. User Experience in Novel Technological Environments}, pages = {401-410}, publisher = {Springer}, abstract = {In the medical domain, which becomes more and more digital, every improvement in efficiency and effectiveness really counts. Doctors must be able to retrieve data easily and provide their input in the most convenient way. With new technologies towards medical cyber-physical systems, such as networked head-mounted displays (HMDs) and eye trackers, new interaction opportunities arise. With our medical demo in the context of a cancer screening programme, we are combining active speech based input, passive/active eye tracker user input, and HMD output (all devices are on-body and hands-free) in a convenient way for both the patient and the doctor.}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } In the medical domain, which becomes more and more digital, every improvement in efficiency and effectiveness really counts. Doctors must be able to retrieve data easily and provide their input in the most convenient way. With new technologies towards medical cyber-physical systems, such as networked head-mounted displays (HMDs) and eye trackers, new interaction opportunities arise. With our medical demo in the context of a cancer screening programme, we are combining active speech based input, passive/active eye tracker user input, and HMD output (all devices are on-body and hands-free) in a convenient way for both the patient and the doctor. |
Inproceedings |
Toyama, Takumi; Sonntag, Daniel; Weber, Markus; Schulz, Christian Husodo Gaze-based Online Face Learning and Recognition in Augmented Reality Inproceedings Proceedings of the IUI 2013 Workshop on Interactive Machine Learning, ACM, 2013. @inproceedings{6780, title = {Gaze-based Online Face Learning and Recognition in Augmented Reality}, author = {Takumi Toyama and Daniel Sonntag and Markus Weber and Christian Husodo Schulz}, url = {https://www.dfki.de/fileadmin/user_upload/import/6780_iui2012-ml-workshop2.pdf}, year = {2013}, date = {2013-03-01}, booktitle = {Proceedings of the IUI 2013 Workshop on Interactive Machine Learning}, publisher = {ACM}, abstract = {We propose a new online face learning and recognition approach using user gaze and augmented displays. User gaze is used to select a face in focus in a scene image whereupon visual feedback and information about the detected person is presented in a head mounted display. Our specific medical application leverages the doctors capabilities of recalling the specific patient context.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We propose a new online face learning and recognition approach using user gaze and augmented displays. User gaze is used to select a face in focus in a scene image whereupon visual feedback and information about the detected person is presented in a head mounted display. Our specific medical application leverages the doctor’s capabilities of recalling the specific patient context. |
Sonntag, Daniel; Toyama, Takumi On-body IE: A Head-Mounted Multimodal Augmented Reality System for Learning and Recalling Faces Inproceedings 9th International Conference on Intelligent Environments, IEEE, 2013. @inproceedings{7039, title = {On-body IE: A Head-Mounted Multimodal Augmented Reality System for Learning and Recalling Faces}, author = {Daniel Sonntag and Takumi Toyama}, year = {2013}, date = {2013-01-01}, booktitle = {9th International Conference on Intelligent Environments}, publisher = {IEEE}, abstract = {We present a new augmented reality (AR) system for knowledge-intensive location-based expert work. The multi-modal interaction system combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted augmented reality display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the expert work in a specific medical application context which shows its potential. In the sensitive domain of examining patients in a cancer screening program we try to combine several active user input devices in the most convenient way for both the patient and the doctor. The resulting multimodal AR is an on-body intelligent environment (IE) and has the potential to yield higher performance outcomes and provides a direct data acquisition control mechanism. It leverages the doctor's capabilities of recalling the specific patient context by a virtual, context-based patient-specific """"external brain"""" for the doctor which can remember patient faces and adapts the virtual augmentation according to the specific patient observation and finding context. In addition, patient data can be displayed on the HMD -- triggered by voice or object/patient recognition. The learned (patient) faces and immovable objects (e.g., a big medical device) define the environmental clues to make the context-dependent recognition model part of the IE to achieve specific goals for the doctors in the hospital routine.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a new augmented reality (AR) system for knowledge-intensive location-based expert work. The multi-modal interaction system combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted augmented reality display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the expert work in a specific medical application context which shows its potential. In the sensitive domain of examining patients in a cancer screening program we try to combine several active user input devices in the most convenient way for both the patient and the doctor. The resulting multimodal AR is an on-body intelligent environment (IE) and has the potential to yield higher performance outcomes and provides a direct data acquisition control mechanism. It leverages the doctor's capabilities of recalling the specific patient context by a virtual, context-based patient-specific """"external brain"""" for the doctor which can remember patient faces and adapts the virtual augmentation according to the specific patient observation and finding context. In addition, patient data can be displayed on the HMD -- triggered by voice or object/patient recognition. The learned (patient) faces and immovable objects (e.g., a big medical device) define the environmental clues to make the context-dependent recognition model part of the IE to achieve specific goals for the doctors in the hospital routine. |
Sonntag, Daniel; Toyama, Takumi Vision-Based Location-Awareness in Augmented Reality Applications Inproceedings 3rd Workshop on Location Awareness for Mixed and Dual Reality (LAMDa’13), ACM Press, 2013. @inproceedings{7164, title = {Vision-Based Location-Awareness in Augmented Reality Applications}, author = {Daniel Sonntag and Takumi Toyama}, url = {https://www.dfki.de/fileadmin/user_upload/import/7164_2013_Vision-Based_Location-Awareness_in_Augmented_Reality_Applications.pdf}, year = {2013}, date = {2013-01-01}, booktitle = {3rd Workshop on Location Awareness for Mixed and Dual Reality (LAMDa’13)}, publisher = {ACM Press}, abstract = {We present an integral HCI approach that incorporates eye-gaze for location-awareness in real-time. A new augmented reality (AR) system for knowledge-intensive location-based work combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted AR display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the navigation on a hospitals premises (outdoors and indoors, figure 1) which shows its potential. We focus on the eye-tracker interaction which provides cues for location-awareness.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present an integral HCI approach that incorporates eye-gaze for location-awareness in real-time. A new augmented reality (AR) system for knowledge-intensive location-based work combines multiple on-body input and output devices: a speech-based dialogue system, a head-mounted AR display (HMD), and a head-mounted eye-tracker. The interaction devices have been selected to augment and improve the navigation on a hospital’s premises (outdoors and indoors, figure 1) which shows its potential. We focus on the eye-tracker interaction which provides cues for location-awareness. |
Weber, Markus; Schulz, Christian Husodo; Sonntag, Daniel; Toyama, Takumi Digital Pens as Smart Objects in Multimodal Medical Application Frameworks Inproceedings Proceedings of the Second Workshop on Interacting with Smart Objects, in conjunction with IUI’13, ACM Press, 2013. @inproceedings{7166, title = {Digital Pens as Smart Objects in Multimodal Medical Application Frameworks}, author = {Markus Weber and Christian Husodo Schulz and Daniel Sonntag and Takumi Toyama}, year = {2013}, date = {2013-01-01}, booktitle = {Proceedings of the Second Workshop on Interacting with Smart Objects, in conjunction with IUI’13}, publisher = {ACM Press}, abstract = {In this paper, we present a novel mobile interaction system which combines a pen-based interface with a head-mounted display (HMD) for clinical radiology reports in the field of mammography. We consider a digital pen as an anthropocentric smart object, one that allows for a physical, tangible and embodied interaction to enhance data input in a mobile onbody HMD environment. Our system provides an intuitive way for a radiologist to write a structured report with a special pen on normal paper and receive real-time feedback using HMD technology. We will focus on the combination of new interaction possibilities with smart digital pens in this multimodal scenario due to a new real-time visualisation possibility.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper, we present a novel mobile interaction system which combines a pen-based interface with a head-mounted display (HMD) for clinical radiology reports in the field of mammography. We consider a digital pen as an anthropocentric smart object, one that allows for a physical, tangible and embodied interaction to enhance data input in a mobile onbody HMD environment. Our system provides an intuitive way for a radiologist to write a structured report with a special pen on normal paper and receive real-time feedback using HMD technology. We will focus on the combination of new interaction possibilities with smart digital pens in this multimodal scenario due to a new real-time visualisation possibility. |
Schulz, Christian Husodo; Sonntag, Daniel; Weber, Markus; Toyama, Takumi Multimodal Interaction Strategies in a Multi-Device Environment using Natural Speech Inproceedings Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion, ACM Press, 2013. @inproceedings{7167, title = {Multimodal Interaction Strategies in a Multi-Device Environment using Natural Speech}, author = {Christian Husodo Schulz and Daniel Sonntag and Markus Weber and Takumi Toyama}, year = {2013}, date = {2013-01-01}, booktitle = {Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion}, publisher = {ACM Press}, abstract = {In this paper we present an intelligent user interface which combines a speech-based interface with several other input modalities. The integration of multiple devices into a working environment should provide greater flexibility to the daily routine of medical experts for example. To this end, we will introduce a medical cyber-physical system that demonstrates the use of a bidirectional connection between a speech-based interface and a head-mounted see-through display. We will show examples of how we can exploit multiple input modalities and thus increase the usability of a speech-based interaction system.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper we present an intelligent user interface which combines a speech-based interface with several other input modalities. The integration of multiple devices into a working environment should provide greater flexibility to the daily routine of medical experts for example. To this end, we will introduce a medical cyber-physical system that demonstrates the use of a bidirectional connection between a speech-based interface and a head-mounted see-through display. We will show examples of how we can exploit multiple input modalities and thus increase the usability of a speech-based interaction system. |
2012 |
Inproceedings |
Sonntag, Daniel; Schulz, Christian Husodo; Reuschling, Christian; Galarraga, Luis RadSpeech, a mobile dialogue system for radiologists Inproceedings Proceedings of the International Conference on Intelligent User Interfaces (IUI), pp. 317-318, ACM, 2012. @inproceedings{6198, title = {RadSpeech, a mobile dialogue system for radiologists}, author = {Daniel Sonntag and Christian Husodo Schulz and Christian Reuschling and Luis Galarraga}, url = {https://www.dfki.de/fileadmin/user_upload/import/6198_iui-2012-preprint.pdf}, year = {2012}, date = {2012-01-01}, booktitle = {Proceedings of the International Conference on Intelligent User Interfaces (IUI)}, pages = {317-318}, publisher = {ACM}, abstract = {With RadSpeech, we aim to build the next generation of intelligent, scalable, and user-friendly semantic search interfaces for the medical imaging domain, based on semantic technologies. Ontology-based knowledge representation is used not only for the image contents, but also for the complex natural language understanding and dialogue management process. This demo shows a speechbased annotation system for radiology images and focuses on a new and effective way to annotate medical image regions with a specific medical, structured, diagnosis while using speech and pointing gestures on the go.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } With RadSpeech, we aim to build the next generation of intelligent, scalable, and user-friendly semantic search interfaces for the medical imaging domain, based on semantic technologies. Ontology-based knowledge representation is used not only for the image contents, but also for the complex natural language understanding and dialogue management process. This demo shows a speechbased annotation system for radiology images and focuses on a new and effective way to annotate medical image regions with a specific medical, structured, diagnosis while using speech and pointing gestures on the go. |
2011 |
Inproceedings |
Sonntag, Daniel; Liwicki, Marcus; Weber, Markus Digital Pen in Mammography Patient Forms Inproceedings Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 303-306, ACM, 2011. @inproceedings{5786, title = {Digital Pen in Mammography Patient Forms}, author = {Daniel Sonntag and Marcus Liwicki and Markus Weber}, url = {https://www.dfki.de/fileadmin/user_upload/import/5786_2011_Digital_Pen_in_Mammography_Patient_Forms.pdf http://dl.acm.org/citation.cfm?id=2070537}, year = {2011}, date = {2011-11-01}, booktitle = {Proceedings of the 13th International Conference on Multimodal Interfaces}, pages = {303-306}, publisher = {ACM}, abstract = {We present a digital pen based interface for clinical radiology reports in the field of mammography. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices in mammography, we allow the radiologist to write structured reports with a special pen on paper with an invisible dot pattern. A handwriting software takes care of the interpretation of the written report which is transferred into an ontological representation. In addition, a gesture recogniser allows radiologists to encircle predefined annotation suggestions which turns out to be the most beneficial feature. The radiologist can (1) provide the image and image region annotations mapped to a FMA, RadLex, or ICD10 code, (2) provide free text entries, and (3) correct/select annotations while using multiple gestures on the forms and sketch regions. The resulting, automatically generated PDF report is then stored in a semantic backend system for further use and contains all transcribed annotations as well as all free form sketches.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a digital pen based interface for clinical radiology reports in the field of mammography. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices in mammography, we allow the radiologist to write structured reports with a special pen on paper with an invisible dot pattern. A handwriting software takes care of the interpretation of the written report which is transferred into an ontological representation. In addition, a gesture recogniser allows radiologists to encircle predefined annotation suggestions which turns out to be the most beneficial feature. The radiologist can (1) provide the image and image region annotations mapped to a FMA, RadLex, or ICD10 code, (2) provide free text entries, and (3) correct/select annotations while using multiple gestures on the forms and sketch regions. The resulting, automatically generated PDF report is then stored in a semantic backend system for further use and contains all transcribed annotations as well as all free form sketches. |
Sonntag, Daniel; and, Christian Husodo Schulz Monitoring and Explaining Reasoning Processes in a Dialogue System’s Input Interpretation Step Inproceedings Proceedings of the International Workshop on Explanation-aware Computing at IJCAI, IJCAI, 2011. @inproceedings{6196, title = {Monitoring and Explaining Reasoning Processes in a Dialogue Systems Input Interpretation Step}, author = {Daniel Sonntag and Christian Husodo Schulz and}, url = {https://www.dfki.de/fileadmin/user_upload/import/6196_Exact2011_final.pdf}, year = {2011}, date = {2011-07-01}, booktitle = {Proceedings of the International Workshop on Explanation-aware Computing at IJCAI}, publisher = {IJCAI}, abstract = {We implemented a generic speech-based dialogue shell that can be configured for and applied to domain-specific dialogue applications. A toolbox for ontology-based dialogue engineering provides a technical solution for the two challenges of engineering domain extensions for new question and answer possibilities and debugging functional modules. In this paper, we address the process of debugging and maintaining rule-based input interpretation modules. While supporting a rapid implementation cycle until the dialogue systems works robustly for a new domain (e.g., the dialogue-based retrieval of medical images), production rules for input interpretation have to be monitored, configured, and maintained. We implemented a special graphical user interface to monitor and explain reasoning processes for the input interpretation phase of multimodal dialogue systems. A particular challenge was the presentation of the software system's ontology-based interaction rules in a way that they were accessible to and editable for humans for maintenance, and, at the same time, allowed a real-time monitoring of their application in the running dialogue system.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We implemented a generic speech-based dialogue shell that can be configured for and applied to domain-specific dialogue applications. A toolbox for ontology-based dialogue engineering provides a technical solution for the two challenges of engineering domain extensions for new question and answer possibilities and debugging functional modules. In this paper, we address the process of debugging and maintaining rule-based input interpretation modules. While supporting a rapid implementation cycle until the dialogue systems works robustly for a new domain (e.g., the dialogue-based retrieval of medical images), production rules for input interpretation have to be monitored, configured, and maintained. We implemented a special graphical user interface to monitor and explain reasoning processes for the input interpretation phase of multimodal dialogue systems. A particular challenge was the presentation of the software system's ontology-based interaction rules in a way that they were accessible to and editable for humans for maintenance, and, at the same time, allowed a real-time monitoring of their application in the running dialogue system. |
Sonntag, Daniel; Liwicki, Marcus; Weber, Markus Interactive Paper for Radiology Findings Inproceedings Proceedings of the 16th International Conference on Intelligent User Interfaces, pp. 459-460, ACM, 2011. @inproceedings{5785, title = {Interactive Paper for Radiology Findings}, author = {Daniel Sonntag and Marcus Liwicki and Markus Weber}, year = {2011}, date = {2011-01-01}, booktitle = {Proceedings of the 16th International Conference on Intelligent User Interfaces}, pages = {459-460}, publisher = {ACM}, abstract = {This paper presents a pen-based interface for clinical radiologists. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices, we allow the radiologist to write structured reports with a special pen on normal paper. A handwriting recognition and interpretation software takes care of the interpretation of the written report which is transferred into an ontological representation. The resulting report is then stored in a semantic backend system for further use. We will focus on the pen-based interface and new interaction possibilities with gestures in this scenario.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper presents a pen-based interface for clinical radiologists. It is of utmost importance in future radiology practices that the radiology reports be uniform, comprehensive, and easily managed. This means that reports must be "readable" to humans and machines alike. In order to improve reporting practices, we allow the radiologist to write structured reports with a special pen on normal paper. A handwriting recognition and interpretation software takes care of the interpretation of the written report which is transferred into an ontological representation. The resulting report is then stored in a semantic backend system for further use. We will focus on the pen-based interface and new interaction possibilities with gestures in this scenario. |
2010 |
Books |
and, Daniel Sonntag Ontologies and Adaptivity in Dialogue for Question Answering Book AKA and IOS Press, 2010. @book{4994, title = {Ontologies and Adaptivity in Dialogue for Question Answering}, author = {Daniel Sonntag and}, url = {http://www.dfki.de/~sonntag/Ontologies_and_Adaptivity_in_Dialogue_for_Question_Answering.html}, year = {2010}, date = {2010-01-01}, volume = {4}, pages = {410}, publisher = {AKA and IOS Press}, abstract = {Question answering (QA) has become one of the fastest growing topics in computational linguistics and information access. To advance research in the area of dialogue-based question answering, we propose a combination of methods from different scientific fields (i.e., information retrieval, dialogue systems, semantic web and machine learning). This book sheds light on adaptable dialogue-based question answering. We demonstrate the technical and computational feasibility of the proposed ideas, the introspective methods in particular, by beginning with an extensive introduction to the dialogical problem domain which motivates the technical implementation. The ideas have been carried out in a mature natural language processing (NLP) system, the SmartWeb dialogue system, which was developed between 2004 and 2007 by partners from academia and industry. We have attempted to make this book a self-containing text and provide an extra section on the interdisciplinary scientific background. The target audience for this book comprises of researchers and students interested in the application potential of semantic technologies for difficult AI tasks such as working dialogue and QA systems.}, keywords = {}, pubstate = {published}, tppubtype = {book} } Question answering (QA) has become one of the fastest growing topics in computational linguistics and information access. To advance research in the area of dialogue-based question answering, we propose a combination of methods from different scientific fields (i.e., information retrieval, dialogue systems, semantic web and machine learning). This book sheds light on adaptable dialogue-based question answering. We demonstrate the technical and computational feasibility of the proposed ideas, the introspective methods in particular, by beginning with an extensive introduction to the dialogical problem domain which motivates the technical implementation. The ideas have been carried out in a mature natural language processing (NLP) system, the SmartWeb dialogue system, which was developed between 2004 and 2007 by partners from academia and industry. We have attempted to make this book a self-containing text and provide an extra section on the interdisciplinary scientific background. The target audience for this book comprises of researchers and students interested in the application potential of semantic technologies for difficult AI tasks such as working dialogue and QA systems. |
Book Chapters |
Sonntag, Daniel; Wennerberg, Pinar; Buitelaar, Paul; Zillner, Sonja Pillars of Ontology Treatment in the Medical Domain Book Chapter Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications, pp. 162-186, Information Science Reference, 2010. @inbook{4256, title = {Pillars of Ontology Treatment in the Medical Domain}, author = {Daniel Sonntag and Pinar Wennerberg and Paul Buitelaar and Sonja Zillner}, url = {https://www.dfki.de/fileadmin/user_upload/import/4256_2010_Pillars_of_Ontology_Treatment_in_the_Medical_Domain_.pdf http://www.igi-global.com/bookstore/Chapter.aspx?TitleId=38043}, year = {2010}, date = {2010-01-01}, booktitle = {Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications}, pages = {162-186}, publisher = {Information Science Reference}, abstract = {In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project. |
Inproceedings |
Sonntag, Daniel; Möller, Manuel Prototyping Semantic Dialogue Systems for Radiologists Inproceedings Proceedings of the 6th International Conference on Intelligent Environments, AAAI, 2010. @inproceedings{4922, title = {Prototyping Semantic Dialogue Systems for Radiologists}, author = {Daniel Sonntag and Manuel Möller}, url = {https://www.dfki.de/fileadmin/user_upload/import/4922_2010_Prototyping_Semantic_Dialogue_Systems_for_Radiologists.pdf}, year = {2010}, date = {2010-07-01}, booktitle = {Proceedings of the 6th International Conference on Intelligent Environments}, publisher = {AAAI}, abstract = {In the future, speech-based semantic image retrieval and annotation of medical images should provide the basis for help in clinical decision support and computer aided diagnosis. We will present a semantic dialogue system installation for radiologists and describe today's clinical workflow and interaction requirements. The focus is on the interaction design and implementation of our prototype system for patient image search and image annotation while using a speech-based dialogue shell and a big touchscreen in the radiology environment. Ontology modeling provides the backbone for knowledge representation in the dialogue shell and the specific medical application domain.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In the future, speech-based semantic image retrieval and annotation of medical images should provide the basis for help in clinical decision support and computer aided diagnosis. We will present a semantic dialogue system installation for radiologists and describe today's clinical workflow and interaction requirements. The focus is on the interaction design and implementation of our prototype system for patient image search and image annotation while using a speech-based dialogue shell and a big touchscreen in the radiology environment. Ontology modeling provides the backbone for knowledge representation in the dialogue shell and the specific medical application domain. |
Sonntag, Daniel; Möller, Manuel A Multimodal Dialogue Mashup for Medical Image Semantics Inproceedings Proceedings of the International Conference on Intelligent User Interfaces, o.A., 2010. @inproceedings{4691, title = {A Multimodal Dialogue Mashup for Medical Image Semantics}, author = {Daniel Sonntag and Manuel Möller}, url = {https://www.dfki.de/fileadmin/user_upload/import/4691_2010_A_MULTIMODAL_DIALOGUE_MASHUP_FOR_MEDICAL_IMAGE_SEMANTICS.pdf http://portal.acm.org/citation.cfm?doid=1719970.1720036}, year = {2010}, date = {2010-02-01}, booktitle = {Proceedings of the International Conference on Intelligent User Interfaces}, publisher = {o.A.}, abstract = {This paper presents a multimodal dialogue mashup where different users are involved in the use of different user interfaces for the annotation and retrieval of medical images. Our solution is a mashup that integrates a multimodal interface for speech-based annotation of medical images and dialogue-based image retrieval with a semantic image annotation tool for manual annotations on a desktop computer. A remote RDF repository connects the annotation and querying task into a common framework and serves as the semantic backend system for the advanced multimodal dialogue a radiologist can use.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper presents a multimodal dialogue mashup where different users are involved in the use of different user interfaces for the annotation and retrieval of medical images. Our solution is a mashup that integrates a multimodal interface for speech-based annotation of medical images and dialogue-based image retrieval with a semantic image annotation tool for manual annotations on a desktop computer. A remote RDF repository connects the annotation and querying task into a common framework and serves as the semantic backend system for the advanced multimodal dialogue a radiologist can use. |
Sonntag, Daniel; Kiesel, Malte Linked Data Integration for Semantic Dialogue and Backend Access Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", o.A., 2010. @inproceedings{4742, title = {Linked Data Integration for Semantic Dialogue and Backend Access}, author = {Daniel Sonntag and Malte Kiesel}, url = {https://www.dfki.de/fileadmin/user_upload/import/4742_2010_Linked_Data_Integration_for_Semantic_Dialogue_and_Backend_Access.pdf http://www.foaf-project.org/events/linkedai}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence"}, publisher = {o.A.}, abstract = {A dialogue system for answering user questions in natural speech presents one of the main achievements of contemporary interaction-based AI technology. Modern dialogue frameworks function as middleware between the interface component and the backend where the answers to the user questions are stored in heterogeneous formats. We implemented an interface to linked data sources as part of a complex natural language understanding and semantic retrieval process, thereby integrating the querying and answering task into a common framework. The semantic backend system integrates multiple linked data sources to allow for an advanced multimodal question answering (QA) dialogue.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } A dialogue system for answering user questions in natural speech presents one of the main achievements of contemporary interaction-based AI technology. Modern dialogue frameworks function as middleware between the interface component and the backend where the answers to the user questions are stored in heterogeneous formats. We implemented an interface to linked data sources as part of a complex natural language understanding and semantic retrieval process, thereby integrating the querying and answering task into a common framework. The semantic backend system integrates multiple linked data sources to allow for an advanced multimodal question answering (QA) dialogue. |
Sonntag, Daniel; Porta, Daniel; Setz, Jochen HTTP/REST-based Meta Web Services in Mobile Application Frameworks Inproceedings Proceedings of the 4nd International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, XPS, 2010. @inproceedings{4864, title = {HTTP/REST-based Meta Web Services in Mobile Application Frameworks}, author = {Daniel Sonntag and Daniel Porta and Jochen Setz}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the 4nd International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies}, publisher = {XPS}, abstract = {This paper describes how a multimodal dialogue application framework can be used to implement specific mobile applications and dynamic HTTP-based REST services. REST services are already publicly available and provide useful location-based information for the user on the go. We use a distributed, ontology-based dialogue system architecture where every major component can be run on a different host, thereby increasing the scalability of the overall system with a mobile user interface. The dialogue system provides customised access to the Google Maps Local Search and two REST services provided by GeoNames (i.e., the findNearbyWikipedia search and the findNearbyWeather search).}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper describes how a multimodal dialogue application framework can be used to implement specific mobile applications and dynamic HTTP-based REST services. REST services are already publicly available and provide useful location-based information for the user on the go. We use a distributed, ontology-based dialogue system architecture where every major component can be run on a different host, thereby increasing the scalability of the overall system with a mobile user interface. The dialogue system provides customised access to the Google Maps Local Search and two REST services provided by GeoNames (i.e., the findNearbyWikipedia search and the findNearbyWeather search). |
Sonntag, Daniel; Wennerberg, Pinar; Zillner, Sonja Applications of an Ontology Engineering Methodology Accessing Linked Data for Medical Image Retrieval Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", Stanford University, 2010. @inproceedings{5010, title = {Applications of an Ontology Engineering Methodology Accessing Linked Data for Medical Image Retrieval}, author = {Daniel Sonntag and Pinar Wennerberg and Sonja Zillner}, url = {https://www.dfki.de/fileadmin/user_upload/import/5010_2010_Applications_of_an_Ontology_Engineering_Methodology_Accessing_Linked_Data_for_Medical_Image_Retrieval_.pdf http://www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1117}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence"}, publisher = {Stanford University}, abstract = {This paper examines first ideas on the applicability of Linked Data, in particular a subset of the Linked Open Drug Data (LODD), to connect radiology, human anatomy, and drug information for improved medical image annotation and subsequent search. One outcome of our ontology engineering methodology is the alignment between radiology-related OWL ontologies (FMA and RadLex). These can be used to provide new connections in the medicine-related linked data cloud. A use case scenario is provided that demonstrates the benefits of the approach by enabling the radiologist to query and explore related data, e.g., medical images and drugs. The diagnosis is on a special type of cancer (lymphoma).}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper examines first ideas on the applicability of Linked Data, in particular a subset of the Linked Open Drug Data (LODD), to connect radiology, human anatomy, and drug information for improved medical image annotation and subsequent search. One outcome of our ontology engineering methodology is the alignment between radiology-related OWL ontologies (FMA and RadLex). These can be used to provide new connections in the medicine-related linked data cloud. A use case scenario is provided that demonstrates the benefits of the approach by enabling the radiologist to query and explore related data, e.g., medical images and drugs. The diagnosis is on a special type of cancer (lymphoma). |
Sonntag, Daniel; Sacaleanu, Bogdan Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering Inproceedings Proceedings of the Seventh Conference on International Language Resources and Evaluation, European Language Resources Association (ELRA), 2010. @inproceedings{5013, title = {Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering}, author = {Daniel Sonntag and Bogdan Sacaleanu}, url = {https://www.dfki.de/fileadmin/user_upload/import/5013_2010_Speech_Grammars_for_Textual_Entailment_Patterns_in_Multimodal_Question_Answering_.pdf http://www.lrec-conf.org/proceedings/lrec2010/summaries/911.html}, year = {2010}, date = {2010-01-01}, booktitle = {Proceedings of the Seventh Conference on International Language Resources and Evaluation}, publisher = {European Language Resources Association (ELRA)}, abstract = {Over the last several years, speech-based question answering (QA) has become very popular in contrast to pure search engine based approaches on a desktop. Open-domain QA systems are now much more powerful and precise, and they can be used in speech applications. Speech-based question answering systems often rely on predefined grammars for speech understanding. In order to improve the coverage of such complex AI systems, we reused speech patterns used to generate textual entailment patterns. These can make multimodal question understanding more robust. We exemplify this in the context of a domain-specific dialogue scenario. As a result, written text input components (e.g., in a textual input field) can deal with more flexible input according to the derived textual entailment patterns. A multimodal QA dialogue spanning over several domains of interest, i.e., personal address book entries, questions about the music domain and politicians and other celebrities, demonstrates how the textual input mode can be used in a multimodal dialogue shell.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Over the last several years, speech-based question answering (QA) has become very popular in contrast to pure search engine based approaches on a desktop. Open-domain QA systems are now much more powerful and precise, and they can be used in speech applications. Speech-based question answering systems often rely on predefined grammars for speech understanding. In order to improve the coverage of such complex AI systems, we reused speech patterns used to generate textual entailment patterns. These can make multimodal question understanding more robust. We exemplify this in the context of a domain-specific dialogue scenario. As a result, written text input components (e.g., in a textual input field) can deal with more flexible input according to the derived textual entailment patterns. A multimodal QA dialogue spanning over several domains of interest, i.e., personal address book entries, questions about the music domain and politicians and other celebrities, demonstrates how the textual input mode can be used in a multimodal dialogue shell. |
Sonntag, Daniel; Reithinger, Norbert; Herzog, Gerd; Becker, Tilman A Discourse and Dialogue Infrastructure for Industrial Dissemination Inproceedings Lee, Gary Geunbae; Mariani, Joseph; Minker, Wolfgang; Nakamura, Satoshi (Ed.): Spoken Dialogue Systems for Ambient Environments - IWSDS 2010: Proceedings of the Seond International Workshop on Spoken Dialogue Systems, pp. 132-143, Springer, 2010. @inproceedings{5014, title = {A Discourse and Dialogue Infrastructure for Industrial Dissemination}, author = {Daniel Sonntag and Norbert Reithinger and Gerd Herzog and Tilman Becker}, editor = {Gary Geunbae Lee and Joseph Mariani and Wolfgang Minker and Satoshi Nakamura}, url = {https://www.dfki.de/fileadmin/user_upload/import/5014_2010_A_Discourse_and_Dialogue_Infrastructure_for_Industrial_Dissemination_.pdf http://www.springerlink.com/content/5149m52mt5378316/}, year = {2010}, date = {2010-01-01}, booktitle = {Spoken Dialogue Systems for Ambient Environments - IWSDS 2010: Proceedings of the Seond International Workshop on Spoken Dialogue Systems}, volume = {6392}, pages = {132-143}, publisher = {Springer}, abstract = {We think that modern speech dialogue systems need a prior usability analysis to identify the requirements for industrial applications. In addition, work from the area of the Semantic Web should be integrated. These requirements can then be met by multimodal semantic processing, semantic navigation, interactive semantic mediation, user adaptation/personalisation, interactive service composition, and semantic output representation which we will explain in this paper. We will also describe the discourse and dialogue infrastructure these components develop and provide two examples of disseminated industrial prototypes.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We think that modern speech dialogue systems need a prior usability analysis to identify the requirements for industrial applications. In addition, work from the area of the Semantic Web should be integrated. These requirements can then be met by multimodal semantic processing, semantic navigation, interactive semantic mediation, user adaptation/personalisation, interactive service composition, and semantic output representation which we will explain in this paper. We will also describe the discourse and dialogue infrastructure these components develop and provide two examples of disseminated industrial prototypes. |
Technical Reports |
Sonntag, Daniel; Weihrauch, Colette; Jacobs, Oliver; Porta, Daniel THESEUS CTC-WP4 Usability Guidelines for Use Case Applications Technical Report Bundesministerium für Wirtschaft und Technologie , 2010. @techreport{4788, title = {THESEUS CTC-WP4 Usability Guidelines for Use Case Applications}, author = {Daniel Sonntag and Colette Weihrauch and Oliver Jacobs and Daniel Porta}, url = {https://www.dfki.de/fileadmin/user_upload/import/4788_100430_CTC-WP4-1_Theseus_Usability-Guidelines_Final.pdf}, year = {2010}, date = {2010-04-01}, volume = {V 1.5}, institution = {Bundesministerium für Wirtschaft und Technologie}, abstract = {Usability Guidelines for Use Case Applications serves as an introduction to the general topic of usability, i.e., how user-friendly and efficient a THESEUS prototype is. In these guidelines, we emphasize the importance of usability testing, particularly during the development of a given THESEUS prototype. We discuss the many advantages of testing prototypes and products in terms of costs, product quality, and customer satisfaction. Usability testing can improve development productivity through more efficient design and fewer code revisions. It can help to eliminate over-design by emphasizing the functionality required to meet the needs of real users. Design problems can be detected earlier in the development process, saving both time and money. In these Guidelines we provide a brief overview of testing options, ranging from a cognitive walkthrough to interviews to eye tracking. Different techniques are used at different stages of a product's development. While many techniques can be applied, no single technique alone can ensure the usability of prototypes. Usability is a process with iterative steps, meaning the cycle is repeated but in a cumulative fashion, similar to software development. In order to test, a prototype must be available and we devote some time in the Guidelines to an overview of different tools and ways to build the necessary prototypes. We also describe some options such as paper prototyping, prototypes from Visio, PowerPoint, HTML, Flash and others, and working prototypes (Java, C++, etc.) before addressing the actual tests. Before any testing is conducted, the purpose of the test should be clarified. This will have considerable impact on the kind of testing to be done. A test plan should also be written before the start of the test which considers several different aspects including, for instance, the duration of the test, where it will take place, or who the experimenter will be. A pilot test is also recommended to avoid misunderstandings and other problems during the actual test. In this context, the Guidelines also discuss other important aspects such as budget, room set-up, time, and limitations of the experimenter and test subjects themselves. To provide an overview of some of the projects THESEUS is concerned with in the context of usability, we supply explicit recommendations that result in proposed scenarios for use cases in the Guidelines. The THESEUS program consists of six use cases: ALEXANDRIA, CONTENTUS, MEDICO, ORDO, PROCESSUS, and TEXO. In order to come up with the different testing scenarios, each of which has specific design and testing recommendations, we first extracted some substantial information from the different use cases in different user settings: we discerned between those who will use the system, where they will use the system, and what they will do with the system. After considering the results, we determined that the THESEUS program works with seven different scenarios. We provide a decision tree that leads to specific recommendations for designing and testing with prototypes for each of the different scenarios and user settings. General recommendations concerning various input methods, the design, and the testing itself have also been included in the Guidelines. Following that, we emphasize what we find important for the design and testing of each of the seven testing scenarios. We address, for instance, the appropriate input method (keyboard, mouse, speech, etc.), according to the type of test subject (e.g., administrator or mobile user), or also which prototype could be used for the usability test. We will also challenge the usability of traditional usability guidelines. Oftentimes, guideline descriptions and explanations are unsatisfactory, remaining vague and ambiguous in explanation The Guidelines close with an extensive list of recommended further information sources.}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } Usability Guidelines for Use Case Applications serves as an introduction to the general topic of usability, i.e., how user-friendly and efficient a THESEUS prototype is. In these guidelines, we emphasize the importance of usability testing, particularly during the development of a given THESEUS prototype. We discuss the many advantages of testing prototypes and products in terms of costs, product quality, and customer satisfaction. Usability testing can improve development productivity through more efficient design and fewer code revisions. It can help to eliminate over-design by emphasizing the functionality required to meet the needs of real users. Design problems can be detected earlier in the development process, saving both time and money. In these Guidelines we provide a brief overview of testing options, ranging from a cognitive walkthrough to interviews to eye tracking. Different techniques are used at different stages of a product's development. While many techniques can be applied, no single technique alone can ensure the usability of prototypes. Usability is a process with iterative steps, meaning the cycle is repeated but in a cumulative fashion, similar to software development. In order to test, a prototype must be available and we devote some time in the Guidelines to an overview of different tools and ways to build the necessary prototypes. We also describe some options such as paper prototyping, prototypes from Visio, PowerPoint, HTML, Flash and others, and working prototypes (Java, C++, etc.) before addressing the actual tests. Before any testing is conducted, the purpose of the test should be clarified. This will have considerable impact on the kind of testing to be done. A test plan should also be written before the start of the test which considers several different aspects including, for instance, the duration of the test, where it will take place, or who the experimenter will be. A pilot test is also recommended to avoid misunderstandings and other problems during the actual test. In this context, the Guidelines also discuss other important aspects such as budget, room set-up, time, and limitations of the experimenter and test subjects themselves. To provide an overview of some of the projects THESEUS is concerned with in the context of usability, we supply explicit recommendations that result in proposed scenarios for use cases in the Guidelines. The THESEUS program consists of six use cases: ALEXANDRIA, CONTENTUS, MEDICO, ORDO, PROCESSUS, and TEXO. In order to come up with the different testing scenarios, each of which has specific design and testing recommendations, we first extracted some substantial information from the different use cases in different user settings: we discerned between those who will use the system, where they will use the system, and what they will do with the system. After considering the results, we determined that the THESEUS program works with seven different scenarios. We provide a decision tree that leads to specific recommendations for designing and testing with prototypes for each of the different scenarios and user settings. General recommendations concerning various input methods, the design, and the testing itself have also been included in the Guidelines. Following that, we emphasize what we find important for the design and testing of each of the seven testing scenarios. We address, for instance, the appropriate input method (keyboard, mouse, speech, etc.), according to the type of test subject (e.g., administrator or mobile user), or also which prototype could be used for the usability test. We will also challenge the usability of traditional usability guidelines. Oftentimes, guideline descriptions and explanations are unsatisfactory, remaining vague and ambiguous in explanation The Guidelines close with an extensive list of recommended further information sources. |
and, Daniel Sonntag A Methodology for Emergent Software Technical Report German Research Center for AI (DFKI ) Stuhlsatzenhausweg 3 66123 Saarbrücken, , 2010. @techreport{4993, title = {A Methodology for Emergent Software}, author = {Daniel Sonntag and}, url = {https://www.dfki.de/fileadmin/user_upload/import/4993_emergent.pdf}, year = {2010}, date = {2010-01-01}, volume = {2}, pages = {10}, address = {Stuhlsatzenhausweg 3 66123 Saarbrücken}, institution = {German Research Center for AI (DFKI )}, abstract = {We present a methodology for software components that suggests adaptations to specific conditions and situations in which the components are used. The emergent software should be able to better function in a specific situation. For this purpose, we survey its background in metacognition and introspection, develop an augmented data mining cycle, and invent an introspective mechanism, a methodology for emergent software. This report is based on emergent software implementations in adaptive information systems (Sonntag, 2010).}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } We present a methodology for software components that suggests adaptations to specific conditions and situations in which the components are used. The emergent software should be able to better function in a specific situation. For this purpose, we survey its background in metacognition and introspection, develop an augmented data mining cycle, and invent an introspective mechanism, a methodology for emergent software. This report is based on emergent software implementations in adaptive information systems (Sonntag, 2010). |
2009 |
Journal Articles |
Sonntag, Daniel; Zapatrin, Roman R Macrodynamics of users' behavior in Information Retrieval Journal Article Computing Research Repository eprint Journal, ArXiv online , pp. 1-15, 2009. @article{5005, title = {Macrodynamics of users' behavior in Information Retrieval}, author = {Daniel Sonntag and Roman R Zapatrin}, url = {http://arxiv.org/abs/0905.2501}, year = {2009}, date = {2009-05-01}, journal = {Computing Research Repository eprint Journal}, volume = {ArXiv online}, pages = {1-15}, publisher = {ArXiv.org}, abstract = {We present a method to geometrize massive data sets from search engines query logs. For this purpose, a macrodynamic-like quantitative model of the Information Retrieval (IR) process is developed, whose paradigm is inspired by basic constructions of Einstein's general relativity theory in which all IR objects are uniformly placed in a common Room. The Room has a structure similar to Einsteinian spacetime, namely that of a smooth manifold. Documents and queries are treated as matter objects and sources of material fields. Relevance, the central notion of IR, becomes a dynamical issue controlled by both gravitation (or, more precisely, as the motion in a curved spacetime) and forces originating from the interactions of matter fields. The spatio-temporal description ascribes dynamics to any document or query, thus providing a uniform description for documents of both initially static and dynamical nature. Within the IR context, the techniques presented are based on two ideas. The first is the placement of all objects participating in IR into a common continuous space. The second idea is the `objectivization' of the IR process; instead of expressing users' wishes, we consider the overall IR as an objective physical process, representing the IR process in terms of motion in a given external-fields configuration. Various semantic environments are treated as various IR universes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } We present a method to geometrize massive data sets from search engines query logs. For this purpose, a macrodynamic-like quantitative model of the Information Retrieval (IR) process is developed, whose paradigm is inspired by basic constructions of Einstein's general relativity theory in which all IR objects are uniformly placed in a common Room. The Room has a structure similar to Einsteinian spacetime, namely that of a smooth manifold. Documents and queries are treated as matter objects and sources of material fields. Relevance, the central notion of IR, becomes a dynamical issue controlled by both gravitation (or, more precisely, as the motion in a curved spacetime) and forces originating from the interactions of matter fields. The spatio-temporal description ascribes dynamics to any document or query, thus providing a uniform description for documents of both initially static and dynamical nature. Within the IR context, the techniques presented are based on two ideas. The first is the placement of all objects participating in IR into a common continuous space. The second idea is the `objectivization' of the IR process; instead of expressing users' wishes, we consider the overall IR as an objective physical process, representing the IR process in terms of motion in a given external-fields configuration. Various semantic environments are treated as various IR universes. |
Sonntag, Daniel; Wennerberg, Pinar; Buitelaar, Paul; Zillner, Sonja Pillars of Ontology Treatment in the Medical Domain Journal Article Journal of Cases on Information Technology, 11 , pp. 47-73, 2009. @article{5007, title = {Pillars of Ontology Treatment in the Medical Domain}, author = {Daniel Sonntag and Pinar Wennerberg and Paul Buitelaar and Sonja Zillner}, editor = {Mehdi Khosrow-Pour}, url = {https://www.dfki.de/fileadmin/user_upload/import/5007_2009_Pillars_of_Ontology_Treatment_in_the_Medical_Domain.pdf http://www.igi-global.com/journals/details.asp?ID=202&mode=toc&volume=Journal+of+Cases+on+Information+Technology%2C+Vol.+11%2C+Issue+4}, year = {2009}, date = {2009-01-01}, journal = {Journal of Cases on Information Technology}, volume = {11}, pages = {47-73}, publisher = {IGI Global}, abstract = {In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In this chapter the authors describe the three pillars of ontology treatment in the medical domain in a comprehensive case study within the large-scale THESEUS MEDICO project. MEDICO addresses the need for advanced semantic technologies in medical image and patient data search. The objective is to enable a seamless integration of medical images and different user applications by providing direct access to image semantics. Semantic image retrieval should provide the basis for the help in clinical decision support and computer aided diagnosis. During the course of lymphoma diagnosis and continual treatment, image data is produced several times using different image modalities. After semantic annotation, the images need to be integrated with medical (textual) data repositories and ontologies. They build upon the three pillars of knowledge engineering, ontology mediation and alignment, and ontology population and learning to achieve the objectives of the MEDICO project. |
Inproceedings |
Sonntag, Daniel; Möller, Manuel Unifying Semantic Annotation and Querying in Biomedical Images Repositories Inproceedings Proceedings of the International Conference on Knowledge Management and Information Sharing, INSTICC Press, 2009. @inproceedings{5008, title = {Unifying Semantic Annotation and Querying in Biomedical Images Repositories}, author = {Daniel Sonntag and Manuel Möller}, url = {https://www.dfki.de/fileadmin/user_upload/import/5008_sonntagmoeller.pdf http://dblp.uni-trier.de/rec/bibtex/conf/ic3k/SonntagM09}, year = {2009}, date = {2009-11-01}, booktitle = {Proceedings of the International Conference on Knowledge Management and Information Sharing}, publisher = {INSTICC Press}, abstract = {In the medical domain, semantic image retrieval should provide the basis for the help in decision support and computer aided diagnosis. But knowledge engineers cannot easily acquire the necessary medical knowledge about the image contents. Based on their semantics, we present a set of techniques for annotating images and querying image data sets. The unification of semantic annotation (using a GUI) and querying (using natural dialogue) in biomedical image repositories is based on a unified view of the knowledge acquisition process. We use a central RDF repository to capture both medical domain knowledge as well as image annotations and understand medical knowledge engineering as an interactive process between the knowledge engineer and the clinician. Our system also supports the interactive process between the dialogue engineer and the clinician.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In the medical domain, semantic image retrieval should provide the basis for the help in decision support and computer aided diagnosis. But knowledge engineers cannot easily acquire the necessary medical knowledge about the image contents. Based on their semantics, we present a set of techniques for annotating images and querying image data sets. The unification of semantic annotation (using a GUI) and querying (using natural dialogue) in biomedical image repositories is based on a unified view of the knowledge acquisition process. We use a central RDF repository to capture both medical domain knowledge as well as image annotations and understand medical knowledge engineering as an interactive process between the knowledge engineer and the clinician. Our system also supports the interactive process between the dialogue engineer and the clinician. |
Porta, Daniel; Sonntag, Daniel; Neßelrath, Robert A Multimodal Mobile B2B Dialogue Interface on the iPhone Inproceedings Proceedings of the 4th Workshop on Speech in Mobile and Pervasive Environments, o.A., 2009. @inproceedings{4177, title = {A Multimodal Mobile B2B Dialogue Interface on the iPhone}, author = {Daniel Porta and Daniel Sonntag and Robert Neßelrath}, url = {https://www.dfki.de/fileadmin/user_upload/import/4177_texo_short_simpe.pdf}, year = {2009}, date = {2009-01-01}, booktitle = {Proceedings of the 4th Workshop on Speech in Mobile and Pervasive Environments}, publisher = {o.A.}, abstract = {In this paper, we describe a mobile Business-to-Business (B2B) interaction system. The mobile device supports users in accessing a service platform. A multimodal dialogue system allows a business expert to intuitively search and browse for services in a real-world production pipeline. We implemented a distributed client-server dialogue application for natural language speech input and speech output generation. On the mobile device, we implemented a multimodal client application which comprises of a GUI for touch gestures and a three-dimensional visualization. The client is linked to an ontology-based dialogue platform and fully leverages the device`s interaction capabilities in order to provide intuitive access to the service platform while on the go.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper, we describe a mobile Business-to-Business (B2B) interaction system. The mobile device supports users in accessing a service platform. A multimodal dialogue system allows a business expert to intuitively search and browse for services in a real-world production pipeline. We implemented a distributed client-server dialogue application for natural language speech input and speech output generation. On the mobile device, we implemented a multimodal client application which comprises of a GUI for touch gestures and a three-dimensional visualization. The client is linked to an ontology-based dialogue platform and fully leverages the device`s interaction capabilities in order to provide intuitive access to the service platform while on the go. |
Porta, Daniel; Sonntag, Daniel; Neßelrath, Robert New Business to Business Interaction: Shake your iPhone and speak to it. Inproceedings Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services, ACM, 2009. @inproceedings{4175, title = {New Business to Business Interaction: Shake your iPhone and speak to it.}, author = {Daniel Porta and Daniel Sonntag and Robert Neßelrath}, url = {https://www.dfki.de/fileadmin/user_upload/import/4175_2009_New_Business_to_Business_Interaction-_Shake_your_iPhone_and_speak_to_it..pdf http://portal.acm.org/ft_gateway.cfm?id=1613931&type=pdf&coll=GUIDE&dl=GUIDE&CFID=78134429&CFTOKEN=84457144}, year = {2009}, date = {2009-01-01}, booktitle = {Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services}, publisher = {ACM}, abstract = {We present a new multimodal interaction sequence for a mobile multimodal Business-to-Business interaction system. A mobile client application on the iPhone supports users in accessing an online service marketplace and allows business experts to intuitively search and browse for services using natural language speech and gestures while on the go. For this purpose, we utilize an ontology-based multimodal dialogue platform as well as an integrated trainable gesture recognizer.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We present a new multimodal interaction sequence for a mobile multimodal Business-to-Business interaction system. A mobile client application on the iPhone supports users in accessing an online service marketplace and allows business experts to intuitively search and browse for services using natural language speech and gestures while on the go. For this purpose, we utilize an ontology-based multimodal dialogue platform as well as an integrated trainable gesture recognizer. |
Sonntag, Daniel; Deru, Matthieu; Bergweiler, Simon Design and Implementation of Combined Mobile and Touchscreen-Based Multimodal Web 3.0 Interfaces Inproceedings Proceedings of the International Conference on Artificial Intelligence (ICAI), CSREA Press, 2009. @inproceedings{4255, title = {Design and Implementation of Combined Mobile and Touchscreen-Based Multimodal Web 3.0 Interfaces}, author = {Daniel Sonntag and Matthieu Deru and Simon Bergweiler}, url = {https://www.dfki.de/fileadmin/user_upload/import/4255_icai-09.pdf}, year = {2009}, date = {2009-01-01}, booktitle = {Proceedings of the International Conference on Artificial Intelligence (ICAI)}, publisher = {CSREA Press}, abstract = {We describe a Web 3.0 interaction system where the mobile user scenario is combined with a touchscreenbased collaborative terminal. Multiple users should be able to easily organize their information/knowledge space (which is ontology-based) and share information with others. We implemented a MP3 and video player interface for the physical iPod Touch (or iPhone) and the corresponding virtual touchscreen workbench. The Web 3.0 access allows us to organize and retrieve multimedia material from online repositories such as YouTube and LastFM.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We describe a Web 3.0 interaction system where the mobile user scenario is combined with a touchscreenbased collaborative terminal. Multiple users should be able to easily organize their information/knowledge space (which is ontology-based) and share information with others. We implemented a MP3 and video player interface for the physical iPod Touch (or iPhone) and the corresponding virtual touchscreen workbench. The Web 3.0 access allows us to organize and retrieve multimedia material from online repositories such as YouTube and LastFM. |
Sonntag, Daniel Introspection and Adaptable Model Integration for Dialogue-based Question Answering Inproceedings Proceedings of the Twenty-first International Joint Conferences on Artificial Intelligence (IJCAI), Online, 2009. @inproceedings{4257, title = {Introspection and Adaptable Model Integration for Dialogue-based Question Answering}, author = {Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/4257_ijcai09_no_asso.pdf}, year = {2009}, date = {2009-01-01}, booktitle = {Proceedings of the Twenty-first International Joint Conferences on Artificial Intelligence (IJCAI)}, publisher = {Online}, abstract = {Dialogue-based Question Answering (QA) is a highly complex task that brings together a QA system including various natural language processing components (i.e., components for question classification, information extraction, and retrieval) with dialogue systems for effective and natural communication. The dialogue-based access is difficult to establish when the QA system in use is complex and combines many different answer services with different quality and access characteristics. For example, some questions are processed by opendomain QA services with a broad coverage. Others should be processed by using a domain-specific instance ontology for more reliable answers. Different answer services may change their characteristics over time and the dialogue reaction models have to be updated according to that. To solve this problem, we developed introspective methods to integrate adaptable models of the answer services. We evaluated the impact of the learned models on the dialogue performance, i.e., whether the adaptable models can be used for a more convenient dialogue formulation process. We show significant effectiveness improvements in the resulting dialogues when using the machine learning (ML) models. Examples are provided in the context of the generation of system-initiative feedback to user questions and answers, as provided by heterogeneous information services.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Dialogue-based Question Answering (QA) is a highly complex task that brings together a QA system including various natural language processing components (i.e., components for question classification, information extraction, and retrieval) with dialogue systems for effective and natural communication. The dialogue-based access is difficult to establish when the QA system in use is complex and combines many different answer services with different quality and access characteristics. For example, some questions are processed by opendomain QA services with a broad coverage. Others should be processed by using a domain-specific instance ontology for more reliable answers. Different answer services may change their characteristics over time and the dialogue reaction models have to be updated according to that. To solve this problem, we developed introspective methods to integrate adaptable models of the answer services. We evaluated the impact of the learned models on the dialogue performance, i.e., whether the adaptable models can be used for a more convenient dialogue formulation process. We show significant effectiveness improvements in the resulting dialogues when using the machine learning (ML) models. Examples are provided in the context of the generation of system-initiative feedback to user questions and answers, as provided by heterogeneous information services. |
Sonntag, Daniel; Sonnenberg, Gerhard; Neßelrath, Robert; Herzog, Gerd Supporting a Rapid Dialogue Engineering Process Inproceedings Proceedings of the First International Workshop On Spoken Dialogue Systems Technology, o.A., 2009. @inproceedings{4673, title = {Supporting a Rapid Dialogue Engineering Process}, author = {Daniel Sonntag and Gerhard Sonnenberg and Robert Neßelrath and Gerd Herzog}, url = {https://www.dfki.de/fileadmin/user_upload/import/4673_IWSDS2009.pdf}, year = {2009}, date = {2009-01-01}, booktitle = {Proceedings of the First International Workshop On Spoken Dialogue Systems Technology}, publisher = {o.A.}, abstract = {We implemented a generic dialogue shell that can be configured for and applied to domain-specific dialogue applications. A toolbox for ontology-based dialogue engineering provides a technical solution for the two challenges of engineering ontological domain extensions and debugging functional modules. We support a rapid implementation cycle until the dialogue systems works robustly for a new domain, e.g., the dialogue-based retrieval of medical images.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We implemented a generic dialogue shell that can be configured for and applied to domain-specific dialogue applications. A toolbox for ontology-based dialogue engineering provides a technical solution for the two challenges of engineering ontological domain extensions and debugging functional modules. We support a rapid implementation cycle until the dialogue systems works robustly for a new domain, e.g., the dialogue-based retrieval of medical images. |
Sonntag, Daniel On Intuitive Dialogue-based Communication and Instinctive Dialogue Initiative Inproceedings Instictive Computing, International Workshop, online, 2009. @inproceedings{5006, title = {On Intuitive Dialogue-based Communication and Instinctive Dialogue Initiative}, author = {Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/5006_instinctiveDialogue.pdf}, year = {2009}, date = {2009-01-01}, booktitle = {Instictive Computing, International Workshop}, publisher = {online}, abstract = {Maximes of conversation and the resulting (multimodal) constraints may be very much related to instinctive computing. At least, one could argue that cognitive instincts and (meta)cognitive dialogue strategies use the same class of actual sensory input. In my model, however, the dialogue partner's (instinctive?) competence arises from adaptable models he learns from the environment. For example, some information resources are more reliable than other, some people always or never tell the truth, which affects the dialogue action models---over time.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Maximes of conversation and the resulting (multimodal) constraints may be very much related to instinctive computing. At least, one could argue that cognitive instincts and (meta)cognitive dialogue strategies use the same class of actual sensory input. In my model, however, the dialogue partner's (instinctive?) competence arises from adaptable models he learns from the environment. For example, some information resources are more reliable than other, some people always or never tell the truth, which affects the dialogue action models---over time. |
2008 |
Inproceedings |
Dividino, Renata; Romanelli, Massimo; Sonntag, Daniel Semiotic-based Ontology Evaluation Tool S-OntoEval Inproceedings Proceedings of the Sixth International Conference on Language Resources and Evaluation, ELRA, 2008. @inproceedings{3766, title = {Semiotic-based Ontology Evaluation Tool S-OntoEval}, author = {Renata Dividino and Massimo Romanelli and Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/3766_SOntoEvalposter.pdf}, year = {2008}, date = {2008-01-01}, booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation}, publisher = {ELRA}, abstract = {The objective of the Semiotic-based Ontology Evaluation Tool (S-OntoEval) is to evaluate and propose improvements to a given ontological model. The evaluation aims at assessing the quality of the ontology by drawing upon semiotic theory (Stamper et al., 2000), taking several metrics into consideration for assessing the syntactic, semantic, and pragmatic aspects of ontology quality. We consider an ontology to be a semiotic object and we identify three main types of semiotic ontology evaluation levels: the structural level, assessing the ontology syntax and formal semantics; the functional level, assessing the ontology cognitive semantics and; the usability-related level, assessing the ontology pragmatics. The Ontology Evaluation Tool implements metrics for each semiotic ontology level: on the structural level by making use of reasoner such as the RACER System (Haarselv and Möller, 2001) and Pellet (Parsia and Sirin, 2004) to check the logical consistency of our ontological model (TBoxes and ABoxes) and graph-theory measures such as Depth; on the functional level by making use of a task-based evaluation approach which measures the quality of the ontology based on the adequacy of the ontological model for a specific task; and on the usability-profiling level by applying a quantitative analysis of the amount of annotation. Other metrics can be easily integrated and added to the respective evaluation level. In this work, the Ontology Evaluation Tool is used to test and evaluate the SWIntO Ontology of the SmartWeb project.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } The objective of the Semiotic-based Ontology Evaluation Tool (S-OntoEval) is to evaluate and propose improvements to a given ontological model. The evaluation aims at assessing the quality of the ontology by drawing upon semiotic theory (Stamper et al., 2000), taking several metrics into consideration for assessing the syntactic, semantic, and pragmatic aspects of ontology quality. We consider an ontology to be a semiotic object and we identify three main types of semiotic ontology evaluation levels: the structural level, assessing the ontology syntax and formal semantics; the functional level, assessing the ontology cognitive semantics and; the usability-related level, assessing the ontology pragmatics. The Ontology Evaluation Tool implements metrics for each semiotic ontology level: on the structural level by making use of reasoner such as the RACER System (Haarselv and Möller, 2001) and Pellet (Parsia and Sirin, 2004) to check the logical consistency of our ontological model (TBoxes and ABoxes) and graph-theory measures such as Depth; on the functional level by making use of a task-based evaluation approach which measures the quality of the ontology based on the adequacy of the ontological model for a specific task; and on the usability-profiling level by applying a quantitative analysis of the amount of annotation. Other metrics can be easily integrated and added to the respective evaluation level. In this work, the Ontology Evaluation Tool is used to test and evaluate the SWIntO Ontology of the SmartWeb project. |
2007 |
Journal Articles |
Oberle, Daniel; Ankolekar, Anupriya; Hitzler, Pascal; Cimiano, Philipp; Sintek, Michael; Kiesel, Malte; Mougouie, Babak; Baumann, Stephan; Vembu, Shankar; Romanelli, Massimo; Buitelaar, Paul; Engel, Ralf; Sonntag, Daniel; Reithinger, Norbert; Loos, Berenike; Zorn, Hans-Peter; Micelli, Vanessa; Porzel, Robert; Schmidt, Christian; Weiten, Moritz; Burkhardt, Felix; Zhou, Jianshen DOLCE ergo SUMO: On Foundational and Domain Models in SWIntO (SmartWeb Integrated Ontology) Journal Article Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 5 , pp. 156-174, 2007. @article{2596, title = {DOLCE ergo SUMO: On Foundational and Domain Models in SWIntO (SmartWeb Integrated Ontology)}, author = {Daniel Oberle and Anupriya Ankolekar and Pascal Hitzler and Philipp Cimiano and Michael Sintek and Malte Kiesel and Babak Mougouie and Stephan Baumann and Shankar Vembu and Massimo Romanelli and Paul Buitelaar and Ralf Engel and Daniel Sonntag and Norbert Reithinger and Berenike Loos and Hans-Peter Zorn and Vanessa Micelli and Robert Porzel and Christian Schmidt and Moritz Weiten and Felix Burkhardt and Jianshen Zhou}, url = {https://www.dfki.de/fileadmin/user_upload/import/2596_resubmission-JWS-D-06-00012-1.pdf}, year = {2007}, date = {2007-01-01}, journal = {Journal of Web Semantics: Science, Services and Agents on the World Wide Web}, volume = {5}, pages = {156-174}, publisher = {o.A.}, abstract = {Increased availability of mobile computing, such as personal digital assistants (PDAs), creates the potential for constant and intelligent access to up-to-date, integrated and detailed information from the Web, regardless of one's actual geographical position. Intelligent question-answering requires the representation of knowledge from various domains, such as the navigational and discourse context of the user, potential user questions, the information provided by Web services and so on, for example in the form of ontologies. Within the context of the SmartWeb project, we have developed a number of domain-specific ontologies that are relevant for mobile and intelligent user interfaces to open-domain question-answering and information services on the Web. To integrate the various domain-specific ontologies, we have developed a foundational ontology, the SmartSUMO ontology, on the basis of the DOLCE and SUMO ontologies. This allows us to combine all the developed ontologies into a single SmartWeb Integrated Ontology (SWIntO) having a common modeling basis with conceptual clarity and the provision of ontology design patterns for modeling consistency. In this paper, we present SWIntO, describe the design choices we made in its construction, illustrate the use of the ontology through a number of applications, and discuss some of the lessons learned from our experiences.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Increased availability of mobile computing, such as personal digital assistants (PDAs), creates the potential for constant and intelligent access to up-to-date, integrated and detailed information from the Web, regardless of one's actual geographical position. Intelligent question-answering requires the representation of knowledge from various domains, such as the navigational and discourse context of the user, potential user questions, the information provided by Web services and so on, for example in the form of ontologies. Within the context of the SmartWeb project, we have developed a number of domain-specific ontologies that are relevant for mobile and intelligent user interfaces to open-domain question-answering and information services on the Web. To integrate the various domain-specific ontologies, we have developed a foundational ontology, the SmartSUMO ontology, on the basis of the DOLCE and SUMO ontologies. This allows us to combine all the developed ontologies into a single SmartWeb Integrated Ontology (SWIntO) having a common modeling basis with conceptual clarity and the provision of ontology design patterns for modeling consistency. In this paper, we present SWIntO, describe the design choices we made in its construction, illustrate the use of the ontology through a number of applications, and discuss some of the lessons learned from our experiences. |
Book Chapters |
Sonntag, Daniel; Engel, Ralf; Herzog, Gerd; Pfalzgraf, Alexander; Pfleger, Norbert; Romanelli, Massimo; Reithinger, Norbert Huang, Thomas; Nijholt, Anton; Pantic, Maja; Plentland, Alex (Ed.): Artifical Intelligence for Human Computing, 4451 , pp. 272-295, Springer, 2007. @inbook{2586, title = {SmartWeb Handheld - Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services (extended version)}, author = {Daniel Sonntag and Ralf Engel and Gerd Herzog and Alexander Pfalzgraf and Norbert Pfleger and Massimo Romanelli and Norbert Reithinger}, editor = {Thomas Huang and Anton Nijholt and Maja Pantic and Alex Plentland}, url = {https://www.dfki.de/fileadmin/user_upload/import/2586_fulltext.pdf}, year = {2007}, date = {2007-07-01}, booktitle = {Artifical Intelligence for Human Computing}, volume = {4451}, pages = {272-295}, publisher = {Springer}, abstract = {SmartWeb aims to provide intuitive multimodal access to a rich selection of Web-based information services. We report on the current prototype with a smartphone client interface to the Semantic Web. An advanced ontology-based representation of facts and media structures serves as the central description for rich media content. Underlying content is accessed through conventional web service middleware to connect the ontological knowledge base and an intelligent web service composition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presentation module renders the media content and the results generated from the services and provides a detailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to interact with the presented multimedia material in a multimodal way.}, keywords = {}, pubstate = {published}, tppubtype = {inbook} } SmartWeb aims to provide intuitive multimodal access to a rich selection of Web-based information services. We report on the current prototype with a smartphone client interface to the Semantic Web. An advanced ontology-based representation of facts and media structures serves as the central description for rich media content. Underlying content is accessed through conventional web service middleware to connect the ontological knowledge base and an intelligent web service composition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presentation module renders the media content and the results generated from the services and provides a detailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to interact with the presented multimedia material in a multimodal way. |
Inproceedings |
Sonntag, Daniel; Heim, Philipp A Constraint-Based Graph Visualisation Architecture for Mobile Semantic Web Interfaces Inproceedings Falciendo, B; Spagnuolo, M; Avrithis, Y; Kompatsiaris, I; Buitelaar, Paul (Ed.): Semantic Multimedia. Proceedings of the 2nd International Conference on Semantic and Digital Media Technologies (SAMT-2007), December 5-7, Genoa, Italy, pp. 158-171, Springer, 2007. @inproceedings{2598, title = {A Constraint-Based Graph Visualisation Architecture for Mobile Semantic Web Interfaces}, author = {Daniel Sonntag and Philipp Heim}, editor = {B Falciendo and M Spagnuolo and Y Avrithis and I Kompatsiaris and Paul Buitelaar}, url = {https://www.dfki.de/fileadmin/user_upload/import/2598_sonntag.pdf}, year = {2007}, date = {2007-12-01}, booktitle = {Semantic Multimedia. Proceedings of the 2nd International Conference on Semantic and Digital Media Technologies (SAMT-2007), December 5-7, Genoa, Italy}, volume = {4816}, pages = {158-171}, publisher = {Springer}, abstract = {Multimodal and dialogue-based mobile interfaces to the Semantic Web offer access to complex knowledge and information structures. We explore more fine-grained co-ordination of multimodal presentations in mobile environments by graph visualisations and navigation in ontological RDF result structures and multimedia archives. Semantic Navigation employs integrated ontology structures and leverages graphical user interface activity for dialogical interaction on mobile devices. Hence information visualisation benefits from the Semantic Web. Constraint-based programming helps to find optimised multimedia graph visualisations. We report on the constraint-formulisation process to optimise the visualisation of semantic-based information on small devices and its integration in a distributed dialogue system.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Multimodal and dialogue-based mobile interfaces to the Semantic Web offer access to complex knowledge and information structures. We explore more fine-grained co-ordination of multimodal presentations in mobile environments by graph visualisations and navigation in ontological RDF result structures and multimedia archives. Semantic Navigation employs integrated ontology structures and leverages graphical user interface activity for dialogical interaction on mobile devices. Hence information visualisation benefits from the Semantic Web. Constraint-based programming helps to find optimised multimedia graph visualisations. We report on the constraint-formulisation process to optimise the visualisation of semantic-based information on small devices and its integration in a distributed dialogue system. |
Sonntag, Daniel; Heim, Philipp Semantic Graph Visualisation for Mobile Semantic Web Interfaces Inproceedings Hertzberg, J; Beetz, M; Englert, R (Ed.): Proceedings of the 30th Annual German Conference on Artificial Intelligence (KI 2007), September 10-13, Osnabrück, Germany, pp. 506-509, Springer, 2007. @inproceedings{2588, title = {Semantic Graph Visualisation for Mobile Semantic Web Interfaces}, author = {Daniel Sonntag and Philipp Heim}, editor = {J Hertzberg and M Beetz and R Englert}, url = {https://www.dfki.de/fileadmin/user_upload/import/2588_2007_POSTER_Semantic_Graph_Visualisation_for_Mobile_Semantic_Web_Interfaces.pdf}, year = {2007}, date = {2007-09-01}, booktitle = {Proceedings of the 30th Annual German Conference on Artificial Intelligence (KI 2007), September 10-13, Osnabrück, Germany}, volume = {4667}, pages = {506-509}, publisher = {Springer}, abstract = {o.A.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } o.A. |
Sonntag, Daniel Proceedings of the 9th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI-07), Septmeber 9-12, Singapore, pp. 142-148, ACM Publications, ACM Publications, 2007. @inproceedings{2597, title = {Context-Sensitive Multimodal Mobile Interfaces - Speech and Gesture Based Information Seeking Interaction with Navigation on Mobile Devices}, author = {Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/2597_2007_Context-Sensitive_Multimodal_Mobile_Interfaces.pdf}, year = {2007}, date = {2007-09-01}, booktitle = {Proceedings of the 9th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI-07), Septmeber 9-12, Singapore}, pages = {142-148}, publisher = {ACM Publications}, address = {ACM Publications}, abstract = {o.A.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } o.A. |
Engel, Ralf; Sonntag, Daniel Text Generation in the SmartWeb Multimodal Dialogue System Inproceedings Hertzberg, J; Beetz, M; Englert, R (Ed.): KI 2007: Advances in Artificial Intelligence. 30th Annual German Conference on Artificial Intelligence (KI 2007), September 10-13, Osnabrück, Germany, pp. 448-451, Springer, 2007. @inproceedings{2589, title = {Text Generation in the SmartWeb Multimodal Dialogue System}, author = {Ralf Engel and Daniel Sonntag}, editor = {J Hertzberg and M Beetz and R Englert}, url = {https://www.dfki.de/fileadmin/user_upload/import/2589_2007_SmartWeb_Handheld_-_Multimodal_Interaction_with_On.pdf}, year = {2007}, date = {2007-09-01}, booktitle = {KI 2007: Advances in Artificial Intelligence. 30th Annual German Conference on Artificial Intelligence (KI 2007), September 10-13, Osnabrück, Germany}, volume = {4667}, pages = {448-451}, publisher = {Springer}, abstract = {o.A.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } o.A. |
Sonntag, Daniel Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces Inproceedings Smith, Michael J; Salvendy, Gavriel (Ed.): Human Interface and the Management of Information. Interacting in Information Environments (Part 2), Springer, 2007. @inproceedings{2587, title = {Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces}, author = {Daniel Sonntag}, editor = {Michael J Smith and Gavriel Salvendy}, url = {https://www.dfki.de/fileadmin/user_upload/import/2587_2007_INTERACTION_DESIGN_AND_IMPLEMENTATION_FOR_MULTIMODAL_MOBILE_SEMANTIC_WEB_INTERFACES.pdf}, year = {2007}, date = {2007-07-01}, booktitle = {Human Interface and the Management of Information. Interacting in Information Environments (Part 2)}, publisher = {Springer}, abstract = {o.A.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } o.A. |
Sonntag, Daniel Embedded Distributed Text Mining and Semantic Web Technology Inproceedings Proceedings of the Workshop: NATO Advanced Study Institute on Mining Massive Data Sets for Security, Advanced Study Institute, 2007. @inproceedings{3767, title = {Embedded Distributed Text Mining and Semantic Web Technology}, author = {Daniel Sonntag}, url = {https://www.dfki.de/fileadmin/user_upload/import/3767_nato_poster.pdf}, year = {2007}, date = {2007-01-01}, booktitle = {Proceedings of the Workshop: NATO Advanced Study Institute on Mining Massive Data Sets for Security}, publisher = {Advanced Study Institute}, abstract = {We first position Text Mining (TM) components and challenges in a Grid-based distributed TM architecture. On the basis of this infrastructure we declare an embedded TM workflow.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We first position Text Mining (TM) components and challenges in a Grid-based distributed TM architecture. On the basis of this infrastructure we declare an embedded TM workflow. |
Technical Reports |
Sonntag, Daniel; Reithinger, Norbert SmartWeb Handheld Interaction: General Interactions and Result Display for User-System Multimodal Dialogue Technical Report DFKI , 2007. @techreport{2919, title = {SmartWeb Handheld Interaction: General Interactions and Result Display for User-System Multimodal Dialogue}, author = {Daniel Sonntag and Norbert Reithinger}, url = {https://www.dfki.de/fileadmin/user_upload/import/2919_InteractionSmartWeb6.pdf}, year = {2007}, date = {2007-01-01}, volume = {V 1.1}, institution = {DFKI}, abstract = {Question answering (QA) has become one of the fastest growing topics in computational linguistics and information access. In this context, SmartWeb (http://www.smartweb-projekt.de/) was a large-scale German research project that aimed to provide intuitive multimodal access to a rich selection of Web-based information services. In one of the main scenarios, the user interacts with a smartphone client interface, and asks natural language questions, to access the Semantic Web. The demonstrator systems were developed between 2004 and 2007 by partners from academia and industry. This document provides the interaction storyboard (the multimodal design of SmartWeb handheld's interaction), and a description of the actual implementation. We decided to publish this technical document in a second version in the context of the THESEUS project (http://www.theseus-programm.de) since this SmartWeb document provided many suggestions for the THESEUS usability guidelines (http://www.dfki.de/~sonntag/interactiondesign.htm) and the implementation of the THESEUS TEXO demonstrators. Theseus is the German flagship project on the Internet of Services, where the user can delegate complex tasks to dynamically composed semantic web services by utilizing multimodal interaction combining speech and multi-touch input on advanced smartphones. More information on SmartWeb's technical question-answering software architecture and the underlying multimodal dialogue system, which is further developed and commercialized in the THESEUS project, can be found in the book: "Ontologies and Adaptivity in Dialogue for Question Answering".}, keywords = {}, pubstate = {published}, tppubtype = {techreport} } Question answering (QA) has become one of the fastest growing topics in computational linguistics and information access. In this context, SmartWeb (http://www.smartweb-projekt.de/) was a large-scale German research project that aimed to provide intuitive multimodal access to a rich selection of Web-based information services. In one of the main scenarios, the user interacts with a smartphone client interface, and asks natural language questions, to access the Semantic Web. The demonstrator systems were developed between 2004 and 2007 by partners from academia and industry. This document provides the interaction storyboard (the multimodal design of SmartWeb handheld's interaction), and a description of the actual implementation. We decided to publish this technical document in a second version in the context of the THESEUS project (http://www.theseus-programm.de) since this SmartWeb document provided many suggestions for the THESEUS usability guidelines (http://www.dfki.de/~sonntag/interactiondesign.htm) and the implementation of the THESEUS TEXO demonstrators. Theseus is the German flagship project on the Internet of Services, where the user can delegate complex tasks to dynamically composed semantic web services by utilizing multimodal interaction combining speech and multi-touch input on advanced smartphones. More information on SmartWeb's technical question-answering software architecture and the underlying multimodal dialogue system, which is further developed and commercialized in the THESEUS project, can be found in the book: "Ontologies and Adaptivity in Dialogue for Question Answering". |
2006 |
Incollections |
Sonntag, Daniel Multimodale Interaktion Incollection Cramer, Irene; im Walde, Sabine Schulte (Ed.): Studienbibliographie Computerlinguistik und Sprachtechnologie, ISBN 3-87276-867-0 , Stauffenburg Verlag Brigitte Narr, 2006. @incollection{2592, title = {Multimodale Interaktion}, author = {Daniel Sonntag}, editor = {Irene Cramer and Sabine Schulte im Walde}, year = {2006}, date = {2006-01-01}, booktitle = {Studienbibliographie Computerlinguistik und Sprachtechnologie}, volume = {ISBN 3-87276-867-0}, publisher = {Stauffenburg Verlag Brigitte Narr}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } |
2015 |
Inproceedings |
An autonomous industrial robot for loading and unloading goods Inproceedings 2015 International Conference on Informatics, Electronics & Vision (ICIEV), pp. 1-6, IEEE, 2015. |
Technical Reports |
Computational Modelling and Prediction of Gaze Estimation Error for Head-mounted Eye Trackers Technical Report DFKI , 2015. |
ISMAR 2015 Tutorial on Intelligent User Interfaces Technical Report DFKI , 2015. |
2014 |
Incollections |
Intelligent Semantic Mediation, Knowledge Acquisition and User Interaction Incollection Grallert, Hans-Joachim; Weiss, Stefan; Friedrich, Hermann; Widenka, Thomas; Wahlster, Wolfgang (Ed.): Towards the Internet of Services: The THESEUS Research Program, pp. 179-189, Springer, 2014. |
Inproceedings |
Proceedings of the 2014 International Conference on Pervasive Computing and Communications Workshops, pp. 320-325, IEEE, 2014. |
LabelMovie: a Semi-supervised Machine Annotation Tool with Quality Assurance and Crowd-sourcing Options for Videos Inproceedings Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, IEEE, 2014. |
Using Eye-gaze and Visualization To Augment Memory: A Framework for Improving Context Recognition and Recall Inproceedings Proceedings of the 16th International Conference on Human-Computer Interaction, pp. 282-291, LNCS Springer, 2014. |
A Mixed Reality Head-Mounted Text Translation System Using Eye Gaze Input Inproceedings Proceedings of the 2014 international conference on Intelligent user interfaces, pp. 329-334, ACM, 2014. |
A natural interface for multi-focal plane head mounted displays using 3D gaze Inproceedings Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces, pp. 25-32, ACM, 2014. |
2013 |
Incollections |
Marcus, Aaron (Ed.): Design, User Experience, and Usability. User Experience in Novel Technological Environments, pp. 401-410, Springer, 2013. |
Inproceedings |
Gaze-based Online Face Learning and Recognition in Augmented Reality Inproceedings Proceedings of the IUI 2013 Workshop on Interactive Machine Learning, ACM, 2013. |
On-body IE: A Head-Mounted Multimodal Augmented Reality System for Learning and Recalling Faces Inproceedings 9th International Conference on Intelligent Environments, IEEE, 2013. |
Vision-Based Location-Awareness in Augmented Reality Applications Inproceedings 3rd Workshop on Location Awareness for Mixed and Dual Reality (LAMDa’13), ACM Press, 2013. |
Digital Pens as Smart Objects in Multimodal Medical Application Frameworks Inproceedings Proceedings of the Second Workshop on Interacting with Smart Objects, in conjunction with IUI’13, ACM Press, 2013. |
Multimodal Interaction Strategies in a Multi-Device Environment using Natural Speech Inproceedings Proceedings of the Companion Publication of the 2013 International Conference on Intelligent User Interfaces Companion, ACM Press, 2013. |
2012 |
Inproceedings |
RadSpeech, a mobile dialogue system for radiologists Inproceedings Proceedings of the International Conference on Intelligent User Interfaces (IUI), pp. 317-318, ACM, 2012. |
2011 |
Inproceedings |
Digital Pen in Mammography Patient Forms Inproceedings Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 303-306, ACM, 2011. |
Monitoring and Explaining Reasoning Processes in a Dialogue System’s Input Interpretation Step Inproceedings Proceedings of the International Workshop on Explanation-aware Computing at IJCAI, IJCAI, 2011. |
Interactive Paper for Radiology Findings Inproceedings Proceedings of the 16th International Conference on Intelligent User Interfaces, pp. 459-460, ACM, 2011. |
2010 |
Books |
Ontologies and Adaptivity in Dialogue for Question Answering Book AKA and IOS Press, 2010. |
Book Chapters |
Pillars of Ontology Treatment in the Medical Domain Book Chapter Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications, pp. 162-186, Information Science Reference, 2010. |
Inproceedings |
Prototyping Semantic Dialogue Systems for Radiologists Inproceedings Proceedings of the 6th International Conference on Intelligent Environments, AAAI, 2010. |
A Multimodal Dialogue Mashup for Medical Image Semantics Inproceedings Proceedings of the International Conference on Intelligent User Interfaces, o.A., 2010. |
Linked Data Integration for Semantic Dialogue and Backend Access Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", o.A., 2010. |
HTTP/REST-based Meta Web Services in Mobile Application Frameworks Inproceedings Proceedings of the 4nd International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, XPS, 2010. |
Applications of an Ontology Engineering Methodology Accessing Linked Data for Medical Image Retrieval Inproceedings Proceedings of the AAAI Spring Symposium "Linked Data meets Artificial Intelligence", Stanford University, 2010. |
Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering Inproceedings Proceedings of the Seventh Conference on International Language Resources and Evaluation, European Language Resources Association (ELRA), 2010. |
A Discourse and Dialogue Infrastructure for Industrial Dissemination Inproceedings Lee, Gary Geunbae; Mariani, Joseph; Minker, Wolfgang; Nakamura, Satoshi (Ed.): Spoken Dialogue Systems for Ambient Environments - IWSDS 2010: Proceedings of the Seond International Workshop on Spoken Dialogue Systems, pp. 132-143, Springer, 2010. |
Technical Reports |
THESEUS CTC-WP4 Usability Guidelines for Use Case Applications Technical Report Bundesministerium für Wirtschaft und Technologie , 2010. |
A Methodology for Emergent Software Technical Report German Research Center for AI (DFKI ) Stuhlsatzenhausweg 3 66123 Saarbrücken, , 2010. |
2009 |
Journal Articles |
Macrodynamics of users' behavior in Information Retrieval Journal Article Computing Research Repository eprint Journal, ArXiv online , pp. 1-15, 2009. |
Pillars of Ontology Treatment in the Medical Domain Journal Article Journal of Cases on Information Technology, 11 , pp. 47-73, 2009. |
Inproceedings |
Unifying Semantic Annotation and Querying in Biomedical Images Repositories Inproceedings Proceedings of the International Conference on Knowledge Management and Information Sharing, INSTICC Press, 2009. |
A Multimodal Mobile B2B Dialogue Interface on the iPhone Inproceedings Proceedings of the 4th Workshop on Speech in Mobile and Pervasive Environments, o.A., 2009. |
New Business to Business Interaction: Shake your iPhone and speak to it. Inproceedings Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services, ACM, 2009. |
Design and Implementation of Combined Mobile and Touchscreen-Based Multimodal Web 3.0 Interfaces Inproceedings Proceedings of the International Conference on Artificial Intelligence (ICAI), CSREA Press, 2009. |
Introspection and Adaptable Model Integration for Dialogue-based Question Answering Inproceedings Proceedings of the Twenty-first International Joint Conferences on Artificial Intelligence (IJCAI), Online, 2009. |
Supporting a Rapid Dialogue Engineering Process Inproceedings Proceedings of the First International Workshop On Spoken Dialogue Systems Technology, o.A., 2009. |
On Intuitive Dialogue-based Communication and Instinctive Dialogue Initiative Inproceedings Instictive Computing, International Workshop, online, 2009. |
2008 |
Inproceedings |
Semiotic-based Ontology Evaluation Tool S-OntoEval Inproceedings Proceedings of the Sixth International Conference on Language Resources and Evaluation, ELRA, 2008. |
2007 |
Journal Articles |
DOLCE ergo SUMO: On Foundational and Domain Models in SWIntO (SmartWeb Integrated Ontology) Journal Article Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 5 , pp. 156-174, 2007. |
Book Chapters |
Huang, Thomas; Nijholt, Anton; Pantic, Maja; Plentland, Alex (Ed.): Artifical Intelligence for Human Computing, 4451 , pp. 272-295, Springer, 2007. |
Inproceedings |
A Constraint-Based Graph Visualisation Architecture for Mobile Semantic Web Interfaces Inproceedings Falciendo, B; Spagnuolo, M; Avrithis, Y; Kompatsiaris, I; Buitelaar, Paul (Ed.): Semantic Multimedia. Proceedings of the 2nd International Conference on Semantic and Digital Media Technologies (SAMT-2007), December 5-7, Genoa, Italy, pp. 158-171, Springer, 2007. |
Semantic Graph Visualisation for Mobile Semantic Web Interfaces Inproceedings Hertzberg, J; Beetz, M; Englert, R (Ed.): Proceedings of the 30th Annual German Conference on Artificial Intelligence (KI 2007), September 10-13, Osnabrück, Germany, pp. 506-509, Springer, 2007. |
Proceedings of the 9th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI-07), Septmeber 9-12, Singapore, pp. 142-148, ACM Publications, ACM Publications, 2007. |
Text Generation in the SmartWeb Multimodal Dialogue System Inproceedings Hertzberg, J; Beetz, M; Englert, R (Ed.): KI 2007: Advances in Artificial Intelligence. 30th Annual German Conference on Artificial Intelligence (KI 2007), September 10-13, Osnabrück, Germany, pp. 448-451, Springer, 2007. |
Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces Inproceedings Smith, Michael J; Salvendy, Gavriel (Ed.): Human Interface and the Management of Information. Interacting in Information Environments (Part 2), Springer, 2007. |
Embedded Distributed Text Mining and Semantic Web Technology Inproceedings Proceedings of the Workshop: NATO Advanced Study Institute on Mining Massive Data Sets for Security, Advanced Study Institute, 2007. |
Technical Reports |
SmartWeb Handheld Interaction: General Interactions and Result Display for User-System Multimodal Dialogue Technical Report DFKI , 2007. |
2006 |
Incollections |
Multimodale Interaktion Incollection Cramer, Irene; im Walde, Sabine Schulte (Ed.): Studienbibliographie Computerlinguistik und Sprachtechnologie, ISBN 3-87276-867-0 , Stauffenburg Verlag Brigitte Narr, 2006. |