ACM Home Page
Please provide us with feedback. Feedback
Multimodal interaction in an augmented reality scenario
Full text PdfPdf (1.12 MB)
Source International Conference on Multimodal Interfaces archive
Proceedings of the 6th international conference on Multimodal interfaces table of contents
State College, PA, USA
SESSION: Architecture table of contents
Pages: 53 - 60  
Year of Publication: 2004
ISBN:1-58113-995-0
Authors
Gunther Heidemann  Bielefeld University, Bielefeld, Germany
Ingo Bax  Bielefeld University, Bielefeld, Germany
Holger Bekel  Bielefeld University, Bielefeld, Germany
Sponsors
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1027933.1027944
What is a DOI?

ABSTRACT

We describe an augmented reality system designed for online acquisition of visual knowledge and retrieval of memorized objects. The system relies on a head mounted camera and display, which allow the user to view the environment together with overlaid augmentations by the system. In this setup, communication by hand gestures and speech is mandatory as common input devices like mouse and keyboard are not available. Using gesture and speech, basically three types of tasks must be handled: (i) Communication with the system about the environment, in particular, directing attention towards objects and commanding the memorization of sample views; (ii) control of system operation, e.g. switching between display modes; and (iii) re-adaptation of the interface itself in case communication becomes unreliable due to changes in external factors, such as illumination conditions. We present an architecture to manage these tasks and describe and evaluate several of its key elements, including modules for pointing gesture recognition, menu control based on gesture and speech, and control strategies to cope with situations when vision becomes unreliable and has to be re-adapted by speech.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Y. Aloimonos. Active vision revisited. In Y. Aloimonos, editor, Active Perception, pages 1--18. Lawrence Efibaum, Hillsdale, 1993.
 
2
H. J. Andersen, M. Stoerring, and E. Granum. Physics-based modelling of human skin colour under mixed illuminants. Robotics and Autonomous Systems, 35(3-4):131--142, 2001.
 
3
C. Bauckhage, G. A. Fink, J. Fritsch, F. Kummert, F. Lomker, G. Sagerer, and S. Wachsmuth. An Integrated System for Cooperative Man-Machine Interaction. In IEEE Int'l Symp. Comput. Intelligence in Robotics and Automation, pages 328--333, Banff, Canada, 2001.
 
4
 
5
H. I. Christensen. Cognitive (vision) systems. ERCIM News, pages 17--18, April April, 2003.
 
6
J. L. Crowley and H. I. Christensen, editors. Vision as Process. Springer, 1995.
7
 
8
B. A. Draper, G. Kutlu, E. M. Riseman, and A. R. Hanson. ISR3: Communication and Data Storage for an Unmanned Ground Vehicle. In Proc. ICPR, volume~I, pages 833--836, 1994.
 
9
 
10
A. R. Hanson and E. M. Riseman. VISIONS: A Computer System for Interpreting Scenes. In A.R. Hanson and E.M. Riseman, editors, Computer Vision Systems. Academic Press, 1978.
 
11
C. Harris and M. Stephens. A Combined Corner and Edge Detector. In Proc. 4th Alvey Vision Conf., pages 147--151, 1988.
 
12
G. Heidemann, R. Rae, H. Bekel, I. Bax, and H. Ritter. Integrating context-free and context-dependent attentional mechanisms for gestural object reference. In Proc. Int'l Conf. Cognitive Vision Systems, pages 22--33, Graz, Austria, 2003.
 
13
A. Hoogs, J. Rittscher, G. Stein, and J. Schmiederer. Video Content Annotation Using Visual analysis and a Large Semantic Knowledgebase. In Proc. CVPR 2003, volume 2, pages 327--334, 2003.
 
14
 
15
G. Lindegaard. Usability Testing and System Evaluation: A Guide for Designing Useful Computer Systems. Chapman & Hall, 1994.
 
16
P. J. Locher and C. F. Nodine. Symmetry Catches the Eye. In A. Levy-Schoen and J. K. O'Reagan, editors, Eye Movements: From Physiology to Cognition, pages 353--361. Elsevier Science Publishers B. V. (North Holland), 1987.
 
17
18
19
 
20
 
21
D. Roy. Learning visually grounded words and syntax of natural spoken language. Evolution of Communication, 4(1):33--56, 2000.
 
22
H. Siegl, and A. Pinz. A Mobile AR Kit as a Human Computer Interface for Cognitive Vision. In Proc. WIAMIS'04, Lisbon, 2004.
 
23
 
24
 
25
 
26


Collaborative Colleagues:
Gunther Heidemann: colleagues
Ingo Bax: colleagues
Holger Bekel: colleagues