ACM Home Page
Please provide us with feedback. Feedback
Beyond attention: the role of deictic gesture in intention recognition in multimodal conversational interfaces
Full text PdfPdf (587 KB)
Source
International Conference on Intelligent User Interfaces archive
Proceedings of the 13th international conference on Intelligent user interfaces table of contents
Gran Canaria, Spain
SESSION: Analyzing interfaces table of contents
Pages 237-246  
Year of Publication: 2008
ISBN:978-1-59593-987-6
Authors
Shaolin Qu  Michigan State University, East Lansing, MI
Joyce Y. Chai  Michigan State University, East Lansing, MI
Sponsors
SIGART: ACM Special Interest Group on Artificial Intelligence
AAAI : Association for the Advancement of Artifical Intelligence
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 78,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1378773.1378805
What is a DOI?

ABSTRACT

In a multimodal conversational interface supporting speech and deictic gesture, deictic gestures on the graphical display have been traditionally used to identify user attention, for example, through reference resolution. Since the context of the identified attention can potentially constrain the associated intention, our hypothesis is that deictic gestures can go beyond attention and apply to intention recognition. Driven by this assumption, this paper systematically investigates the role of deictic gestures in intention recognition. We experiment with different model-based methods and instancebased methods to incorporate gestural information for intention recognition. We examine the effects of utilizing gestural information in two different processing stages: speech recognition stage and language understanding stage. Our empirical results have shown that utilizing gestural information improves intention recognition. The performance is further improved when gestures are incorporated in both speech recognition and language understanding stages compared to either stage alone.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
J. Chai, S. Pan, and M. Zhou. Mind: A context-based multimodal interpretation framework in conversational systems. In O. Bernsen, L. Dybkjaer, and J. van Kuppevelt, editors, Natural, Intelligent and Effective Interaction in Multimodal Dialogue Systems. Kluwer Academic Publishers, 2005.
 
3
J. Chai, Z. Prasov, and S. Qu. Cognitive principles in robutst multimodal interpretation. Journal of Artificial Intelligence Research, 27:55--83, 2006.
 
4
5
 
6
A. Chotimongkol and A. Rudnicky. N-best speech hypotheses reordering using linear regression. In Proceedings of 7th EUROSPEECH, pages 1829--1832, 2001.
7
 
8
 
9
J. Eisenstein and C. M. Christoudias. A salience-based approach to gesture-speech alignment. In Proceedings of HLT/NAACL'04, 2004.
 
10
 
11
 
12
A. Gruenstein, C. Wang, and S. Seneff. Context-sensitive statistical language modeling. In Proceedings of Eurospeech'05, 2005.
 
13
J. Gustafson, L. Bell, J. Beskow, B. J., R. Carlson, J. Edlund, B. Granstrom, H. D., and M. Wiren. Adapt - a multimodal conversational dialogue system in an apartment domain. In Proceedings of 6th International Conference on Spoken Language Processing (ICSLP), 2000.
 
14
C.-W. Hsu and C.-J. Lin. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13:415--425, 2002.
 
15
 
16
S. Iba, C. Paredis, and P. Khosla. Intention aware interactive multi-modal robot programming. In Proceedings of IEEE/RSJ International Conference on - Intelligent Robots and Systems, 2003.
 
17
 
18
Z. Kazi, S. Chen, M. Beitler, D. Chester, and R. Foulds. Multimodal hci for robot control: Towards an intelligent robotic assistant for people with disablities. In Proceedings of AAAI'96 Fall Symposium on Developing AI Applications for the Disabled, 1996.
 
19
 
20
P. Kiefer and C. Schlieder. Exploring context-sensitivity in spatial intention recognition. In Proceedings of the Workshop on Behaviour Monitoring and Interpretation (BMI'07), 2007.
21
 
22
 
23
S. Oviatt. Mulitmodal interactive maps: Designing for human performance. Human-Computer Interaction, 12:93--129, 1997.
24
25
 
26
27
 
28
D. Roy and N. Mukherjee. Towards situated speech understanding: Visual context priming of language models. Computer Speech and Language, 19(2):227--248, 2005.
 
29
R. A. Solsona, E. Fosler-Lussier, H.-K. J. Kuo, A. Potamianos, and I. Zitouni. Adaptive language models for spoken dialogue systems. In Proceedings of ICASSP, 2002.
 
30
 
31
W. Walker, P. Lamere, P. Kwok, B. Raj, R. Singh, E. Gouvea, P. Wolf, and J. Woelfel. Sphinx-4: A flexible open source framework for speech recognition. Technical Report TR-2004-139, Sun Microsystems Laboratories, 2004.
 
32
S.-J. Youn and K.-W. Oh. Intention recognition using a graph representation. International Journal of Applied Science, Engineering and Techcnology, 4:13--18, 2007.
 
33
M. Zancanaro, O. Stock, and C. Strapparava. Multimodal interaction for information access: Exploiting cohesion. Computational Intelligence, 13(7):439--464, 1997.

Collaborative Colleagues:
Shaolin Qu: colleagues
Joyce Y. Chai: colleagues