|
ABSTRACT
With the explosive growth in mobile computing and communication over the past few years, it is possible to access almost any information from virtually anywhere. However, the efficiency and effectiveness of this interaction is severely limited by the inherent characteristics of mobile devices, including small screen size and the lack of a viable keyboard or mouse. This paper concerns the use of multimodal language processing techniques to enable interfaces combining speech and gesture input that overcome these limitations. Specifically we focus on robust processing of pen gesture inputs in a local search application and demonstrate that edit-based techniques that have proven effective in spoken language processing can also be used to overcome unexpected or errorful gesture input. We also examine the use of a bottom-up gesture aggregation technique to improve the coverage of multimodal understanding.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
E. André. Natural language in multimedia/multimodal systems. In R. Mitkov, editor, Handbook of Computational Linguistics, pages 650--669. Oxford University Press, 2002.
|
| |
4
|
S. Bangalore and M. Johnston. Balancing data-driven and rule-based approaches in the context of a multimodal conversational system. In Proceedings of North American Association for Computational Linguistics/Human Language Technology, pages 33--40, Boston, USA, 2004.
|
| |
5
|
S. Bangalore and M. Johnston. Robust understanding in multimodal interfaces. Accepted for publication in Computational Linguistics, 2008.
|
| |
6
|
M. Boros, W. Eckert, F. Gallwitz, G. Gğrz, G. Hanrieder, and H. Niemann. Towards understanding spontaneous speech: word accuracy vs. concept accuracy. In Proceedings of International Conference on Spoken Language Processing, pages 41--44, Philadelphia, USA, 1996.
|
| |
7
|
|
| |
8
|
|
| |
9
|
A. Ciaramella. A Prototype Performance Evaluation Report. Technical Report WP8000-D3, Project Esprit 2218 SUNDIAL, 1993.
|
| |
10
|
Philip R. Cohen , Michael Johnston , David McGee , Sharon Oviatt , Jay Pittman , Ira Smith , Liang Chen , Josh Clow, Multimodal interaction for distributed interactive simulation, Readings in intelligent user interfaces, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998
|
| |
11
|
P. R. Cohen, M. Johnston, D. McGee, S. L. Oviatt, J. Clow, and I. Smith. The efficiency of multimodal interaction: a case study. In Proceedings of International Conference on Spoken Language Processing, pages 249--252, Sydney, Australia, 1998.
|
| |
12
|
John Dowding , Jean Mark Gawron , Doug Appelt , John Bear , Lynn Cherny , Robert Moore , Douglas Moran, Gemini: a natural language system for spoken-language understanding, Proceedings of the 31st annual meeting on Association for Computational Linguistics, p.54-61, June 22-26, 1993, Columbus, Ohio
[doi> 10.3115/981574.981582]
|
| |
13
|
P. Ehlen, M. Johnston, and G. Vasireddy. Collecting mobile multimodal data for MATCH. In Proceedings of International Conference on Spoken Language Processing, pages 2557--2560, Denver, CO, USA, 2002.
|
| |
14
|
|
| |
15
|
J. Gustafson, L. Bell, J. Beskow, J. Boye, R. Carlson, J. Edlund, B. Granstrm, D. House, and M. Wirén. Adapt - a multimodal conversational dialogue system in an apartment domain. In International Conference on Spoken Language Processing, pages 134--137, Beijing, China, 2000.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
Michael Johnston , Srinivas Bangalore , Gunaranjan Vasireddy , Amanda Stent , Patrick Ehlen , Marilyn Walker , Steve Whittaker , Preetam Maloor, MATCH: an architecture for multimodal dialogue systems, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, July 07-12, 2002, Philadelphia, Pennsylvania
[doi> 10.3115/1073083.1073146]
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
| |
23
|
W. Wahlster. SmartKom: Fusion and fission of speech, gestures, and facial expressions. In Proceedings of the 1st International Workshop on Man-Machine Symbiotic Systems, pages 213--225, Kyoto, Japan, 2002.
|
| |
24
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.2
User Interfaces (D.2.2, H.1.2, I.3.6)
Subjects:
Input devices and strategies (e.g., mouse, touchscreen)
Additional Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.2
User Interfaces (D.2.2, H.1.2, I.3.6)
Subjects:
Natural language
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.7
Natural Language Processing
Subjects:
Language parsing and understanding
General Terms:
Algorithms,
Experimentation,
Human Factors
Keywords:
finite-state methods,
local search,
mobile,
multimodal interfaces,
robustness,
speech-gesture integration
|