ACM Home Page
Please provide us with feedback. Feedback
Model-based and empirical evaluation of multimodal interactive error correction
Full text PdfPdf (1.07 MB)
Source Conference on Human Factors in Computing Systems archive
Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit table of contents
Pittsburgh, Pennsylvania, United States
Pages: 584 - 591  
Year of Publication: 1999
ISBN:0-201-48559-1
Authors
Bernhard Suhm  Interactive Systems Laboratories, Carnegie Mellon University/Universität Karlsruhe
Brad Myers  Human Computer Interaction Institute, Carnegie Mellon University
Alex Waibel  Interactive Systems Laboratories, Carnegie Mellon University/Universität Karlsruhe
Sponsor
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 27,   Citation Count: 16
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/302979.303165
What is a DOI?

ABSTRACT

Our research addresses the problem of error correction in speech user interfaces. Previous work hypothesized that switching modality could speed up interactive correction of recognition errors (so-called multimodal error correction). We present a user study that compares, on a dictation task, multimodal error correction with conventional interactive correction, such as speaking again, choosing Tom a list, and keyboard input. Results show that multimodal correction is faster than conventional correction without keyboard input, but slower than correction by typing for users with good typing skills. Furthermore, while users initially prefer speech, they learn to avoid ineffective correction modalities with experience. To extrapolate results from this user study we developed a performance model of multimodal interaction that predicts input speed including time needed for error correction. We apply the model to estimate the impact of recognition technology improvements on correction speeds and the influence of recognition accuracy and correction method on the productivity of dictation systems. Our model is a first step towards formalizing multimodal (recognition-based) interaction.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Alto, P., et al. "Experimenting Natual-Language Dictation with a 20000-Word Speech Recognizer," in VLSI and Computer Peripherals. 1989. IEEE Computer Society Press. 2: pp. 78-81.
 
2
Baber, C., Stammers, R.B., and Usher, D.M., "Error correction requirements in automatic speech recognition," in Contemporary Ergonomics, E.J. Levesey, Editor 1990, Taylor and Francis. London.
 
3
Gibbon, D., Moore, R., and Winski, R., eds. Handbook of Standards and Resources for Spoken Language Systems. 1997, Mouton de Gruyter: Berlin, New York.
 
4
Gould, J.D., "How Experts Dictate." Journal of Experimental Psychology: Human Perception and Performance, 1978. 4(4): pp. 648-661.
5
 
6
Hild, H., Buchstabiererkennung mit neuronalen Netzen in Auskunfissystemen. Fakult~it fiir Informatik Fredericiana, 1997, Karlsruhe. 216 pages.
7
8
 
9
 
10
McNair, A.E. and Waibel, A. "Improving Recognizer Acceptance through Robust, Natural Speech Repair," in International Conference on Spoken Language Processing. 1994. Yokohama (Japan). 3: pp. 1299-1302.
 
11
Mellor, B. and Baber, C. "Modelling of Speech-based User Interfaces," in European Conference on Speech Communication and Technology. 1997. Rhodes (Greece): ESCA. 4: pp. 2263- 2266.
 
12
Oviatt, S. and VanGent, R. "Error Resolution During Multimodal Human-Computer Interaction," in International Conference on Spoken Language Processing. 1996. Philadelphia (PA). 2: pp. 204-207.
 
13
Rhyne, J.R. and Wolf, C.G., "Recognition-Based User Interfaces," in Advances in Human-Computer Interaction, H.R. Hartson and D. Hix, Editors. 1993, Ablex Publishing. Norwood (NJ). pp. 191-212.
 
14
Rogina, I. and Waibel, A. "The JANUS Speech Recognizer," in ARPA Workshop on Spoken Language Technology. 1995. Austin (TX). Morgan Kaufmann. pp. 166- 169.
15
 
16
Soltau, H., 1998. Personal Communication.
 
17
Suhm, B., Multimodal Interactive Error Recovery for Non- Conversational Speech User Interfaces. PhD, Computer Science Department, Fredericiana University, 1998, Karlsruhe.

CITED BY  16

Collaborative Colleagues:
Bernhard Suhm: colleagues
Brad Myers: colleagues
Alex Waibel: colleagues