|
ABSTRACT
Our research addresses the problem of error correction in speech
user interfaces. Previous work hypothesized that switching modality
could speed up interactive correction of recognition errors
(so-called multimodal error correction). We present a user study
that compares, on a dictation task, multimodal error correction
with conventional interactive correction, such as speaking again,
choosing Tom a list, and keyboard input. Results show that
multimodal correction is faster than conventional correction
without keyboard input, but slower than correction by typing for
users with good typing skills. Furthermore, while users initially
prefer speech, they learn to avoid ineffective correction
modalities with experience. To extrapolate results from this user
study we developed a performance model of multimodal interaction
that predicts input speed including time needed for error
correction. We apply the model to estimate the impact of
recognition technology improvements on correction speeds and the
influence of recognition accuracy and correction method on the
productivity of dictation systems. Our model is a first step
towards formalizing multimodal (recognition-based) interaction.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alto, P., et al. "Experimenting Natual-Language Dictation with a 20000-Word Speech Recognizer," in VLSI and Computer Peripherals. 1989. IEEE Computer Society Press. 2: pp. 78-81.
|
| |
2
|
Baber, C., Stammers, R.B., and Usher, D.M., "Error correction requirements in automatic speech recognition," in Contemporary Ergonomics, E.J. Levesey, Editor 1990, Taylor and Francis. London.
|
| |
3
|
Gibbon, D., Moore, R., and Winski, R., eds. Handbook of Standards and Resources for Spoken Language Systems. 1997, Mouton de Gruyter: Berlin, New York.
|
| |
4
|
Gould, J.D., "How Experts Dictate." Journal of Experimental Psychology: Human Perception and Performance, 1978. 4(4): pp. 648-661.
|
 |
5
|
|
| |
6
|
Hild, H., Buchstabiererkennung mit neuronalen Netzen in Auskunfissystemen. Fakult~it fiir Informatik Fredericiana, 1997, Karlsruhe. 216 pages.
|
 |
7
|
|
 |
8
|
|
| |
9
|
|
| |
10
|
McNair, A.E. and Waibel, A. "Improving Recognizer Acceptance through Robust, Natural Speech Repair," in International Conference on Spoken Language Processing. 1994. Yokohama (Japan). 3: pp. 1299-1302.
|
| |
11
|
Mellor, B. and Baber, C. "Modelling of Speech-based User Interfaces," in European Conference on Speech Communication and Technology. 1997. Rhodes (Greece): ESCA. 4: pp. 2263- 2266.
|
| |
12
|
Oviatt, S. and VanGent, R. "Error Resolution During Multimodal Human-Computer Interaction," in International Conference on Spoken Language Processing. 1996. Philadelphia (PA). 2: pp. 204-207.
|
| |
13
|
Rhyne, J.R. and Wolf, C.G., "Recognition-Based User Interfaces," in Advances in Human-Computer Interaction, H.R. Hartson and D. Hix, Editors. 1993, Ablex Publishing. Norwood (NJ). pp. 191-212.
|
| |
14
|
Rogina, I. and Waibel, A. "The JANUS Speech Recognizer," in ARPA Workshop on Spoken Language Technology. 1995. Austin (TX). Morgan Kaufmann. pp. 166- 169.
|
 |
15
|
|
| |
16
|
Soltau, H., 1998. Personal Communication.
|
| |
17
|
Suhm, B., Multimodal Interactive Error Recovery for Non- Conversational Speech User Interfaces. PhD, Computer Science Department, Fredericiana University, 1998, Karlsruhe.
|
CITED BY 16
|
|
|
|
|
Jennifer Mankoff , Scott E. Hudson , Gregory D. Abowd, Interaction techniques for ambiguity resolution in recognition-based interfaces, Proceedings of the 13th annual ACM symposium on User interface software and technology, p.11-20, November 06-08, 2000, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Chris Schmandt , Kwan Hong Lee , Jang Kim , Mark Ackerman, Impromptu: managing networked audio applications for mobile users, Proceedings of the 2nd international conference on Mobile systems, applications, and services, June 06-09, 2004, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiang Ao , Xugang Wang , Feng Tian , Guozhong Dai , Hongan Wang, Crossmodal error dorrection of continuous handwriting recognition by speech, Proceedings of the 12th international conference on Intelligent user interfaces, January 28-31, 2007, Honolulu, Hawaii, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|