| Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search |
| Full text |
Flv
(2:51),
Mov
(2:42),
Pdf
(1.37 MB)
|
Source
|
Symposium on User Interface Software and Technology
archive
Proceedings of the 21st annual ACM symposium on User interface software and technology
table of contents
Monterey, CA, USA
SESSION: Text and speech
table of contents
Pages 141-150
Year of Publication: 2008
ISBN:978-1-59593-975-3
|
|
Authors
|
|
Tim Paek
|
Microsoft Corporation, Redmond, WA, USA
|
|
Bo Thiesson
|
Microsoft Corporation, Redmond, WA, USA
|
|
Yun-Cheng Ju
|
Microsoft Corporation, Redmond, WA, USA
|
|
Bongshin Lee
|
Microsoft Corporation, Redmond, WA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 30, Downloads (12 Months): 194, Citation Count: 0
|
|
|
ABSTRACT
Internet usage on mobile devices continues to grow as users seek anytime, anywhere access to information. Because users frequently search for businesses, directory assistance has been the focus of many voice search applications utilizing speech as the primary input modality. Unfortunately, mobile settings often contain noise which degrades performance. As such, we present Search Vox, a mobile search interface that not only facilitates touch and text refinement whenever speech fails, but also allows users to assist the recognizer via text hints. Search Vox can also take advantage of any partial knowledge users may have about the business listing by letting them express their uncertainty in an intuitive way using verbal wildcards. In simulation experiments conducted on real voice search data, leveraging multimodal refinement resulted in a 28% relative reduction in error rate. Providing text hints along with the spoken utterance resulted in even greater relative reduction, with dramatic gains in recovery for each additional character.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Church, K., Thiesson, B., & Ragno, R. 2007. K-best suffix arrays. Proc. of NAACL HLT, companion volume, 17--20.
|
| |
3
|
Hsu, P., Mahajan, M. & Acero, A. 2005. Multimodal text entry on mobile devices. Proc. of ASRU.
|
| |
4
|
Ipsos Insight. 2006. Mobile phones could soon to rival the PC as world's dominant Internet platform. http://www.ipsosna.com/news/pressrelease.cfm?id=3049, April 2006. Accessed January 2008.
|
| |
5
|
|
| |
6
|
Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10:707--710.
|
| |
7
|
Live Search Mobile: http://livesearchmobile.com/
|
| |
8
|
|
| |
9
|
Oviatt, S. & Van Gent, R. 1994. Error resolution during multimodal human-computer interaction. In Proc. of CHI, 415--422.
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
Paek, T. & Ju, Y.C. 2008. Accommodating explicit user expressions of uncertainty in voice search or something like that. Proc. of Interspeech.
|
| |
14
|
Rhyne, J. R. & Wolf, C. G. 1993. Recognition-based user interfaces. In Advances in Human-Computer Interaction, H. R. Hartson & D. Hix, Eds. Ablex Publishing Corp, 191--212.
|
| |
15
|
|
 |
16
|
|
| |
17
|
Tellme Press Release. 2006. Tellme to power all Cingular wireless 411 calls: Expanded relationship focuses on enhancing 411 with personalization and mobile search services, http://www.tellme.com/about/PressRoom/release/20061009, October 2006. Accessed March 2008.
|
| |
18
|
Yahoo oneSearch: http://mobile.yahoo.com/onesearch
|
| |
19
|
Yu, D., Ju, Y. C., Wang, Y. Y., Zweig, G., & Acero, A. 2007. Automated directory assistance system: From theory to practice. Proc. of Interspeech.
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.2
User Interfaces (D.2.2, H.1.2, I.3.6)
Subjects:
Graphical user interfaces (GUI)
Additional Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.2
User Interfaces (D.2.2, H.1.2, I.3.6)
Subjects:
Interaction styles (e.g., commands, menus, forms, direct manipulation);
Input devices and strategies (e.g., mouse, touchscreen)
General Terms:
Design,
Human Factors,
Performance
Keywords:
mobile search,
multimodal,
speech recognition
|