| A maximum entropy based approach for multimodal integration |
| Full text |
Pdf
(155 KB)
|
| Source
|
International Conference on Multimodal Interfaces
archive
Proceedings of the 6th international conference on Multimodal interfaces
table of contents
State College, PA, USA
DEMONSTRATION SESSION: Demo session 2
table of contents
Pages: 337 - 338
Year of Publication: 2004
ISBN:1-58113-995-0
|
|
Author
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 20, Citation Count: 0
|
|
|
ABSTRACT
Integration of various user input channels for a multimodal interface is not just an engineering problem. To fully understand users in the context of an application and the current session, solutions are sought that process information from different intentional, i.e. user-originated, as well as from passively available sources in a uniform manner. As a first step towards this goal, the work demonstrated here investigates how intentional user input (e.g. speech, gesture) can be seamlessly combined to provide a single semantic interpretation of the user input. For this classical Multimodal Integration problem the Maximum Entropy approach is demonstrated with 76.52% integration accuracy for the 1st and 86.77% accuracy for the top 3-best candidates. The paper also exhibits the process that generates multimodal data for training the statistical integrator, using transcribed speech from MIT's Voyager application. The quality of the generated data is assessed by comparing to real inputs to the multimodal version of Voyager.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Boda, P. P. and Filisko, E. Virtual Modality: a Framework for Testing and Building Multimodal Applications. HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems, Boston, Massachusetts, USA, May 7, 2004.
|
| |
3
|
Boda, P. P. Multimodal Integration in a Wider Sense. COLING 2004 Satellite Workshop on Robust and Adaptive Information Processing for Mobile Speech Interfaces. Geneva, Switzerland, August 28 -- 29, 2004
|
| |
4
|
James Glass , Giovanni Flammia , David Goodine , Michael Phillips , Joseph Polifroni , Shinsuke Sakai , Stephanie Seneff , Victor Zue, Multilingual spoken-language understanding in the MIT Voyager system, Speech Communication, v.17 n.1-2, p.1-18, Aug. 1995
[doi> 10.1016/0167-6393(95)00008-C]
|
| |
5
|
Wang, S. B. A Multimodal Galaxy-based Geographic System. S.M. thesis, MIT Department of Electrical Engineering and Computer Science. 2003.
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.2
User Interfaces (D.2.2, H.1.2, I.3.6)
Subjects:
User interface management systems (UIMS)
Additional Classification:
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.2
User Interfaces (D.2.2, H.1.2, I.3.6)
Subjects:
Prototyping;
Natural language
General Terms:
Algorithms,
Design,
Experimentation
Keywords:
machine learning,
maximum entropy,
multimodal database,
multimodal integration,
virtual modality
|