|
ABSTRACT
When talking about spatial domains, humans frequently accompany their explanations with iconic gestures to depict what they are referring to. For example, when giving directions, it is common to see people making gestures that indicate the shape of buildings, or outline a route to be taken by the listener, and these gestures are essential to the understanding of the directions. Based on results from an ongoing study on language and gesture in direction-giving, we propose a framework to analyze such gestural images into semantic units (image description features), and to link these units to morphological features (hand shape, trajectory, etc.). This feature-based framework allows us to generate novel iconic gestures for embodied conversational agents, without drawing on a lexicon of canned gestures. We present an integrated microplanner that derives the form of both coordinated natural language and iconic gesture directly from given communicative goals, and serves as input to the speech and gesture realization engine in our NUMACK project.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Cassell, J. & Prevost, S. Distribution of Semantic Features Across Speech and Gesture by Humans and Computers. In Proc. Workshop on Integration of Gesture in Language and Speech, 1996, Wilmington, DE.
|
| |
2
|
Cassell, J., McNeill, D. & McCullough, K.E. Speech-Gesture Mismatches: Evidence for One Underlying Representation of Linguistic and Non-Linguistic Information. Pragmatics and Cognition 7(1): 1--33, 1999.
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
Clark, H. H. Using Language. Cambridge Univ. Press, 1996.
|
| |
7
|
de Ruiter, J.P. The production of gesture and speech. In McNeill, D. (ed.) Language and Gesture. Cambridge, UK: Cambridge University Press, 2000.
|
| |
8
|
|
| |
9
|
Gao, Y. Automatic extraction of spatial location for gesture generation, Master thesis, MIT Dept. of Electrical Engineering and Computer Science, 2002.
|
| |
10
|
Green, N., G. Carenini, et al. A Media-Independent Content Language for Integrated Text and Graphics Generation. Workshop on Content Visualization and Intermedia Representations at COLING/ACL '98, Montreal, 1998.
|
| |
11
|
|
| |
12
|
Joshi, A.K. An Introduction to Tree Adjoining Grammars, In A. Manaster-Ramer (ed), Mathematics of Language, Amsterdam: John Benjamins, pp. 87--114, 1987.
|
| |
13
|
|
| |
14
|
David B. Koons , Carlton J. Sparrell , Kristinn R. Thorisson, Integrating simultaneous input from speech, gaze, and hand gestures, Intelligent multimedia interfaces, American Association for Artificial Intelligence, Menlo Park, CA, 1993
|
| |
15
|
|
| |
16
|
Landau, B. & Jackendoff, R. What and where in spatial language and spatial cognition. Behavioral and Brain Sciences 16: 217--265, 1993.
|
| |
17
|
McNeill, D. & Levy, E. Conceptual representations in language activity and gesture. In R. Jarvella, & W. Klein (eds.): Speech, Place, and Action, John Wiley & Sons, 1982.
|
| |
18
|
McNeill, D., Hand and Mind: What Gestures Reveal About Thought. Chicago, IL: Univ. of Chicago Press, 1992.
|
| |
19
|
McNeill, D. Catchments and Contexts: Non-modular factors in speech and gesture production. In D. McNeill (ed.): Language and Gesture. Cambridge University Press, 2000.
|
| |
20
|
Nijholt, A., Theune , M. & Heylen, D. Embodied Language Generation, In O. Stock & M. Zancanaro (eds): Intelligent Information Presentation, Kluwer, 2004.
|
 |
21
|
|
| |
22
|
Pelachaud, C. & Poggi, I. Multimodal Embodied Agents, In Autonomous Agents Workshop Multimodal Communication and Context in Embodied Agents, pp. 95--99, 2001.
|
| |
23
|
|
| |
24
|
Jeff Rickel , Stacy Marcella , Jonathan Gratch , Randall Hill , David Traum , William Swartout, Toward a New Generation of Virtual Humans for Interactive Experiences, IEEE Intelligent Systems, v.17 n.4, p.32-38, July 2002
[doi> 10.1109/MIS.2002.1024750]
|
| |
25
|
Sowa, T. & Wachsmuth, I. Coverbal Iconic Gestures for Object Descriptions in Virtual Environments: An Empirical Study. In M. Rector, I. Poggi & N. Trigo (eds.): Proc. "Gestures. Meaning and Use", pp. 365--376, 2003.
|
| |
26
|
Stone, M., Doran, C., Webber, B., Bleam, T. & Palmer, M. Microplanning with communicative intentions: the SPUD system. Computational Intelligence 19(4): 311--381, 2003.
|
| |
27
|
|
 |
28
|
|
| |
29
|
Yan, H. Paired Speech and Gesture Generation in Embodied Conversational Agents. Masters Thesis. MIT, School of Architecture and Planning, 2000.
|
|