ACM Home Page
Please provide us with feedback. Feedback
Learning a model of speaker head nods using gesture corpora
Full text PdfPdf (429 KB)
Source
International Conference on Autonomous Agents archive
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1 table of contents
Budapest, Hungary
SESSION: Virtual agents/agent-human interaction table of contents
Pages 289-296  
Year of Publication: 2009
ISBN:978-0-9817381-6-1
Authors
Jina Lee  University of Southern California, Marina del Rey, CA
Stacy Marsella  University of Southern California, Marina del Rey, CA
Sponsors
: The Foundation for Intelligent Physical Agents
Microsoft Research : Microsoft Research
: Wiley - Blackwell Ltd
: Whitestein Technologies
: European Office of Aerospace Research and Development, Air Force Office of Scientific Research, United States Air Force Research Laboratory
: Drexel University
Publisher
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 38,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

During face-to-face conversation, the speaker's head is continually in motion. These movements serve a variety of important communicative functions. Our goal is to develop a model of the speaker's head movements that can be used to generate head movements for virtual agents based on a gesture annotation corpora. In this paper, we focus on the first step of the head movement generation process: predicting when the speaker should use head nods. We describe our machine-learning approach that creates a head nod model from annotated corpora of face-to-face human interaction, relying on the linguistic features of the surface text. We also describe the feature selection process, training process, and the evaluation of the learned model with test data in detail. The result shows that the model is able to predict head nods with high precision and recall.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
C. Busso, Z. Deng, M. Grimm, U. Neumann, and S. Narayanan. Rigid head motion in expressive speech animation: Analysis and synthesis. IEEE Transactions on Audio, Speech and Language Processing, 15(3):1075--1086, 2007.
 
3
J. Carletta. Unleashing the killer corpus: experiences in creating the multi-everything AMI meeting corpus. Language Resources and Evaluation Journal, 41(2):181--190, 2007.
4
5
 
6
 
7
 
8
 
9
 
10
M. E. Foster and J. Oberlander. Corpus-based generation of head and eyebrow motion for an embodied conversational agent. Language Resources and Evaluation, 41:305--324(3), 2007.
 
11
 
12
U. Hadar, T. J. Steiner, E. C. Grant, and F. C. Rose. Kinematics of head movements accompanying speech during conversation. Human Movement Science, 2:35--46, 1983.
 
13
U. Hadar, T. J. Steiner, and F. C. Rose. Head movement during listening turns in conversation. Journal of Nonverbal Behavior, 9(4):214--228, 1985.
 
14
D. Heylen. Challenges ahead: Head movements and other social acts in conversations. In AISB 2005, Social Presence Cues Symposium, 2005.
 
15
R. W. Hill, J. Belanich, H. C. Lane, M. G. Core, M. Dixon, E. Forbell, J. Kim, and J. Hart. Pedagogically structured game-based training: Development of the elect bilat simulation. In Proceedings of the 25th Army Science Conference (ASC 2006). Association for Computational Linguistics, Noverber, 2006.
 
16
B. H. Hwang and L. R. Rabiner. Hidden markov models for speech recognition, August 1991.
 
17
A. Kendon. Some uses of the head shake. Gesture, 2:147--182(36), 2002.
 
18
 
19
 
20
 
21
J. Lee and S. Marsella. Nonverbal behavior generator for embodied conversational agents. In In Proceedings of the 6th International Conference on Intelligent Virtual Agents, Marina del Rey, CA, pages 243--255. Springer, 2006.
 
22
 
23
M. Mancini, R. Bresin, and C. Pelachaud. A virtual head driven by music expressivity. IEEE Transactions on Audio, Speech and Language Processing, 15(6):1833--1841, 2007.
 
24
E. Z. McClave. Linguistic functions of head movements in the context of speech. Journal of Pragmatics, 32:855--878(24), June 2000.
 
25
G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. Int J Lexicography, 3(4):235--244, January 1990.
 
26
 
27
K. G. Munhall, J. A. Jones, D. E. Callan, T. Kuratate, and E. Vatikiotis-Bateson. Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15:133--137(5), February 2004.
 
28
R. Nishimura, N. Kitaoka, and S. Nakagawa. A spoken dialog system for chat-like conversations considering response timing. In TSD, pages 599--606, 2007.
 
29
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, 1989.
 
30
 
31
 
32
G. Tom, P. Pettersen, T. Lau, T. Burton, and J. Cook. The role of overt head movement in the formation of affect. Basic and Applied Social Psychology, 12(3):281--289, 1991.
 
33
D. Traum, A. Roque, A. L. P. Georgiou, J. Gerten, B. M. S. Narayanan, S. Robinson, and A. Vaswani. Hassan: A virtual human for tactical questioning. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, pages 71--74, Antwerp, September 2007. Association for Computational Linguistics.
 
34
T. Ward and W. Tsukahara. Visual prosody and speech intelligibility in english and japanese. Pragmatics, 23:1177--1207, 2004.
 
35

Collaborative Colleagues:
Jina Lee: colleagues
Stacy Marsella: colleagues