|
ABSTRACT
During face-to-face conversation, the speaker's head is continually in motion. These movements serve a variety of important communicative functions. Our goal is to develop a model of the speaker's head movements that can be used to generate head movements for virtual agents based on a gesture annotation corpora. In this paper, we focus on the first step of the head movement generation process: predicting when the speaker should use head nods. We describe our machine-learning approach that creates a head nod model from annotated corpora of face-to-face human interaction, relying on the linguistic features of the surface text. We also describe the feature selection process, training process, and the evaluation of the learned model with test data in detail. The result shows that the model is able to predict head nods with high precision and recall.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
C. Busso, Z. Deng, M. Grimm, U. Neumann, and S. Narayanan. Rigid head motion in expressive speech animation: Analysis and synthesis. IEEE Transactions on Audio, Speech and Language Processing, 15(3):1075--1086, 2007.
|
| |
3
|
J. Carletta. Unleashing the killer corpus: experiences in creating the multi-everything AMI meeting corpus. Language Resources and Evaluation Journal, 41(2):181--190, 2007.
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
M. E. Foster and J. Oberlander. Corpus-based generation of head and eyebrow motion for an embodied conversational agent. Language Resources and Evaluation, 41:305--324(3), 2007.
|
| |
11
|
Mary Ellen Foster , Michael White , Andrea Setzer , Roberta Catizone, Multimodal generation in the COMIC dialogue system, Proceedings of the ACL 2005 on Interactive poster and demonstration sessions, p.45-48, June 25-30, 2005, Ann Arbor, Michigan
[doi> 10.3115/1225753.1225765]
|
| |
12
|
U. Hadar, T. J. Steiner, E. C. Grant, and F. C. Rose. Kinematics of head movements accompanying speech during conversation. Human Movement Science, 2:35--46, 1983.
|
| |
13
|
U. Hadar, T. J. Steiner, and F. C. Rose. Head movement during listening turns in conversation. Journal of Nonverbal Behavior, 9(4):214--228, 1985.
|
| |
14
|
D. Heylen. Challenges ahead: Head movements and other social acts in conversations. In AISB 2005, Social Presence Cues Symposium, 2005.
|
| |
15
|
R. W. Hill, J. Belanich, H. C. Lane, M. G. Core, M. Dixon, E. Forbell, J. Kim, and J. Hart. Pedagogically structured game-based training: Development of the elect bilat simulation. In Proceedings of the 25th Army Science Conference (ASC 2006). Association for Computational Linguistics, Noverber, 2006.
|
| |
16
|
B. H. Hwang and L. R. Rabiner. Hidden markov models for speech recognition, August 1991.
|
| |
17
|
A. Kendon. Some uses of the head shake. Gesture, 2:147--182(36), 2002.
|
| |
18
|
Patrick Kenny , Thomas D. Parsons , Jonathan Gratch , Anton Leuski , Albert A. Rizzo, Virtual Patients for Clinical Therapist Skills Training, Proceedings of the 7th international conference on Intelligent Virtual Agents, September 17-19, 2007, Paris, France
[doi> 10.1007/978-3-540-74997-4_19]
|
| |
19
|
|
| |
20
|
François L. A. Knoppel , Almer S. Tigelaar , Danny Oude Bos , Thijs Alofs , Zsófia Ruttkay, Trackside DEIRA: a dynamic engaging intelligent reporter agent, Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, May 12-16, 2008, Estoril, Portugal
|
| |
21
|
J. Lee and S. Marsella. Nonverbal behavior generator for embodied conversational agents. In In Proceedings of the 6th International Conference on Intelligent Virtual Agents, Marina del Rey, CA, pages 243--255. Springer, 2006.
|
| |
22
|
|
| |
23
|
M. Mancini, R. Bresin, and C. Pelachaud. A virtual head driven by music expressivity. IEEE Transactions on Audio, Speech and Language Processing, 15(6):1833--1841, 2007.
|
| |
24
|
E. Z. McClave. Linguistic functions of head movements in the context of speech. Journal of Pragmatics, 32:855--878(24), June 2000.
|
| |
25
|
G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. Int J Lexicography, 3(4):235--244, January 1990.
|
| |
26
|
|
| |
27
|
K. G. Munhall, J. A. Jones, D. E. Callan, T. Kuratate, and E. Vatikiotis-Bateson. Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15:133--137(5), February 2004.
|
| |
28
|
R. Nishimura, N. Kitaoka, and S. Nakagawa. A spoken dialog system for chat-like conversations considering response timing. In TSD, pages 599--606, 2007.
|
| |
29
|
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, 1989.
|
| |
30
|
|
| |
31
|
William Swartout , Jonathan Gratch , Randall W. Hill , Eduard Hovy , Stacy Marsella , Jeff Rickel , David Traum, Toward virtual humans, AI Magazine, v.27 n.2, p.96-108, July 2006
|
| |
32
|
G. Tom, P. Pettersen, T. Lau, T. Burton, and J. Cook. The role of overt head movement in the formation of affect. Basic and Applied Social Psychology, 12(3):281--289, 1991.
|
| |
33
|
D. Traum, A. Roque, A. L. P. Georgiou, J. Gerten, B. M. S. Narayanan, S. Robinson, and A. Vaswani. Hassan: A virtual human for tactical questioning. In Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, pages 71--74, Antwerp, September 2007. Association for Computational Linguistics.
|
| |
34
|
T. Ward and W. Tsukahara. Visual prosody and speech intelligibility in english and japanese. Pragmatics, 23:1177--1207, 2004.
|
| |
35
|
|
|