|
ABSTRACT
We describe an implemented system which automatically generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversation is created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener relationship, the text, and the intonation in turn drive facial expressions, lip motions, eye gaze, head motion, and arm gestures generators. Coordinated arm, wrist, and hand motions are invoked to create semantically meaningful gestures. Throughout we will use examples from an actual synthesized, fully animated conversation.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Argyle and M. Cook. Gaze and Mutual gaze. Cambridge University Press, 1976.
|
| |
2
|
|
| |
3
|
|
| |
4
|
Welton M. Becket. The jack lisp api. Technical Report MS-CIS-94-01/Graphics Lab 59, University of Pennsylvania, 1994.
|
| |
5
|
|
| |
6
|
J. Cappella. personal communication, 1993.
|
| |
7
|
Justine Cassell, Mark Steedman, Norm Badler, Catherine Pelachaud, Matthew Stone, Brett Douville, Scott Prevost and Brett Achorn. Modeling the interaction between speech and gesture. Proceedings of the Cognitive Science Society Annual Conference, 1994.
|
| |
8
|
Justine Cassell and David McNeill. Gesture and the poetics of prose. Poetics Today, 12:375-404, 1992.
|
| |
9
|
Justine Cassell, David McNeill, and Karl-Erik McCullough. Kids, don't try this at home: Experimental mismatches of speech and gesture. presented at the International Communication Association annual meeting, 1993.
|
| |
10
|
D. T. Chen, S. D. Pieper, S. K. Singh, J. M. Rosen, and D. Zeltzer. The virtual sailor: An implementation of interactive human body modeling. In Proc. 1993 Virtual Reality Annual International Symposium, Seattle, WA, September 1993. IEEE.
|
| |
11
|
M.M. Cohen and D.W. Massaro. Modeling coarticulation in synthetic visual speech. In N.M. Thalmann and D.Thalmann, editors, Models and Techniques in Computer Animation, pages 139-156. Springer-Verlag, 1993.
|
| |
12
|
G. Collier. Emotional expression. Lawrence Erlbaum Associates, 1985.
|
| |
13
|
W.S. Condon and W.D. Osgton.Speech and body motion synchrony of the speaker-hearer. In D.H. Hortonand J.J. Jenkins, editors, The perceptionof Language, pages 150-184. Academic Press, 1971.
|
| |
14
|
S. Duncan. Some signals and rules for taking speaking turns in conversations. In Weitz, editor, Nonverbal Communication. Oxford University Press, 1974.
|
| |
15
|
P. Ekman. Movements with precise meanings. The Journal of Communication, 26, 1976.
|
| |
16
|
P. Ekman. About brows: emotional and conversational signals. In M. von Cranach, K. Foppa, W. Lepenies, and D. Ploog, editors, Humanethology: claims and limits of a new disipline: contributions to the Colloquium, pages 169-248. Cambridge University Press, Cambridge, England; New-York, 1979.
|
| |
17
|
P. Ekmanand W. Friesen. Facial Action Coding System. Consulting Psychologists Press, Inc., 1978.
|
 |
18
|
|
| |
19
|
P. Kalra, A. Mangili, N. Magnenat-Thalmann, and D. Thalmann. Smile: A multilayeredfacial animationsystem. In T.L. Kunii, editor, Modeling in Computer Graphics. Springer-Verlag, 1991.
|
| |
20
|
A. Kendon. Movement coordination in social interaction: some examples de-scribed. In Weitz, editor, Nonverbal Communication. Oxford University Press, 1974.
|
| |
21
|
AdamKendon. Gesticulation and speech: Two aspects of the process of utterance. In M.R.Key, editor, The Relation between Verbal and Nonverbal Communication, pages 207-227. Mouton, 1980.
|
| |
22
|
Jintae Lee and Tosiyasu L. Kunii. Visual translation: From native language to sign language. In Workshop on Visual Languages, Seattle, WA, 1993. IEEE.
|
 |
23
|
|
| |
24
|
Mark Liberman and A. L. Buchsbaum. Structure and usage of current Bell Labs text to speech programs. Technical MemorandumTM 11225-850731-11, AT&T Bell Laboratories, 1985.
|
 |
25
|
|
| |
26
|
|
| |
27
|
David McNeill. Handand Mind: What Gestures Reveal about Thought. University of Chicago, 1992.
|
| |
28
|
M. Patel. Making FACES. PhD thesis, School of Mathematical Sciences, Univer-sity of Bath, Bath, AVON, UK, 1991.
|
| |
29
|
C. Pelachaud, N.I. Badler, and M. Steedman. Linguistic issues in facial animation. In N. Magnenat-Thalmann and D. Thalmann, editors, Computer Animation '91, pages 15-30. Springer-Verlag, 1991.
|
| |
30
|
Richard Power. The organisation of purposeful dialogues. Linguistics, 1977.
|
| |
31
|
|
| |
32
|
Ellen F. Prince. The ZPG letter: Subjects, definiteness and information status. In S. Thompson and W. Mann, editors, Discourse description: diverse analyses of a fund raising text, pages 295-325. John Benjamins B.V., 1992.
|
 |
33
|
|
| |
34
|
Barbara Robertson. Easy motion. Computer Graphics World, 16(12):33-38, December 1993.
|
| |
35
|
Klaus R. Scherer. The functions of nonverbal signs in conversation. In H. Giles R. St. Clair, editor, The Social and Physhological Contexts of Language, pages 225-243. Lawrence Erlbaum Associates, 1980.
|
| |
36
|
Mark Steedman. Structure and intonation. Language, 67:260-296, 1991.
|
| |
37
|
|
| |
38
|
K. Tuite. The production of gesture. Semiotica, 93(1/2), 1993.
|
CITED BY 73
|
|
Kenji Sakamoto , Haruo Hinode , Keiko Watanuki , Susumu Seki , Jiro Kiyama , Fumio Togawa, A response model for a CG character based on timing of interactions in a multimodal human interface, Proceedings of the 2nd international conference on Intelligent user interfaces, p.257-260, January 06-09, 1997, Orlando, Florida, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bill Tomlinson , Marc Downie , Matt Berlin , Jesse Gray , Derek Lyons , Jennie Cochran , Bruce Blumberg, Leashing the AlphaWolves: mixing user direction with autonomous emotion in a pack of semi-autonomous virtual characters, Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on Computer animation, July 21-22, 2002, San Antonio, Texas
|
|
|
Ivo van Es , Dirk Heylen , Betsy van Dijk , Anton Nijholt, Gaze behavior of talking faces makes a difference, CHI '02 extended abstracts on Human factors in computing systems, April 20-25, 2002, Minneapolis, Minnesota, USA
|
|
|
Catherine Pelachaud , Valeria Carofiglio , Berardina De Carolis , Fiorella de Rosis , Isabella Poggi, Embodied contextual agent in information delivering application, Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2, July 15-19, 2002, Bologna, Italy
|
|
|
|
|
|
Marilyn A. Walker , Janet E. Cahn , Stephen J. Whittaker, Improvising linguistic style: social and affective bases for agent personality, Proceedings of the first international conference on Autonomous agents, p.96-105, February 05-08, 1997, Marina del Rey, California, United States
|
|
|
Yan Li , Feng Yu , Ying-Qing Xu , Eric Chang , Heung-Yeung Shum, Speech-driven cartoon animation with emotions, Proceedings of the ninth ACM international conference on Multimedia, September 30-October 05, 2001, Ottawa, Canada
|
|
|
|
|
|
|
|
|
|
|
|
John P. Granieri , Welton Becket , Barry D. Reich , Jonathan Crabtree , Norman I. Badler, Behavioral control for real-time simulated human agents, Proceedings of the 1995 symposium on Interactive 3D graphics, p.173-180, April 09-12, 1995, Monterey, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
James C. Lester , Luke S. Zettlemoyer , Joël P. Grégoire , William H. Bares, Explanatory lifelike avatars: performing user-centered tasks in 3D learning environments, Proceedings of the third annual conference on Autonomous Agents, p.24-31, April 1999, Seattle, Washington, United States
|
|
|
|
|
|
Mukesh Dalal , Steven Feiner , Kathleen McKeown , Shimei Pan , Michelle Zhou , Tobias Höllerer , James Shaw , Yong Feng , Jeanne Fromer, Negotiation for automated generation of temporal multimedia presentations, Proceedings of the fourth ACM international conference on Multimedia, p.55-64, November 18-22, 1996, Boston, Massachusetts, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
J. Cassell , T. Bickmore , M. Billinghurst , L. Campbell , K. Chang , H. Vilhjálmsson , H. Yan, Embodiment in conversational interfaces: Rea, Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, p.520-527, May 15-20, 1999, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
Jan Allbeck , Karin Kipper , Charles Adams , William Schuler , Elena Zoubanova , Norman Badler , Martha Palmer , Aravind Joshi, ACUMEN: amplifying control and understanding of multiple entities, Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1, July 15-19, 2002, Bologna, Italy
|
|
|
Dorée Duncan Seligmann , Rebecca T. Mercuri , John T. Edmark, Providing assurances in a multimedia interactive environment, Proceedings of the SIGCHI conference on Human factors in computing systems, p.250-256, May 07-11, 1995, Denver, Colorado, United States
|
|
|
J. Cassell , T. Bickmore , H. Vilhjálmsson , H. Yan, More than just a pretty face: affordances of embodiment, Proceedings of the 5th international conference on Intelligent user interfaces, p.52-59, January 09-12, 2000, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
Matthew Stone , Doug DeCarlo , Insuk Oh , Christian Rodriguez , Adrian Stere , Alyssa Lees , Chris Bregler, Speaking with hands: creating animated conversational characters from recordings of human performance, ACM Transactions on Graphics (TOG), v.23 n.3, August 2004
|
|
|
|
|
|
|
|
|
Zhigang Deng , Shri Narayanan , Carlos Busso , Ulrich Neumann, Audio-based head motion synthesis for Avatar-based telepresence systems, Proceedings of the 2004 ACM SIGMM workshop on Effective telepresence, October 15-15, 2004, New York, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
H. C. van Vugt , E. A. Konijn , J. F. Hoorn , I. Keur , A. Eliëns, Realism is not all! User engagement with task-related interface characters, Interacting with Computers, v.19 n.2, p.267-280, March, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhigang Deng , Ulrich Neumann , J. P. Lewis , Tae-Yong Kim , Murtaza Bulut , Shrikanth Narayanan, Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces, IEEE Transactions on Visualization and Computer Graphics, v.12 n.6, p.1523-1534, November 2006
|
|
|
Bilge Mutlu , Toshiyuki Shiwa , Takayuki Kanda , Hiroshi Ishiguro , Norihiro Hagita, Footing in human-robot conversations: how robots might shape participant roles using gaze cues, Proceedings of the 4th ACM/IEEE international conference on Human robot interaction, March 09-13, 2009, La Jolla, California, USA
|
|
|
Berardina De Carolis , Catherine Pelachaud , Isabella Poggi , Fiorella de Rosis, Behavior planning for a reflexive agent, Proceedings of the 17th international joint conference on Artificial intelligence, p.1059-1064, August 04-10, 2001, Seattle, WA, USA
|
|
|
|
|