|
ABSTRACT
One of the research goals in the human-computer interaction community is to build believable Embodied Conversational Agents, that is, agents able to communicate complex information with human-like expressiveness and naturalness. Since emotions play a crucial role in human communication and most of them are expressed through the face, having more believable ECAs implies to give them the ability of displaying emotional facial expressions.This paper presents a system based on Hidden Markov Models (HMMs) for the synthesis of emotional facial expressions during speech. The HMMs were trained on a set of emotion examples in which a professional actor uttered Italian non-sense words, acting various emotional facial expressions with different intensities.The evaluation of the experimental results, performed comparing the "synthetic examples" (generated by the system) with a reference "natural example" (one of the actor's examples) in three different ways, shows that HMMs for emotional facial expressions synthesis have some limitations but are suitable to make a synthetic Talking Head more expressive and realistic.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
K. Balci. Xface: Open Source Toolkit for Creating 3D Faces of an Embodied Conversational Agent. In Proceedings of Smart Graphics, 2005.
|
| |
3
|
J. Beskow, L. Cerrato, P. Cosi, E. Costantini, M. Nordstrand, F. Pianesi, M. Prete, and G. Svanfeldt. Preliminary Cross-cultural Evaluation of Expressiveness in Synthetic Faces. In E. Andrè, L. Dybkiaer, W. Minker, and P. Heisterkamp, editors, Affective Dialogue Systems ADS '04, Springer Verlag, 2004.
|
| |
4
|
|
| |
5
|
|
| |
6
|
Justine Cassell , Tim Bickmore , Lee Campbell , Hannes Vilhjálmsson , Hao Yan, Human conversation as a system framework: designing embodied conversational agents, Embodied conversational agents, MIT Press, Cambridge, MA, 2001
|
 |
7
|
Justine Cassell , Catherine Pelachaud , Norman Badler , Mark Steedman , Brett Achorn , Tripp Becket , Brett Douville , Scott Prevost , Matthew Stone, Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents, Proceedings of the 21st annual conference on Computer graphics and interactive techniques, p.413-420, July 1994
[doi> 10.1145/192161.192272]
|
| |
8
|
I. Cohen, A. Garg, and T. Huang. Emotion recognition from facial expressions using multilevel HMM, 2000.
|
| |
9
|
E. Costantini, F. Pianesi, and P. Cosi. Evaluation of Synthetic Faces: Human Recognition of Emotional Facial Displays. In E. Andrè, L. Dybkiaer, W. Minker, and P. Heisterkamp, editors, Affective Dialogue Systems ADS '04, Springer-Verlag, 2004.
|
| |
10
|
|
| |
11
|
P. Doenges, F. Lavagetto, J. Ostermann, I. S. Pandzic, and E. Petajan. MPEG-4: Audio/Video and Synthetic Graphics/Audio for Mixed Media. In Image Communications Journal, 5(4), May 1997.
|
| |
12
|
P. Ekman. An Argument for Basic Emotions. In N. L. Stein, and K. Oatley, editors, Basic Emotions, pp 169--200, 1992.
|
| |
13
|
P. Ekman, and W. Friesen. Manual for the Facial Action Coding System. Consulting Psychologists Press, 1978.
|
| |
14
|
G. Ferrigno, and A. Pedotti. ELITE: A Digital Dedicated Hardware System for Movement Analysis via Real-Time TV Signal Processing. In IEEE Transactions on Biomedical Engineering, BME-32, pp 943--950, 1985.
|
| |
15
|
|
| |
16
|
E. Magno Caldognetto, C. Zmarich, P. Cosi and F. Ferrero. Italian Consonantal Visemes: Relationships Between Spatial/temporal Articulatory Characteristics and Coproduced Acoustic Signal. In Proceedings of AVSP-97, Tutorial & Research Workshop on Audio-Visual Speech Processing: Computational & Cognitive Science Approaches, Rhodes (Greece), pp. 5--8, 26-27 September 1997.
|
| |
17
|
N. Mana, P. Cosi, G. Tisato, F. Cavicchio, E. Magno and F. Pianesi. An Italian Database of Emotional Speech and Facial Expressions. Proceedings of "Workshop on Emotion: Corpora for Research on Emotion and Affect", in association with 5th International Conference on Language, Resources and Evaluation (LREC2006), Genoa, Italy, May 2006.
|
| |
18
|
McBreen, H., Jack, M. (2001). Evaluating Humanoid Synthetic Agents in e-retail Applications. IEEE Transactions on Systems, Man and Cybernetics, vol. 31 (5), 2001.
|
 |
19
|
Helen McBreen , Paul Shade , Mervyn Jack , Peter Wyard, Experimental assessment of the effectiveness of synthetic personae for multi-modal e-retail applications, Proceedings of the fourth international conference on Autonomous agents, p.39-45, June 03-07, 2000, Barcelona, Spain
[doi> 10.1145/336595.336968]
|
| |
20
|
|
| |
21
|
L. R. Rabiner. A tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In Proceedings of the IEEE, 77(2), pp. 257--286, 1989.
|
 |
22
|
Raoul Rickenberg , Byron Reeves, The effects of animated characters on anxiety, task performance, and evaluations of user interfaces, Proceedings of the SIGCHI conference on Human factors in computing systems, p.49-56, April 01-06, 2000, The Hague, The Netherlands
[doi> 10.1145/332040.332406]
|
| |
23
|
D. Sankoff and J. B. Kruskal. Time warps, string edits, and macromolecules: The theory and practice of sequence comparison. Addison-Wesley Publishing Company, Reading, MA, 1983.
|
|