ABSTRACT
Data-driven approaches have been successfully used for realistic visual speech synthesis. However, little effort has been devoted to real-time lip-synching for interactive applications. In particular, algorithms that are based on a graph of motions are notorious for their exponential complexity. In this paper, we present a greedy graph search algorithm that yields vastly superior performance and allows real-time motion synthesis from a large database of motions. The time complexity of the algorithm is linear with respect to the size of an input utterance. In our experiments, the synthesis time for an input sentence of average length is under a second.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
{BS94} Brook N., Scott S.: Computer graphics animations of talking faces based on stochastic models. In International Symposium on Speech, Image Processing, and Neural Networkds (1994).
|
| |
5
|
|
| |
6
|
{CM93} Cohen N., Massaro D. W.: Modeling coarticulation in synthetic visual speech. In Models and Techniques in Computer Animation (1993), Thalmann N. M., Thalmann D., (Eds.), Springer-Verlang, pp. 139--156.
|
 |
7
|
Justine Cassell , Catherine Pelachaud , Norman Badler , Mark Steedman , Brett Achorn , Tripp Becket , Brett Douville , Scott Prevost , Matthew Stone, Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents, Proceedings of the 21st annual conference on Computer graphics and interactive techniques, p.413-420, July 1994
[doi> 10.1145/192161.192272]
|
| |
8
|
|
 |
9
|
|
| |
10
|
{Int} International Computer Science Institute, Berkeley, CA: Rasta software. www.icsi.berkeley.edu/Speech/rasta.html.
|
 |
11
|
|
| |
12
|
{KMG02} Kalberer G. A., Mueller P., Gool L. V.: Speech animation using viseme space. In Vision, Modeling, and Visualization VMV 2002 (2002), Akademische Verlags-gesellschaft Aka GmbH, Berlin, pp. 463--470.
|
| |
13
|
{LCR*02} Lee J., Chai J., Reitsma P., Hodgins J., Pollard N.: Interactive control of avatars animated with human motion data, 2002.
|
 |
14
|
|
 |
15
|
|
| |
16
|
{MKT*98} Masuko T., Kobayashi T., Tamura M., Masubuchi J., K. Tokuda: Text-to-visual speech synthesis based on parameter generation from hmm. In ICASSP (1998).
|
| |
17
|
{Pel91} Pelachaud C.: Realistic Face Animation for Speech. PhD thesis, University of Pennsylvania, 1991.
|
| |
18
|
{SBCS04} Saisan P., Bissacco A., Chiuso A., Soatto S.: Modeling and synthesis of facial motion driven by speech. In European Conference on Computer Vision 2004 (2004), pp. 456--467.
|
| |
19
|
{SG} Speech Group C. M. U.:. www.speech.cs.cmu.edu/festival.
|
 |
20
|
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhigang Deng , Ulrich Neumann , J. P. Lewis , Tae-Yong Kim , Murtaza Bulut , Shrikanth Narayanan, Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces, IEEE Transactions on Visualization and Computer Graphics, v.12 n.6, p.1523-1534, November 2006
|
|
|
|
|