ACM Home Page
Please provide us with feedback. Feedback
Exploiting prosodic structuring of coverbal gesticulation
Full text PdfPdf (471 KB)
Source International Conference on Multimodal Interfaces archive
Proceedings of the 6th international conference on Multimodal interfaces table of contents
State College, PA, USA
SESSION: Multimodal communication table of contents
Pages: 105 - 112  
Year of Publication: 2004
ISBN:1-58113-995-0
Author
Sanshzar Kettebekov  Advanced Interfaces, Inc., State College, PA
Sponsors
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 37,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1027933.1027953
What is a DOI?

ABSTRACT

Although gesture recognition has been studied extensively, communicative, affective, and biometrical "utility" of natural gesticulation remains relatively unexplored. One of the main reasons for that is the modeling complexity of spontaneous gestures. While lexical information in speech provides additional cues for disambiguating gestures, it does not cover rich paralinguistic domain. This paper offers initial findings from a large corpus of natural monologues about prosodic structuring between frequent beat-like strokes and concurrent speech. Using a set of audio-visual features in an HMM-based formulation, we are able to improve the discrimination between visually similar gestures. Those types of articulatory strokes represent different communicative functions. The analysis is based on the temporal alignment of detected vocal perturbations and the concurrent hand movement. As a supplementary result, we show that recognized articulatory strokes may be used for quantifying gesturing behavior.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
M. Yeasin and S. Chaudhuri, "Visual understanding of dynamic hand gestures," Pattern Recognition, vol. 33, pp. 1805--1817, 2000.
 
5
 
6
 
7
 
8
 
9
A. Kendon, "Gesticulation and speech: Two aspects of the process of the utterance," in The relation between verbal and non-verbal communication, M. R. Key, Ed. Hague: Mouton, 1980, pp. 207-227.
 
10
D. McNeill, Hand and Mind: The University of Chicago Press, Chicago IL, 1992.
 
11
 
12
13
 
14
R. Sharma, M. Yeasin, N. Krahnstoever, I. Rauschert, G. Cai, A. MacEachren, K. Sengupta, and I. Brewer, "Speech-Gesture Driven Multimodal Interfaces for Crisis Management," Proceedings of IEEE special issue on Multimodal Human-Computer Interface, 2003.
15
 
16
D. McNeill, "Gesture and Language Dialectic," Acta Linguistica Hafniesia, 2002.
 
17
M. W. Alibali, S. Kita, and A.J.Young, "Gesture and the process of speech production: We think, therefore we gesture," Language and cognitive processes, vol. 15, pp. 593--613, 2000.
 
18
D. F. Armstrong, W. C. Stokoe, and S. E. Wilcox, Gesture and the Nature of Language: Cambridge University Press, 1995.
 
19
A. Kendon, "Do gestures communicate?: A review," Research on Language and Social Interaction, vol. 27, pp. 175--200, 1994.
 
20
21
 
22
S. Kettebekov, M. Yeasin, and R. Sharma, "Improving Continuous Gesture Recognition with Spoken Prosody," In proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'03), Madison, Wisconsin, 2003, vol. 1, pp. 565--570.
 
23
 
24
M. E. Beckman, "The parsing of prosody," Language and Cognitive Processes, vol. 11, pp. 17--67, 1996.
 
25
B. Butterworth and U. Hadar, "Gesture, speech, and computational stages: A reply to McNeill," Psychological Review, vol. 96, pp. 168--174, 1989.
 
26
R. M. Krauss, Y. Chen, and P. Chawla, "Nonverbal behavior and nonverbal communication: What do conversational hand gestures tell us?," in Advances in experimental social psychology, M. Zanna, Ed. San Diego, CA: Academic Press, 1996, pp. 389--450.
 
27
R. M. Krauss, "Why do we gesture when we speak?," Current Directions in Psychological Science, vol. 7, pp. 54--59, 1998.
 
28
J.-P. de Ruiter, "Gesture and speech production," in Series in Psycholinguistics. Nijmegen, The Netherlands: MPI, 1998.
 
29
D. McNeill, "Gesture and Language Dialectic," Acta Linguistica Hafniensia, 2002.
 
30
 
31
F. Quek and Y. Xiong, "Oscillatory Gestures and Discourse," In proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, 2002.
 
32
S. Kettebekov, M. Yeasin, N. Krahnstoever, and R. Sharma, "Prosody Based Co-analysis of Deictic Gestures and Speech in Weather Narration Broadcast," In proc. of Workshop on Multimodal Resources and Multimodal System Evaluation. (LREC 2002), Las Palmas, Spain, 2002, pp. 57--62.
 
33
 
34
 
35
P. Boersma and D. Weenink, "PRAAT," 4.0 ed. Amsterdam, NL: Institute of Phonetic Sciences. University of Amsterdam, NL, 2002.
 
36
J. N. Holmes, "Mechanisms and Models of Human Speech Production," in Speech Synthesis and Recognition Aspects of Information Technology. Berkshire: Van Nostrand Reinhold, 1988.
 
37
I. Lehiste, Suprasegmentals. Cambridge, Massachusetts: MIT Press, 1970.
 
38
J. E. Atkinson, "Correlation Analysis of the Physiological Factors Controlling Fundamental Voice Frequency," Journal of the Acoustical Society of America, vol. 63, pp. 211--222, 1978.
 
39
J. Godfrey and J. N. Brodsky, "Acoustic Correlates of Emphasis," Journal of the Acoustical Society of America, vol. 80, 1986.
 
40
I. Titze, Principles of Voice Production. Englewood Cliffs: Prentice-Hall, 1994.
 
41
A. Adami, "Modeling Prosodic Differences for Speaker and Language Recognition," OGI School of Science and Engineering: Oregon Health and Science University, Beaverton, OR, Doctorate Thesis, 2004.
 
42


Collaborative Colleagues:
Sanshzar Kettebekov: colleagues