ACM Home Page
Please provide us with feedback. Feedback
Role recognition in multiparty recordings using social affiliation networks and discrete distributions
Full text PdfPdf (214 KB)
Source
International Conference on Multimodal Interfaces archive
Proceedings of the 10th international conference on Multimodal interfaces table of contents
Chania, Crete, Greece
SESSION: Special session on social signal processing (oral session) table of contents
Pages 29-36  
Year of Publication: 2008
ISBN:978-1-60558-198-9
Authors
Sarah Favre  Idiap Research Institute, Martigny, Switzerland
Hugues Salamin  Idiap Research Institute, Martigny, Switzerland
John Dines  Idiap Research Institute, Martigny, Switzerland
Alessandro Vinciarelli  Idiap Research Institute, Martigny, Switzerland
Sponsors
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 79,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1452392.1452401
What is a DOI?

ABSTRACT

This paper presents an approach for the recognition of roles in multiparty recordings. The approach includes two major stages: extraction of Social Affiliation Networks (speaker diarization and representation of people in terms of their social interactions), and role recognition (application of discrete probability distributions to map people into roles). The experiments are performed over several corpora, including broadcast data and meeting recordings, for a total of roughly 90 hours of material. The results are satisfactory for the broadcast data (around 80 percent of the data time correctly labeled in terms of role), while they still must be improved in the case of the meeting recordings (around 45 percent of the data time correctly labeled). In both cases, the approach outperforms significantly chance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Ajmera. Robust Audio Segmentation. PhD thesis, École Polytechnique Fédérale de Lausanne (EPFL), 2004.
 
2
J. Ajmera and C. Wooters. A robust speaker clustering algorithm. In Proceedings of IEEE Workshop on Automatic Speech Recognition Understanding, 2003.
 
3
S. Banerjee and A. Rudnicky. Using simple speech based features to detect the state of a meeting and the roles of the meeting participants. In proceedings of International Conference on Spoken Language Processing, 2004.
 
4
 
5
 
6
J. Dines, J. Vepa, and T. Hain. The segmentation of multi-channel meeting recordings for automatic speech recognition. In Proceedings of Interspeech, pages 1213--1216, 2006.
 
7
E. Glaeser and J. Scheinkman. Measuring social interactions. In S. Durlauf and H. Young, editors, Social Dynamics, pages 83--132. MIT Press, 2001.
 
8
T. Hain, L. Burget, J. Dines, G. Garau, V. Wan, M. Karafiát, J. Vepa, and M. Lincoln. The AMI system for the transcription of speech in meetings. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 357--360, 2007.
 
9
 
10
I. McCowan, J. Carletta, W. Kraaij, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, W. Post, D. Reidsma, and P. Wellner. The AMI meeting corpus. In Proceedings of the 5th International Conference on Methods and Techniques in Behavioral Research, 2005.
 
11
 
12
H. Tischler. Introduction to Sociology. Harcourt Brace College Publishers, 1990.
 
13
A. Vinciarelli. Speakers role recognition in multiparty audio recordings using social network analysis and duration distribution modeling. IEEE Transactions on Multimedia, 9(6), 2007.
 
14
S. Wasserman and K. Faust. Social Network Analysis. Cambridge University Press, 1994.
 
15
C. Weng, W. Chu, and J. Wu. Movie analysis based on roles social network. In proceedings of IEEE International Conference on Multimedia and Expo, pages 1403--1406, 2007.
 
16
S. Wrigley, G. Brown, V. Wan, and S. Renals. Speech and crosstalk detection in multichannel audio. IEEE Transactions on Speech and Audio Processing, 13(1):84--91, 2005.
17

Collaborative Colleagues:
Sarah Favre: colleagues
Hugues Salamin: colleagues
John Dines: colleagues
Alessandro Vinciarelli: colleagues