ACM Home Page
Please provide us with feedback. Feedback
Audio Puzzler: piecing together time-stamped speech transcripts with a puzzle game
Full text PdfPdf (266 KB)
Source
International Multimedia Conference archive
Proceeding of the 16th ACM international conference on Multimedia table of contents
Vancouver, British Columbia, Canada
SESSION: Applications track short papers session 1 table of contents
Pages 865-868  
Year of Publication: 2008
ISBN:978-1-60558-303-7
Authors
Nicholas Diakopoulos  Georgia Institute of Technology, Atlanta, GA, USA
Kurt Luther  Georgia Institute of Technology, Atlanta, GA, USA
Irfan Essa  Georgia Institute of Technology, Atlanta, GA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 50,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1459359.1459507
What is a DOI?

ABSTRACT

We have developed an audio-based casual puzzle game which produces a time-stamped transcription of spoken audio as a by-product of play. Our evaluation of the game indicates that it is both fun and challenging. The transcripts generated using the game are more accurate than those produced using a standard automatic transcription system and the time-stamps of words are within several hundred milliseconds of ground truth.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Haubold, A. and Kender, J. R., Alignment of Speech to Highly Imperfect Text Transcriptions. in International Conference on Multimedia and Expo (ICME), (2007), 224--227.
 
3
Huang, X., Acero, A. and Hon, H. W. Spoken Language Processing. Prentice Hall, 2001.
 
4
5
 
6
Miller, G. The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. The Psychological Review, 63. 81--97.
7
8
9
 
10
Stark, L., Whittaker, S. and Hirschberg, J., ASR Satisficing: The effects of ASR accuracy on speech retrieval. in Proceedings of International Conference on Spoken Language Processing, (2000).
 
11
Turnbull, D., Liu, R., Barrington, L. and Lanckriet, G., A Game-Based Approach for Collecting Semantic Annotations of Music. in International Symposium on Music Information Retrieval, (2007).
12
 
13
Yuan, J., Liberman, M. and Cieri, C., Towards an Integrated Understanding of Speaking Rate in Conversation. in Proceedings of the Conference on Spoken Language Processing, (2006).


Collaborative Colleagues:
Nicholas Diakopoulos: colleagues
Kurt Luther: colleagues
Irfan Essa: colleagues