ACM Home Page
Please provide us with feedback. Feedback
Interface design strategies for computer-assisted speech transcription
Full text PdfPdf (857 KB)
Source
OZCHI; Vol. 287 archive
Proceedings of the 20th Australasian Conference on Computer-Human Interaction: Designing for Habitus and Habitat table of contents
Cairns, Australia
SESSION: Speech & voice table of contents
Pages 203-210  
Year of Publication: 2008
ISBN:0-9803063-4-5
Authors
Saturnino Luz  Trinity College Dublin, Ireland
Masood Masoodian  The University of Waikato, New Zealand
Bill Rogers  The University of Waikato, New Zealand
Chris Deering  Trinity College Dublin, Ireland
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 51,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1517744.1517812
What is a DOI?

ABSTRACT

A set of user interface design techniques for computer-assisted speech transcription are presented and evaluated with respect to task performance and usability. These techniques include error-correction mechanisms which originated in dictation systems and audio editors as well as new techniques developed by us which exploit specific characteristics of existing speech recognition technologies in order to facilitate transcription in settings that typically yield considerable recognition inaccuracy, such as when the speech to be transcribed was produced by different speakers. In particular, we describe a mechanism for dynamic propagation of user feedback which progressively adapts the system to different speakers and lexical contexts. Results of usability and performance evaluation trials indicate that feedback propagation, menu-based correction coupled with keyboard interaction and text-driven audio playback are positively perceived by users and result in improved transcript accuracy.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
A. Artegian. The technology-augmented court record. In Proceedings of the Fifth National Court Technology Conference, 1997.
 
3
Audacity sound editor. http://audacity.sf.net/. accessed 2nd July 2008.
 
4
 
5
S. Borowitz. Computer-based speech recognition as an alternative to medical transcription. Journal of the American Medical Informatics Association, 8:101--102, 2001.
 
6
M.-M. Bouamrane and S. Luz. An analytical evaluation of search by content and interaction patterns on multimodal meeting records. Multimedia Systems, 13(2):89--102, 2007.
 
7
 
8
Halverson, C. A., Horn, D. B., Karat, C.-M., and J. Karat. The beauty of errors: Patterns of error correction in desktop speech systems. In Proceedings of INTERACT'99: Human-Computer Interaction, pages 133--140, 1999.
 
9
T. Hazen. Automatic alignment and error correction of human generated transcripts for long speech recordings. In Proceedings of Interspeech'06, pages 1606--1609, Pittsburgh, Pennsylvania, 2006.
 
10
11
12
 
13
I. McCowan, D. Moore, J. Dines, D. Gatica-Perez, M. Flynn, P. Wellner, and H. Bourlard. On the use of information retrieval measures for speech recognition evaluation. Technical Report IDIAP-RR-04-73, LIDIAP, 2004.
 
14
D. N. Mohr, D. W. Turner, G. R. Pond, J. S. Kamath, C. B. De Vos, and P. C. Carpenter. Speech recognition as a transcription aid: A randomized comparison with standard transcription. Journal of the American Medical Informatics Association, 10(1):85--93, 2003.
 
15
MPI. ELAN: Eucido Linguistic Annotator. Max Planck Institute for Psycholinguistics, March 2005. http://www.mpi.nl/tool/elan.html.
16
 
17
H. Nanjo and T. Kawahara. Towards an efficient archive of spontaneous speech: Design of computer-assisted speech transcription system. The Journal of the Acoustical Society of America, 120:3042, 2006.
 
18
 
19
D. Rosenthal, F. Chew, D. Dupuy, S. Kattapuram, W. Palmer, R. Yap, and L. Levine. Computer-based speech recognition as a replacement for medical transcription. American Journal of Roentgenology, 170(1):23--5, 1998.
 
20
K. Sjölander and J. Beskow. Wavesurfer - an open source speech tool. In Proceedings of the 6th International Conference on Spoken Language Processing. ISCA, 2000.
21
 
22
C. L. Wayne. Topic detection and tracking (TDT): Overview and perspectives. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne Conference Resort Lansdowne, Virginia, USA, Feb. 1998.
 
23
A. Zafar, B. Mamlin, S. Perkins, A. M. Belsito, J. M. Overhage, and C. J. McDonald. A simple error classification system for understanding sources of error in automatic speech recognition and human transcription. International Journal of Medical Informatics, 73:719--730, Sep 2004.

Collaborative Colleagues:
Saturnino Luz: colleagues
Masood Masoodian: colleagues
Bill Rogers: colleagues
Chris Deering: colleagues