|
ABSTRACT
This paper describes our experience with the creation, indexing, and provision of access to a very large archive of videotaped oral histories - 116,000 hours of digitized interviews in 32 languages from 52,000 survivors, liberators, rescuers, and witnesses of the Nazi Holocaust. It goes on to identify a set of critical research issues that must be addressed if we are to provide full and detailed access to collections of this size: issues in user requirement studies, automatic speech recognition, automatic classification, segmentation, summarization, retrieval, and user interfaces. The paper ends by inviting others to discuss use of these materials in their own research.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Bates, Marcia J., 1996. "The Getty End-User Online Searching Project in the Humanities: Report No. 6: Overview and Conclusions." College and Research Libraries 57 (Nov.): 514--523
|
| |
4
|
|
| |
5
|
Buckland, M., et al., 1999. Mapping Entry Vocabulary to Unfamiliar Metadata Vocabularies. D-Lib Magazine Vol.5 No.1 January
|
| |
6
|
S. Dharanipragada , M. Franz , J. S. McCarley , T. Ward , W.-J. Zhu, Segmentation and detection at IBM: hybrid statistical models and two-tiered clustering, Topic detection and tracking: event-based information organization, Kluwer Academic Publishers, Norwell, MA, 2002
|
| |
7
|
Franz, M., McCarley, J. S., Roukos, S., 1999. Audio-Indexing for Broadcast News, Proceedings of the Seventh Text Retrieval Conference, pp. 115--119
|
| |
8
|
Franz, M., McCarley, J. S., Ward, T., 2000. Ad Hoc, Cross-Language and Spoken Document Retrieval at IBM, Proceedings of the Eight Text Retrieval Conference, pp. 391--398
|
| |
9
|
Goel, V. and Byrne, W., 1999. Task dependent loss functions in speech recognition: A-star search over recognition lattices. Proc. European Conf. On Speech and Communication and Technology. V. 3, p. 1243--1246
|
| |
10
|
|
| |
11
|
Johnson, S. E., Jourlin, P., Sparck Jones, K., Woodland, P. C., 1999. Spoken Document Retrieval. The Eighth Text Retrieval Conference TREC-8, Cambridge University, Nov. See Also http://trec.nist.gov
|
| |
12
|
Kubala, F. and R. Schwartz and R. Stone and R. Weischedel,, 1998. Named entity extraction from speech, in Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, (Lansdowne, VA), February
|
| |
13
|
Merlino, A., and Maybury, M., 1999. An Empirical Study of the Optimal Presentation of Multimedia Summaries of Broadcast News. Mani, I., and Maybury, M. (eds.), Automated Text Summarization. MIT Press. pp. 391--401
|
| |
14
|
|
| |
15
|
Ramabhadran, B., Gao, Y., and Picheny, M., 2000. Dynamic selection of feature spaces for robust speech recognition. ICSPL
|
| |
16
|
Ulargiu, Barbara. 2000 Accessibility of Oral History Collections: An investigation of current practices and future developments. Masters Thesis, University of Sheffield, September, 2000
|
| |
17
|
Young, S., 1996. "A review of large-vocabulary continuous-speech recognition", IEEE Signal Processing Magazine, pp. 45--57, Sep
|
| |
18
|
Zweig, G., et al., 2001. The IBM 2001 Conversational Speech Recognition System, The 2001 NIST Hub-5 Evaluation Workshop, May
|
CITED BY 9
|
|
Douglas W. Oard , Dagobert Soergel , David Doermann , Xiaoli Huang , G. Craig Murray , Jianqiang Wang , Bhuvana Ramabhadran , Martin Franz , Samuel Gustman , James Mayfield , Liliya Kharevych , Stephanie Strassel, Building an information retrieval test collection for spontaneous conversational speech, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
|
|
|
Scott R. Klemmer , Jamey Graham , Gregory J. Wolff , James A. Landay, Books with voices: paper transcripts as a physical interface to oral histories, Proceedings of the SIGCHI conference on Human factors in computing systems, April 05-10, 2003, Ft. Lauderdale, Florida, USA
|
|
|
|
|
|
|
|
|
G. Craig Murray , Bonnie J. Dorr , Jimmy Lin , Jan Hajič , Pavel Pecina, Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, p.945-952, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|