|
ABSTRACT
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Allauzen, A. and Gauvain, J.-L. 2005. Open vocabulary ASR for audiovisual document indexation. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1013--1016.
|
| |
3
|
Aroyo, L., Kuflik, T., Stock, O., and Zancanar, M., Eds. 2007. Proceedings of the User Modeling Conference Workshop on Personalization Enhanced Access to Cultural Heritage.
|
| |
4
|
Auzanne, C., Garofolo, J., Fiscus, J., and Fisher, W. 2000. Automatic language model adaptation for spoken document retrieval. In Proceedings of Recherche d'Information Assistée par Ordinatour (RIAO). 132--141.
|
| |
5
|
|
 |
6
|
Henk Ernst Blok , Vojkan Mihajlović , Georgina Ramírez , Thijs Westerveld , Djoerd Hiemstra , Arjen P. de Vries, The TIJAH XML information retrieval system, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148338]
|
 |
7
|
M. G. Brown , J. T. Foote , G. J. F. Jones , K. Sparck Jones , S. J. Young, Automatic content-based retrieval of broadcast news, Proceedings of the third ACM international conference on Multimedia, p.35-43, November 05-09, 1995, San Francisco, California, United States
[doi> 10.1145/217279.215080]
|
| |
8
|
Burget, L. 2005. Combination of speech features using smoothed heteroscedastic linear discriminant analysis. In Proceedings of the International Conference on Spoken Language Processing (ICSLP'04). Jeju island, KR.
|
| |
9
|
Byrne, W., Doermann, D., Franz, M., Gustman, S., Hajic, J., Oard, D., Picheny, M., Psutka, J., Ramabhadran, B., Soergel, D., Ward, T., and Zhu, W.-J. 2004. Automatic recognition of spontaneous speech for access to multilingual oral history archives. IEEE Trans. Speech Audio Process. 12, 4, 420--435.
|
 |
10
|
|
| |
11
|
de Jong, F., Oard, D., Ordelman, R., and Raaijmakers, S., Eds. 2007a. Proceedings of ACM SIGIR Workshop on Searching Spontaneous Conversational Speech.
|
| |
12
|
de Jong, F., Ordelman, R., and van Hessen, A. 2006. The role of automated speech and audio analysis in semantic multimedia annotation. In Proceedings of the International Conference on Visual Information Engineering. Bangalore, India, 226--240.
|
 |
13
|
|
| |
14
|
Garofolo, J., Auzanne, C., and Voorhees, E. 2000. The TREC SDR track: A success story. In Proceedings of the 8th Text Retrieval Conference. Washington, DC, 107--129.
|
| |
15
|
Godfrey, J., Holliman, E., and McDaniel, J. 1992. Switchboard: telephone speech corpus for research and development. In IEEE the International Conference on Acoustics, Speech and Signal Processing (ICASSP). San Francisco, CA, Vol. 1. 517--520.
|
| |
16
|
Goldman, J., Renals, S., Bird, S., de Jong, F., Stewart, C., Frederico, M., Fleischhauer, C., Lamel, L., Kornbluh, M., Sebastiani, F., Oard, D. W., and Wright, R. 2003. Report of the EU/NSF working group on Spoken Word Audio Archives. http:/www.ercim.org/publication/ws-proceedings/delos-nsf/spokenword.pdf.
|
| |
17
|
Goldman, J., Renals, S., Bird, S., de Jong, F. M. G., Federico, M., Fleischhauer, C., Kornbluh, M., Lamel, L., Oard, D. W., Stewart, C., and Wright, R. 2005. Accessing the spoken word. Int. J. Digital Libraries 5, 4, 287--298.
|
 |
18
|
Samuel Gustman , Dagobert Soergel , Douglas Oard , William Byrne , Michael Picheny , Bhuvana Ramabhadran , Douglas Greenberg, Supporting access to large digital oral history archives, Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, July 14-18, 2002, Portland, Oregon, USA
[doi> 10.1145/544220.544224]
|
| |
19
|
Hansen, J., Huang, R., Zhou, B., Deadle, M., Deller, J., Gurijala, A. R., Kurimo, M., and Angkititrakul, P. 2005. Speechfind: Advances in spoken document retrieval for a national gallery of the spoken word. IEEE Trans. Speech Audio Process. 13, 5, 712--730.
|
 |
20
|
Willemijn Heeren , Laurens van der Werff , Roeland Ordelman , Arjan van Hessen , Franciska de Jong, Radio Oranje: searching the queen's speech(es), Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277971]
|
| |
21
|
Hori, C. and Furui, S. 2003. A new approach to automatic speech summarization. IEEE Trans. Multimedia 5, 368--378.
|
| |
22
|
Horvitz, E., Dumais, S., and Koch, P. 2004. Learning predictive models of memory landmarks. In Proceedings of the Cognitive Science Society.
|
| |
23
|
Huijbregts, M. A. H., Ordelman, R. J. F., and de Jong, F. M. G. 2007. Annotation of heterogeneous multimedia content using automatic speech recognition. In Proceedings of the 2nd International Conference on Semantic and Digital Media Technologies (SAMT).
|
| |
24
|
|
 |
25
|
|
| |
26
|
Jing, H., Kambhatla, N., and Roukos, S. 2007. Extracting social networks and biographical facts from conversational speech transcripts. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, Czech Republic, 1040--1047.
|
| |
27
|
Jourlin, P., Johnson, S., Spärck Jones, K., and Woodland, P. 1999. General query expansion techniques for spoken document retrieval. In Proceedings of the ESCA Workshop on Extracting Information from Spoken Audio. Cambridge, UK, 8--13.
|
| |
28
|
Kim, J., Oard, D., and Soergel, D. 2003. Searching large collections of recorded speech: A preliminary study. In Proceedings of the Annual Conference of the American Society for Information Science and Technology. Long Beach, CA, 330--339.
|
 |
29
|
Scott R. Klemmer , Jamey Graham , Gregory J. Wolff , James A. Landay, Books with voices: paper transcripts as a physical interface to oral histories, Proceedings of the SIGCHI conference on Human factors in computing systems, April 05-10, 2003, Ft. Lauderdale, Florida, USA
[doi> 10.1145/642611.642628]
|
| |
30
|
Legetter, C. and Woodland, P. 1995. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comp. Speech Lang. 9, 171--185.
|
| |
31
|
Makkonen, J. and Ahonen-Myka, H. 2003. Utilizing temporal information in topic detection and tracking. In Proceedings of the European Conference on Digital Libraries. 393--404.
|
| |
32
|
McKeown, K., Hirschberg, J., Galley, M., and Maskey, S. 2005. From text to speech summarization. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP). 997--1000.
|
| |
33
|
Morang, J., de Jong, F., Ordelman, R., and van Hessen, A. 2005. Infolink: analysis of dutch broadcast news and cross-media browsing. In Proceedings of the IEEE International Conference on Multimedia. Amsterdam, The Netherlands, 1582--1585.
|
| |
34
|
Moreno, P. J., Joerg, C., Thong, J.-M. V., and Glickman, O. 1998. A recursive algorithm for the forced alignment of very long audio segments. In Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP). Sydney, Australia.
|
| |
35
|
|
| |
36
|
Douglas W. Oard , Dina Demner-Fushman , Jan Hajic , Bhuvana Ramabhadran , Samuel Gustman , William J. Byrne , Dagobert Soergel , Bonnie J. Dorr , Philip Resnik , Michael Picheny, Cross-Language Access to Recorded Speech in the MALACH Project, Proceedings of the 5th International Conference on Text, Speech and Dialogue, p.57-64, September 09-12, 2002
|
 |
37
|
|
| |
38
|
Oomen, J. and Smulders, H. 2006. First analysis of metadata in the cultural heritage domain. MultiMatch report. http://www.multimatch.org.
|
| |
39
|
Ordelman, R., de Jong, F., and Heeren, W. 2006. Exploration of audiovisual heritage using audio indexing technology. In Proceedings of the 1st ECAI Workshop on Intelligent Technologies for Cultural Heritage Exploitation. 36--39.
|
| |
40
|
Pecina, P., Hoffmannova, P., Jones, G. J., Zhang, Y., and Oard, D. W. 2007. Overview of the CLEF-2007 cross language speech retrieval track. In Working Notes for the CLEF 2007 Workshop.
|
| |
41
|
Ringel, M., Cutrell, E., Dumais, S. T., and Horvitz, E. 2003. Milestones in time: The value of landmarks in retrieving information from personal stores. In Proceedings of Interact.
|
| |
42
|
Roark, B., Liu, Y., Harper, M., Stewart, R., Lease, M., Snover, M., Shafran, I., Dorr, B., Hale, J., Krasnyanskaya, A., and Yung, L. 2006. Reranking for sentence boundary detection in conversational speech. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP).
|
| |
43
|
Rosenfeld, R. 1995. Optimizing lexical and N-gram coverage via judicious use of linguistic data. In Eurospeech 95. 1763--1766.
|
 |
44
|
|
| |
45
|
|
 |
46
|
|
| |
47
|
Steijlen, F. 2002. Memories of The East. KITLV Press, Leiden, The Netherlands.
|
| |
48
|
Ulargiu, B. 2000. Accessibility of oral history collections: An investigation into current practices and future developments. Masters thesis, University of Sheffield.
|
| |
49
|
van den Bosch, A., Grover, C., and Sporleder, C., Eds. 2007. In Proceedings of the ACL Workshop on Language Technology for Cultural Heritage Data.
|
| |
50
|
L. Welling , S. Kanthak , H. Ney, Improved methods for vocal tract normalization, Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference, p.761-764, March 15-19, 1999
[doi> 10.1109/ICASSP.1999.759780]
|
 |
51
|
Steve Whittaker , Julia Hirschberg , John Choi , Don Hindle , Fernando Pereira , Amit Singhal, SCAN: designing and evaluating user interfaces to support retrieval from speech archives, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.26-33, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312639]
|
 |
52
|
P. C. Woodland , S. E. Johnson , P. Jourlin , K. Spärck Jones, Effects of out of vocabulary words in spoken document retrieval (poster session), Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.372-374, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345661]
|
| |
53
|
Wright, R. and Williams, A. 2001. Presto - preservation techniques for European broadcast archives. IST-1999-20013.
|
 |
54
|
Pengyi Zhang , Lynne Plettenberg , Judith L. Klavans , Douglas W. Oard , Dagobert Soergel, Task-based interaction with an integrated multilingual, multimedia information system: a formative evaluation, Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, June 18-23, 2007, Vancouver, BC, Canada
[doi> 10.1145/1255175.1255199]
|
| |
55
|
Zhang, P. and Soergel, D. 2006. Knowledge-based approaches to the segmentation of oral history interviews. MALACH technical report, College of Information Studies, University of Maryland, College Park, MD.
|
CITED BY
|
|
Michael G. Christel , Robert V. Baron , Geoff Froh , Dan Benson , Julieanna Richardson, Accessing the densho and historymakers oral history collections via informedia technologies, Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, June 15-19, 2009, Austin, TX, USA
|
|