ACM Home Page
Please provide us with feedback. Feedback
Phonetic confusion matrix based spoken document retrieval
Full text PdfPdf (714 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Athens, Greece
Pages: 81 - 87  
Year of Publication: 2000
ISBN:1-58113-226-3
Authors
Savitha Srinivasan  IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Dragutin Petkovic  IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Sponsors
Athens U of Econ & Business : Athens University of Economics and Business
Greek Com Soc : Greek Computer Society
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 93,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/345508.345552
What is a DOI?

ABSTRACT

Combined word-based index and phonetic indexes have been used to improve the performance of spoken document retrieval systems primarily by addressing the out-of-vocabulary retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken document retrieval against word-based retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Amir, A., Ponceleon, D., Blanchard, B., Petkovic, D., Srinivasan, S. and Cohen, G. Using Audio Time Scale Modification for Video Browsing, in Proceedings of HICSS-33, Hawaii, Jan. 2000.
2
 
3
Dharanipragada, S., Franz, M. and Roukos, S. Audio-Indexing For Broadcast News. In Proceedings of Seventh Text Retrieval Conference, TREC-6, (NIST Special Publication) 1997.
 
4
Dharanipragada, S., and Roukos, S. A Fast vocabulary independent algorithm for spotting words in speech. In Proceedings of lCASSP 98, 1998.
5
 
6
Garofolo, J.,Voorhees, E., Auzanne, C., Stanford, V. and Lund, B. (1997). The TREC-7 Spoken Document Retrieval Track Overview and Results. In Proceedings of the seventh Text Retrieval Conference (TREC-7), pp. 79. NIST Special Publication 500-242.
 
7
James, D. System for Unrestricted Topic Retrieval from Radio News Broadcasts, In Proceedings of ICASSP-96, Atlanta, GA, May196, pp. 279-282.
8
 
9
Johnson, S.E., Jourlin, P., Moore, G.L., Jones, K.S. and Woodland, P.C. Spoken Document Retrieval for TREC-7 at Cambridge University. In Proceedings of the Seventh Text Retrieval Conference (TPREC-7), (NIST Special Publication) 1998
 
10
Jones, G. J. F., Foote, J. T., Jones, K. S., and Young, S. J.. Video Mail Retrieval: the effect of word spotting accuracy on precision. In Proceedings of ICASSP 95, volume 1, pp. 309-312, Detroit, MI.
11
 
12
Jones, K. S., Walker, S. and Robertson, S.E. A probabilistic model of information retrieval: Develepment and STatus, TR 446, Cambridge University Computer Laboratory, Sept 1998.
 
13
See URL at http://www.lotus.com/home.nsf/tabs/learnspace
 
14
Lunassen, L.M. and Mercer, R.L. An Information Theoretic Approach to Automatic Determination of Phonemic Baseforms. In Proceedings of ICASSP 84, pp. 42.5.1-42.5.4, 1984.
15
 
16
Ng, K. and Zue, V. Phonetic Recognition for Spoken Document Retrieval. In Proceedings of ICASSP 98, pp. 325-328.
 
17
Robertson, S.E. and SparckoJones, K. Relevance weighting of search terms. In Journal of American Society of Information Sciences. 27 (May-June 1976). pp. 126-146.
 
18
Robertson, S.E., Walker, A., Sparck-Jones, K., Hancock-Beaulieu M.M & Gatford, M. Okapi at TREC-3. In Prec. Third Text Retrieval Conference. (NIST special publication), 1995.
 
19
Sch/tuble, P. and Wechsler, M. First Experiences with a System for Content Based Retrieval of Information from Speech Recordings. In IJCAI-95, Workshop on Intelligent Multimedia Information Retrieval, Maybury, M.T.
 
20
Siegler, M.A., Witbrock, M.J., Slattery, S.T., Seymore, K., Jones, R.E. and Hauptmann, A.G. Experiments in Spoken Document Retrieval at CMU. In Ptvceedings of the Seventh Text Retrieval Conference (TREC-7), (NIST Special Publication) 1998.
 
21
Singhal, A., Col, J., Hindle, D., Lewis, D. and Pereira, F. AT&T at TREC-7. In Proceedings of the Seventh Text Retrieval Conference TREC-7, (NIST Special Publication) 1998.
 
22
 
23
See URL at http://cwp.stanford.edu.
 
24
See URL at http://www-4.ibm.com/software/speecld
 
25
Voorhees, E., Garofolo, J. and Jones, K. (1997). The TREC-6 Spoken Document Retrieval Track Overview and Results. In Proceedings of the sixth Text Retrieval Conference (TREC-6), pp. 83. NIST Special Publication 500-240.
26
27

CITED BY  15

Collaborative Colleagues:
Savitha Srinivasan: colleagues
Dragutin Petkovic: colleagues