|
ABSTRACT
Recent efforts on the task of spoken document retrieval (SDR) have made use of speech lattices: speech lattices contain information about alternative speech transcription hypotheses other than the 1-best transcripts, and this information can improve retrieval accuracy by overcoming recognition errors present in the 1-best transcription. In this paper, we look at using lattices for the query-by-example spoken document retrieval task - retrieving documents from a speech corpus, where the queries are themselves in the form of complete spoken documents (query exemplars). We extend a previously proposed method for SDR with short queries to the query-by-example task. Specifically, we use a retrieval method based on statistical modeling: we compute expected word counts from document and query lattices, estimate statistical models from these counts, and compute relevance scores as divergences between these models. Experimental results on a speech corpus of conversational English show that the use of statistics from lattices for both documents and query exemplars results in better retrieval accuracy than using only 1-best transcripts for either documents, or queries, or both. In addition, we investigate the effect of stop word removal which further improves retrieval accuracy. To our knowledge, our work is the first to have used a lattice-based approach to query-by-example spoken document retrieval.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
T. K. Chia, H. Li, and H. T. Ng. A statistical language modeling approach to lattice-based spoken document retrieval. In Proceedings of EMNLP-CoNLL 2007, pages 810--818, 2007.
|
| |
8
|
N. Colineau and A. Halber. A hybrid approach to spoken query processing in document retrieval system. In Proceedings of the ESCA ETRW Workshop: Accessing information in spoken audio, pages 31--36, 1999.
|
| |
9
|
D. He, H. R. Park, G. C. Murray, M. Subotin, and D. W. Oard. TDT-2002 topic tracking at Maryland: First experiments with the Lemur toolkit. Technical Report LAMP-TR-099, CS-TR-4454, UMIACS-TR-2003-24, University of Maryland, College Park, Feb. 2003.
|
| |
10
|
T. Hori, I. L. Hetherington, T. J. Hazen, and J. R. Glass. Open-vocabulary spoken utterance retrieval using confusion networks. In Proceedings of IEEE ICASSP, pages 73--76, Honolulu, Hawaii, 2007.
|
| |
11
|
D. A. James. The Application of Classical Information Retrieval Techniques to Spoken Documents. PhD thesis, University of Cambridge, 1995.
|
| |
12
|
D. A. James and S. J. Young. A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of IEEE ICASSP, pages 377--380, Adelaide, Australia, 1994.
|
| |
13
|
F. Jelinek and R. L. Mercer. Interpolated estimation of Markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice, pages 381--397, 1980.
|
 |
14
|
G. J. F. Jones , J. T. Foote , K. Spärck Jones , S. J. Young, Retrieving spoken documents by combining multiple index sources, Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, p.30-38, August 18-22, 1996, Zurich, Switzerland
[doi> 10.1145/243199.243208]
|
 |
15
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
| |
16
|
Y.-Y. Lo and J.-L. Gauvain. The LIMSI topic tracking system for TDT2002. In Proceedings of DARPA Topic Detection and Tracking Workshop, Gaithersburg, Nov 2002.
|
| |
17
|
Y.-Y. Lo and J.-L. Gauvain. Tracking topics in broadcast news data. In Proceedings of ISCA Workshop on Multilingual Spoken Document Retrieval, Hong Kong, April 2003.
|
| |
18
|
D. J. C. Mackay and L. C. B. Peto. A hierarchical Dirichlet language model. Natural Language Engineering, 1(3):1--19, 1994.
|
 |
19
|
|
| |
20
|
L. Mangu, E. Brill, and A. Stolcke. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer Speech and Language, 14(4):373--400, 2000.
|
| |
21
|
|
 |
22
|
|
| |
23
|
M. F. Porter. An algorithm for su#x stripping. Program, 14(3):130--137, 1980.
|
| |
24
|
L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, 1989.
|
| |
25
|
M. Saraclar and R. Sproat. Lattice-based search for spoken utterance retrieval. In Proceedings of HLT-NAACL 2004, pages 129--136, Boston, Massachusetts, USA, May 2004. Association for Computational Linguistics.
|
| |
26
|
|
| |
27
|
|
 |
28
|
|
| |
29
|
A. Stolcke. SRILM - an extensible language modeling toolkit. In Proceedings of ICSLP 2002, volume 2, pages 901--904, Denver, CO, USA, 2002.
|
| |
30
|
E. M. Voorhees and D. Harman. Overview of the Ninth Text REtrieval Conference (TREC-9). In Proceedings of the Ninth Text REtrieval Conference (TREC-9), pages 1--14, 2000.
|
| |
31
|
R. E. Walpole and R. H. Myers. Probability and Statistics for Engineers and Scientists. Macmillan, Inc., New York, 4th edition, 1989.
|
 |
32
|
|
| |
33
|
F. Weng, A. Stolcke, and A. Sankar. Efficient lattice representation and generation. In Proceedings of ICSLP 1998, volume 6, pages 2531--2534, Sydney, Australia, 1998.
|
| |
34
|
S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland. The HTK Book (for HTK Version 3.4). Cambridge University Press, Cambridge, UK, 2006.
|
 |
35
|
|
| |
36
|
Zheng-Yu Zhou , Peng Yu , Ciprian Chelba , Frank Seide, Towards spoken-document retrieval for the internet: lattice indexing for large-scale web-search architectures, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.415-422, June 04-09, 2006, New York, New York
[doi> 10.3115/1220835.1220888]
|
CITED BY 2
|
|
|
|
|
Irwin King , Jiexing Li , Kam Tong Chan, A brief survey of computational approaches in social computing, Proceedings of the 2009 international joint conference on Neural Networks, p.2699-2706, June 14-19, 2009, Atlanta, Georgia, USA
|
|