| A critical assessment of spoken utterance retrieval through approximate lattice representations |
| Full text |
Pdf
(228 KB)
|
Source
|
International Multimedia Conference
archive
Proceeding of the 1st ACM international conference on Multimedia information retrieval
table of contents
Vancouver, British Columbia, Canada
SESSION: Audio retrieval
table of contents
Pages 83-88
Year of Publication: 2008
ISBN:978-1-60558-312-9
|
|
Authors
|
|
Siavash Kazemian
|
University of Toronto, Toronto, ON, Canada
|
|
Frank Rudzicz
|
University of Toronto, Toronto, ON, Canada
|
|
Gerald Penn
|
University of Toronto, Toronto, ON, Canada
|
|
Cosmin Munteanu
|
University of Toronto, Toronto, ON, Canada
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 60, Citation Count: 0
|
|
|
ABSTRACT
This paper compares the performance of position-specific posterior lattices (PSPL) and confusion networks applied to spoken utterance retrieval, and tests these recent proposals against several baselines in two disparate domains. These lossy methods provide compact representations that generalize the original segment lattices and provide greater recall and robustness, but have yet to be evaluated against each other in multiple WER conditions for spoken utterance retrieval. Our comparisons suggest that while PSPL and confusion networks have comparable recall, the former is slightly more precise, although its merit appears to be coupled to the assumptions of low-frequency search queries and low-WER environments.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Y. cheng Pan, H. lin Chang, and L. shan Lee. Analytical comparison between position specific posterior lattices and confusion networks based on words and subword units for spoken document indexing. In Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on, Kyoto, Japan, 2007.
|
| |
4
|
J. Garofolo, G. Auzanne, and E. Voorhees. The trec spoken document retrieval track: A success story. In Proceedings of the Recherche d'Informations Assiste par Ordinateur: ContentBased Multimedia Information Access Conference, April 2000.
|
| |
5
|
T. Hori, I. L. Hetherington, T. J. Hazen, and J. R. Glass. Open-vocabulary spoken utterance retrieval using confusion networks. In Proceedings of the 2007 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), 2007.
|
| |
6
|
L. Mangu, E. Brill, and A. Stolcke. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer, Speech and Language, 14(4):373--400, 2000.
|
| |
7
|
C. Munteanu, G. Penn, and R. Baecker. Web-based language modelling for automatic lecture transcription. In Proceedings of the Tenth European Conference on Speech Communication and Technology - EuroSpeech / Eighth INTERSPEECH, Antwerp, Belgium, August 2007.
|
| |
8
|
|
| |
9
|
B. Pellom and K. Hacioglu. Recent improvements in the cu sonic asr system for noisy speech: The spine task. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, April 2003.
|
| |
10
|
L. R. Rabiner. A tutorial on hidden markov models and selected applications inspeech recognition. In Proceedings of the IEEE, volume 77, pages 257--286, February 1989.
|
| |
11
|
M. Saraclar and R. Sproat. Lattice-based search for spoken utterance retrieval. In Proceedings of the Human Language Technologies and North American Association for Computational Linguistics (HLT-NAACL 04), Boston, USA, May 2004.
|
| |
12
|
|
| |
13
|
F. Seide, P. Yu, C. Ma, and E. Chang. Vocabulary-independent search in spontaneous speech. In Proceedings of ICASSP, Montreal, Canada, 2004.
|
| |
14
|
|
| |
15
|
Zheng-Yu Zhou , Peng Yu , Ciprian Chelba , Frank Seide, Towards spoken-document retrieval for the internet: lattice indexing for large-scale web-search architectures, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.415-422, June 04-09, 2006, New York, New York
[doi> 10.3115/1220835.1220888]
|
|