ACM Home Page
Please provide us with feedback. Feedback
Statistical cross-language information retrieval using n-best query translations
Full text PdfPdf (260 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
SESSION: Cross-language Information Retrieval table of contents
Pages: 167 - 174  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Marcello Federico  ITC-irst Centro per la Ricerca Scientifica e Tecnologica, Trento, Italy
Nicola Bertoldi  ITC-irst Centro per la Ricerca Scientifica e Tecnologica, Trento, Italy
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 68,   Citation Count: 13
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564407
What is a DOI?

ABSTRACT

This paper presents a novel statistical model for cross-language information retrieval. Given a written query in the source language, documents in the target language are ranked by integrating probabilities computed by two statistical models: a query-translation model, which generates most probable term-by-term translations of the query, and a query-document model, which evaluates the likelihood of each document and translation. Integration of the two scores is performed over the set of N most probable translations of the query. Experimental results with values N=1, 5, 10 are presented on the Italian-English bilingual track data used in the CLEF 2000 and 2001 evaluation campaigns.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
 
5
6
 
7
S. Johnson, P. Jourlin, K. S. Jones, and P. Woodland. Spoken document retrieval for TREC-8 at Cambridge University. In Proceedings of the 8th Text REtrieval Conference, Gaithersburg, MD, 1999.
 
8
W. Kraaij, R. Pohlmann, and D. Hiemstra. Twenty-One at TREC-8: using Language Technology for Information Retrieval. In Proceedings of the 8th Text Retrieval Conference TREC-8, pages 285--300, 2000.
 
9
D. R. H. Miller, T. Leek, and R. M. Schwartz. BBN at TREC-7: Using hidden Markov models for information retrieval. In Proceedings of the 7th Text REtrieval Conference, pages 133--142, Gaithersburg, MD, 1998.
 
10
H. Ney, U. Essen, and R. Kneser. On structuring probabilistic dependences in stochastic language modelling. Computer Speech and Language, 8:1--38, 1994.
 
11
K. Ng. A maximum likelihood ratio information retrieval model. In Proceedings of the 8th Text REtrieval Conference, Gaithersburg, MD, 1999.
 
12
 
13
 
14
C. Peters, editor. Working notes for the CLEF 2001 Workshop. Darmstatd, Germany, 2001.
 
15
 
16
F. K. Soong and E. F. Huang. A tree-trellis based fast search for finding the n-best sentence hypotheses in continuous speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 1, pages 705--708, Toronto, Canada, 1991.
 
17
I. H. Witten and T. C. Bell. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inform. Theory, IT-37(4):1085--1094, 1991.
18
19

CITED BY  13

Collaborative Colleagues:
Marcello Federico: colleagues
Nicola Bertoldi: colleagues