|
ABSTRACT
This work proposes and evaluates a probabilistic cross-lingual retrieval system. The system uses a generative model to estimate the probability that a document in one language is relevant, given a query in another language. An important component of the model is translation probabilities from terms in documents to terms in a query. Our approach is evaluated when 1) the only resource is a manually generated bilingual word list, 2) the only resource is a parallel corpus, and 3) both resources are combined in a mixture model. The combined resources produce about 90% of monolingual performance in retrieving Chinese documents. For Spanish the system achieves 85% of monolingual performance using only a pseudo-parallel Spanish-English corpus. Retrieval results are comparable with those of the structural query translation technique (Pirkola, 1998) when bilingual lexicons are used for query translation. When parallel texts in addition to conventional lexicons are used, it achieves better retrieval results but requires more computation than the structural query translation technique. It also produces slightly better results than using a machine translation system for CLIR, but the improvement over the MT system is not significant.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Allan, J., Callan, J., Feng, F-F, and Malin, D. 2000. "INQUERY at TREC8." In TREC8 Proceedings, Special publication by NIST, 2000.
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
Hull, D. 1997. "Using structured queries for disambiguation in cross-language information retrieval." In AAAI Symposium on Cross-Language Text and Speech Retrieval, 1997.
|
| |
8
|
Klavans, J. and Hovy, E. 1999. "Multilingual (or Crosslingual) Information Retrieval". Chapter 2, Multilingual Information Management, current levels and future abilities. Editors, E. Hovy, N. Ide, R. Frederking, J. Mariani and A. Zampolli, Arpil, 1999.
|
 |
9
|
|
| |
10
|
Kwok, K.L. 2000. "TREC9 Cross-language, questionanswering track experiments using PIRCS." TREC9 Proceedings published by NIST, 2000.
|
| |
11
|
Lafferty, J. 1999. Personal communications.
|
| |
12
|
Levow, G.A. and Oard, D. 1999. "Evaluating lexical coverage for cross-language information retrieval." In Workshop on Multilingual Information Processing and Asian Language Processing, Beijing, 1999.
|
| |
13
|
|
 |
14
|
David R. H. Miller , Tim Leek , Richard M. Schwartz, A hidden Markov model information retrieval system, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.214-221, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312680]
|
| |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
Porter, M. 1980. "An algorithm for suffix stripping." Program 14, 3(1980), pages 130-137.
|
| |
19
|
Rabiner, L. 1989. "A tutorial on Hidden Markov models and selected applications in speech recognition", In Proceedings of IEEE 77, pages 257-286, 1989.
|
 |
20
|
|
 |
21
|
|
| |
22
|
Voorhees, E. and Harman, D. 1997. TREC-5 Proceedings. NIST special publication, 1997.
|
| |
23
|
Voorhees, E. and Harman, D. 2000. TREC-9 Proceedings. To be published by NIST.
|
 |
24
|
|
CITED BY 24
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pu-Jen Cheng , Jei-Wen Teng , Ruei-Cheng Chen , Jenq-Haur Wang , Wen-Hsiang Lu , Lee-Feng Chien, Translating unknown queries with web corpora for cross-language information retrieval, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xuanhui Wang , ChengXiang Zhai , Xiao Hu , Richard Sproat, Mining correlated bursty topic patterns from coordinated text streams, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
|
|
|
|
|
|
|