|
ABSTRACT
The quality of translation resources is arguably the most important factor affecting the performance of a cross-language information retrieval system. While many investigations have explored the use of query expansion techniques to combat errors induced by translation, no study has yet examined the effectiveness of these techniques across resources of varying quality. This paper presents results using parallel corpora and bilingual wordlists that have been deliberately degraded prior to query translation. Across different languages, translingual resources, and degrees of resource degradation, pre-translation query expansion is tremendously effective. In several instances, pre-translation expansion results in better performance when no translations are available, than when an uncompromised resource is used without pre-translation expansion. We also demonstrate that post-translation expansion using relevance feedback can confer modest performance gains. Measuring the efficacy of these techniques with resources of different quality suggests an explanation for the conflicting reports that have appeared in the literature.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
C. Buckley, M. Mitra, J. Walz, and C. Cardie, 'Using Clustering and Super Concepts within SMART: TREC-6.' In E. Voorhees and D. Harman (eds.), Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500--240, 1998.
|
| |
5
|
C. Buckley, 'The TREC-9 Query Track.' In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Ninth Text REtrieval Conference (TREC-9), pp. 81--85, 2001.
|
| |
6
|
|
| |
7
|
A. Diekema, 'May the Best Team Win: Language Resources in CLIR.' Position paper at the CLEF-2000 workshop. Available online at: http://clef.iei.pi.cnr.it:2002/DELOS/CLEF/diekema.pdf
|
 |
8
|
Martin Franz , J. Scott McCarley , Todd Ward , Wei-Jing Zhu, Quantifying the utility of parallel corpora, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.398-399, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.384037]
|
| |
9
|
F. Gey and A. Chen, 'TREC-9 Cross-Language Information Retrieval (English - Chinese) Overview.' In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Ninth Text REtrieval Conference (TREC-9), pp. 15--23, 2001.
|
| |
10
|
|
 |
11
|
|
| |
12
|
D. Harman, 'Overview of the Fourth Text REtrieval Conference (TREC-4).' In D. K. Harman, ed., Proceedings of the Fourth Text REtrieval Conference (TREC-4), NIST Special Publication 500-236, pp. 1--24, 1995.
|
| |
13
|
D. Hiemstra, 'Using Language Models for Information Retrieval.' Ph. D. Thesis, Center for Telematics and Information Technology, The Netherlands, 2000.
|
| |
14
|
|
 |
15
|
|
| |
16
|
T. K. Landauer and M. L. Littman, 'Fully automated cross-language document retrieval using latent semantic indexing.' In the Proceedings of the 6th Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research. 31--38, 1990.
|
 |
17
|
|
| |
18
|
|
| |
19
|
D. Oard and A. Diekema, 'Cross-Language Information Retrieval.' In M. Williams (ed.), Annual Review of Information Science, pp. 223--256, 1998.
|
| |
20
|
C. Peters, 'Foreward to the Proceedings of the CLEF-2001 Workshop', to appear in 2002.
|
| |
21
|
Ari Pirkola , Turid Hedlund , Heikki Keskustalo , Kalervo Järvelin, Dictionary-Based Cross-Language Information Retrieval: Problems, Methods, and Research Findings, Information Retrieval, v.4 n.3-4, p.209-230, September-December 2001
[doi> 10.1023/A:1011994105352]
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
G. Salton and C. Buckley, 'Improving Retrieval Performance by Relevance Feedback.' In the Journal of the American Society for Information Science, 41(4), pp. 288--297, 1990.
|
 |
26
|
|
| |
27
|
|
| |
28
|
J. Xu, A. Fraser, and R. Weischedel, 'TREC 2001 Cross-lingual Retrieval at BBN.' In TREC-2001 Notebook Papers, pp. 122--131, 2001.
|
| |
29
|
Cross-Language Evaluation Forum, http://www.clef-campaign.org/
|
| |
30
|
NTCIR Project, http://research.nii.ac.jp/ntcir/
|
| |
31
|
Text REtrieval Conference, http://trec.nist.gov/
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
CITED BY 18
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wei Gao , Cheng Niu , Jian-Yun Nie , Ming Zhou , Jian Hu , Kam-Fai Wong , Hsiao-Wuen Hon, Cross-lingual query suggestion using query logs of different languages, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
|
|