ACM Home Page
Please provide us with feedback. Feedback
Comparing cross-language query expansion techniques by degrading translation resources
Full text PdfPdf (267 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
SESSION: Cross-language Information Retrieval table of contents
Pages: 159 - 166  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Paul McNamee  The Johns Hopkins University Applied Physics Laboratory, Laurel, MD
James Mayfield  The Johns Hopkins University Applied Physics Laboratory, Laurel, MD
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 57,   Citation Count: 18
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564406
What is a DOI?

ABSTRACT

The quality of translation resources is arguably the most important factor affecting the performance of a cross-language information retrieval system. While many investigations have explored the use of query expansion techniques to combat errors induced by translation, no study has yet examined the effectiveness of these techniques across resources of varying quality. This paper presents results using parallel corpora and bilingual wordlists that have been deliberately degraded prior to query translation. Across different languages, translingual resources, and degrees of resource degradation, pre-translation query expansion is tremendously effective. In several instances, pre-translation expansion results in better performance when no translations are available, than when an uncompromised resource is used without pre-translation expansion. We also demonstrate that post-translation expansion using relevance feedback can confer modest performance gains. Measuring the efficacy of these techniques with resources of different quality suggests an explanation for the conflicting reports that have appeared in the literature.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
C. Buckley, M. Mitra, J. Walz, and C. Cardie, 'Using Clustering and Super Concepts within SMART: TREC-6.' In E. Voorhees and D. Harman (eds.), Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500--240, 1998.
 
5
C. Buckley, 'The TREC-9 Query Track.' In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Ninth Text REtrieval Conference (TREC-9), pp. 81--85, 2001.
 
6
 
7
A. Diekema, 'May the Best Team Win: Language Resources in CLIR.' Position paper at the CLEF-2000 workshop. Available online at: http://clef.iei.pi.cnr.it:2002/DELOS/CLEF/diekema.pdf
8
 
9
F. Gey and A. Chen, 'TREC-9 Cross-Language Information Retrieval (English - Chinese) Overview.' In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Ninth Text REtrieval Conference (TREC-9), pp. 15--23, 2001.
 
10
11
 
12
D. Harman, 'Overview of the Fourth Text REtrieval Conference (TREC-4).' In D. K. Harman, ed., Proceedings of the Fourth Text REtrieval Conference (TREC-4), NIST Special Publication 500-236, pp. 1--24, 1995.
 
13
D. Hiemstra, 'Using Language Models for Information Retrieval.' Ph. D. Thesis, Center for Telematics and Information Technology, The Netherlands, 2000.
 
14
15
 
16
T. K. Landauer and M. L. Littman, 'Fully automated cross-language document retrieval using latent semantic indexing.' In the Proceedings of the 6th Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research. 31--38, 1990.
17
 
18
 
19
D. Oard and A. Diekema, 'Cross-Language Information Retrieval.' In M. Williams (ed.), Annual Review of Information Science, pp. 223--256, 1998.
 
20
C. Peters, 'Foreward to the Proceedings of the CLEF-2001 Workshop', to appear in 2002.
 
21
22
23
 
24
 
25
G. Salton and C. Buckley, 'Improving Retrieval Performance by Relevance Feedback.' In the Journal of the American Society for Information Science, 41(4), pp. 288--297, 1990.
26
 
27
 
28
J. Xu, A. Fraser, and R. Weischedel, 'TREC 2001 Cross-lingual Retrieval at BBN.' In TREC-2001 Notebook Papers, pp. 122--131, 2001.
 
29
Cross-Language Evaluation Forum, http://www.clef-campaign.org/
 
30
NTCIR Project, http://research.nii.ac.jp/ntcir/
 
31
Text REtrieval Conference, http://trec.nist.gov/
 
32
 
33
 
34

CITED BY  18

Collaborative Colleagues:
Paul McNamee: colleagues
James Mayfield: colleagues