ACM Home Page
Please provide us with feedback. Feedback
Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations
Full text PdfPdf (308 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
SESSION: Cross-language Information Retrieval table of contents
Pages: 183 - 190  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Jianfeng Gao  Microsoft Research, Asia
Ming Zhou  Microsoft Research, Asia
Jian-Yun Nie  Université de Montréal
Hongzhao He  Tianjin University, China
Weijun Chen  Tianjin University, China
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 68,   Citation Count: 27
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564409
What is a DOI?

ABSTRACT

Bilingual dictionaries have been commonly used for query translation in cross-language information retrieval (CLIR). However, we are faced with the problem of translation selection. Several recent studies suggested the utilization of term co-occurrences in this selection. This paper presents two extensions to improve them. First, we extend the basic co-occurrence model by adding a decaying factor that decreases the mutual information when the distance between the terms increases. Second, we incorporate a triple translation model, in which syntactic dependence relations (represented as triples) are integrated. Our evaluation on translation accuracy shows that translating triples as units is more precise than a word-by-word translation. Our CLIR experiments show that the addition of the decaying factor leads to substantial improvements of the basic co-occurrence model; and the triple translation model brings further improvements.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
 
5
 
6
Davis, M. W., and Ogden, W. C. (1997). Free resources and advanced alignment for cross-language text retrieval. In: TREC-6, pp. 285--402.
 
7
8
 
9
Gao, J., Nie, J. Y., Zhang, J., Xun, E., Su, Y., Zhou, M., and Huang, C. (2000). TREC-9 CLIR experiments at MSRCN. In TREC-9, pp. 343--353.
10
 
11
 
12
 
13
 
14
Peters, C., and Picchi, E. (1996). Cross language information retrieval: A system for comparable corpus querying. In SIGIR'96 Workshop on Cross-linguistic Information Retrieval, pp. 24--33.
 
15
Robertson, S. E., and Walker, S. (2000). Microsoft Cambridge at TREC-9: Filtering track. In TREC-9, pp. 361--368.
 
16
Voorhees, E., Harman, D. (2001). Overview of the ninth text retrieval conference (TREC-9). In TREC-9 pp. 1--14.
 
17
Xu, J., and Weischedel, R. (2000). TREC-9 cross-lingual retrieval at BBN. In TREC-9, pp. 106--116.
 
18
Zhou, M., Ding, Y., and Huang, C. (2001). Improving translation selection with a new translation model trained by independent monolingual corpora. Computational linguistics and Chinese Language Processing. Vol. 6, No. 1, pp 1--26.
19

CITED BY  27

Collaborative Colleagues:
Jianfeng Gao: colleagues
Ming Zhou: colleagues
Jian-Yun Nie: colleagues
Hongzhao He: colleagues
Weijun Chen: colleagues