ACM Home Page
Please provide us with feedback. Feedback
Query term disambiguation for Web cross-language information retrieval using a search engine
Full text PdfPdf (736 KB)
Source International Workshop on Information Retrieval with Asia Languages archive
Proceedings of the fifth international workshop on on Information retrieval with Asian languages table of contents
Hong Kong, China
Pages: 25 - 32  
Year of Publication: 2000
ISBN:1-58113-300-6
Authors
Akira Maeda  Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Japan
Fatiha Sadat  Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Japan
Masatoshi Yoshikawa  Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Japan
Shunsuke Uemura  Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Japan
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGLINK: Hypertext, Hypermedia, and Web
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM Hong Kong Chapter : ACM Hong Kong Chapter Executive Committee
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 83,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/355214.355218
What is a DOI?

ABSTRACT

With the worldwide growth of the Internet, research on Cross-Language Information Retrieval (CLIR) is being paid much attention. Existing CLIR approaches based on query translation require parallel corpora or comparable corpora for the disambiguation of translated query terms. However, those natural language resources are not readily available. In this paper, we propose a disambiguation method for dictionary-based query translation that is independent of the availability of such scarce language resources, while achieving adequate retrieval effectiveness by utilizing Web documents as a corpus and using co-occurrence information between terms within that corpus. In the experiments, our method achieved 97% of manual translation case in terms of the average precision.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Kikui, G. Identifying the coding system and language of on-line documents using statistical language models. Transactions oflPSJ, 1997, 38(12), pp. 2440-2448.
 
2
 
3
 
4
Fujii, A. and Ishikawa, T. Cross-language information retrieval for technical documents. In Proceedings of the Joint ACL SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999, pp. 29-37.
 
5
Oard, D. W. Alternative approaches for cross-language text retrieval. In Electronic Working Notes of the AAAI Symposium on Cross-Language Text and Speech Retrieval, 1997.
 
6
7
 
8
Maeda, A. and Uemura, S. Key technologies for multilingual information processing on WWW. In Proceedings of the Fourth International Symposium on Standardization of Multilingual Information Technology (MLIT-4), 1999, pp. 15-25.
 
9
Lin, C., Lin, W., Bian, G. and Chen, H. Description of the NTU Japanese-English cross-lingual information retrieval system used for NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, 1999, pp. 145-148.
 
10
11
 
12
Fatiha, S., Maeda, A., Yoshikawa, M. and Uemura, S.: Integrating Dictionary-based and Statistical-based Approaches in Cross-Language Information Retrieval, IPSJ SIG Notes, 2000-DBS-121/2000-FI-Sg, 2000, pp. 61--68.
 
13
Ikeno, A., Murata, T., Shimohata, S. and Yamamoto, H. Machine translation using the Internet natural language resources. In Proceedings of World TELECOM99+ lnteractive99 Forum, 1999.
 
14
 
15
Kitamura, M. and Matsumoto, Y. Automatic extraction of translation patterns in parallel corpora. Transactions oflPSJ, 1997, 38(4), pp. 727-736. (in Japanese)
 
16
 
17
Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Hidaka, S. and Adachi, J. The NTCIR workshop: the first evaluation workshop on Japanese text retrieval and cross-lingual information retrieval. In Proceedings of the 4th International Workshop on Information Retrieval with Asian Languages (1RAL '99), 1999.
 
18
Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Matsuda, H. and Asahara, M. Japanese morphological analysis system ChaSen version 2.0 manual 2nd edition. Technical Report NAIST-IS- TR99013, Nara Institute of Science and Technology, 1999.
 
19
Japan Electronic Dictionary Research Institute, Ltd. EDR electronic dictionary version 1.5 technical guide, Technical Report TR2-007, Japan Electronic Dictionary Research Institute, Ltd., 1996.

CITED BY  14

Collaborative Colleagues:
Akira Maeda: colleagues
Fatiha Sadat: colleagues
Masatoshi Yoshikawa: colleagues
Shunsuke Uemura: colleagues