ACM Home Page
Please provide us with feedback. Feedback
Selecting indexing strings using adaptation
Full text PdfPdf (6.03 MB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
POSTER SESSION: Poster session table of contents
Pages: 427 - 428  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Yoshiyuki Takeda  Toyohashi University of Technology , Toyohashi, Aichi, Japan
Kyoji Umemura  Toyohashi University of Technology , Toyohashi, Aichi, Japan
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 29,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564477
What is a DOI?

ABSTRACT

It is not easy to tokenize agglutinative languages like Japanese and Chinese into words. Many IR systems start with a dictionary-based morphology program like ChaSen [4]. Unfortunately, dictionaries cannot cover all possible words; unknown words such as proper nouns are important for IR. This paper proposes a statistical dictionary-free method for selecting index strings based on recent work on adaptive language modeling.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Kenneth W. Church (2000), Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2, Coling, pp. 173--179.
 
2
NTCIR Project, http://research.nii.ac.jp/ntcir/
 
3
 
4
Yuji Matsumoto, Akira Kitauchi, Tatsuo Yamashita, Yoshitaka Hirano, Osamu Imaichi, and Tomoaki Imamura (1997), Japanese Morphological analysis System ChaSen Manual, NAIST Technical Report NAIST-IS-TR97007, http://cactus.aist-nara.ac.jp/lab/nlt/chasen.html

Collaborative Colleagues:
Yoshiyuki Takeda: colleagues
Kyoji Umemura: colleagues