| Comparison of word-based and syllable-based retrieval for Tibetan (poster session) |
| Full text |
Pdf
(202 KB)
|
| Source
|
International Workshop on Information Retrieval with Asia Languages
archive
Proceedings of the fifth international workshop on on Information retrieval with Asian languages
table of contents
Hong Kong, China
Pages: 197 - 198
Year of Publication: 2000
ISBN:1-58113-300-6
|
|
Authors
|
|
Paul G. Hackett
|
College of Information Studies, University of Maryland, College Park, MD
|
|
Douglas W. Oard
|
College of Information Studies, University of Maryland, College Park, MD
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 14, Citation Count: 1
|
|
|
ABSTRACT
Tibetan retrieval based on automatically segmented words is compared with the use of overlapping syllable n-grams using a known-item retrieval evaluation. The optimal span of fixed-length n-grams is found to be 2 syllables, and indexing words is found to be as effective as indexing syllable bigrams.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ACIP, Asian Classics Input Project, Release 4, 1998.
|
| |
2
|
Carbonell, J., Y. Yang, R. Frederking, R.D. Brown, Y. Geng, and D. Lee, Translingual Information Retrieval: A comparative evaluation. In International Joint Conference on Artificial lntelligence, 1997.
|
| |
3
|
Garofolo, J., E.M. Voorhees, V.M. Stanford, and K. Sparck-Jones, TR.EC-6 1997 Spoken Document Retrieval Track Overview and Results. In Proceedings of the Sixth Text Retrieval Conference, Gaithersburg, 1998, pp.83-91, http://trec.nist.gov
|
| |
4
|
HackeR, P.G., Approaches to Tibetan Information Retrieval: Segmentation vs. n-grams. Master's Thesis. College of Library and Information Services, University of Maryland, College Park, 2000. http://www.glue.umd.edu/-oard
|
| |
5
|
Miller, E., D. Shen, J. Liu, and C. Nicholas, Performance and Scalability of a Large-scale N-gram Based Information Retrieval System. Journal of Digital Information, January 2000.
|
| |
6
|
Wilkenson, R., Chinese Document Retrieval at TR.EC-6. In Proceedings of the Sixth Text Retrieval Conference, Gaithersburg, 1998, pp.25-30.
|
| |
7
|
Wilson, Joe, Translating Buddhism from Tibetan. Ithaca: Snow Lion Publ. 1992.
|
CITED BY
|
|
Daqing He , Douglas W. Oard , Jianqiang Wang , Jun Luo , Dina Demner-Fushman , Kareem Darwish , Philip Resnik , Sanjeev Khudanpur , Michael Nossal , Michael Subotin , Anton Leuski, Making MIRACLEs: Interactive translingual search for Cebuano and Hindi, ACM Transactions on Asian Language Information Processing (TALIP), v.2 n.3, p.219-244, September 2003
|
|