| An information-theoretic measure for document similarity |
| Full text |
Pdf
(70 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
table of contents
Toronto, Canada
POSTER SESSION: Posters
table of contents
Pages: 449 - 450
Year of Publication: 2003
ISBN:1-58113-646-3
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 22, Downloads (12 Months): 118, Citation Count: 6
|
|
|
ABSTRACT
Recent work has demonstrated that the assessment of pairwise object similarity can be approached in an axiomatic manner using information theory. We extend this concept specifically to document similarity and test the effectiveness of an information-theoretic measure for pairwise document similarity. We adapt query retrieval to rate the quality of document similarity measures and demonstrate that our proposed information-theoretic measure for document similarity yields statistically significant improvements over other popular measures of similarity.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
M. McGill, M. Koll, and T. Norreault. An evaluation of factors affecting document ranking by information retrieval systems. Technical report, Syracuse University School of Information Studies, 1979.
|
| |
5
|
|
Peer to Peer - Readers of this Article have also read:
-
Constructing reality
Proceedings of the 11th annual international conference on Systems documentation
Douglas A. Powell
, Norman R. Ball
, Mansel W. Griffiths
-
M4: a metamodel for data preprocessing
Proceedings of the 4th ACM international workshop on Data warehousing and OLAP
Anca Vaduva
, Jörg-Uwe Kietz
, Regina Zücker
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
|