| Exploring the similarity space |
| Full text |
Pdf
(1.23 MB)
|
| Source
|
ACM SIGIR Forum
archive
Volume 32 , Issue 1 (Spring 1998)
table of contents
Pages: 18 - 34
Year of Publication: 1998
ISSN:0163-5840
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 11, Downloads (12 Months): 82, Citation Count: 58
|
|
|
ABSTRACT
Ranked queries are used to locate relevant documents in text databases. In a ranked query a list of terms is specified, then the documents that most closely match the query are returned---in decreasing order of similarity---as answers. Crucial to the efficacy of ranked querying is the use of a similarity heuristic, a mechanism that assigns a numeric score indicating how closely a document and the query match. In this note we explore and categorise a range of similarity heuristics described in the literature. We have implemented all of these measures in a structured way, and have carried out retrieval experiments with a substantial subset of these measures.Our purpose with this work is threefold: first, in enumerating the various measures in an orthogonal framework we make it straightforward for other researchers to describe and discuss similarity measures; second, by experimenting with a wide range of the measures, we hope to observe which features yield good retrieval behaviour in a variety of retrieval environments; and third, by describing our results so far, to gather feedback on the issues we have uncovered. We demonstrate that it is surprisingly difficult to identify which techniques work best, and comment on the experimental methodology required to support any claims as to the superiority of one method over another.
CITED BY 58
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
William Hersh , Andrew Turpin , Susan Price , Benjamin Chan , Dale Kramer , Lynetta Sacherek , Daniel Olson, Do batch and user evaluations give the same results?, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.17-24, July 24-28, 2000, Athens, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Weiguo Fan , Ming Luo , Li Wang , Wensi Xi , Edward A. Fox, Tuning before feedback: combining ranking discovery and blind feedback for robust retrieval, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
H. C. Wu , R. W. P. Luk , K. F. Wong , K. L. Kwok , W. J. Li, A retrospective study of probabilistic context-based retrieval, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Feng Shao , Lin Guo , Chavdar Botev , Anand Bhaskar , Muthiah Chettiar , Fan Yang , Jayavel Shanmugasundaram, Efficient keyword search over virtual XML views, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
Andrei Z. Broder , Peter Ciccolo , Marcus Fontoura , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Search advertising using web relevance feedback, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
Feng Shao , Lin Guo , Chavdar Botev , Anand Bhaskar , Muthiah Chettiar , Fan Yang , Jayavel Shanmugasundaram, Efficient keyword search over virtual XML views, The VLDB Journal — The International Journal on Very Large Data Bases, v.18 n.2, p.543-570, April 2009
|
|
|
|
|
|
|
|
|
|
|