|
ABSTRACT
This paper examines whether the Cranfield evaluation methodology is robust to gross violations of the completeness assumption (i.e., the assumption that all relevant documents within a test collection have been identified and are present in the collection). We show that current evaluation measures are not robust to substantially incomplete relevance judgments. A new measure is introduced that is both highly correlated with existing measures when complete judgments are available and more robust to incomplete judgment sets. This finding suggests that substantially larger or dynamic test collections built using current pooling practices should be viable laboratory tools, despite the fact that the relevance information will be incomplete and imperfect.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Chris Buckley. trec_eval IR evaluation package. Available from ftp://ftp.cs.cornell.edu/pub/smart.
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
Google. Benefits of a Google search. http://www.google.com/technology/whyuse.html, January 2004.
|
| |
9
|
Stefano Mizzaro. A new measure of retrieval effectiveness(Or: What's wrong with precision and recall). In Proceedings of the International Workshop on Information Retrieval(IR'2001), pages 43--52, 2001.
|
 |
10
|
|
| |
11
|
Mark E. Rorvig. The simple scalability of documents. Journal of the American Society for Information Science, 41(8):590--598, 1990.
|
 |
12
|
|
| |
13
|
K. Sparck Jones and C. van Rijsbergen. Report on the need for and provision of an "ideal" information retrieval test collection. British Library Research and Development Report 5266, Computer Laboratory, University of Cambridge, 1975.
|
| |
14
|
C. J. van Rijsbergen. Evaluation, chapter 7. Butterworths, 2 edition, 1979.
|
| |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
| |
19
|
Ellen M. Voorhees and Donna Harman. Overview of the seventh Text REtrieval Conference(TREC-7). In Proceedings of the Seventh Text REtrieval Conference(TREC-7), pages 1--23, 1999. NIST Special Publication 500--242.
|
| |
20
|
|
 |
21
|
|
CITED BY 76
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David Lillis , Fergus Toolan , Rem Collier , John Dunnion, ProbFuse: a probabilistic approach to data fusion, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
Fazli Can , Seyit Kocberber , Erman Balcik , Cihan Kaynak , H. Cagdas Ocalan , Onur M. Vursavas, First large-scale information retrieval experiments on turkish texts, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
Michael Taylor , Hugo Zaragoza , Nick Craswell , Stephen Robertson , Chris Burges, Optimisation methods for ranking functions with multiple parameters, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Charles L.A. Clarke , Maheedhar Kolla , Gordon V. Cormack , Olga Vechtomova , Azin Ashkan , Stefan Büttcher , Ian MacKinnon, Novelty and diversity in information retrieval evaluation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomaz Philippe C. Silva , Edleno Silva de Moura , João Marcos B. Cavalcanti , Altigran S. da Silva , Moisés Gomes de Carvalho , Marcos André Gonçalves, An evolutionary approach for combining different sources of evidence in search engines, Information Systems, v.34 n.2, p.276-289, April, 2009
|
|
|
|
|
|
David Fernandes , Edleno S. de Moura , Berthier Ribeiro-Neto , Altigran S. da Silva , Marcos André Gonçalves, Computing block importance for searching on web sites, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
Tanuja Bompada , Chi-Chao Chang , John Chen , Ravi Kumar , Rajesh Shenoy, On the robustness of relevance measures with incomplete judgments, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
Aris Anagnostopoulos , Andrei Z. Broder , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Just-in-time contextual advertising, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
|
|
|
Thomas Mandl , Christa Womser-Hacker , Giorgio Di Nunzio , Nicola Ferro, How robust are multilingual information retrieval systems?, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
Jie Tang , Jing Zhang , Limin Yao , Juanzi Li , Li Zhang , Zhong Su, ArnetMiner: extraction and mining of academic social networks, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
Andrei Broder , Massimiliano Ciaramita , Marcus Fontoura , Evgeniy Gabrilovich , Vanja Josifovski , Donald Metzler , Vanessa Murdock , Vassilis Plachouras, To swing or not to swing: learning when (not) to advertise, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
Peter Bailey , Nick Craswell , Ian Soboroff , Paul Thomas , Arjen P. de Vries , Emine Yilmaz, Relevance assessment: are judges exchangeable and does it matter, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
Alberto H.F. Laender , Marcos André Gonçalves , Ricardo G. Cota , Anderson A. Ferreira , Rodrygo L.T. Santos , Allan J.C. Silva, Keeping a digital library clean: new solutions to old problems, Proceeding of the eighth ACM symposium on Document engineering, September 16-19, 2008, Sao Paulo, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tom Yeh , Boris Katz, Searching documentation using text, OCR, and image, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Turpin , Falk Scholer , Kalvero Jarvelin , Mingfang Wu , J. Shane Culpepper, Including summaries in system evaluation, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
Susan L. Price , Marianne Lykke Nielsen , Lois M. L. Delcambre , Peter Vedsted , Jeremy Steinhauer, Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective, Information Systems, v.34 n.8, p.778-806, December, 2009
|
|