|
ABSTRACT
Given the size of the web, the search engine industry has argued that engines should be evaluated by their ability to retrieve highly relevant pages rather than all possible relevant pages. To explore the role highly relevant documents play in retrieval system evaluation, assessors for the \mbox{TREC-9} web track used a three-point relevance scale and also selected best pages for each topic. The relative effectiveness of runs evaluated by different relevant document sets differed, confirming the hypothesis that different retrieval techniques work better for retrieving highly relevant documents. Yet evaluating by highly relevant documents can be unstable since there are relatively few highly relevant documents. TREC assessors frequently disagreed in their selection of the best page, and subsequent evaluation by best page across different assessors varied widely. The discounted cumulative gain measure introduced by J\"{a}rvelin and Kek\"{a}l\"{a}inen increases evaluation stability by incorporating all relevance judgments while still giving precedence to highly relevant documents.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Internet Archive. The Internet Archive: Building an 'Internet Library'. http://www.archive.org.
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
W.S. Cooper. Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19:30-41, January 1968.
|
| |
6
|
|
| |
7
|
David Hawking, Nick Craswell, and Paul Thistlewaite. Overview of TREC-7 very large collection track. In E.M. Voorhees and D.K. Harman, editors, Proceedings of the Seventh Text REtrieval Conference (TREC-7), pages 91-103, August 1999. NIST Special Publication 500-242. Electronic version available at http://trec.nist.gov/pubs.html.
|
| |
8
|
|
| |
9
|
David Hawking, Ellen Voorhees, Nick Craswell, and Peter Bailey. Overview of the TREC-8 web track. In E.M. Voorhees and D.K. Harman, editors, Proceedings of the Eighth Text REtrieval Conference (TREC-8), pages 131-150, 2000. NIST Special Publication 500-246. Electronic version available at http://trec.nist.gov/pubs.html.
|
 |
10
|
|
| |
11
|
Chris Sherman. Special report|the 5th annual search engines meeting. http://websearch.about.com/ internet/websearch/library/blsem.htm. Section "The Fireworks Fly".
|
| |
12
|
Alan Stuart. Kendall's tau. In Samuel Kotz and Norman L. Johnson, editors, Encyclopedia of Statistical Sciences, volume 4, pages 367-369. John Wiley & Sons, 1983.
|
| |
13
|
Bob Travis and Andrei Broder. The need behind the query: Web search vs classic information retrieval. http://www.infonortics.com/searchengines/sh01/ slides-01/sh01pro.html.
|
| |
14
|
|
| |
15
|
|
CITED BY 47
|
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman , Ophir Frieder, Using manually-built web directories for automatic evaluation of known-item retrieval, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, July 28-August 01, 2003, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman, Using titles and category names from editor-driven taxonomies for automatic evaluation, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Irina Matveeva , Chris Burges , Timo Burkard , Andy Laucius , Leon Wong, High accuracy retrieval with multiple nested ranker, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomaz Philippe C. Silva , Edleno Silva de Moura , João Marcos B. Cavalcanti , Altigran S. da Silva , Moisés Gomes de Carvalho , Marcos André Gonçalves, An evolutionary approach for combining different sources of evidence in search engines, Information Systems, v.34 n.2, p.276-289, April, 2009
|
|
|
|
|
|
Tanuja Bompada , Chi-Chao Chang , John Chen , Ravi Kumar , Rajesh Shenoy, On the robustness of relevance measures with incomplete judgments, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Turpin , Falk Scholer , Kalvero Jarvelin , Mingfang Wu , J. Shane Culpepper, Including summaries in system evaluation, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Susan L. Price , Marianne Lykke Nielsen , Lois M. L. Delcambre , Peter Vedsted , Jeremy Steinhauer, Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective, Information Systems, v.34 n.8, p.778-806, December, 2009
|
|