|
ABSTRACT
Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation. In order to develop IR techniques in this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their ability to retrieve highly relevant documents. This can be done by extending traditional evaluation methods, that is, recall and precision based on binary relevance judgments, to graded relevance judgments. Alternatively, novel measures based on graded relevance judgments may be developed. This article proposes several novel measures that compute the cumulative gain the user obtains by examining the retrieval result up to a given ranked position. The first one accumulates the relevance scores of retrieved documents along the ranked result list. The second one is similar but applies a discount factor to the relevance scores in order to devaluate late-retrieved documents. The third one computes the relative-to-the-ideal performance of IR techniques, based on the cumulative gain they are able to yield. These novel measures are defined and discussed and their use is demonstrated in a case study using TREC data: sample system run results for 20 queries in TREC-7. As a relevance base we used novel graded relevance judgments on a four-point scale. The test results indicate that the proposed measures credit IR methods for their ability to retrieve highly relevant documents and allow testing of statistical significance of effectiveness differences. The graphs based on the measures also provide insight into the performance IR techniques and allow interpretation, for example, from the user point of view.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Borlund, P. 2000. Evaluation of interactive information retrieval systems. PhD Dissertation. Åbo University Press.
|
 |
3
|
|
| |
4
|
Conover, W. J. 1980. Practical Nonparametric Statistics, 2nd ed., Wiley, New York.
|
| |
5
|
Cooper, W. S. 1968. Expected search length: A single measure of retrieval effectiveness based on weak ordering action of retrieval systems. J. Am. Soc. Inf. Sci. 19, 1, 30--41.
|
| |
6
|
|
 |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Kekäläinen, J. and Järvelin, K. 2002b. Evaluating information retrieval systems under the challenges of interaction and multidimensional dynamic relevance. In Proceedings of the CoLIS 4 Conference, H. Bruce, R. Fidel, P. Ingwersen, AND P. Vakkari, Eds., Libraries Unlimited: Greenwood Village, Colo., 253--270.
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
Pollack, S. M. 1968. Measures for the comparison of information retrieval systems. Am. Doc. 19, 4, 387--397.
|
| |
17
|
Over, P. 1999. TREC-7 interactive track report {On-line}. Available at http://trec.nist.gov/pubs/trec7/papers/t7irep.pdf.gz. In NIST Special Publication 500-242: The Seventh Text REtrieval Conference (TREC 7).
|
| |
18
|
Robertson, S. E. and Belkin, N. J. 1978. Ranking in principle. J. Doc. 34, 2, 93--100.
|
| |
19
|
Rocchio, J. J., Jr. 1966. Document retrieval systems---Optimization and evaluation. PhD Dissertation. Harvard Computation Laboratory, Harvard University.
|
 |
20
|
|
| |
21
|
|
| |
22
|
Saracevic, T. Kantor, P. Chamis, A., and Trivison, D. 1988. A study of information seeking and retrieving. I. Background and methodology. J. Am. Soc. Inf. Sci. 39, 3, 161--176.
|
| |
23
|
Sormunen, E. 2000. A method for measuring wide range performance of Boolean queries in full-text databases {On-line}. Available at http://acta.uta.fi/pdf/951-44-4732-8.pdf. PhD Dissertation. Department of Information Studies, University of Tampere.
|
| |
24
|
|
 |
25
|
|
| |
26
|
Sparck-Jones, K. 1974. Automatic indexing. J. Doc. 30, 393--432.
|
| |
27
|
|
| |
28
|
|
| |
29
|
Trec Homepage 2001. Data---English relevance judgements {On-line}. Available at http://trec.nist.gov/data/reljudge_eng.html.
|
| |
30
|
Vakkari, P. and Hakala, N. 2000. Changes in relevance criteria and problem stages in task performance. J. Doc. 56, 540--562.
|
 |
31
|
|
| |
32
|
Voorhees, E. and Harman, D. 1999. Overview of the Seventh Text REtrieval Conference (TREC-7) {On-line}. Available at http://trec.nist.gov/pubs/trec7/papers/overview7.pdf.gz. In NIST Special Publication 500-242: The Seventh Text REtrieval Conference (TREC 7).
|
 |
33
|
|
CITED BY 105
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jing Bai , Yi Chang , Hang Cui , Zhaohui Zheng , Gordon Sun , Xin Li, Investigation of partial query proximity in web search, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hongyuan Zha , Zhaohui Zheng , Haoying Fu , Gordon Sun, Incorporating query difference for learning retrieval functions in world wide web search, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tao Qin , Xu-Dong Zhang , Ming-Feng Tsai , De-Sheng Wang , Tie-Yan Liu , Hang Li, Query-level loss functions for information retrieval, Information Processing and Management: an International Journal, v.44 n.2, p.838-855, March, 2008
|
|
|
|
|
|
|
|
|
|
|
|
Charles L.A. Clarke , Maheedhar Kolla , Gordon V. Cormack , Olga Vechtomova , Azin Ashkan , Stefan Büttcher , Ian MacKinnon, Novelty and diversity in information retrieval evaluation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
Xiubo Geng , Tie-Yan Liu , Tao Qin , Andrew Arnold , Hang Li , Heung-Yeung Shum, Query dependent ranking using K-nearest neighbor, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
Yu-Ting Liu , Tie-Yan Liu , Tao Qin , Zhi-Ming Ma , Hang Li, Supervised rank aggregation, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yuting Liu , Bin Gao , Tie-Yan Liu , Ying Zhang , Zhiming Ma , Shuyuan He , Hang Li, BrowseRank: letting web users vote for page importance, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tao Qin , Xu-Dong Zhang , De-Sheng Wang , Tie-Yan Liu , Wei Lai , Hang Li, Ranking with multiple hyperplanes, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
Ming-Feng Tsai , Tie-Yan Liu , Tao Qin , Hsin-Hsi Chen , Wei-Ying Ma, FRank: a ranking method with fidelity loss, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
Tanuja Bompada , Chi-Chao Chang , John Chen , Ravi Kumar , Rajesh Shenoy, On the robustness of relevance measures with incomplete judgments, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
Scott Huffman , April Lehman , Alexei Stolboushkin , Howard Wong-Toi , Fan Yang , Hein Roehrig, Multiple-signal duplicate detection for search evaluation, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
Yiming Yang , Abhimanyu Lad , Ni Lao , Abhay Harpale , Bryan Kisiel , Monica Rogati, Utility-based information distillation over temporally sequenced documents, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
Hai Jin , Xiaomin Ning , Weijia Jia , Hao Wu , Guilin Lu, Combining weights with fuzziness for intelligent semantic web search, Knowledge-Based Systems, v.21 n.7, p.655-665, October, 2008
|
|
|
Diane Kelly , Chirag Shah , Cassidy R. Sugimoto , Earl W. Bailey , Rachael A. Clemens , Ann K. Irvine , Nicholas A. Johnson , Weimao Ke , Sanghee Oh , Anezka Poljakova , Marcos A. Rodriguez , Megan G. van Noord , Yan Zhang, Effects of performance feedback on users' evaluations of an interactive IR system, Proceedings of the second international symposium on Information interaction in context, October 14-17, 2008, London, United Kingdom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yanyan Lan , Tie-Yan Liu , Tao Qin , Zhiming Ma , Hang Li, Query-level stability and generalization in learning to rank, Proceedings of the 25th international conference on Machine learning, p.512-519, July 05-09, 2008, Helsinki, Finland
|
|
|
|
|
|
|
|
|
Quannan Li , Yu Zheng , Xing Xie , Yukun Chen , Wenyu Liu , Wei-Ying Ma, Mining user similarity based on location history, Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, November 05-07, 2008, Irvine, California
|
|
|
Xianming Liu , Rongrong Ji , Hongxun Yao , Pengfei Xu , Xiaoshuai Sun , Tianqiang Liu, Cross-media manifold learning for image retrieval & annotation, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Dong Liu , Xian-Sheng Hua , Linjun Yang , Meng Wang , Hong-Jiang Zhang, Tag ranking, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Andrei Broder , Peter Ciccolo , Evgeniy Gabrilovich , Vanja Josifovski , Donald Metzler , Lance Riedel , Jeffrey Yuan, Online expansion of rare queries for sponsored search, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
Donald Metzler , Jasmine Novak , Hang Cui , Srihari Reddy, Building enriched document representations using aggregated anchor text, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
Fan Li , Xin Li , Shihao Ji , Zhaohui Zheng, Comparing both relevance and robustness in selection of web ranking functions, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
Wen-Yen Chen , Jon-Chyuan Chu , Junyi Luan , Hongjie Bai , Yi Wang , Edward Y. Chang, Collaborative filtering for orkut communities: discovery of user latent behavior, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
Sanjay Agrawal , Kaushik Chakrabarti , Surajit Chaudhuri , Venkatesh Ganti , Arnd Christian Konig , Dong Xin, Exploiting web search engines to search structured databases, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Chenxing Yang , Baogang Wei , Jiangqin Wu , Yin Zhang , Liang Zhang, CARES: a ranking-oriented CADAL recommender system, Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, June 15-19, 2009, Austin, TX, USA
|
|
|
Zhengya Sun , Tao Qin , Qing Tao , Jue Wang, Robust sparse rank learning for non-smooth ranking measures, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
Andrew Turpin , Falk Scholer , Kalvero Jarvelin , Mingfang Wu , J. Shane Culpepper, Including summaries in system evaluation, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
Mingfang Wu , James A. Thom , Andrew Turpin , Ross Wilkinson, Cost and benefit analysis of mediated enterprise search, Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, June 15-19, 2009, Austin, TX, USA
|
|
|
Susan L. Price , Marianne Lykke Nielsen , Lois M. L. Delcambre , Peter Vedsted , Jeremy Steinhauer, Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective, Information Systems, v.34 n.8, p.778-806, December, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Einat Amitay , David Carmel , Nadav Har'El , Shila Ofek-Koifman , Aya Soffer , Sivan Yogev , Nadav Golbandi, Social search and discovery using a unified approach, Proceedings of the 20th ACM conference on Hypertext and hypermedia, June 29-July 01, 2009, Torino, Italy
|
|
|
|
|
|
|
|