ACM Home Page
Please provide us with feedback. Feedback
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Full text PdfPdf (202 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Retreval models table of contents
Pages: 10 - 17  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
Cheng Xiang Zhai  University of Illinois at Urbana-Champaign, Urbana, IL
William W. Cohen  Carnegie Mellon University, Pittsburgh, PA
John Lafferty  Carnegie Mellon University, Pittsburgh, PA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 136,   Citation Count: 38
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860440
What is a DOI?

ABSTRACT

We present a non-traditional retrieval problem we call subtopic retrieval. The subtopic retrieval problem is concerned with finding documents that cover many different subtopics of a query topic. In such a problem, the utility of a document in a ranking is dependent on other documents in the ranking, violating the assumption of independent relevance which is assumed in most traditional retrieval methods. Subtopic retrieval poses challenges for evaluating performance, as well as for developing effective algorithms. We propose a framework for evaluating subtopic retrieval which generalizes the traditional precision and recall metrics by accounting for intrinsic topic difficulty as well as redundancy in documents. We propose and systematically evaluate several methods for performing subtopic retrieval using statistical language models and a maximal marginal relevance (MMR) ranking strategy. A mixture model combined with query likelihood relevance ranking is shown to modestly outperform a baseline relevance ranking on a data set used in the TREC interactive track.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
D. Harman. Overview of the trec 2002 novelty track. In Proceedings of TREC 2002, 2002.
 
5
W. Hersh and P. Over. Trec-8 interactive track report. In E. Voorhees and D. Harman, editors, The Seventh Text REtrieval Conference (TREC-8), pages 57--64, 2000. NIST Special Publication 500--246.
6
7
 
8
P. Ogilvie and J. Callan. Experiments using the lemur toolkit. In Proceedings of the 2001 Text REtrieval Conference, pages 103--108, 2002.
 
9
P. Over. Trec-6 interactive track report. In E. Voorhees and D. Harman, editors, The Sixth Text REtrieval Conference (TREC-6), pages 73--82, 1998. NIST Special Publication 500--240.
 
10
P. Over. Trec-7 interactive track report. In E. Voorhees and D. Harman, editors, The Sixth Text REtrieval Conference (TREC-7), pages 65--72, 1999. NIST Special Publication 500--242.
 
11
S. E. Robertson. The probability ranking principle in IR. Journal of Documentation, 33(4):294--304, Dec. 1977.
 
12
T. Saracevic. Relevance reconsidered. In Proceedings of the 2nd Conference on Conceptions of Library and Information Science, pages 201--218, 1996.
13
 
14
C. Zhai and J. Lafferty. Model-based feedback in the KL-divergence retrieval model. In Tenth International Conference on Information and Knowledge Management (CIKM 2001), pages 403--410, 2001.
15
16

CITED BY  40

Collaborative Colleagues:
Cheng Xiang Zhai: colleagues
William W. Cohen: colleagues
John Lafferty: colleagues