ACM Home Page
Please provide us with feedback. Feedback
Minimal document set retrieval
Full text PdfPdf (234 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the 14th ACM international conference on Information and knowledge management table of contents
Bremen, Germany
SESSION: Paper session IR-11 (information retrieval): novelty detection table of contents
Pages: 752 - 759  
Year of Publication: 2005
ISBN:1-59593-140-6
Authors
Wei Dai  State University of New York at Buffalo, Buffalo, NY
Rohini Srihari  State University of New York at Buffalo, Buffalo, NY
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 47,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1099554.1099735
What is a DOI?

ABSTRACT

This paper presents a novel formulation and approach to the minimal document set retrieval problem. Minimal Document Set Retrieval (MDSR) is a promising information retrieval task in which each query topic is assumed to have different subtopics; the task is to retrieve and rank relevant document sets with maximum coverage but minimum redundancy of subtopics in each set. For this task, we propose three document set retrieval and ranking algorithms: Novelty Based method, Cluster Based method and Subtopic Extraction Based method. In order to evaluate the system performance, we design a new evaluation framework for document set ranking which evaluates both relevance between set and query topic, and redundancy within each set. Finally, we compare the performance of the three algorithms using the TREC interactive track dataset. Experimental results show the effectiveness of our algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study. Topic Detection and Tracking Workshop Report, 2001.
2
3
 
4
W. Hersh and P. Over. Trec-8 interactive track report. The Seventh Text Retrieval Conference (TREC-8), pages 57--64, 2000.
 
5
A. Leuski and J. Allan. Improving interactive retrieval by combining ranked list and clustering. In Proceedings of RIAO, pages 665--681, 2000.
 
6
A. Leuski and W. Croft. An evaluation of techniques for clustering search results. In Technical Report IR-76, 1996.
 
7
N.Jardine and C. van Rijsbergen. The use of hierarchic clustering in information retrieval, Information Storage and Retrieval. 1995.
 
8
P. Over. Trec-6 interactive track report. The Sixth Text Retrieval Conference (TREC-6), pages 73--82, 1998.
 
9
P. Over. Trec-7 interactive track report. The Seventh Text Retrieval Conference (TREC-7), pages 65--72, 1999.
 
10
M. Spitters, R. Villa, and C. V. Rijsbergen. Tno at tdt2001: language model-based topic detection. In Topic Detection and Tracking Workshop Report, 2001.
 
11
E. M. Voorhees. Overview of the TREC 2003 question answering track. In Proceedings of Text REtrieval Conference, 2003.
 
12
J. Yamron, I. Carp, L. Gillick, S. Lowe, and P. V. Mulbregt. Topic tracking in a news stream. In Proceedings of the DARPA Broadcast News Workshop, 1999.
13
 
14
15
16
17
18