| Minimal document set retrieval |
| Full text |
Pdf
(234 KB)
|
| Source
|
Conference on Information and Knowledge Management
archive
Proceedings of the 14th ACM international conference on Information and knowledge management
table of contents
Bremen, Germany
SESSION: Paper session IR-11 (information retrieval): novelty detection
table of contents
Pages: 752 - 759
Year of Publication: 2005
ISBN:1-59593-140-6
|
|
Authors
|
|
Wei Dai
|
State University of New York at Buffalo, Buffalo, NY
|
|
Rohini Srihari
|
State University of New York at Buffalo, Buffalo, NY
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 47, Citation Count: 3
|
|
|
ABSTRACT
This paper presents a novel formulation and approach to the minimal document set retrieval problem. Minimal Document Set Retrieval (MDSR) is a promising information retrieval task in which each query topic is assumed to have different subtopics; the task is to retrieve and rank relevant document sets with maximum coverage but minimum redundancy of subtopics in each set. For this task, we propose three document set retrieval and ranking algorithms: Novelty Based method, Cluster Based method and Subtopic Extraction Based method. In order to evaluate the system performance, we design a new evaluation framework for document set ranking which evaluates both relevance between set and query topic, and redundancy within each set. Finally, we compare the performance of the three algorithms using the TREC interactive track dataset. Experimental results show the effectiveness of our algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study. Topic Detection and Tracking Workshop Report, 2001.
|
 |
2
|
|
 |
3
|
|
| |
4
|
W. Hersh and P. Over. Trec-8 interactive track report. The Seventh Text Retrieval Conference (TREC-8), pages 57--64, 2000.
|
| |
5
|
A. Leuski and J. Allan. Improving interactive retrieval by combining ranked list and clustering. In Proceedings of RIAO, pages 665--681, 2000.
|
| |
6
|
A. Leuski and W. Croft. An evaluation of techniques for clustering search results. In Technical Report IR-76, 1996.
|
| |
7
|
N.Jardine and C. van Rijsbergen. The use of hierarchic clustering in information retrieval, Information Storage and Retrieval. 1995.
|
| |
8
|
P. Over. Trec-6 interactive track report. The Sixth Text Retrieval Conference (TREC-6), pages 73--82, 1998.
|
| |
9
|
P. Over. Trec-7 interactive track report. The Seventh Text Retrieval Conference (TREC-7), pages 65--72, 1999.
|
| |
10
|
M. Spitters, R. Villa, and C. V. Rijsbergen. Tno at tdt2001: language model-based topic detection. In Topic Detection and Tracking Workshop Report, 2001.
|
| |
11
|
E. M. Voorhees. Overview of the TREC 2003 question answering track. In Proceedings of Text REtrieval Conference, 2003.
|
| |
12
|
J. Yamron, I. Carp, L. Gillick, S. Lowe, and P. V. Mulbregt. Topic tracking in a news stream. In Proceedings of the DARPA Broadcast News Workshop, 1999.
|
 |
13
|
|
| |
14
|
|
 |
15
|
Hua-Jun Zeng , Qi-Cai He , Zheng Chen , Wei-Ying Ma , Jinwen Ma, Learning to cluster web search results, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
[doi> 10.1145/1008992.1009030]
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
CITED BY 3
|
Tao Qin , Tie-Yan Liu , Xu-Dong Zhang , De-Sheng Wang , Wen-Ying Xiong , Hang Li, Learning to rank relational objects and its application to web search, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|