|
ABSTRACT
Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. A higher relevance density will result in greater retrieval accuracy, ultimately approaching true relevance feedback. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Buckley, C. and Harman, D. 2004. Reliable information access final workshop report. http://nrrc.mitre.org/NRRC/publications.htm
|
| |
3
|
Buckley, C., Mitra, M., Walz, J., and Cardie, C. 1998. Using clustering and superconcepts within SMART: TREC 6. In Proc. 6th Text REtrieval Conference (TREC-6).
|
| |
4
|
Buckley, C. and Robertson, S. 2008. Proposal for relevance feedback 2008 track. http://groups.google.com/group/trec-relfeed.
|
 |
5
|
|
 |
6
|
|
 |
7
|
|
| |
8
|
Efron, B. 1979. Bootstrap methods: Another look at the jackknife, The Annals of Statistics, 7, pp. 1--26.
|
| |
9
|
Fix, E. and Hodges, L. 1951. Discriminatory analysis: nonparametric discrimination: consistency properties. Technical Report, USAF School of Aviation Medicine, Randolph Field, Texas, Project 21-49-004.
|
| |
10
|
|
| |
11
|
Jardine. N. and Rijsbergen, C.J.V. 1971. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7, pp. 217--240.
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
Thomas R. Lynam , Chris Buckley , Charles L. A. Clarke , Gordon V. Cormack, A multi-system analysis of document and term selection for blind feedback, Proceedings of the thirteenth ACM international conference on Information and knowledge management, November 08-13, 2004, Washington, D.C., USA
[doi> 10.1145/1031171.1031229]
|
 |
20
|
|
 |
21
|
|
| |
22
|
Robertson, S.E., Walker, S., Beaulieu, M., Gatford, M., and Payne, A. 1996. Okapi at TREC-4. In Proc. 4th Text REtrieval Conference (TREC).
|
| |
23
|
Rocchio, J.J. 1971. Relevance feedback in information retrieval. The SMART retrieval system, Prentice-Hall, pp. 316--321.
|
| |
24
|
Rosenfeld, R. 2000. Two decades of statistical language modeling: where do we go from here? In Proc. of the IEEE, 88(8), pp. 1270--1278.
|
 |
25
|
|
| |
26
|
Salton, G., and Buckley, C. 1990. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4), pp. 288--297.
|
| |
27
|
|
| |
28
|
Strohman, T., Metzler, D., Turtle, H., and Croft, W.B. 2005. Indri: A language model-based search engine for complex queries. In Proc. International Conference on Intelligence Analysis.
|
 |
29
|
|
| |
30
|
TREC. 20008. Call for participation. http://trec.nist.gov/call08.html
|
 |
31
|
|
 |
32
|
Lingpeng Yang , Donghong Ji , Guodong Zhou , Yu Nie , Guozheng Xiao, Document re-ranking using cluster validation and label propagation, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
[doi> 10.1145/1183614.1183713]
|
| |
33
|
Yeung, D.L., Clarke, C.L.A., Cormack, G.V., Lynam, T.R., and Terra, E.L. 2004. Task-specific query expansion. In Proc. 12th Text REtrieval Conference (TREC), pp. 810--819.
|
 |
34
|
Benyu Zhang , Hua Li , Yi Liu , Lei Ji , Wensi Xi , Weiguo Fan , Zheng Chen , Wei-Ying Ma, Improving web search results using affinity graph, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076120]
|
 |
35
|
|
|