ACM Home Page
Please provide us with feedback. Feedback
Regularized estimation of mixture models for robust pseudo-relevance feedback
Full text PdfPdf (214 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Seattle, Washington, USA
SESSION: Relevance feedback table of contents
Pages: 162 - 169  
Year of Publication: 2006
ISBN:1-59593-369-7
Authors
Tao Tao  University of Illinois at Urbana-Champaign, IL
ChengXiang Zhai  University of Illinois at Urbana-Champaign, IL
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 115,   Citation Count: 20
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1148170.1148201
What is a DOI?

ABSTRACT

Pseudo-relevance feedback has proven to be an effective strategy for improving retrieval accuracy in all retrieval models. However the performance of existing pseudo feedback methods is often affected significantly by some parameters, such as the number of feedback documents to use and the relative weight of original query terms; these parameters generally have to be set by trial-and-error without any guidance. In this paper, we present a more robust method for pseudo feedback based on statistical language models. Our main idea is to integrate the original query with feedback documents in a single probabilistic mixture model and regularize the estimation of the language model parameters in the model so that the information in the feedback documents can be gradually added to the original query. Unlike most existing feedback methods, our new method has no parameter to tune. Experiment results on two representative data sets show that the new method is significantly more robust than a state-of-the-art baseline language modeling approach for feedback with comparable or better retrieval accuracy.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. A. Evans and R. G. Lefferts. Design and evaluation of the clarit-trec-2 system. In D. Harman, editor, Proceedings of the Second Text REtrieval Conference (TREC-2), 1994.
2
3
4
5
 
6
7
 
8
G. J. McLachlan and T. Krishnan. The EM Algorithm and Extensions. Wiley, 1996.
9
 
10
S. Robertson and K. Sparck Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129--146, 1976.
 
11
J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall Inc., 1971.
 
12
J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall Inc., 1971.
13
 
14
T. Tao and C. Zhai. Mixture clustering model for pseudo. In Proceedings of the 2004 Meeting of the International Federation of Classification Societies. Spriner, 2003.
 
15
E. Voorhees and D. Harman, editors. Proceedings of Text REtrieval Conference (TREC1-9). NIST Special Publications, 2001. http://trec.nist.gov/pubs.html.
16
17
 
18
C. Zhai and J. Lafferty. Model-based feedback in the KL -divergence retrieval model. In Tenth International Conference on Information and Knowledge Management (CIKM 2001), pages 403--410, 2001.
19

CITED BY  20

Collaborative Colleagues:
Tao Tao: colleagues
ChengXiang Zhai: colleagues