ACM Home Page
Please provide us with feedback. Feedback
Model-based feedback in the language modeling approach to information retrieval
Full text PdfPdf (1.33 MB)
Source Conference on Information and Knowledge Management archive
Proceedings of the tenth international conference on Information and knowledge management table of contents
Atlanta, Georgia, USA
Session: Similarity Measures table of contents
Pages: 403 - 410  
Year of Publication: 2001
ISBN:1-58113-436-3
Authors
Chengxiang Zhai  Carnegie Mellon Univ., Pittsburgh, PA
John Lafferty  Carnegie Mellon Univ., Pittsburgh, PA
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 144,   Citation Count: 70
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502585.502654
What is a DOI?

ABSTRACT

The language modeling approach to retrieval has been shown to perform well empirically. One advantage of this new approach is its statistical foundations. However, feedback, as one important component in a retrieval system, has only been dealt with heuristically in this new retrieval approach: the original query is usually literally expanded by adding additional terms to it. Such expansion-based feedback creates an inconsistent interpretation of the original and the expanded query. In this paper, we present a more principled approach to feedback in the language modeling approach. Specifically, we treat feedback as updating the query language model based on the extra evidence carried by the feedback documents. Such a model-based feedback strategy easily fits into an extension of the language modeling approach. We propose and evaluate two different approaches to updating a query language model based on feedback documents, one based on a generative probabilistic model of feedback documents and one based on minimization of the KL-divergence over feedback documents. Experiment results show that both approaches are effective and outperform the Rocchio feedback approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statist. Sot. B, 39:1-38, 1977.
 
4
D. Hiemstra. Using language models for information retrieval. PhD thesis, University of Twente, 2001.
 
5
D. Hiemstra and W. Kraaij. Twenty-one at TREC-7: Ad-hoc and cross-language track. In Proc. of Seventh Text REtrieval Conference (TREC-7), 1998.
6
7
8
 
9
K. Ng. A maximum likelihood ratio information retrieval model. In TREC-8 Workshop notebook, 1999.
 
10
11
 
12
S. Robertson and K. Sparck Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129-146, 1976.
 
13
S. E. Robertson and S. Walker. Okapi/keenbow at TREC-8. In E. M. Voorhees and D. K. Harman, editors, The Eighth Text REtrieval Conference (TREC 8). NIST Special Publication 500-246, 1999.
 
14
J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313-323. Prentice- Hall Inc., 1971.
15
 
16
E. Voorhees and D. Harman, editors. Proceedings of Text REtrieval Conference (TRECI-9). NIST Special Publications, 2001. http://trec.nist.gov/pubs.html.
 
17
18
19

CITED BY  70

Collaborative Colleagues:
Chengxiang Zhai: colleagues
John Lafferty: colleagues