ACM Home Page
Please provide us with feedback. Feedback
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Full text PdfPdf (300 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
New Orleans, Louisiana, United States
Pages: 334 - 342  
Year of Publication: 2001
ISBN:1-58113-331-6
Authors
Chengxiang Zhai  Carnegie Mellon Univ., Pittsburgh, PA
John Lafferty  Carnegie Mellon Univ., Pittsburgh, PA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 175,   Citation Count: 175
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/383952.384019
What is a DOI?

ABSTRACT

Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and then rank documents by the likelihood of the query according to the estimated language model. A core problem in language model estimation is smoothing, which adjusts the maximum likelihood estimator so as to correct the inaccuracy due to data sparseness. In this paper, we study the problem of language model smoothing and its influence on retrieval performance. We examine the sensitivity of retrieval performance to the smoothing parameters and compare several popular smoothing methods on different test collections.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
S.F.Chen and J.Goo man (1998)."An empirical study of smoothing techniques for language modeling,"Tech.Rep. TR-10-98,Harvar University.
 
3
 
4
I.J.Goo (1953)."The Population Frequencies of Species and the Estimation of Population Parameters,"Biometrika Volume 40,parts 3,4,pp.237 -264.
 
5
D.Hiemstra and W.Kraaij (1998)."Twenty-one at TREC- 7:A -hoc and cross-language track,"in Proc. of Seventh Text REtrieval Conference (TREC-7),Gaithersburg,MD.
 
6
F.Jelinek and R.Mercer (1980)."Interpolated estimation of Markov source parameters from sparse ata ".In Pattern Recognition in Practice E.S.Gelsemaan L.N.Kanal(editors),pages 381 -402.North Holland,Amsterdam.
 
7
S.M.Katz (1987)."Estimation of probabilities from sparse data for the language model component of a speech recognizer,"IEEE Transactions on Acoustics, Speech and Signal Processing volume ASSP-35,pages 400 -401,March 1987.
 
8
R.Kneser and H.Ney (1995)."Improved smoothing for mgram language modeling,"in Proceedings of the International Conference on Acoustics, Speech and Signal Processing Detroit,MI.
 
9
MacKay,D.and Peto,L.(1995)."A hierarchical Dirichlet language model."Natural Language Engineering 1(3),pp. 289 -307.
10
 
11
H.Ney,U.Essen,and R.Kneser (1994)."On structuring probabilistic epen encies in stochastic language mo eling," Computer Speech and Language 8:1-38.
 
12
13
 
14
C.J.van Rijsbergen (1986)."A Non-classical Logic for Information Retrieval,"The Computer Journal 29(6).
 
15
 
16
S.E.Robertson,S.Walker,S.Jones,M.M.Hancock- Beaulieu,and M.Gatfor (1995)."Okapi at TREC-3,"The Third Text REtrieval Conference (TREC-3),inD.K.Harman (e ),NIST Special Publication.
 
17
 
18
G.Salton and C.Buckley (1990),"Improving retrieval performance by relevance feedback ",Journal of the American Society for Information Science,Vol.44,No.4,288 -297.
19
20
 
21
22

CITED BY  175

Collaborative Colleagues:
Chengxiang Zhai: colleagues
John Lafferty: colleagues