| Bayesian extension to the language model for ad hoc information retrieval |
| Full text |
Pdf
(170 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
table of contents
Toronto, Canada
SESSION: Retreval models
table of contents
Pages: 4 - 9
Year of Publication: 2003
ISBN:1-58113-646-3
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 11, Downloads (12 Months): 101, Citation Count: 12
|
|
|
ABSTRACT
We propose a Bayesian extension to the ad-hoc Language Model. Many smoothed estimators used for the multinomial query model in ad-hoc Language Models (including Laplace and Bayes-smoothing) are approximations to the Bayesian predictive distribution. In this paper we derive the full predictive distribution in a form amenable to implementation by classical IR models, and then compare it to other currently used estimators. In our experiments the proposed model outperforms Bayes-smoothing, and its combination with linear interpolation smoothing outperforms all other estimators.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Stanley. F. Chen and Joshua Goodman. An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Center for Research in Computing Technology. Harvard University, August 1998.
|
| |
3
|
W. B Croft, D. J Harper, D. H Kraft, and J. Zobel, editors. SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 2001.
|
| |
4
|
A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian Data Analysis. Chapman and Hall/CRC, 1885.
|
| |
5
|
D. Hiemstra and W. Kraaij. Twenty-one at trec-7: ad-hoc and cross-language track. In Voorhees and Harman {11}, pages 227--238. NIST Special Publication 500--242.
|
 |
6
|
|
| |
7
|
David J. C. McKay and Linda C. Bauman Peto. A hierarchical dirichlet language model. Natural Language Engineering, 1(3):289--307, 1995.
|
| |
8
|
D. Miller, T. Leek, and R. Schwartz. Bbn at trec-7: using hidden markov models for information retrieval. In Voorhees and Harman {11}, pages 133--142. NIST Special Publication 500--242.
|
 |
9
|
|
| |
10
|
S. E. Robertson and D. Hiemstra. Language models and probability of relevance'. In Proceedings of the first Workshop on Language Modeling and Information Retrieval, pages 21--25, 2001.
|
| |
11
|
E. M. Voorhees and D. K. Harman, editors. The Seventh Text REtrieval Conference (TREC--7). Gaithersburg, MD: NIST, 1999. NIST Special Publication 500--242.
|
| |
12
|
Grace Wahba. Spline Models for Observational Data, volume. 59. SIAM, 1992.
|
 |
13
|
|
 |
14
|
|
CITED BY 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Chin-Yew Lin , Guihong Cao , Jianfeng Gao , Jian-Yun Nie, An information-theoretic approach to automatic evaluation of summaries, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.463-470, June 04-09, 2006, New York, New York
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jianhan Zhu , Jun Wang , Ingemar J. Cox , Michael J. Taylor, Risky business: modeling and exploiting uncertainty in information retrieval, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|