ACM Home Page
Please provide us with feedback. Feedback
A study of Poisson query generation model for information retrieval
Full text PdfPdf (227 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Amsterdam, The Netherlands
SESSION: Formal models table of contents
Pages: 319 - 326  
Year of Publication: 2007
ISBN:978-1-59593-597-7
Authors
Qiaozhu Mei  University of Illinois at Urbana-Champaign
Hui Fang  University of Illinois at Urbana-Champaign
ChengXiang Zhai  University of Illinois at Urbana-Champaign
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 157,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1277741.1277797
What is a DOI?

ABSTRACT

Many variants of language models have been proposed for information retrieval. Most existing models are based on multinomial distribution and would score documents based on query likelihood computed based on a query generation probabilistic model. In this paper, we propose and study a new family of query generation models based on Poisson distribution. We show that while in their simplest forms, the new family of models and the existing multinomial models are equivalent. However, based on different smoothing methods, the two families of models behave differently. We show that the Poisson model has several advantages, including naturally accommodating per-term smoothing and modeling accurate background more efficiently. We present several variants of the new model corresponding to different smoothing methods, and evaluate them on four representative TREC test collections. The results show that while their basic models perform comparably, the Poisson model can out perform multinomial model with per-term smoothing. The performance can be further improved with two-stage smoothing.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Harvard University, 1998.
 
3
K. Church and W. Gale. Poisson mixtures. Nat. Lang. Eng. 1(2):163--190, 1995.
 
4
5
 
6
D. Hiemstra. Using Language Models for Information Retrieval PhD thesis, University of Twente, Enschede, Netherlands, 2001.
7
8
 
9
10
11
 
12
J. Lafferty and C. Zhai. Probabilistic IR models based on query and document generation. In Proceedings of the Language Modeling and IR workshop pages 1--5, May 31 June 1 2001.
 
13
J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. In W. B. Croft and J. Lafferty, editors, Language Modeling and Information Retrieval Kluwer Academic Publishers, 2003.
14
15
 
16
17
18
 
19
A. Papoulis. Probability, random variables and stochastic processes New York: McGraw-Hill, 1984, 2nd ed., 1984.
20
 
21
 
22
S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In D. K. Harman, editor, The Third Text REtrieval Conference (TREC-3) pages 109--126, 1995.
23
 
24
25
26
27
28


Collaborative Colleagues:
Qiaozhu Mei: colleagues
Hui Fang: colleagues
ChengXiang Zhai: colleagues