ACM Home Page
Please provide us with feedback. Feedback
A proximity language model for information retrieval
Full text PdfPdf (452 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval table of contents
Boston, MA, USA
SESSION: Retrieval models II table of contents
Pages 291-298  
Year of Publication: 2009
ISBN:978-1-60558-483-6
Authors
Jinglei Zhao  iZENEsoft, Inc., Shanghai, China
Yeogirl Yun  Wisenut, Inc., Seoul, South Korea
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 84,   Downloads (12 Months): 244,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1571941.1571993
What is a DOI?

ABSTRACT

The proximity of query terms in a document is a very important information to enable ranking models go beyond the "bag of word" assumption in information retrieval. This paper studies the integration of term proximity information into the unigram language modeling. A new proximity language model (PLM) is proposed which views query terms' proximity centrality as the Dirichlet hyper-parameter that weights the parameters of the unigram document language model. Several forms of proximity measure are developed to be used in PLM which could compute a query term's proximate centrality in a specific document. In experiments, the proximity language model is compared with the basic language model and previous works that combine the proximity information with language model using linear score combination. The experiment results show that the proposed model performs better in both top precision and average precision.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Bai, Y. Chang, H. Cui, Z. Zheng, G. Sun, and X. Li. Investigation of partial query proximity in web search. 2008.
 
2
 
3
S. Buttcher and C. Clarke. Efficiency vs. Effectiveness in Terabyte-Scale Information Retrieval. Proceedings of the 14th Text Retrieval Conference (Gaithersburg, USA, November 2005).
 
4
5
 
6
W. Croft. Boolean queries and term dependencies in probabilistic retrieval models. Journal of the American Society for Information Science, 37(2):71--77, 1986.
7
 
8
J. Fagan. Experiments in Automatic Phrase Indexing For Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods. 1987.
 
9
T. Ferguson. A Bayesian analysis of some nonparametric problems. Ann. Statist, 1(2):209--230, 1973.
10
 
11
D. Hawking and P. Thistlewaite. Proximity operators-So near and yet so far. Proceedings of the 4th Text Retrieval Conference, pages 131--143, 1995.
 
12
 
13
K. Jones, S. Walker, and S. Robertson. A Probabilistic Model of Information Retrieval: Development and Status. University of Cambridge, Computer Laboratory, 1998.
14
15
16
17
 
18
M. Mitra, C. Buckley, A. Singhal, and C. Cardie. An analysis of statistical and syntactic phrases. Proceedings of RIAO-97, 5th International Conference "Recherche d'Information Assistee par Ordinateur, pages 200--214, 1997.
19
 
20
P. Ogilvie and J. Callan. Experiments Using the Lemur Toolkit. NIST Special Publication SP, pages 103--108, 2002.
 
21
J. Ponte and W. Croft. A language modeling approach to information retrieval. ACM New York, NY, USA, 1998.
 
22
Y. Rasolofo and J. Savoy. Term Proximity Scoring for Keyword-Based Retrieval Systems. Lecture Notes in Computer Science, pages 207--218, 2003.
 
23
S. Robertson, S. Jones, et al. Relevance Weighting of Search Terms. Journal of the American Society for Information Science, 27(3):129--46, 1976.
 
24
S. Robertson, S. Walker, and M. Beaulieu. Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track. NIST Special Publication SP, pages 253--264, 1999.
25
26
27
 
28
C. Yu, C. Buckley, K. Lam, and G. Salton. A Generalized Term Dependence Model in Information Retrieval. Information technology: research and development, 2(4):129--154, 1983.
29

Collaborative Colleagues:
Jinglei Zhao: colleagues
Yeogirl Yun: colleagues