ACM Home Page
Please provide us with feedback. Feedback
Dependence language model for information retrieval
Full text PdfPdf (314 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Sheffield, United Kingdom
SESSION: Language models table of contents
Pages: 170 - 177  
Year of Publication: 2004
ISBN:1-58113-881-4
Authors
Jianfeng Gao  Microsoft Research, Asia
Jian-Yun Nie  Université de Montréal
Guangyuan Wu  Tianjin University, China
Guihong Cao  Tianjin University, China
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 176,   Citation Count: 40
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1008992.1009024
What is a DOI?

ABSTRACT

This paper presents a new dependence language modeling approach to information retrieval. The approach extends the basic language modeling approach based on unigram by relaxing the independence assumption. We integrate the linkage of a query as a hidden variable, which expresses the term dependencies within the query as an acyclic, planar, undirected graph. We then assume that a query is generated from a document in two stages: the linkage is generated first, and then each term is generated in turn depending on other related terms according to the linkage. We also present a smoothing method for model parameter estimation and an approach to learning the linkage of a sentence in an unsupervised manner. The new approach is compared to the classical probabilistic retrieval model and the previously proposed language models with and without taking into account term dependencies. Results show that our model achieves substantial and significant improvements on TREC collections.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Chelba, Ciprian and Frederick Jelinek. 2000. Structured Language Modeling. In: Computer Speech and Language, Vol. 14, No. 4. pp 283--332.
 
4
Chelba, C, D. Engle, F. Jelinek, V. Jimenez, S. Khudanpur, L. Mangu, H. Printz, E. S. Ristad, R. Rosenfeld, A. Stolcke and D. Wu. 1997. Structure and performance of a dependency language model. In: Processing of Eurospeech, Vol. 5, pp 2775--2778.
 
5
6
 
7
Croft, W. B. 1986. Boolean queries and term dependencies in probabilistic retrieval models. In: JASIS, 37(2): 71--77.
 
8
 
9
 
10
Harper, D. J. and C. J. van Rijsbergen. 1978. An evaluation of feedback in document retrieval using co-occurrence data. In: Journal of Documentation, 34: 189--216.
11
 
12
 
13
Harman, D. K. 1995. Overview of the fourth Text REtrieval Conference (TREC-4). In: TREC-4, pp 1--24.
 
14
 
15
Katz, S. M. 1987. Estimation of probabilities from sparse data for other language component of a speech recognizer. In: IEEE transactions on Acoustics, Speech and Signal Processing, 35(3): 400--401.
 
16
 
17
 
18
Jones, K. S., S. Walker and S. Robertson. 1998. A probabilistic model of information retrieval: development and status. Technical Report TR-446, Cambridge University Computer Laboratory.
 
19
Katz, S. M. 1987. Estimation of probabilities from sparse data for other language component of a speech recognizer. In: IEEE transactions on Acoustics, Speech and Signal Processing, 35(3): 400--401.
 
20
Lafferty, J., Sleator, D. and Temperley, D. 1992. Grammatical trigrams: a probabilistic model of link grammar. In: Proc. of the 1992 AAAI Fall Symposium on Probabilistic Approaches to Natural Language.
21
22
23
24
 
25
 
26
Robertson, S. E. and Walker, S. 2000. Microsoft Cambridge at TREC-9: Filtering track. In: TREC-9, pp. 361--368.
27
 
28
Sparck Jones, K. 1998. What is the role of NLP in text retrieval? In: Naturnal language information retrieval (Ed. T. Strzalkowski), Dordrecht: Kluwer.
29
 
30
van Rijsbergen, C. J. 1977. A theoretical basis for the use of co-occurrence data in information retrieval. In: Journal of Documentation, 33(2): 106--119.
31
 
32
33

CITED BY  40

Collaborative Colleagues:
Jianfeng Gao: colleagues
Jian-Yun Nie: colleagues
Guangyuan Wu: colleagues
Guihong Cao: colleagues