ACM Home Page
Please provide us with feedback. Feedback
A Markov random field model for term dependencies
Full text PdfPdf (173 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Salvador, Brazil
SESSION: Theory 3 table of contents
Pages: 472 - 479  
Year of Publication: 2005
ISBN:1-59593-034-5
Authors
Donald Metzler  University of Massachusetts, Amherst, MA
W. Bruce Croft  University of Massachusetts, Amherst, MA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 242,   Citation Count: 50
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1076034.1076115
What is a DOI?

ABSTRACT

This paper develops a general, formal framework for modeling term dependencies via Markov random fields. The model allows for arbitrary text features to be incorporated as evidence. In particular, we make use of features based on occurrences of single terms, ordered phrases, and unordered phrases. We explore full independence, sequential dependence, and full dependence variants of the model. A novel approach is developed to train the model that directly maximizes the mean average precision rather than maximizing the likelihood of the training data. Ad hoc retrieval experiments are presented on several newswire and web collections, including the GOV2 collection used at the TREC 2004 Terabyte Track. The results show significant improvements are possible by modeling dependencies, especially on the larger web collections.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
5
6
 
7
 
8
 
9
D. Metzler. Direct maximization of rank-based metrics. Technical report, University of Massachusetts, Amherst, 2005.
 
10
 
11
D. Metzler, T. Strohman, H. Turtle, and W. B. Croft. Indri at terabyte track 2004. In Text REtrieval Conference (TREC 2004), 2004.
 
12
G. Mishne and M. de Rijke. Boosting web retrieval through query operations. In Proc. 27th European Conf. on Information Retrieval, pages 502--516, 2005.
 
13
W. Morgan, W. Greiff, and J. Henderson. Direct maximization of average precision by hill-climbing with a comparison to a maximum entropy approach. Technical report, MITRE, 2004.
14
15
16
 
17
 
18
S. Robertson. The probability ranking principle in IR. Journal of Documentation, 33(4):294--303, 1977.
 
19
20
21
 
22
B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. In Proc. of Advances in Neural Information Processing Systems (NIPS 2003), 2003.
23
 
24
C. J. van Rijsbergen. A theoretical basis for the use of cooccurrence data in information retrieval. Journal of Documentation, 33(2):106--119, 1977.
 
25
26

CITED BY  50

Collaborative Colleagues:
Donald Metzler: colleagues
W. Bruce Croft: colleagues