ACM Home Page
Please provide us with feedback. Feedback
Beyond bags of words: effectively modeling dependence and features in information retrieval
Full text PdfPdf (290 KB)
Source
ACM SIGIR Forum archive
Volume 42 ,  Issue 1  (June 2008) table of contents
COLUMN: Dissertation abstracts table of contents
Pages 77-77  
Year of Publication: 2008
ISSN:0163-5840
Author
Donald Metzler  University of Massachusetts, Amherst, MA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 52,   Citation Count: 0
Additional Information:

abstract   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1394251.1394271
What is a DOI?

ABSTRACT

Current state of the art information retrieval models treat documents and queries as bags of words. There have been many attempts to go beyond this simple representation. Unfortunately, few have shown consistent improvements in retrieval effectiveness across a wide range of tasks and data sets. Here, we propose a new statistical model for information retrieval based on Markov random fields. The proposed model goes beyond the bag of words assumption by allowing dependencies between terms to be incorporated into the model. This allows for a variety of textual and non-textual features to be easily combined under the umbrella of a single model. Within this framework, we explore the theoretical issues involved, parameter estimation, feature selection, and query expansion. We give experimental results from a number of information retrieval tasks, such as ad hoc retrieval and web search.