ACM Home Page
Please provide us with feedback. Feedback
Compression-based document length prior for language models
Full text PdfPdf (322 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval table of contents
Boston, MA, USA
POSTER SESSION: Posters table of contents
Pages 652-653  
Year of Publication: 2009
ISBN:978-1-60558-483-6
Authors
Javier Parapar  University of A Coruña, A Coruña, Spain
David E. Losada  University of Santiago de Compostela, Santiago de Compostela, Spain
Álvaro Barreiro  University of A Coruña, A Coruña, Spain
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 55,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1571941.1572061
What is a DOI?

ABSTRACT

The inclusion of document length factors has been a major topic in the development of retrieval models. We believe that current models can be further improved by more refined estimations of the document's scope. In this poster we present a new document length prior that uses the size of the compressed document. This new prior is introduced in the context of Language Modeling with Dirichlet smoothing. The evaluation performed on several collections shows significant improvements in effectiveness.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Cilibrasi and P. Vitanyi. Clustering by compression. IEEE Trans. on Information Theory, 51:1523--1545, 2005.
 
2
 
3
Y. Marton, N. Wu, and L. Hellerstein. On compression--based text classification. In Proc. ECIR-05, 300--314, 2005.
 
4
S. Robertson, S. Walker, S. Jones, M. HancockBeaulieu, and M. Gatford. Okapi at TREC-3. In Proc. TREC-3, 109--127, 1995.
5
6

Collaborative Colleagues:
Javier Parapar: colleagues
David E. Losada: colleagues
Álvaro Barreiro: colleagues