ACM Home Page
Please provide us with feedback. Feedback
Regularizing ad hoc retrieval scores
Full text PdfPdf (212 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the 14th ACM international conference on Information and knowledge management table of contents
Bremen, Germany
SESSION: Paper session IR-9 (information retrieval): IR models 2 table of contents
Pages: 672 - 679  
Year of Publication: 2005
ISBN:1-59593-140-6
Author
Fernando Diaz  University of Massachusetts, Amherst, MA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 74,   Citation Count: 17
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1099554.1099722
What is a DOI?

ABSTRACT

The cluster hypothesis states: closely related documents tend to be relevant to the same request. We exploit this hypothesis directly by adjusting ad hoc retrieval scores from an initial retrieval so that topically related documents receive similar scores. We refer to this process as score regularization. Score regularization can be presented as an optimization problem, allowing the use of results from semi-supervised learning. We demonstrate that regularized scores consistently and significantly rank documents better than unregularized scores, given a variety of initial retrieval algorithms. We evaluate our method on two large corpora across a substantial number of topics.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Allan, J. Callan, K. Collins-Thompson, B. Croft, F. Feng, D. Fisher, J. Lafferty, L. Larkey, T. N. Truong, P. Ogilvie, L. Si, T. Strohman, H. Turtle, L. Yau, and C. Zhai. The lemur toolkit for language modeling and information retrieval. http://lemurproject.org.
2
 
3
 
4
5
 
6
 
7
N. Jardine and C. J. V. Rijsbergen. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7:217--240, 1971.
8
9
10
11
 
12
John Lafferty , Guy Lebanon, Diffusion Kernels on Statistical Manifolds, The Journal of Machine Learning Research, 6, p.129-163, 9/1/2005
 
13
14
 
15
I. Matveeva. Text representation with the locality preserving projection algorithm for information retrieval task. Master's thesis, University of Chicago, 2004.
 
16
A. K. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/ mccallum/bow, 1996.
 
17
18
 
19
 
20
J. J. Rocchio. The SMART Retrieval System: Experiments in Automatic Document Processing, chapter Relevance Feedback in Information Retrieval, pages 313--323. Prentice-Hall Inc., 1971.
21
22
 
23
U. von Luxburg, O. Bousquet, and M. Belkin. On the convergence of spectral clustering on random samples: The normalized case. In Proceedings of the 17th Annual Conference on Learning Theory, pages 457--471, Berlin, 2004. Springer.
 
24
E. Voorhees. Overview of the trec 2004 robust track. In Proceedings of the 13th Text REtrieval Conference (TREC 2004), 2004.
25
26
 
27
D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schölkopf. Ranking on data manifolds. In L. S. Thrun, S. and B. Scholkopf, editors, Advances in Neural Information Processing Systems 16, volume 16, pages 169--176, Cambridge, MA, USA, 2004. MIT Press.
 
28

CITED BY  17