ACM Home Page
Please provide us with feedback. Feedback
Using sampled data and regression to merge search engine results
Full text PdfPdf (264 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
SESSION: Web Information Retrieval table of contents
Pages: 19 - 26  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Luo Si  Carnegie Mellon University, Pittsburgh, PA
Jamie Callan  Carnegie Mellon University, Pittsburgh, PA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 70,   Citation Count: 20
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564382
What is a DOI?

ABSTRACT

This paper addresses the problem of merging results obtained from different databases and search engines in a distributed information retrieval environment. The prior research on this problem either assumed the exchange of statistics necessary for normalizing scores (cooperative solutions) or is heuristic. Both approaches have disadvantages. We show that the problem in uncooperative environments is simpler when viewed as a component of a distributed IR system that uses query-based sampling to create resource descriptions. Documents sampled for creating resource descriptions can also be used to create a sample centralized index, and this index is a source of training data for adaptive results merging algorithms. A variety of experiments demonstrate that this new approach is more effective than a well-known alternative, and that it allows query-by-query tuning of the results merging function.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Callan. Distributed information retrieval. In W.B. Croft, editor, Advances in information retrieval. pp. 127--150. Kluwer Academic Publishers, 2000.
 
2
3
4
5
 
6
7
 
8
9
10
 
11
S. T. Kirsch. Document retrieval over networks wherein ranking and relevance scores are computed at the client for multiple database documents. U.S. Patent 5,659,732.
 
12
N. Craswell, D. Hawking, and P. Thistlewaite. Merging Results from Isolated Search Engines. In Proc. of the Tenth Australasian Database Conf., pages 189--200, 1999.
13
14
15
 
16
17
18
 
19
C. Buckley, A. Singhal, M. Mitra, and G. Salton, New retrieval approaches using SMART. In Proceedings of 1995 Text REtrieval Conference (TREC-3). National Institute of Standards and Technology, special publication.
20
21
 
22
P. Ogilvie, J. Callan. Experiments using the Lemur toolkit. In Proc of 2001 Text REtrieval Conference (TREC 2001). National Institute of Standards and Technology, special publication.
23

CITED BY  20