ACM Home Page
Please provide us with feedback. Feedback
Relevant document distribution estimation method for resource selection
Full text PdfPdf (211 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Distributed information retrieval table of contents
Pages: 298 - 305  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
Luo Si  Carnegie Mellon University, Pittsburgh, PA
Jamie Callan  Carnegie Mellon University, Pittsburgh, PA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 82,   Citation Count: 36
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860490
What is a DOI?

ABSTRACT

Prior research under a variety of conditions has shown the CORI algorithm to be one of the most effective resource selection algorithms, but the range of database sizes studied was not large. This paper shows that the CORI algorithm does not do well in environments with a mix of "small" and "very large" databases. A new resource selection algorithm is proposed that uses information about database sizes as well as database contents. We also show how to acquire database size estimates in uncooperative environments as an extension of the query-based sampling used to acquire resource descriptions. Experiments demonstrate that the database size estimates are more accurate for large databases than estimates produced by a competing method; the new resource ranking algorithm is always at least as effective as the CORI algorithm; and the new algorithm results in better document rankings than the CORI algorithm.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Callan. (2000). Distributed information retrieval. In W.B. Croft, editor, Advances in Information Retrieval. Kluwer Academic Publishers. (pp. 127--150).
 
2
 
3
N. Craswell. (2000). Methods for distributed information retrieval http://pigfish.vic.cmis.csiro.au/~nickc/pubs/craswellthesis.pdf. Ph. D. thesis, The Australian Nation University.
 
4
 
5
6
7
 
8
P. Ipeirotis and L. Gravano. (2002). Distributed search over the hidden web: Hierarchical database sampling and selection. In Proceedings of the 28th International Conference on Very Large Databases (VLDB).
 
9
The lemur toolkit. http://www.cs.cmu.edu/~lemur
 
10
InvisibleWeb.com. http://www.invisibleweb.com/
11
12
13
14
 
15

CITED BY  36