ACM Home Page
Please provide us with feedback. Feedback
SUSHI: scoring scaled samples for server selection
Full text PdfPdf (371 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval table of contents
Boston, MA, USA
SESSION: Federated, distributed search table of contents
Pages 419-426  
Year of Publication: 2009
ISBN:978-1-60558-483-6
Authors
Paul Thomas  CSIRO, Canberra, Australia
Milad Shokouhi  Microsoft Research, Cambridge, United Kingdom
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 43,   Downloads (12 Months): 111,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1571941.1572014
What is a DOI?

ABSTRACT

Modern techniques for distributed information retrieval use a set of documents sampled from each server, but these samples have been underutilised in server selection. We describe a new server selection algorithm, SUSHI, which unlike earlier algorithms can make full use of the text of each sampled document and which does not need training data. SUSHI can directly optimise for many common cases, including high precision retrieval, and by including a simple stopping condition can do so while reducing network traffic.

Our experiments compare SUSHI with alternatives and show it achieves the same effectiveness as the best current methods while being substantially more efficient, selecting as few as 20% as many servers.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
5
6
 
7
N. Craswell, D. Hawking, and P. Thistlewaite. Merging results from isolated search engines. In Proc. Australasian Database Conference, 1999.
 
8
D. D'Souza, J. Zobel, and J.A. Thom. Is CORI effective for collection selection? An exploration of parameters, queries, and data. In Proc. Australasian Document Computing Symposium, 2004.
9
 
10
 
11
D.K. Harman. The TREC test collections. In E.M. Voorhees and D.K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.
 
12
13
14
15
 
16
 
17
W. Meng, W. Wang, H. Sun, and C. Yu. Concept hierarchy based text database categorization. Knowledge and Information Systems, 4(2), 2002.
 
18
H. Nottelmann and N. Fuhr. Combining CORI and the decision-theoretic approach for advanced resource selection. In Proc. ECIR, 2004.
 
19
M. Shokouhi. Central-rank-based collection selection in uncooperative distributed information retrieval. In Proc. ECIR, 2007.
20
21
 
22
L. Si and J. Callan. The eect of database size distribution on resource selection algorithms. In Proc. ACM SIGIR, 2003.
23
24
25
26
 
27
P. Thomas. Server Characterisation and Selection for Personal Metasearch. PhD thesis, Australian National University, 2008.
28

Collaborative Colleagues:
Paul Thomas: colleagues
Milad Shokouhi: colleagues