| SUSHI: scoring scaled samples for server selection |
| Full text |
Pdf
(371 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
table of contents
Boston, MA, USA
SESSION: Federated, distributed search
table of contents
Pages 419-426
Year of Publication: 2009
ISBN:978-1-60558-483-6
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 43, Downloads (12 Months): 111, Citation Count: 0
|
|
|
ABSTRACT
Modern techniques for distributed information retrieval use a set of documents sampled from each server, but these samples have been underutilised in server selection. We describe a new server selection algorithm, SUSHI, which unlike earlier algorithms can make full use of the text of each sampled document and which does not need training data. SUSHI can directly optimise for many common cases, including high precision retrieval, and by including a simple stopping condition can do so while reducing network traffic. Our experiments compare SUSHI with alternatives and show it achieves the same effectiveness as the best current methods while being substantially more efficient, selecting as few as 20% as many servers.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
 |
5
|
Jamie Callan , Margaret Connell , Aiqun Du, Automatic discovery of language models for text databases, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.479-490, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
6
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
| |
7
|
N. Craswell, D. Hawking, and P. Thistlewaite. Merging results from isolated search engines. In Proc. Australasian Database Conference, 1999.
|
| |
8
|
D. D'Souza, J. Zobel, and J.A. Thom. Is CORI effective for collection selection? An exploration of parameters, queries, and data. In Proc. Australasian Document Computing Symposium, 2004.
|
 |
9
|
|
| |
10
|
|
| |
11
|
D.K. Harman. The TREC test collections. In E.M. Voorhees and D.K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.
|
| |
12
|
|
 |
13
|
|
 |
14
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076063]
|
 |
15
|
King-Lup Liu , Adrain Santoso , Clement Yu , Weiyi Meng, Discovering the representative of a search engine, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502696]
|
| |
16
|
|
| |
17
|
W. Meng, W. Wang, H. Sun, and C. Yu. Concept hierarchy based text database categorization. Knowledge and Information Systems, 4(2), 2002.
|
| |
18
|
H. Nottelmann and N. Fuhr. Combining CORI and the decision-theoretic approach for advanced resource selection. In Proc. ECIR, 2004.
|
| |
19
|
M. Shokouhi. Central-rank-based collection selection in uncooperative distributed information retrieval. In Proc. ECIR, 2007.
|
 |
20
|
|
 |
21
|
Milad Shokouhi , Justin Zobel , Falk Scholer , S. M. M. Tahaghoghi, Capturing collection size for distributed non-cooperative retrieval, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148227]
|
| |
22
|
L. Si and J. Callan. The eect of database size distribution on resource selection algorithms. In Proc. ACM SIGIR, 2003.
|
 |
23
|
|
 |
24
|
|
 |
25
|
|
 |
26
|
|
| |
27
|
P. Thomas. Server Characterisation and Selection for Personal Metasearch. PhD thesis, Australian National University, 2008.
|
 |
28
|
|
|