| Efficiency trade-offs in two-tier web search systems |
| Full text |
Pdf
(456 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
table of contents
Boston, MA, USA
SESSION: Efficiency
table of contents
Pages 163-170
Year of Publication: 2009
ISBN:978-1-60558-483-6
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 62, Downloads (12 Months): 177, Citation Count: 0
|
|
|
ABSTRACT
Search engines rely on searching multiple partitioned corpora to return results to users in a reasonable amount of time. In this paper we analyze the standard two-tier architecture for Web search with the difference that the corpus to be searched for a given query is predicted in advance. We show that any predictor better than random yields time savings, but this decrease in the processing time yields an increase in the infrastructure cost. We provide an analysis and investigate this trade-off in the context of two different scenarios on real-world data. We demonstrate that in general the decrease in answer time is justified by a small increase in infrastructure cost.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Ricardo Baeza-Yates , Aristides Gionis , Flavio Junqueira , Vanessa Murdock , Vassilis Plachouras , Fabrizio Silvestri, The impact of caching on search engines, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277775]
|
| |
2
|
|
| |
3
|
J. Callan, W. B. Croft, and S. M. Harding. The INQUERY retrieval system. In Proceedings of the 3rd International Conference on Database and Expert Systems Applications, pages 78--83, 1992.
|
 |
4
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
 |
5
|
Nick Craswell , Peter Bailey , David Hawking, Server selection on the World Wide Web, Proceedings of the fifth ACM conference on Digital libraries, p.37-46, June 02-07, 2000, San Antonio, Texas, United States
[doi> 10.1145/336597.336628]
|
| |
6
|
Nick Craswell , Francis Crimmins , David Hawking , Alistair Moffat, Performance and cost tradeoffs in Web search, Proceedings of the 15th Australasian database conference, p.161-169, January 01, 2004, Dunedin, New Zealand
|
 |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In Proceedings of SPIRE, 2004.
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
K. M. Risvik. Scaling Internet Search Engines: Methods and Analysis. PhD thesis, NTNU, 2004.
|
| |
19
|
|
 |
20
|
Luo Si , Rong Jin , Jamie Callan , Paul Ogilvie, A language modeling framework for resource selection and results merging, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
[doi> 10.1145/584792.584856]
|
 |
21
|
|
 |
22
|
|
| |
23
|
A. Spink, J. Bateman, and B. J. Jansen. Searching heterogeneous collections on the web: Behavior of excite users. Information Research, 4(2), 1998. Available at http://informationr.net/ir/4-2/paper53.html.
|
| |
24
|
E. Yom-Tov, D. Carmel, A. Darlow, D. Pelleg, S. Errera-Yaakov, and S. Fine. Juru at TREC 2005: Query prediction in the teraybyte and the robust tracks. In TREC 2005, 2005.
|
 |
25
|
|
| |
26
|
|
 |
27
|
|
| |
28
|
J. Zobel. Collection selection via lexicon inspection. In Proceedings of the Second Australian Document Computing Symposium, 1997.
|
|