ACM Home Page
Please provide us with feedback. Feedback
Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data
Full text PdfPdf (669 KB)
Source ACM Transactions on Information Systems (TOIS) archive
Volume 24 ,  Issue 1  (January 2006) table of contents
Pages: 51 - 78  
Year of Publication: 2006
ISSN:1046-8188
Authors
Tiziano Fagni  Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy
Raffaele Perego  Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy
Fabrizio Silvestri  Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy
Salvatore Orlando  Università Ca' Foscari di Venezia, Mestre (VE), Italy
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 47,   Downloads (12 Months): 241,   Citation Count: 19
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1125857.1125859
What is a DOI?

ABSTRACT

This article discusses efficiency and effectiveness issues in caching the results of queries submitted to a Web search engine (WSE). We propose SDC (Static Dynamic Cache), a new caching strategy aimed to efficiently exploit the temporal and spatial locality present in the stream of processed queries. SDC extracts from historical usage data the results of the most frequently submitted queries and stores them in a static, read-only portion of the cache. The remaining entries of the cache are dynamically managed according to a given replacement policy and are used for those queries that cannot be satisfied by the static portion. Moreover, we improve the hit ratio of SDC by using an adaptive prefetching strategy, which anticipates future requests by introducing a limited overhead over the back-end WSE. We experimentally demonstrate the superiority of SDC over purely static and dynamic policies by measuring the hit ratio achieved on three large query logs by varying the cache parameters and the replacement policy used for managing the dynamic part of the cache. Finally, we deploy and measure the throughput achieved by a concurrent version of our caching system. Our tests show how the SDC cache can be efficiently exploited by many threads that concurrently serve the queries of different users.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
Hölscher, C. 1998. How Internet experts search for information on the Web. In Proceedings of WebNet 98---World Conference on the WWW and Internet & Intranet (Orlando, FL, Nov. 7--12).
 
4
 
5
6
7
 
8
Markatos, E. P. 2000. On caching search engine results. In Proceedings of the 5th International Web Caching and Content Delivery Workshop. Go online to http://www.iwcw.org/2000/Proceedings/proceedings.html.
 
9
Markatos, E. P. 2001. On caching search engine results. Comput. Commun. 24, 2, 137--143.
 
10
Moffat, A. and Zobel, J. 2004. What does it mean to “measure performance”? In Proceedings of the International Conference on Web Informations Systems, X. Zhou, S. Su, M. P. Papazoglou, M. E. Owlowska, and K. Jeffrey, Eds. Lecture Notes in Computer Science, vol. 3306. Springer, Berlin, Germany, 1--12.
11
 
12
Orlando, S., Perego, R., and Silvestri, F. 2001. Design of a parallel and distributed Web search engine. In ParCo2001: Proceedings of the International Conference Parallel Computing: Advances and Current Issues. Imperial College Press, London, U.K., 197--204.
13
14
15
16
17
 
18
Silvestri, F. 2004. High performance issues in Web search engines: Algorithms and techniques. Ph.D. dissertation. Università degli Studi di Pisa---Facoltà di Informatica, Pisa, Italy.
 
19
 
20
 
21
Xie, Y. and O'Hallaron, D. 2002. Locality in search engine queries and its implications for caching. In Proceedings of IEEE INFOCOM 2002: The 21st Annual Joint Conference of the IEEE Computer and Communications Societies.

CITED BY  19

Collaborative Colleagues:
Tiziano Fagni: colleagues
Raffaele Perego: colleagues
Fabrizio Silvestri: colleagues
Salvatore Orlando: colleagues