ACM Home Page
Please provide us with feedback. Feedback
Predictive caching and prefetching of query results in search engines
Full text PdfPdf (213 KB)
Source International World Wide Web Conference archive
Proceedings of the 12th international conference on World Wide Web table of contents
Budapest, Hungary
SESSION: Information Retrieval table of contents
Pages: 19 - 28  
Year of Publication: 2003
ISBN:1-58113-680-3
Authors
Ronny Lempel  Technion, Haifa, Israel
Shlomo Moran  Technion, Haifa, Israel
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 97,   Citation Count: 34
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/775152.775156
What is a DOI?

ABSTRACT

We study the caching of query result pages in Web search engines. Popular search engines receive millions of queries per day, and efficient policies for caching query results may enable them to lower their response time and reduce their hardware requirements. We present PDC (probability driven cache), a novel scheme tailored for caching search results, that is based on a probabilistic model of search engine users. We then use a trace of over seven million queries submitted to the search engine AltaVista to evaluate PDC, as well as traditional LRU and SLRU based caching schemes. The trace driven simulations show that PDC outperforms the other policies. We also examine the prefetching of search results, and demonstrate that prefetching can increase cache hit ratios by 50% for large caches, and can double the hit ratios of small caches. When integrating prefetching into PDC, we attain hit ratios of over 0.53.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
L. A. Adamic and B. A. Huberman. The nature of markets in the world wide web. Quarterly Journal of Economic Commerce, 1:5--12, 2000.
 
2
 
3
 
4
P. Cao, J. Zhang, and K. Beach. Active cache: Caching dynamic contents on the web. In Proc. of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware '98), pages 373--388, 1998.
 
5
 
6
7
 
8
 
9
 
10
R. Lempel and S. Moran. Optimizing result prefetching in web search engines with segmented indices. In Proc. 28th International Conference on Very Large Data Bases, Hong Kong, China, 2002.
 
11
E. P. Markatos. On caching search engine query results. Proceedings of the 5th International Web Caching and Content Delivery Workshop, May 2000.
 
12
M. Mitzenmacher. A brief history of generative models for power law and lognormal distributions. Invited Talk in the 39th Annual Allerton Conference on Communication, Control and Computing, October 2001.
13
 
14
C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large altavista query log. Technical Report 1998-014, Compaq Systems Research Center, October 1998.
 
15
 
16
 
17

CITED BY  34

Collaborative Colleagues:
Ronny Lempel: colleagues
Shlomo Moran: colleagues