ACM Home Page
Please provide us with feedback. Feedback
Improved techniques for result caching in web search engines
Full text PdfPdf (1.02 MB)
Source
International World Wide Web Conference archive
Proceedings of the 18th international conference on World wide web table of contents
Madrid, Spain
SESSION: Search/session: caching and indices table of contents
Pages 431-440  
Year of Publication: 2009
ISBN:978-1-60558-487-4
Authors
Qingqing Gan  Polytechnic Institute of NYU, Brooklyn, NY, USA
Torsten Suel  Yahoo! Research, Sunnyvale, CA, USA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 69,   Downloads (12 Months): 248,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1526709.1526768
What is a DOI?

ABSTRACT

Query processing is a major cost factor in operating large web search engines. In this paper, we study query result caching, one of the main techniques used to optimize query processing performance. Our first contribution is a study of result caching as a weighted caching problem. Most previous work has focused on optimizing cache hit ratios, but given that processing costs of queries can vary very significantly we argue that total cost savings also need to be considered. We describe and evaluate several algorithms for weighted result caching, and study the impact of Zipf-based query distributions on result caching. Our second and main contribution is a new set of feature-based cache eviction policies that achieve significant improvements over all previous methods, substantially narrowing the existing performance gap to the theoretically optimal (clairvoyant) method. Finally, using the same approach, we also obtain performance gains for the related problem of inverted list caching.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
R.Baeza-Yates, F.Junqueira, V.Plachouras, and H.Witschel. Admission policies for caches of search engine results. In Proc. of the 14th String Processing and Information Retrieval Symposium, Sept. 2007.
 
3
R.Baeza-Yates and F.Saint-Jean. A three-level search-engine index based in query log distribution. In Proc. of the 10th String Processing and Information Retrieval Symposium, Sept. 2003.
 
4
 
5
 
6
S.F. Chen and J.Goodman. An empirical study of smoothing techniques for language modeling, 1996.
7
 
8
S.Garcia. Search Engine Optimisation Using Past Queries. Ph.D Thesis, Department of Computer Science and Information Technology, RMIT University, 2007.
9
10
11
12
13
 
14
 
15
E.Markatos. On caching search engine query results. In 5th International Web Caching and Content Delivery Workshop, May 2000.
16
 
17
C.Silverstein, M.Henzinger, H.Marais, and M.Moricz. Analysis of a Very Large AltaVista Query Log. Technical Report 014, SRC (Digital, Palo Alto), Oct. 1998.
 
18
Y.Xie and D.O'Hallaron. Locality in search engine queries and its implications for caching. In Proc. IEEE Infocom, 2002.
 
19
20
21


Collaborative Colleagues:
Qingqing Gan: colleagues
Torsten Suel: colleagues