| Nearest-neighbor caching for content-match applications |
| Full text |
Pdf
(799 KB)
|
Source
|
International World Wide Web Conference
archive
Proceedings of the 18th international conference on World wide web
table of contents
Madrid, Spain
SESSION: Search/session: caching and indices
table of contents
Pages 441-450
Year of Publication: 2009
ISBN:978-1-60558-487-4
|
|
Authors
|
|
Sandeep Pandey
|
Yahoo! Research, Santa Clara, CA, USA
|
|
Andrei Broder
|
Yahoo! Research, Santa Clara, CA, USA
|
|
Flavio Chierichetti
|
Yahoo! Research, Santa Clara, CA, USA
|
|
Vanja Josifovski
|
Yahoo! Research, Santa Clara, CA, USA
|
|
Ravi Kumar
|
Yahoo! Research, Santa Clara, CA, USA
|
|
Sergei Vassilvitskii
|
Yahoo! Research, Santa Clara, CA, USA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 116, Citation Count: 1
|
|
|
ABSTRACT
Motivated by contextual advertising systems and other web applications involving efficiency-accuracy tradeoffs, we study similarity caching. Here, a cache hit is said to occur if the requested item is similar but not necessarily equal to some cached item. We study two objectives that dictate the efficiency-accuracy tradeoff and provide our caching policies for these objectives. By conducting extensive experiments on real data we show similarity caching can significantly improve the efficiency of contextual advertising systems, with minimal impact on accuracy. Inspired by the above, we propose a simple generative model that embodies two fundamental characteristics of page requests arriving to advertising systems, namely, long-range dependences and similarities. We provide theoretical bounds on the gains of similarity caching in this model and demonstrate these gains empirically by fitting the actual data to the model.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. F. Arlitt and C. L. Williamson. Trace-driven Simulation of Document Caching Strategies for Internet Web Servers. Simulation Journal, 68:23--33, 1997.
|
| |
2
|
|
 |
3
|
Andrei Z. Broder , David Carmel , Michael Herscovici , Aya Soffer , Jason Zien, Efficient query evaluation using a two-level retrieval process, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
[doi> 10.1145/956863.956944]
|
| |
4
|
|
| |
5
|
|
| |
6
|
F. Chierichetti, R. Kumar, and S. Vassilvitskii. Similarity Caching. In Submitted to PODS, 2009.
|
 |
7
|
Mayur Datar , Nicole Immorlica , Piotr Indyk , Vahab S. Mirrokni, Locality-sensitive hashing scheme based on p-stable distributions, Proceedings of the twentieth annual symposium on Computational geometry, June 08-11, 2004, Brooklyn, New York, USA
[doi> 10.1145/997817.997857]
|
 |
8
|
|
 |
9
|
Fabrizio Falchi , Claudio Lucchese , Salvatore Orlando , Raffaele Perego , Fausto Rabitti, A metric cache for similarity search, Proceeding of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval, October 30-30, 2008, Napa Valley, California, USA
[doi> 10.1145/1458469.1458473]
|
| |
10
|
|
| |
11
|
T. F. Gonzalez. Clustering to Minimize the Maximum Intercluster Distance. Theoretical Computer Science, 38(2-3):293--306, 1985.
|
| |
12
|
P. Indyk. Low-distortion embeddings of finite metric spaces. In Handbook of Discrete and Computational Geometry. CRC Press, 2004.
|
 |
13
|
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
E. P. Markatos. On Caching Search Engine Query Results. In Computer Communications, pages 137--143, 2000.
|
| |
18
|
|
 |
19
|
Paricia Correia Saraiva , Edleno Silva de Moura , Novio Ziviani , Wagner Meira , Rodrigo Fonseca , Berthier Riberio-Neto, Rank-preserving two-level caching for scalable search engines, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.51-58, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383959]
|
| |
20
|
|
 |
21
|
|
|