ACM Home Page
Please provide us with feedback. Feedback
Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces
Full text PdfPdf (253 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: High-performance & high dimensional indexing table of contents
Pages 123-130  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Wei Dong  Princeton University, Princeton, NJ, USA
Moses Charikar  Princeton University, Princeton, NJ, USA
Kai Li  Princeton University, Princeton, NJ, USA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 220,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390358
What is a DOI?

ABSTRACT

Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 distance in high-dimensional spaces, and that filtering with sketches can speed up similarity search by an order of magnitude. It is a challenge to further reduce the size of sketches, which are already compact, without compromising accuracy of distance estimation.

This paper presents an efficient sketch algorithm for similarity search with L2 distances and a novel asymmetric distance estimation technique. Our new asymmetric estimator takes advantage of the original feature vector of the query to boost the distance estimation accuracy. We also apply this asymmetric method to existing sketches for cosine similarity and L1 distance. Evaluations with datasets extracted from images and telephone records show that our L2 sketch outperforms existing methods, and the asymmetric estimators consistently improve the accuracy of different sketch methods. To achieve the same search quality, asymmetric estimators can reduce the sketch size by 10% to 40%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
 
5
I. Fodor. A survey of dimension reduction techniques.
 
6
7
8
 
9
10
11
 
12
SIFT demo. http://www.cs.ubc.ca/~lowe/keypoints/.
 
13
SWITCHBOARD-1 Release 2. http://www.ldc.upenn.edu/Catalog/docs/switchboard/.
 
14
15
 
16

Collaborative Colleagues:
Wei Dong: colleagues
Moses Charikar: colleagues
Kai Li: colleagues