ACM Home Page
Please provide us with feedback. Feedback
Approximate searches: k-neighbors + precision
Full text PdfPdf (155 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the twelfth international conference on Information and knowledge management table of contents
New Orleans, LA, USA
SESSION: Database session 1: querying high-dimensional data table of contents
Pages: 24 - 31  
Year of Publication: 2003
ISBN:1-58113-723-0
Authors
Sid-Ahmed Berrani  IRISA, Cesson-Sévigné France
Laurent Amsaleg  CNRS -- IRISA, Rennes, France
Patrick Gros  CNRS -- IRISA, Rennes, France
Sponsors
ACM: Association for Computing Machinery
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 37,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/956863.956870
What is a DOI?

ABSTRACT

It is known that all multi-dimensional index structures fail to accelerate content-based similarity searches when the feature vectors describing images are high-dimensional. It is possible to circumvent this problem by relying on approximate search-schemes trading-off result quality for reduced query execution time. Most approximate schemes, however, provide none or only complex control on the precision of the searches, especially when retrieving the k nearest neighbors (NNs) of query points.In contrast, this paper describes an approximate search scheme for high-dimensional databases where the precision of the search can be probabilistically controlled when retrieving the k NNs of query points. It allows a fine and intuitive control over this precision by setting at run time the maximum probability for a vector that would be in the exact answer set to be missed in the approximate set of answers eventually returned. This paper also presents a performance study of the implementation using real datasets showing its reliability and efficiency. It shows, for example, that our method is 6.72 times faster than the sequential scan when it handles more than 5 106 24-dimensional vectors, even when the probability of missing one of the true nearest neighbors is below 0.01.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
L. Amsaleg and P. Gros. Content-based retrieval using local descriptors: Problems and issues from a database perspective. Pattern Analysis and Applications, Special Issue on Image Indexation, 4:108--124, 2001.
 
2
L. Amsaleg, P. Gros, and S.-A. Berrani. A robust technique to recognize objects in images, and the db problems it raises. In Proceedings of the 7th International Workshop on Multimedia Information Systems, Capri, Italy, November 2001.
 
3
4
 
5
6
 
7
G. Brown. Modern Mathematics for the Engineer. 1956.
 
8
 
9
 
10
 
11
L. Florack, B. ter Haar Romeny, J. Koenderink, and M. Viergever. General intensity transformation and differential invariants. Journal of Mathematical Imaging and Vision , 4(2):171--187, 1994.
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
 
20
21

CITED BY  7

Collaborative Colleagues:
Sid-Ahmed Berrani: colleagues
Laurent Amsaleg: colleagues
Patrick Gros: colleagues