ACM Home Page
Please provide us with feedback. Feedback
Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach
Full text PdfPdf (497 KB)
Source ACM International Conference Proceeding Series; Vol. 261 archive
Proceedings of the 11th international conference on Extending database technology: Advances in database technology table of contents
Nantes, France
SESSION: Research sessions: Indexing table of contents
Pages 241-251  
Year of Publication: 2008
ISBN:978-1-59593-926-5
Authors
Yi Zhuang  Zhejiang University, P.R.China
Yueting Zhuang  Zhejiang University, P.R.China
Qing Li  City University of Hong Kong, HKSAR, P.R.China
Lei Chen  Hong Kong University of Science and Technology, HKSAR, P.R.China
Yi Yu  Nara Women's University, Japan
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 51,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1353343.1353375
What is a DOI?

ABSTRACT

Due to the well-known dimensionality curse problem, search in a high-dimensional space is considered as a "hard" problem. In this paper, a novel symmetrical encoding-based index structure, which is called EHD-Tree (for symmetrical Encoding-based Hybrid Distance Tree), is proposed to support fast k-Nearest-Neighbor (k-NN) search in high-dimensional spaces. In an EHD-Tree, all data points are first grouped into clusters by a k-Means clustering algorithm. Then the uniform ID number of each data point is obtained by a dual-distance-driven encoding scheme in which each cluster sphere is partitioned twice according to the dual distances of start- and centroid-distance. Finally, the uniform ID number and the centroid-distance of each data point are combined to get a uniform index key, the latter is then indexed through a partition-based B+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces can be transformed into search in a single dimensional space with the aid of the EHD-Tree index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of our proposed scheme, and the results demonstrate that this method outperforms the state-of-the-art high dimensional search techniques such as the X-Tree, VA-file, iDistance and NB-Tree, especially when the query radius is not very large.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
4
 
5
 
6
 
7
8
 
9
 
10
 
11
12
13
 
14
15
 
16
 
17
 
18
19
 
20
UCI KDD Archive, http://www.kdd.ics.uci.edu, 2002.

Collaborative Colleagues:
Yi Zhuang: colleagues
Yueting Zhuang: colleagues
Qing Li: colleagues
Lei Chen: colleagues
Yi Yu: colleagues