|
ABSTRACT
In this paper, we propose the Pyramid-Technique, a new indexing method for high-dimensional data spaces. The Pyramid-Technique is highly adapted to range query processing using the maximum metric Lmax. In contrast to all other index structures, the performance of the Pyramid-Technique does not deteriorate when processing range queries on data of higher dimensionality. The Pyramid-Technique is based on a special partitioning strategy which is optimized for high-dimensional data. The basic idea is to divide the data space first into 2d pyramids sharing the center point of the space as a top. In a second step, the single pyramids are cut into slices parallel to the basis of the pyramid. These slices from the data pages. Furthermore, we show that this partition provides a mapping from the given d-dimensional space to a 1-dimensional space. Therefore, we are able to use a B+-tree to manage the transformed data. As an analytical evaluation of our technique for hypercube range queries and uniform data distribution shows, the Pyramid-Technique clearly outperforms index structures using other partitioning strategies. To demonstrate the practical relevance of our technique, we experimentally compared the Pyramid-Technique with the X-tree, the Hilbert R-tree, and the Linear Scan. The results of our experiments using both, synthetic and real data, demonstrate that the Pyramid-Technique outperforms the X-tree and the Hilbert R-tree by a factor of up to 14 (number of page accesses) and up to 2500 (total elapsed time) for range queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bayer R., McCreight E.M.: 'Organization and Maintenance of Large Ordered Indices', Acta Informatica 1(3), 1977, pp. 173-189.
|
 |
2
|
Stefan Berchtold , Christian Böhm , Bernhard Braunmüller , Daniel A. Keim , Hans-Peter Kriegel, Fast parallel similarity search in multimedia databases, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.1-12, May 11-15, 1997, Tucson, Arizona, United States
|
| |
3
|
|
| |
4
|
Berchtold S., B0hm C., Keim D., Kriegel H.-P., Xu X.:'Optimal Multidimensional Query Processing Using Tree Striping', submitted.
|
 |
5
|
Stefan Berchtold , Christian Böhm , Daniel A. Keim , Hans-Peter Kriegel, A cost model for nearest neighbor search in high-dimensional data space, Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.78-86, May 11-15, 1997, Tucson, Arizona, United States
[doi> 10.1145/263661.263671]
|
| |
6
|
|
| |
7
|
|
| |
8
|
Beyer K., Goldstein J., Ramakrishnan R., Shaft U..: 'When Is "Nearest Neighbor" Meaningful?', submitted for publication, 1998.
|
 |
9
|
Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States
|
 |
10
|
|
 |
11
|
|
| |
12
|
C. Faloutsos , R. Barber , M. Flickner , J. Hafner , W. Niblack , D. Petkovic , W. Equitz, Efficient and effective querying by image content, Journal of Intelligent Information Systems, v.3 n.3-4, p.231-262, July 1994
[doi> 10.1007/BF00962238]
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
|
| |
17
|
Jain R, White D.A.: 'Similarity indexing: Algorithms and Performance', Proc. SPIE Storage and Retrieval for Image and Video Databases IV, Vol. 2670, San Jose, CA, 1996, pp. 62-75.
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
|
CITED BY 89
|
|
|
|
|
|
|
|
Hakan Ferhatosmanoglu , Divyakant Agrawal , Amr El Abbadi, Clustering declustered data for efficient retrieval, Proceedings of the eighth international conference on Information and knowledge management, p.343-350, November 02-06, 1999, Kansas City, Missouri, United States
|
|
|
|
|
|
Beng Chin Ooi , Kian-Lee Tan , Cui Yu , Stephane Bressan, Indexing the edges—a simple and yet efficient approach to high-dimensional indexing, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.166-174, May 15-18, 2000, Dallas, Texas, United States
|
|
|
|
|
|
Yuan-Chi Chang , Lawrence Bergman , Vittorio Castelli , Chung-Sheng Li , Ming-Ling Lo , John R. Smith, The onion technique: indexing for linear optimization queries, ACM SIGMOD Record, v.29 n.2, p.391-402, June 2000
|
|
|
|
|
|
|
|
|
|
|
|
Kristin P. Bennett , Usama Fayyad , Dan Geiger, Density-based indexing for approximate nearest-neighbor queries, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.233-243, August 15-18, 1999, San Diego, California, United States
|
|
|
|
|
|
Agma Traina , Caetano Traina , Spiros Papadimitriou , Christos Faloutsos, Tri-plots: scalable tools for multidimensional data mining, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, p.184-193, August 26-29, 2001, San Francisco, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hyo-Sang Lim , Jae-Gil Lee , Min-Jae Lee , Kyu-Young Whang , Il-Yeol Song, Continuous query processing in data streams using duality of data and queries, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
Khanh Vu , Kien A. Hua , Hao Cheng , Sheau-Dong Lang, A non-linear dimensionality-reduction technique for fast similarity search in large databases, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Weidong Chen , Jyh-Herng Chow , You-Chin Fuh , Jean Grandbois , Michelle Jou , Nelson Mendonça Mattos , Brian T. Tran , Yun Wang, High Level Indexing of User-Defined Types, Proceedings of the 25th International Conference on Very Large Data Bases, p.554-564, September 07-10, 1999
|
|
|
|
|
|
|
|
|
Yi Zhuang , Yueting Zhuang , Qing Li , Lei Chen , Yi Yu, Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach, Proceedings of the 11th international conference on Extending database technology: Advances in database technology, March 25-29, 2008, Nantes, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|