|
ABSTRACT
We compare two different techniques for browsing through a collection of spatial objects stored in an R-tree spatial data structure on the basis of their distances from an arbitrary spatial query object. The conventional approach is one that makes use of a k-nearest neighbor algorithm where k is known prior to the invocation of the algorithm. Thus if m < k neighbors are needed, the k-nearest neighbor algorithm has to be reinvoked for m neighbors, thereby possibly performing some redundant computations. The second approach is incremental in the sense that having obtained the k nearest neighbors, the k + 1st neighbor can be obtained without having to calculate the k + 1 nearest neighbors from scratch. The incremental approach is useful when processing complex queries where one of the conditions involves spatial proximity (e.g., the nearest city to Chicago with population greater than a million), in which case a query engine can make use of a pipelined strategy. We present a general incremental nearest neighbor algorithm that is applicable to a large class of hierarchical spatial data structures. This algorithm is adapted to the R-tree and its performance is compared to an existing k-nearest neighbor algorithm for R-trees [Rousseopoulos et al. 1995]. Experiments show that the incremental nearest neighbor algorithm significantly outperforms the k-nearest neighbor algorithm for distance browsing queries in a spatial database that uses the R-tree as a spatial index. Moreover, the incremental nearest neighbor algorithm usually outperforms the k-nearest neighber algorithm when applied to the k-nearest neighbor problem for the R-tree, although the improvement is not nearly as large as for distance browsing queries. In fact, we prove informally that at any step in its execution the incremental nearest neighbor algorithm is optimal with respect to the spatial data structure that is employed. Furthermore, based on some simplifying assumptions, we prove that in two dimensions the number of distance computations and leaf nodes accesses made by the algorithm for finding k neighbors is O(k + k).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
AREF, W. G. AND SAMET, H. 1992. Uniquely reporting spatial objects: Yet another operation for comparing spatial data structures. In Proceedings of the 5th Symposium on Spatial Data Handling (Charleston, SC, Aug.). 178-189.
|
| |
3
|
AREF, W. G. AND SAMET, H. 1993. Estimating selectivity factors of spatial operations. In Optimization in Databases - Proceedings of the 5th International Workshop on Foundations of Models and Languages for Data and Objects (Aigen, Austria, Sept.). 31-40.
|
 |
4
|
|
 |
5
|
|
 |
6
|
Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States
|
 |
7
|
|
 |
8
|
Stefan Berchtold , Christian Böhm , Daniel A. Keim , Hans-Peter Kriegel, A cost model for nearest neighbor search in high-dimensional data space, Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.78-86, May 11-15, 1997, Tucson, Arizona, United States
[doi> 10.1145/263661.263671]
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
EASTMAN, C. M. AND ZEMANKOVA, M. 1982. Partially specified nearest neighbor searches using k-d-trees. Inf. Process. Lett. 15, 2 (Sept.), 53-56.
|
| |
17
|
|
 |
18
|
Christos Faloutsos , King-Ip Lin, FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.163-174, May 22-25, 1995, San Jose, California, United States
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
FUKUNAGA, K. AND NARENDRA, P. M. 1975. A branch and bound algorithm for computing. IEEE Trans. Comput. 24, 7 (July), 750-753.
|
| |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
HENRICH, A. 1994. A distance-scan algorithm for spatial access structures. In Proceedings of the Second ACM Workshop on Geographic Information Systems (Gaithersburg, MD, Dec.). ACM Press, New York, NY, 136-143.
|
| |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
| |
33
|
KAMGAR-PARSI, B. AND KANAL, L. N. 1985. An improved branch and bound algorithm for computing k -nearest neighbors. Pattern Recogn. Lett. 3, 1 (Jan.).
|
 |
34
|
K. V. Ravi Kanth , Divyakant Agrawal , Ambuj Singh, Dimensionality reduction for similarity searching in dynamic databases, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.166-176, June 01-04, 1998, Seattle, Washington, United States
|
 |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
LINDENBAUM, M. AND SAMET, H. 1995. A probabilistic analysis of trie-based sorting of large collections of line segments. TR-3455. University of Maryland at College Park, College Park, MD.
|
| |
39
|
|
 |
40
|
|
| |
41
|
|
 |
42
|
|
| |
43
|
BUREAU OF THE CENSUS, 1989. Tiger/Line precensus files. Bureau of the Census, Washington DC.
|
 |
44
|
|
 |
45
|
Nick Roussopoulos , Stephen Kelley , Frédéric Vincent, Nearest neighbor queries, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.71-79, May 22-25, 1995, San Jose, California, United States
|
 |
46
|
|
| |
47
|
|
 |
48
|
|
 |
49
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
| |
50
|
|
| |
51
|
SPROULL, R. F. 1991. Refinements to nearest-neighbor searching in k-dimensional trees. Algorithmica 6, 4, 579-589.
|
| |
52
|
UHLMANN, J. K. 1991. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40, 4 (Nov.), 175-179.
|
| |
53
|
|
| |
54
|
WHITE, D. A. AND JAIN, R. 1996. Algorithms and strategies for similarity retrieval. Tech. Rep. VCL-96-101. University of California at San Diego, La Jolla, CA.
|
| |
55
|
|
CITED BY 132
|
|
|
|
|
|
|
|
Hanan Samet , Houman Alborzi , František Brabec , Claudio Esperança , Gísli R. Hjaltason , Frank Morgan , Egemen Tanin, Use of the SAND spatial browser for digital government applications, Communications of the ACM, v.46 n.1, January 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jun Zhang , Manli Zhu , Dimitris Papadias , Yufei Tao , Dik Lun Lee, Location-based spatial queries, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 09-12, 2003, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
Thomas Schwarz , Markus Iofcea , Matthias Grossmann , Nicola Hönle , Daniela Nicklas , Bernhard Mitschang, On efficiently processing nearest neighbor queries in a loosely coupled set of data sources, Proceedings of the 12th annual ACM international workshop on Geographic information systems, November 12-13, 2004, Washington DC, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Evangelos Dellis , Akrivi Vlachou , Ilya Vladimirskiy , Bernhard Seeger , Yannis Theodoridis, Constrained subspace skyline computation, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
Xiaobin Ma , Shashi Shekhar , Hui Xiong , Pusheng Zhang, Exploiting a page-level upper bound for multi-type nearest neighbor queries, Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems, November 10-11, 2006, Arlington, Virginia, USA
|
|
|
Caetano Traina, Jr. , Agma J. M. Traina , Marcos R. Vieira , Adriano S. Arantes , Christos Faloutsos, Efficient processing of complex similarity queries in RDBMS through query rewriting, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Chenyi Xia , Hongjun Lu , Beng Chin Ooi , Jing Hu, Gorder: an efficient method for KNN join processing, Proceedings of the Thirtieth international conference on Very large data bases, p.756-767, August 31-September 03, 2004, Toronto, Canada
|
|
|
Yun-Jun Gao , Chun Li , Gen-Cai Chen , Ling Chen , Xian-Ta Jiang , Chun Chen, Efficient k-nearest-neighbor search algorthims for historical moving object trajectories, Journal of Computer Science and Technology, v.22 n.2, p.232-244, March 2007
|
|
|
|
|
|
|
|
|
|
|
|
Dimitris Papadias , Jun Zhang , Nikos Mamoulis , Yufei Tao, Query processing in spatial network databases, Proceedings of the 29th international conference on Very large data bases, p.802-813, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel Blank , Soufyane El Allali , Wolfgang Mueller , Andreas Henrich, Sample-based creation of peer summaries for efficient similarity search in scalable peer-to-peer networks, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
Chee-Yong Chan , H. V. Jagadish , Kian-Lee Tan , Anthony K. H. Tung , Zhenjie Zhang, Finding k-dominant skylines in high dimensional space, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jochen Van den Bercken , Björn Blohsfeld , Jens-Peter Dittrich , Jürgen Krämer , Tobias Schäfer , Martin Schneider , Bernhard Seeger, XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries, Proceedings of the 27th International Conference on Very Large Data Bases, p.39-48, September 11-14, 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hideki Hayashi , Daisuke Ito , Masaaki Tanizaki , Kohji Kimura , Hisanori Kajiyama, Dual-heap kNN: k-nearest neighbor search for spatial data retrieval in embedded DBMS, Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, November 05-07, 2008, Irvine, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Qiuxia Chen , Lei Chen , Xiang Lian , Yunhao Liu , Jeffrey Xu Yu, Indexable PLA for efficient similarity search, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ke Deng , Xiaofang Zhou , Heng Tao Shen , Qing Liu , Kai Xu , Xuemin Lin, A multi-resolution surface distance model for k-NN query processing, The VLDB Journal — The International Journal on Very Large Data Bases, v.17 n.5, p.1101-1119, August 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Humberto L. Razente , Maria Camila N. Barioni , Agma Juci M. Traina , Christos Faloutsos , Caetano Traina, Jr., A novel optimization approach to efficiently process aggregate similarity queries in metric access methods, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
Ken C. K. Lee , Wang-Chien Lee , Hong Va Leong , Brandon Unger , Baihua Zheng, Efficient valid scope computation for location-dependent spatial queries in mobile and wireless environments, Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication, January 15-16, 2009, Suwon, Korea
|
|
|
Zaiben Chen , Heng Tao Shen , Xiaofang Zhou , Jeffrey Xu Yu, Monitoring path nearest neighbor in road networks, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
Bin Wang , Xiao-Chun Yang , Guo-Ren Wang , Ge Yu , Lei Chen , X. Sean Wang , Xue-Min Lin, Continually answering constraint k-NN queries in unstructured P2P systems, Journal of Computer Science and Technology, v.23 n.4, p.538-556, July 2008
|
|
|
Goce Trajcevski , Roberto Tamassia , Hui Ding , Peter Scheuermann , Isabel F. Cruz, Continuous probabilistic nearest-neighbor queries for uncertain trajectories, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yufei Tao , Ke Yi , Cheng Sheng , Panos Kalnis, Quality and efficiency in high dimensional nearest neighbor search, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|