|
ABSTRACT
As databases have expanded in scope to storing string data (XML documents, product catalogs), it has become increasingly important to search databases based on matching substrings, often on multiple, correlated dimensions. While string B-trees are I/O optimal in one dimension, no index structure with non-trivial query bounds is known for two-dimensional substring indexing.
In this paper, we present a technique for two-dimensional substring indexing based on a reduction to the geometric problem of identifying common colors in two ranges containing colored points. We develop an I/O efficient algorithm for solving the common colors problem, and use it to obtain an I/O efficient (poly-logarithmic query time) algorithm for the two-dimensional substring indexing problem. Our techniques result in a family of secondary memory index structures that trade space for time, with no loss of accuracy. We show how our technique can be practically realized using a combination of string B-trees and R-trees.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Lars Arge , Vasilis Samoladas , Jeffrey Scott Vitter, On two-dimensional indexability and optimal range search indexing, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.346-357, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
[doi> 10.1145/303976.304010]
|
| |
2
|
P. Agarwal and J. Erickson. Geometric range searching and its relatives. Advances in Discrete and Computational Geometry, AMS, 1998.
|
 |
3
|
|
 |
4
|
Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
H. V. Jagadish , Nick Koudas , Divesh Srivastava, On effective multi-dimensional indexing for strings, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.403-414, May 15-18, 2000, Dallas, Texas, United States
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
|