|
ABSTRACT
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally - when keys are long or of variable length,
- when keys are compressed, even when using front compression, the standard B-tree compression scheme,
- for range queries, and
- with respect to memory effects such as disk prefetching.
This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways: - The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally.
- It maintains an index whose size is proportional to the front-compressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner.
- It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block.
- It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
S. Alstrup, M. A. Bender, E. D. Demaine, M. Farach-Colton, T. Rauhe, and M. Thorup. Efficient tree layout in a multilevel memory hierarchy. arXiv:cs.DS/0211010, 2004. http://www.arXiv.org/pdf/cs.DS/0211010.
|
| |
3
|
|
| |
4
|
R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Inf., 1(3):173--189, Feb. 1972.
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Michael A. Bender , Ziyang Duan , John Iacono , Jing Wu, A locality-preserving cache-oblivious dynamic dictionary, Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.29-38, January 06-08, 2002, San Francisco, California
|
| |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
H. K. Chang. Compressed indexing method. IBM Technical Disclosure Bulletin, 11(11):1400--1401, 1969.
|
| |
15
|
W. A. Clark IV, K. A. Salmond, and T. A. Stafford. Method and means for generating compressed keys. US Patent 3,593,309, 3 Jan. 1969.
|
 |
16
|
|
| |
17
|
E. D. Demaine, J. Iacono, and S. Langerman. Worst-case optimal tree layout in a memory hierarchy. Manuscript. arXiv:DS/0410048, 2004. http://john2.poly.edu/papers/manu04/paper.pdf.
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
I. Katriel. Implicit data structures based on local reorganizations. Master's thesis, Technion, Israel Inst. of Tech., Haifa, May 2002.
|
| |
26
|
D. E. Knuth. Sorting and Searching, vol. 3 of The Art of Computer Programming. Addison-Wesley, Reading, Massachusetts, 1973.
|
| |
27
|
K. Mehlhorn. Data Structures and Algorithms 1: Sorting and Searching, pp. 198--199. Springer-Verlag, Berlin, 1984.
|
| |
28
|
|
| |
29
|
H. Prokop. Cache-oblivious algorithms. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Inst. of Tech., June 1999.
|
| |
30
|
Sleepycat Software. The Berkeley Database. http://www.sleepycat.com, 2005.
|
 |
31
|
|
| |
32
|
R. E. Wagner. Indexing design considerations. IBM Syst. J., 12(4):351--367, 1973.
|
CITED BY 12
|
|
|
|
|
Paolo Ferragina , Roberto Grossi , Ankur Gupta , Rahul Shah , Jeffrey Scott Vitter, On searching compressed string collections cache-obliviously, Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
|
|
|
Michael A. Bender , Martin Farach-Colton , Jeremy T. Fineman , Yonatan R. Fogel , Bradley C. Kuszmaul , Jelani Nelson, Cache-oblivious streaming B-trees, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, June 09-11, 2007, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|