ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Cache-oblivious string B-trees
Full text PdfPdf (154 KB)
Source Symposium on Principles of Database Systems archive
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Chicago, IL, USA
SESSION: Query processing table of contents
Pages: 233 - 242  
Year of Publication: 2006
ISBN:1-59593-318-2
Authors
Michael A. Bender  Stony Brook University
Martin Farach-Colton  Rutgers
Bradley C. Kuszmaul  MIT CSAIL
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
SIGART: ACM Special Interest Group on Artificial Intelligence
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 81,   Citation Count: 12
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1142351.1142385
What is a DOI?

ABSTRACT

B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally

  • when keys are long or of variable length,
  • when keys are compressed, even when using front compression, the standard B-tree compression scheme,
  • for range queries, and
  • with respect to memory effects such as disk prefetching.
This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways:
  • The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally.
  • It maintains an index whose size is proportional to the front-compressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner.
  • It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block.
  • It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
S. Alstrup, M. A. Bender, E. D. Demaine, M. Farach-Colton, T. Rauhe, and M. Thorup. Efficient tree layout in a multilevel memory hierarchy. arXiv:cs.DS/0211010, 2004. http://www.arXiv.org/pdf/cs.DS/0211010.
 
3
 
4
R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Inf., 1(3):173--189, Feb. 1972.
5
 
6
 
7
 
8
 
9
 
10
 
11
12
 
13
 
14
H. K. Chang. Compressed indexing method. IBM Technical Disclosure Bulletin, 11(11):1400--1401, 1969.
 
15
W. A. Clark IV, K. A. Salmond, and T. A. Stafford. Method and means for generating compressed keys. US Patent 3,593,309, 3 Jan. 1969.
16
 
17
E. D. Demaine, J. Iacono, and S. Langerman. Worst-case optimal tree layout in a memory hierarchy. Manuscript. arXiv:DS/0410048, 2004. http://john2.poly.edu/papers/manu04/paper.pdf.
18
19
 
20
 
21
 
22
 
23
 
24
 
25
I. Katriel. Implicit data structures based on local reorganizations. Master's thesis, Technion, Israel Inst. of Tech., Haifa, May 2002.
 
26
D. E. Knuth. Sorting and Searching, vol. 3 of The Art of Computer Programming. Addison-Wesley, Reading, Massachusetts, 1973.
 
27
K. Mehlhorn. Data Structures and Algorithms 1: Sorting and Searching, pp. 198--199. Springer-Verlag, Berlin, 1984.
 
28
 
29
H. Prokop. Cache-oblivious algorithms. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Inst. of Tech., June 1999.
 
30
Sleepycat Software. The Berkeley Database. http://www.sleepycat.com, 2005.
31
 
32
R. E. Wagner. Indexing design considerations. IBM Syst. J., 12(4):351--367, 1973.

CITED BY  12

Collaborative Colleagues:
Michael A. Bender: colleagues
Martin Farach-Colton: colleagues
Bradley C. Kuszmaul: colleagues