|
ABSTRACT
Databases have achieved orders-of-magnitude performance improvements by changing the layout of stored data -- for instance, by arranging data in columns or compressing it before storage. These improvements have been implemented in monolithic new engines, however, making it difficult to experiment with feature combinations or extensions. We present Anvil, a modular and extensible toolkit for building database back ends. Anvil's storage modules, called dTables, have much finer granularity than prior work. For example, some dTables specialize in writing data, while others provide optimized read-only formats. This specialization makes both kinds of dTable simple to write and understand. Unifying dTables implement more comprehensive functionality by layering over other dTables -- for instance, building a read/write store from read-only tables and a writable journal, or building a general-purpose store from optimized special-purpose stores. The dTable design leads to a flexible system powerful enough to implement many database storage layouts. Our prototype implementation of Anvil performs up to 5.5 times faster than an existing B-tree-based database back end on conventional workloads, and can easily be customized for further gains on specific data and workloads.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Daniel Abadi, Samuel Madden, and Miguel Ferreira. Integrating compression and execution in column-oriented database systems. In Proc. SIGMOD '06, pages 671--682, 2006.
|
| |
2
|
Don Steve Batory, J.R. Barnett, Jorge F. Garza, Kenneth Paul Smith, K. Tsukuda, C. Twichell, and T.E. Wise. GENESIS: an extensible database management system. IEEE Transactions on Software Engineering, 14(11):1711--1730, 1988.
|
| |
3
|
Rudolf Bayer and Edward M. McCreight. Organization and maintenance of large ordered indices. In SIGFIDET Workshop, pages 107--141, July 1970.
|
| |
4
|
Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422--426, 1970.
|
| |
5
|
Peter Alexander Boncz. Monet: A Next-Generation DBMS Kernel For Query-Intensive Applications. PhD thesis, Universiteit van Amsterdam, Amsterdam, The Netherlands, May 2002.
|
| |
6
|
CDB Constant DataBase. http://cr.yp.to/cdb.html.
|
| |
7
|
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: a distributed storage system for structured data. In Proc. OSDI '06, pages 205--218, November 2006.
|
| |
8
|
DBT2. http://sf.net/projects/osdldbt/.
|
| |
9
|
Christopher Frost, Mike Mammarella, Eddie Kohler, Andrew de los Reyes, Shant Hovsepian, Andrew Matsuoka, and Lei Zhang. Generalized file system dependencies. In Proc. SOSP '07, pages 307--320, 2007.
|
| |
10
|
Stavros Harizopoulos, Daniel J. Abadi, Samuel Madden, and Michael Stonebraker. OLTP through the looking glass, and what we found there. In Proc. SIGMOD '08, pages 981--992, 2008.
|
| |
11
|
Stavros Harizopoulos, Velen Liang, Daniel J. Abadi, and Samuel Madden. Performance tradeoffs in read-optimized databases. In Proc. VLDB '06, pages 487--498, 2006.
|
| |
12
|
Nicholas Lester, Alistair Moffat, and Justin Zobel. Efficient online index construction for text databases. ACM Transactions on Database Systems, 33(3):1--33, 2008.
|
| |
13
|
Bruce Lindsay, John McPherson, and Hamid Pirahesh. A data management extension architecture. SIGMOD Record, 16(3):220--226, 1987.
|
| |
14
|
David E. Lowell and Peter M. Chen. Free transactions with Rio Vista. In Proc. SOSP '97, pages 92--101, 1997.
|
| |
15
|
MySQL. http://www.mysql.com/.
|
| |
16
|
MySQL Internals Custom Engine. http://forge.mysql.com/wiki/MySQL_Internals_Custom_Engine.
|
| |
17
|
Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn. Rethink the sync. In Proc. OSDI '06, pages 1--14, November 2006.
|
| |
18
|
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. The log-structured merge-tree (LSM-tree). Acta Informatica, 33(4):351--385, 1996.
|
| |
19
|
Oracle. http://www.oracle.com/.
|
| |
20
|
Mendel Rosenblum and John K. Ousterhout. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems, 10(1):1--15, 1992.
|
| |
21
|
Russell Sears and Eric Brewer. Stasis: flexible transactional storage. In Proc. OSDI '06, pages 29--44, November 2006.
|
| |
22
|
Russell Sears, Mark Callaghan, and Eric Brewer. Rose: Compressed, log-structured replication. In Proc. VLDB '08, August 2008.
|
| |
23
|
SQLite. http://www.sqlite.org/.
|
| |
24
|
Michael Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O'Neil, Pat O'Neil, Alex Rasin, Nga Tran, and Stan Zdonik. C-store: a column-oriented DBMS. In Proc. VLDB '05, pages 553--564, 2005.
|
| |
25
|
Michael Stonebraker and Greg Kemnitz. The POSTGRES next generation database management system. Communications of the ACM, 34(10):78--92, 1991.
|
| |
26
|
Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, and Pat Helland. The end of an architectural era (It's time for a complete rewrite). In Proc. VLDB '07, pages 1150--1160, 2007.
|
| |
27
|
Subversion. http://subversion.tigris.org/.
|
| |
28
|
TPC-C. http://www.tpc.org/tpcc/.
|
| |
29
|
TPC-H. http://www.tpc.org/tpch/.
|
| |
30
|
Theodore Ts'o. Delayed allocation and the zero-length file problem. Theodore Ts'o's blog. http://tinyurl.com/dy7rgm (retrieved March 2009).
|
| |
31
|
Till Westmann, Donald Kossmann, Sven Helmer, and Guido Moerkotte. The implementation and performance of compressed databases. SIGMOD Record, 29(3):55--67, 2000.
|
| |
32
|
Charles P. Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P. Quigley, Erez Zadok, and Mohammad Nayyer Zubair. Versatility and Unix semantics in namespace unification. ACM Transactions on Storage, 2(1):74--105, February 2006.
|
| |
33
|
ZFS Space Maps. http://blogs.sun.com/bonwick/entry/space_maps.
|
| |
34
|
Jacob Ziv and Abraham Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337--343, May 1977.
|
|