ACM Home Page
Please provide us with feedback. Feedback
Self-organizing tuple reconstruction in column-stores
Full text PdfPdf (952 KB)
Source
International Conference on Management of Data archive
Proceedings of the 35th SIGMOD international conference on Management of data table of contents
Providence, Rhode Island, USA
SESSION: Research session 8: column stores table of contents
Pages 297-308  
Year of Publication: 2009
ISBN:978-1-60558-551-2
Authors
Stratos Idreos  CWI, Amsterdam, Netherlands
Martin L. Kersten  CWI, Amsterdam, Netherlands
Stefan Manegold  CWI, Amsterdam, Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 56,   Downloads (12 Months): 193,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1559845.1559878
What is a DOI?

ABSTRACT

Column-stores gained popularity as a promising physical design alternative. Each attribute of a relation is physically stored as a separate column allowing queries to load only the required attributes. The overhead incurred is on-the-fly tuple reconstruction for multi-attribute queries. Each tuple reconstruction is a join of two columns based on tuple IDs, making it a significant cost component. The ultimate physical design is to have multiple presorted copies of each base table such that tuples are already appropriately organized in multiple different orders across the various columns. This requires the ability to predict the workload, idle time to prepare, and infrequent updates.

In this paper, we propose a novel design, partial sideways cracking, that minimizes the tuple reconstruction cost in a self-organizing way. It achieves performance similar to using presorted data, but without requiring the heavy initial presorting step itself. Instead, it handles dynamic, unpredictable workloads with no idle time and frequent updates. Auxiliary dynamic data structures, called cracker maps, provide a direct mapping between pairs of attributes used together in queries for tuple reconstruction. A map is continuously physically reorganized as an integral part of query evaluation, providing faster and reduced data access for future queries. To enable flexible and self-organizing behavior in storage-limited environments, maps are materialized only partially as demanded by the workload. Each map is a collection of separate chunks that are individually reorganized, dropped or recreated as needed. We implemented partial sideways cracking in an open-source column-store. A detailed experimental analysis demonstrates that it brings significant performance benefits for multi-attribute queries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
D. Abadi et al. Materialization Strategies in a Column-Oriented DBMS. ICDE 2007.
 
3
S. Agrawal et al. Database Tuning Advisor for Microsoft SQL Server. VLDB 2004.
 
4
P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. CIDR 2005.
 
5
 
6
 
7
S. Idreos, M. Kersten, and S. Manegold. Database Cracking. CIDR 2007.
8
 
9
M. Kersten and S. Manegold. Cracking the Database Store. CIDR 2005.
 
10
11
 
12
 
13
 
14
TPC Benchmark H. http://www.tpc.org/tpch/.
 
15
MonetDB. http://monetdb.cwi.nl/.

Collaborative Colleagues:
Stratos Idreos: colleagues
Martin L. Kersten: colleagues
Stefan Manegold: colleagues