ACM Home Page
Please provide us with feedback. Feedback
Multi-dimensional clustering: a new data layout scheme in DB2
Full text PdfPdf (168 KB)
Source International Conference on Management of Data archive
Proceedings of the 2003 ACM SIGMOD international conference on Management of data table of contents
San Diego, California
SESSION: Industrial track session 2: server technology table of contents
Pages: 637 - 641  
Year of Publication: 2003
ISBN:1-58113-634-X
Authors
Sriram Padmanabhan  IBM T.J. Watson Research Center, Hawthorne, New York
Bishwaranjan Bhattacharjee  IBM T.J. Watson Research Center, Hawthorne, New York
Tim Malkemus  IBM T.J. Watson Research Center, Hawthorne, New York
Leslie Cranston  IBM Toronto Laboratory, Markham, Ontario, Canada
Matthew Huras  IBM Toronto Laboratory, Markham, Ontario, Canada
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 59,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/872757.872835
What is a DOI?

ABSTRACT

We describe the design and implementation of a new data layout scheme, called multi-dimensional clustering, in DB2 Universal Database Version 8. Many applications, e.g., OLAP and data warehousing, process a table or tables in a database using a multi-dimensional access paradigm. Currently, most database systems can only support organization of a table using a primary clustering index. Secondary indexes are created to access the tables when the primary key index is not applicable. Unfortunately, secondary indexes perform many random I/O accesses against the table for a simple operation such as a range query. Our work in multi-dimensional clustering addresses this important deficiency in database systems. Multi-Dimensional Clustering is based on the definition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. We describe novel techniques for maintaining this physical layout efficiently and methods of processing database operations that provide significant performance improvements. We show results from experiments using a star-schema database to validate our claims of performance with minimal overhead.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Method and System for Multi-Dimensional Clustering in a Relational Database System, 2002. Patent Filed, IBM Corp.
2
 
3
 
4
 
5
 
6

CITED BY  10

Collaborative Colleagues:
Sriram Padmanabhan: colleagues
Bishwaranjan Bhattacharjee: colleagues
Tim Malkemus: colleagues
Leslie Cranston: colleagues
Matthew Huras: colleagues