|
ABSTRACT
Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like total sales. The values of many of these cells are dependent on the values of other cells in the data cube. A common and powerful query optimization technique is to materialize some or all of these cells rather than compute them from raw data each time. Commercial systems differ mainly in their approach to materializing the data cube. In this paper, we investigate the issue of which cells (views) to materialize when it is too expensive to materialize all views. A lattice framework is used to express dependencies among views. We present greedy algorithms that work off this lattice and determine a good set of views to materialize. The greedy algorithm performs within a small constant factor of optimal under a variety of models. We then consider the most common case of the hypercube lattice and examine the choice of materialized views for hypercubes in detail, giving some good tradeoffs between the space used and the average time to answer a query.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Arb
|
Arbor Software. Multidimensional Analysis: (',onverting Corporate Data into Strategic Information. White Paper. At ht, tp://www.arborsoft.com/ papers / multiTO ( ~,. ht ml
|
| |
Che96
|
C. Chekllri. Personal comnmnication, 1996.
|
| |
CS94
|
|
 |
Fei96
|
|
| |
GBLP95
|
J..Gray, A. Bosworth, A. Layman, H. Pirahesh Data Cube: A Relational Aggregation ()perator Generalizing Group-By, Cross-Tab, and Sub- Totals. Microsoft TedmicaI Report No. MSR-TR- 95-22.
|
| |
GHQ95
|
|
| |
GHRU96
|
H. Gupta, V. Harinarayan, A. Rajaralnan, and J. D. Ulhnan. Index Selection for ()LAP. Sift> mitred for publication. At http://db.stanford.edu/ pub/hgupt a/1996 / CubeIndex. ps
|
 |
Gra93
|
|
| |
HRU95
|
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing Data Cubes Efficiently. A flfll version of Lhis paper. At http://db.stanford.edu/ pllb / harinarayan / 1995 / cub e. ps
|
| |
HNSS95
|
|
 |
OG95
|
|
| |
Raa95
|
F. Raab, editor. TPC, Benchmark(tin) D (Decision Support), Proposed Revision 1.0. Transaction Processing Performance Council, San Jose, CA 95112, 4 April 1995.
|
 |
Rad95
|
|
| |
STG
|
Stanford Technology Group, Inc. Designing the Data Warehouse On Relational Databases. White Paper.
|
| |
Xen94
|
J. Xenakis, editor. Multidimensional Databases, tn Application Development Strategies, April 1994.
|
CITED BY 263
|
|
|
|
|
|
|
|
Jef Wijsen , Raymond T. Ng , Toon Calders, Discovering roll-up dependencies, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.213-222, August 15-18, 1999, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jayavel Shanmugasundaram , Usama Fayyad , P. S. Bradley, Compressed data cubes for OLAP aggregate query approximation on continuous dimensions, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.223-232, August 15-18, 1999, San Diego, California, United States
|
|
|
|
|
|
|
|
|
J. Albrecht , W. Hümmer , W. Lehner , L. Schlesinger, Query optimization by using derivability in a data warehouse environment, Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP, p.49-56, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Osmar R. Zaïane , Jiawei Han , Ze-Nian Li , Jean Hou, Mining multimedia data, Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research, p.24, November 30-December 03, 1998, Toronto, Ontario, Canada
|
|
|
Barbara Dinter , Carsten Sapia , Gabriele Höfling , Markus Blaschka, The OLAP market: state of the art and research issues, Proceedings of the 1st ACM international workshop on Data warehousing and OLAP, p.22-27, November 02-07, 1998, Washington, D.C., United States
|
|
|
|
|
|
|
|
|
|
|
|
Wai Gen Yee , Michael J. Donahoo , Edward Omiecinski , Shamkant B. Navathe, Scaling replica maintenance in intermittently synchronized mobile databases, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
|
|
|
|
|
|
Ching-Tien Ho , Jehoshua Bruck , Rakesh Agrawal, Partial-sum queries in OLAP data cubes using covering codes, Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.228-237, May 11-15, 1997, Tucson, Arizona, United States
|
|
|
Jeffrey Scott Vitter , Min Wang , Bala Iyer, Data cube approximation and histograms via wavelets, Proceedings of the seventh international conference on Information and knowledge management, p.96-104, November 02-07, 1998, Bethesda, Maryland, United States
|
|
|
|
|
|
Goretti K. Y. Chan , Qing Li , Ling Feng, Design and selection of materialized views in a data warehousing environment: a case study, Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP, p.42-47, November 02-06, 1999, Kansas City, Missouri, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John R. Smith , Vittorio Castelli , Anant Jhingran , Chung-Sheng Li, Dynamic assembly of views in data cubes, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.274-283, June 01-04, 1998, Seattle, Washington, United States
|
|
|
Stéphane Grumbach , Maurizio Rafanelli , Leonardo Tininini, Querying aggregate data, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.174-184, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
David W. Cheung , Bo Zhou , Ben Kao , Hongjun Lu , Tak Wah Lam , Hing Fung Ting, Requirement-based data cube schema design, Proceedings of the eighth international conference on Information and knowledge management, p.162-169, November 02-06, 1999, Kansas City, Missouri, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Biswadeep Nag , Prasad M. Deshpande , David J. DeWitt, Using a knowledge cache for interactive discovery of association rules, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.244-253, August 15-18, 1999, San Diego, California, United States
|
|
|
Hidetoshi Uchiyama , Kanda Runapongsa , Toby J. Teorey, A progressive view materialization algorithm, Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP, p.36-41, November 02-06, 1999, Kansas City, Missouri, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
José Luis Ambite , Yigal Arens , Eduard Hovy , Andrew Philpot , Luis Gravano , Vasileios Hatzivassiloglou , Judith Klavans, Simplifying Data Access: The Energy Data Collection Project, Computer, v.34 n.2, p.47-54, February 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
W. Lehner , W. Hümmer , L. Schlesinger , A. Bauer, On the problem of generating common predecessors, Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP, p.43-48, November 08-08, 2002, McLean, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kanda Runapongsa , Thomas P. Nadeau , Toby J. Teorey, Storage estimation for multidimensional aggregates in OLAP, Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research, p.10, November 08-11, 1999, Mississauga, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yannis Sismanis , Antonios Deligiannakis , Yannis Kotidis , Nick Roussopoulos, Hierarchical dwarfs for the rollup cube, Proceedings of the 6th ACM international workshop on Data warehousing and OLAP, November 07-07, 2003, New Orleans, Louisiana, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guozhu Dong , Jiawei Han , Joyce M. W. Lam , Jian Pei , Ke Wang , Wei Zou, Mining Constrained Gradients in Large Databases, IEEE Transactions on Knowledge and Data Engineering, v.16 n.8, p.922-938, August 2004
|
|
|
|
|
|
|
|
|
Cuiping Li , Gao Cong , Anthony K. H. Tung , Shan Wang, Incremental maintenance of quotient cube for median, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
L. Schlesinger , A. Bauer , W. Lehner , G. Ediberidze , M. Gutzmann, Efficiently synchronizing multidimensional schema data, Proceedings of the 4th ACM international workshop on Data warehousing and OLAP, p.69-76, November 09-09, 2001, Atlanta, Georgia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Panos Kalnis , Wee Siong Ng , Beng Chin Ooi , Dimitris Papadias , Kian-Lee Tan, An adaptive peer-to-peer network for distributed caching of OLAP results, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ronald Fagin , R. Guha , Ravi Kumar , Jasmine Novak , D. Sivakumar , Andrew Tomkins, Multi-structural databases, Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 13-15, 2005, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
R. Fagin , Ph. Kolaitis , R. Kumar , J. Novak , D. Sivakumar , A. Tomkins, Efficient implementation of large-scale multi-structural databases, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
José Luis Ambite , Yigal Arens , Luis Gravano , Vasileios Hatzivassiloglou , Eduard Hovy , Judith Klavans , Andrew Philpot , Usha Ramachandran , Jay Sandhaus , Anurag Singla , Brian Whitman, Simplifying data access: the energy data collection (EDC) project, Proceedings of the 2000 annual national conference on Digital government research, p.1-11, May 15-17, 2000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jiawei Han , Yixin Chen , Guozhu Dong , Jian Pei , Benjamin W. Wah , Jianyong Wang , Y. Dora Cai, Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams, Distributed and Parallel Databases, v.18 n.2, p.173-197, September 2005
|
|
|
|
|
|
|
|
|
|
|
|
Ying Chen , Frank Dehne , Todd Eavis , Andrew Rau-Chaplin, PnP: sequential, external memory, and parallel iceberg cube computation, Distributed and Parallel Databases, v.23 n.2, p.99-126, April 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Roozbeh Derakhshan , Frank Dehne , Othmar Korn , Bela Stantic, Simulated annealing for materialized view selection in data warehousing environment, Proceedings of the 24th IASTED international conference on Database and applications, p.89-94, February 13-15, 2006, Innsbruck, Austria
|
|
|
|
|
|
|
|
|
Doug Burdick , Prasad M. Deshpande , T. S. Jayram , Raghu Ramakrishnan , Shivakumar Vaithyanathan, Efficient allocation algorithms for OLAP over imprecise data, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cristina Dutra de Aguiar Ciferri , Ricardo Rodrigues Ciferri , Diogo Tuler Forlani , Agma Juci Machado Traina , Fernando da Fonseca de Souza, Horizontal fragmentation as a technique to improve the performance of drill-down and roll-up queries, Proceedings of the 2007 ACM symposium on Applied computing, March 11-15, 2007, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Dong Xin , Jiawei Han , Xiaolei Li , Benjamin W. Wah, Star-cubing: computing iceberg cubes by top-down and bottom-up integration, Proceedings of the 29th international conference on Very large data bases, p.476-487, September 09-12, 2003, Berlin, Germany
|
|
|
Yixin Chen , Guozhu Dong , Jiawei Han , Benjamin W. Wah , Jianyong Wang, Multi-dimensional regression analysis of time-series data streams, Proceedings of the 28th international conference on Very Large Data Bases, p.323-334, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cuiping Li , Beng Chin Ooi , Anthony K. H. Tung , Shan Wang, DADA: a data cube for dominant relationship analysis, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Craig J. Bunker , Latha S. Colby , Richard L. Cole , William J. McKenna , Gopal Mulagund , David Wilhite, Aggregate Maintenance for Data Warehousing in Informix Red Brick Vista, Proceedings of the 27th International Conference on Very Large Data Bases, p.659-662, September 11-14, 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jim Gray , Surajit Chaudhuri , Adam Bosworth , Andrew Layman , Don Reichart , Murali Venkatrao , Frank Pellow , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals, Data Mining and Knowledge Discovery, v.1 n.1, p.29-53, 1997
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sameet Agarwal , Rakesh Agrawal , Prasad Deshpande , Ashish Gupta , Jeffrey F. Naughton , Raghu Ramakrishnan , Sunita Sarawagi, On the Computation of Multidimensional Aggregates, Proceedings of the 22th International Conference on Very Large Data Bases, p.506-521, September 03-06, 1996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yixin Chen , Guozhu Dong , Jiawei Han , Jian Pei , Benjamin W. Wah , Jianyong Wang, Regression Cubes with Lossless Compression and Aggregation, IEEE Transactions on Knowledge and Data Engineering, v.18 n.12, p.1585-1599, December 2006
|
|
|
|
|
|
|
|
|
|
|
|
Xiaolei Li , Jiawei Han , Zhijun Yin , Jae-Gil Lee , Yizhou Sun, Sampling cube: a framework for statistical olap over sampling data, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
Yu-Chin Liu , Ping-Yu Hsu , Gwo-Ji Sheen , Steve Ku , Kai-Wen Chang, Simultaneous determination of view selection and update policy with stochastic query and response time constraints, Information Sciences: an International Journal, v.178 n.18, p.3491-3509, September, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Marcos Antonio Vaz Salles , Jens-Peter Dittrich , Shant Kirakos Karakashian , Olivier René Girard , Lukas Blunschi, iTrails: pay-as-you-go information integration in dataspaces, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|