| Efficient computation of Iceberg cubes with complex measures |
| Full text |
Pdf
(400 KB)
|
| Source
|
International Conference on Management of Data
archive
Proceedings of the 2001 ACM SIGMOD international conference on Management of data
table of contents
Santa Barbara, California, United States
Pages: 1 - 12
Year of Publication: 2001
ISBN:1-58113-332-4
Also published in ...
|
|
Authors
|
|
Jiawei Han
|
School of Computing Science, Simon Fraser University, B.C., Canada
|
|
Jian Pei
|
School of Computing Science, Simon Fraser University, B.C., Canada
|
|
Guozhu Dong
|
Department of Computer Science, Wright State University, Dayton, OH
|
|
Ke Wang
|
School of Computing Science, Simon Fraser University, B.C., Canada
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Citation Count: 44
|
|
|
ABSTRACT
It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multi-dimensional aggregations for OLAP and data mining.
In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as average, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. In particular, for efficient computation of iceberg cubes with the average measure, we propose a top-k average pruning method and extend two previously studied methods, Apriori and BUC, to Top-k Apriori and Top-k BUC. To further improve the performance, an interesting hypertree structure, called H-tree, is designed and a new iceberg cubing method, called Top-k H-Cubing, is developed. Our performance study shows that Top-k BUC and Top-k H-Cubing are two promising candidates for scalable computation, and Top-k H-Cubing has better performance in most cases.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Sameet Agarwal , Rakesh Agrawal , Prasad Deshpande , Ashish Gupta , Jeffrey F. Naughton , Raghu Ramakrishnan , Sunita Sarawagi, On the Computation of Multidimensional Aggregates, Proceedings of the 22th International Conference on Very Large Data Bases, p.506-521, September 03-06, 1996
|
| |
2
|
|
| |
3
|
R. J. Bayardo, R. Agrawal, and D. Gunopulos. Constraint-based rule mining on large, dense data sets. ICDE'99.
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
G. Grahne, L. Lakshmanan, and X. Wang. Efficient mining of constrained correlated sets. ICDE'00.
|
| |
8
|
Jim Gray , Surajit Chaudhuri , Adam Bosworth , Andrew Layman , Don Reichart , Murali Venkatrao , Frank Pellow , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals, Data Mining and Knowledge Discovery, v.1 n.1, p.29-53, 1997
[doi> 10.1023/A:1009726021843]
|
 |
9
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
 |
10
|
Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman, Implementing data cubes efficiently, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.205-216, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
11
|
Laks V. S. Lakshmanan , Raymond Ng , Jiawei Han , Alex Pang, Optimization of constrained frequent set queries with 2-variable constraints, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.157-168, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
12
|
Raymond T. Ng , Laks V. S. Lakshmanan , Jiawei Han , Alex Pang, Exploratory mining and pruning optimizations of constrained associations rules, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.13-24, June 01-04, 1998, Seattle, Washington, United States
|
 |
13
|
|
| |
14
|
|
| |
15
|
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. KDD'97.
|
 |
16
|
Yihong Zhao , Prasad M. Deshpande , Jeffrey F. Naughton, An array-based algorithm for simultaneous multidimensional aggregates, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.159-170, May 11-15, 1997, Tucson, Arizona, United States
|
CITED BY 44
|
|
Helen Pinto , Jiawei Han , Jian Pei , Ke Wang , Qiming Chen , Umeshwar Dayal, Multi-dimensional sequential pattern mining, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
|
|
|
|
|
|
Jiawei Han , Jianyong Wang , Guozhu Dong , Jian Pei , Ke Wang, CubeExplorer: online exploration of data cubes, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
|
|
|
|
|
|
Guozhu Dong , Jiawei Han , Joyce M. W. Lam , Jian Pei , Ke Wang , Wei Zou, Mining Constrained Gradients in Large Databases, IEEE Transactions on Knowledge and Data Engineering, v.16 n.8, p.922-938, August 2004
|
|
|
Y. Dora Cai , David Clutter , Greg Pape , Jiawei Han , Michael Welge , Loretta Auvil, MAIDS: mining alarming incidents from data streams, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jiawei Han , Yixin Chen , Guozhu Dong , Jian Pei , Benjamin W. Wah , Jianyong Wang , Y. Dora Cai, Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams, Distributed and Parallel Databases, v.18 n.2, p.173-197, September 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jian Pei , Yidong Yuan , Xuemin Lin , Wen Jin , Martin Ester , Qing Liu , Wei Wang , Yufei Tao , Jeffrey Xu Yu , Qing Zhang, Towards multidimensional subspace skyline analysis, ACM Transactions on Database Systems (TODS), v.31 n.4, p.1335-1381, December 2006
|
|
|
|
|
|
|
|
|
|
|
|
Dong Xin , Jiawei Han , Xiaolei Li , Benjamin W. Wah, Star-cubing: computing iceberg cubes by top-down and bottom-up integration, Proceedings of the 29th international conference on Very large data bases, p.476-487, September 09-12, 2003, Berlin, Germany
|
|
|
Yixin Chen , Guozhu Dong , Jiawei Han , Benjamin W. Wah , Jianyong Wang, Multi-dimensional regression analysis of time-series data streams, Proceedings of the 28th international conference on Very Large Data Bases, p.323-334, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
|
|
|
Yixin Chen , Guozhu Dong , Jiawei Han , Jian Pei , Benjamin W. Wah , Jianyong Wang, Regression Cubes with Lossless Compression and Aggregation, IEEE Transactions on Knowledge and Data Engineering, v.18 n.12, p.1585-1599, December 2006
|
|
|
|
|
|
|
|
|
|
|
|
Radu Berinde , Graham Cormode , Piotr Indyk , Martin J. Strauss, Space-optimal heavy hitters with strong error bounds, Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 29-July 01, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|