|
ABSTRACT
Data Warehousing and OLAP applications typically view data an having multiple logical dimensions (e.g., product, location) with natural hierarchies defined on each dimension. OLAP queries usually involve hierarchical selections on some of the dimensions, and often aggregate measure attributes (e.g., sales, volume). Accurately estimating the distribution of measure attributes, under hierarchical selections, is important in a variety of scenarios, including approximate query evaluation and cost-based optimization of queries.In this paper, we propose fast (near linear time) algorithms for the problem of approximating the distribution of measure attributes with hierarchies defined on them, using histograms. Our algorithms are based on dynamic programming and a novel notion of sparse intervals that we introduce, and are the first practical algorithms for this problem. They effectively trade space for construction time without compromising histogram accuracy. We complement our analytical contributions with an experimental evaluation using real data sets, demonstrating the superiority of our approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Anna C. Gilbert , Yannis Kotidis , S. Muthukrishnan , Marin J. Strauss, Optimal and approximate computation of summary statistics for range aggregates, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.227-236, May 2001, Santa Barbara, California, United States
[doi> 10.1145/375551.375598]
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
H. V. Jagadish , Nick Koudas , S. Muthukrishnan , Viswanath Poosala , Kenneth C. Sevcik , Torsten Suel, Optimal Histograms with Quality Guarantees, Proceedings of the 24rd International Conference on Very Large Data Bases, p.275-286, August 24-27, 1998
|
| |
6
|
|
 |
7
|
Nick Koudas , S. Muthukrishnan , Divesh Srivastava, Optimal histograms for hierarchical range queries (extended abstract), Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.196-204, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335223]
|
 |
8
|
|
 |
9
|
Yossi Matias , Jeffrey Scott Vitter , Min Wang, Wavelet-based histograms for selectivity estimation, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.448-459, June 01-04, 1998, Seattle, Washington, United States
|
 |
10
|
|
 |
11
|
Viswanath Poosala , Peter J. Haas , Yannis E. Ioannidis , Eugene J. Shekita, Improved histograms for selectivity estimation of range predicates, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.294-305, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
12
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
|