| MAXENT: consistent cardinality estimation in action |
| Full text |
Pdf
(312 KB)
|
| Source
|
International Conference on Management of Data
archive
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
table of contents
Chicago, IL, USA
DEMONSTRATION SESSION: Group C
table of contents
Pages: 775 - 777
Year of Publication: 2006
ISBN:1-59593-434-0
|
|
Authors
|
|
V. Markl
|
IBM Almaden Research Center, San Jose, CA
|
|
M. Kutsch
|
IBM Boeblingen Laboratory, Boeblingen, Germany
|
|
T. M. Tran
|
IBM Silicon Valley Laboratory, San Jose, CA
|
|
P. J. Haas
|
IBM Almaden Research Center, San Jose, CA
|
|
N. Megiddo
|
IBM Almaden Research Center, San Jose, CA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 37, Citation Count: 0
|
|
|
ABSTRACT
When comparing alternative query execution plans (QEPs), a cost-based query optimizer in a relational database management system needs to estimate the selectivity of conjunctive predicates. To avoid inaccurate independence assumptions, modern optimizers try to exploit multivariate statistics (MVS) that provide knowledge about joint frequencies in a table of a relation. Because the complete joint distribution is almost always too large to store, optimizers are given only partial knowledge about this distribution. As a result, there exist multiple, non-equivalent ways to estimate the selectivity of a conjunctive predicate. To consistently combine the partial knowledge during the estimation process, existing optimizers employ cumbersome ad hoc heuristics. These methods unjustifiably ignore valuable information, and the optimizer tends to favor QEPs for which the least information is available. This bias problem yields poor QEP quality and performance. We demonstrate MAXENT, a novel approach based on the maximum entropy principle, prototyped in IBM DB2 LUW. We illustrate MAXENT's ability to consistently estimate the selectivity of conjunctive predicates on a per-table basis. In contrast to the DB2 optimizer's current ad hoc methods, we show how MAXENT exploits all available information about the joint column distribution and thus avoids the bias problem. For some complex queries against a real-world database, we show that MAXENT improves selectivity estimates by orders of magnitude relative to the current DB2 optimizer, and also show how these improved estimate influence plan choices as well as query execution times.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
V. Markl , N. Megiddo , M. Kutsch , T. M. Tran , P. Haas , U. Srivastava, Consistently estimating the selectivity of conjuncts of predicates, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
| |
2
|
Kutsch, M., Tran, T.M., Haas, P.J., Markl, V., Megiddo, N.: Integrating a Maximum-Entropy Cardinality Estimator into DB2 UDB. EDBT 2006, Munich, Germany
|
|