ACM Home Page
Please provide us with feedback. Feedback
A system for specification and verification of interactive, data-driven web applications
Full text PdfPdf (125 KB)
Source International Conference on Management of Data archive
Proceedings of the 2006 ACM SIGMOD international conference on Management of data table of contents
Chicago, IL, USA
DEMONSTRATION SESSION: Group B table of contents
Pages: 772 - 774  
Year of Publication: 2006
ISBN:1-59593-434-0
Authors
Alin Deutsch  University of California, San Diego
Liying Sui  University of California, San Diego
Victor Vianu  University of California, San Diego
Dayou Zhou  University of California, San Diego
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 59,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1142473.1142584
What is a DOI?

ABSTRACT

When comparing alternative query execution plans (QEPs), a cost-based query optimizer in a relational database management system needs to estimate the selectivity of conjunctive predicates. To avoid inaccurate independence assumptions, modern optimizers try to exploit multivariate statistics (MVS) that provide knowledge about joint frequencies in a table of a relation. Because the complete joint distribution is almost always too large to store, optimizers are given only partial knowledge about this distribution. As a result, there exist multiple, non-equivalent ways to estimate the selectivity of a conjunctive predicate. To consistently combine the partial knowledge during the estimation process, existing optimizers employ cumbersome ad hoc heuristics. These methods unjustifiably ignore valuable information, and the optimizer tends to favor QEPs for which the least information is available. This bias problem yields poor QEP quality and performance. We demonstrate MAXENT, a novel approach based on the maximum entropy principle, prototyped in IBM DB2 LUW. We illustrate MAXENT's ability to consistently estimate the selectivity of conjunctive predicates on a per-table basis. In contrast to the DB2 optimizer's current ad hoc methods, we show how MAXENT exploits all available information about the joint column distribution and thus avoids the bias problem. For some complex queries against a real-world database, we show that MAXENT improves selectivity estimates by orders of magnitude relative to the current DB2 optimizer, and also show how these improved estimate influence plan choices as well as query execution times.




Collaborative Colleagues:
Alin Deutsch: colleagues
Liying Sui: colleagues
Victor Vianu: colleagues
Dayou Zhou: colleagues