| Robust query processing through progressive optimization |
| Full text |
Pdf
(331 KB)
|
| Source
|
International Conference on Management of Data
archive
Proceedings of the 2004 ACM SIGMOD international conference on Management of data
table of contents
Paris, France
SESSION: Research sessions: query optimization
table of contents
Pages: 659 - 670
Year of Publication: 2004
ISBN:1-58113-859-8
|
|
Authors
|
|
Volker Markl
|
IBM Almaden Research Center, San Jose, CA
|
|
Vijayshankar Raman
|
IBM Almaden Research Center, San Jose, CA
|
|
David Simmen
|
IBM Silicon Valley Lab, San Jose, CA
|
|
Guy Lohman
|
IBM Almaden Research Center, San Jose, CA
|
|
Hamid Pirahesh
|
IBM Almaden Research Center, San Jose, CA
|
|
Miso Cilimdzic
|
IBM Toronto Lab, Toronto, Canada
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 17, Downloads (12 Months): 126, Citation Count: 31
|
|
|
ABSTRACT
Virtually every commercial query optimizer chooses the best plan for a query using a cost model that relies heavily on accurate cardinality estimation. Cardinality estimation errors can occur due to the use of inaccurate statistics, invalid assumptions about attribute independence, parameter markers, and so on. Cardinality estimation errors may cause the optimizer to choose a sub-optimal plan. We present an approach to query processing that is extremely robust because it is able to detect and recover from cardinality estimation errors. We call this approach "progressive query optimization" (POP). POP validates cardinality estimates against actual values as measured during query execution. If there is significant disagreement between estimated and actual values, execution might be stopped and re-optimization might occur. Oscillation between optimization and execution steps can occur any number of times. A re-optimization step can exploit both the actual cardinality and partial results, computed during a previous execution step. Checkpoint operators (CHECK) validate the optimizer's cardinality estimates against actual cardinalities. Each CHECK has a condition that indicates the cardinality bounds within which a plan is valid. We compute this validity range through a novel sensitivity analysis of query plan operators. If the CHECK condition is violated, CHECK triggers re-optimization. POP has been prototyped in a leading commercial DBMS. An experimental evaluation of POP using TPC-H queries illustrates the robustness POP adds to query processing, while incurring only negligible overhead. A case-study applying POP to a real-world database and workload shows the potential of POP, accelerating complex OLAP queries by almost two orders of magnitude.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
 |
4
|
D. D. Chamberlin , M. M. Astrahan , W. F. King , R. A. Lorie , J. W. Mehl , T. G. Price , M. Schkolnick , P. Griffiths Selinger , D. R. Slutz , B. W. Wade , R. A. Yost, Support for repetitive transactions and ad hoc queries in System R, ACM Transactions on Database Systems (TODS), v.6 n.1, p.70-94, March 1981
[doi> 10.1145/319540.319550]
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
HS93 P. Haas and A. Swami, Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics, IBM Research Report, 1993
|
| |
10
|
HS02 A. Hulgeri and S. Sudarshan. Parametric Query Optimization for Linear and Piecewise Linear Cost Functions. VLDB 2002.
|
 |
11
|
|
| |
12
|
|
 |
13
|
|
 |
14
|
|
 |
15
|
Viswanath Poosala , Peter J. Haas , Yannis E. Ioannidis , Eugene J. Shekita, Improved histograms for selectivity estimation of range predicates, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.294-305, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
16
|
|
| |
17
|
RAH03 V. Raman, A. Deshpande, and J. M. Hellerstein, Using State Modules for Adaptive Query Optimization. ICDE 2003
|
 |
18
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
| |
19
|
|
 |
20
|
|
 |
21
|
Tolga Urhan , Michael J. Franklin , Laurent Amsaleg, Cost-based query scrambling for initial delays, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.130-141, June 01-04, 1998, Seattle, Washington, United States
|
| |
22
|
|
 |
23
|
|
CITED BY 31
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ihab F. Ilyas , Walid G. Aref , Ahmed K. Elmagarmid , Hicham G. Elmongui , Rahul Shah , Jeffrey Scott Vitter, Adaptive rank-aware query optimization in relational databases, ACM Transactions on Database Systems (TODS), v.31 n.4, p.1257-1304, December 2006
|
|
|
Holger Kache , Wook-Shin Han , Volker Markl , Vijayshankar Raman , Stephan Ewen, POP/FED: progressive query optimization for federated queries in DB2, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
Vijayshankar Raman , Volker Markl , David Simmen , Guy Lohman , Hamid Pirahesh, Progressive optimization in action, Proceedings of the Thirtieth international conference on Very large data bases, p.1337-1340, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Roald Lengu , Paolo Missier , Alvaro A. A. Fernandes , Giovanna Guerrini , Marco Mesiti, Time-completeness trade-offs in record linkage using adaptive query processing, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
|
|
|
|
|
|
Riham Abdel Kader , Peter Boncz , Stefan Manegold , Maurice van Keulen, ROX: run-time optimization of XQueries, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
Anastasios Gounaris , Jim Smith , Norman W. Paton , Rizos Sakellariou , Alvaro A. Fernandes , Paul Watson, Adaptive workload allocation in query processing in autonomous heterogeneous environments, Distributed and Parallel Databases, v.25 n.3, p.125-164, June 2009
|
|
|
|
|
|
Norman W. Paton , Jorge Buenabad-Chavez , Mengsong Chen , Vijayshankar Raman , Garret Swart , Inderpal Narang , Daniel M. Yellin , Alvaro A. Fernandes, Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options, The VLDB Journal — The International Journal on Very Large Data Bases, v.18 n.1, p.119-140, January 2009
|
|
|
Ahmad Ghazal , Dawit Seid , Bhashyam Ramesh , Alain Crolotte , Manjula Koppuravuri , Vinod G, Dynamic plan generation for parameterized queries, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|