|
ABSTRACT
Statistics play an important role in influencing the plans produced by a query optimizer. Traditionally, optimizers use statistics built over base tables and assume independence between attributes while propagating statistical information through the query plan. This approach can introduce large estimation errors, which may result in the optimizer choosing inefficient execution plans. In this paper, we show how to extend a generic optimizer so that it also exploits statistics built on expressions corresponding to intermediate nodes of query plans. We show that in some cases, the quality of the resulting plans is significantly better than when only base-table statistics are available. Unfortunately, even moderately-sized schemas may have too many relevant candidate statistics. We introduce a workload-driven technique to identify a small subset of statistics that can provide significant benefits over just maintaining base-table statistics. Finally, we present experimental results on an implementation of our approach in Microsoft SQL Server 2000.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala , Sridhar Ramaswamy, Join synopses for approximate query answering, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.275-286, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
3
|
Nicolas Bruno , Surajit Chaudhuri , Luis Gravano, STHoles: a multidimensional workload-aware histogram, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.211-222, May 21-24, 2001, Santa Barbara, California, United States
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
 |
9
|
Jonathan Goldstein , Per-Åke Larson, Optimizing queries using materialized views: a practical, scalable solution, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.331-342, May 21-24, 2001, Santa Barbara, California, United States
|
| |
10
|
G. Graefe. The cascades framework for query optimization. Data Engineering Bulletin, 18(3), 1995.
|
 |
11
|
|
| |
12
|
|
 |
13
|
Dimitrios Gunopulos , George Kollios , Vassilis J. Tsotras , Carlotta Domeniconi, Approximating multi-dimensional aggregate range queries over real attributes, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.463-474, May 15-18, 2000, Dallas, Texas, United States
|
 |
14
|
L. M. Haas , J. C. Freytag , G. M. Lohman , H. Pirahesh, Extensible query processing in starburst, Proceedings of the 1989 ACM SIGMOD international conference on Management of data, p.377-388, June 1989, Portland, Oregon, United States
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
Viswanath Poosala , Peter J. Haas , Yannis E. Ioannidis , Eugene J. Shekita, Improved histograms for selectivity estimation of range predicates, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.294-305, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
19
|
|
 |
20
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
| |
21
|
|
| |
22
|
TPC Benchmark H. Decision support. Available at http://www.tpc.org.
|
CITED BY 37
|
|
Ihab F. Ilyas , Volker Markl , Peter Haas , Paul Brown , Ashraf Aboulnaga, CORDS: automatic discovery of correlations and soft functional dependencies, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
V. Markl , N. Megiddo , M. Kutsch , T. M. Tran , P. Haas , U. Srivastava, Consistently estimating the selectivity of conjuncts of predicates, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
Volker Markl , Vijayshankar Raman , David Simmen , Guy Lohman , Hamid Pirahesh , Miso Cilimdzic, Robust query processing through progressive optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
Shivnath Babu , Rajeev Motwani , Kamesh Munagala , Itaru Nishizawa , Jennifer Widom, Adaptive ordering of pipelined stream filters, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ihab F. Ilyas , Walid G. Aref , Ahmed K. Elmagarmid , Hicham G. Elmongui , Rahul Shah , Jeffrey Scott Vitter, Adaptive rank-aware query optimization in relational databases, ACM Transactions on Database Systems (TODS), v.31 n.4, p.1257-1304, December 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
César A. Galindo-Legaria , Milind M. Joshi , Florian Waas , Ming-Chuan Wu, Statistics on views, Proceedings of the 29th international conference on Very large data bases, p.952-962, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|
|
A. Aboulnaga , P. Haas , M. Kandil , S. Lightstone , G. Lohman , V. Markl , I. Popivanov , V. Raman, Automated statistics collection in DB2 UDB, Proceedings of the Thirtieth international conference on Very large data bases, p.1158-1169, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
V. Markl , P. J. Haas , M. Kutsch , N. Megiddo , U. Srivastava , T. M. Tran, Consistent selectivity estimation via maximum entropy, The VLDB Journal — The International Journal on Very Large Data Bases, v.16 n.1, p.55-76, January 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Riham Abdel Kader , Peter Boncz , Stefan Manegold , Maurice van Keulen, ROX: run-time optimization of XQueries, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
Pawel Terlecki , Hardik Bati , Cesar Galindo-Legaria , Peter Zabback, Filtered statistics, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|