ACM Home Page
Please provide us with feedback. Feedback
Conditional selectivity for statistics on query expressions
Full text PdfPdf (355 KB)
Source International Conference on Management of Data archive
Proceedings of the 2004 ACM SIGMOD international conference on Management of data table of contents
Paris, France
SESSION: Research sessions: statistics table of contents
Pages: 311 - 322  
Year of Publication: 2004
ISBN:1-58113-859-8
Authors
Nicolas Bruno  Microsoft Research
Surajit Chaudhuri  Microsoft Research
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 65,   Citation Count: 9
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1007568.1007604
What is a DOI?

ABSTRACT

Cardinality estimation during query optimization relies on simplifying assumptions that usually do not hold in practice. To diminish the impact of inaccurate estimates during optimization, statistics on query expressions (SITs) have been previously proposed. These statistics help directly model the distribution of tuples on query sub-plans. Past work in statistics on query expressions has exploited view matching technology to harness their benefits. In this paper we argue against such an approach as it overlooks significant opportunities for improvement in cardinality estimations. We then introduce a framework to reason with SITs based on the notion of conditional selectivity. We present a dynamic programming algorithm to efficiently find the most accurate selectivity estimation for given queries, and discuss how such an approach can be incorporated into existing optimizers with a small number of changes. Finally, we demonstrate experimentally that our technique results in superior cardinality estimations than previous approaches with very little overhead.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
4
5
 
6
 
7
8
9
10
 
11
12
 
13
14
 
15
G. Graefe. The Cascades framework for query optimization. Data Engineering Bulletin, 18(3), 1995.
16
17
 
18
 
19
20
21
22
 
23
24
 
25
 
26
F. Wass, C. Galindo-Legaria, M.-C. Wu, and M. Joshi. Statistics on views. In Proceedings of the 29th International Conference on Very Large Databases (VLDB), 2003.

CITED BY  9
Collaborative Colleagues:
Nicolas Bruno: colleagues
Surajit Chaudhuri: colleagues