ACM Home Page
Please provide us with feedback. Feedback
Dynamic sample selection for approximate query processing
Full text PdfPdf (261 KB)
Source International Conference on Management of Data archive
Proceedings of the 2003 ACM SIGMOD international conference on Management of data table of contents
San Diego, California
SESSION: XML query processing I table of contents
Pages: 539 - 550  
Year of Publication: 2003
ISBN:1-58113-634-X
Authors
Brian Babcock  Stanford University
Surajit Chaudhuri  Microsoft Research
Gautam Das  Microsoft Research
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 72,   Citation Count: 24
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/872757.872822
What is a DOI?

ABSTRACT

In decision support applications, the ability to provide fast approximate answers to aggregation queries is desirable. One commonly-used technique for approximate query answering is sampling. For many aggregation queries, appropriately constructed biased (non-uniform) samples can provide more accurate approximations than a uniform sample. The optimal type of bias, however, varies from query to query. In this paper, we describe an approximate query processing technique that dynamically constructs an appropriately biased sample for each query by combining samples selected from a family of non-uniform samples that are constructed during a pre-processing phase. We show that dynamic selection of appropriate portions of previously constructed samples can provide more accurate approximate answers than static, non-adaptive usage of uniform or non-uniform samples.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
 
5
A. Agresti and B. A. Coull. Approximate is better than "exact" for interval estimation of binomial proportions. The American Statistician, 52:119--126, 1998.
 
6
D. Barbara, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The New Jersey data reduction report. Data Engineering Bulletin, 20(4):3--45, 1997.
 
7
L. D. Brown, T. Cai, and A. DasGupta. Interval estimation for a binomial proportion. Statistical Science, 16:101--133, 2001.
 
8
 
9
10
11
12
 
13
S. Chaudhuri and V. Narasayya. Program for tpc-d data generation with skew. Available at ftp://ftp.research.microsoft.com/pub/users/viveknar/tpcdskew.
 
14
 
15
 
16
 
17
 
18
 
19
20
 
21
J. Hellerstein, M. Franklin, S. Chandrasekaran, A. Deshpande, K. Hildrum, S. Madden, V. Raman, and M. A. Shah. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 23(2):7--18, June 2000.
22
 
23
24
 
25
26
 
27
Transaction Processing Performance Council. TPC-H benchmark 1.1.0, June 1999. http://www.tpc.org.
28

CITED BY  24

Collaborative Colleagues:
Brian Babcock: colleagues
Surajit Chaudhuri: colleagues
Gautam Das: colleagues