ACM Home Page
Please provide us with feedback. Feedback
Relational confidence bounds are easy with the bootstrap
Full text PdfPdf (338 KB)
Source International Conference on Management of Data archive
Proceedings of the 2005 ACM SIGMOD international conference on Management of data table of contents
Baltimore, Maryland
SESSION: Research papers: estimation and approximation table of contents
Pages: 587 - 598  
Year of Publication: 2005
ISBN:1-59593-060-4
Authors
Abhijit Pol  University of Florida, Gainesville, FL
Christopher Jermaine  University of Florida, Gainesville, FL
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 38,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1066157.1066224
What is a DOI?

ABSTRACT

Statistical estimation and approximate query processing have become increasingly prevalent applications for database systems. However, approximation is usually of little use without some sort of guarantee on estimation accuracy, or "confidence bound." Analytically deriving probabilistic guarantees for database queries over sampled data is a daunting task, not suitable for the faint of heart, and certainly beyond the expertise of the typical database system end-user. This paper considers the problem of incorporating into a database system a powerful "plug-in" method for computing confidence bounds on the answer to relational database queries over sampled or incomplete data. This statistical tool, called the bootstrap, is simple enough that it can be used by a data-base programmer with a rudimentary mathematical background, but general enough that it can be applied to almost any statistical inference problem. Given the power and ease-of-use of the bootstrap, we argue that the algorithms presented for supporting the bootstrap should be incorporated into any database system which is intended to support analytic processing.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
B. Efron and R. J. Tibshirani: An Introduction to the Bootstrap. Chapman & Hall/CRC 1998
 
3
4
 
5
 
6
 
7
8
9
 
10
P. Hall: The Bootstrap and Edgeworth Expansion. Springer-Verlag 1995
11
12
 
13
N. L Johnson, S. Kotz, A. W. Kemp: Univariate Discrete Distributions. John Wiley and Sons. Inc 1993.
 
14
15
 
16
 
17
C. E. Shannon: A Mathematical Theory of Communication. Bell System Technical Journal, vol. 27:379--423 and 623--656, (July and October, 1948)
 
18
D. F. Vysochanskii and Y. I Petunin: Justification of the 3 ó Rule for Unimodal Distributions. Theory of Prob. and Math. Stat., vol 21: 25--36 (1980)
 
19
J. Ziv and A. Lempel: A Universal Algorithm for Sequential Data Compression. IEEE Trans. Inform. Theory, 23:337--343 (1977)
 
20
 
21
Collaborative Colleagues:
Abhijit Pol: colleagues
Christopher Jermaine: colleagues