|
ABSTRACT
Though extensions to the relational data model have been proposed in order to handle probabilistic information, there has been very little work to date on handling aggregate operators in such databases. In this article, we present a very general notion of an aggregate operator and show how classical aggregation operators (such as COUNT, SUM, etc.) as well as statistical operators (such as percentiles, variance, etc.) are special cases of this general definition. We devise a formal linear programming based semantics for computing aggregates over probabilistic DBMSs, develop algorithms that satisfy this semantics, analyze their complexity, and introduce several families of approximation algorithms that run in polynomial time. We implemented all of these algorithms and tested them on a large set of data to help determine when each one is preferable.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Aitchison, J., and Dunsmore, I. R. 1975. Statistical Prediction Analysis. Cambridge University Press, Cambridge.
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
Cramton, P. 1997. The FCC spectrum auctions: An early assessment. J. Econ. Manage. Strat. 6, 3, 431--495.
|
 |
8
|
|
| |
9
|
Dekhtyar, A., and Subrahmanian, V. S. 1997. Hybrid probabilistic programs. In Proceedings of the 14th International Conference on Logic Program. MIT Press, Cambridge, Mass., 391--405.
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
Fuhr, N., and Rolleke, T. 1996. A probabilistic NF2 relational algebra for integrated information retrieval and database systems. In Proceedings of the 2nd World Conference on Integrated Design and Process Technology. Society for Design and Process Science, Austin, Tex. 17--30.
|
| |
17
|
|
| |
18
|
Gee, J., Briquer, L., Barillot, C., and Haynor, D. 1995. Probabilistic matching of brain images. In Information Processing in Medical Imaging. Kluwer Academic Publishers, Dordrecht, 113--125.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
|
| |
28
|
|
| |
29
|
Oliver, N., Rosario, B., and Pentland, A. 1998. Statistical modeling of human interactions. In Proceedings of the Symposium on Computer Vision and Pattern Recognition (Santa Barbara, Calif.). IEEE Computer Society Press, Los Alamitos, Calif., 39--46.
|
| |
30
|
|
| |
31
|
Shiri, N. 1997. On a generalized theory of deductive databases. Ph.D. thesis, Concordia University, Montreal.
|
| |
32
|
Street, W. N., Mangasarian, O. L., and Wolberg, W. H. 1995. An inductive learning approach to prognostic prediction. In Proceedings of the 12th International Conference on Machine Learning. Morgan Kaufmann, San Francisco, Calif. 522--530.
|
| |
33
|
Subrahmanian, V. S. 1987. On the semantics of quantitative logic programs. In Proceedings of the 4th Symposium on Logic Programming (Washington, D.C.). IEEE Computer Society Press, Los Alamitos, Calif., 173--182.
|
| |
34
|
|
| |
35
|
|
| |
36
|
|
| |
37
|
Wolfram, C. D. 1998. Strategic bidding in a multi-unit auction: An empirical analysis of bids to supply electricity in England and Wales. RAND J. Econ. 29, 4, 703--773.
|
 |
38
|
|
CITED BY 13
|
|
Doug Burdick , Prasad M. Deshpande , T. S. Jayram , Raghu Ramakrishnan , Shivakumar Vaithyanathan, OLAP over uncertain and imprecise data, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
Jihad Boulos , Nilesh Dalvi , Bhushan Mandhani , Shobhit Mathur , Chris Re , Dan Suciu, MYSTIQ: a system for finding more answers by using probabilities, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|