|
ABSTRACT
The database community has made rapid strides in capturing, representing, and querying uncertain data. Probabilistic databases capture the inherent uncertainty in derived tuples as probability estimates. Data acquisition and stream systems can produce succinct summaries of very large and time-varying datasets. This tutorial addresses the natural next step in harnessing uncertain data: How can we efficiently and quantifiably determine what, how, and how much to learn in order to make good decisions based on the imprecise information available. The material in this tutorial is drawn from a range of fields including database systems, control and information theory, operations research, convex optimization, and statistical learning. The focus of the tutorial is on the natural constraints that are imposed in a database context and the demands of imprecise information from an optimization point of view. We look both into the past as well as into the future; to discuss general tools and techniques that can serve as a guide to database researchers and practitioners, and to enumerate the challenges that lie ahead.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Abernethy, E. Hazan, and A. Rakhlin. Competing in the dark: An efficient algorithm for bandit linear optimization. In COLT, pages 263--274, 2008.
|
| |
2
|
Barbara M. Anthony , Vineet Goyal , Anupam Gupta , Viswanath Nagarajan, A plant location guide for the unsure, Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, p.1164-1173, January 20-22, 2008, San Francisco, California
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
S. Babu. Grand challenge: Experiment-driven adaptive systems. In Proc. Hot Topics in Autonomic Computing (HotAC III), 2008.
|
| |
7
|
S. Babu, N. Borisov, S. Duan, H. Herodotou, and V. Thummala. Automated experiment-driven management of (database) systems. In Proceedings of Workshop on Hot Topics in Operating Systems (HotOS), 2009.
|
| |
8
|
|
 |
9
|
Shivnath Babu , Rajeev Motwani , Kamesh Munagala , Itaru Nishizawa , Jennifer Widom, Adaptive ordering of pipelined stream filters, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007615]
|
| |
10
|
|
 |
11
|
|
| |
12
|
Amol Deshpande , Carlos Guestrin , Samuel R. Madden , Joseph M. Hellerstein , Wei Hong, Model-driven data acquisition in sensor networks, Proceedings of the Thirtieth international conference on Very large data bases, p.588-599, August 31-September 03, 2004, Toronto, Canada
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
J.C. Gittins and D.M. Jones. A dynamic allocation index for the sequential design of experiments. Progress in statistics (European Meeting of Statisticians), 1972.
|
 |
17
|
|
 |
18
|
|
| |
19
|
|
 |
20
|
|
| |
21
|
S. Guha, K. Munagala, and S. Sarkar. Information acquisition and exploitation in multi-channel wireless systems. CoRR: 0804.1724, 2008.
|
| |
22
|
S. Guha, K. Munagala, and P. Shi. Approximation algorithms for restless bandit problems. CoRR: 0711.3861, 2008.
|
| |
23
|
|
| |
24
|
P. Horn. Autonomic computing: IBM's perspective on the state of information technology. IBM Tech. Rep., 2001. http://www.research.ibm.com/autonomic.
|
| |
25
|
|
| |
26
|
A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. UAI, 2005.
|
 |
27
|
Andreas Krause , Carlos Guestrin , Anupam Gupta , Jon Kleinberg, Near-optimal sensor placements: maximizing information while minimizing communication cost, Proceedings of the 5th international conference on Information processing in sensor networks, April 19-21, 2006, Nashville, Tennessee, USA
[doi> 10.1145/1127777.1127782]
|
| |
28
|
|
| |
29
|
|
| |
30
|
K. Munagala, S. Babu, R. Motwani, and J. Widom. The pipelined set cover problem. Proc. Intl. Conf. Database Theory, 2005.
|
| |
31
|
G.L. Nemhauser, L.A. Wolsey, and M.L. Fisher. An analysis of approximations for maximizing submodular set functions-I. Math Programming, 14(1):265--294, 1978.
|
| |
32
|
S. Pandey and C. Olston. Handling advertisements of unknown quality in search advertising. In NIPS, pages 1065--1072, 2006.
|
| |
33
|
H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58:527--535, 1952.
|
| |
34
|
S. Sarawagi and W.W. Cohen. Semi-Markov conditional random fields for information extraction. Proc. NIPS, 2004.
|
 |
35
|
|
| |
36
|
|
| |
37
|
Adam Silberstein , Gavino Puggioni , Alan Gelfand , Kamesh Munagala , Jun Yang, Suppression and failures in sensor networks: a Bayesian approach, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
| |
38
|
|
 |
39
|
|
| |
40
|
|
| |
41
|
R. Thonangi, V. Thummala, and S. Babu. Finding good configurations in high-dimensional spaces: Doing more with less. In Proc. MASCOTS, 2008.
|
| |
42
|
P. Whittle. Restless bandits: Activity allocation in a changing world. Appl. Prob., 25(A):287--298, 1988.
|
| |
43
|
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. Proceedings of ICML, pages 928--936, 2003.
|
|