|
ABSTRACT
We consider the problem of pipelined filters, where a continuous stream of tuples is processed by a set of commutative filters. Pipelined filters are common in stream applications and capture a large class of multiway stream joins. We focus on the problem of ordering the filters adaptively to minimize processing cost in an environment where stream and filter characteristics vary unpredictably over time. Our core algorithm, A-Greedy (for Adaptive Greedy), has strong theoretical guarantees: If stream and filter characteristics were to stabilize, A-Greedy would converge to an ordering within a small constant factor of optimal. (In experiments A-Greedy usually converges to the optimal ordering.) One very important feature of A-Greedy is that it monitors and responds to selectivities that are correlated across filters (i.e., that are nonindependent), which provides the strong quality guarantee but incurs run-time overhead. We identify a three-way tradeoff among provable convergence to good orderings, run-time overhead, and speed of adaptivity. We develop a suite of variants of A-Greedy that lie at different points on this tradeoff spectrum. We have implemented all our algorithms in the STREAM prototype Data Stream Management System and a thorough performance evaluation is presented.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom. Adaptive ordering of pipelined stream filters. Technical report, Stanford University Database Group, Nov. 2003. Available at http://dbpubs.stanford.edu/pub/2003-69.
|
| |
3
|
S. Babu, K. Munagala, J. Widom, and R. Motwani. Adaptive caching for continuous queries. Technical report, Stanford University Database Group, Mar. 2004, Available at http://dbpubs.stanford.edu/pub/2004-14.
|
 |
4
|
|
| |
5
|
D. Carney et al. Monitoring streams-a new class of data management applications. In Proc. of the 2002 Intl. Conf. on Very Large Data Bases, Aug. 2002.
|
| |
6
|
S. Chandrasekaran et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proc. First Biennial Conf. on Innovative Data Systems Research, Jan. 2003.
|
 |
7
|
|
 |
8
|
Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
N. G. Duffield , M. Grossglauser, Trajectory sampling for direct traffic observation, Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, p.271-282, August 28-September 01, 2000, Stockholm, Sweden
|
 |
14
|
Françoise Fabret , H. Arno Jacobsen , François Llirbat , Joăo Pereira , Kenneth A. Ross , Dennis Shasha, Filtering algorithms and implementation for very fast publish/subscribe systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.115-126, May 21-24, 2001, Santa Barbara, California, United States
|
| |
15
|
|
| |
16
|
L. Golab and T. Ozsu. Processing sliding window multi-joins in continuous queries over data streams. In Proc. of the 2003 Intl. Conf. on Very Large Data Bases, Sept. 2003.
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
 |
20
|
Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
21
|
|
| |
22
|
J. Kang, J. Naughton, and S. Viglas. Evaluating window joins over unbounded streams. In Proc. of the 2003 Intl. Conf. on Data Engineering, Mar. 2003.
|
| |
23
|
|
| |
24
|
|
 |
25
|
|
| |
26
|
R. Motwani, J. Widom, et al. Query processing, approximation, and resource management in a data stream management system. In Proc. First Biennial Conf. on Innovative Data Systems Research (CIDR), Jan. 2003.
|
| |
27
|
K. Munagala, S. Babu, R. Motwani, and J. Widom. The pipelined set cover problem. Technical report, Stanford University Database Group, Oct. 2003. Available at http://dbpubs.stanford.edu/pub/2003-65.
|
| |
28
|
V. Raman, A. Deshpande, and J. Hellerstein. Using state modules for adaptive query processing. In Proc. of the 2003 Intl. Conf. on Data Engineering, Mar. 2003.
|
 |
29
|
|
| |
30
|
|
 |
31
|
Tolga Urhan , Michael J. Franklin , Laurent Amsaleg, Cost-based query scrambling for initial delays, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.130-141, June 01-04, 1998, Seattle, Washington, United States
|
| |
32
|
S. Viglas, J. Naughton, and J. Burger. Maximizing the output rate of multi-join queries over streaming information sources. In Proc. of the 2003 Intl. Conf. on Very Large Data Bases, Sept. 2003.
|
CITED BY 37
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anne Condon , Amol Deshpande , Lisa Hellerstein , Ning Wu, Flow algorithms for two pipelined filter ordering problems, Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 26-28, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Navendu Jain , Dmitry Kit , Prince Mahajan , Praveen Yalagandula , Mike Dahlin , Yin Zhang, STAR: self-tuning aggregation for scalable monitoring, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Selçuk Candan , Mehmet E. Dönderler , Yan Qi , Jaikannan Ramamoorthy , Jong W. Kim, FMware: middleware for efficient filtering and matching of XML messages with local data, Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware, November 01-01, 2006, Melbourne, Australia
|
|
|
|
|
|
|
|
|
|
|
|
Kunal Agrawal , Anne Benoit , Fanny Dufossé , Yves Robert, Mapping filtering streaming applications with communication costs, Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, August 11-13, 2009, Calgary, AB, Canada
|
|