ACM Home Page
Please provide us with feedback. Feedback
Chain: operator scheduling for memory minimization in data stream systems
Full text PdfPdf (300 KB)
Source International Conference on Management of Data archive
Proceedings of the 2003 ACM SIGMOD international conference on Management of data table of contents
San Diego, California
SESSION: Stream query processing II table of contents
Pages: 253 - 264  
Year of Publication: 2003
ISBN:1-58113-634-X
Authors
Brian Babcock  Stanford University, Stanford, CA
Shivnath Babu  Stanford University, Stanford, CA
Rajeev Motwani  Stanford University, Stanford, CA
Mayur Datar  Stanford University, Stanford, CA
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 60,   Citation Count: 39
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/872757.872789
What is a DOI?

ABSTRACT

In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment must be prepared to deal gracefully with bursts in data arrival without compromising system performance. We discuss one strategy for processing bursty streams --- adaptive, load-aware scheduling of query operators to minimize resource consumption during times of peak load. We show that the choice of an operator scheduling strategy can have significant impact on the run-time system memory usage. We then present Chain scheduling, an operator scheduling strategy for data stream systems that is near-optimal in minimizing run-time memory usage for any collection of single-stream queries involving selections, projections, and foreign-key joins with stored relations. Chain scheduling also performs well for queries with sliding-window joins over multiple streams, and multiple queries of the above types. A thorough experimental evaluation is provided where we demonstrate the potential benefits of Chain scheduling, compare it with competing scheduling strategies, and validate our analytical conclusions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
 
5
D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams--a new class of data management applications. In Proc. 28th Intl. Conf. on Very Large Data Bases, Aug. 2002.
 
6
S. Chandrasekaran and M. Franklin. Streaming queries over streaming data. In Proc. 28th Intl. Conf. on Very Large Data Bases, Aug. 2002.
7
 
8
B. Dageville and M. Zait. SQL memory management in Oracle9i. In Proc. of the 2002 Intl. Conf. on Very Large Data Bases, Aug. 2002.
 
9
 
10
J. Hellerstein, M. Franklin, S. Chandrasekaran, A. Deshpande, K. Hildrum, S. Madden, V. Raman, and M. A. Shah. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 23(2):7--18, June 2000.
 
11
Internet Traffic Archive. http://www.acm.org/sigcomm/ITA/.
12
13
 
14
J. Kang, J. F. Naughton, and S. Viglas. Evaluating window joins over unbounded streams. In Proc. of the 2003 Intl. Conf. on Data Engineering, Mar. 2003.
15
 
16
 
17
D. Lomet and A. Levy. Special issue on adaptive query processing. IEEE Data Engineering Bulletin, 23(2), June 2000.
18
 
19
R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, approximation, and resource management in a data stream management system. In Proc. First Biennial Conf. on Innovative Data Systems Research (CIDR), Jan. 2003.
20
 
21
Niagara Project. http://www.cs.wisc.edu/niagara/.
 
22
 
23
 
24
V. Raman, A. Deshpande, and J. Hellerstein. Using state modules for adaptive query processing. In Proc. of the 2003 Intl. Conf. on Data Engineering, Mar. 2003.
25
 
26
Stanford Stream Data Management (STREAM) Project. http://www-db.stanford.edu/stream.
 
27
28
 
29
T. Urhan and M. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Engineering Bulletin, 23(2):27--33, June 2000.
 
30
31
32
 
33
W. Willinger, V. Paxson, R. Riedi, and M. Taqqu. Long-range dependence and data network traffic. In Long-range Dependence: Theory and Applications, P. Doukhan, G. Oppenheim and M. S. Taqqu, eds., Birkhauser, 2002.
 
34
W. Willinger, M. Taqqu, and A. Erramilli. A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks. In F. P. Kelly, S. Zachary, and I. Ziedins, editors, Stochastic Networks: Theory and Applications, pages 339--366. Oxford University Press, 1996.
 
35

CITED BY  39

Collaborative Colleagues:
Brian Babcock: colleagues
Shivnath Babu: colleagues
Rajeev Motwani: colleagues
Mayur Datar: colleagues