|
ABSTRACT
Data Stream Management Systems (DSMS) operate under strict performance requirements. Key to meeting such requirements is to efficiently handle time-critical tasks such as managing internal states of continuous query operators, traffic on the queues between operators, as well as providing storage support for shared computation and archived data. In this paper, we introduce a general purpose storage management framework for DSMSs that performs these tasks based on a clean, loosely-coupled, and flexible system design that also facilitates performance optimization. An important contribution of the framework is that, in analogy to buffer management techniques in relational database systems, it uses information about the access patterns of streaming applications to tune and customize the performance of the storage manager. In the paper, we first analyze typical application requirements at different granularities in order to identify important tunable parameters and their corresponding values. Based on these parameters, we define a general-purpose storage management interface. Using the interface, a developer can use our SMS (Storage Manager for Streams) to generate a customized storage manager for streaming applications. We explore the performance and potential of SMS through a set of experiments using the Linear Road benchmark.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Daniel J. Abadi , Don Carney , Ugur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal — The International Journal on Very Large Data Bases, v.12 n.2, p.120-139, August 2003
[doi> 10.1007/s00778-003-0095-z]
|
| |
2
|
A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom. STREAM: The Stanford Data Stream Management System. In M. Garofalakis, J. Gehrke, and R. Rastogi, editors, Data Stream Management: Processing High-Speed Data Streams. Springer, 2007.
|
| |
3
|
Arvind Arasu , Mitch Cherniack , Eduardo Galvez , David Maier , Anurag S. Maskey , Esther Ryvkina , Michael Stonebraker , Richard Tibbetts, Linear road: a stream data management benchmark, Proceedings of the Thirtieth international conference on Very large data bases, p.480-491, August 31-September 03, 2004, Toronto, Canada
|
| |
4
|
Hari Balakrishnan , Magdalena Balazinska , Don Carney , Uğur Çetintemel , Mitch Cherniack , Christian Convey , Eddie Galvez , Jon Salz , Michael Stonebraker , Nesime Tatbul , Richard Tibbetts , Stan Zdonik, Retrospective on Aurora, The VLDB Journal — The International Journal on Very Large Data Bases, v.13 n.4, p.370-383, December 2004
[doi> 10.1007/s00778-004-0133-5]
|
| |
5
|
Irina Botan , Donald Kossmann , Peter M. Fischer , Tim Kraska , Dana Florescu , Rokas Tamosevicius, Extending XQuery with window functions, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
| |
6
|
S. Chandrasekaran, A. Deshpande, M. Franklin, J. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. Shah. Telegraph CQ: Continuous Dataflow Processing for an Uncertain World. In CIDR Conference, Asilomar, CA, January 2003.
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
N. Glance, M. Hurst, and T. Tomokiyo. BlogPulse: Automated Trend Discovery for Weblogs. In WWW Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, New York, NY, May 2004.
|
| |
11
|
L. Golab, S. Garg, and M. T. Özsu. On Indexing Sliding Windows over Online Data Streams. In EDBT Conference, Crete, Greece, March 2004.
|
 |
12
|
|
| |
13
|
Richard Kuntschke , Tobias Scholl , Sebastian Huber , Alfons Kemper , Angelika Reiser , Hans-Martin Adorf , Gerard Lemson , Wolfgang Voges, Grid-Based Data Stream Processing in e-Science, Proceedings of the Second IEEE International Conference on e-Science and Grid Computing, p.30, December 04-06, 2006
[doi> 10.1109/E-SCIENCE.2006.78]
|
| |
14
|
|
| |
15
|
R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query Processing, Approximation, and Resource Management in a Data Stream Management System. In CIDR Conference, Asilomar, CA, January 2003.
|
| |
16
|
V. Raman, A. Deshpande, and J. M. Hellerstein. Using State Modules for Adaptive Query Processing. In IEEE ICDE Conference, Bangalore, India, March 2003.
|
CITED BY
|
|
Gustavo Alonso , Donald Kossmann , Timothy Roscoe , Nesime Tatbul , Andrew Baumann , Carsten Binnig , Peter Fischer , Oriana Riva , Jens Teubner, The ETH Zurich systems group and enterprise computing center, ACM SIGMOD Record, v.37 n.4, December 2008
|
|