|
ABSTRACT
Techniques for efficient and distributed processing of huge, unbound data streams have made some impact in the database community. Sensors and data sources, such as position data of moving objects, continuously produce data that is consumed, e.g., by location-aware applications. Depending on the domain of interest, e.g. visualization, the processing of such data often depends on domain-specific functionality. This functionality is specified in terms of dedicated operators that may require specialized hardware, e.g. GPUs. This creates a strong dependency which a data stream processing system must consider when deploying such operators. Many data stream processing systems have been presented so far. However, these systems assume homogeneous computing nodes, do not consider operator deployment constraints, and are not designed to address domain-specific needs. In this paper, we identify necessary features that a flexible and extensible middleware for distributed stream processing of context data must satisfy. We present NexusDS, our approach to achieve these requirements. In NexusDS, data processing is specified by orchestrating data flow graphs, which are modeled as processing pipelines of predefined and general operators as well as custom-built and domain-specific ones. We focus on easy extensibility and support for domain-specific operators and services that may even utilize specific hardware available on dedicated computing nodes.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The Design of the Borealis Stream Processing Engine. In Second Biennial Conference on Innovative Data Systems Research (CIDR 2005), Asilomar, CA, January 2005.
|
| |
2
|
D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120--139, 2003.
|
| |
3
|
Y. Ahmad, B. Berg, U. Cetintemel, M. Humphrey, J.-H. Hwang, A. Jhingran, A. Maskey, O. Papaemmanouil, A. Rasin, N. Tatbul, W. Xing, Y. Xing, and S. Zdonik. Distributed operation in the borealis stream processing engine. In SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pages 882--884, New York, NY, USA, 2005. ACM.
|
| |
4
|
L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani. Spc: a distributed, scalable platform for data mining. In DMSSP '06: Proceedings of the 4th international workshop on Data mining standards, services and platforms, pages 27--37, New York, NY, USA, 2006. ACM.
|
| |
5
|
G. Antoniu, P. Hatcher, M. Jan, and D. Noblet. Performance evaluation of jxta communication layers. Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International Symposium on, 1:251--258 Vol. 1, May 2005.
|
| |
6
|
A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, R. Motwani, I. Nishizawa, U. Srivastava, D. Thomas, R. Varma, and J. Widom. Stream: The stanford stream data manager. IEEE Data Eng. Bull., 26(1):19--26, 2003.
|
| |
7
|
B. Babcock, S. Babu, R. Motwani, and M. Datar. Chain: operator scheduling for memory minimization in data stream systems. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 253--264, New York, NY, USA, 2003. ACM.
|
| |
8
|
M. Balazinska, H. Balakrishnan, and M. Stonebraker. Load management and high availability in the medusa distributed stream processing system. In SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 929--930, New York, NY, USA, 2004. ACM.
|
| |
9
|
D. Carney, U. Çetintemel, A. Rasin, S. Zdonik, M. Cherniack, and M. Stonebraker. Operator scheduling in a data stream manager. In VLDB '2003: Proceedings of the 29th international conference on Very large data bases, pages 838--849. VLDB Endowment, 2003.
|
| |
10
|
M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Çetintemel, Y. Xing, and S. B. Zdonik. Scalable distributed stream processing. In CIDR, 2003.
|
| |
11
|
I. Foster and C. Kesselman. The Grid 2: Blueprint for a New Computing Infrastructure (The Morgan Kaufmann Series in Computer Architecture and Design). Morgan Kaufmann, November 2003.
|
| |
12
|
B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo. Spade: the system s declarative stream processing engine. In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1123--1134, New York, NY, USA, 2008. ACM.
|
| |
13
|
L. Golab and M. T. Özsu. Issues in data stream management. SIGMOD Rec., 32(2):5--14, 2003.
|
| |
14
|
N. Goodnight, R. Wang, and G. Humphreys. Computation on programmable graphics hardware. IEEE Computer Graphics and Applications, 25(5):12--15, 2005.
|
| |
15
|
R. B. Haber and D. A. McNabb. Visualization idioms: A conceptual model for scientific visualization systems. In B. Schriver, G. M. Nielson, and L. J. Rosenblum, editors, Visualization in Scientific Computing, pages 74--93. IEEE Computer Society Press, 1990.
|
| |
16
|
N. Hönle, U.-P. Käppeler, D. Nicklas, T. Schwarz, and M. Großmann. Benefits of integrating meta data into a context model. In PerCom Workshops, pages 25--29. IEEE Computer Society, 2005.
|
| |
17
|
C. Kesselman and I. Foster. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, November 1998.
|
| |
18
|
J. Krämer and B. Seeger. Pipes: a public infrastructure for processing and exploring streams. In SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 925--926, New York, NY, USA, 2004. ACM.
|
| |
19
|
R. Kuntschke, B. Stegmaier, A. Kemper, and A. Reiser. Streamglobe: processing and sharing data streams in grid-based p2p infrastructures. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 1259--1262. VLDB Endowment, 2005.
|
| |
20
|
D. Nicklas, M. Großmann, T. Schwarz, S. Volz, and B. Mitschang. A model-based, open architecture for mobile, spatially aware applications. In C. S. Jensen, M. Schneider, B. Seeger, and V. J. Tsotras, editors, SSTD, volume 2121 of Lecture Notes in Computer Science, pages 117--135. Springer, 2001.
|
| |
21
|
D. Nicklas and B. Mitschang. On building location aware applications using an open platform based on the NEXUS augmented world model. Software and System Modeling, 3:303--313, 2004.
|
| |
22
|
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A. E. Lefohn, and T. J. Purcell. A survey of general-purpose computation on graphics hardware. In Eurographics 2005, State of the Art Reports, pages 21--51, Aug. 2005.
|
| |
23
|
B. Stegmaier, R. Kuntschke, and A. Kemper. Streamglobe: Adaptive query processing and optimization in streaming p2p environments. In DMSN '04: Proceedings of the 1st international workshop on Data management for sensor networks, pages 88--97, New York, NY, USA, 2004. ACM.
|
| |
24
|
N. Tatbul and S. Zdonik. Dealing with overload in distributed stream processing systems. In ICDEW '06: Proceedings of the 22nd International Conference on Data Engineering Workshops, page 24, Washington, DC, USA, 2006. IEEE Computer Society.
|
| |
25
|
T. Urhan and M. J. Franklin. Dynamic pipeline scheduling for improving interactive query performance. In VLDB '01: Proceedings of the 27th International Conference on Very Large Data Bases, pages 501--510, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc.
|
| |
26
|
Y. Yang, J. Krämer, D. Papadias, and B. Seeger. Hybmig: A hybrid approach to dynamic plan migration for continuous queries. IEEE Transactions on Knowledge and Data Engineering, 19(3):398--411, 2007.
|
| |
27
|
M. Zhu, Q. Wu, N. Rao, and S. Iyengar. Adaptive visualization pipeline decomposition and mapping onto computer networks. Image and Graphics, 2004. Proceedings. Third International Conference on, pages 402--405, Dec. 2004.
|
| |
28
|
Y. Zhu, E. A. Rundensteiner, and G. T. Heineman. Dynamic plan migration for continuous queries over data streams. In SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 431--442, New York, NY, USA, 2004. ACM.
|
|