|
ABSTRACT
In many recent applications, data may take the form of continuous data streams, rather than finite stored data sets. Several aspects of data management need to be reconsidered in the presence of data streams, offering a new research direction for the database community. In this paper we focus primarily on the problem of query processing, specifically on how to define and evaluate continuous queries over data streams. We address semantic issues as well as efficiency concerns. Our main contributions are threefold. First, we specify a general and flexible architecture for query processing in the presence of data streams. Second, we use our basic architecture as a tool to clarify alternative semantics and processing techniques for continuous queries. The architecture also captures most previous work on continuous queries and data streams, as well as related concepts such as triggers and materialized views. Finally, we map out research topics in the area of query processing over data streams, showing where previous work is relevant and describing problems yet to be addressed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala, Congressional samples for approximate answering of group-by queries, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.487-498, May 15-18, 2000, Dallas, Texas, United States
|
 |
3
|
Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala , Sridhar Ramaswamy, Join synopses for approximate query answering, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.275-286, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
4
|
|
 |
5
|
Noga Alon , Yossi Matias , Mario Szegedy, The space complexity of approximating the frequency moments, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.20-29, May 22-24, 1996, Philadelphia, Pennsylvania, United States
[doi> 10.1145/237814.237823]
|
| |
6
|
{B+97} D. Barbara et al. The New Jersey data reduction report. IEEE Data Engineering Bulletin, 20(4):3-45, 1997.
|
| |
7
|
{Bar99} D. Barbara. The characterization of continuous queries. Intl. Journal of Cooperative Information Systems, 8(4):295-323, December 1999.
|
 |
8
|
|
 |
9
|
Shivnath Babu , Minos Garofalakis , Rajeev Rastogi, SPARTAN: a model-based semantic compression system for massive data tables, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.283-294, May 21-24, 2001, Santa Barbara, California, United States
|
 |
10
|
Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States
|
 |
11
|
Corinna Cortes , Kathleen Fisher , Daryl Pregibon , Anne Rogers, Hancock: a language for extracting signatures from data streams, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.9-17, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347094]
|
| |
12
|
|
 |
13
|
Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya, On random sampling over joins, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.263-274, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
14
|
N. G. Duffield , M. Grossglauser, Trajectory sampling for direct traffic observation, Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, p.271-282, August 28-September 01, 2000, Stockholm, Sweden
|
 |
15
|
|
 |
16
|
|
 |
17
|
Christos Faloutsos , M. Ranganathan , Yannis Manolopoulos, Fast subsequence matching in time-series databases, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.419-429, May 24-27, 1994, Minneapolis, Minnesota, United States
|
| |
18
|
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
 |
22
|
Johannes Gehrke , Flip Korn , Divesh Srivastava, On computing correlated aggregates over continual data streams, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.13-24, May 21-24, 2001, Santa Barbara, California, United States
|
| |
23
|
{GM95} A. Gupta and I. S. Mumick. Maintenance of materialized views: Problems, techniques, and applications. IEEE Data Engineering Bulletin, 18(2):3-18, June 1995.
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
|
 |
28
|
|
 |
29
|
|
| |
30
|
{HF+00} J. M. Hellerstein, M. J. Franklin, et al. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 23(2):7-18, June 2000.
|
 |
31
|
|
 |
32
|
Joseph M. Hellerstein , Peter J. Haas , Helen J. Wang, Online aggregation, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.171-182, May 11-15, 1997, Tucson, Arizona, United States
|
 |
33
|
|
| |
34
|
{HRR98} M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. Technical Report TR-1998-011, Compaq Systems Research Center, Palo Alto, California, May 1998.
|
 |
35
|
|
 |
36
|
Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
37
|
|
 |
38
|
H. V. Jagadish , Inderpal Singh Mumick , Abraham Silberschatz, View maintenance issues for the chronicle data model (extended abstract), Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.113-124, May 22-25, 1995, San Jose, California, United States
[doi> 10.1145/212433.220201]
|
| |
39
|
|
| |
40
|
|
 |
41
|
Gurmeet Singh Manku , Sridhar Rajagopalan , Bruce G. Lindsay, Random sampling techniques for space efficient online computation of order statistics of large datasets, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.251-262, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
42
|
|
 |
43
|
Benjamin Nguyen , Serge Abiteboul , Grégory Cobena , Mihaí Preda, Monitoring XML data on the Web, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.437-448, May 21-24, 2001, Santa Barbara, California, United States
|
| |
44
|
|
| |
45
|
Dallan Quass , Ashish Gupta , Inderpal Singh Mumick , Jennifer Widom, Making views self-maintainable for data warehousing, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.158-169, December 18-20, 1996, Miami Beach, Florida, United States
|
 |
46
|
|
 |
47
|
Praveen Seshadri , Miron Livny , Raghu Ramakrishnan, Sequence query processing, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.430-441, May 24-27, 1994, Minneapolis, Minnesota, United States
|
| |
48
|
|
| |
49
|
{STD+00} J. Shanmugasundaram, K. Tufte, D. J. DeWitt, J. F. Naughton, and D. Maier. Architecting a network query engine for producing partial results. In Proc. of the 2000 Intl. Workshop on the Web and Databases, pages 17-22, May 2000.
|
| |
50
|
|
| |
51
|
|
 |
52
|
|
 |
53
|
Douglas Terry , David Goldberg , David Nichols , Brian Oki, Continuous queries over append-only databases, Proceedings of the 1992 ACM SIGMOD international conference on Management of data, p.321-330, June 02-05, 1992, San Diego, California, United States
|
| |
54
|
{Tra} Traderbot home page. http://www.traderbot.com.
|
| |
55
|
|
| |
56
|
|
 |
57
|
|
 |
58
|
|
| |
59
|
|
| |
60
|
|
| |
61
|
{XPA99} XML path language (XPath) version 1.0, November 1999. W3C Recommendation available at http://www.w3.org/TR/xpath.
|
| |
62
|
|
CITED BY 105
|
|
|
|
|
|
|
|
|
|
|
Greg Humphreys , Mike Houston , Ren Ng , Randall Frank , Sean Ahern , Peter D. Kirchner , James T. Klosowski, Chromium: a stream-processing framework for interactive rendering on clusters, ACM Transactions on Graphics (TOG), v.21 n.3, July 2002
|
|
|
|
|
|
|
|
|
Leonidas Fegaras , David Levine , Sujoe Bose , Vamsi Chaluvadi, Query processing of streamed XML data, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mohamed F. Mokbel , Walid G. Aref , Susanne E. Hambrusch , Sunil Prabhakar, Towards scalable location-aware services: requirements and research issues, Proceedings of the 11th ACM international symposium on Advances in geographic information systems, p.110-117, November 07-08, 2003, New Orleans, Louisiana, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alvin Chen , Richard R. Muntz , Spencer Yuen , Ivo Locher , Sung I. Park , Mani B. Srivastava, A Support Infrastructure for the Smart Kindergarten, IEEE Pervasive Computing, v.1 n.2, p.49-57, April 2002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
|
|
|
Lina Peng , K. Selcuk Candan , Kyung D. Ryu , Karamvir S. Chatha , Hari Sundaram, ARIA: an adaptive and programmable media-flow architecture for interactive arts, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Huanmei Wu , Betty Salzberg , Gregory C Sharp , Steve B Jiang , Hiroki Shirato , David Kaeli, Subsequence matching on structured time series data, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
|
|
|
Jan Steffan , Ludger Fiege , Mariano Cilia , Alejandro Buchmann, Scoping in wireless sensor networks: a position paper, Proceedings of the 2nd workshop on Middleware for pervasive and ad-hoc computing, p.167-171, October 18-22, 2004, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jiawei Han , Yixin Chen , Guozhu Dong , Jian Pei , Benjamin W. Wah , Jianyong Wang , Y. Dora Cai, Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams, Distributed and Parallel Databases, v.18 n.2, p.173-197, September 2005
|
|
|
|
|
|
|
|
|
Arvind Arasu , Brian Babcock , Shivnath Babu , Jon McAlister , Jennifer Widom, Characterizing memory requirements for queries over continuous data streams, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Don Carney , Uğur Çetintemel , Alex Rasin , Stan Zdonik , Mitch Cherniack , Mike Stonebraker, Operator scheduling in a data stream manager, Proceedings of the 29th international conference on Very large data bases, p.838-849, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
Yixin Chen , Guozhu Dong , Jiawei Han , Benjamin W. Wah , Jianyong Wang, Multi-dimensional regression analysis of time-series data streams, Proceedings of the 28th international conference on Very Large Data Bases, p.323-334, August 20-23, 2002, Hong Kong, China
|
|
|
Graham Cormode , Mayur Datar , Piotr Indyk , S. Muthukrishnan, Comparing data streams using Hamming norms (how to zero in), Proceedings of the 28th international conference on Very Large Data Bases, p.335-345, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
Don Carney , Uǧur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Greg Seidman , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Monitoring streams: a new class of data management applications, Proceedings of the 28th international conference on Very Large Data Bases, p.215-226, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Angelo Brayner , Aretusa Lopes , Diorgens Meira , Ricardo Vasconcelos , Ronaldo Menezes, An adaptive in-network aggregation operator for query processing in wireless sensor networks, Journal of Systems and Software, v.81 n.3, p.328-342, March, 2008
|
|
|
Daniel J. Abadi , Don Carney , Ugur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal — The International Journal on Very Large Data Bases, v.12 n.2, p.120-139, August 2003
|
|
|
|
|
|
|
|
|
Yixin Chen , Guozhu Dong , Jiawei Han , Jian Pei , Benjamin W. Wah , Jianyong Wang, Regression Cubes with Lossless Compression and Aggregation, IEEE Transactions on Knowledge and Data Engineering, v.18 n.12, p.1585-1599, December 2006
|
|
|
|
|
|
|
|
|
|
|
|
Angelo Brayner , Aretusa Lopes , Diorgens Meira , Ricardo Vasconcelos , Ronaldo Menezes, Toward adaptive query processing in wireless sensor networks, Signal Processing, v.87 n.12, p.2911-2933, December, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Greg Humphreys , Mike Houston , Ren Ng , Randall Frank , Sean Ahern , Peter D. Kirchner , James T. Klosowski, Chromium: a stream-processing framework for interactive rendering on clusters, ACM SIGGRAPH ASIA 2008 courses, p.1-10, December 10-13, 2008, Singapore
|
|
|
Lu-An Tang , Bin Gui , Hong-Yan Li , Gao-Shan Miao , Dong-Qing Yang , Xin-Biao Zhou, PGG: an online pattern based approach for stream variation management, Journal of Computer Science and Technology, v.23 n.4, p.497-515, July 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|