|
ABSTRACT
Continuous queries are persistent queries that allow users to receive new results when they become available. While continuous query systems can transform a passive web into an active environment, they need to be able to support millions of queries due to the scale of the Internet. No existing systems have achieved this level of scalability. NiagaraCQ addresses this problem by grouping continuous queries based on the observation that many web queries share similar structures. Grouped queries can share the common computation, tend to fit in memory and can reduce the I/O cost significantly. Furthermore, grouping on selection predicates can eliminate a large number of unnecessary query invocations. Our grouping technique is distinguished from previous group optimization approaches in the following ways. First, we use an incremental group optimization strategy with dynamic re-grouping. New queries are added to existing query groups, without having to regroup already installed queries. Second, we use a query-split scheme that requires minimal changes to a general-purpose query engine. Third, NiagaraCQ groups both change-based and timer-based queries in a uniform way. To insure that NiagaraCQ is scalable, we have also employed other techniques including incremental evaluation of continuous queries, use of both pull and push models for detecting heterogeneous data source changes, and memory caching. This paper presents the design of NiagaraCQ system and gives some experimental results on the system's performance and scalability.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
CM86
|
|
| |
DFF+98
|
A. Deutsch, M. Fernandez, D. Florescu, A. Levy, D. Suciu. XML-QL: A Query Langaage for XML. http://www.w3.org/TR/NOTE-xml-ql.
|
| |
HCH+99
|
|
| |
HJ94
|
E. N. Hanson and T. Johnson. Selection Predicate Indexing for Active Databases Using Interval Skip List. TR94-017. CIS department, University of Florida, 1994.
|
| |
LPBZ96
|
|
| |
LPT99
|
|
 |
MD89
|
|
| |
RC88
|
|
| |
Sel86
|
T. Sellis. Multiple query optimization. ACM Transactions on Database Systems, 10(3), 1986.
|
 |
SJGP90
|
Michael Stonebraker , Anant Jhingran , Jeffrey Goh , Spyros Potamianos, On rules, procedure, caching and views in data base systems, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.281-290, May 23-26, 1990, Atlantic City, New Jersey, United States
|
| |
SK95
|
|
| |
SPAM91
|
|
 |
TGNO92
|
Douglas Terry , David Goldberg , David Nichols , Brian Oki, Continuous queries over append-only databases, Proceedings of the 1992 ACM SIGMOD international conference on Management of data, p.321-330, June 02-05, 1992, San Diego, California, United States
|
 |
WF89
|
|
 |
ZDNS98
|
Yihong Zhao , Prasad M. Deshpande , Jeffrey F. Naughton , Amit Shukla, Simultaneous optimization and evaluation of multiple dimensional queries, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.271-282, June 01-04, 1998, Seattle, Washington, United States
|
CITED BY 199
|
|
|
|
|
Vladimir Zadorozhny , Louiqa Raschid , Maria Esther Vidal , Tolga Urhan , Laura Bright, Efficient evaluation of queries in a mediator for WebSources, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bahattin Ozen , Ozgur Kilic , Mehmet Altinel , Asuman Dogac, Highly personalized information delivery to mobile clients, Proceedings of the 2nd ACM international workshop on Data engineering for wireless and mobile access, p.35-42, May 2001, Santa Barbara, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Françoise Fabret , H. Arno Jacobsen , François Llirbat , Joăo Pereira , Kenneth A. Ross , Dennis Shasha, Filtering algorithms and implementation for very fast publish/subscribe systems, ACM SIGMOD Record, v.30 n.2, p.115-126, June 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mohamed F. Mokbel , Walid G. Aref , Susanne E. Hambrusch , Sunil Prabhakar, Towards scalable location-aware services: requirements and research issues, Proceedings of the 11th ACM international symposium on Advances in geographic information systems, p.110-117, November 07-08, 2003, New Orleans, Louisiana, USA
|
|
|
|
|
|
|
|
|
|
|
|
Feng Tian , Berthold Reinwald , Hamid Pirahesh , Tobias Mayr , Jussi Myllymaki, Implementing a scalable XML publish/subscribe system using relational database systems, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Faensen , L. Faultstich , H. Schweppe , A. Hinze , A. Steidinger, Hermes: a notification service for digital libraries, Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries, p.373-380, January 2001, Roanoke, Virginia, United States
|
|
|
|
|
|
|
|
|
Xuemin Lin , Jian Xu , Qing Zhang , Hongjun Lu , Jeffrey Xu Yu , Xiaofang Zhou , Yidong Yuan, Approximate Processing of Massive Continuous Quantile Queries over High-Speed Data Streams, IEEE Transactions on Knowledge and Data Engineering, v.18 n.5, p.683-698, May 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mehul A. Shah , Joseph M. Hellerstein , Eric Brewer, Highly available, fault-tolerant, parallel dataflows, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
S. B. Davidson , J. Crabtree , B. P. Brunk , J. Schug , V. Tannen , G. C. Overton , C. J. Stoeckert, Jr., K2/Kleisli and GUS: experiments in integrated access to genomic data sources, IBM Systems Journal, v.40 n.2, p.512-531, February 2001
|
|
|
|
|
|
|
|
|
|
|
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
Shivnath Babu , Rajeev Motwani , Kamesh Munagala , Itaru Nishizawa , Jennifer Widom, Adaptive ordering of pipelined stream filters, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert McCann , Bedoor AlShebli , Quoc Le , Hoa Nguyen , Long Vu , AnHai Doan, Mapping maintenance for data integration systems, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Marco Mazzucco , Asvin Ananthanarayan , Robert L. Grossman , Jorge Levera , Gokulnath Bhagavantha Rao, Merging multiple data streams on common keys over high performance networks, Proceedings of the 2002 ACM/IEEE conference on Supercomputing, p.1-12, November 16, 2002, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
|
|
|
Arvind Arasu , Brian Babcock , Shivnath Babu , Jon McAlister , Jennifer Widom, Characterizing memory requirements for queries over continuous data streams, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
Mong Li Lee , Boon Chin Chua , Wynne Hsu , Kian-Lee Tan, Efficient evaluation of multiple queries on streaming XML data, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Umakishore Ramachandran , Rajnish Kumar , Matthew Wolenetz , Brian Cooper , Bikash Agarwalla , Junsuk Shin , Phillip Hutto , Arnab Paul, Dynamic data fusion for future sensor networks, ACM Transactions on Sensor Networks (TOSN), v.2 n.3, p.404-443, August 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Selçuk Candan , Wang-Pin Hsiung , Songting Chen , Junichi Tatemura , Divyakant Agrawal, AFilter: adaptable XML filtering with prefix-caching suffix-clustering, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
Jonathan Ledlie , Jeff Shneidman , Matt Welsh , Mema Roussopoulos , Margo Seltzer, Open problems in data collection networks, Proceedings of the 11th workshop on ACM SIGOPS European workshop: beyond the PC, p.27-es, September 19-22, 2004, Leuven, Belgium
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhen Liu , Srinivasan Parthasarthy , Anand Ranganathan , Hao Yang, Scalable event matching for overlapping subscriptions in pub/sub systems, Proceedings of the 2007 inaugural international conference on Distributed event-based systems, June 20-22, 2007, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
Don Carney , Uğur Çetintemel , Alex Rasin , Stan Zdonik , Mitch Cherniack , Mike Stonebraker, Operator scheduling in a data stream manager, Proceedings of the 29th international conference on Very large data bases, p.838-849, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
Sailesh Krishnamurthy , Michael J. Franklin , Joseph M. Hellerstein , Garrett Jacobson, The case for precision sharing, Proceedings of the Thirtieth international conference on Very large data bases, p.972-984, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Arvind Arasu , Mitch Cherniack , Eduardo Galvez , David Maier , Anurag S. Maskey , Esther Ryvkina , Michael Stonebraker , Richard Tibbetts, Linear road: a stream data management benchmark, Proceedings of the Thirtieth international conference on Very large data bases, p.480-491, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Rakesh Agrawal , Roberto Bayardo , Christos Faloutsos , Jerry Kiernan , Ralf Rantzau , Ramakrishnan Srikant, Auditing compliance with a Hippocratic database, Proceedings of the Thirtieth international conference on Very large data bases, p.516-527, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
Mohamed F. Mokbel , Xiaopeng Xiong , Walid G. Aref , Susanne E. Hambrusch , Sunil Prabhakar , Moustafa A. Hammad, PLACE: a query processor for handling real-time spatio-temporal data streams, Proceedings of the Thirtieth international conference on Very large data bases, p.1377-1380, August 31-September 03, 2004, Toronto, Canada
|
|
|
Tobias Kraft , Holger Schwarz , Ralf Rantzau , Bernhard Mitschang, Coarse-grained optimization: techniques for rewriting SQL statement sequences, Proceedings of the 29th international conference on Very large data bases, p.488-499, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|
|
Don Carney , Uǧur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Greg Seidman , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Monitoring streams: a new class of data management applications, Proceedings of the 28th international conference on Very Large Data Bases, p.215-226, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
Lisha Ma , Werner Nutt , Hamish Taylor, Condensative stream query language for data streams, Proceedings of the eighteenth conference on Australasian database, p.113-122, January 30-February 02, 2007, Ballarat, Victoria, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hyo-Sang Lim , Jae-Gil Lee , Min-Jae Lee , Kyu-Young Whang , Il-Yeol Song, Continuous query processing in data streams using duality of data and queries, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel J. Abadi , Don Carney , Ugur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal — The International Journal on Very Large Data Bases, v.12 n.2, p.120-139, August 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Witkowski , Srikanth Bellamkonda , Hua-Gang Li , Vince Liang , Lei Sheng , Wayne Smith , Sankar Subramanian , James Terry , Tsae-Feng Yu, Continuous queries in oracle, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
Ryan Johnson , Stavros Harizopoulos , Nikos Hardavellas , Kivanc Sabirli , Ippokratis Pandis , Anastasia Ailamaki , Naju G. Mancheril , Babak Falsafi, To share or not to share?, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fábio C. M. Ricotta , Thatyana de F. P. Seraphim , Enzo Seraphim , Edmilson M. Moreira , Caetano Traina, Jr., Paginação de resultados em consultas por abrangência, Proceedings of the 23rd Brazilian symposium on Databases, October 13-17, 2008, Campinas, Sao Paulo, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
K. Selçuk Candan , Mehmet E. Dönderler , Yan Qi , Jaikannan Ramamoorthy , Jong W. Kim, FMware: middleware for efficient filtering and matching of XML messages with local data, Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware, November 01-01, 2006, Melbourne, Australia
|
|
|
Mingsheng Hong , Mirek Riedewald , Christoph Koch , Johannes Gehrke , Alan Demers, Rule-based multi-query optimization, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|