|
ABSTRACT
Relational query optimizers have traditionally relied upon table cardinalities when estimating the cost of the query plans they consider. While this approach has been and continues to be successful, the advent of the Internet and the need to execute queries over streaming sources requires a different approach, since for streaming inputs the cardinality may not be known or may not even be knowable (as is the case for an unbounded stream.) In view of this, we propose shifting from a cardinality-based approach to a rate-based approach, and give an optimization framework that aims at maximizing the output rate of query evaluation plans. This approach can be applied to cases where the cardinality-based approach cannot be used. It may also be useful for cases where cardinalities are known, because by focusing on rates we are able not only to optimize the time at which the last result tuple appears, but also to optimize for the number of answers computed at any specified time after the query evaluation commences. We present a preliminary validation of our rate-based optimization framework on a prototype XML query engine, though it is generic enough to be used in other database contexts. The results show that rate-based optimization is feasible and can indeed yield correct decisions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Arasu, B. Babcock, S. Babu, J. McAlister and J. Widom, Characterizing Memory Requirements for Queries over Continuous Data Streams, Stanford Techinical Report, November 2001, http://dbpubs.stanford.edu/pub/2001-49.
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States
|
| |
6
|
Mayur Datar , Aristides Gionis , Piotr Indyk , Rajeev Motwani, Maintaining stream statistics over sliding windows: (extended abstract), Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.635-644, January 06-08, 2002, San Francisco, California
|
| |
7
|
|
 |
8
|
|
| |
9
|
|
 |
10
|
Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
11
|
Z. G. Ives, A. Y. Levy and D. S. Weld. Efficient Evaluation of Regular Path Expressions on Streaming XML Data, University of Washington, Technical Report UW-CSE-2000-05-02.
|
 |
12
|
|
| |
13
|
|
| |
14
|
J. Naughton, D. J. DeWitt, D. Maier et al. The Niagara Internet Query System, IEEE Data Engineering Bulletin 24 (2): 27-33 (2001), http://www.cs.wisc.edu/niagara.
|
| |
15
|
|
 |
16
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
| |
17
|
Michael Stonebraker , Paul M. Aoki , Witold Litwin , Avi Pfeffer , Adam Sah , Jeff Sidell , Carl Staelin , Andrew Yu, Mariposa: a wide-area distributed database system, The VLDB Journal — The International Journal on Very Large Data Bases, v.5 n.1, p.048-063, January 1996
[doi> 10.1007/s007780050015]
|
| |
18
|
G. Schumacher. GEI's Experience with Britton-Lee's IDM, IWDM, 1983, pp. 233-241.
|
 |
19
|
|
| |
20
|
Jayavel Shanmugasundaram , Eugene J. Shekita , Rimon Barr , Michael J. Carey , Bruce G. Lindsay , Hamid Pirahesh , Berthold Reinwald, Efficiently Publishing Relational Data as XML Documents, Proceedings of the 26th International Conference on Very Large Data Bases, p.65-76, September 10-14, 2000
|
| |
21
|
J. Shanmugasundaram, K. Tufte, D. J. DeWitt, J. F. Naughton and D. Maier. Architecting a Network Query Engine for Producing Partial Results, WebDB 2000.
|
| |
22
|
T. Urhan and M. J. Franklin. Xjoin: A Reactively-Scheduled Pipelined Join Operator, IEEE Data Engineering Bulletin, June 2000, (23) 2:27-33.
|
 |
23
|
Tolga Urhan , Michael J. Franklin , Laurent Amsaleg, Cost-based query scrambling for initial delays, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.130-141, June 01-04, 1998, Seattle, Washington, United States
|
| |
24
|
A. N. Wilschut and P. M. G. Apers. Pipelining in Query Execution, Conference on Databases, Parallel Architectures and their Applications, Miami, 1991.
|
 |
25
|
Chun Zhang , Jeffrey Naughton , David DeWitt , Qiong Luo , Guy Lohman, On supporting containment queries in relational database management systems, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.425-436, May 21-24, 2001, Santa Barbara, California, United States
|
CITED BY 46
|
|
|
|
|
|
|
|
Yufei Tao , Man Lung Yiu , Dimitris Papadias , Marios Hadjieleftheriou , Nikos Mamoulis, RPJ: producing fast join results on streams through rate-based optimization, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
|
|
|
|
|
|
Mohamed Medhat Gaber , Shonali Krishnaswamy , Arkady Zaslavsky, Cost-efficient mining techniques for data streams, Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation, p.109-114, January 01, 2004, Dunedin, New Zealand
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
|
|
|
|
|
|
|
|
|
Deepak S. Turaga , Brian Foo , Olivier Verscheure , Rong Yan, Configuring topologies of distributed semantic concept classifiers for continuous multimedia stream processing, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Ioana Stanoi , George Mihaila , Themis Palpanas , Christian Lang, Maximizing the sustained throughput of distributed continuous queries, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
Gisik Kwon , K. Selçuk Candan, DANS: decentralized, autonomous, and networkwide service delivery and multimedia workflow processing, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
|
|
|
Kristine Towne , Qiang Zhu , Calisto Zuzarte , Wen-Chi Hou, Window query processing for joining data streams with relations, Proceedings of the 2007 conference of the center for advanced studies on Collaborative research, October 22-25, 2007, Richmond Hill, Ontario, Canada
|
|
|
|
|
|
Don Carney , Uğur Çetintemel , Alex Rasin , Stan Zdonik , Mitch Cherniack , Mike Stonebraker, Operator scheduling in a data stream manager, Proceedings of the 29th international conference on Very large data bases, p.838-849, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel J. Abadi , Don Carney , Ugur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal — The International Journal on Very Large Data Bases, v.12 n.2, p.120-139, August 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|