|
ABSTRACT
In large federated and shared-nothing databases, resources can exhibit widely fluctuating characteristics. Assumptions made at the time a query is submitted will rarely hold throughout the duration of query processing. As a result, traditional static query optimization and execution techniques are ineffective in these environments.
In this paper we introduce a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs. We characterize the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated. By combining eddies with appropriate join algorithms, we merge the optimization and execution phases of query processing, allowing each tuple to have a flexible ordering of the query operators. This flexibility is controlled by a combination of fluid dynamics and a simple learning algorithm. Our initial implementation demonstrates promising results, with eddies performing nearly as well as a static optimizer/executor in static scenarios, and providing dramatic improvements in dynamic execution environments.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
AAC+97
|
Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , David E. Culler , Joseph M. Hellerstein , David A. Patterson, High-performance sorting on networks of workstations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.243-254, May 11-15, 1997, Tucson, Arizona, United States
|
 |
AAT+99
|
Remzi H. Arpaci-Dusseau , Eric Anderson , Noah Treuhaft , David E. Culler , Joseph M. Hellerstein , David Patterson , Kathy Yelick, Cluster I/O with River: making the fast case common, Proceedings of the sixth workshop on I/O in parallel and distributed systems, p.10-22, May 05-05, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301816.301823]
|
| |
AFTU96
|
Laurent Amsaleg , Michael J. Franklin , Anthony Tomasic , Tolga Urhan, Scrambling query plans to cope with unexpected delays, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.208-219, December 18-20, 1996, Miami Beach, Florida, United States
|
| |
AH99
|
|
| |
Aok99
|
|
| |
AZ96
|
|
| |
Bar99
|
R. Barnes. Scale Out. In High Performance Transaction Processing Workshop (HPTS '99), Asilomar, September 1999.
|
| |
BDF+97
|
D. Barbara, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The New Jersey Data Reduction Report. IEEE Data Engineering Bulletin, 20(4), December 1997.
|
 |
BO99
|
|
| |
DGS+90
|
D. J. Dewitt , S. Ghandeharizadeh , D. A. Schneider , A. Bricker , H. -I. Hsiao , R. Rasmussen, The Gamma Database Machine Project, IEEE Transactions on Knowledge and Data Engineering, v.2 n.1, p.44-62, March 1990
[doi> 10.1109/69.50905]
|
 |
DKO+84
|
David J DeWitt , Randy H Katz , Frank Olken , Leonard D Shapiro , Michael R Stonebraker , David Wood, Implementation techniques for main memory database systems, Proceedings of the 1984 ACM SIGMOD international conference on Management of data, June 18-21, 1984, Boston, Massachusetts
|
 |
FMLS99
|
Daniela Florescu , Alon Levy , Ioana Manolescu , Dan Suciu, Query optimization in the presence of limited access patterns, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.311-322, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
GC94
|
|
| |
GMPQ+97
|
Hector Garcia-Molina , Yannis Papakonstantinou , Dallan Quass , Anand Rajaraman , Yehoshua Sagiv , Jeffrey Ullman , Vasilis Vassalos , Jennifer Widom, The TSIMMIS Approach to Mediation: Data Models and Languages, Journal of Intelligent Information Systems, v.8 n.2, p.117-132, March/April 1997
[doi> 10.1023/A:1008683107812]
|
 |
Gra90
|
|
| |
GWBC99
|
S. D. Gribble, M. Welsh, E. A. Brewer, and D. Culler. The Multi- Space: an Evolutionary Platform for Infrastructural Services. In Proceedings of the 1999 Usenix Annual Technical Conference, Monterey, June 1999.
|
| |
HAC+99
|
Joseph M. Hellerstein , Ron Avnur , Andy Chou , Christian Hidber , Chris Olston , Vijayshankar Raman , Tali Roth , Peter J. Haas, Interactive Data Analysis: The Control Project, Computer, v.32 n.8, p.51-59, August 1999
[doi> 10.1109/2.781635]
|
 |
Hel98
|
|
 |
HH99
|
|
| |
HKWY97
|
|
| |
HSC99
|
J. M. Hellerstein, M. Stonebraker, and R. Caccia. Open, Independent Enterprise Data Integration. IEEE Data Engineering Bulletin, 22(1), March 1999. http://www.cohera.com.
|
 |
IFF+99
|
Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
IK84
|
|
| |
INSS97
|
|
| |
KBZ86
|
|
 |
KD98
|
|
| |
Met97
|
R. Van Meter. Observing the Effects of Multi-Zone Disks. In Proceedings of the Usenix 1997 Technical Conference, Anaheim, January 1997.
|
| |
Mit97
|
|
| |
NWMN99
|
|
| |
RPK+99
|
|
| |
RRH99
|
|
| |
SB98
|
|
| |
SBH98
|
M. Stonebraker, P. Brown, and M. Herbach. Interoperability, Distributed Applications, and Distributed Databases: The Virtual Table Interface. IEEE Data Engineering Bulletin, 21(3):25-34, September 1998.
|
| |
Son98
|
|
 |
SWK76
|
|
| |
UF99
|
T. Urhan and M. Franklin. XJoin: Getting Fast Answers From Slow and Bursty Networks. Technical Report CS-TR-3994, University of Maryland, February 1999.
|
 |
UFA98
|
Tolga Urhan , Michael J. Franklin , Laurent Amsaleg, Cost-based query scrambling for initial delays, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.130-141, June 01-04, 1998, Seattle, Washington, United States
|
| |
WA91
|
|
| |
WW94
|
C. A. Waldspurger and W. E. Weihl. Lottery scheduling: Flexible proportional-share resource management. In Proc. of the First Symposium on Operating Systems Design and Implementation (OSDI '94), pages 1-11, Monterey, CA, November 1994. USENIX Assoc.
|
CITED BY 114
|
|
|
|
|
Vladimir Zadorozhny , Louiqa Raschid , Maria Esther Vidal , Tolga Urhan , Laura Bright, Efficient evaluation of queries in a mediator for WebSources, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Volker Markl , Vijayshankar Raman , David Simmen , Guy Lohman , Hamid Pirahesh , Miso Cilimdzic, Robust query processing through progressive optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Larry Huston , Rahul Sukthankar , Rajiv Wickremesinghe , M. Satyanarayanan , Gregory R. Ganger , Erik Riedel , Anastassia Ailamaki, Diamond: A Storage Architecture for Early Discard in Interactive Search, Proceedings of the 3rd USENIX Conference on File and Storage Technologies, March 31-31, 2004, San Francisco, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
V. Fontes , B. Schulze , M. Dutra , F. Porto , A. Barbosa, CoDIMS-G: a data and program integration service for the grid, Proceedings of the 2nd workshop on Middleware for grid computing, p.29-34, October 18-22, 2004, Toronto, Ontario, Canada
|
|
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
Shivnath Babu , Rajeev Motwani , Kamesh Munagala , Itaru Nishizawa , Jennifer Widom, Adaptive ordering of pipelined stream filters, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anne Condon , Amol Deshpande , Lisa Hellerstein , Ning Wu, Flow algorithms for two pipelined filter ordering problems, Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 26-28, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David Chu , Kaisen Lin , Alexandre Linares , Giang Nguyen , Joseph M. Hellerstein, Sdlib: a sensor network data and communications library for rapid and robust application development, Proceedings of the fifth international conference on Information processing in sensor networks, April 19-21, 2006, Nashville, Tennessee, USA
|
|
|
Ihab F. Ilyas , Walid G. Aref , Ahmed K. Elmagarmid , Hicham G. Elmongui , Rahul Shah , Jeffrey Scott Vitter, Adaptive rank-aware query optimization in relational databases, ACM Transactions on Database Systems (TODS), v.31 n.4, p.1257-1304, December 2006
|
|
|
|
|
|
Hua-Gang Li , Songting Chen , Junichi Tatemura , Divyakant Agrawal , K. Selçuk Candan , Wang-Pin Hsiung, Safety guarantee of continuous join queries over punctuated data streams, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
Holger Bast , Debapriyo Majumdar , Ralf Schenkel , Martin Theobald , Gerhard Weikum, IO-Top-k: index-access optimized top-k query processing, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Choi Changbai , Lim Jaehyoung , Han Juyeon , Jang Insung , Kim Minsoo , Soon J. Hyun, SNQL: a query language for sensor network databases, Proceedings of the 7th WSEAS International Conference on Telecommunications and Informatics, p.114-119, May 27-30, 2008, Istanbul, Turkey
|
|
|
Don Carney , Uğur Çetintemel , Alex Rasin , Stan Zdonik , Mitch Cherniack , Mike Stonebraker, Operator scheduling in a data stream manager, Proceedings of the 29th international conference on Very large data bases, p.838-849, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
Sailesh Krishnamurthy , Michael J. Franklin , Joseph M. Hellerstein , Garrett Jacobson, The case for precision sharing, Proceedings of the Thirtieth international conference on Very large data bases, p.972-984, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
Ryan Huebsch , Joseph M. Hellerstein , Nick Lanham , Boon Thau Loo , Scott Shenker , Ion Stoica, Querying the internet with PIER, Proceedings of the 29th international conference on Very large data bases, p.321-332, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|
|
Don Carney , Uǧur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Greg Seidman , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Monitoring streams: a new class of data management applications, Proceedings of the 28th international conference on Very Large Data Bases, p.215-226, August 20-23, 2002, Hong Kong, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Navendu Jain , Dmitry Kit , Prince Mahajan , Praveen Yalagandula , Mike Dahlin , Yin Zhang, STAR: self-tuning aggregation for scalable monitoring, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel J. Abadi , Don Carney , Ugur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal — The International Journal on Very Large Data Bases, v.12 n.2, p.120-139, August 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mahmoud El Samad , Julien Gossa , Franck Morvan , Abdelkader Hameurlain , Jean-Marc Pierson , Lionel Brunie, A monitoring service for large-scale dynamic query optimisation in a grid environment, International Journal of Web and Grid Services, v.4 n.2, p.222-246, June 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anastasios Gounaris , Jim Smith , Norman W. Paton , Rizos Sakellariou , Alvaro A. Fernandes , Paul Watson, Adaptive workload allocation in query processing in autonomous heterogeneous environments, Distributed and Parallel Databases, v.25 n.3, p.125-164, June 2009
|
|
|
|
|
|
Norman W. Paton , Jorge Buenabad-Chavez , Mengsong Chen , Vijayshankar Raman , Garret Swart , Inderpal Narang , Daniel M. Yellin , Alvaro A. Fernandes, Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options, The VLDB Journal — The International Journal on Very Large Data Bases, v.18 n.1, p.119-140, January 2009
|
|
|
|
|
|
|
|
|
Riham Abdel Kader , Peter Boncz , Stefan Manegold , Maurice van Keulen, ROX: run-time optimization of XQueries, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|