|
ABSTRACT
Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad application combines computational "vertices" with communication "channels" to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of available computers, communicating as appropriate through flies, TCP pipes, and shared-memory FIFOs. The vertices provided by the application developer are quite simple and are usually written as sequential programs with no thread creation or locking. Concurrency arises from Dryad scheduling vertices to run simultaneously on multiple computers, or on multiple CPU cores within a computer. The application can discover the size and placement of data at run time, and modify the graph as the computation progresses to make efficient use of the available resources. Dryad is designed to scale from powerful multi-core single computers, through small clusters of computers, to data centers with thousands of computers. The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Global grid forum. http://www.gridforum.org/.
|
| |
2
|
Intel IXP2XXX product line of network processors. http://www.intel.com/design/network/products/npfamily/ixp2xxx.htm.
|
| |
3
|
Intel platform 2015. http://www.Intel.com/technology/architecture/platform2015/.
|
| |
4
|
The LINQ project. http://msdn.microsoft.com/netframework/future/linq/.
|
| |
5
|
Open MPI. http://www.open-mpi.org/.
|
| |
6
|
SQL Server Integration Services. http://www.microsoft.com/sq1/technologies/integration/default.mspx.
|
| |
7
|
|
 |
8
|
|
 |
9
|
Özalp Babaoğlu , Lorenzo Alvisi , Alessandro Amoroso , Renzo Davoli , Luigi Alberto Giachini, Paralex: an environment for parallel programming in distributed systems, Proceedings of the 6th international conference on Supercomputing, p.178-187, July 19-24, 1992, Washington, D. C., United States
[doi> 10.1145/143369.143406]
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.207-216, July 19-21, 1995, Santa Barbara, California, United States
|
| |
15
|
Eylon Caspi , Michael Chu , Randy Huang , Joseph Yeh , John Wawrzynek , André DeHon, Stream Computations Organized for Reconfigurable Execution (SCORE), Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications, p.605-614, August 27-30, 2000
|
| |
16
|
|
| |
17
|
D. J. Dewitt , S. Ghandeharizadeh , D. A. Schneider , A. Bricker , H. -I. Hsiao , R. Rasmussen, The Gamma Database Machine Project, IEEE Transactions on Knowledge and Data Engineering, v.2 n.1, p.44-62, March 1990
[doi> 10.1109/69.50905]
|
 |
18
|
|
| |
19
|
D. C. DiNucci and R. G. Babb II. Design and implementation of parallel programs with LGDF2. In Digest of Papers from Compcon '89, pages 102--107, 1989.
|
 |
20
|
Armando Fox , Steven D. Gribble , Yatin Chawathe , Eric A. Brewer , Paul Gauthier, Cluster-based scalable network services, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.78-91, October 05-08, 1997, Saint Malo, France
|
 |
21
|
|
 |
22
|
|
| |
23
|
J. Gray, A. S. Szalay, A. Thakar, P. Kunszt, C. Stoughton, D. Slutz, and J. Vandenberg. Data mining the SDSS SkyServer database. In Distributed Data and Structures 4: Records of the 4th International Meeting, pages 189--210, Paris, France, March 2002. Carleton Scientific, also as MSR-TR-2002-01.
|
| |
24
|
Jim Gray and Alex Szalay. Science in an exponential world. Nature, 440(23), March 23 2006.
|
| |
25
|
J.-H. Hwang, M. Balazinska, A. Rasin, U. Çetintemel, M. Stonebraker, and S. Zdonik. A comparison of stream-oriented high-availability algorithms. Technical Report TR-03-17, Computer Science Department, Brown University, September 2003.
|
| |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
|
 |
30
|
|
 |
31
|
|
| |
32
|
Ken Phillips. SenSage ESA. SC Magazine, March 1 2006.
|
| |
33
|
|
 |
34
|
Mehul A. Shah , Joseph M. Hellerstein , Eric Brewer, Highly available, fault-tolerant, parallel dataflows, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007662]
|
| |
35
|
|
 |
36
|
|
| |
37
|
|
| |
38
|
|
CITED BY 33
|
|
Christopher Olston , Benjamin Reed , Utkarsh Srivastava , Ravi Kumar , Andrew Tomkins, Pig latin: a not-so-foreign language for data processing, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xuezheng Liu , Zhenyu Guo , Xi Wang , Feibo Chen , Xiaochen Lian , Jian Tang , Ming Wu , M. Frans Kaashoek , Zheng Zhang, D3S: debugging deployed distributed systems, Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, p.423-437, April 16-18, 2008, San Francisco, California
|
|
|
|
|
|
|
|
|
Ronnie Chaiken , Bob Jenkins , Per-Åke Larson , Bill Ramsey , Darren Shakib , Simon Weaver , Jingren Zhou, SCOPE: easy and efficient parallel processing of massive data sets, Proceedings of the VLDB Endowment, v.1 n.2, August 2008
|
|
|
|
|
|
David J. DeWitt , Erik Paulson , Eric Robinson , Jeffrey Naughton , Joshua Royalty , Srinath Shankar , Andrew Krioukov, Clustera: an integrated computation and data management system, Proceedings of the VLDB Endowment, v.1 n.1, August 2008
|
|
|
|
|
|
Ioan Raicu , Zhao Zhang , Mike Wilde , Ian Foster , Pete Beckman , Kamil Iskra , Ben Clifford, Toward loosely coupled programming on petascale systems, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, November 15-21, 2008, Austin, Texas
|
|
|
|
|
|
|
|
|
Michael A. Kozuch , Michael P. Ryan , Richard Gass , Steven W. Schlosser , David O'Hallaron , James Cipar , Elie Krevat , Julio López , Michael Stroucken , Gregory R. Ganger, Tashi: location-aware cluster management, Proceedings of the 1st workshop on Automated control for datacenters and clouds, June 19-19, 2009, Barcelona, Spain
|
|
|
Li Yi , Christopher Moretti , Scott Emrich , Kenneth Judd , Douglas Thain, Harnessing parallelism in multicore clusters with the all-pairs and wavefront abstractions, Proceedings of the 18th ACM international symposium on High performance distributed computing, June 11-13, 2009, Garching, Germany
|
|
|
Yao Zhao , Yinglian Xie , Fang Yu , Qifa Ke , Yuan Yu , Yan Chen , Eliot Gillum, BotGraph: large scale spamming botnet detection, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.321-334, April 22-24, 2009, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
Padmanabhan S. Pillai , Lily B. Mummert , Steven W. Schlosser , Rahul Sukthankar , Casey J. Helfrich, SLIPstream: scalable low-latency interactive perception on streaming data, Proceedings of the 18th international workshop on Network and operating systems support for digital audio and video, June 03-05, 2009, Williamsburg, VA, USA
|
|
|
|
|
|
Andrew Pavlo , Erik Paulson , Alexander Rasin , Daniel J. Abadi , David J. DeWitt , Samuel Madden , Michael Stonebraker, A comparison of approaches to large-scale data analysis, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
Joe B. Buck , Noah Watkins , Carlos Maltzahn , Scott A. Brandt, Abstract storage: moving file format-specific abstractions intopetabyte-scale storage systems, Proceedings of the second international workshop on Data-aware distributed computing, p.31-40, June 09-10, 2009, Garching, Germany
|
|
|
Chuanxiong Guo , Guohan Lu , Dan Li , Haitao Wu , Xuan Zhang , Yunfeng Shi , Chen Tian , Yongguang Zhang , Songwu Lu, BCube: a high performance, server-centric network architecture for modular data centers, ACM SIGCOMM Computer Communication Review, v.39 n.4, October 2009
|
|
|
|
|
|
|
|