|
ABSTRACT
The success of the von Neumann model of sequential computation is attributable to the fact that it is an efficient bridge between software and hardware: high-level languages can be efficiently compiled on to this model; yet it can be effeciently implemented in hardware. The author argues that an analogous bridge between software and hardware in required for parallel computation if that is to become as widely used. This article introduces the bulk-synchronous parallel (BSP) model as a candidate for this role, and gives results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Aiello, B., Leighton, FT., Maggs, B., and Neumann, M. Fast algorithms for bit-serial routing on a hypercube. Manuscript, 1990.
|
| |
3
|
Anderson, R.J. and Miller, G.L. Optical communication for pointer based algorithms. Tech. Pep. CRI 88-14, Computer Science Dept~, Univ. of Southern California, 1988.
|
| |
4
|
|
| |
5
|
Carter, J.L. and Wegman, M.N. Universal classes of hash functions. J. Comput. Syst. Sci. 18 (1979) 143-154.
|
| |
6
|
|
 |
7
|
|
| |
8
|
Gottlieb, A. et al. The NYU ultracomputer-- Designing an MIMD shared memory parallel computer. IEEE 7~ans. Comput. 32, 2 (1983) 175-189.
|
| |
9
|
Hoeffding, W. Probability inequalities for sums of bounded random variables. Am. Stat. Assoc. J. 58 (1963) 13-30.
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
Maniloff, E.S.,Johnson, K.M., and Reif, J.H. Holographic routing network for parallel processing machines. Society of Photo Optical Instrumentation Engineers (SPIE), Paris, France 1989, V 1136, Holographic Optics II, Principles and Applications, 283-289.
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
Ranade, A.G. How to emulate shared memory. In Proceedings of the Twenty-eighth IEEE Symposium on Foundations of Computer Science (1987) pp. 185-194.
|
 |
21
|
|
| |
22
|
Siegel, A. On universal classes of fast high performance hash functions. In Proceedings of the Thirtieth IEEE Symposium on Foundations of Computer Science (1989).
|
| |
23
|
|
| |
24
|
Turing, A.M. On computable numbers with an application to the Entscheidungs problem. In Proceedings of the London Mathematical Society 42 2 (1936) 230-265; correction, ibidem 43 (1937) 544-546.
|
 |
25
|
|
| |
26
|
Valiant, L.G. A scheme for fast parallel communication. SIAM jr. Comput. 11 (1982) 350-361.
|
| |
27
|
Valiant, L.G. Optimally universal parallel computers. Phil. Trans. R. Soc. Lond. A326 (1988) 373-376.
|
| |
28
|
Valiant, L.G. Bulk-synchronous parallel computers. In Parallel Processing and Artificial Intelligence, M. Reeve and S.E. Zenith, Eds., Wiley, 1989 15-22.
|
| |
29
|
|
 |
30
|
|
CITED BY 242
|
|
Uzi Vishkin , Shlomit Dascal , Efraim Berkovich , Joseph Nuzman, Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract), Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, p.140-151, June 28-July 02, 1998, Puerto Vallarta, Mexico
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jesper Larsson Traff , Hubert Ritzdorf , Rolf Hempel, The implementation of MPI-2 one-sided communication for the NEC SX-5, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.1-es, November 04-10, 2000, Dallas, Texas, United States
|
|
|
Micah Adler , Phillip B. Gibbons , Vijaya Ramachandran , Yossi Matias, Modeling parallel bandwidth: local vs. global restrictions, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, p.94-105, June 23-25, 1997, Newport, Rhode Island, United States
|
|
|
Mark M. Mathis , Nancy M. Amato , Marvin L. Adams, A general performance model for parallel sweeps on orthogonal grids for particle transport calculations, Proceedings of the 14th international conference on Supercomputing, p.255-263, May 08-11, 2000, Santa Fe, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Micah Adler , Wolfgang Dittrich , Ben Juurlink , Mirosław Kutyłowski , Ingo Rieping, Communication-optimal parallel minimum spanning tree algorithms (extended abstract), Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, p.27-36, June 28-July 02, 1998, Puerto Vallarta, Mexico
|
|
|
|
|
|
|
|
|
|
|
|
Vijaya Ramachandran , Brian Grayson , Michael Dahlin, Emulations between QSM, BSP, and LogP: a framework for general-purpose parallel algorithm design, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.957-958, January 17-19, 1999, Baltimore, Maryland, United States
|
|
|
|
|
|
|
|
|
|
|
|
Phillip B. Gibbons , Yossi Matias , Vijaya Ramachandran, Efficient low-contention parallel algorithms, Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures, p.236-247, June 27-29, 1994, Cape May, New Jersey, United States
|
|
|
|
|
|
|
|
|
Frank Dehne , Xiaotie Deng , Patrick Dymond , Andreas Fabri , Ashfaq A. Khokhar, A randomized parallel 3D convex hull algorithm for coarse grained multicomputers, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, p.27-33, June 24-26, 1995, Santa Barbara, California, United States
|
|
|
Richard M. Karp , Abhijit Sahay , Eunice E. Santos , Klaus Erik Schauser, Optimal broadcast and summation in the LogP model, Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures, p.142-153, June 30-July 02, 1993, Velen, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Xiaotie Deng , Nian Gu , Tim Brecht , KaiCheng Lu, Preemptive scheduling of parallel jobs on multiprocessors, Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms, p.159-167, January 28-30, 1996, Atlanta, Georgia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Matthew Andrews , Tom Leighton , P. Takis Metaxas , Lisa Zhang, Automatic methods for hiding latency in high bandwidth networks (extended abstract), Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.257-265, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
Mark W. Goudreau , Kevin Lang , Girija Narlikar , Satish B. Rao, BOS is boss: a case for bulk-synchronous object systems, Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, p.115-125, June 27-30, 1999, Saint Malo, France
|
|
|
|
|
|
|
|
|
Spyros C. Kontogiannis , Grammati E. Pantziou , Paul G. Spirakis , Moti Yung, “Dynamic-fault-prone BSP”: a paradigm for robust computations in changing environments, Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, p.37-46, June 28-July 02, 1998, Puerto Vallarta, Mexico
|
|
|
|
|
|
Jaswinder Pal Singh , Edward Rothberg , Anoop Gupta, Modeling communication in parallel algorithms: a fruitful interaction between theory and systems?, Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures, p.189-199, June 27-29, 1994, Cape May, New Jersey, United States
|
|
|
Naohito Sato , Satoshi Matsuoka , Jean-Marc Jézéquel , Akinori Yonezawa, A methodology for specifying data distribution using only standard object-oriented features, Proceedings of the 11th international conference on Supercomputing, p.116-123, July 07-11, 1997, Vienna, Austria
|
|
|
|
|
|
Phillip B. Gibbons , Yossi Matias , Vijaya Ramachandran, Can shared-memory model serve as a bridging model for parallel computation?, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, p.72-83, June 23-25, 1997, Newport, Rhode Island, United States
|
|
|
Z. M. Kedem , K. V. Palem , M. O. Rabin , A. Raghunathan, Efficient program transformations for resilient parallel computation via randomization (preliminary version), Proceedings of the twenty-fourth annual ACM symposium on Theory of computing, p.306-317, May 04-06, 1992, Victoria, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Frank Dehne , Wolfgang Dittrich , David Hutchinson, Efficient external memory algorithms by simulating coarse-grained parallel algorithms, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, p.106-115, June 23-25, 1997, Newport, Rhode Island, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mark Goudreau , Kevin Lang , Satish Rao , Torsten Suel , Thanasis Tsantilas, Towards efficiency and portability: programming with the BSP model, Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, p.1-12, June 24-26, 1996, Padua, Italy
|
|
|
Bogdan S. Chlebus , Stefan Dobrev , Dariusz R. Kowalski , Grzegorz Malewicz , Alex Shvartsman , Imrich Vrto, Towards practical deteministic write-all algorithms, Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures, p.271-280, July 2001, Crete Island, Greece
|
|
|
|
|
|
|
|
|
Matthew Andrews , Tom Leighton , P. Takis Metaxas , Lisa Zhang, Improved methods for hiding latency in high bandwidth networks (extended abstract), Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, p.52-61, June 24-26, 1996, Padua, Italy
|
|
|
Gianfranco Bilardi , Kieran T. Herley , Andrea Pietracaprina , Geppino Pucci , Paul Spirakis, BSP vs LogP, Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, p.25-32, June 24-26, 1996, Padua, Italy
|
|
|
Albert Alexandrov , Mihai F. Ionescu , Klaus E. Schauser , Chris Scheiman, LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, p.95-105, June 24-26, 1995, Santa Barbara, California, United States
|
|
|
|
|
|
|
|
|
Francisco Almeida , Rumen Andonov , Daniel Gonzalez , Luz M. Moreno , Vincent Poirriez , Casiano Rodriguez, Optimal tiling for the RNA base pairing problem, Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, August 10-13, 2002, Winnipeg, Manitoba, Canada
|
|
|
David Culler , Richard Karp , David Patterson , Abhijit Sahay , Klaus Erik Schauser , Eunice Santos , Ramesh Subramonian , Thorsten von Eicken, LogP: towards a realistic model of parallel computation, ACM SIGPLAN Notices, v.28 n.7, p.1-12, July 1993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David E. Culler , Richard M. Karp , David Patterson , Abhijit Sahay , Eunice E. Santos , Klaus Erik Schauser , Ramesh Subramonian , Thorsten von Eicken, LogP: a practical model of parallel computation, Communications of the ACM, v.39 n.11, p.78-85, Nov. 1996
|
|
|
Phillip B. Gibbons , Yossi Matias , Vijaya Ramachandran, The QRQW PRAM: accounting for contention in parallel algorithms, Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms, p.638-648, January 23-25, 1994, Arlington, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Z. M. Kedem , K. V. Palem , A. Raghunathan , P. G. Spirakis, Combining tentative and definite executions for very fast dependable parallel computing, Proceedings of the twenty-third annual ACM symposium on Theory of computing, p.381-390, May 05-08, 1991, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Seth Pettie , Vijaya Ramachandran, Minimizing randomness in minimum spanning tree, parallel connectivity, and set maxima algorithms, Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.713-722, January 06-08, 2002, San Francisco, California
|
|
|
|
|
|
|
|
|
Guy E. Blelloch , Phillip B. Gibbons , Yossi Matias , Marco Zagha, Accounting for memory bank contention and delay in high-bandwidth multiprocessors, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, p.84-94, June 24-26, 1995, Santa Barbara, California, United States
|
|
|
Micah Adler , John W. Byers , Richard M. Karp, Parallel sorting with limited bandwidth, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, p.129-136, June 24-26, 1995, Santa Barbara, California, United States
|
|
|
Spyros C. Kontogiannis , Grammati E. Pantziou , Paul G. Spirakis, Efficient computations on fault-prone BSP machines, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, p.84-93, June 23-25, 1997, Newport, Rhode Island, United States
|
|
|
F. Dehne , W. Dittrich , D. Hutchinson , A. Maheshwari, Parallel virtual memory, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.889-890, January 17-19, 1999, Baltimore, Maryland, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
V. Blanco , J. A. González , C. León , C. Rodríguez , G. Rodríguez , M. Printista, Predicting the performance of parallel programs, Parallel Computing, v.30 n.3, p.337-356, March 2004
|
|
|
|
|
|
|
|
|
Raphael Y. de Camargo , Andrei Goldchleger , Fabio Kon , Alfredo Goldman, Checkpointing-based rollback recovery for parallel applications on the InteGrade grid middleware, Proceedings of the 2nd workshop on Middleware for grid computing, p.35-40, October 18-22, 2004, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrei Goldchleger , Alfredo Goldman , Ulisses Hayashida , Fabio Kon, The implementation of the BSP parallel computing model on the InteGrade Grid middleware, Proceedings of the 3rd international workshop on Middleware for grid computing, p.1-6, November 28-December 02, 2005, Grenoble, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gail A. Alverson , William G. Griswold , Calvin Lin , David Notkin , Lawrence Snyder, Abstractions for Portable, Scalable Parallel Programming, IEEE Transactions on Parallel and Distributed Systems, v.9 n.1, p.71-86, January 1998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Martha Mercaldi , Steven Swanson , Andrew Petersen , Andrew Putnam , Andrew Schwerin , Mark Oskin , Susan J. Eggers, Modeling instruction placement on a spatial architecture, Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, July 30-August 02, 2006, Cambridge, Massachusetts, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christof Krick , Friedhelm Meyer auf der Heide , Harald Räcke , Berthold Vöcking , Matthias Westermann, Data management in networks: experimental evaluation of a provably good strategy, Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, p.165-174, June 27-30, 1999, Saint Malo, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bradley J. Barnes , Barry Rountree , David K. Lowenthal , Jaxk Reeves , Bronis de Supinski , Martin Schulz, A regression-based approach to scalability prediction, Proceedings of the 22nd annual international conference on Supercomputing, June 07-12, 2008, Island of Kos, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Doruk Bozdağ , Assefaw H. Gebremedhin , Fredrik Manne , Erik G. Boman , Umit V. Catalyurek, A framework for scalable greedy coloring on distributed-memory parallel computers, Journal of Parallel and Distributed Computing, v.68 n.4, p.515-535, April, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John H. Kelm , Daniel R. Johnson , Matthew R. Johnson , Neal C. Crago , William Tuohy , Aqeel Mahesri , Steven S. Lumetta , Matthew I. Frank , Sanjay J. Patel, Rigel: an architecture and scalable programming interface for a 1000-core accelerator, ACM SIGARCH Computer Architecture News, v.37 n.3, June 2009
|
REVIEW
"Ronald J. Watro : Reviewer"
Valiant observes that parallel processing has not made the expected
progress in displacing sequential processing in computationally
intensive domains. He attributes this failure to the lack of an
appropriate bridging model for parallel computa
more...
|