|
ABSTRACT
Parallel computers with tens of thousands of processors are typically programmed in a data parallel style, as opposed to the control parallel style used in multiprocessing. The success of data parallel algorithms—even on problems that at first glance seem inherently serial—suggests that this style of programming has much wider applicability than was previously thought.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Batcher. K.E. Sorting networks and their applications. In Proceedings of fhe 1968 Sprirlg Joirlf Compufer Corlferetxe (Reston, Va.. Apr.) AFIPS, Reston. Va.. 1968, pp. 307-314.
|
| |
3
|
Batcher. K.E. Design of a massively parallel processor. IEEE Trans. Coqmf. C-29, 9 (Sept. 1980). 836-840.
|
| |
4
|
Bawden. A. A programming language for massively parallel computers. Master's thesis. Dept. of Electrical Engineering and Computer Science. MIT, Cambridge. Mass.. Sept. 1984.
|
 |
5
|
|
| |
6
|
Blelloch. G. AFL-I: A programming language for massively concurrent computers. Master's thesis, Dept. of Electrical Engineering and Computer Science, MIT, Cambridge, Mass., June 1986.
|
| |
7
|
Blelloch, G. Parallel prefix versus concurrent memory access. Tech. Rep.. Thinking Machines Corp., Cambridge, Mass.. 1986.
|
| |
8
|
Bouknight, W.J., Denenberg, S.A., McIntyre. D.E.. Randall, 1.M.. Sameh, A.H.. and Slotnick, D.L. The ILLIAC IV system. Proc. IEEE 60.4 (Apr. 1972). 369-388.
|
| |
9
|
Christman. D.P. Programming the Connection Machine. Master's thesis. Dept. of Electrical Engineering and Computer Science, MIT, Cambridge. Mass., Jan. 1983.
|
| |
10
|
Christman, D.P. Programming the Connection Machine. Tech. Rep. ISL-84-3, Xerox Palo Alto Research Center, Palo Alto, Calif.. Apr. 1984. (Reprint of the author's master's thesis at MIT.)
|
 |
11
|
|
| |
12
|
Flanders, P.M., et al. Efficient high speed computing with the distributed array processor. In High Speed Computer and Algorithm Orgarlizafiorr. Koch, Lawrie. and Sameh. Eds. Academic Press, New York, 1977, pp. 113-127.
|
| |
13
|
Haynes, L.S.. Lao. R.L.. Siewiorek. D.P., and Mizell. D.W. A survey of highly parallel computing. Compufer ()an. 1982). 9-24.
|
| |
14
|
Hillis, W.D. Tile Comertim Machine. MIT Press, Cambridge, Mass.. 1985.
|
| |
15
|
|
| |
16
|
Knuth. D.E. The Arf of Con~pufer Programmi?~g. Vol. 3. Sorfing and Searrhirig. Addison-Wesley, Reading, Mass.. 1973.
|
| |
17
|
|
| |
18
|
Kong. H.T., and Lieserson. C.E. Algorithms for VLSI processor arrays. In Infroducfiorf fo VLSI Systems, L. Carver and L. Conway. Eds. Addison-Wesley, New York. 1980. pp. 271-292.
|
| |
19
|
Lim, W. Fast algorithms for labeling connected components in 2-D arrays. Tech. Rep. 86.22, Thinking Machines Corp., Cambridge, Mass., July 1986.
|
| |
20
|
Minsky. M.. and Papert. S. Percepfrorrs. 2nd ed. MIT Press, Cambridge, Mass.. 1979.
|
| |
21
|
Pan, V., and Reif. J. Efficient parallel solution of linear systems. Tech. Rep. TR-02-85. Aiken Computation Laboratory, Harvard Univ.. Cambridge. Mass., 1985.
|
 |
22
|
|
| |
23
|
Shaw. D.E. Tire NON-VON Supercomputer. Tech. Rep., Dept. of Computer Science, Columbia Univ.. New York. Aug. 1982.
|
 |
24
|
|
| |
25
|
Turner. D.A. A new implementation technique for applicative languages. Soffw. Pracf. &per. 9 (1979). 31-49.
|
CITED BY 105
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
G. C. Fox , P. Hipes , J. Salmon, Practical parallel supercomputing: examples from chemistry and physics, Proceedings of the 1989 ACM/IEEE conference on Supercomputing, p.58-69, November 12-17, 1989, Reno, Nevada, United States
|
|
|
Ian Foster , David R. Kohr, Jr. , Rakesh Krishnaiyer , Alok Choudhary, Double standards: bringing task parallelism to HPF via the message passing interface, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.36-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Toshio Suganuma , Hideaki Komatsu , Toshio Nakatani, Detection and global optimization of reduction operations for distributed parallel machines, Proceedings of the 10th international conference on Supercomputing, p.18-25, May 25-28, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
S. A. Kravitz , R. E. Bryant , R. A. Rutenbar, Massively parallel switch-level simulation: a feasibility study, Proceedings of the 26th ACM/IEEE conference on Design automation, p.91-97, June 25-28, 1989, Las Vegas, Nevada, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, ACM SIGPLAN Notices, v.30 n.8, p.207-216, Aug. 1995
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yishay Mansour , Noam Nisan , Uzi Vishkin, Trade-offs between communication throughput and parallel time, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, p.372-381, May 23-25, 1994, Montreal, Quebec, Canada
|
|
|
P. J. Hatcher , M. J. Quinn , A. J. Lapadula , B. K. Seevers , R. J. Anderson , R. R. Jones, Data-Parallel Programming on MIMD Computers, IEEE Transactions on Parallel and Distributed Systems, v.2 n.3, p.377-383, July 1991
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Y. A. Teng , F. Sullivan , I. Beichl , E. Puppo, A data-parallel algorithm for three-dimensional Delaunay triangulation and its implementation, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.112-121, December 1993, Portland, Oregon, United States
|
|
|
|
|
|
|
|
|
Hideo Matsuda , Gary J. Olsen , Ross Overbeek , Yukio Kaneda, Fast phylogenetic analysis on a massively parallel machine, Proceedings of the 8th international conference on Supercomputing, p.297-302, July 11-15, 1994, Manchester, England
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Charles E. Leiserson , Zahi S. Abuhamdeh , David C. Douglas , Carl R. Feynman , Mahesh N. Ganmukhi , Jeffrey V. Hill , Daniel Hillis , Bradley C. Kuszmaul , Margaret A. St. Pierre , David S. Wells , Monica C. Wong , Shaw-Wen Yang , Robert Zak, The network architecture of the Connection Machine CM-5 (extended abstract), Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures, p.272-285, June 29-July 01, 1992, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Jack Dongarra , Ian Foster , Geoffrey Fox , William Gropp , Ken Kennedy , Linda Torczon , Andy White, References, Sourcebook of parallel computing, Morgan Kaufmann Publishers Inc., San Francisco, CA, 2003
|
|
|
|
|
|
|
|
|
|
|
|
Aaron E. Lefohn , Shubhabrata Sengupta , Joe Kniss , Robert Strzodka , John D. Owens, Glift: Generic, efficient, random-access GPU data structures, ACM Transactions on Graphics (TOG), v.25 n.1, p.60-99, January 2006
|
|
|
|
|
|
|
|
|
William J. Dally , J. A. Stuart Fiske , John S. Keen , Richard A. Lethin , Michael D. Noakes , Peter R. Nuth , Roy E. Davison , Gregory A. Fyler, The Message-Driven Processor: A Multicomputer Processing Node with Efficient Mechanisms, IEEE Micro, v.12 n.2, p.23-39, March 1992
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jim Kaba , Jim Matey , Gordon Stoll , Herb Taylor , Pat Hanrahan, Interactive terrain rendering and volume visualization on the Princeton Engine, Proceedings of the 3rd conference on Visualization '92, October 19-23, 1992, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Douglas Stott Parker, Jr. , Eric Simon , Patrick Valduriez, SVP: A Model Capturing Sets, Lists, Streams, and Parallelism, Proceedings of the 18th International Conference on Very Large Data Bases, p.115-126, August 23-27, 1992
|
|
|
T. Gross , S. Hinrichs , G. Lueh , D. O'Hallaron , J. Stichnoch , J. Subhlok, Compiling task and data parallel programs for iWarp, ACM SIGPLAN Notices, v.28 n.1, p.32-35, Jan. 1993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Frans J. Peters : Reviewer"
This expository paper presents a number of algorithms that have been
implemented on the Connection Machine, a system consisting of tens of
thousands of processors. The paper expresses the view that such fine-grained
SIMD (Single Instruction stre
more...
|