|
ABSTRACT
Parallel supercomputers architectures with complex memory hierarchies or distributed memory systems have become very common. Unfortunately, the problems associated with restructuring software to take advantage of these memory systems are not easily solved. This paper presents an overview of some of the mathematical issues behind several of these problems and attempts to give a brief look at some of the potential solutions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
AbSKL 79
|
Abu-Sufah, W., Kuck, D., and Lawrle, D. Automatic program ~ransformations for virtual memory computers. Proc. National Computer Conference, June 1979, pp. 969-974.
|
| |
AbSKL 81
|
Abu-Sufah, W., Kuck, D., and Lawrie, D. On the performance enhancement of paging systems through program analysis and program transformation. IEEE Trans. Comput. C-30, 5 (May 1981), 341-356.
|
| |
CGST 85
|
Crowther, W., Goodhue, J., Start, E., Thomas, It., Milliken, W., and Blackadar, T. Performance measurements on a l~8-node butterfly parallel processor. Proe. International Conference on Parallel Processing, August 1985, pp. 531- 540.
|
| |
GJMS 88
|
Gallivan, K., Jalby, W., Meier, U., and Sameh, k. The impact of hierarchical memory systems on linear algebra algorithm design. International Journal of Supereomputer Applications. 2, 1 (Spring 1988), 12-48.
|
| |
GaJa 87
|
Gannon, D. and Jalby, W. The influence of memory hierarchy on algorithm organization: programming FFT'8 on a vector multiprocessor. In Gannon, D., Jamieson, L., and Douglas, R. (Eds.). The Characteristics of Parallel Algorithms. MIT Press, Cambridge, MA, 1987.
|
| |
GaJG 87
|
|
| |
KoMeh 86
|
|
| |
Koel 88
|
Koelbel, C. Automatic Data Distribution for Parallel Programs Ph.D. Thesis (in progress) , Department of Computer Science, Purdue University, West Lafayette.
|
 |
KeCo 69
|
|
| |
PhNo 85
|
Phister, G. and Norton, A. Hot spot contention and combining in multistage interconnection networks. Proe. International Conference on Parallel Processing, August 1985, pp. 790-797.
|
| |
Poly 86
|
|
 |
Veid 88
|
|
CITED BY 28
|
|
|
|
|
Kyle Gallivan , William Jalby , Allen Maloney , Harry Wijshoff, Performance prediction of loop constructs on multiprocessor hierarchical-memory systems, Proceedings of the 3rd international conference on Supercomputing, p.433-442, June 05-09, 1989, Crete, Greece
|
|
|
|
|
|
Ernesto Su , Antonio Lain , Shankar Ramaswamy , Daniel J. Palermo , Eugene W. Hodges, IV , Prithviraj Banerjee, Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers, Proceedings of the 9th international conference on Supercomputing, p.424-433, July 03-07, 1995, Barcelona, Spain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
J. Ramanujam , Jinpyo Hong , Mahmut Kandemir , A. Narayan, Reducing memory requirements of nested loops for embedded systems, Proceedings of the 38th conference on Design automation, p.359-364, June 2001, Las Vegas, Nevada, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert Schreiber , Shail Aditya , Scott Mahlke , Vinod Kathail , B. Ramakrishna Rau , Darren Cronquist , Mukund Sivaraman, PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators, Journal of VLSI Signal Processing Systems, v.31 n.2, p.127-142, June 2002
|
|
|
D. Baxter , R. Mirchandaney , J. H. Saltz, Run-time parallelization and scheduling of loops, Proceedings of the first annual ACM symposium on Parallel algorithms and architectures, p.303-312, June 18-21, 1989, Santa Fe, New Mexico, United States
|
|
|
|
|
|
|
|