|
ABSTRACT
Modern supercomputers like CRAY X-MP and IBM 3090 VF achieve their high computing speed by using both vector and parallel hardware. The available multitasking concepts supporting concurrent execution of tasks within a single application have been designed for different purposes: owing to the small dispatching overhead, fine-grain parallelism allows parallelization of small units of computation, usually chunks of a DO loop. Larger units of computation, such as arithmetic intensive subroutines, may be processed independently using coarse-grain parallelism.
This paper gives an introduction to the concepts of CRAY macro- and microtasking, and of IBM Multitasking Facility (MTF), the ECSEC microtasking prototype, and Parallel FORTRAN. Basic parallelization using fine-grain as well as coarse-grain techniques have been applied to linear algebra kernels, consisting in matrix multiplication and LU decomposition, and an application program simulating a Czochralski bulk flow describing a crystal growing system. Depending on the problem, it can be shown that a parallel speed up of nearly four (on the CRAY X-MP/416) and nearly six (on the IBM 3090-600E) can be achieved for the implementation of the matrix multiplication. All other kernels and the application program were limited by serialization overheads arising from memory conflicts (bank and section conflicts on CRAY, cache coherence on IBM) and multitasking primitive overheads. However, with a careful implementation a parallel efficiency of more than 0.9 can be obtained on both multiprocessors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Autotasking User's Guide, CRAY-Research ine, SN-2088 (I 988).
|
| |
2
|
|
| |
3
|
J.J. Dongarra, J.J. Du Croz, {.S. I)uff and S.J. l lammarling, A Set of Level 3 Basic Linear AIgebra Subprograms, Argonne National Laboratory Tech. Memo. 88, Rev. 1 (1988).
|
| |
4
|
J.J, Dongarra and A.R. l lJnds, Unrolling Loops in Fortran, Software-Practice and Experience 9 (3) (1979) 219-226.
|
| |
5
|
Engineering and Scientific Subroutine Library: Guide and Reference, IBM Order No SC23-0184.
|
| |
6
|
|
| |
7
|
Gallivan, K., Jalby, W., Meier, U., and Sameh, A., The impact of hierarchical memory systems on linear algebra algorithm design, C^qRI) Report, CSRD University of Illinois at Urbana-Champaign, 1986.
|
| |
8
|
W. Gentzsch, F. Szel6nyi, and V. Zecca, Use of Parallel FORTRAN for some engineering problems on the IBM 3090 vector multiprocessor, Parallel Computing 9 (1) (1988) t07-115.
|
| |
9
|
F. ltossfeld, R. Knecht, and W.E. Nagel, Multitasking: Experiences with Applications on a CRAY X-MP, to appear in Parallel Computing.
|
| |
10
|
J.-Fr. l lake and W. I lomberg, Linear Algebra Software on a Vector Computer, to appear in Parallel Computing.
|
| |
11
|
A. Liegmann, Die Strategic des Microtasking als Mittel zur 8eschleunlgung yon Programmen auf dem Vektorrechner CRAY X-MP, JiiI-Spez-435, KFA Jiilich (1988).
|
| |
12
|
B. Liu and N. SO'other, Peak Vector Performance from VS Fortran, IBM Research Report RC 2849 (San Jose, 1987).
|
 |
13
|
|
| |
14
|
S. Knecht and W. Nagel, Multiprocessing on CRAY X-MP/22 - Experiences in Macrotasking and Microtasking, Proceedings of CRA Y User Group Meeting (i 986) 153-166.
|
| |
15
|
P.J.D. Mayes, Block Factorization Algorithms on the IBM 3090/VF, NAG Technical Report TR7/88, NAG Ltd., Oxford.
|
| |
16
|
M. Mihelcic, Chr. Pirron, and K. Wingerath, Three- Dimensional Simulations of the Czochralski Bulk Flow, Journal of Crystal Growth 69 (1984) 473-488.
|
| |
17
|
Multitasking Programmer's Manual, Revision E, CRAY-Rescarch Inc, SN-0222 (1988).
|
| |
18
|
W. Nagel and S. Knecht, Einsatzm6glichkeiten des Multitasking am Beispiel yon Programmkernen, PARS Mitteilungen 4 (1987) 75-90.
|
 |
19
|
|
| |
20
|
W.E. Nagel and F. Szel6nyi, Multitasking on Supercomputers: Concepts and Experiences, IBM Technical Report iCE-VS05, IBM ECS{:C (1989).
|
| |
21
|
Parallel FORI'RAN l_.anguage and Library Reference, IBM Order No SC23-0431.
|
| |
22
|
Programmers Library Reference Manual, CRAY-Research inc, SN-0113 (I 987).
|
| |
23
|
F. Szel~nyi, FORTRAN Multitasking with a Parallel Precompiler, IBM Technical Report ICE-VS04, IBM ECSEC (~ 9~8).
|
| |
24
|
|
| |
25
|
S.G. Tucker, The IBM 3090 system: An overview, IBM Systems Journal 25 (l) (1986) 4-19.
|
| |
26
|
VS F()R'I'RAN Version 2 Language and Library Reference, IBM Order No SC26-4221.
|
|