|
ABSTRACT
This paper proposes a parallel processing system -Harray- for scientific computations. Data flow computers are expected to obtain the high performance because they can extract parallelism fully from a program. However, they have many problems, such as the difficulty of controlling the sequence of execution. The -Harray- system is an array processor which adapts two levels of control mechanism; data flow execution in each processor and control flow between processors, in order to take full advantage of both mechanisms. A task which is assigned to a processor is called a “macro-block”. Three types of macro-blocking and three types of activation schemes for the macro-block which initiates its execution are introduced in order to attain the high performance. Moreover, a hardware synchronization mechanism is used to reduce synchronization overhead and to gain the liner speedup of the -Harray- system.
In this paper, the system architecture of the -Harray- system and its performance evaluation by software simulation are presented.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
W.J.Bouknight, A.Denenberg, D.E.Mcintyre, J.M.Randall, A.H.Sameh, and D.L.Slotnik : "The ILLIAC IV System," Proceedings of IEEE, pp.369-388 (1972)
|
 |
2
|
|
| |
3
|
T.Hoshino, T.Shirakawa, T.Kamimura, T.Sekiguchi, Y.Oyanagi, and T.Kawai : "Highly Parallel Processor Array "PAX" for Wide Scientific Applications," Proc. of the Int. Conf. on Parallel Processing, pp.95-103 (1983)
|
| |
4
|
D.Gajski, D.Kuck, D.Lawrie, and A.Sameh : "Cedar," Report Number UIUCDCS-R-83-1123, Department of Computer Science, University of Illinois at Urbana-Champagin, pp.l-25 (1983)
|
| |
5
|
L.M.Patnaik, R.Govindarajan, and N.S.Ramadoss ~ "Design and Performance IEEE Trans. on Computers, Vo1.C-35, No.3, pp.229-244 (1986)
|
| |
6
|
J.B.Dennis : "The MIT Dataflow Engineering Model," Proc. of IHP Congress 83, pp.553-566 (1983)
|
| |
7
|
A.Plas, D.Comte, O.Gelly, and J.C.Syre : "LAU System Architecture : A Parallel Data-driven Processor Based on Single Assignment," Proc. of Int. Conf. on Parallel Processing, pp.293-302 (1976)
|
 |
8
|
T. Shimada , K. Hiraki , K. Nishida , S. Sekiguchi, Evaluation of a prototype data flow processor of the SIGMA-1 for scientific computations, Proceedings of the 13th annual international symposium on Computer architecture, p.226-234, June 02-05, 1986, Tokyo, Japan
|
| |
9
|
T.Marushima, H.Yamana, T.Hagiwara, Y.Kusano, and Y.Muraoka : "Execution Mechanism of Parallel Processing System -Harray-", Technical Report IPS Japan, 88-CA-69, pp.9-16 (1988) (in japanese)
|
| |
10
|
|
| |
11
|
D.A.Padua, D.J.Kuck, and D.H.Lawrie : "High-Speed Multiprocessors and Compilation Techniques," IEEE Trans. on Computers, Vol.C-29, No.9, pp.763-776 (1980)
|
| |
12
|
U.Banerjee, S.Chen, D.J.Kuck, and R.A.Towl : "Time and Parallel Processor Bounds for Frotran-like Loops," IEEE Tram. on Computers, Vol.C-28, No.9, pp.660-670 (1979)
|
| |
13
|
D.J.Kuck and D.A.Padua : "High-Speed Multiprocessors and Their Compilers," Proc. of Int. Conf. on Parallel Processing, pp.5-16 (1979)
|
| |
14
|
S.F.Lundstron and G.H.Barnes : "Controllable MIMD Architecture," Proc. of Conf. on Parallel Processing, pp.19-27 (1980)
|
| |
15
|
R.G.Cytron : "Doacross : Beyond Vectorization for Multiprocessors," Proc. of Conf. on Parallel Processing, pp.836-844 (1986)
|
| |
16
|
S.P.Midlciff and D.A.Padua : "Compiler Generated Synchronization for Do Loops," Proe. of Conf. on Parallel Processing, pp.544-551 (1986)
|
| |
17
|
D.I.Kuck, A .H.Sameh, R .Cytron, A.V.V eidenbaum, C.D.Polychronopoulos, G.Lee, T.McDaniel, B.R.Leasure, C.Beckman, J.R.B.Davies, and C~.P.Kruskal " "The Effects of Program Restructuring Algorithm Change and Architecture Choice on Program Change," Proc. of Conf. on Parallel Processing, pp.129-138 (1984)
|
| |
18
|
|
| |
19
|
|
| |
20
|
System/360 Scientific Subroutine Package Version If} Programar's Manual, No.H-20-0205-3, IBM Corp.
|
| |
21
|
S.L.Chang : "Multi-Read Single-Write Memory and its Aplications," IEEE Trans. on Computers, Vo1.C-29, No.8, pp.689-694 (1980)
|
| |
22
|
T.Nakagawa, i.Kumata, T.Hasegawa, T.Matsumoto, K.Abe, N.Kobayashi, and H.Aiso : "A Multi-Microprocessor Approach to Discrete System Simulation," COMPCON Spring, pp.350-355 (t980)
|
 |
23
|
Hideharu Amano , Taisuke Boku , Tomohiro Kudoh , Hideo Aiso, (SM)2-II: a new version of the sparse matrix solving machine, Proceedings of the 12th annual international symposium on Computer architecture, p.100-107, June 17-19, 1985, Boston, Massachusetts, United States
|
| |
24
|
R.W.Hockney and C.R.Jesshope : "Parallel Computers," Adam Hilger Ltd. (1981)
|
| |
25
|
K.Asai, K.Higuchi, J.Katakura, and Y.Kurita : "Vectorization of the KENO-IV Code," Nuclear Science and Engineering : 92, pp.298-307 (1986)
|
CITED BY
|
|
H. Yamana , T. Hagiwara , J. Kohdate , Y. Muraoka, A preceding activation scheme with graph unfolding for the parallel processing system-array, Proceedings of the 1989 ACM/IEEE conference on Supercomputing, p.675-684, November 12-17, 1989, Reno, Nevada, United States
|
|