|
ABSTRACT
We consider the algebraic multilevel iteration (AMLI) for the solution of systems of linear equations as they arise form a finite-difference discretization on a rectangular grid. Key operation is the matrix-vector product, which can efficiently be executed on vector and parallel-vector computer architectures if the nonzero entries of the matrix are concentrated in a few diagonals. In order to maintain this structure for all matrices on all levels coarsening in alternating directions is used. In some cases it is necessary to introduce additional dummy grid hyperplanes. The data movements in the restriction and prolongation are crucial, as they produce massive memory conflicts on vector architectures. By using a simple performance model the best of the possible vectorization strategies is automatically selected at runtime. Examples show that on a Fujitsu VPP300 the presented implementation of AMLI reaches about 85% of the useful performance, and scalability with respect to computing time can be achieved.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
AXELSSON,O.AND NEYTCHEVA, M. 1994. Algebraic multilevel iteration method for Stieltjes matrices. Num. Lin. Alg. Appl. 1, 213-236.
|
| |
3
|
BARRETT, R., BERRY, M., CHAN, T., DEMMEL, J., DONATO, J., DONGARRA, J., EIJKHOUT, V., POZO, R., ROMINE, C., AND VAN DER VORST, H. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM.
|
| |
4
|
CRAY RESEARCH. 1986. Autotasking User's Guide. Cray Research Inc.
|
| |
5
|
|
| |
6
|
FUJITSU HIGH-PERFORMANCE COMPUTING GROUP. 1996. R&D Server VX/VPP300/VPP700 Series SA/SE Handbook. Fujitsu High-Performance Computing Group.
|
| |
7
|
FUJITSU. 1997. VPP Fortran User's Guide. Fujitsu Ltd.
|
| |
8
|
GILBERT, W. J. 1976. Modern Algebra and Applications. John Wiley, New York.
|
| |
9
|
HACKBUSCH, W. 1985. Multigrid Methods and Applications. Springer-Verlag, Berlin, Heidel-berg, New York.
|
| |
10
|
INOUE, M., FURUI, T., KATAYAMA, H., AND NAKANISHI, K. 1998. Development, concept and overview of the SX-5 series supercomputers. SX World 21, 1-4. see URL http://www.hpc. comp.nec.co.jp/sx-e/sx-world/no23/en4.pdf.
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
NEYTCHEVA, M., ADIY, A., MELLAARD, M., GEOGIEV, K., AND AXELSSON, O. 1996. Scalable and optimal iterative solvers for linear and nonlinear problems. Final Report 9613, Department of Mathematics, University of Nijmegen, The Netherlands.
|
| |
15
|
|
| |
16
|
SCHOENAUER, W. 1999. Scientific Supercomputing: Architecture and Use of Shared and Distributed Memory Parallel Computers, Volume Addendum. Willi Schoenauer, Karlsruhe. see http://www.uni-karlsruhe.de/Uni/RZ/Personen/rz03/book/.
|
| |
17
|
|
| |
18
|
|
| |
19
|
WEISS, R. 1996. Parameter-Free Iterative Linear Solvers. Mathematical Research, vol. 97. Akademie Verlag, Berlin.
|
|