|
ABSTRACT
This paper concerns latency-tolerant schemes for the efficient parallel solution of sparse triangular linear systems on distributed memory multiprocessors. Such triangular solution is required when sparse Cholesky factors are used to solve for a sequence of right-hand-side vectors or when incomplete sparse Cholesky factors are used to precondition a Conjugate Gradients iterative solver. In such applications, the use of traditional distributed substitution schemes can create a performance bottleneck when the latency of interprocessor communication is large. We had earlier developed the Selective Inversion (SI) scheme to reduce communication latency costs by replacing distributed substitution by parallel matrix vector multiplication. We now present a new two-way mapping of the triangular sparse matrix to processors to improve the performance of SI by halving its communication latency costs. We provide analytic results for model sparse matrices and we report on the performance of our scheme for parallel preconditioning with incomplete sparse Cholesky factors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. A. AJIZAND A. JENNINGS, A robust incomplete Cholesky-conjugate gradient algorithm, Internat. J. Numer. Methods. Engrg., 20 (1984), pp. 949--966.
|
| |
2
|
|
| |
3
|
|
| |
4
|
M. BOLLHÖFERA robust ILU with pivoting based on monitoring the growth of the inverse factors, Linear Algebra and its Applications, 338(1--3), pp. 201--218, 2001.
|
| |
5
|
E. CHOW, ParaSails User's Guide, Tech. Report UCRL-MA-137863, Lawrence Livermore National Laboratory, Livermore, CA, 2000.
|
| |
6
|
E. CHOW, Parallel implementation and performance characteristics of least squares sparse approximate inverse preconditioners, Tech. Report UCRL-JC-138883, Lawrence Livermore National Laboratory, Livermore, CA, 2000.
|
| |
7
|
P. CONCUS, G. GOLUB, AND D. O'LEARY, A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations, J. R. Bunchand D. J. Rose, eds., Sparse Matrix Computations, pp. 309--332, Academic Press, 1976.
|
 |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
A. GEORGE, M. T. HEATH, J. W-H. LIUAND E. G-Y. NGSymbolic Cholesky factorization on a local-memory multiprocessor, Parallel Computing 5 (1987), pp. 85--95.
|
| |
12
|
|
| |
13
|
|
| |
14
|
A. GUPTAAND V. KUMAR, A scalable parallel algorithm for sparse matrix factorization, Technical Report 94--19, Department of Computer Science, University of Minnesota, Minneapolis, 1994.
|
| |
15
|
I. GUSTAFSSON, An incomplete factorization preconditioning method based on modification of element matrices BIT Numerical Mathematics 36, 1, pp. 86--100, 1996.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
M. HEATHAND P. RAGHAVAN, Parallel sparse triangular solution, in IMA Volumes in Mathematics and its Applications, R. Gulliver, M. Heath, R. Schreiber, and P. Bjorstad, eds., vol. 105, Springer-Verlag (1998), pp. 289--306.
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
T. A. MANTEUFFEL, An incomplete factorization technique for positive definite linear systems, Math. Comput., 34 (1980), pp. 473--497.
|
| |
26
|
J. A. MEIJERINKAND H. A. VANDER VORST, An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix, Math. Comp., 31 (1977), pp. 148--162.
|
| |
27
|
MESSAGE PASSING INTERFACE FORUM, MPI: A Message-Passing Interface Standard, Int. J. of Supercomputer Applications, 8, (1994) pp. 165--414.
|
| |
28
|
E. G. NG, B. W.PEYTONAND P. RAGHAVAN, A Blocked incomplete Cholesky preconditioner for hierarchical-memory computers, Proceedings of the Fourth IMACS International Symposium on Iterative Methods in Scientific Computation. IMACS Series in Computational and Applied Mathematics: Iterative Methods in Scientific Computation IV, D. R. Kincaid and A. C. Elster, eds., 1999, pp. 211--222.
|
| |
29
|
P. RAGHAVAN, DSCPACK: A Domain-Separator Codes For The Parallel Solution Of Sparse Liner Systems, Technical Report CSE-02-004, Department of Computer Science and Engineering, The Pennsylvania State University.
|
| |
30
|
P. RAGHAVAN, Efficient parallel sparse triangular solution using selective inversion, Parallel processing Letters, Vol. 8 No. 1 (1998), pp. 29--40.
|
| |
31
|
P. RAGHAVAN, K. TERANISHIAND E. NG, Towards Scalable Preconditioning Using Incomplete Factorization, Proceedings of 2001 International Conference On Preconditioning Techniques For Large Sparse Matrix Problems In Industrial Applications, pp. 63--65, Tahoe City CA, April 29--May 1, 2001. Longer version submitted to Numerical Linear Algebra.
|
| |
32
|
|
| |
33
|
B. SMITH, L. C. MCINNES, AND W. GROPP, PETSc 2. 0 user's manual, Technical Report ANL 95/11 - Revision 2. 0. 22 (1997), Mathematics and Computer Science Division Argonne National Laboratory, Argonne IL 60439.
|
|