ACM Home Page
Please provide us with feedback. Feedback
A new data-mapping scheme for latency-tolerant distributed sparse triangular solution
Full text PdfPdf (287 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2002 ACM/IEEE conference on Supercomputing table of contents
Baltimore, Maryland
Pages: 1 - 11  
Year of Publication: 2002
Authors
Keita Teranishi  The Pennsylvania State University, University Park, PA
Padma Raghavan  The Pennsylvania State University, University Park, PA
Esmond Ng  Lawrence Berkeley National Laboratory and NERSC, MS 50F-1602, Berkeley, CA
Sponsors
IEEE-CS\DATC : IEEE Computer Society
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 16,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

This paper concerns latency-tolerant schemes for the efficient parallel solution of sparse triangular linear systems on distributed memory multiprocessors. Such triangular solution is required when sparse Cholesky factors are used to solve for a sequence of right-hand-side vectors or when incomplete sparse Cholesky factors are used to precondition a Conjugate Gradients iterative solver. In such applications, the use of traditional distributed substitution schemes can create a performance bottleneck when the latency of interprocessor communication is large. We had earlier developed the Selective Inversion (SI) scheme to reduce communication latency costs by replacing distributed substitution by parallel matrix vector multiplication. We now present a new two-way mapping of the triangular sparse matrix to processors to improve the performance of SI by halving its communication latency costs. We provide analytic results for model sparse matrices and we report on the performance of our scheme for parallel preconditioning with incomplete sparse Cholesky factors.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. A. AJIZAND A. JENNINGS, A robust incomplete Cholesky-conjugate gradient algorithm, Internat. J. Numer. Methods. Engrg., 20 (1984), pp. 949--966.
 
2
 
3
 
4
M. BOLLHÖFERA robust ILU with pivoting based on monitoring the growth of the inverse factors, Linear Algebra and its Applications, 338(1--3), pp. 201--218, 2001.
 
5
E. CHOW, ParaSails User's Guide, Tech. Report UCRL-MA-137863, Lawrence Livermore National Laboratory, Livermore, CA, 2000.
 
6
E. CHOW, Parallel implementation and performance characteristics of least squares sparse approximate inverse preconditioners, Tech. Report UCRL-JC-138883, Lawrence Livermore National Laboratory, Livermore, CA, 2000.
 
7
P. CONCUS, G. GOLUB, AND D. O'LEARY, A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations, J. R. Bunchand D. J. Rose, eds., Sparse Matrix Computations, pp. 309--332, Academic Press, 1976.
8
9
 
10
 
11
A. GEORGE, M. T. HEATH, J. W-H. LIUAND E. G-Y. NGSymbolic Cholesky factorization on a local-memory multiprocessor, Parallel Computing 5 (1987), pp. 85--95.
 
12
 
13
 
14
A. GUPTAAND V. KUMAR, A scalable parallel algorithm for sparse matrix factorization, Technical Report 94--19, Department of Computer Science, University of Minnesota, Minneapolis, 1994.
 
15
I. GUSTAFSSON, An incomplete factorization preconditioning method based on modification of element matrices BIT Numerical Mathematics 36, 1, pp. 86--100, 1996.
 
16
 
17
 
18
 
19
M. HEATHAND P. RAGHAVAN, Parallel sparse triangular solution, in IMA Volumes in Mathematics and its Applications, R. Gulliver, M. Heath, R. Schreiber, and P. Bjorstad, eds., vol. 105, Springer-Verlag (1998), pp. 289--306.
 
20
 
21
 
22
 
23
 
24
 
25
T. A. MANTEUFFEL, An incomplete factorization technique for positive definite linear systems, Math. Comput., 34 (1980), pp. 473--497.
 
26
J. A. MEIJERINKAND H. A. VANDER VORST, An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix, Math. Comp., 31 (1977), pp. 148--162.
 
27
MESSAGE PASSING INTERFACE FORUM, MPI: A Message-Passing Interface Standard, Int. J. of Supercomputer Applications, 8, (1994) pp. 165--414.
 
28
E. G. NG, B. W.PEYTONAND P. RAGHAVAN, A Blocked incomplete Cholesky preconditioner for hierarchical-memory computers, Proceedings of the Fourth IMACS International Symposium on Iterative Methods in Scientific Computation. IMACS Series in Computational and Applied Mathematics: Iterative Methods in Scientific Computation IV, D. R. Kincaid and A. C. Elster, eds., 1999, pp. 211--222.
 
29
P. RAGHAVAN, DSCPACK: A Domain-Separator Codes For The Parallel Solution Of Sparse Liner Systems, Technical Report CSE-02-004, Department of Computer Science and Engineering, The Pennsylvania State University.
 
30
P. RAGHAVAN, Efficient parallel sparse triangular solution using selective inversion, Parallel processing Letters, Vol. 8 No. 1 (1998), pp. 29--40.
 
31
P. RAGHAVAN, K. TERANISHIAND E. NG, Towards Scalable Preconditioning Using Incomplete Factorization, Proceedings of 2001 International Conference On Preconditioning Techniques For Large Sparse Matrix Problems In Industrial Applications, pp. 63--65, Tahoe City CA, April 29--May 1, 2001. Longer version submitted to Numerical Linear Algebra.
 
32
 
33
B. SMITH, L. C. MCINNES, AND W. GROPP, PETSc 2. 0 user's manual, Technical Report ANL 95/11 - Revision 2. 0. 22 (1997), Mathematics and Computer Science Division Argonne National Laboratory, Argonne IL 60439.

Collaborative Colleagues:
Keita Teranishi: colleagues
Padma Raghavan: colleagues
Esmond Ng: colleagues