|
ABSTRACT
For multiprocessor systems with two level memory hierarchy; the communication requirements of parallel Cholesky factorization of dense and sparse symmetric, positive definite matrices are analyzed. The data traffic associated with computing the Chloesky factor of an nn ×n dense matrix using n&agr; processors, &agr; ≤ 2, is shown to be &OHgr;(n2+&agr;/2), assuming that the computational load is uniformly distributed. For an nn ×n sparse matrix, representing a √n × √n regular grid graph, the corresponding data traffic is shown to be &OHgr;(n1+&agr;/2), &agr; ≤ 1.
Partitioning schemes that are variations of block assignment scheme are described. The data traffic generated by these schemes are asymptotically optimal and these schemes allow efficient use of up to &Ogr;(n2) and &Ogr;(n) processors in the dense and the sparse case, respectively. The block based partitioning schemes are shown to provide a better utilization of the data accessed from the shared memory and reduce the total data traffic as compared to the schemes based on the column-wise wrap around assignment.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
CHARRIER, P., AND ROMAN, J. Study of the Parollclism Induced by a Nested Dissection ltiethod and of its lmplemertlation on a Message Passing Alultiprocessor Computer. Tech. Rep. 1-8722, Universite de Bordeaux L, Talence, France, 1987.
|
| |
2
|
GEORGE, J. A., HEATH, M. T., AND LIU, J. W. 11. Parallel cholesky h ctorizal, ion on a shared-memory multiproccssor. Linear Alqebra and Its Applications 77 (1986), 165 187.
|
| |
3
|
GEORGE, J. A., HEATH, M. T., LIU, J. W. H., ANd NG, E. Solution of Sparse Positive Definite Systems on a Shared-Memory Multiprocessor. Tech. Rep. ORNL/TM-10260, ()ak Ridge National Laboratory, Oak R.idge, Tenn., 1987.
|
| |
4
|
GEORGE, J. A., HEATH, M. T., LIU, J.W. H., AND NG, E. Sparse Cholesky Factorization on a Local Memory of Multiprocessor. Tech. Rep. ORNL/TM-9962, Oak Ridge National Laboratory, Oak Ridge, Tenn., 1986.
|
| |
5
|
GEORGE, J. A., HEATH,, M. T., No,, E., and LlU, J. W. H. Symbolic cholcsky factorization on a local-memory multiprocessor. Parallel Computing 5 (1987), 85-95.
|
| |
6
|
|
| |
7
|
GEORGE, J. A., AND LIU, J. W. H. AND No;, E. Communication reduction in parallel sparse cholesky factorizalion on a hypercube. In Hypercube Multiproccssors 1987, M. T. Heath, Ed., SIAM Publication, 1987, pp. 576 586.
|
| |
8
|
|
| |
9
|
|
| |
10
|
GOLUB, G. H., AND LOAN, C. F. V. Matriz Corn.. putations. The Johns Hopklns Unlvcrslty Press, Baltimore, MD, 1983.
|
| |
11
|
HEATH, M. T. Parallel Cholesky Factorization in Message-Passing Multiprocessor Environments. Tech. Rep. OR.NL-6150, Oak Ridge Nat. tonal Laboratory, Oak Ridge, Tenn., 1985.
|
| |
12
|
LIPTON, R.J. ROSE, D.J., AND TARJAN, R. E. Gcncra lizcd nested dissection. SIAM ,Journal of Numerical Analyis 16 (1979), 346 ..358.
|
| |
13
|
LIPTON, R.J. AND TARJAN, R. E.. A scparat or thereom for planar graphs. SIAM .Journal of Applied Mathematics 36 (1979), 177-189.
|
| |
14
|
|
| |
15
|
|
| |
16
|
NAIK, V. K., and Patrick M. L. Data Traffic Reduction Schemes for Sparse Cholesky Factorizalions, Report. 88-14, ICASE, NASA Langley Rescarch Center, Hampton, VA, 1988.
|
| |
17
|
SAAD, Y. Communication complexity of the gaussian climinination algorithm on multiiprocessors. Linear Alqebra and Its Applications 77 (1986), 315 340.
|
| |
18
|
WORLEY, P. H., AND SCHREIBER, R,. Nested dissection on a mesh-connected processor array. In Proceedings of the ARO Workshop on New Com. puting Environments: Parallel, Vector, and Systolic (1986), A. Wouk, Ed.
|
| |
19
|
|
|