ACM Home Page
Please provide us with feedback. Feedback
Data traffic reduction schemes for Cholesky factorization on asynchronous multiprocessor systems
Full text PdfPdf (1.27 MB)
Source International Conference on Supercomputing archive
Proceedings of the 3rd international conference on Supercomputing table of contents
Crete, Greece
Pages: 283 - 294  
Year of Publication: 1989
ISBN:0-89791-309-4
Authors
Vijay K. Naik  IBM Research, T.J. Watson Research Center, Yorktown Heights, NY
Merrell L. Patrick  ICASE, NASA Langley Research Center, Hampton, VA and Computer Science Department, Duke University, Durham, NC
Sponsors
Computer Tech Inst. : Computer Technology Institute
SIGARCH: ACM Special Interest Group on Computer Architecture
SIAM : Society for Industrial and Applied Mathematics
AICA : Assoc Italianai de Calcolo Automatico
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 23,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/318789.318820
What is a DOI?

ABSTRACT

For multiprocessor systems with two level memory hierarchy; the communication requirements of parallel Cholesky factorization of dense and sparse symmetric, positive definite matrices are analyzed. The data traffic associated with computing the Chloesky factor of an nn ×n dense matrix using n&agr; processors, &agr; ≤ 2, is shown to be &OHgr;(n2+&agr;/2), assuming that the computational load is uniformly distributed. For an nn ×n sparse matrix, representing a √n × √n regular grid graph, the corresponding data traffic is shown to be &OHgr;(n1+&agr;/2), &agr; ≤ 1. Partitioning schemes that are variations of block assignment scheme are described. The data traffic generated by these schemes are asymptotically optimal and these schemes allow efficient use of up to &Ogr;(n2) and &Ogr;(n) processors in the dense and the sparse case, respectively. The block based partitioning schemes are shown to provide a better utilization of the data accessed from the shared memory and reduce the total data traffic as compared to the schemes based on the column-wise wrap around assignment.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
CHARRIER, P., AND ROMAN, J. Study of the Parollclism Induced by a Nested Dissection ltiethod and of its lmplemertlation on a Message Passing Alultiprocessor Computer. Tech. Rep. 1-8722, Universite de Bordeaux L, Talence, France, 1987.
 
2
GEORGE, J. A., HEATH, M. T., AND LIU, J. W. 11. Parallel cholesky h ctorizal, ion on a shared-memory multiproccssor. Linear Alqebra and Its Applications 77 (1986), 165 187.
 
3
GEORGE, J. A., HEATH, M. T., LIU, J. W. H., ANd NG, E. Solution of Sparse Positive Definite Systems on a Shared-Memory Multiprocessor. Tech. Rep. ORNL/TM-10260, ()ak Ridge National Laboratory, Oak R.idge, Tenn., 1987.
 
4
GEORGE, J. A., HEATH, M. T., LIU, J.W. H., AND NG, E. Sparse Cholesky Factorization on a Local Memory of Multiprocessor. Tech. Rep. ORNL/TM-9962, Oak Ridge National Laboratory, Oak Ridge, Tenn., 1986.
 
5
GEORGE, J. A., HEATH,, M. T., No,, E., and LlU, J. W. H. Symbolic cholcsky factorization on a local-memory multiprocessor. Parallel Computing 5 (1987), 85-95.
 
6
 
7
GEORGE, J. A., AND LIU, J. W. H. AND No;, E. Communication reduction in parallel sparse cholesky factorizalion on a hypercube. In Hypercube Multiproccssors 1987, M. T. Heath, Ed., SIAM Publication, 1987, pp. 576 586.
 
8
 
9
 
10
GOLUB, G. H., AND LOAN, C. F. V. Matriz Corn.. putations. The Johns Hopklns Unlvcrslty Press, Baltimore, MD, 1983.
 
11
HEATH, M. T. Parallel Cholesky Factorization in Message-Passing Multiprocessor Environments. Tech. Rep. OR.NL-6150, Oak Ridge Nat. tonal Laboratory, Oak Ridge, Tenn., 1985.
 
12
LIPTON, R.J. ROSE, D.J., AND TARJAN, R. E. Gcncra lizcd nested dissection. SIAM ,Journal of Numerical Analyis 16 (1979), 346 ..358.
 
13
LIPTON, R.J. AND TARJAN, R. E.. A scparat or thereom for planar graphs. SIAM .Journal of Applied Mathematics 36 (1979), 177-189.
 
14
 
15
 
16
NAIK, V. K., and Patrick M. L. Data Traffic Reduction Schemes for Sparse Cholesky Factorizalions, Report. 88-14, ICASE, NASA Langley Rescarch Center, Hampton, VA, 1988.
 
17
SAAD, Y. Communication complexity of the gaussian climinination algorithm on multiiprocessors. Linear Alqebra and Its Applications 77 (1986), 315 340.
 
18
WORLEY, P. H., AND SCHREIBER, R,. Nested dissection on a mesh-connected processor array. In Proceedings of the ARO Workshop on New Com. puting Environments: Parallel, Vector, and Systolic (1986), A. Wouk, Ed.
 
19


Collaborative Colleagues:
Vijay K. Naik: colleagues
Merrell L. Patrick: colleagues