| Vector and parallel algorithms for Cholesky factorization on IBM 3090 |
| Full text |
Pdf
(889 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
table of contents
Reno, Nevada, United States
Pages: 225 - 233
Year of Publication: 1989
ISBN:0-89791-341-8
|
|
Authors
|
|
R. C. Agarwal
|
I.B.M. Research Division, Thomas J. Watson Research Center, Yorktown Hts., New York
|
|
F. G. Gustavson
|
I.B.M. Research Division, Thomas J. Watson Research Center, Yorktown Hts., New York
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 17, Citation Count: 0
|
|
|
ABSTRACT
In many engineering applications, a solution of Fx = b is required, where F is a positive definite symmetric matrix. This is usually done by the Cholesky factorization, F = RRT, where R is the lower triangular Cholesky factor. This is a compute intensive problem. However, in order to achieve the best possible performance on IBM 3090 Vector Facility, the problem requires blocking at various levels to match 3090 memory hierarchy. A large problem which does not fit in a particular level of memory is blocked so that each block fits in memory. This minimizes data transfers between various levels of memory. In this paper, various blocking schemes are described for vector and parallel implementation on 3090 VF. Some of these algorithms have been included in the Engineering and Scientific Subroutine Library (ESSL). Performance numbers are also included. These algorithms achieve close to the peak performance of the 3090 uniprocessor and multiprocessors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J.J. Dongarra, J. Bunch, C. Moler, and (3. Stewart, I,INPACK User's Guide, SIAM Pub., 1979.
|
| |
2
|
J. Oemme|, J.J. l)ongarra, J. Du Croz, A. Greenbaum, S. tlammarling, and I). Sorenson, "Prospectus for the development of a linear algebra library for high-performance computers", Argonne National Laboratory, Mathematics and Computer Science Division, Technical Memorandum No. 97, Sept. 1987.
|
| |
3
|
C. Bischof, J. Demmel, J. Dongarra, J. DtJ Croz, A. Greenbaum, S. Hammarling, and D. Sorensen, "I~AI)ACK working note #5, Provisional contents", Argonne National Labor~tory, Mathematics and Computer Science Division, ANI_,-88-38, Sept. 1988.
|
| |
4
|
Preliminary meeting on BLAS 3 adoption, Argonne National l.aborztory, Jan. 27-29, 1987.
|
| |
5
|
ESSI_, Guide and Reference, order number SC23-0184-0, IBM Corp., Feb., 1986.
|
| |
6
|
S. Katoh, IBM Corp., private communication, 1989.
|
| |
7
|
VS FORTFR, AN, Version 2, I,anguage and l.,ibrary Reference, order number SC26-4221-3, IBM Corp., March, 1988.
|
CITED BY 6
|
|
Ernie Chan , Enrique S. Quintana-Orti , Gregorio Quintana-Orti , Robert van de Geijn, Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, June 09-11, 2007, San Diego, California, USA
|
|
|
Ernie Chan , Field G. Van Zee , Paolo Bientinesi , Enrique S. Quintana-Orti , Gregorio Quintana-Orti , Robert van de Geijn, SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, February 20-23, 2008, Salt Lake City, UT, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|