|
ABSTRACT
This paper describes a proposal for Level 3 Basic Linear Algebra Subprograms (Level 3 BLAS). The Level 3 BLAS are targeted at matrix-matrix operations with the aim of providing more efficient, but portable, implementations of algorithms on high-performance computers, especially those with hierarchical memory and parallel processing capability.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. W. Barron and H. P. F. Swinnerton-Dyer, "Solution of Simultaneous Linear Equations Using a Magnetic-Tape Store," <i>Computer J.</i>, vol. 3, 1960.
|
| |
2
|
M. Berry, K. Gallivan, W. Harrod, W. Jalby, S. Lo, U. Meier, B. Philippe, and A. Sameh, "Parallel Algorithms on the CEDAR System," <i>CSRD Report No. 581.</i>
|
| |
3
|
|
| |
4
|
I. Bucher and T. Jordan, "Linear Algebra Programs for use on a Vector Computer with a Secondary Solid State Storage Device," in <i>Advances in Computer Methods for Partical Differential Equations</i>, ed. R. Vichnevetsky and R Stepleman, pp. 546--550, IMACS.
|
| |
5
|
D. A. Calahan, "Block-Oriented Local-Memory-Based Linear Equation Solution on the CRAY-2: Uniprocessor Algorithms," <i>Proceedings International Conference on Parallel Processing</i>, IEEE Computer Society Press, August 1986.
|
| |
6
|
B. Chartres, "Adaption of the Jacobi and Givens Methods for a Computer with Magnetic Tape Backup Store," <i>University of Sydney Technical Report No. 8.</i>
|
| |
7
|
A. K. Dave and I. S. Duff, "Sparse Matrix Calculations on the CRAY-2," AERE Harwell Report CSS 197 (to appear Parallel Computing).
|
 |
8
|
|
| |
9
|
J. J. Dongarra, J. Bunch, C. Moler, and G. Stewart, <i>LINPACK User's Guide</i>, SIAM Pub., 1976.
|
| |
10
|
J. J. Dongarra, F. Gustavson, and A. Karp, "Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine," <i>SIAM Review</i>, vol. 26, 1, pp. 91--112.
|
| |
11
|
|
| |
12
|
J. J. Dongarra and D. C. Sorensen, "Linear Algebra on High-Performance Computers," in <i>Proceedings Parallel Computing 85</i>, ed. U. Schendel, pp. 113--136, North Holland.
|
| |
13
|
J. J. Dongarra, J. DuCroz, S. Hammarling, and R. Hanson, "An Extended Set of Fortran Basic Linear Algebra Subprograms," Argonne National Laboratory Report, ANL-MCS-TM-41 (Revision 3), November 1986.
|
| |
14
|
J. Dongarra, J. DuCroz, S. Hammarling, and R. Hanson, "An Extended Set of Fortran Basic Linear Algebra Subprograms: Model Implementation and Test Programs," Argonne National Laboratory Report, ANL-MCS-TM-81, November, 1986.
|
| |
15
|
J. J. Dongarra and I. S. Duff, "Advanced Architecture Computers," Argonne National Laboratory Report, ANL-MCS-TM-57 (Revision 1), January, 1987.
|
 |
16
|
|
| |
17
|
I. S. Duff, "Full Matrix Techniques in Sparse Gaussian Elimination," <i>Numerical Analysis Proceedings, Dundee 1981, Lecture Notes in Mathematics 912</i>, pp. 71--84, Springer-Verlag, 1986.
|
| |
18
|
A. George and H. Rashwan, "Auxiliary Storage Methods for Solving Finite Element Systems," <i>SIAM SISSC</i>, vol. 6, 1981.
|
| |
19
|
IBM, "Engineering and Scientific Subroutine Library," <i>IBM</i>, vol. Program Number: 5668-863.
|
 |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
R. Schreiber, "Module Design Specification (Version 1.0)," <i>SAXPY Computer Corporation, 255 San Geronimo Way, Sunnyvale, CA 94086.</i>, 1986.
|
| |
24
|
Y. Robert and P. Sguazzero, "The LU Decomposition Algorithm and Its Efficient Fortran Implementation on the IBM 3090 Vector Multiprocessor," IBM ECSEC Report ICE-0006, 1987.
|
|