ACM Home Page
Please provide us with feedback. Feedback
Exploiting fast matrix multiplication within the level 3 BLAS
Full text PdfPdf (1.12 MB)
Source ACM Transactions on Mathematical Software (TOMS) archive
Volume 16 ,  Issue 4  (December 1990) table of contents
Pages: 352 - 368  
Year of Publication: 1990
ISSN:0098-3500
Author
Nicholas J. Higham  Univ. of Manchester, Manchester, UK
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 35,   Downloads (12 Months): 177,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/98267.98290
What is a DOI?

ABSTRACT

The Level 3 BLAS (BLAS3) are a set of specifications of FORTRAN 77 subprograms for carrying out matrix multiplications and the solution of triangular systems with multiple right-hand sides. They are intended to provide efficient and portable building blocks for linear algebra algorithms on high-performance computers. We describe algorithms for the BLAS3 operations that are asymptotically faster than the conventional ones. These algorithms are based on Strassen's method for fast matrix multiplication, which is now recognized to be a practically useful technique once matrix dimensions exceed about 100. We pay particular attention to the numerical stability of these “fast BLAS3.” Error bounds are given and their significance is explained and illustrated with the aid of numerical experiments. Our conclusion is that the fast BLAS3, although not as strongly stable as conventional implementations, are stable enough to merit careful consideration in many applications.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
BINI, D., AND LOTTI, D. Stability of fast algorithms for matrix multiplication. Numer. Math. 36 (1980), 63-72.
 
4
 
5
 
6
BRENT, R.P. Error analysis of algorithms for matrix multiplication and triangular decomposition using Winograd's identity. Numer. Math. 16 (1970), 145-156.
7
 
8
DAYD~, M0' J', AND DUFF, I.S. Use of level 3 BLAS in LU factorization on the Cray-2, the ETA-10P, and the IBM 3090-200/VF. Tech. Rep. CSS 229, Computer Science and Systems Div., Harwell Lab., 1988.
 
9
DEMMEL, J. W., DONGARRA, J. J., Du CROZ, J. J., GREENBAUM, A., HAMMARLING, S. J., AND SORENSEN, D. C. Prospectus for the development of a linear algebra library for highperformance computers. Tech. Memor. 97. Mathematics and Computer Science Div., Argonne National Lab., Argonne, Ill., 1987.
10
11
 
12
 
13
GALLIVAN, K., JALBY, W., MEIER, U., AND SAMEH, A. I-I. Impact of hierarchical memory systems on linear algebra algorithm design. Int. J. Supercomput. Appl. 2 (1988), 12-48.
 
14
 
15
 
16
 
17
 
18
IBM. Engineering and Scientific Subroutine Library, Guide and Reference, Release 3. 4th ed., Program 5668-863, 1988.
 
19
MILLER, W. Computational complexity and numerical stability. SIAM J. Comput. 4 (1975), 97-107.
 
20
MOLER, C. B., LITTLE, J. N., AND BANGERT, S. Pro-Matlab User's Guide. The MathWorks, Inc., South Natick, Mass., 1987.
 
21
 
22
SCHREIBER, R.S. Block algorithms for parallel machines. In Numerical Algorithms for Modern Parallel Computer Architectures, M. H. Schultz, Ed., IMA Volumes In Mathematics and Its Applications 13, Springer-Verlag, Berlin, 1988, 197-207.
 
23
 
24
STRASSEN, V. Gaussian elimination is not optimal. Numer. Math. I3 (1969), 354-356.

CITED BY  14


REVIEW

"Peter Bruce Worland : Reviewer"

The BLAS3 (Level 3 Basic Linear Algebra Subprograms) is a set of specifications, based on matrix-matrix operations, for efficient and portable routines to be used on high-performance computers. This interesting and clearly written paper explor  more...