ACM Home Page
Please provide us with feedback. Feedback
Stability of block algorithms with fast level-3 BLAS
Full text PdfPdf (1.14 MB)
Source ACM Transactions on Mathematical Software (TOMS) archive
Volume 18 ,  Issue 3  (September 1992) table of contents
Pages: 274 - 291  
Year of Publication: 1992
ISSN:0098-3500
Authors
James W. Demmel  Univ. of California, Berkeley
Nicholas J. Higham  Univ. of Manchester, Manchester, UK
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 42,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/131766.131769
What is a DOI?

ABSTRACT

Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar, they have a higher level of granularity than point algorithms, and this makes them well suited to high-performance computers. The numerical stability of the block algorithms in the new linear algebra program library LAPACK is investigated here. It is shown that these algorithms have backward error analyses in which the backward error bounds are commensurate with the error bounds for the underlying level-3 BLAS (BLAS3). One implication is that the block algorithms are as stable as the corresponding point algorithms when conventional BLAS3 are used. A second implication is that the use of BLAS3 based on fast matrix multiplication techniques affects the stability only insofar as it increases the constant terms in the normwise backward error bounds. For linear equation solvers employing LU factorization, it is shown that fixed precision iterative refinement helps to mitigate the effect of the larger error constants. Despite the positive results presented here, not all plausible block algorithms are stable; we illustrate this with the example of LU factorization with block triangular factors and describe how to check a block algorithm for stability without doing a full error analysis.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ARIOLI, M., DEMMEL, J. W., AND DUFF, h S. Solving sparse linear systems with sparse backward error. SIAM J. Motrix Anal. Appl. lO (1989), 165-190.
 
2
 
3
 
4
BISCHOF, C. H., AND DONGARRA, J.J. A project for developing a linear algebra library for high-performance computers. Preprint MCS-P105-0989, Mathematics and Computer Science Div., Argonne National Lab., 1989.
 
5
 
6
BRENT, R.P. Error analysis of algorithms for matrix multiplication and triangular decomposition using Winograd's identity. Numer. Math. 16 (1970), 145 156.
 
7
DAHLQUIST, G., AND BJORCK, A. Numerlcc~ Methods. Prentice-Hall, Englewood Cliffs, N.J., 1974.
 
8
DEMMEL, J.W. On floating point errors in Cholesky. LAPACK Working Note 14, Dept. of Computer Science, Univ. of Tennessee, Knoxville, 1989.
 
9
DEMMEL, J. ~~., DONGARRA, J J., DU CROZ, J. J., GREENBAUM, A., I-IAMMARLING, ~. J., AND SORENSEN, D.C. Prospectus for the development of a linear algebra library for high-performance computers. Tech. Mem. 97, Mathematics and Computer Science Div., Argonne National Lab., 1987.
 
10
DEMMEL, J W., DU CROZ, J. J., HAMMARLING, S. J., AND SORENSON, D.C. Guidelines for the design of symmetric eigenroutines, SVD, and iterative refinement and condition estimation for linear systems. LAPACK Working Note 4, Tech. Memor. Ill, Mathematics and Computer Science Dlv., Argonne National Lab., 1988.
11
12
 
13
DONGARRA, J. J., HAMMARLING, S. J., AND SORENSEN, D. C. Block reduction of matrices to condensed forms for eigenvalue computations. J. Comput. Appl. Math. 27 (1989), 215 227.
 
14
DONGARRA, J. J., KAUFMAN, L., AND HAMMARLING, S.J. Squeezing the most out of eigenvalue solvers on high-performance computers. Linear Algebra Appl. 77 (1986), 113-136.
 
15
DuCRoz, J.J.,ANDHIGH~M,N.J. Stability of methods for matrix inversion. IMAJ. Numer. Anal. 12, (1992), 1-19.
 
16
GALLIVAN, K., JALBY, W., MEIER, U., AND SAMEH, A. H. Impact of hierarchical memory systems on linear algebra algorithm design. Int. J. Supercomput. Appl. 2 (1988), 12 48.
 
17
 
18
19
 
20
21
 
22
HIGHAM, N. J. How accurate is Gaussian elimination? In Numerical Analysis 1989, In Proceedings of the 13th Dundee Conference, D. F. Griffiths and G. A. Watson, Eds., Pitman Research Notes in Mathematics 228, LongTnan, 1990, pp. 137-154.
 
23
H~GHAM, N.J. Is fast matrix multiplication of practical use? SIAM News. 23 (Nov. 1990), 12 +.
 
24
 
25
HIGHAM, N. J. Stability of a method for multiplying complex matrices with three real matrix multiplications. Numerical Analysis Rep. No. 181, Univ. of Manchester, England, 1990; to appear in SIAM J. Matrix Anal App{.
 
26
M^YES, P., AND RADICATI, G. Banded Cholesky factorization using level 3 BLAS, LAPACK Working Note 12, Mathematics and Computer Science Div., Argonne National Lab. 1989.
 
27
OETTH, W., AND Pr~E~, W. Compatibility of approximate solution of linear equations with given error bounds for coefficients and right-hand sides. Nvmer. Math. 6 (1964), 405-409.
 
28
 
29
SC~REmER, R. S. Block algorithms for parallel machines. In Numerical Algorithms for Modern Parallel Computer Archztectures, M. H. Schultz, Ed., IMA Volumes In Mathematics and Its Applications 13, Springer-Verlag, Beriin~ 1988, pp. 197-207.
 
30
 
31
SKEEL, R. D. Iterative refinement implies numerical stability for Gaussian elimination. Math. Comput. 35 (1980), 817 832.
 
32
ST~SSEN, V. Gaussian elimination is not optimal. Numer. Math. 13 (1969), 354 356.
 
33
WILKINSON~ J.H. Error analysis of eigenvalue techniques based on orthogonal transformations. J. Soc. Indust. Appl. Math. 10 (1962), 162-195.
 
34
 
35
WmK~NSON, J.H. Error analysis revisited. IMA Bull. 22 (1986), 192-200.
 
36
WINOGR~, S. A new algorithm for inner product. IEEE Trans. Comput. C-18 (1968), 693-694.

CITED BY  9


REVIEW

"Chaya Gurwitz : Reviewer"

Many matrix computations are organized as block algorithms in which the steps of the algorithm are defined in terms of operations on submatrices, rather than in terms of operations on the individual matrix elements. If a block algorithm is cod  more...

Collaborative Colleagues:
James W. Demmel: colleagues
Nicholas J. Higham: colleagues