|
ABSTRACT
This article proposes a set of Level 3 Basic Linear Algebra Subprograms and associated kernels for sparse matrices. A major goal is to design and develop a common framework to enable efficient, and portable, implementations of iterative algorithms for sparse matrices on high-performance computers. We have designed the routines to shield the developer of mathematical software from most of the complexities of the various data structures used for sparse matrices. We have kept the interface and suite of codes as simple as possible while at the same time including sufficient functionality to cover most of the requirements of iterative solvers and sufficient flexibility to cover most sparse matrix data structures. An important aspect of our framework is that it can be easily extended to incorporate new kernels if the need arises. We discuss the design, implementation, and use of subprograms for the multiplication of a fully matrix by a sparse one and for the solution of sparse triangular systems with one or more (full) right-hand sides. We include a routine for checking the input data, generating a new sparse data structure from the input, and scaling a sparse matrix. The new data structure for the transformation can be specified by the user or can be chosen automatically by vendors to be efficient on their machines. We also include a routine for permuting the columns of a sparse matrix and one for permuting the rows of a full matrix.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
AEA TECHNOLOGY. 1996. Harwell Subroutine Library: A catalogue of subroutines (release 12). AEA Technology, Didcot, Oxon, United Kingdom.
|
| |
2
|
R. C. Agarwal , F. G. Gustavson , M. Zubair, A high performance algorithm using pre-processing for the sparse matrix-vector multiplication, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, p.32-41, November 16-20, 1992, Minneapolis, Minnesota, United States
|
| |
3
|
AMESTOY, P. R., DAYDI~, M., AND DUFF, I.S. 1989. Use of level 3 BLAS in the solution of full and sparse linear equations. In High Performance Computing (Montpellier, France, March 22-24, 1989), J.-L. Delhaye and E. Gelenbe, Eds. North-Holland Publishing Co., Amsterdam, The Netherlands, 19-31.
|
| |
4
|
E. Anderson , Z. Bai , C. Bischof , J. Demmel , J. Dongarra , J. Du Croz , A. Greenbaum , S. Hammarling , A. McKenney , S. Ostrouchov , D. Sorensen, LAPACK's user's guide, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1992
|
| |
5
|
ASHBY, S. F. AND SEAGER, M.K. 1990. A proposed standard for iterative solvers. Tech. Rep. 102860, Lawrence Livermore National Laboratory, Livermore, CA.
|
| |
6
|
CARNEY, S., HEROUX, M. A., AND LI, G. 1993. A proposal for a sparse BLAS toolkit. Tech. Rep. TR/PA/92/90, Revised. CERFACS, Toulouse, France.
|
| |
7
|
CERIONI, F., COLAJANNI, M., FILIPPONE, S., AND MAIOLATESI, S. 1996. A proposal for parallel sparse BLAS. Tech. Rep. RI.96.05, University of Rome--Tor Vergata, Rome, Italy.
|
 |
8
|
|
 |
9
|
|
 |
10
|
|
| |
11
|
DUFF, I. S. 1981. Full matrix techniques in sparse Gaussian elimination. In Numerical Analysis Proceedings (Dundee, Scotland), G. A. Watson, Ed. Lecture Notes in Mathematics, vol. 912. Springer-Verlag New York, Inc., New York, NY, 71-84.
|
 |
12
|
|
 |
13
|
|
| |
14
|
DUFF, I. S., GRIMES, R. G., AND LEWIS, J. G. 1997. The Rutherford-Boeing Sparse Matrix Collection. Tech. Rep. RAL TR-97-031, Rutherford Appleton Laboratory, Didcot, Oxon, United Kingdom.
|
| |
15
|
ERHEL, J. 1990. Sparse matrix multiplication on vector computers. Int. J. High Speed Comput. 2, 101-116.
|
| |
16
|
IBM. 1990. IBM Engineering and Scientific Subroutine Library: Guide and Reference. Tech. Rep. IBM Corp., Riverton, NJ.
|
 |
17
|
|
| |
18
|
OPPE, T. C. AND KINCAID, D.R. 1990. Are there iterative BLAS?. Tech. Rep. CNA-240, The University of Texas at Austin, Austin, TX.
|
| |
19
|
OPPE, T. C., JOUBERT, W., AND KINCAID, D. R. 1988. NSPCG user's guide: A package for solving large linear systems by various iterative methods. Tech. Rep. CNA-216, The University of Texas at Austin, Austin, TX.
|
| |
20
|
|
| |
21
|
Pozo, R. AND REMINGTON, K. A. 1996. The NIST sparse BLAS library implementation: Design and performance. Tech. Rep. National Institute of Standards and Technology, Gaithersburg, MD.
|
| |
22
|
SAAD, Y. 1994. ILUT: A dual threshold incomplete factorization. Num. Lin. Alg. Appl. 1, 4, 387-402.
|
| |
23
|
THINKING MACHINES. 1992. CMSSL for CM Fortran. Version 3.0. Tech. Rep. Thinking Machines Corporation, Bedford, MA.
|
CITED BY 7
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sriram Krishnamoorthy , Umit Catalyurek , Jarek Nieplocha , Atanas Rountev , P. Sadayappan, Data management and query---Hypergraph partitioning for automatic memory hierarchy management, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, November 11-17, 2006, Tampa, Florida
|
|
|
|
|