| Algorithm 784: GEMM-based level 3 BLAS: portability and optimization issues |
| Full text |
Pdf
(155 KB)
|
| Source
|
ACM Transactions on Mathematical Software (TOMS)
archive
Volume 24 , Issue 3 (September 1998)
table of contents
Pages: 303 - 316
Year of Publication: 1998
ISSN:0098-3500
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 38, Citation Count: 6
|
|
ABSTRACT
This companion article discusses portability and optimization issues of the GEMM-based level 3 BLAS model implementations and the performance evaluation benchmark. All software comes in all four data types (single- and double-precision, real and complex) and are designed to be easy to implement and use on different platforms. Each of the GEMM-based routines has a few machine-dependent parameters that specify internal block sizes, cache characteristics, and branch points for alternative code sections. These parameters provide means for adjustment to the characteristics of a memory hierarchy.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
E. Anderson , Z. Bai , C. Bischof , J. Demmel , J. Dongarra , J. Du Croz , A. Greenbaum , S. Hammarling , A. McKenney , S. Ostrouchov , D. Sorensen, LAPACK's user's guide, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1992
|
| |
2
|
David H. Bailey, Unfavorable strides in cache memory systems (RNR Technical Report RNR-92-015), Scientific Programming, v.4 n.2, p.53-58, Summer 1995
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
REVIEW
"Timothy R. Hopkins : Reviewer"
The basic linear algebra subroutines (BLAS) consist of
three libraries (known as Levels 1, 2, and 3) and form an integral part
of much of the important numerical software developed over the last two
decades. Efficient implementatio
more...
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|