ACM Home Page
Please provide us with feedback. Feedback
Representation-transparent matrix algorithms with scalable performance
Full text PdfPdf (471 KB)
Source
International Conference on Supercomputing archive
Proceedings of the 21st annual international conference on Supercomputing table of contents
Seattle, Washington
SESSION: Algorithms and applications II table of contents
Pages: 116 - 125  
Year of Publication: 2007
ISBN:978-1-59593-768-1
Authors
Peter Gottschling  Indiana University Bloomington, IN
David S. Wise  Bloomington, IN
Michael D. Adams  Indiana University, Bloomington, IN
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 69,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1274971.1274989
What is a DOI?

ABSTRACT

Positive results from new object-oriented tools for scientific programming are reported. Using template classes, abstractions of matrix representations are available that subsume conventional row-major, column-major, either Z- or И-Morton-order, as well as block-wise combinations of these. Moreover, the design of the Matrix Template Library (MTL) has been independently extended to provide recursators, to support block-recursive algorithms, supplementing MTL's iterators. Data types modeling both concepts enable the programmer to implement both iterative and recursive algorithms (or even both) on all of the aforementioned matrix representations at once for a wide family of important scientific operations.

We illustrate the unrestricted applicability of our matrix-recursator on matrix multiplication. The same generic block-recursive function, unaltered, is instantiated on different triplets of matrix types. Within a base block, either a library multiplication or a user-provided, platform-specific code provides excellent performance. We achieve 77% of peak-performance using hand-tuned base cases without explicit prefetching. This excellent performance becomes available over a wide family of matrix representations from a single program. The techniques generalize to other applications in linear algebra.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Advanced Micro Devices, Inc., Sunnyvale, CA. AMD Core Math Library (ACML), 2006. http://developer.amd.com/acml.jsp
 
4
P. Bientinesi, B. Gunter, and R. A. van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw., 2007. Accepted upon revisions. http://www.cs.utexas.edu/users/pauldj/pubs/TOMS_SPD.pdf
5
 
6
 
7
J. J. Dongarra, H. W. Meuer, E. Strohmaier, and H. Simon. Top 500 supercomputer sites. http://www.top500.org, 2006.
 
8
J. J. Dongarra, C. B. Moler, J. R. Bunch, and G. W. Stewart. LINPACK Users' Guide. Society for Industrial and Applied Mathematics, Philadelphia, 1979.
9
 
10
 
11
K. Goto and R. A. van de Geijn. Anatomy of high-performance matrix multiplication. Technical report, Univ. of Texas, Austin. Submittted for publication. Visited Sept. 2006. http://www.cs.utexas.edu/users/flame/pubs/GOTO_TOMS.pdf
 
12
P. Gottschling. Fundamental algebraic concepts in concept-enabled C++. Technical Report 638, Indiana University, Oct. 2006. http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR638
13
 
14
A. Lumsdaine, J. Siek, L.-Q. Lee, and P. Gottschling. The Matrix Template Library home page. http://ww.osl.iu.edu/research/mtl, 2006.
 
15
G. M. Morton. A computer oriented geodetic data base and a new technique in file sequencing. Technical report, IBM Ltd., Ottawa, Ontario, Mar. 1966.
 
16
 
17
 
18
R. Raman and D. S. Wise. Converting to and from dilated integers. Submitted for publication, Dec. 2006. http://www.cs.indiana.edu/~dswise/Arcee/castingDilated-comb.pdf
 
19
 
20
 
21
J. Spieß. Untersuchungen des Zeitgewinns durch neue Algorithmen zur Matrix-Multiplication. Computing, 17:23--36, 1976.
 
22
A. Stepanov. The Standard Template Library --- how do you build an algorithm that is both generic and efficient? Byte Magazine, 20(10), Oct. 1995.
 
23
 
24
R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1--2):3--35, Jan. 2001. http://dx.doi.org/10.1016/S0167-8191(00)00087-9
25


Collaborative Colleagues:
Peter Gottschling: colleagues
David S. Wise: colleagues
Michael D. Adams: colleagues