| Recursive array layouts and fast parallel matrix multiplication |
| Full text |
Pdf
(1.28 MB)
|
| Source
|
ACM Symposium on Parallel Algorithms and Architectures
archive
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
table of contents
Saint Malo, France
Pages: 222 - 231
Year of Publication: 1999
ISBN:1-58113-124-0
|
|
Authors
|
|
Siddhartha Chatterjee
|
Department of Computer Science, The University of North Carolina, Chapel Hill, NC
|
|
Alvin R. Lebeck
|
Department of Computer Science, Duke University, Durham, NC
|
|
Praveen K. Patnala
|
Department of Computer Science, The University of North Carolina, Chapel Hill, NC
|
|
Mithuna Thottethodi
|
Department of Computer Science, Duke University, Durham, NC
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 30, Downloads (12 Months): 75, Citation Count: 28
|
|
|
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
T. Bially. Space-filling curves: Their generation and their application to bandwidth reduction. IEEE Transactions on Information Theory, IT-15(6):658-664, Nov. 1969.
|
 |
3
|
Jeff Bilmes , Krste Asanovic , Chee-Whye Chin , Jim Demmel, Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology, Proceedings of the 11th international conference on Supercomputing, p.340-347, July 07-11, 1997, Vienna, Austria
[doi> 10.1145/263580.263662]
|
 |
4
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.207-216, July 19-21, 1995, Santa Barbara, California, United States
|
 |
5
|
Steve Carr , Kathryn S. McKinley , Chau-Wen Tseng, Compiler optimizations for improving data locality, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.252-262, October 05-07, 1994, San Jose, California, United States
|
| |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
 |
12
|
|
| |
13
|
M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proceedings oflCASSP'98, volume 3, page 1381, Seattle, WA, 1998. IEEE.
|
| |
14
|
M. E Goodchild and A. W. Grandfield. Optimizing raster storage: an examination of four alternatives. In Proceedings of Auto-Carto 6, volume 1, pages 400-407, Ottawa, Oct. 1983.
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
D. Hilbert. 0ber stetige Abbildung einer Linie auf ein Fl~ichensttlck. Mathematische Annalen, 38:459--460, 1891.
|
| |
19
|
|
 |
20
|
Y. Charlie Hu , S. Lennart Johnsson , Shang-Hua Teng, High performance Fortran for highly irregular problems, Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.13-24, June 18-21, 1997, Las Vegas, Nevada, United States
|
| |
21
|
S. E Hummel, I. Banicescu, C.-T. Wang, and J. Wein. Load balancing and data locality via fractiling: An experimental study. In Language, Compilers and Run- Time Systems for Scalable Computers. Kluwer Academic Publishers, 1995.
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
Charles H. Koelbel , David B. Loveman , Robert S. Schreiber , Guy L. Steele, Jr. , Mary E. Zosel, The high performance Fortran handbook, MIT Press, Cambridge, MA, 1994
|
 |
26
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
| |
27
|
R. Laurini. Graphical data bases built on Peano space-filling curves. In C. E. Vandoni, editor, Proceedings of the EUROGRAPHICS'85 Conference, pages 327-338, Amsterdam, 1985. North-Holland.
|
| |
28
|
C. E. Leiserson. Personal communication, Aug. 1998.
|
| |
29
|
|
| |
30
|
|
| |
31
|
|
| |
32
|
G. Peano. Sur une courbe qui remplit toute une aire plaine. Mathematische Annalen, 36:157-160, 1890.
|
| |
33
|
|
| |
34
|
H. Sagan. Space-Filling Curves. Springer-Verlag, 1994. ISBN 0-387-94265-3.
|
 |
35
|
J. P. Singh , T. Joe , J. L. Hennessy , A. Gupta, An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.214-225, December 1993, Portland, Oregon, United States
[doi> 10.1145/169627.169699]
|
| |
36
|
V. Strassen. Gaussian elimination is not optimal. Numer. Math., 13:354-356, 1969.
|
| |
37
|
|
 |
38
|
|
 |
40
|
|
 |
41
|
|
CITED BY 28
|
|
|
|
|
|
|
|
|
|
|
José E. Moreira , Samuel P. Midkiff , Manish Gupta, A comparison of three approaches to language, compiler, and library support for multidimensional arrays in Java, Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande, p.116-125, June 2001, Palo Alto, California, United States
|
|
|
Siddhartha Chatterjee , Vibhor V. Jain , Alvin R. Lebeck , Shyam Mundhra , Mithuna Thottethodi, Nonlinear array layouts for hierarchical memory systems, Proceedings of the 13th international conference on Supercomputing, p.444-453, June 20-25, 1999, Rhodes, Greece
|
|
|
|
|
|
G. Almasi , F. G. Gustavson , J. E. Moreira, Design and evaluation of a linear algebra package for Java, Proceedings of the ACM 2000 conference on Java Grande, p.150-159, June 03-04, 2000, San Francisco, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Venkata K. Pingali , Sally A. McKee , Wilson C. Hseih , John B. Carter, Computation regrouping: restructuring programs for temporal data cache locality, Proceedings of the 16th international conference on Supercomputing, June 22-26, 2002, New York, New York, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kamen Yotov , Tom Roeder , Keshav Pingali , John Gunnels , Fred Gustavson, An experimental comparison of cache-oblivious and cache-conscious programs, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, June 09-11, 2007, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|