ACM Home Page
Please provide us with feedback. Feedback
Roofline: an insightful visual performance model for multicore architectures
Full text Digital EditionDigital Edition HtmlHtml (3 KB),  PdfPdf (652 KB)
Source
Communications of the ACM archive
Volume 52 ,  Issue 4  (April 2009) table of contents
A Direct Path to Dependable Software
SECTION: Contributed articles table of contents
Pages 65-76  
Year of Publication: 2009
ISSN:0001-0782
Authors
Samuel Williams  Lawrence Berkeley National Laboratory, Berkeley, CA
Andrew Waterman  University of California, Berkeley
David Patterson  University of California, Berkeley
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 76,   Downloads (12 Months): 1303,   Citation Count: 3
Additional Information:

appendices and supplements   abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1498765.1498785
What is a DOI?

APPENDICES and SUPPLEMENTS
PdfAppendix A (1.34 MB),
Appendix associated with the Roofline article


ABSTRACT

The Roofline model offers insight on how to improve the performance of software and hardware.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
AMD. Software Optimization Guide for AMD Family 10h Processors, Publication 40546, Apr. 2008; www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/40546.pdf.
3
 
4
Asanovic, K., Bodik, R., Catanzaro, B., Gebis, J., Keutzer, K., Patterson, D., Plishker, W., Shalf, J., Williams, S., and Yelick, K. The Landscape of Parallel Computing Research: A View from Berkeley Technical Report UCB/EECS-2006-183. EECS, University of California, Berkeley, Dec. 2006.
 
5
Bienia, C., Kumar, S., Singh, J., and Li, K. The PARSEC Benchmark Suite: Characterization and Architectural Implications, Technical Report TR-811-008. Princeton University, Jan. 2008.
 
6
Bird, S., Waterman, A., Klues, K., Datta, K., Liu, R., Nishtala, R., Williams, S., Asanovi, K., Demmel, J., Patterson, D., and Yelick, K. A case for sensible performance counters. Submitted to the First USENIX Workshop on Hot Topics in Parallelism (Berkeley CA, Mar. 30--31, 2009); www.usenix.org/events/hotpar09/.
 
7
 
8
9
 
10
Chong, J. Private communication on financial PDE solvers, 2008.
 
11
Colella, P. Defining Software Requirements for Scientific Computing, Presentation, 2004.
 
12
 
13
Demmel, J., Dongarra, J., Eijkhout, V., Fuentes, E., Petitet, A., Vuduc, R., Whaley, R., and Yelick, K. Self-adapting linear algebra algorithms and software. Proceedings of the IEEE: Special Issue on Program Generation, Optimization, and Adaptation 93, 2 (2005).
 
14
 
15
Frigo, M. and Johnson, S. The design and implementation of FFTW3. Proceedings of the IEEE: Special Issue on Program Generation, Optimization, and Platform Adaptation 93, 2 (2005).
16
 
17
 
18
 
19
 
20
 
21
Little, J.D.C. A proof of the queueing formula L = λ W. Operations Research 9, 3 (1961), 383--387.
 
22
McCalpin, J. STREAM: Sustainable Memory Bandwidth in High-Performance Computers, 1995; www.cs.virginia.edu/stream.
23
 
24
25
 
26
 
27
Williams, S. Autotuning Performance on Multicore Computers, Ph.D. Thesis. University of California, Berkeley, Dec. 2008; www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-164.html.
 
28
Williams, S., Carter, J., Oliker, L., Shalf, J., and Yelick, K. Lattice Boltzmann simulation optimization on leading multicore platforms. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing Symposium (Miami, FL, Apr. 14--18, 2008), 1--14.
29
30


Collaborative Colleagues:
Samuel Williams: colleagues
Andrew Waterman: colleagues
David Patterson: colleagues