ACM Home Page
Please provide us with feedback. Feedback
Accelerating linpack with CUDA on heterogenous clusters
Full text PdfPdf (772 KB)
Source ACM International Conference Proceeding Series; Vol. 383 archive
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units table of contents
Washington, D.C.
Pages 46-51  
Year of Publication: 2009
ISBN:978-1-60558-517-8
Author
Massimiliano Fatica  NVIDIA Corporation, Santa Clara, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 101,   Downloads (12 Months): 513,   Citation Count: 0
Additional Information:

abstract   references  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1513895.1513901
What is a DOI?

ABSTRACT

This paper describes the use of CUDA to accelerate the Linpack benchmark on heterogenous clusters, where both CPUs and GPUs are used in synergy with minor or no modifications to the original source code. A host library intercepts the calls to DGEMM and DTRSM and executes them simultaneously on both GPUs and CPU cores. An 8U cluster is able to sustain more than a Teraflop using a CUDA accelerated version of HPL.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
A. Petitet, R. C. Whaley, J. Dongarra, and A. Cleary. HPL - a portable implementation of the high performance Linpack benchmark for distributed memory computers, version 2.0. http://www.netlib.org/benchmark/hpl/
 
3
NVIDIA CUDA Compute Unified Device Architecture Programming Guide
 
4
 
5
J. Dongarra, P. Luszczek, A. Petitet, "The Linpack Benchmark: Past, Present and Future", Concurrency and Computation: Practice and Experience, Vol. 15, No. 9, 2003.
 
6
J. J. Dongarra. Performance of various computers using standard linear equations software. Technical report, 2008. http://www.netlib.org/benchmark/performance.pdf