|
ABSTRACT
The advent of multicore CPUs and manycore GPUs means that
mainstream processor chips are now parallel systems. Furthermore,
their parallelism continues to scale with Moore's law. The
challenge is to develop mainstream application software that
transparently scales its parallelism to leverage the increasing
number of processor cores, much as 3D graphics applications
transparently scale their parallelism to manycore GPUs with widely
varying numbers of cores.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
NVIDIA. 2007. CUDA Technology; http://www.nvidia.com/CUDA.
|
| |
2
|
NVIDIA. 2007. CUDA Programming Guide 1.1; http://developer.download.nvidia.com/compute/cuda/1_1/NVIDIA_CUDA_Programming_Guide_1.1.pdf.
|
| |
3
|
Stratton, J.A., Stone, S. S., Hwu, W. W. 2008. M-CUDA: An efficient implementation of CUDA kernels on multicores. IMPACT Technical Report 08-01, University of Illinois at Urbana-Champaign, (February).
|
| |
4
|
See reference 3.
|
 |
5
|
Ian Buck , Tim Foley , Daniel Horn , Jeremy Sugerman , Kayvon Fatahalian , Mike Houston , Pat Hanrahan, Brook for GPUs: stream computing on graphics hardware, ACM SIGGRAPH 2004 Papers, August 08-12, 2004, Los Angeles, California
|
| |
6
|
Stone, S.S., Yi, H., Hwu, W.W., Haldar, J.P., Sutton, B.P., Liang, Z.-P. 2007. How GPUs can improve the quality of magnetic resonance imaging. The First Workshop on General-Purpose Processing on Graphics Processing Units (October).
|
| |
7
|
Stone, J.E., Phillips, J.C., Freddolino, P.L., Hardy, D.J., Trabuco, L.G., Schulten, K. 2007. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry 28(16): 2618--2640; http://dx.doi.org/10.1002/jcc.20829.
|
| |
8
|
Nyland, L., Harris, M., Prins, J. 2007. Fast n-body simulation with CUDA. In GPU Gems 3. H. Nguyen, ed. Addison-Wesley.
|
| |
9
|
|
| |
10
|
Buatois, L., Caumon, G., Lévy, B. 2007. Concurrent number cruncher: An efficient sparse linear solver on the GPU. Proceedings of the High-Performance Computation Conference (HPCC), Springer LNCS.
|
| |
11
|
|
| |
12
|
See Reference 3.
|
CITED BY 14
|
|
|
|
|
Shuai Che , Michael Boyer , Jiayuan Meng , David Tarjan , Jeremy W. Sheaffer , Kevin Skadron, A performance study of general-purpose applications on graphics processors using CUDA, Journal of Parallel and Distributed Computing, v.68 n.10, p.1370-1380, October, 2008
|
|
|
|
|
|
Larry Seiler , Doug Carmean , Eric Sprangle , Tom Forsyth , Michael Abrash , Pradeep Dubey , Stephen Junkins , Adam Lake , Jeremy Sugerman , Robert Cavin , Roger Espasa , Ed Grochowski , Toni Juan , Pat Hanrahan, Larrabee: a many-core x86 architecture for visual computing, ACM Transactions on Graphics (TOG), v.27 n.3, August 2008
|
|
|
John E. Stone , Jan Saam , David J. Hardy , Kirby L. Vandivort , Wen-mei W. Hwu , Klaus Schulten, High performance computation and interactive display of molecular orbitals on GPUs and multi-core CPUs, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, p.9-18, March 08-08, 2009, Washington, D.C.
|
|
|
|
|
|
Jay L.T. Cornwall , Lee Howes , Paul H.J. Kelly , Phil Parsonage , Bruno Nicoletti, High-performance SIMT code generation in an active visual effects library, Proceedings of the 6th ACM conference on Computing frontiers, May 18-20, 2009, Ischia, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John H. Kelm , Daniel R. Johnson , Matthew R. Johnson , Neal C. Crago , William Tuohy , Aqeel Mahesri , Steven S. Lumetta , Matthew I. Frank , Sanjay J. Patel, Rigel: an architecture and scalable programming interface for a 1000-core accelerator, ACM SIGARCH Computer Architecture News, v.37 n.3, June 2009
|
|
|
|
|