|
ABSTRACT
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithm-level parallelism, express their parallel algorithms by asserting high-level properties on top of a traditional sequential programming language, and rely on parallelizing compilers and hardware support to perform parallel execution under the hood. In such a model, compilers and related tools require much more advanced program analysis capabilities and programmer assertions than what are currently available so that a comprehensive understanding of the input program's concurrency can be derived. Such an understanding is then used to drive automatic or interactive parallel code generation tools for a diverse set of parallel hardware organizations. The chip-level architecture and hardware should maintain parallel execution state in such a way that a strictly sequential execution state can always be derived for the purpose of verifying and debugging the program. We argue that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Barnett, K. R. M. Leino, and W. Schulte. The Spec# programming system: An overview. In Proc. of Construction and Analysis of Safe, Secure and Interoperable Smart Devices, volume 3362 of LNCS, pages 49--69. Springer, March 2004.
|
| |
2
|
F. Bodin, T. Kisuki, P. Knijnenburg, M. O'Boyle, and E. Rohou. Iterative compilation in a non-linear optimisation space. In Proceedings of the Workshop on Profile and Feedback-Directed Compilation, October 1998.
|
| |
3
|
D. Carmean. The Pentium 4 processor. Hot Chips 13, Stanford University, Palo Alto, CA, August 2001.
|
| |
4
|
|
| |
5
|
Datamonitor, 2006. http://www.datamonitor.com.
|
 |
6
|
Richard A. Hankins , Gautham N. Chinya , Jamison D. Collins , Perry H. Wang , Ryan Rakvic , Hong Wang , John P. Shen, Multiple Instruction Stream Processor, Proceedings of the 33rd annual international symposium on Computer Architecture, p.114-127, June 17-21, 2006
|
| |
7
|
|
| |
8
|
T. G. Mattson, B. A. Sanders, and B. L. Massingill. Patterns for Parallel Programming. Addison-Wesley Professional, 2004.
|
| |
9
|
|
| |
10
|
Message Passing Interface Forum. MPI-2: extensions to the message-passing interface, July 1997.
|
| |
11
|
N. Mitchell. Philips TriMedia: A digital media convergence platform. In Proceedings of WESCON, pages 56--60, November 1997.
|
| |
12
|
G. E. Moore. Cramming more components onto integrated circuits. Electronics, April 1965.
|
| |
13
|
NVIDIA Corporation. CUDA Programming Guide, February 2007.
|
| |
14
|
OpenMP Architecture Review Board. OpenMP application program interface, May 2005.
|
| |
15
|
|
 |
16
|
|
| |
17
|
S. Ryoo, S.-Z. Ueng, C. I. Rodrigues, R. E. Kidd, M. I. Frank, and W. W. Hwu. Automatic Discovery of Coarse-Grained Parallelism in Media Applications. Transactions on HiPEAC I, LNCS 4050, 2007.
|
| |
18
|
A. Sangiovanni-Vincentelli. Defining platform-based design. EEDesign of EETimes, February 2002.
|
| |
19
|
|
| |
20
|
Sun Microsystems. Java. http://java.sun.com.
|
| |
21
|
|
| |
22
|
|
| |
23
|
J. Turley and H. Hakkarainen. TI's new 'C6x DSP screams at 1,600 MIPS. Microprocessor Report, 11(2):14--17, February 1997.
|
| |
24
|
|
CITED BY 5
|
|
|
|
|
Shuai Che , Michael Boyer , Jiayuan Meng , David Tarjan , Jeremy W. Sheaffer , Kevin Skadron, A performance study of general-purpose applications on graphics processors using CUDA, Journal of Parallel and Distributed Computing, v.68 n.10, p.1370-1380, October, 2008
|
|
|
|
|
|
|
|
|
|
|