| A comparison of programming models for multiprocessors with explicitly managed memory hierarchies |
| Full text |
Pdf
(453 KB)
|
Source
|
Principles and Practice of Parallel Programming
archive
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
table of contents
Raleigh, NC, USA
SESSION: Accelerator software
table of contents
Pages 131-140
Year of Publication: 2009
ISBN:978-1-60558-397-6
Also published in ...
|
|
Authors
|
|
Scott Schneider
|
Virginia Tech, Blacksburg, VA, USA
|
|
Jae-Seung Yeom
|
Virginia Tech, Blacksburg, VA, USA
|
|
Benjamin Rose
|
Virginia Tech, Blacksburg, VA, USA
|
|
John C. Linford
|
Virginia Tech, Blacksburg, VA, USA
|
|
Adrian Sandu
|
Virginia Tech, Blacksburg, VA, USA
|
|
Dimitrios S. Nikolopoulos
|
Virginia Tech, Blacksburg, VA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 64, Downloads (12 Months): 362, Citation Count: 2
|
|
|
ABSTRACT
On multiprocessors with explicitly managed memory hierarchies (EMM), software has the responsibility of moving data in and out of fast local memories. This task can be complex and error-prone even for expert programmers. Before we can allow compilers to handle this complexity for us, we must identify the abstractions that are general enough to allow us to write applications with reasonable effort, yet specific enough to exploit the vast on-chip memory bandwidth of EMM multi-processors. To this end, we compare two programming models against hand-tuned codes on the STI Cell, paying attention to programmability and performance. The first programming model, Sequoia, abstracts the memory hierarchy as private address spaces, each corresponding to a parallel task. The second, Cellgen, is a new framework which provides OpenMP-like semantics and the abstraction of a shared address space divided into private and shared data. We compare three applications programmed using these models against their hand-optimized counterparts in terms of abstractions, programming complexity, and performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
J. Balart, M. González, X. Martorell, E. Ayguadé, Z. Sura, T. Chen, T. Zhang, K. O'Brien, and K. M. O'Brien. A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor. In Proc. of the 20th International Workshop on Languages and Compilers for Parallel Computing, LNCS Vol. 5234, pages 125--140, Oct. 2007.
|
 |
3
|
|
| |
4
|
W. P. L. Carter. Documentation Of The Saprc-99 Chemical Mechanism For Voc Reactivity Assessment. Final Report Contract No. 92-329, California Air Resources Board, May 8 2000.
|
| |
5
|
|
| |
6
|
T. Chen, Z. Sura, K. M. O'Brien, and J. K. O'Brien. Optimizing the Use of Static Buffers for DMA on a CELL Chip. In Languages and Compilers for Parallel Computing, 19th International Workshop (LCPC), pages 314--329, 2006.
|
 |
7
|
|
| |
8
|
William J. Dally , Francois Labonte , Abhishek Das , Patrick Hanrahan , Jung-Ho Ahn , Jayanth Gummaraju , Mattan Erez , Nuwan Jayasena , Ian Buck , Timothy J. Knight , Ujval J. Kapasi, Merrimac: Supercomputing with Streams, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.35, November 15-21, 2003
|
| |
9
|
A. Duran, J. M. Perez, E. Ayguade, R. M. Badia, and J. Labarta. Extending the OpenMP Tasking Model to Allow Dependent Tasks. In OpenMP in a New Era of Parallelism, Proceedings of the 4th International Workshop on OpenMP, LNCS Vol. 5004, pages 111--122, July 2008.
|
 |
10
|
Kayvon Fatahalian , Daniel Reiter Horn , Timothy J. Knight , Larkhoon Leem , Mike Houston , Ji Young Park , Mattan Erez , Manman Ren , Alex Aiken , William J. Dally , Pat Hanrahan, Sequoia: programming the memory hierarchy, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, November 11-17, 2006, Tampa, Florida
[doi> 10.1145/1188455.1188543]
|
 |
11
|
|
 |
12
|
Michael I. Gordon , William Thies , Saman Amarasinghe, Exploiting coarse-grained task, data, and pipeline parallelism in stream programs, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
 |
13
|
Jayanth Gummaraju , Joel Coburn , Yoshio Turner , Mendel Rosenblum, Streamware: programming general-purpose multicore processors using streams, Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, March 01-05, 2008, Seattle, WA, USA
|
| |
14
|
W. Hundsdorfer. Numerical Solution of Advection-Diffusion-Reaction Equations. Technical report, Centrum voor Wiskunde en Informatica, 1996.
|
| |
15
|
IBM Corporation. Software development kit for multi-core acceleration version 3.0. Oct. 2007.
|
| |
16
|
D. Jimenez-Gonzalez, X. Martorell, and A. Ramirez. Performance Analysis of Cell Broadband Engine for High Memory Bandwidth Applications. Performance Analysis of Systems & Software, 2007. ISPASS 2007. IEEE International Symposium on, pages 210--219, April 2007.
|
| |
17
|
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips. GPU Computing. Proceedings of the IEEE, 95(6):879--899, May 2008.
|
| |
22
|
B. Rose. Cellstream. http://www.cs.vt.edu/~bar234/cellstream.
|
| |
23
|
|
 |
24
|
Perry H. Wang , Jamison D. Collins , Gautham N. Chinya , Hong Jiang , Xinmin Tian , Milind Girkar , Nick Y. Yang , Guei-Yuan Lueh , Hong Wang, EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, June 10-13, 2007, San Diego, California, USA
|
CITED BY 2
|
|
Tao Liu , Haibo Lin , Tong Chen , John Kevin O'Brien , Ling Shao, DBDB: optimizing DMATransfer for the cell be architecture, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
|
|
|
|