|
ABSTRACT
Parallel architectures are the way of the future, but are notoriously difficult to program. In addition to the low-level constructs they often present (e.g., locks, DMA, and non-sequential memory models), most parallel programming environments admit data races: the environment may make nondeterministic scheduling choices that can change the function of the program. We believe the solution is model-based design, where the programmer is presented with a constrained higher-level language that prevents certain unwanted behavior. In this paper, we describe a compiler for the SHIM scheduling-independent concurrent language that generates code for the Cell Broadband heterogeneous multicore processor. The complexity of the code our compiler generates relative to the source illustrates how difficult it is to manually write code for the Cell. We demonstrate the efficacy of our compiler on two examples. While the SHIM language is (by design) not ideal for every algorithm, it works well for certain applications and simplifies the parallel programming process, especially on the Cell architecture.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Vikas Agarwal , M. S. Hrishikesh , Stephen W. Keckler , Doug Burger, Clock rate versus IPC: the end of the road for conventional microarchitectures, Proceedings of the 27th annual international symposium on Computer architecture, p.248-259, June 2000, Vancouver, British Columbia, Canada
|
| |
3
|
|
 |
4
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.207-216, July 19-21, 1995, Santa Barbara, California, United States
|
| |
5
|
A. C. Chow et al. A programming example: Large FFT on the Cell Broadband Engine. In Global Signal Processing Expo (GSPx), Santa Clara, CA, Oct. 2005. (from IBM)
|
 |
6
|
|
 |
7
|
|
| |
8
|
A. E. Eichenberger , J. K. O'Brien , K. M. O'Brien , P. Wu , T. Chen , P. H. Oden , D. A. Prener , J. C. Shepherd , B. So , Z. Sura , A. Wang , T. Zhang , P. Zhao , M. K. Gschwind , R. Archambault , Y. Gao , R. Koo, Using advanced compiler technology to exploit the performance of the Cell Broadband EngineTM architecture, IBM Systems Journal, v.45 n.1, p.59-84, January 2006
|
| |
9
|
Alexandre E. Eichenberger , Kathryn O'Brien , Kevin O'Brien , Peng Wu , Tong Chen , Peter H. Oden , Daniel A. Prener , Janice C. Shepherd , Byoungro So , Zehra Sura , Amy Wang , Tao Zhang , Peng Zhao , Michael Gschwind, Optimizing Compiler for the CELL Processor, Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, p.161-172, September 17-21, 2005
[doi> 10.1109/PACT.2005.33]
|
 |
10
|
Kayvon Fatahalian , Daniel Reiter Horn , Timothy J. Knight , Larkhoon Leem , Mike Houston , Ji Young Park , Mattan Erez , Manman Ren , Alex Aiken , William J. Dally , Pat Hanrahan, Sequoia: programming the memory hierarchy, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, November 11-17, 2006, Tampa, Florida
[doi> 10.1145/1188455.1188543]
|
| |
11
|
|
 |
12
|
|
| |
13
|
IBM. Cell Broadband Engine Architecture v1.02, Oct. 2007.
|
| |
14
|
IBM. Example Library API Reference v3.0, Sept. 2007.
|
| |
15
|
J. A. Kahle , M. N. Day , H. P. Hofstee , C. R. Johns , T. R. Maeurer , D. Shippy, Introduction to the cell multiprocessor, IBM Journal of Research and Development, v.49 n.4/5, p.589-604, July 2005
|
| |
16
|
G. Kahn. The semantics of a simple language for parallel programming. In Information Processing 74: IFIP Congress 74, pages 471--475, Stockholm, Sweden, Aug. 1974.
|
| |
17
|
|
| |
18
|
E. A. Lee and D. G. Messerschmitt. Synchronous data flow. Proc. IEEE, 75(9):1235--1245, Sept. 1987.
|
| |
19
|
The Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, June 1995. Version 1.1.
|
| |
20
|
|
| |
21
|
OpenMP Arch. Review Board, www.openmp.org. OpenMP C and C++ Application Program Interface, 2002. Ver. 2.0.
|
| |
22
|
F. Petrini et al. Multicore surprises: Lessons learned from optimizing Sweep3D on the Cell Broadband Engine. In Intl. Parallel and Distributed Processing Symposium (IPDPS), pages 1--10, Long Beach, CA, Mar. 2007.
|
| |
23
|
D. Pham et al. The design and implementation of a first-generation Cell processor. In Solid-State Cir. Conf. (ISSCC), v. 1, pp. 184--185, San Francisco, CA, Feb. 2005.
|
 |
24
|
Tarik Saidani , Stéphane Piskorski , Lionel Lacassagne , Samir Bouaziz, Parallelization schemes for memory optimization on the cell processor: a case study of image processing algorithm, Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture, p.9-16, September 16-16, 2007, Brasov, Romania
[doi> 10.1145/1327171.1327172]
|
 |
25
|
|
| |
26
|
|
| |
27
|
N. Vasudevan and S. A. Edwards. Static deadlock detection for the SHIM concurrent language. In Formal Methods and Models for Codesign, Anaheim, CA, June 2008.
|
|