ACM Home Page
Please provide us with feedback. Feedback
SPENK: adding another level of parallelism on the cell broadband engine
Full text PdfPdf (337 KB)
Source ACM International Conference Proceeding Series; Vol. 356 archive
Proceedings of the 1st international forum on Next-generation multicore/manycore technologies table of contents
Cairo, Egypt
SESSION: Thread management and thread-level speculation table of contents
Article No. 2  
Year of Publication: 2008
ISBN:978-1-60558-407-2
Authors
Mohamed F. Ahmed  University of Connecticut, Storrs, CT
Reda A. Ammar  University of Connecticut, Storrs, CT
Sanguthevar Rajasekaran  University of Connecticut, Storrs, CT
Sponsors
IBM : IBM
: IBM Center for Advanced Studies, Cairo, Egypt
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 87,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1463768.1463771
What is a DOI?

ABSTRACT

The Cell Broadband Engine (CBE) is a heterogeneous multi-core processor with unique design properties for high-performance computing. It consists of one Power Processing Element (PPE) and eight Synergistic Processing Elements (SPEs) connected with the Elements Interconnect Network (EIB). It employs novel techniques, such as software managed cache, to hide memory latency and guarantee, by default, maximum utilization for the overall system resources. However, utilization of these facilities requires complex designs and implementations of algorithms to get best performance. In this paper we discuss our micro-threading model realized by a nano-kernel implemented on top of each SPE. SPE's Nano-kernel, or SPENK, employs the micro-threading model to increase the utilization of the CBE resources while simplifying the programming model. Our framework boosted processor's overall performance by a factor of five compared to the current threading model. It allowed us to build a distributed model for the SPEs' tasks management and automated Local Storage (LS) management. We further utilized the micro-threading model to build an event based programming model on top of the CBE architecture. We tested our framework on two types of algorithms: (1) Uniform memory access algorithms, such as parallel summation, and (2) Non-uniform or irregular memory access algorithms, specifically tree spanning algorithms. For the first type of algorithms we could obtain up to three times performance improvement and fivefold performance improvement in the second type of algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Accelerated Library Framework for Cell Broadband Engine Programmer's Guide and API Reference, IBM Corporation, version 1.1, October 2007.
 
2
 
3
4
 
5
Cell Broadband Engine Programming Handbook, IBM, version 1.1, April 2007.
 
6
Cell Broadband Engine Programming Tutorial, IBM, version 3.0, October 2007.
 
7
 
8
David A. Bader, V. Agarwal, and K. Madduri. On the Design and Analysis of Irregular Algorithms on the Cell Processor: A case study on list ranking. 21th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, CA, March 26--30, 2007.
 
9
David Kunzman, Gengbin Zheng, Eric Bohm, Laxmikant V. Kale. Charm++, Offload API, and the Cell Processor. In PMUP Workshop at PACT'06, September 2006.
 
10
David Kunzman. Charm++ on the Cell Processor. Master's Thesis, Department of Computer Science, University of Illinois 2006
 
11
Dongarra J., Gannon D., Fox G., and Kennedy K. The impact of Multicore on Computational Science Software. CTWatch Quarterly. Vol. 3, No. 1. February 2007.
12
 
13
 
14
 
15
 
16
Krste Asanovic, et. al. The Landscape of Parallel Computing Research - A view from Berkely. EECS Department, University of California, Berkeley. Technical Report No. UCB/EECS-2006-183. December 18, 2006.
17
18
 
19
M. K. Velamati, Arun Kumar, Naresh Jayam, Ganapathy Senthilkumar, Pallav K. Baruah, Raghunath Sharma, Shakti Kapoor, Ashok Srinivasan: Optimization of Collective Communication in Intra-cell MPI. HiPC 2007: 488--499
 
20
 
21
 
22
Manferdelli J. The Many-Core Inflection Point for Mass Market Computer Systems. CTWatch Quarterly. Vol. 3, No. 1. February 2007.
 
23
McCaplin J., Moore C., and Hester P. The role of Multicore Processors in the Evolution of General-Purpose Computing. CTWatch Quarterly. Vol. 3, No. 1. February 2007.

Collaborative Colleagues:
Mohamed F. Ahmed: colleagues
Reda A. Ammar: colleagues
Sanguthevar Rajasekaran: colleagues