|
ABSTRACT
The BlueGene/L supercomputer has been designed with a focus on power/performance efficiency to achieve high application performance under the thermal constraints of common data centers. To achieve this goal, emphasis was put on system solutions to engineer a power-efficient system. To exploit thread level parallelism, the BlueGene/L system can scale to 64 racks with a total of 65536 computer nodes consisting of a single compute ASIC integrating all system functions with two industry-standard PowerPC microprocessor cores in a chip multiprocessor configuration. Each PowerPC processor exploits data-level parallelism with a high-performance SIMD oating point unitTo support good application scaling on such a massive system, special emphasis was put on efficient communication primitives by including five highly optimized communification networks. After an initial introduction of the Blue-Gene/L system architecture, we analyze power/performance efficiency for the BlueGene system using performance and power characteristics for the overall system performance (as exemplified by peak performance numbers.To understand application scaling behavior, and its impact on performance and power/performance efficiency, we analyze the NAMD molecular dynamics package using the ApoA1 benchmark. We find that even for strong scaling problems, BlueGene/L systems can deliver superior performance scaling and deliver significant power/performance efficiency. Application benchmark power/performance scaling for the voltage-invariant energy delay 2 power/performance metric demonstrates that choosing a power-efficient 700MHz embedded PowerPC processor core and relying on application parallelism was the right decision to build a powerful, and power/performance efficient system
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Almasi, C. Cascaval, J. Castanos, D. Lieber, and J. Moreira. Developing system software for Blue Gene. Technical report, IBM TJ Watson Research Center, 2001.
|
| |
2
|
P. Bose, D. Brooks, P. Emma, M. Gschwind, V. Srinivasan, P. Strenski, and V. Zyuban. Integrated analysis of power and performance for pipelined microprocessors. IBM Research Report RC22913, IBM TJ Watson Research Center, Yorktown Heights, NY, April 2003.
|
| |
3
|
A. Bright, M. Ellavsky, A. Gara, R. Haring, G. Kopcsay, R. Lembach, J. Marcella, M. Ohmacht, and V. Salapura. Creating the BlueGene/L supercomputer from low power SoC ASICs. In Internationcal Solid State Circuits Conference. IEEE, February 2005.
|
| |
4
|
A. Gara et al. An overview of the BlueGene/L system architecture. IBM Journal of Research and Development, 49(2), 2005.
|
| |
5
|
M. Giampapa, R. Bellofatto, M. Blumrich, D. Chen, A. Gara, P. Heidelberger, D. Hoenicke, G. Kopcsay, B. Nathanson, B. Steinmacher-Burow, M. Ohmacht, V. Salapura, and P. Vranas. BlueGene/L advanced diagnostics environment. IBM Journal of Research and Development, 49(2), 2005.
|
| |
6
|
R. Gonzalez, B. Gordon, and M. Horowitz. Supply and threshold voltage scaling for low power CMOS. IEEE Journal of Solid State Circuits, 2(8):1210--1216, August 1997.
|
| |
7
|
R. Gonzalez and M. Horowitz. Energy dissipation in general purpose microprocessors. IEEE Journal of Solid State Circuits, 31(9):12771284, September 1996.
|
| |
8
|
Laxmikant Kalé , Robert Skeel , Milind Bhandarkar , Robert Brunner , Attila Gursoy , Neal Krawetz , James Phillips , Artiomo Shinozaki , Krishnan Varadarajan , Klaus Schulten, NAMD2: greater scalability for parallel molecular dynamics, Journal of Computational Physics, v.151 n.1, p.283-312, May 1, 1999
[doi> 10.1006/jcph.1999.6201]
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
CITED BY 5
|
|
|
|
|
José E. Moreira , Valentina Salapura , George Almasi , Charles Archer , Ralph Bellofatto , Peter Bergner , Randy Bickford , Mathias Blumrich , José R. Brunheroto , Arthur A. Bright , Michael Brutman , José G. Castaños , Dong Chen , Paul Coteus , Paul Crumley , Sam Ellis , Thomas Engelsiepen , Alan Gara , Mark Giampapa , Tom Gooding , Shawn Hall , Ruud A. Haring , Roger Haskin , Philip Heidelberger , Dirk Hoenicke , Todd Inglett , Gerrard V. Kopcsay , Derek Lieber , David Limpert , Pat McCarthy , Mark Megerian , Mike Mundy , Martin Ohmacht , Jeff Parker , Rick A. Rand , Don Reed , Ramendra Sahoo , Alda Sanomiya , Richard Shok , Brian Smith , Gordon G. Stewart , Todd Takken , Pavlos Vranas , Brian Wallenfelt , Michael Blocksome , Joe Ratterman, The blue gene/L supercomputer: a hardware and software story, International Journal of Parallel Programming, v.35 n.3, p.181-206, June 2007
|
|
|
|
|
|
Valentina Salapura , Matthias Blumrich , Alan Gara, Improving the accuracy of snoop filtering using stream registers, Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture, p.25-32, September 16-16, 2007, Brasov, Romania
|
|
|
|
|