|
ABSTRACT
This paper addresses the problem of orchestrating and scheduling parallelism at multiple levels of granularity on heterogeneous multicore processors. We present mechanisms and policies for adaptive exploitation and scheduling of layered parallelism on the Cell Broadband Engine. Our policies combine event-driven task scheduling with malleable loop-level parallelism, which is exploited from the runtime system whenever task-level parallelism leaves idle cores. We present a scheduler for applications with layered parallelism on Cell and investigate its performance with RAxML, an application which infers large phylogenetic trees, using the Maximum Likelihood (ML) method. Our experiments show that the Cell benefits significantly from dynamic methods that selectively exploit the layers of parallelism in the system, in response to workload fluctuation. Our scheduler out performs the MPI version of RAxML, scheduled by the Linux kernel, by up to a factor of 2.6. We are able to execute RAxMLon one Cell four times faster than on a dual-processor system with Hyperthreaded Xeon processors, and 5--10% faster than on a single-processor system with a dual-core, quad-thread IBM Power5processor.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
PowerPC Microprocessor Family: Vector/SIMD Multimedia Extension Technology Programming Environments Manual. http://www-306.ibm.com/chips/techlib.
|
| |
2
|
D. A. Bader, B. M. E. Moret, and L. Vawter. Industrial Applications of High-Performance Computing for Phylogeny Reconstruction. In Proc. of SPIE ITCom, volume 4528, pages 159--168, 2001.
|
| |
3
|
D. A. Bader, V. Agarwal, and K. Madduri. On the Design and Analysis of Irregular Algorithms on the Cell Processor: a Case Study on List Ranking. In Proc. of the 21st IEEE/ACM International Parallel and Distributed Processing Symposium, March 2007.
|
| |
4
|
C. Benthin, I. Wald, M. Scherbaum, and H. Friedrich. Ray Tracing on the CELL Processor. Technical Report, inTrace Realtime Ray Tracing GmbH, No inTrace-2006-001 (submitted for publication), 2006.
|
| |
5
|
F. Blagojevic, D. S. Nikolopoulos, A. Stamatakis, and C. D. Antonopoulos. RAxML-Cell: Parallel Phylogenetic Tree Inference on the Cell Broadband Engine. In Proceedings of the 21st IEEE/ACM International Parallel and Distributed Processing Symposium, March 2007.
|
| |
6
|
D. A. Brokenshire. Maximizing the Power of the Cell Broadband Engine Processor: 25 Tips to Optimal Application Performance. IBM developerWorks, jun 2006.
|
| |
7
|
M. Charalambous, P. Trancoso, and A. Stamatakis. Initial Experiences Porting a Bioinformatics Application to a Graphics Processor. In In Proceedings of the 10th Panhellenic Conference on Informatics (PCI 2005), pages 415--425, 2005.
|
| |
8
|
T. Chen, R. Raghavan, J. Dale, and E. Iwata. Cell Broadband Engine Architecture and its First Implementation. IBM developerWorks, Nov 2005.
|
| |
9
|
|
| |
10
|
T. Z. DeSantis, P. Hugenholtz, N. Larsen, M. Rojas, E. L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, and G. L. Andersen. Greengenes, a Chimera Checked 16S rRNA Gene Database and Workbench Compatible with ARB. Appl. Environ. Microbiol., 72(7):5069--5072, 2006.
|
| |
11
|
Alexandre E. Eichenberger , Kathryn O'Brien , Kevin O'Brien , Peng Wu , Tong Chen , Peter H. Oden , Daniel A. Prener , Janice C. Shepherd , Byoungro So , Zehra Sura , Amy Wang , Tao Zhang , Peng Zhao , Michael Gschwind, Optimizing Compiler for the CELL Processor, Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, p.161-172, September 17-21, 2005
[doi> 10.1109/PACT.2005.33]
|
| |
12
|
B. Flachs et al. The Microarchitecture of the Streaming Processor for a CELL Processor. Proceedings of the IEEE International Solid-State Circuits Symposium, pages 184--185, February 2005.
|
| |
13
|
J. Felsenstein. Evolutionary Trees From DNA Sequences: A Maximum Likelihood Approach. Journal of Molecular Evolution, 17:368--376, 1981.
|
| |
14
|
G. W. Grimm, S. S. Renner, A. Stamatakis, and V. Hemleben. A Nuclear Ribosomal DNA Phylogeny of Acer Inferred With Maximum Likelihood, Splits Graphs, and Motif Analyses of 606 Sequences. Evolutionary Bioinformatics Online, 2006. to be published.
|
| |
15
|
S. Guindon and O. Gascuel. A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Syst. Biol., 52(5):696--704, 2003.
|
| |
16
|
N. Hjelte. Smoothed Particle Hydrodynamics on the Cell Broadband Engine. Master's thesis, Umeå University, Department of Computer Science, Jun 2006.
|
| |
17
|
|
| |
18
|
R. E. Ley, J. K. Harris, J. Wilcox, J. R. Spear, S. R. Miller, B. M. Bebout, J. A. Maresca, D. A. Bryant, M. L. Sogin, and N. R. Pace. Unexpected Diversity and Complexity of the Guerrero Negro Hypersaline Microbial Mat. Appl. Envir. Microbiol., 72(5):3685--3695, May 2006.
|
| |
19
|
R. E. Ley, F. Backhed, P. Turnbaugh, C. A. Lozupone, R. D. Knight, and J. I. Gordon. Obesity Alters Gut Microbial Ecology. Proceedings of the National Academy of Sciences of the United States of America, 102(31):11070--11075, 2005.
|
| |
20
|
|
| |
21
|
B. Minor, G. Fossum, and V. To. Terrain Renderin Engine (TRE), http://www.research.ibm.com/cell/whitepapers/TRE.pdf. May 2005.
|
| |
22
|
F. Petrini, G. Fossum, M. Kistler, and M. Perrone. Multicore Suprises: Lesson Learned from Optimizing Sweep3D on the Cell Broadbend Engine.
|
| |
23
|
C. E. Robertson, J. K. Harris, J. R. Spear, and N. R. Pace. Phylogenetic Diversity and Ecology of Environmental Archaea. Current Opinion in Microbiology, 8:638--642, 2005.
|
| |
24
|
|
| |
25
|
A. Stamatakis, M. Ott, and T. Ludwig. Raxml-omp: An Efficient Program for Phylogenetic Inference on SMPs. In Proc. of PaCT05, pages 288--302, 2005.
|
| |
26
|
|
| |
27
|
Alias Systems. Alias Cloth Technology Demonstration for the Cell Processor, http://www.research.ibm.com/cell/whitepapers/alias_cloth.pdf. 2005.
|
 |
28
|
Samuel Williams , John Shalf , Leonid Oliker , Shoaib Kamil , Parry Husbands , Katherine Yelick, The potential of the cell processor for scientific computing, Proceedings of the 3rd conference on Computing frontiers, May 03-05, 2006, Ischia, Italy
[doi> 10.1145/1128022.1128027]
|
| |
29
|
D. Zwickl. Genetic Algorithm Approaches for the Phylogenetic Analysis of Large Biological Sequence Datasets under the Maximum Likelihood Criterion. PhD thesis, University of Texas at Austin, April 2006.
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
Filip Blagojevic , Dimitrios S. Nikolopoulos , Alexandros Stamatakis , Christos D. Antonopoulos , Matthew Curtis-Maury, Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems, Parallel Computing, v.33 n.10-11, p.700-719, November, 2007
|
|
|
Michael Ott , Jaroslaw Zola , Srinivas Aluru , Andrew D. Johnson_aff3n4 , Daniel Janies , Alexandros Stamatakis, Large-scale phylogenetic analysis on current HPC architectures, Scientific Programming, v.16 n.2-3, p.255-270, April 2008
|
|
|
|
|
|
|
|
|
Filip Blagojevic , Costin Iancu , Katherine Yelick , Matthew Curtis-Maury , Dimitrios S. Nikolopoulos , Benjamin Rose, Scheduling dynamic parallelism on accelerators, Proceedings of the 6th ACM conference on Computing frontiers, May 18-20, 2009, Ischia, Italy
|
|
|
|
|