|
ABSTRACT
Heuristics for calculating phylogenetic trees for a large sets of aligned rRNA sequences based on the maximum likelihood method are computationally expensive. The core of most parallel algorithms, which accounts for the greatest part of computation time, is the tree evaluation function, that calculates the likelihood value for each tree topology. This paper describes and uses Subtree Equality Vectors (SEVs) to reduce the number of required floating point operations during topology evaluation.We integrated our optimizations into various sequential programs and into parallel fastDNAml, one of the most common and efficient parallel programs for calculating large phylogenetic trees.Experimental results for our parallel program, which renders exactly the same output as parallel fastDNAml show global run time improvements of 26% to 65%. The optimization scales best on clusters of PCs, which also implies a substantial cost saving factor for the determination of large trees.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Hans L. Bodlaender , Michael R. Fellows , Michael T. Hallett , H. Todd Wareham , Tandy J. Warnow, The hardness of perfect phylogeny, feasible register assignment and other problems on thin colored graphs, Theoretical Computer Science, v.244 n.1-2, p.167-188, Aug. 6, 2000
[doi> 10.1016/S0304-3975(98)00342-9]
|
| |
2
|
|
| |
3
|
fastDNAml Distribution: http://geta.life.uiuc.edu/~gary/programs/fastDNAml.html
|
| |
4
|
Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17: 368--376.
|
| |
5
|
HEidelberg LInux Cluster System: http://helics.uni-hd.de
|
| |
6
|
Höchstleistungsrechner in Bayern (HLRB): The Hitachi SR8000-F1: http://www.lrz-muenchen.de/services/compute/hlrb
|
| |
7
|
Jermiin, L.S., Olsen, G.J., Mengersen, K.L. and Easteal, S. 1997. Majority-rule consensus of phylogenetic trees obtained by maximum-likelihood analysis. Mol. Biol. Evol. 14: 1297--1302.
|
| |
8
|
|
| |
9
|
Korber, B., Muldoon, M., Theiler, J., Gao, F., Gupta, R., Lapedes, A., Hahn, B.H., Wolinsky, S., and Bhattacharya, T. June 2000. Timing the ancestor of the HIV-1 pandemic strains. Science 288: 1789--1796.
|
| |
10
|
Olsen, G.J., Matsuda, H., Hagstrom, R., and Overbeek, R. 1994. fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41--48.
|
| |
11
|
|
 |
12
|
Craig A. Stewart , David Hart , Donald K. Berry , Gary J. Olsen , Eric A. Wernert , William Fischer, Parallel implementation and performance of fastDNAml: a program for maximum likelihood phylogenetic inference, Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM), p.20-20, November 10-16, 2001, Denver, Colorado
[doi> 10.1145/582034.582054]
|
| |
13
|
Stewart, C.A., Tan, T.W., Buchhorn, M., Hart, D., Berry, D., Zhang L., Wernert, E., Sakharkar, M., Fisher, W., and McMullen, D. 1999. Evolutionary biology and computational grids. IBM CASCON 1999 Computational Biology Workshop: Software Tools for Computational Biology.
|
| |
14
|
The ARB project: http://www.arb-home.de
|
| |
15
|
The PHYLogeny Inference Package: http://evolution.genetics.washington.edu/phylip.html
|
| |
16
|
Top 500 supercomputer sites: http://www.top500.org/list/2002/06
|
| |
17
|
Wolf, M.J., Easteal, S., Kahn, M. McKay, B.D., and Jermiin, L.S. 2000. TrExML: A maximum likelihood program for extensive tree-space exploration. Bioinformatics 16:383--394.
|
|