ACM Home Page
Please provide us with feedback. Feedback
The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI-Based Local Searches
Full text PdfPdf (1.16 MB)
Source IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) archive
Volume 6 ,  Issue 2  (April 2009) table of contents
Pages 221-231  
Year of Publication: 2009
ISSN:1545-5963
Authors
Mukul S. Bansal  Iowa State University, Ames
Oliver Eulenstein  Iowa State University, Ames
André Wehe  Iowa State University, Ames
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 50,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1109/TCBB.2009.7

ABSTRACT

The gene-duplication problem is to infer a species supertree from a collection of gene trees that are confounded by complex histories of gene-duplication events. This problem is NP-complete and thus requires efficient and effective heuristics. Existing heuristics perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. A classical local search problem is the {\tt NNI} search problem, which is based on the nearest neighbor interchange operation. In this work, we 1) provide a novel near-linear time algorithm for the {\tt NNI} search problem, 2) introduce extensions that significantly enlarge the search space of the {\tt NNI} search problem, and 3) present algorithms for these extended versions that are asymptotically just as efficient as our algorithm for the {\tt NNI} search problem. The exceptional speedup achieved in the extended {\tt NNI} search problems makes the gene-duplication problem more tractable for large-scale phylogenetic analyses. We verify the performance of our algorithms in a comparison study using sets of large randomly generated gene trees.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M.S. Bansal and O. Eulenstein, "The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI-Based Local Searches," Proc. Int'l Symp. Bioinformatics Research and Applications (ISBRA), pp. 14-25, May 2008.
 
2
M. Goodman, J. Czelusniak, G.W. Moore, A.E. Romero-Herrera, and G. Matsuda, "Fitting the Gene Lineage into Its Species Lineage: A Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences," Systematic Zoology, vol. 28, pp. 132-163, 1979.
 
3
R. Guigó, I. Muchnik, and T.F. Smith, "Reconstruction of Ancient Molecular Phylogeny," Molecular Phylogenetics and Evolution, vol. 6, no. 2, pp. 189-213, 1996.
 
4
 
5
I. Wapinski, A. Pfeffer, N. Friedman, and A. Regev, "Natural History and Evolutionary Principles of Gene Duplication in Fungi," Nature, vol. 449, pp. 54-61, 2007.
 
6
L. Arvestad, A.-C. Berglund, J. Lagergren, and B. Sennblad, "Bayesian Gene/Species Tree Reconciliation and Orthology Analysis Using MCMC," Proc. Int'l Conf. Intelligent Systems for Molecular Biology (ISMB) (Supplement of Bioinformatics), pp. 7-15, 2003.
7
 
8
 
9
R.D.M. Page, "GeneTree: Comparing Gene and Species Phylogenies Using Reconciled Trees," Bioinformatics, vol. 14, no. 9, pp. 819- 820, 1998.
 
10
J. Felsenstein, Inferring Phylogenies, pp. 37-46. Sinauer Assoc., 2004.
 
11
C. Semple and M. Steel, Phylogenetics, pp. 30-33. Oxford Univ. Press, 2003.
 
12
B.L. Allen and M. Steel, "Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees," Annals of Combinatorics, vol. 5, pp. 1-13, 2001.
 
13
M. Bordewich and C. Semple, "On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance," Annals of Combinatorics, vol. 8, pp. 409-423, 2004.
 
14
R.D.M. Page and M.A. Charleston, "From Gene to Organismal Phylogeny: Reconciled Trees and the Gene Tree/Species Tree Problem," Molecular Phylogenetics and Evolution, vol. 7, pp. 231- 240, 1997.
 
15
D.L. Swofford, G.J. Olsen, P.J. Waddel, and D.M. Hillis, "Phylogenetic Inference," Molecular Systematics, D.M. Hillis, C. Moritz, and B.K. Mable, eds., chapter 11, pp. 407-509, Sinauer Assoc., 1996.
 
16
D. Chen, O. Eulenstein, D. Fernández-Baca, and J.G. Burleigh, "Improved Heuristics for Minimum-Flip Supertree Construction," Evolutionary Bioinformatics, vol. 2, pp. 347-356, 2006.
 
17
M.S. Bansal, J.G. Burleigh, O. Eulenstein, and A. Wehe, "Heuristics for the Gene-Duplication Problem: A ¿(n) Speedup for the Local Search," Proc. Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB), pp. 238-252, 2007.
 
18
 
19
G. Ganapathy, V. Ramachandran, and T. Warnow, "Better Hill-Climbing Searches for Parsimony," Proc. Workshop Algorithms in Bioinformatics (WABI), pp. 245-258, 2003.
 
20
 
21
R.D.M. Page, "Maps between Trees and Cladistic Analysis of Historical Associations Among Genes, Organisms, and Areas," Systematic Biology, vol. 43, no. 1, pp. 58-77, 1994.
 
22
B. Mirkin, I. Muchnik, and T.F. Smith, "A Biologically Consistent Model for Comparing Molecular Phylogenies," J. Computational Biology, vol. 2, no. 4, pp. 493-507, 1995.
 
23
O. Eulenstein, "Predictions of Gene-Duplications and Their Phylogenetic Development," PhD dissertation, Univ. of Bonn, GMD Research Series no. 20/1998, 1998.
 
24
L. Zhang, "On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies," J. Computational Biology, vol. 4, no. 2, pp. 177-187, 1997.
 
25
K. Chen, D. Durand, and M. Farach-Colton, "Notung: A Program for Dating Gene Duplications and Optimizing Gene Family Trees," J. Computational Biology, vol. 7, pp. 429-447, 2000.
 
26
 
27
P. Górecki and J. Tiuryn, "On the Structure of Reconciliations," Proc. Research in Computational Molecular Biology (RECOMB) Comparative Genomics Workshop, pp. 42-52, Oct. 2004.
 
28
 
29
J.B. Slowinski, A. Knight, and A.P. Rooney, "Inferring Species Trees from Gene Trees: A Phylogenetic Analysis of the Elapidae (Serpentes) Based on the Amino Acid Sequences of Venom Proteins," Molecular Phylogenetics and Evolution, vol. 8, pp. 349- 362, 1997.
 
30
R.D.M. Page, "Extracting Species Trees from Complex Gene Trees: Reconciled Trees and Vertebrate Phylogeny," Molecular Phylogenetics and Evolution, vol. 14, pp. 89-106, 2000.
 
31
R.D.M. Page and J. Cotton, "Vertebrate Phylogenomics: Reconciled Trees and Gene Duplications," Proc. Pacific Symp. Biocomputing , pp. 536-547, 2002.
 
32
J.A. Cotton and R.D.M. Page, "Tangled Tales from Multiple Markers: Reconciling Conflict between Phylogenies to Build Molecular Supertrees," Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, pp. 107-125, Springer-Verlag, 2004.
 
33
M.J. Sanderson and M.M. McMahon, "Inferring Angiosperm Phylogeny from EST Data with Widespread Gene Duplication," BMC Evolutionary Biology, vol. 7, (Suppl. 1): S3, 2007.
 
34
 
35
36
 
37
 
38

Collaborative Colleagues:
Mukul S. Bansal: colleagues
Oliver Eulenstein: colleagues
André Wehe: colleagues