ACM Home Page
Please provide us with feedback. Feedback
Genetic subtyping using cluster analysis
Full text PdfPdf (984 KB)
Source ACM SIGKDD Explorations Newsletter archive
Volume 3 ,  Issue 1  (July 2001) table of contents
COLUMN: Contributed articles table of contents
Pages: 33 - 42  
Year of Publication: 2001
ISSN:1931-0145
Authors
Tom Burr  Los Alamos National Laboratory, Los Alamos, NM
James R. Gattiker  Los Alamos National Laboratory, Los Alamos, NM
Greggory S. LaBerge  Denver Police Dept. Crime Lab, Denver, CO and University of Colorado
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 34,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/507533.507539
What is a DOI?

ABSTRACT

In this paper we (1) describe state-of-the-art methods to identify clusters in DNA sequence data for taxonomic analysis; (2) describe a new method with better scaling properties based on model-based clustering, and (3) present examples using the nucleoprotein and hemagglutin regions of influenza and the env and gag regions of human immunodeficiency virus (HIV).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Banfield, J. and Raftery, A. Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803-821, 1993.
 
2
Bradley, P., Fayyad, U., and Reina, C. Scaling Clustering Algorithms to Large Databases. Proceedings of the 4th International Conf. on Knowledge Discovery and Data Mining (KDD-98). AAAI Press, Aug. 1998.
 
3
Burr, T., Myers, G., and Hyman, J. The origin of AIDS --- Darwinian or Lamarkian? Phil. Trans. R. Soc. Lond. B.356:877-887, 2001
 
4
 
5
Burr, T., Charlton, W., and Stanbro, W. Comparison of signature pattern analysis methods in molecular epidemiology. Mathematical and Engineering Methods in Medicine and Biological Sciences, 473-479, 2000.
 
6
Dempster, A., Laird, N., and Rubin, D. Maximum likelihood for incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39:1-38, 1977.
 
7
Efron, B., Halloran, E., and Holmes, S. Bootstrap confidence levels for phylogenetic trees. Proc. Natl. Acad. Sci. USA 93: 13429, 1996.
8
 
9
Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368-376, 1981.
 
10
Felsenstein, J. Phylogenies from molecular sequences: inference and reliability. Annual Review of Genetics 22:521-565, 1997.
 
11
Fraley, C. and Raftery, A. MCLUST: Software for model-based cluster analysis. Journal of Classification 16:297-306, 1999.
 
12
Gammelin, M., Mandler, J., and Schholtissek, C. Two subtypes of nucleoproteins (NP) of the influenza viruses. Virology 170:71-80, 1989.
 
13
Grassley, N. C., Harvey, P. H., and Holmes, E. C. Population dynamics of HIV-1 inferred from gene sequences. Genetics 151: 427-438, 1999.
14
 
15
Hasegawa, M., Kishino, H., and Yano, T. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 21: 160-174, 1985.
 
16
Holmes, E. C., Pybus, O. G., and Harvey, P. H. The molecular population dynamics of HIV-1. In Crandell, K. The Evolution of HIV, Baltimore: Johns Hopkins University Press, 1999.
 
17
Hu, D. J., Buve, A., Baggs, J., van der Groen, G., and Dondero, T. J. What role does HIV-1 subtype play in transmission and pathogenesis? An epidemiological perspective. AIDS 13:873-881, 1999.
 
18
Huelsenbeck, J. and Rannala, B. Phylogenetic methods come of age: testing hypotheses in an evolutionary context. Science, 276: 227-232, 1997.
 
19
 
20
Kass, R. and Raftery, A. Bayes Factors. J. American Statistical Association. 90:773-795, 1995.
 
21
Kingman, J. F. C. On the genealogy of large populations. J. Appl. Prob. 19: 27-43. 1982.
 
22
Korber, B. and Myers, G. Signature pattern analysis: a method for assessing viral sequence relatedness. AIDS Research and Human Retroviruses 8: 1549-1560, 1992.
 
23
Leitner, T., Kumar., S., and Albert, J. Tempo and mode of nucleotide substitutions in gag and env gene fragments in HIV Type 1 populations with a known transmission history. Virology 71: 4761-4770, 1997.
 
24
Leitner, T., et al, Accurate reconstruction of a known HIV-1 transmission history by phylogenetic tree analysis. Proc. Natl. Acad. Sci., USA 93: 10864-10869, 1996.
 
25
Mau, B., Newton, M., and Larget, B. Bayesian phylogenetic inference via Markov Chain Montre Carlo Methods. Biometrics 55:1-12, 1999.
 
26
 
27
Myers, G. HIV: between past and future. AIDS Res Human Retro 10: 1317-1324, 1994.
 
28
Needleman, S. and Wunsch, C. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol Biol. 48:443-453, 1970.
 
29
 
30
Salter, L. Algorithms for phylogenetic tree reconstruction. Mathematical and Engineering Methods in Medicine and Biological Sciences, 459-465, 2000.
 
31
Simon, D. and Larget, B. Bayesian Analysis in Molecular Biology and Evolution (BAMBE) version 1.01 beta, Dept. of Mathematics and Computer Science, Duquesne University, 1998.
 
32
S-Plus 5.1 MathSoft, Seattle Washington, 1999.
 
33
Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. Phylogenetic inference In Molecular Systematics, 2nd edition, pp. 407-514 (Hillis et al., eds.) Sunderland, Massachusetts: Sinauer Associates, 1996.
 
34
Swofford, D. L. PAUP* Phylogenetic analysis using parsimony; Version 4; Sunderland, Massachusetts: Sinauer Associates, 1999.
 
35
Venables, W. and Ripley, B. Modern applied statistics with S-PLUS, 2nd ed., Springer-Verlag: NY, 1997.
 
36
Web sites: hiv-web.lanl.gov for the HIV sequences; linker.lanl.gov/flu for the influenza sequences; www.stat.washington.edu/fraley for emclust code for use in Splus; http://evolve.zoo.ox.ac.uk for Treevolve code to simulate DNA data under various coalescent models.
37

Collaborative Colleagues:
Tom Burr: colleagues
James R. Gattiker: colleagues
Greggory S. LaBerge: colleagues