ACM Home Page
Please provide us with feedback. Feedback
Clustering gene expression data via mining ensembles of classification rules evolved using moses
Full text PdfPdf (289 KB)
Source
Genetic And Evolutionary Computation Conference archive
Proceedings of the 9th annual conference on Genetic and evolutionary computation table of contents
London, England
SESSION: Biological applications: papers table of contents
Pages: 407 - 414  
Year of Publication: 2007
ISBN:978-1-59593-697-4
Authors
Moshe Looks  Washington University in St. Louis: also SAIC, St. Louis, MO
Ben Goertzel  Biomind LLC, Rockville, MD
Lucio de Souza Coelho  Biomind LLC, Rockville, MD
Mauricio Mudado  Biomind LLC, Rockville, MD
Cassio Pennachin  Biomind LLC, Rockville, MD
Sponsors
SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 66,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1276958.1277041
What is a DOI?

ABSTRACT

A novel approach, model-based clustering, is described foridentifying complex interactions between genes or gene-categories based on static gene expression data. The approach deals with categorical data, which consists of a set of gene expressionprofiles belonging to one category, and a set belonging to anothercategory. An evolutionary algorithm (Meta-Optimizing Semantic Evolutionary Search, or MOSES) is used to learn an ensemble of classification models distinguishing the two categories, based on inputs that are features corresponding to gene expression values. Each feature is associated with a model-based vector, which encodes quantitative information regarding the utilization of the feature across the ensembles of models. Two different ways of constructing these vectors are explored. These model-based vectors are then clustered using a variant of hierarchical clustering called Omniclust. The result is a set of model-based clusters, in which features are gathered together if they are often considered together by classification models -- which may be because they're co-expressed, or may be for subtler reasons involving multi-gene interactions. The method is illustrated by applying it to two datasets regarding human gene expression, one drawn from brain cells and pertinent to the neurogenetics of aging, and the other drawn from blood cells and relating to differentiating between types of lymphoma. We find that, compared to traditional expression-based clustering, the new method often yields clusters that have higher mathematical quality (in the sense of homogeneity and separation) and also yield novel and meaningful insights into the underlying biological processes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Alenzi F. Q. Apoptosis and diseases: regulation and clinical relevance. Saudi Med J, 26, 11 (Nov 2005), 1679--90.
 
2
Bar-Joseph Z., Demaine E.D., Gifford D.K., Srebro N., Hamel A.M., Jaakkola T.S. K-ary clustering with optimal leaf ordering for gene expression data. Bioinformatics 19: 1070--1078, 2003.
 
3
Ben-Dor A., Shamir R., Yakhini Z. Clustering gene expression patterns. J Comput Biol 6: 281--297, 1999.
 
4
Bogenrieder T., Herlyn M. Axis of evil: molecular mechanisms of cancer metastasis. Oncogene, 22, 42 (Sep 2003), 6524--36.
 
5
 
6
Brown M.P., Grundy W.N., Lin D., Cristianini N., Sugnet C.W., Furey T.S., Ares M., Jr., Haussler D.. Knowledgebased analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A 97: 262--267, 2000.
 
7
Cho J.H., Lee D., Park J.H., Lee I.B. Gene selection and classification from microarray data using kernel machine. FEBS Lett 571: 93--98, 2004.
 
8
 
9
Dudoit S., Fridlyand J., Speed T. Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97: 77--87, 2002.
 
10
Eisen M.B., Spellman P.T., Brown P.O., Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95: 14863--14868, 1998.
 
11
Goertzel B., Pennachin C., de Souza Coelho L., Mudado M. Identifying Complex Biological Interactions based on Categorical Gene Expression Data. In Gary G. Yen and Lipo Wang and Piero Bonissone and Simon M. Lucas editors, Proceedings of the 2006 IEEE Congress on Evolutionary Computation, pages 5583--5590, Vancouver, 2006. details
 
12
Golub T.R., Slonim D.K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J.P., Coller H., Loh M.L., Downing J.R., Caligiuri M.A.., Bloomfield C.D., Lander E.S. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531--537, 1999.
 
13
 
14
Lombardi G, Burzyn D, Mundinano J, Berguer P, Bekinschtein P, Costa H, Castillo LF, Goldman A, Meiss R, Piazzon I, Nepomnaschy I. Cathepsin-L influences the expression of extracellular matrix in lymphoid organs and plays a role in the regulation of thymic output and of peripheral T cell number. J Immunol, 174, 11 (Jun 2005), 7022--32.
 
15
Looks, M. Competent Program Evolution. PhD thesis, Washington University in St. Louis, 2006.
 
16
Lu T., Pan Y., Kao S.Y., Li C., Kohane I., Chan J., Yankner B.A.. Gene regulation and DNA damage in the Aging human brain. Nature 429: 883--891, 2004.
 
17
Markovetz F.. A bibliography on learning causal networks of gene interactions. 2004
 
18
Markovetz F., Spang R. Reconstructing gene regulation networks from passive observations and active interventions. 7th Ann Intl Conf Res Comput Molec Biol (RECOMB), 2003.
 
19
Mattson M.P. Neuronal life-and-death signaling, apoptosis, and neurodegenerative disorders. Antioxid Redox Signal. 8, 11-12 (Nov-Dec 2006), 1997--2006.
 
20
 
21
Neiman P.E., Ruddell A., Jasoni C., Loring G., Thomas S.J., Brandvold K.A., Lee R., Burnside J., Delrow J. Analysis of gene expression during myc oncogene-induced lymphomagenesis in the bursa of Fabricius. Proc Natl Acad Sci U S A 98: 6378--6383, 2001.
22
 
23
Spellman P.T., Sherlock G., Zhang M.Q., Iyer V.R., Anders K., Eisen M.B., Brown P.O., Botstein D. and Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9: 3273--3297, 1998.
 
24
 
25
Sharan R., Elkon R., Shamir R. Cluster analysis and its applications to gene expression data. Ernst Schering workshop on Bioinformatics and Genome Analysis. Springer Verlag, 2001.
 
26
Shaw R. J. Glucose metabolism and cancer. Curr Opin Cell Biol, 18, 6 (Dec 2006), 598--608.
 
27
Shipp M. A., Ross K. N., Tamayo P., Weng A. P., Kutok J. L.. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine. 2002.
 
28
 
29
 
30
Tamayo P., Slonim D., Mesirov J., Zhu Q., Kitareewan S., Dmitrovsky E., Lander E.S., Golub T.R.. "nterpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci U S A. 96: 2907--2912, 1999.
 
31
 
32
Vert J.P., Kanehisa M. Extracting active pathways from gene expression data. Bioinformatics 19 Suppl 2: II238--II244, 2003.

Collaborative Colleagues:
Moshe Looks: colleagues
Ben Goertzel: colleagues
Lucio de Souza Coelho: colleagues
Mauricio Mudado: colleagues
Cassio Pennachin: colleagues