ACM Home Page
Please provide us with feedback. Feedback
Biological pathways as features for microarray data classification
Full text PdfPdf (277 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 2nd international workshop on Data and text mining in bioinformatics table of contents
Napa Valley, California, USA
SESSION: Bio-data mining table of contents
Pages 5-12  
Year of Publication: 2008
ISBN:978-1-60558-251-1
Authors
Brian Quanz  University of Kansas, Lawrence, KS, USA
Meeyoung Park  University of Kansas, Lawrence, KS, USA
Jun Huan  University of Kansas, Lawrence, KS, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 113,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458449.1458455
What is a DOI?

ABSTRACT

Classification using microarray gene expression data is an important task in bioinformatics. Due to the high dimensionality and small sample size that characterizes microarray data, there has recently been a drive to incorporate any available information in addition to the expression data in the classification process. As a result, much work has begun on selecting biological pathways that are closely related to a clinical outcome of interest using the gene expression data, and incorporating this pathway information opens up new avenues for classification. As opposed to previous approaches that consider individual genes as features, we propose a new approach that treats biological pathways as features. Each pathway found to be significantly related to an outcome of interest is treated as a feature, and is mapped to a feature value. We define several methods for mapping pathways to features, and compare the performance of several classifiers using our feature transformations to that of the classifiers using individual genes as features for different feature selection methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
R. Diaz-Uriarte and S. Alvarez de Andres. Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7(3):1471--2105, 2006.
 
4
 
5
J. Goeman, S. van de Geer, F. de Kort, and H. van Houwelingen. A global test for groups of genes:testing association with a clinical outcome, 2004.
 
6
 
7
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction Springer, 2001.
 
8
T. Jirapech-Umpai and S. Aitken. Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes. BMC Bioinformatics 6(1):148, 2005.
 
9
M. Kanehisa and S. Goto. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 28(1):27, 2000.
 
10
R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the fourteenth international joint conference on artificial intelligence volume 2, pages 1137--1143, 1995.
 
11
L. Liang, V. Mandal, Y. Lu, and D. Kumar. Mcm-test: a fuzzy-set-theory-based approach to differential analysis of gene pathways. BMC Bioinformatics 9 (Suppl 6):S16, 2008.
 
12
 
13
H. Liu, J. Li, and L. Wong. A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Informatics 13:51--60, 2002.
 
14
V. Mootha, C. Lindgren, K. Eriksson, A. Subramanian, S. Sihag, J. Lehar, P. Puigserver, E. Carlsson, M. Ridderstraale, E. Laurila, et al. PGC-1 α -responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics 34(3):267--273, 2003.
 
15
 
16
 
17
 
18
S. Shevade and S. Keerthi. A simple and efficient algorithm for gene selection using sparse logistic regression, 2003.
 
19
 
20
 
21
 
22
J. Tomfohr, J. Lu, and T. Kepler. Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics 6(1):225, 2005.
 
23
Y. Wang, I. Tetko, M. Hall, E. Frank, A. Facius, K. Mayer, and H. Mewes. Gene selection from microarray data for cancer classification -- a machine learning approach.Computational Biology and Chemistry 29(1):37--46, 2005.
 
24

Collaborative Colleagues:
Brian Quanz: colleagues
Meeyoung Park: colleagues
Jun Huan: colleagues