ACM Home Page
Please provide us with feedback. Feedback
Identifying biologically relevant genes via multiple heterogeneous data sources
Full text PdfPdf (383 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Research papers table of contents
Pages 839-847  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Zheng Zhao  Arizona State University, Tempe, AZ, USA
Jiangxin Wang  Arizona State University, Tempe, AZ, USA
Huan Liu  Arizona State University, Tempe, AZ, USA
Jieping Ye  Arizona State University, Tempe, AZ, USA
Yung Chang  Arizona State University, Tempe, AZ, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 244,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1401990
What is a DOI?

ABSTRACT

Selection of genes that are differentially expressed and critical to a particular biological process has been a major challenge in post-array analysis. Recent development in bioinformatics has made various data sources available such as mRNA and miRNA expression profiles, biological pathway and gene annotation, etc. Efficient and effective integration of multiple data sources helps enrich our knowledge about the involved samples and genes for selecting genes bearing significant biological relevance. In this work, we studied a novel problem of multi-source gene selection: given multiple heterogeneous data sources (or data sets), select genes from expression profiles by integrating information from various data sources. We investigated how to effectively employ information contained in multiple data sources to extract an intrinsic global geometric pattern and use it in covariance analysis for gene selection. We designed and conducted experiments to systematically compare the proposed approach with representative methods in terms of statistical and biological significance, and showed the efficacy and potential of the proposed approach with promising findings.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
E. Camon, etal. The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Research, 32:262--266, 2004.
 
5
F. Chung. Spectral graph theory. AMS, 1997.
 
6
A. Cimmino, etal. mir-15 and mir-16 induce apoptosis by targeting bcl2. PNAS, 102:13944--13949, 2005.
 
7
A. d'Aspremont, F. Bach, and L. E. Ghaoui. Optimal solutions for sparse principal component analysis. Technical report, Princeton University, 2007.
 
8
J. Dy. Unsupervised feature selection. In H. Liu and H. Motoda, editors, Computational Methods of Feature Selection. Chapman and Hall/CRC Press, 2007.
 
9
C. Gercel-Taylor, D. L. Doering, F. B. Kraemer, and D. D. Taylor. Aberrations in normal systemic lipid metabolism in ovarian cancer patients. Gynecologic Oncology, 60:35--41, 1996.
 
10
G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.
 
11
J. Hagan and C. Croce. Micrornas in carcinogenesis. Cytogenet Genome Res, 118:252--259, 2007.
 
12
X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18. MIT Press, 2005.
 
13
J. C. Huang, etal. Using expression profiling data to identify human microrna targets. NATURE METHODS, 4:1045--1049, 2007.
 
14
Ingenuity-Systems. Ingenuity pathways analysis. http://www.ingenuity.com.
 
15
 
16
 
17
 
18
L. Lovasz. Random walks on graphs: A survey. Combinatorics, Paul Erdos is Eighty, 2:353--397, 1993.
 
19
J. Lu, etal. Microrna expression profiles classify human cancers. Nature, 435:834--838, 2005.
 
20
P. Mahalanobis. On the generalized distance in statistics. Proceedings of the National Institute of Science of India, 12:49--55, 1936.
 
21
H. Pingzhao, G. Bader, D. Wigle, and A. Emili. computational prediction of cancer--gene function. Nature Reviews Cancer, 7:23--34, 2007.
 
22
B. Scholkopf, A. Smola, and K.-R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Technical report, Max Planck Institut, 1996.
 
23
M.-L. Si, etal. mir-21-mediated tumor growth. Oncogene, 26:2799--2803, 2007.
 
24
 
25
 
26
T. J. Slaga, etal. Skin tumor-promoting activity of benzoyl peroxide, a widely used free radical-generating compound. Science, 213:1023--1025, 1981.
 
27
M. R. Spiegel. Theory and Problems of Probability and Statistics. New York: McGraw-Hill, 2nd edition, 1992.
 
28
H. Tazawa, etal. Tumor-suppressive mir-34a induces senescence-like growth arrest through modulation of the e2f pathway in human colon cancer cells. PNAS, 104:15472--15477, 2007.
 
29
U. von Luxburg. A tutorial on spectral clustering. Technical report, Max Planck Institute for Biological Cybernetics, 2007.
 
30
31
32
33

Collaborative Colleagues:
Zheng Zhao: colleagues
Jiangxin Wang: colleagues
Huan Liu: colleagues
Jieping Ye: colleagues
Yung Chang: colleagues