| Identifying biologically relevant genes via multiple heterogeneous data sources |
| Full text |
Pdf
(383 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 839-847
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
Zheng Zhao
|
Arizona State University, Tempe, AZ, USA
|
|
Jiangxin Wang
|
Arizona State University, Tempe, AZ, USA
|
|
Huan Liu
|
Arizona State University, Tempe, AZ, USA
|
|
Jieping Ye
|
Arizona State University, Tempe, AZ, USA
|
|
Yung Chang
|
Arizona State University, Tempe, AZ, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 244, Citation Count: 0
|
|
|
ABSTRACT
Selection of genes that are differentially expressed and critical to a particular biological process has been a major challenge in post-array analysis. Recent development in bioinformatics has made various data sources available such as mRNA and miRNA expression profiles, biological pathway and gene annotation, etc. Efficient and effective integration of multiple data sources helps enrich our knowledge about the involved samples and genes for selecting genes bearing significant biological relevance. In this work, we studied a novel problem of multi-source gene selection: given multiple heterogeneous data sources (or data sets), select genes from expression profiles by integrating information from various data sources. We investigated how to effectively employ information contained in multiple data sources to extract an intrinsic global geometric pattern and use it in covariance analysis for gene selection. We designed and conducted experiments to systematically compare the proposed approach with representative methods in terms of statistical and biological significance, and showed the efficacy and potential of the proposed approach with promising findings.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Yoshua Bengio , Olivier Delalleau , Nicolas Le Roux , Jean-François Paiement , Pascal Vincent , Marie Ouimet, Learning Eigenfunctions Links Spectral Embedding and Kernel PCA, Neural Computation, v.16 n.10, p.2197-2219, October 2004
[doi> 10.1162/0899766041732396]
|
| |
3
|
|
| |
4
|
E. Camon, etal. The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Research, 32:262--266, 2004.
|
| |
5
|
F. Chung. Spectral graph theory. AMS, 1997.
|
| |
6
|
A. Cimmino, etal. mir-15 and mir-16 induce apoptosis by targeting bcl2. PNAS, 102:13944--13949, 2005.
|
| |
7
|
A. d'Aspremont, F. Bach, and L. E. Ghaoui. Optimal solutions for sparse principal component analysis. Technical report, Princeton University, 2007.
|
| |
8
|
J. Dy. Unsupervised feature selection. In H. Liu and H. Motoda, editors, Computational Methods of Feature Selection. Chapman and Hall/CRC Press, 2007.
|
| |
9
|
C. Gercel-Taylor, D. L. Doering, F. B. Kraemer, and D. D. Taylor. Aberrations in normal systemic lipid metabolism in ovarian cancer patients. Gynecologic Oncology, 60:35--41, 1996.
|
| |
10
|
G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.
|
| |
11
|
J. Hagan and C. Croce. Micrornas in carcinogenesis. Cytogenet Genome Res, 118:252--259, 2007.
|
| |
12
|
X. He, D. Cai, and P. Niyogi. Laplacian score for feature selection. In Advances in Neural Information Processing Systems 18. MIT Press, 2005.
|
| |
13
|
J. C. Huang, etal. Using expression profiling data to identify human microrna targets. NATURE METHODS, 4:1045--1049, 2007.
|
| |
14
|
Ingenuity-Systems. Ingenuity pathways analysis. http://www.ingenuity.com.
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
L. Lovasz. Random walks on graphs: A survey. Combinatorics, Paul Erdos is Eighty, 2:353--397, 1993.
|
| |
19
|
J. Lu, etal. Microrna expression profiles classify human cancers. Nature, 435:834--838, 2005.
|
| |
20
|
P. Mahalanobis. On the generalized distance in statistics. Proceedings of the National Institute of Science of India, 12:49--55, 1936.
|
| |
21
|
H. Pingzhao, G. Bader, D. Wigle, and A. Emili. computational prediction of cancer--gene function. Nature Reviews Cancer, 7:23--34, 2007.
|
| |
22
|
B. Scholkopf, A. Smola, and K.-R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Technical report, Max Planck Institut, 1996.
|
| |
23
|
M.-L. Si, etal. mir-21-mediated tumor growth. Oncogene, 26:2799--2803, 2007.
|
| |
24
|
|
| |
25
|
|
| |
26
|
T. J. Slaga, etal. Skin tumor-promoting activity of benzoyl peroxide, a widely used free radical-generating compound. Science, 213:1023--1025, 1981.
|
| |
27
|
M. R. Spiegel. Theory and Problems of Probability and Statistics. New York: McGraw-Hill, 2nd edition, 1992.
|
| |
28
|
H. Tazawa, etal. Tumor-suppressive mir-34a induces senescence-like growth arrest through modulation of the e2f pathway in human colon cancer cells. PNAS, 104:15472--15477, 2007.
|
| |
29
|
U. von Luxburg. A tutorial on spectral clustering. Technical report, Max Planck Institute for Biological Cybernetics, 2007.
|
| |
30
|
|
 |
31
|
|
 |
32
|
|
 |
33
|
|
|