ACM Home Page
Please provide us with feedback. Feedback
Simple and effective visual models for gene expression cancer diagnostics
Full text PdfPdf (708 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining table of contents
Chicago, Illinois, USA
SESSION: Research track paper table of contents
Pages: 167 - 176  
Year of Publication: 2005
ISBN:1-59593-135-X
Authors
Gregor Leban  University of Ljubljana, Tržaška 25, Ljubljana, Slovenia
Minca Mramor  University of Ljubljana, Tržaška 25, Ljubljana, Slovenia
Ivan Bratko  University of Ljubljana, Tržaška 25, Ljubljana, Slovenia
Blaz Zupan  University of Ljubljana, Tržaška 25, Ljubljana, Slovenia
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 72,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1081870.1081892
What is a DOI?

ABSTRACT

In the paper we show that diagnostic classes in cancer gene expression data sets, which most often include thousands of features (genes), may be effectively separated with simple two-dimensional plots such as scatterplot and radviz graph. The principal innovation proposed in the paper is a method called VizRank, which is able to score and identify the best among possibly millions of candidate projections for visualizations. Compared to recently much applied techniques in the field of cancer genomics that include neural networks, support vector machines and various ensemble-based approaches, VizRank is fast and finds visualization models that can be easily examined and interpreted by domain experts. Our experiments on a number of gene expression data sets show that VizRank was always able to find data visualizations with a small number of (two to seven) genes and excellent class separation. In addition to providing grounds for gene expression cancer diagnosis, VizRank and its visualizations also identify small sets of relevant genes, uncover interesting gene interactions and point to outliers and potential misclassifications in cancer data sets.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. A. Armstrong, J. E. Staunton, L. B. Silverman, et al. MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30(1):41--47, 2001.
 
2
A. Bhattacharjee, W. G. Richards, J. Staunton, et al. Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. PNAS, 98(24):13790--13795, 2001.
 
3
D. Boue and T. LeBien. Expression and structure of cd22 in acute leukemia. Blood, 71(5):1480--1486, 1988.
 
4
C. Brunsdon, A. S. Fotheringham, and M. Charlton. An investigation of methods for visualising highly multivariate datasets. Case Studies of Visualization in the Social Sciences, pages 55--80, 1998.
 
5
J. E. Cutting and P. M. Vishton. Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In Handbook of perception and cognition, pages 69--117. Academic Press, San Diego, CA, 1995.
 
6
B. W. Dasarathy. Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press, 1991.
 
7
J. Demšar and B. Zupan. From experimental machine learning to interactive data mining, a white paper. AI Lab, Faculty of Computer and Information Science, Ljubljana, 2004.
 
8
 
9
L. M. Fu and C. S. Fu-Liu. Multi-class cancer subtype classification based on gene expression signatures with reliability analysis. FEBS Letters, 561(1-3):186--190, 2004. TY - ABST.
 
10
T. R. Golub, D. K. Slonim, P. Tamayo, et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286(5439):531--537, 1999.
 
11
D. Hanahan and R. Weinberg. The hallmarks of cancer. Cell, 100(1):57--70, 2000.
 
12
 
13
J. Khan, J. S. Wei, M. Ringnér, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. 7, 6(1):673--679, 2001.
 
14
 
15
I. Kononenko and E. Simec. Induction of decision trees using relieff. In Mathematical and statistical methods in artificial intelligence. Springer Verlag, 1995.
 
16
L. Liu, L. McGavran, M. A. Lovell, et al. Nonpositive terminal deoxynucleotidyl transferase in pediatric precursor b-lymphoblastic leukemia. American Journal of Clinical Pathology, 121(6):810--815, 2004.
 
17
C. L. Nutt, D. R. Mani, R. A. Betensky, et al. Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res, 63(7):1602--1607, 2003.
 
18
S. L. Pomeroy, P. Tamayo, M. Gaasenbeek, et al. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415(6870):436--442, 2002.
 
19
M. A. Shipp, K. N. Ross, P. Tamayo, et al. Diffuse large b-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine, 8(1):68--74, 2002.
 
20
D. Singh, P. G. Febbo, K. Ross, et al. Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2):203--209, 2002.
 
21

Collaborative Colleagues:
Gregor Leban: colleagues
Minca Mramor: colleagues
Ivan Bratko: colleagues
Blaz Zupan: colleagues