ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Handling very large numbers of association rules in the analysis of microarray data
Full text PdfPdf (954 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Edmonton, Alberta, Canada
SESSION: Industry track papers table of contents
Pages: 396 - 404  
Year of Publication: 2002
ISBN:1-58113-567-X
Authors
Alexander Tuzhilin  New York University, New York, NY
Gediminas Adomavicius  University of Minnesota, Minneapolis, MN
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
: AAAI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 57,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/775047.775104
What is a DOI?

ABSTRACT

The problem of analyzing microarray data became one of important topics in bioinformatics over the past several years, and different data mining techniques have been proposed for the analysis of such data. In this paper, we propose to use association rule discovery methods for determining associations among expression levels of different genes. One of the main problems related to the discovery of these associations is the scalability issue. Microarrays usually contain very large numbers of genes that are sometimes measured in 10,000s. Therefore, analysis of such data can generate a very large number of associations that can often be measured in millions. The paper addresses this problem by presenting a method that enables biologists to evaluate these very large numbers of discovered association rules during the post-analysis stage of the data mining process. This is achieved by providing several rule evaluation operators, including rule grouping, filtering, browsing, and data inspection operators, that allow biologists to validate multiple individual gane regulation patterns at a time. By iteratively applying these operators, biologists can explore a significant part of all the initially generated rules in an acceptable period of time and thus answer biological questions that are of a particular interest to him or her. To validate our method, we tested our system on the microarray data pertaining to the studies of environmental hazards and their influence of gane expression processes. As a result, we managed to answer several questions that were of interest to the biologists that had collected this data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
4
 
5
 
6
Berrar, D., Dubitzky, W., Granzow, M., and Eils, R. Analysis of Gene Expression and Drug Activity Data by Knowledge-based Association Mining. In Proceedings of Critical Assessment of Microarray Data Analysis Techniques (CAMDA'01), pp. 25--28, 2001.
 
7
Bicciato, S., Paladin, M., Didone, G., Di Bello, C. Analysis of an Associative Memory Neural Network for Pattern Identification in Gene Expression Data. Proceedings of BIOKDD'01, 2001.
 
8
Bowtell, D.D. Options available---from start to finish--for obtaining expression data by microarray. Nature Genetics, vol. 21 (1 Suppl):25--32, 1999.
 
9
Brown, M.P, Grundy, W.N., Lin, D., Cristiani, N., Sugnet, C.W., Furey, T.S., Ares, M, and Haussler D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of National Academy of Sciences, vol. 97, no 1., Jan. 2000.
 
10
 
11
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proceedings of National Academy of Sciences, 95(25):14863--8, 1998.
12
 
13
 
14
Jelinsky, S., Estep, P., Church, G, Samson, L. Regulatory Networks Revealed by Transcriptional Profiling of Damaged Saccharomyces cerevisiae Cells: RPN4 Links Base Excision Repair with Proteasomes. Molecular and Cellular Biology, 20(21), Nov., 2000.
15
 
16
Kotala, P., Perera A., Kai Zhou, J., Mudivarthy, S., Perrizo, W., and Deckard, E. Gene Expression Profiling of DNA Microarray Data Using Peano Count Trees (P-Trees). Online Proceedings of the First Virtual Conference on Genomics and Bioinformatics, October 2001. URL: http://midas-10.cs.ndsu.nodak.edu/bio/
 
17
Kurra, G. Niu, W., Bhatnagar, R. Mining Microarray Expression Data for Classifier Gene-Cores. Proceedings of BIOKDD'01, 2001.
 
18
 
19
Lewin, Benjamin. Genes VI. Oxford; New York: Oxford University Press, 1997.
 
20
Liu, B. and Hsu, W., 1996. Post-Analysis of Learned Rules. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI '96), pp. 828--834.
21
 
22
 
23
Padmanabhan, B. and Tuzhilin, A. A Belief-Driven Method for Discovering Unexpected Patterns." In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD'98), August 1998.
 
24
Pavlidis, P., C. Tang, W. Noble, Classification of genes using probabilistic models of microarray expression profiles. In Proceedings of BIOKDD'01, 2001.
25
 
26
Pevsner P.A., Lysov Y., Khrapko K.R., Belyavsky A., Floreny'ev, Mirzabekov A. Improved Chips for Sequencing by Hybridization. Journal of Biomolecular Structure and Dynamics 9(2), pp 399--410, 1991.
 
27
 
28
Srikant, R., Vu, Q., and Agrawal, R. Mining Association Rules with Item Constraints. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD'97), AAAI Press, Menlo Park, California, 1997.
 
29
Suzuki, E., 1997. Autonomous Discovery of Reliable Exception Rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD'97), pp. 259--262, 1997.
 
30
Tamayo, P, Slonim, D., Mesirov, J., Zhu, Q, Kitareewan, S., Dmitrovsky, E., Lander, E., Golub, T. Interpreting patterns of gene expression with self-organizing maps: Methods and applications to hematopoietic differentiation. In Proceedings of National Academy of Sciences, Vol. 96, March 1999.
 
31
Toivonen, H., Klemettinen M., Ronkainen P., Hatonen, K. and Mannila H. Pruning and grouping discovered association rules. In ECML-95 Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases. 1995.
32
 
33
Wang, K, Tay, S.H.W. and Liu, B. Interestingness-based interval merger for numeric association rules. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD'98), August 1998.

CITED BY  15

Collaborative Colleagues:
Alexander Tuzhilin: colleagues
Gediminas Adomavicius: colleagues