|
ABSTRACT
Microarray datasets typically contain large number of columns but small number of rows. Association rules have been proved to be useful in analyzing such datasets. However, most existing association rule mining algorithms are unable to efficiently handle datasets with large number of columns. Moreover, the number of association rules generated from such datasets is enormous due to the large number of possible column combinations.In this paper, we describe a new algorithm called FARMER that is specially designed to discover association rules from microarray datasets. Instead of finding individual association rules, FARMER finds interesting rule groups which are essentially a set of rules that are generated from the same set of rows. Unlike conventional rule mining algorithms, FARMER searches for interesting rules in the row enumeration space and exploits all user-specified constraints including minimum support, confidence and chi-square to support efficient pruning. Several experiments on real bioinformatics datasets show that FARMER is orders of magnitude faster than previous association rule mining algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Roberto J. Bayardo, Jr. , Rakesh Agrawal, Mining the most interesting rules, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.145-154, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312219]
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
G. Cong, A. K. H. Tung, X. Xu, F. Pan, and J. Yang. Farmer: Finding interesting rule groups in microarray datasets. Technical Report: National University of Singapore, 2004.
|
| |
7
|
C. Creighton and S. Hanash. Mining gene expression databases for association rules. Bioinformatics, 19, 2003.
|
| |
8
|
S. Doddi, A. Marathe, S. Ravi, and D. Torney. Discovery of association rules in medical data. Med. Inform. Internet. Med., 26:25--33, 2001.
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
T. Joachims. Making large-scale svm learning practical. 1999. svmlight.joachims.org/.
|
| |
13
|
J. Li and L. Wong. Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics, 18:725--734, 2002.
|
| |
14
|
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98).
|
| |
15
|
S. Morishita and J. Sese. Traversing itemset lattices with statistical metric prunning. In Proc. of PODS, 2002.
|
 |
16
|
Raymond T. Ng , Laks V. S. Lakshmanan , Jiawei Han , Alex Pang, Exploratory mining and pruning optimizations of constrained associations rules, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.13-24, June 01-04, 1998, Seattle, Washington, United States
|
 |
17
|
Feng Pan , Gao Cong , Anthony K. H. Tung , Jiong Yang , Mohammed J. Zaki, Carpenter: finding closed patterns in long biological datasets, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C.
[doi> 10.1145/956750.956832]
|
| |
18
|
|
| |
19
|
J. L. Pfaltz and C. Taylor. Closed set mining of biological data. In Workshop on Data Mining in BIoinformatics with (SIGKDD02), 2002.
|
| |
20
|
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD'97), 1997.
|
 |
21
|
|
 |
22
|
|
| |
23
|
M. Zaki and C. Hsiao. Charm: An efficient algorithm for closed association rule mining. In Proc. of SIAM on Data Mining, 2002.
|
CITED BY 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xin Xu , Gao Cong , Beng Chin Ooi , Kian-Lee Tan , Anthony K. H. Tung, Semantic mining and analysis of gene expression data, Proceedings of the Thirtieth international conference on Very large data bases, p.1261-1264, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hongyan Liu , Xiaoyu Wang , Jun He , Jiawei Han , Dong Xin , Zheng Shao, Top-down mining of frequent closed patterns from very high dimensional data, Information Sciences: an International Journal, v.179 n.7, p.899-924, March, 2009
|
|
|
Adriano Veloso , Wagner Meira, Jr , Mohammed Zaki, Calibrated lazy associative classification, Proceedings of the 23rd Brazilian symposium on Databases, October 13-17, 2008, Campinas, Sao Paulo, Brazil
|
|