|
ABSTRACT
In this paper, we propose a novel algorithm to discover the top-k covering rule groups for each row of gene expression profiles. Several experiments on real bioinformatics datasets show that the new top-k covering rule mining algorithm is orders of magnitude faster than previous association rule mining algorithms.Furthermore, we propose a new classification method RCBT. RCBT classifier is constructed from the top-k covering rule groups. The rule groups generated for building RCBT are bounded in number. This is in contrast to existing rule-based classification methods like CBA [19] which despite generating excessive number of redundant rules, is still unable to cover some training data with the discovered rules. Experiments show that the RCBT classifier can match or outperform other state-of-the-art classifiers on several benchmark gene expression datasets. In addition, the top-k covering rule groups themselves provide insights into the mechanisms responsible for diseases directly.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
T. R. Anderson and T. A. Slotkin. Maturation of the adrenal medulla--iv. effects of morphine. Biochem Pharmacol, August 1975.
|
| |
3
|
|
 |
4
|
Roberto J. Bayardo, Jr. , Rakesh Agrawal, Mining the most interesting rules, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.145-154, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312219]
|
| |
5
|
K. S. Bose and R. H. Sarma. Delineation of the intimate details of the backbone conformation of pyridine nucleotide coenzymes in aqueous solution. Biochem Biophys Res Commun, October 1975.
|
 |
6
|
Gao Cong , Anthony K. H. Tung , Xin Xu , Feng Pan , Jiong Yang, FARMER: finding interesting rule groups in microarray datasets, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007587]
|
| |
7
|
C. Creighton and S. Hanash. Mining gene expression databases for association rules. Bioinformatics, 19, 2003.
|
| |
8
|
S. Doddi, A. Marathe, S. Ravi, and D. Torney. Discovery of association rules in medical data. Med. Inform. Internet. Med., 26:25--33, 2001.
|
| |
9
|
|
| |
10
|
D. J. Glenn and R. A. Maurer. Mrg1 binds to the lim domain of lhx2 and may function as a coactivator to stimulate glycoprotein hormone α-subunit gene expression. J Biol Chem, 274, December 1999.
|
 |
11
|
|
 |
12
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
 |
13
|
Daxin Jiang , Jian Pei , Murali Ramanathan , Chun Tang , Aidong Zhang, Mining coherent gene clusters from gene-sample-time microarray data, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014101]
|
| |
14
|
D. Jiang, J. Pei, and A. Zhang. A general approach to mining quality pattern-based clusters from gene expression data. In DASFAA 2005. To Appear.
|
| |
15
|
|
| |
16
|
M. Kasai, J. Guerrero-Santoro, R. Friedman, E. S. Leman, R. H. Getzenberg, and D. B. DeFranco. The group 3 lim domain protein paxillin potentiates androgen receptor transactivation in prostate cancer cell lines. Cancer Research, 63:4927--4935, August 2003.
|
| |
17
|
S. Kurimoto, N. Moriyama, K. Takata, S. A. Nozaw, Y. Aso, and H. Hirano. Detection of a glycosphingolipid antigen in bladder cancer cells with monoclonal antibody mrg-1. Histochem J., 1995.
|
| |
18
|
J. Li and L. Wong. Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics, 18:725--734, 2002.
|
| |
19
|
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98), 1998.
|
 |
20
|
Bing Liu , Wynne Hsu , Yiming Ma, Pruning and summarizing the discovered associations, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.125-134, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312216]
|
| |
21
|
M. Nagata, H. Fujita, H. Ida, H. Hoshina, T. Inoue, Y. Seki, M. Ohnishi, T. Ohyama, S. Shingaki, M. Kaji, T. Saku, and R. Takagi. Identification of potential biomarkers of lymph node metastasis in oral squamous cell carcinoma by cdna microarray analysis. International Journal of Cancer, 106:683--689, June 2003.
|
 |
22
|
Raymond T. Ng , Laks V. S. Lakshmanan , Jiawei Han , Alex Pang, Exploratory mining and pruning optimizations of constrained associations rules, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.13-24, June 01-04, 1998, Seattle, Washington, United States
|
 |
23
|
Feng Pan , Gao Cong , Anthony K. H. Tung , Jiong Yang , Mohammed J. Zaki, Carpenter: finding closed patterns in long biological datasets, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C.
[doi> 10.1145/956750.956832]
|
| |
24
|
|
| |
25
|
|
| |
26
|
J. L. Pfaltz and C. M. Taylor. Closed set mining of biological data. Workshop on Data Mining in Bioinformatics, pages 43--48, 2002.
|
| |
27
|
J. R. Quinlan. Bagging, boosting, and C4.5. In Proc. 1996 Nat. Conf. Artificial Intelligence (AAAI'96), volume 1, pages 725--730, Portland, OR, Aug. 1996.
|
| |
28
|
|
 |
29
|
|
 |
30
|
|
| |
31
|
M. Zaki and C. Hsiao. Charm: An efficient algorithm for closed association rule mining. In Proc. of SDM 2002, 2002.
|
CITED BY 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wei Fan , Kun Zhang , Hong Cheng , Jing Gao , Xifeng Yan , Jiawei Han , Philip Yu , Olivier Verscheure, Direct mining of discriminative and essential frequent patterns via model-based search tree, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
Hongyan Liu , Xiaoyu Wang , Jun He , Jiawei Han , Dong Xin , Zheng Shao, Top-down mining of frequent closed patterns from very high dimensional data, Information Sciences: an International Journal, v.179 n.7, p.899-924, March, 2009
|
|
|
Guihua Sun , Gao Cong , Xiaohua Liu , Chin-Yew Lin , Ming Zhou, Mining sequential patterns and tree patterns to detect erroneous sentences, Proceedings of the 22nd national conference on Artificial intelligence, p.925-930, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|