ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Mining top-K covering rule groups for gene expression data
Full text PdfPdf (416 KB)
Source International Conference on Management of Data archive
Proceedings of the 2005 ACM SIGMOD international conference on Management of data table of contents
Baltimore, Maryland
SESSION: Research papers: mining biological and medical data table of contents
Pages: 670 - 681  
Year of Publication: 2005
ISBN:1-59593-060-4
Authors
Gao Cong  University of Edinburgh
Kian-Lee Tan  National University of Singapore
Anthony K. H. Tung  National University of Singapore
Xin Xu  National University of Singapore
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 83,   Citation Count: 13
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1066157.1066234
What is a DOI?

ABSTRACT

In this paper, we propose a novel algorithm to discover the top-k covering rule groups for each row of gene expression profiles. Several experiments on real bioinformatics datasets show that the new top-k covering rule mining algorithm is orders of magnitude faster than previous association rule mining algorithms.Furthermore, we propose a new classification method RCBT. RCBT classifier is constructed from the top-k covering rule groups. The rule groups generated for building RCBT are bounded in number. This is in contrast to existing rule-based classification methods like CBA [19] which despite generating excessive number of redundant rules, is still unable to cover some training data with the discovered rules. Experiments show that the RCBT classifier can match or outperform other state-of-the-art classifiers on several benchmark gene expression datasets. In addition, the top-k covering rule groups themselves provide insights into the mechanisms responsible for diseases directly.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
T. R. Anderson and T. A. Slotkin. Maturation of the adrenal medulla--iv. effects of morphine. Biochem Pharmacol, August 1975.
 
3
4
 
5
K. S. Bose and R. H. Sarma. Delineation of the intimate details of the backbone conformation of pyridine nucleotide coenzymes in aqueous solution. Biochem Biophys Res Commun, October 1975.
6
 
7
C. Creighton and S. Hanash. Mining gene expression databases for association rules. Bioinformatics, 19, 2003.
 
8
S. Doddi, A. Marathe, S. Ravi, and D. Torney. Discovery of association rules in medical data. Med. Inform. Internet. Med., 26:25--33, 2001.
 
9
 
10
D. J. Glenn and R. A. Maurer. Mrg1 binds to the lim domain of lhx2 and may function as a coactivator to stimulate glycoprotein hormone α-subunit gene expression. J Biol Chem, 274, December 1999.
11
12
13
 
14
D. Jiang, J. Pei, and A. Zhang. A general approach to mining quality pattern-based clusters from gene expression data. In DASFAA 2005. To Appear.
 
15
 
16
M. Kasai, J. Guerrero-Santoro, R. Friedman, E. S. Leman, R. H. Getzenberg, and D. B. DeFranco. The group 3 lim domain protein paxillin potentiates androgen receptor transactivation in prostate cancer cell lines. Cancer Research, 63:4927--4935, August 2003.
 
17
S. Kurimoto, N. Moriyama, K. Takata, S. A. Nozaw, Y. Aso, and H. Hirano. Detection of a glycosphingolipid antigen in bladder cancer cells with monoclonal antibody mrg-1. Histochem J., 1995.
 
18
J. Li and L. Wong. Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics, 18:725--734, 2002.
 
19
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98), 1998.
20
 
21
M. Nagata, H. Fujita, H. Ida, H. Hoshina, T. Inoue, Y. Seki, M. Ohnishi, T. Ohyama, S. Shingaki, M. Kaji, T. Saku, and R. Takagi. Identification of potential biomarkers of lymph node metastasis in oral squamous cell carcinoma by cdna microarray analysis. International Journal of Cancer, 106:683--689, June 2003.
22
23
 
24
 
25
 
26
J. L. Pfaltz and C. M. Taylor. Closed set mining of biological data. Workshop on Data Mining in Bioinformatics, pages 43--48, 2002.
 
27
J. R. Quinlan. Bagging, boosting, and C4.5. In Proc. 1996 Nat. Conf. Artificial Intelligence (AAAI'96), volume 1, pages 725--730, Portland, OR, Aug. 1996.
 
28
29
30
 
31
M. Zaki and C. Hsiao. Charm: An efficient algorithm for closed association rule mining. In Proc. of SDM 2002, 2002.

CITED BY  13
Collaborative Colleagues:
Gao Cong: colleagues
Kian-Lee Tan: colleagues
Anthony K. H. Tung: colleagues
Xin Xu: colleagues