|
ABSTRACT
Microarray data provides a perfect riposte to the original assumption underlying association rule mining -- large but sparse transaction sets. In a typical microarray the number of columns (genes) is an order of magnitude larger than the number of rows (experiments). A new family of row enumerated rule mining algorithms have emerged to facilitate mining in dense sets. However, to date, all the algorithms proposed to mine expression relationships alone rely on the support measure to prune the search space. This is a major shortcoming as it results in the pruning of many potentially interesting rules which have low support but high confidence. In this paper we propose the MAXCONF algorithm which exploits the weak downward closure of confidence to directly mine for high confidence rules. We also provide a means to evaluate the biological significance of the gene relationships identified. An evaluation of MAXCONF with RERII on the database BIND shows that their recall is 94% and .15% respectively.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
|
| |
3
|
T. Akutsu, S. Miyano, and S. Kuhara. Identification of genetic networks from a small number of gene expression patterns under the boolean network model. In Pacific Symposium on Biocomputing, pages 17--28, 1999.
|
| |
4
|
|
| |
5
|
C Alfarano et.al. The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res, 33:D418--24, 2005.
|
| |
6
|
|
 |
7
|
|
 |
8
|
Gao Cong , Anthony K. H. Tung , Xin Xu , Feng Pan , Jiong Yang, FARMER: finding interesting rule groups in microarray datasets, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007587]
|
| |
9
|
C. Creighton and S. Hanash. Mining gene expression databases for association rules. Bioinformatics, 19(1):79--86, 2003.
|
| |
10
|
R. F. Hassett, A. M. Romeo, and D. J. Kosman. Regulation of high affinity iron uptake in the yeast saccharomyces cerevisiae. J Biol Chem, 273(13):7628--7636, 1998.
|
| |
11
|
V. Haurie, H. Boucherie, and F. Sagliocco. The snf1 protein kinase controls the induction of genes of the iron uptake pathway at the diauxic shift in saccharomyces cerevisiae. J Biol Chem, 278(46):45391--6, 2003.
|
 |
12
|
|
| |
13
|
T. Hughes et al. Functional discovery via a compendium of expression profiles. Cell, 102:109--126, 2000.
|
| |
14
|
L. J. Martins, L. T. Jensen, J. R. Simon, G. L. Keller, and D. R. Winge. Metalloregulation of fre1 and fre2 homologs in saccharomyces cerevisiae. J Biol Chem, 273(37):23716--23721, 1998.
|
 |
15
|
Feng Pan , Gao Cong , Anthony K. H. Tung , Jiong Yang , Mohammed J. Zaki, Carpenter: finding closed patterns in long biological datasets, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C.
[doi> 10.1145/956750.956832]
|
| |
16
|
J. Pei, J. Han, and R. Mao. Closet: An efficient algorithm for mining frequent closed itemsets. In ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pages 21--30, 2000.
|
 |
17
|
|
| |
18
|
The Gene Ontology Consortium. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res, 32:D258--D261, 2004.
|
| |
19
|
M. J. Zaki and C.-J. Hsiao. Charm: An efficient algorithm for closed itemset mining. In Proc. 2nd SIAM International Conference on Data Mining, 2002.
|
|