ACM Home Page
Please provide us with feedback. Feedback
Maximally informative k-itemsets and their efficient discovery
Full text PdfPdf (863 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Research track papers table of contents
Pages: 237 - 244  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Arno J. Knobbe  Kiminkii, Houten, The Netherlands & Utrecht University, Utrecht, The Netherlands
Eric K. Y. Ho  Kiminkii, Houten, The Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 67,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150431
What is a DOI?

ABSTRACT

In this paper we present a new approach to mining binary data. We treat each binary feature (item) as a means of distinguishing two sets of examples. Our interest is in selecting from the total set of items an itemset of specified size, such that the database is partitioned with as uniform a distribution over the parts as possible. To achieve this goal, we propose the use of joint entropy as a quality measure for itemsets, and refer to optimal itemsets of cardinality k as maximally informative k-itemsets. We claim that this approach maximises distinctive power, as well as minimises redundancy within the feature set. A number of algorithms is presented for computing optimal itemsets efficiently.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Almuallim, H., Dietterich, T. G., Learning with Many Irrelevant Features, In Proceedings of AAAI '91, 1991
3
 
4
 
5
 
6
 
7
8
 
9
Hyvärinen, A., Karhunen, J., Oja, E., Independent Component Analysis, John Wiley & Sons, 2001
 
10
 
11
 
12
Knobbe, A. J., Adriaans, P. W., Discovering Foreign Key Relations in Relational Databases, In Proceedings of EMCSR '96, 1996
 
13
Knobbe, A. J., Multi-Relational Data Mining, Ph.D. dissertation, 2004, http://www.kiminkii.com/thesis.pdf
 
14
 
15
Koller, D., Sahami, M., Toward Optimal Feature Selection, In Proceedings of ICML '96, 1996
 
16
 
17
Kreher, D. L., Stinson, D. R., Combinatorial Algorithms, CRC Press, 1999
18
 
19
Pfahringer, B., Compression-Based Feature Subset Selection, In Proceedings of IJCAI '95, 1995
 
20
Safarii Multi-Relational Data Mining environment, http://www.kiminkii.com/safarii.html, 2006
 
21
 
22
 
23
 
24
Zaki, M. J., Orihara, M., Theoretical Foundations of Association Rules, In Proceedings ACM SIGMED workshop on research issues in KDD, 1998


Collaborative Colleagues:
Arno J. Knobbe: colleagues
Eric K. Y. Ho: colleagues