ACM Home Page
Please provide us with feedback. Feedback
On support thresholds in associative classification
Full text PdfPdf (375 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2004 ACM symposium on Applied computing table of contents
Nicosia, Cyprus
SESSION: Data mining (DM) table of contents
Pages: 553 - 558  
Year of Publication: 2004
ISBN:1-58113-812-1
Authors
Elena Baralis  Corso Duca degli Abruzzi, Torino, Italy
Silvia Chiusano  Corso Duca degli Abruzzi, Torino, Italy
Paolo Garza  Corso Duca degli Abruzzi, Torino, Italy
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 39,   Citation Count: 2
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/967900.968016
What is a DOI?

ABSTRACT

Associative classification is a well-known technique for structured data classification. Most previous works on associative classification use support based pruning for rule extraction, and usually set the threshold value to 1%. This threshold allows rule extraction to be tractable and on the average yields a good accuracy. We believe that this threshold may be not accurate in some cases, since the class distribution in the dataset is not taken into account. In this paper we investigate the effect of support threshold on classification accuracy. Lower support thresholds are often unfeasible with current extraction algorithms, or may cause the generation of a huge rule set. To observe the effect of varying the support threshold, we first propose a compact form to encode a complete rule set. We then develop a new classifier, named L3G, based on the compact form. Taking advantage of the compact form, the classifier can be built also with rather low support rules. We ran a variety of experiments with different support thresholds on datasets from the UCI machine learning database repository. The experiments showed that the optimal accuracy is obtained for variable threshold values, sometime lower than 1%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
C. Blake and C. Merz. UCI repository of machine learning databases, 1998.
 
6
7
8
 
9
B. Cremilleux and J.-F. Boulicaut. Simplest rules characterizing classes generated by delta-free sets. In ES'02.
 
10
 
11
E. Baralis and S. Chiusano. Minimal non redundant classification rule sets. IEEE ICDM Workshop on Foundations of Data Mining and Discovery, 2002.
 
12
 
13
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In KDD'98.
 
14
 
15
 
16
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Closed itemsets discovery of small covers for association rules. In Networking and Information Systems, June 2001.
 
17
J. Pei, J. Han, and R. Mao. Closet: An efficient algorithm for mining frequent closed itemsets. In ACM SIGMOD DMKD'00.
 
18
J. Quinlan. C4.5: program for classification learning. Morgan Kaufmann, 1992.
19
20
 
21
M. Zaki and C.-J. Hsiao. Charm: An efficient algorithm for closed itemset mining. In SIAM'02.

Collaborative Colleagues:
Elena Baralis: colleagues
Silvia Chiusano: colleagues
Paolo Garza: colleagues