ACM Home Page
Please provide us with feedback. Feedback
Using an improved C4.5 for imbalanced dataset of intrusion
Full text PdfPdf (155 KB)
Source PST; Vol. 380 archive
Proceedings of the 2006 International Conference on Privacy, Security and Trust: Bridge the Gap Between PST Technologies and Business Services table of contents
Markham, Ontario, Canada
SESSION: Short papers: Security applications table of contents
Article No. 67  
Year of Publication: 2006
ISBN:1-59593-604-1
Authors
Quan Zhou  Nanjing University, Nanjing
Lin-gang Gu  Jiangsu Provincial Security Bureau of PRC
Chong-jun Wang  Nanjing University, Nanjing
Wang-jun  Nanjing University, Nanjing
Shi-fu Chen  Nanjing University, Nanjing
Sponsor
SIGSAC: ACM Special Interest Group on Security, Audit, and Control
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 30,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1501434.1501513
What is a DOI?

ABSTRACT

The imbalance of dataset will directly affect the precision of classifier. PC4.5, an improved C4.5 algorithm is proposed. The experiments in MIT dataset indicate that PC4.5 is effective on imbalanced dataset and the scale of the decision tree could be reduced.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Zhang Qi-rui, Zhang Lin, Dong Shou-bin. The influence of class distribution on Text Categorization. Journal of Tsinghua University (Science and Technology). 2005 (45): 1802--1805
 
3
 
4
Chris Drummond & Robert C. Holte. C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling. in proceeding of Learning from Imbalanced Datasets II, ICML, Washington DC, 2003.
 
5
Prati, R. C., Batista, G. E. A. P. A., and Monard, M. C. Class Imbalances versus Class Overlapping: an Analysis of a Learning System Behavior. In MICAI (2004), pp. 312--321. LNAI 2972.
 
6
A. Kolcz, A. Chowdhury and J. Alspector, "Data Duplication: An Imbalance Problem?," ICML'2003 Workshop on Learning from Imbalanced Datasets, Washington, DC, USA, 2003
Collaborative Colleagues:
Quan Zhou: colleagues
Lin-gang Gu: colleagues
Chong-jun Wang: colleagues
Wang-jun: colleagues
Shi-fu Chen: colleagues