ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Semi-supervised co-training and active learning based approach for multi-view intrusion detection
Full text PdfPdf (370 KB)
Source
Symposium on Applied Computing archive
Proceedings of the 2009 ACM symposium on Applied Computing table of contents
Honolulu, Hawaii
SESSION: Computer security track table of contents
Pages: 2042-2048  
Year of Publication: 2009
ISBN:978-1-60558-166-8
Authors
Ching-Hao Mao  National Taiwan University of Science and Technology, Taipei, Taiwan
Hahn-Ming Lee  National Taiwan University of Science and Technology, Taipei, Taiwan and Academia Sinica, Taipei, Taiwan
Devi Parikh  Carnegie Mellon University, Pittsburgh, Pennsylvania
Tsuhan Chen  Carnegie Mellon University, Pittsburgh, Pennsylvania
Si-Yu Huang  National Taiwan University of Science and Technology, Taipei, Taiwan
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 35,   Downloads (12 Months): 160,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1529282.1529735
What is a DOI?

ABSTRACT

Although there is immense data available from networks and hosts, a very small proportion of this data is labeled due to the cost of obtaining expert labels. This proves to be a significant bottle-neck for developing supervised intrusion detection systems that rely solely on labeled data. In spite of the data being collected from real network environments and hence potentially holding valuable information for intrusion detection, such systems can not exploit the remaining unlabeled data. In this work, we intelligently leverage both labeled and unlabeled data. Also, intrusion detection tasks naturally lend themselves into a multi-view scenario, and can benefit significantly if these multiple views are combined meaningfully. In this paper, we propose a co-training method framework for intrusion detection, which is a semi-supervised learning method and can not only utilize unlabeled data, but can also combine multi-view data. We also employ an active learning framework where statistically ambiguous parts of the unlabeled data are identified, which can then be labeled by an expert. This allows for minimal expert labeling while ensuring that the labels obtained from the expert are most informative. In our experiments, we demonstrate that leveraging the unlabeled data using our proposed method significantly reduces the error rate as compared to using the labeled data alone. In addition, our proposed multi-view method has a lower error rate than using a single view.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
 
5
 
6
Kayacik H. G., Zincir-Heywood A. N., and Heywood M. I. Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Benchmark. In Proceedings of the International Conference on Privacy, Security, and Trust (PST 2005) (Markham, Ontario, Canada, Oct. 12-14), Association for Computer Machinery Press, Morristown, NJ, 2006, 85--89.
 
7
 
8
 
9
Lane, T. A Decision-Theoretic, Semi-Supervised Model for Intrusion Detection. Lane, T. In Maloof, M., ed., Machine learning and data mining for computer security: Methods and applications. London: Springer-Verlag. 2006.
 
10
 
11
 
12
 
13
Nigam, K., McCallum, A., and Mitchell, T. Semi-supervised Text Classification Using EM. In Chapelle, O., Zien, A., and Scholkopf, B. (Eds.) Semi-Supervised Learning. MIT Press: Boston, 2006.
 
14
 
15
Parikh, D., and Chen, T. Bringing Diverse Classifiers to Common Grounds: dtransform. In proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Las Vegas, Nevada, U.S.A, March 30-April 4), IEEE Computer Society Press, Los Alamitos, California, 2008, 3349--3352.
 
16
Strokes, W. J., and Platt, C. J. Aladin: Active Learning for Statistical Intrusion Detection. In Proceeding of Neural Information Process System Conference 2007 Workshop on Machine Learning in Adversarial Environments for Computer Security MIT Press, Vancouver, Canada, 2007, 12--13.
 
17
 
18
University of California Department of Information and Computer Science, KDD Cup 99 Intrusion Detection Dataset Task Description, 1999, URL: http://kdd.ics.uci.edu-/databases/kddcup99/kddcup99.html.
 
19
Xiaojin, Z. Semi-supervised Learning Literature Survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin, Madison, 2005.
 
20
Zissman, M. 1998/99 DARPA Intrusion Detection Evaluation datasets. MIT Lincoln Laboratory, URL: http://www.ll.mit.edu/IST/ideval/data/data_index.html.

Collaborative Colleagues:
Ching-Hao Mao: colleagues
Hahn-Ming Lee: colleagues
Devi Parikh: colleagues
Tsuhan Chen: colleagues
Si-Yu Huang: colleagues