ACM Home Page
Please provide us with feedback. Feedback
A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification
Full text PdfPdf (395 KB)
Source ACM SIGCOMM Computer Communication Review archive
Volume 36 ,  Issue 5  (October 2006) table of contents
FEATURE: Reviewed articles table of contents
Pages: 5 - 16  
Year of Publication: 2006
ISSN:0146-4833
Authors
Nigel Williams  Swinburne University of Technology, Melbourne, Australia
Sebastian Zander  Swinburne University of Technology, Melbourne, Australia
Grenville Armitage  Swinburne University of Technology, Melbourne, Australia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 42,   Downloads (12 Months): 222,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1163593.1163596
What is a DOI?

ABSTRACT

The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payload-based identification exhibit a number of shortfalls. An alternative is to use machine learning (ML) techniques and identify network applications based on per-flow statistics, derived from payload-independent features such as packet length and inter-arrival time distributions. The performance impact of feature set reduction, using Consistency-based and Correlation-based feature selection, is demonstrated on Naïve Bayes, C4.5, Bayesian Network and Naïve Bayes Tree algorithms. We then show that it is useful to differentiate algorithms based on computational performance rather than classification accuracy alone, as although classification accuracy between the algorithms is similar, computational performance can differ significantly.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
T. Karagiannis, A. Broido, N. Brownlee, kc claffy, "Is P2P dying or just hiding?", In Proceedings of Globecom, November/December 2004.
 
2
3
 
4
A. McGregor, M. Hall, P. Lorier, J. Brunskill, "Flow Clustering Using Machine Learning Techniques", Passive & Active Measurement Workshop, France, April 2004.
 
5
T. Dunnigan, G. Ostrouchov, "Flow Characterization for Intrusion Detection", Technical Report, Oak Ridge National Laboratory, November 2000.
 
6
7
8
9
 
10
 
11
G. H. John, P. Langley, "Estimating Continuous Distributions in Bayesian Classifiers", in Proceedings of 11th Conference on Uncertainty in Artificial Intelligence, pp. 338--345, Morgan Kaufman, San Mateo, 1995.
 
12
 
13
R. Bouckaert, "Bayesian Network Classifiers in Weka", Technical Report, Department of Computer Science, Waikato University, Hamilton, NZ 2005.
 
14
R. Kohavi, "Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid", in Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD), 1996.
 
15
 
16
M. Hall, "Correlation-based Feature Selection for Machine Learning", PhD Diss. Department of Computer Science, Waikato University, Hamilton, NZ, 1998.
 
17
NLANR traces: http://pma.nlanr.net/Special/ (viewed August 2006).
 
18
NetMate, http://sourceforge.net/projects/netmate-meter/ (viewed August 2006).
 
19
N. Brownlee, "NeTraMet & NeMaC Reference Manual", University of Auckland, http://www.auckland. ac.nz/net/Accounting/ntmref.pdf, June 1999.
 
20
Waikato Environment for Knowledge Analysis (WEKA) 3.4.4, http://www.cs.waikato.ac.nz/ml/weka/ (viewed August 2006).

CITED BY  14

Collaborative Colleagues:
Nigel Williams: colleagues
Sebastian Zander: colleagues
Grenville Armitage: colleagues