|
ABSTRACT
The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payload-based identification exhibit a number of shortfalls. An alternative is to use machine learning (ML) techniques and identify network applications based on per-flow statistics, derived from payload-independent features such as packet length and inter-arrival time distributions. The performance impact of feature set reduction, using Consistency-based and Correlation-based feature selection, is demonstrated on Naïve Bayes, C4.5, Bayesian Network and Naïve Bayes Tree algorithms. We then show that it is useful to differentiate algorithms based on computational performance rather than classification accuracy alone, as although classification accuracy between the algorithms is similar, computational performance can differ significantly.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
T. Karagiannis, A. Broido, N. Brownlee, kc claffy, "Is P2P dying or just hiding?", In Proceedings of Globecom, November/December 2004.
|
| |
2
|
|
 |
3
|
|
| |
4
|
A. McGregor, M. Hall, P. Lorier, J. Brunskill, "Flow Clustering Using Machine Learning Techniques", Passive & Active Measurement Workshop, France, April 2004.
|
| |
5
|
T. Dunnigan, G. Ostrouchov, "Flow Characterization for Intrusion Detection", Technical Report, Oak Ridge National Laboratory, November 2000.
|
| |
6
|
|
 |
7
|
Matthew Roughan , Subhabrata Sen , Oliver Spatscheck , Nick Duffield, Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification, Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, October 25-27, 2004, Taormina, Sicily, Italy
[doi> 10.1145/1028788.1028805]
|
 |
8
|
Thomas Karagiannis , Konstantina Papagiannaki , Michalis Faloutsos, BLINC: multilevel traffic classification in the dark, Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications, August 22-26, 2005, Philadelphia, Pennsylvania, USA
|
 |
9
|
|
| |
10
|
|
| |
11
|
G. H. John, P. Langley, "Estimating Continuous Distributions in Bayesian Classifiers", in Proceedings of 11th Conference on Uncertainty in Artificial Intelligence, pp. 338--345, Morgan Kaufman, San Mateo, 1995.
|
| |
12
|
|
| |
13
|
R. Bouckaert, "Bayesian Network Classifiers in Weka", Technical Report, Department of Computer Science, Waikato University, Hamilton, NZ 2005.
|
| |
14
|
R. Kohavi, "Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid", in Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD), 1996.
|
| |
15
|
|
| |
16
|
M. Hall, "Correlation-based Feature Selection for Machine Learning", PhD Diss. Department of Computer Science, Waikato University, Hamilton, NZ, 1998.
|
| |
17
|
NLANR traces: http://pma.nlanr.net/Special/ (viewed August 2006).
|
| |
18
|
NetMate, http://sourceforge.net/projects/netmate-meter/ (viewed August 2006).
|
| |
19
|
N. Brownlee, "NeTraMet & NeMaC Reference Manual", University of Auckland, http://www.auckland. ac.nz/net/Accounting/ntmref.pdf, June 1999.
|
| |
20
|
Waikato Environment for Knowledge Analysis (WEKA) 3.4.4, http://www.cs.waikato.ac.nz/ml/weka/ (viewed August 2006).
|
CITED BY 14
|
|
|
|
|
Jason But , Thuy Nguyen , Lawrence Stewart , Nigel Williams , Grenville Armitage, Performance analysis of the ANGEL system for automated control of game traffic prioritisation, Proceedings of the 6th ACM SIGCOMM workshop on Network and system support for games, p.123-128, September 19-20, 2007, Melbourne, Australia
|
|
|
Jeffrey Erman , Anirban Mahanti , Martin Arlitt , Ira Cohen , Carey Williamson, Offline/realtime traffic classification using semi-supervised learning, Performance Evaluation, v.64 n.9-12, p.1194-1213, October, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Riyad Alshammari , Peter I. Lichodzijewski , Malcolm Heywood , A. Nur Zincir-Heywood, Classifying SSH encrypted traffic with minimum packet header features using genetic programming, Proceedings of the 11th annual conference companion on Genetic and evolutionary computation conference, July 08-12, 2009, Montreal, Québec, Canada
|
|
|
|
|
|
Wei Lu , Mahbod Tavallaee , Ali A. Ghorbani, Automatic discovery of botnet communities on large-scale communication networks, Proceedings of the 4th International Symposium on Information, Computer, and Communications Security, March 10-12, 2009, Sydney, Australia
|
|
|
Hyunchul Kim , KC Claffy , Marina Fomenkov , Dhiman Barman , Michalis Faloutsos , KiYoung Lee, Internet traffic classification demystified: myths, caveats, and the best practices, Proceedings of the 2008 ACM CoNEXT Conference, p.1-12, December 09-12, 2008, Madrid, Spain
|
|
|
|
|
|
|
|