ACM Home Page
Please provide us with feedback. Feedback
Internet traffic classification demystified: myths, caveats, and the best practices
Full text PdfPdf (207 KB)
Source International Conference On Emerging Networking Experiments And Technologies archive
Proceedings of the 2008 ACM CoNEXT Conference table of contents
Madrid, Spain
Article No. 11  
Year of Publication: 2008
ISBN:978-1-60558-210-8
Authors
Hyunchul Kim  CAIDA and Seoul National University
KC Claffy  UC San Diego
Marina Fomenkov  UC San Diego
Dhiman Barman  UC Riverside
Michalis Faloutsos  UC Riverside
KiYoung Lee  UC San Diego
Sponsors
ACM: Association for Computing Machinery
SIGCOMM: ACM Special Interest Group on Data Communication
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 53,   Downloads (12 Months): 137,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1544012.1544023
What is a DOI?

ABSTRACT

Recent research on Internet traffic classification algorithms has yield a flurry of proposed approaches for distinguishing types of traffic, but no systematic comparison of the various algorithms. This fragmented approach to traffic classification research leaves the operational community with no basis for consensus on what approach to use when, and how to interpret results. In this work we critically revisit traffic classification by conducting a thorough evaluation of three classification approaches, based on transport layer ports, host behavior, and flow features. A strength of our work is the broad range of data against which we test the three classification approaches: seven traces with payload collected in Japan, Korea, and the US. The diverse geographic locations, link characteristics and application traffic mix in these data allowed us to evaluate the approaches under a wide variety of conditions. We analyze the advantages and limitations of each approach, evaluate methods to overcome the limitations, and extract insights and recommendations for both the study and practical application of traffic classification. We make our software, classifiers, and data available for researchers interested in validating or extending this work.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
CoralReef. http://www.caida.org/tools/measurement/coralreef/.
 
2
Ellacoya. http://www.ellacoya.com.
 
3
Packeteer. http://www.packeteer.com.
 
4
WEKA: Data Mining Software in Java. http://www.cs.waikato.ac.nz/ml/weka/.
5
6
 
7
T. Auld, A. W. Moore, and S. F. Gull. Bayesian neural networks for internet traffic classification. IEEE Transactions on Neural Networks, 18(1): 223--239, January 2007.
8
9
 
10
 
11
 
12
T. Choi, C. Kim, S. Yoon, J. Park, B. Lee, H. Kim, and H. Chung. Content-aware internet application traffic measurement and analysis. In IEEE/IFIP NOMS, April 2004.
 
13
K. Claffy, H.-W. Braun, and G. C. Polyzos. A parameterizable methodology for internet traffic flow profiling. IEEE JSAC Special Issue on the Global Internet, 1995.
14
 
15
16
17
 
18
J. Erman, A. Mahanti, and M. Arlitt. Internet Traffic Identification using Machine Learning. In IEEE Globecom, November 2006.
19
 
20
J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson. Offline/Realtime Traffic Classification Using Semi-Supervised Learning. In IFIP Performance, October 2007.
21
22
23
24
25
26
 
27
Z. Li, R. Yuan, and X. Guan. Accurate Classification of the Internet Traffic Based on the SVM Method. In ICC, June 2007.
28
 
29
A. McGregor, M. Hall, P. Lorier, and J. Brunskill. Flow clustering using machine learning techniques. In PAM, April 2004.
 
30
A. Moore and K. Papagiannaki. Toward the accurate identification of network applications. In PAM, April 2005.
31
 
32
T. T. Nguyen and G. Armitage. Training on multiple sub-flows to optimise the use of machine learning classifiers in real-world ip networks. In IEEE LCN, November 2006.
 
33
T. T. Nguyen and G. Armitage. A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys and Tutorials, to appear, 2008.
 
34
 
35
J. C. Plat. Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, Microsoft Research, April 1998.
36
37
 
38
N. Williams, S. Zander, and G. Armitage. Evaluating machine learning algorithms for automated network application identification. Technical Report 060401B, CAIA, Swinburne Univ., April 2006.
39
 
40
 
41
Y. J. Won, B.-C. Park, H.-T. Ju, M.-S. Kim, and J. W. Hong. A hybrid approach for accurate application traffic idenficiation. In IEEE/IFIP E2EMON, April 2006.
 
42


Collaborative Colleagues:
Hyunchul Kim: colleagues
KC Claffy: colleagues
Marina Fomenkov: colleagues
Dhiman Barman: colleagues
Michalis Faloutsos: colleagues
KiYoung Lee: colleagues