ACM Home Page
Please provide us with feedback. Feedback
Traffic classification using clustering algorithms
Full text PdfPdf (149 KB)
Source Joint International Conference on Measurement and Modeling of Computer Systems archive
Proceedings of the 2006 SIGCOMM workshop on Mining network data table of contents
Pisa, Italy
Pages: 281 - 286  
Year of Publication: 2006
ISBN:1-59593-569-X
Authors
Jeffrey Erman  University of Calgary, Calgary, AB, Canada
Martin Arlitt  University of Calgary, Calgary, AB, Canada
Anirban Mahanti  University of Calgary, Calgary, AB, Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 39,   Downloads (12 Months): 236,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1162678.1162679
What is a DOI?

ABSTRACT

Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult with many peer-to-peer (P2P) applications using dynamic port numbers, masquerading techniques, and encryption to avoid detection. An alternative approach is to classify traffic by exploiting the distinctive characteristics of applications when they communicate on a network. We pursue this latter approach and demonstrate how cluster analysis can be used to effectively identify groups of traffic that are similar using only transport layer statistics. Our work considers two unsupervised clustering algorithms, namely K-Means and DBSCAN, that have previously not been used for network traffic classification. We evaluate these two algorithms and compare them to the previously used AutoClass algorithm, using empirical Internet traces. The experimental results show that both K-Means and DBSCAN work very well and much more quickly then AutoClass. Our results indicate that although DBSCAN has lower accuracy compared to K-Means and AutoClass, DBSCAN produces better clusters.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
A. P. Dempster, N. M. Paird, and D. B. Rubin. Maximum likelihood from incomeplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1): 1--38, 1977.
3
 
4
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster Analysis and Display of Genome-wide Expression Patterns. Genetics, 95(1): 14863--15868, 1998.
 
5
M. Ester, H. Kriegel, J. Sander, and X. Xu. A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD 96), Portland, USA, 1996.
6
 
7
8
9
 
10
A. McGregor, M. Hall, P. Lorier, and J. Brunskill. Flow Clustering Using Machine Learning Techniques. In PAM 2004, Antibes Juan-les-Pins, France, April 19--20, 2004.
 
11
A. W. Moore and K. Papagiannaki. Toward the Accurate Identification of Network Applications. In PAM 2005, Boston, USA, March 31-April 1, 2005.
12
 
13
14
15
 
16
 
17

CITED BY  15

Collaborative Colleagues:
Jeffrey Erman: colleagues
Martin Arlitt: colleagues
Anirban Mahanti: colleagues