| Web user-session inference by means of clustering techniques |
| Full text |
Pdf
(602 KB)
|
| Source
|
IEEE/ACM Transactions on Networking (TON)
archive
Volume 17 , Issue 2 (April 2009)
table of contents
Pages 405-416
Year of Publication: 2009
ISSN:1063-6692
|
|
Authors
|
|
Andrea Bianco
|
Dipartimento di Elettronica, Politecnico di Torino, Torino, Italy
|
|
Gianluca Mardente
|
Cisco Systems, San Jose, CA
|
|
Marco Mellia
|
Dipartimento di Elettronica, Politecnico di Torino, Torino, Italy
|
|
Maurizio Munafò
|
Dipartimento di Elettronica, Politecnico di Torino, Torino, Italy
|
|
Luca Muscariello
|
France Telecom R&D, Issy-Les-Moulineaux, France
|
|
| Publisher |
IEEE Press
Piscataway, NJ, USA
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 90, Citation Count: 0
|
|
|
ABSTRACT
This paper focuses on the definition and identification of "Web user-sessions", aggregations of several TCP connections generated by the same source host. The identification of a user-session is non trivial. Traditional approaches rely on threshold based mechanisms. However, these techniques are very sensitive to the value chosen for the threshold, which may be difficult to set correctly. By applying clustering techniques, we define a novel methodology to identify Web user-sessions without requiring an a priori definition of threshold values. We define a clustering based approach, we discuss pros and cons of this approach, and we apply it to real traffic traces. The proposed methodology is applied to artificially generated traces to evaluate its benefits against traditional threshold based approaches. We also analyze the characteristics of user-sessions extracted by the clustering methodology from real traces and study their statistical properties. Web user-sessions tend to be Poisson, but correlation may arise during periods of network/hosts anomalous behavior.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
Ramón Cáceres , Peter B. Danzig , Sugih Jamin , Danny J. Mitzel, Characteristics of wide-area TCP/IP conversations, Proceedings of the conference on Communications architecture & protocols, p.101-112, September 03-06, 1991, Zurich, Switzerland
|
| |
4
|
P. Danzig and S. Jamin, "Tcplib: A library of TCP Internetwork traffic characteristics," USC, Tech. rep., 1991.
|
| |
5
|
P. Danzig, S. Jamin, R. Caceres, D. Mitzel, and D. Mestrin, "An empirical workload model for driving wide-area TCP/IP network simulations," Internetworking: Research and Experience, vol. 3, no. 1, pp. 1-26, 1992.
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
|
| |
10
|
T. Bonald, A. Proutière, G. Régnié, and J. W. Roberts, "Insensitivity results in statistical bandwidth sharing," in Proc. Int. Teletraffic Congr. (ITC) 17, Salvador, Brazil, Dec. 2001, 12 pp.
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Universal Mobile Telecommunications System (UMTS), Selection Procedures for the Choice of Radio Transmission Technologies of the UMTS (UMTS 30.03 Version 3.2.0), ETSI TR 101 112 V3.2.0 (1998-04), (1998-04).
|
 |
15
|
S. Ben Fred , T. Bonald , A. Proutiere , G. Régnié , J. W. Roberts, Statistical bandwidth sharing: a study of congestion at flow level, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.111-122, August 2001, San Diego, California, United States
|
 |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
A. Bianco, G. Mardente, M. Mellia, M. Munafò, and L. Muscariello, "Web user session characterization via clustering techniques," in Proc. IEEE GLOBECOM 2005, St. Louis, MO, Nov. 2005, vol. 2, pp. 1102-1107.
|
| |
20
|
The GARR Network Topology. 2005 [Online]. Available: http://www. garr.it/reteGARR/mappe.php
|
| |
21
|
M. Mellia, A. Carpani, and R. Lo Cigno, "Measuring IP and TCP behavior on edge nodes," in Proc. IEEE GLOBECOM 2002, Taipei, Taiwan, R.O.C., Nov. 2002, vol. 3, pp. 2533-2537.
|
| |
22
|
M. Mellia, R. Lo Cigno, and F. Neri, Tstat Web Page. 2001 [Online]. Available: http://tstat.tlc.polito.it
|
| |
23
|
What You Should Know About the Sasser Worm. May 2004 [Online]. Available: http://www.microsoft.com/security/incident/sasser.mspx
|
| |
24
|
S. McCanne, C. Leres, and V. Jacobson, Tcpdump. 2001 [Online]. Available: http://www.tcpdump.org
|
 |
25
|
Gianluca Iannaccone , Christophe Diot , Ian Graham , Nick McKeown, Monitoring very high speed links, Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, November 01-02, 2001, San Francisco, California, USA
[doi> 10.1145/505202.505235]
|
| |
26
|
A. Feldmann, "Characteristics of TCP connection arrivals," AT&T Labs Research, Florham Park, NJ, Tech. rep., 1998.
|
| |
27
|
Goodness-of-Fit Techniques, R. B. D'Agostino and M. A. Stephens, Eds. New York: Marcel Dekker, 1986.
|
|