|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
Knowing the distribution of the sizes of traffic flows passing through a network link helps a network operator to characterize network resource usage, infer traffic demands, detect traffic anomalies, and accommodate new traffic demands through better traffic engineering. Previous work on estimating the flow size distribution has been focused on making inferences from sampled network traffic. Its accuracy is limited by the (typically) low sampling rate required to make the sampling operation affordable. In this paper we present a novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM. For each incoming packet, our algorithm only needs to increment one underlying counter, making the algorithm fast enough even for 40 Gbps (OC-768) links. The data structure is lossy in the sense that sizes of multiple flows may collide into the same counter. Our algorithm uses Bayesian statistical methods such as Expectation Maximization to infer the most likely flow size distribution that results in the observed counter values after collision. Evaluations of this algorithm on large Internet traces obtained from several sources (including a tier-1 ISP) demonstrate that it has very high measurement accuracy (within 2%). Our algorithm not only dramatically improves the accuracy of flow distribution measurement, but also contributes to the field of data streaming by formalizing an existing methodology and applying it to the context of estimating the flow-distribution.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Nick Duffield , Carsten Lund , Mikkel Thorup, Estimating flow distributions from sampled flow statistics, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
[doi> 10.1145/863955.863992]
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
Cristian Estan , George Varghese, New directions in traffic measurement and accounting, Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, August 19-23, 2002, Pittsburgh, Pennsylvania, USA
|
 |
6
|
|
 |
7
|
A. Medina , N. Taft , K. Salamatian , S. Bhattacharyya , C. Diot, Traffic matrix estimation: existing techniques and new directions, Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, August 19-23, 2002, Pittsburgh, Pennsylvania, USA
|
 |
8
|
|
 |
9
|
Yin Zhang , Matthew Roughan , Nick Duffield , Albert Greenberg, Fast accurate computation of large-scale IP traffic matrices from link loads, Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 11-14, 2003, San Diego, CA, USA
|
 |
10
|
Yin Zhang , Matthew Roughan , Carsten Lund , David Donoho, An information-theoretic approach to traffic matrix estimation, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
[doi> 10.1145/863955.863990]
|
| |
11
|
S. Muthukrishnan, "Data streams: Algorithms and applications," available at http://athos.rutgers.edu/~muthu/.
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the em algorithm," Journal of the Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1--38, 1977.
|
| |
16
|
A. Kumar, J. Xu, J. Wang, O. Spatschek, and L. Li, "Space-Code Bloom Filter for Efficient per-flow Traffic Measurement," in Proc. IEEE Infocom, Mar. 2004.
|
 |
17
|
|
| |
18
|
D. Moor, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver, "The spread of the sapphire/slammer worm," in Technical Report, CAIDA, 2003.
|
 |
19
|
Balachander Krishnamurthy , Subhabrata Sen , Yin Zhang , Yan Chen, Sketch-based change detection: methods, evaluation, and applications, Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement, October 27-29, 2003, Miami Beach, FL, USA
[doi> 10.1145/948205.948236]
|
 |
20
|
|
| |
21
|
|
CITED BY 22
|
|
|
|
|
|
|
|
|
|
|
Min Cai , Jianping Pan , Yu-Kwong Kwok , Kai Hwang, Fast and accurate traffic matrix measurement using adaptive cardinality counting, Proceeding of the 2005 ACM SIGCOMM workshop on Mining network data, August 26-26, 2005, Philadelphia, Pennsylvania, USA
|
|
|
|
|
|
|
|
|
|
|
|
Qi (George) Zhao , Mitsunori Ogihara , Haixun Wang , Jun (Jim) Xu, Finding global icebergs over distributed data sets, Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 26-28, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Vyas Sekar , Michael K. Reiter , Walter Willinger , Hui Zhang , Ramana Rao Kompella , David G. Andersen, CSAMP: a system for network-wide flow monitoring, Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, p.233-246, April 16-18, 2008, San Francisco, California
|
|
|
|
|
|
|
|
|
George Nychis , Vyas Sekar , David G. Andersen , Hyong Kim , Hui Zhang, An empirical evaluation of entropy-based traffic anomaly detection, Proceedings of the 8th ACM SIGCOMM conference on Internet measurement, October 20-22, 2008, Vouliagmeni, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hui Zhang , Guofei Jiang , Kenji Yoshihira , Haifeng Chen , Akhilesh Saxena, Resilient workload manager: taming bursty workload of scaling internet applications, Proceedings of the 6th international conference industry session on Autonomic computing and communications industry session, June 15-15, 2009, Barcelona, Spain
|
|