|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
The Internet is teeming with high variability phenomena, from measured IP flow sizes to aspects of inferred router-level connectivity, but there still exists considerable debate about how best to deal with this encountered high variability and model it. While one popular approach favors modeling highly variable event sizes with conventional, finite variance distributions such as lognormal or Weibull distributions, Mandelbrot has argued for the last 40 years that there are compelling mathematical, statistical, and practical reasons for why infinite variance distributions are natural candidates for capturing the essence behind high variability phenomena. In this paper, we elaborate on Mandelbrot's arguments and present a methodology that often allows for a clear distinction between the two approaches. In particular, by requiring the resulting models to be resilient to ambiguities (i.e., robust to real-world deficiencies in the underlying network measurements) and internally self-consistent (i.e., insensitive with respect the duration, location, or time of the data collection), we provide a rigorous framework for a qualitative assessment of the observed high variability. We apply the proposed framework to assess previously reported findings about measured Internet traffic and inferred router- and AS-level connectivity. In the process, we also discuss what our approach has to say about recent discussions concerning network traffic being Poisson or self-similar and router-level or AS-level connectivity graphs of the Internet being scale-free or not.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Albert, H. Jeong, and A.-L. Barabási. Attack and error tolerance of complex networks, Nature 406, pp. 378--382, 2000.
|
| |
2
|
D. Alderson. Technological and Economic Drivers and Constraints in the Internet's "Last Mile", Technical Report CIT-CDS-04-004, California Institute of Technology, 2004.
|
| |
3
|
A. Bookstein Informetric distributions, Part II: Resilience to ambiguity Journal of the Amer. Soc. for Information Science, Vol. 41, No. 5, pp. 376--386, 1990.
|
| |
4
|
Andrei Broder , Ravi Kumar , Farzin Maghoul , Prabhakar Raghavan , Sridhar Rajagopalan , Raymie Stata , Andrew Tomkins , Janet Wiener, Graph structure in the Web, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.309-320, June 2000, Amsterdam, The Netherlands
|
| |
5
|
J. Cao, W. S. Cleveland, D. Lin, and D. X. Sun. Internet traffic tends toward Poisson and independent as the load increases. in: Nonlinear Estimation and Classification, C. Holmes, D. Denison, M. Hansen, B. Yu, and B. Mallick (Eds.), Springer-Verlag, New York, 2002.
|
 |
6
|
Hyunseok Chang , Ramesh Govindan , Sugih Jamin , Scott J. Shenker , Walter Willinger, Towards capturing representative AS-level Internet topologies, Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 15-19, 2002, Marina Del Rey, California
|
| |
7
|
Q. Chen, H. Chang, R. Govindan, S. Jamin, S. Shenker, and W. Willinger. The Origin of power laws in Internet topologies revisited. Proc. IEEE INFOCOM '02, New York, NY, 2002.
|
| |
8
|
K. Claffy, H.-W. Braun, and G. Polyzos. A parameterizable methodology for Internet traffic flow profiling, IEEE Journal on Selected Areas in Communications 13, pp. 1481--1494, 1995.
|
| |
9
|
Cooperative Association for Internet Data Analysis (CAIDA), Skitter. Available from www.caida.org/tools/measurement/skitter/.
|
| |
10
|
A. B. Downey. The structural cause for file size distribution. www.allendowney.com/research/filesize/cds-tr25 2000.ps.gz, 2000.
|
 |
11
|
|
| |
12
|
A. B. Downey. Lognormal and Pareto distributions in the Internet. www.allendowney.com/research/longtail/downey03lognormal.pdf, 2003.
|
| |
13
|
D. Draper, J. S. Hodges, C. L. Mallows and D. Pregibon. Exchangeability and data analysis, J. R. Statist. Soc. A 156, pp. 9--37, 1993.
|
 |
14
|
Nick Duffield , Carsten Lund , Mikkel Thorup, Estimating flow distributions from sampled flow statistics, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
[doi> 10.1145/863955.863992]
|
 |
15
|
Michalis Faloutsos , Petros Faloutsos , Christos Faloutsos, On power-law relationships of the Internet topology, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.251-262, August 30-September 03, 1999, Cambridge, Massachusetts, United States
|
| |
16
|
A. Feldmann, Characteristics of TCP connection arrivals, in: Self-Similar Network Traffic and Performance Evaluation, K. Park and W. Willinger (Eds.), J. Wiley & Sons, New York, pp. 367--400, 2000.
|
| |
17
|
W. Feller. An Introduction to Probability Theory and its Applications, Volume II, Second Edition. Wiley & Sons, New York, 1966.
|
| |
18
|
Daniel R. Figueiredo , Benyuan Liu , Anja Feldmann , Vishal Misra , Don Towsley , Walter Willinger, On TCP and self-similar traffic, Performance Evaluation, v.61 n.2-3, p.129-141, July 2005
[doi> 10.1016/j.peva.2004.11.004]
|
| |
19
|
|
| |
20
|
|
| |
21
|
R. Govindan and H. Tangmunarunkit. Heuristics for Internet Map Discovery, Proceeding of IEEE INFOCOM '00, 2000.
|
| |
22
|
N.L. Johnson, S. Kotz, and N. Balakrishnan. Continuous Univariate Distributions, Volume1, Second Edition J. Wiley & Sons, New York, 1994.
|
| |
23
|
T. Karagiannis, M. Faloutsos, M. Molle, and A. Broido. A nonstationary Poisson view of Internet traffic. Proc. IEEE INFOCOM '04, 2004.
|
| |
24
|
T. G. Kurtz. Limit Theorems for Workload Input Models, in: Stochastic Networks: Theory and Applications, F.P. Kelly, S. Zachary, and I. Ziedins (Eds.), pp. 119--140, Oxford University Press, 1996.
|
| |
25
|
A. Lakhina, J. W. Byers, M. Crovella, and P. Xie. Sampling biases in IP topology measurements. Proc. IEEE INFOCOM '03, 2003.
|
 |
26
|
Lun Li , David Alderson , Walter Willinger , John Doyle, A first-principles approach to understanding the internet's router-level topology, Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications, August 30-September 03, 2004, Portland, Oregon, USA
|
| |
27
|
B. B. Mandelbrot. New methods in statistical economics, Journal of Political Economics 71, pp. 421--440, 1963.
|
| |
28
|
B. B. Mandelbrot. Fractals and Scaling in Finance. Springer-Verlag, New York, 1997.
|
| |
29
|
M. Mitzenmacher. A brief history of generative models for power law and lognormal distributions. Internet Mathematics, Vol. 1, No. 2, pp. 226--251, 2004.
|
| |
30
|
P. Newman, T. Lyons, and G. Minshall. Flow labelled IP: A connectionless approach to ATM, Proc. IEEE INFOCOM '96, 1996.
|
 |
31
|
|
| |
32
|
Passive Measurement and Analysis (PMA) Project, http://pma.nlanr.net/
|
| |
33
|
|
| |
34
|
S. I. Resnick. Heavy tail modeling and teletraffic data. Annals of Statistics 25, pp. 1805--1869, 1997.
|
| |
35
|
Route Views, University of Oregon Route Views Project, Available at http://www.antc.uoregon.edu/route-views/.
|
 |
36
|
Neil Spring , Ratul Mahajan , David Wetherall, Measuring ISP topologies with rocketfuel, Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, August 19-23, 2002, Pittsburgh, Pennsylvania, USA
|
| |
37
|
G. Samorodnitsky and M. S. Taqqu. Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. Chapman and Hall, New York, 1994.
|
| |
38
|
|
 |
39
|
|
 |
40
|
Renata Teixeira , Keith Marzullo , Stefan Savage , Geoffrey M. Voelker, In search of path diversity in ISP networks, Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement, October 27-29, 2003, Miami Beach, FL, USA
[doi> 10.1145/948205.948247]
|
| |
41
|
J. W. Tukey. Data analysis and behavioral science or learning to bear the quantitative's man burden by shunning badmandments, in: The Collected Works of John W. Tukey, L. W. Jones (Ed.), Vol. III, Wadsworth, Monterey, CA, 1986.
|
| |
42
|
A. Veres and M. Boda. The chaotic nature of TCP congestion control. Proc. IEEE INFOCOM '00, 2000.
|
| |
43
|
W. Willinger, R. Govindan, S. Jamin, V. Paxson and S. Shenker. Scaling phenomena in the Internet: Critically examining criticality. Proc. Nat. Acad. Sci. 99, suppl. 1, pp. 2573--2580, 2002.
|
| |
44
|
W. Willinger and V. Paxson, Where Mathematics meets the Internet, Notices of the AMS 45, pp. 961--970, 1998.
|
| |
45
|
S.-H. Yook, H. Jeong, and A.-L. Barabási. Modeling the Internet's large-scale topology. Proc. Nat. Acad. Sci. 99, pp. 13382--13386, 2002.
|
| |
46
|
X. Zhu, J. Yu, and J. C. Doyle. Heavy tails, generalized coding, and optimal Web layout. Proc. IEEE INFOCOM '01, 2001.
|
CITED BY 8
|
|
David Alderson , Lun Li , Walter Willinger , John C. Doyle, Understanding internet topology: principles, models, and validation, IEEE/ACM Transactions on Networking (TON), v.13 n.6, p.1205-1218, December 2005
|
|
|
|
|
|
Daniel R. Figueiredo , Benyuan Liu , Anja Feldmann , Vishal Misra , Don Towsley , Walter Willinger, On TCP and self-similar traffic, Performance Evaluation, v.61 n.2-3, p.129-141, July 2005
|
|
|
Alan Mislove , Massimiliano Marcon , Krishna P. Gummadi , Peter Druschel , Bobby Bhattacharjee, Measurement and analysis of online social networks, Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, October 24-26, 2007, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
G.
Mathematics of Computing
G.3
PROBABILITY AND STATISTICS
Additional Classification:
G.
Mathematics of Computing
G.3
PROBABILITY AND STATISTICS
Subjects:
Distribution functions
General Terms:
Experimentation,
Measurement,
Verification
Keywords:
Pareto distributions,
borrowing strength,
heavy-tailed distributions,
high variability,
lognormal distributions,
model consistency,
model robustness,
power-laws,
scaling distributions,
sequential moment plots
|