ACM Home Page
Please provide us with feedback. Feedback
Bigfoot, sasquatch, the yeti and other missing links: what we don't know about the as graph
Full text PdfPdf (167 KB)
Source
Internet Measurement Conference archive
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement table of contents
Vouliagmeni, Greece
SESSION: Routing and network topology table of contents
Pages 325-330  
Year of Publication: 2008
ISBN:978-1-60558-334-1
Authors
Matthew Roughan  University of Adelaide, Adelaide, Australia
Simon Jonathan Tuke  University of Adelaide, University of Adelaide, Australia
Olaf Maennel  Techniche Universität Berlin and Deutsche Telekom Labs, Berlin, Germany
Sponsors
SIGCOMM: ACM Special Interest Group on Data Communication
SIGMETRICS: ACM Special Interest Group on Measurement and Evaluation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 45,   Downloads (12 Months): 185,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1452520.1452558
What is a DOI?

ABSTRACT

Study of the Internet's high-level structure has for some time intrigued scientists. The AS-graph (showing interconnections between Autonomous Systems) has been measured, studied, modelled and discussed in many papers over the last decade. However, the quality of the measurement data has always been in question. It is by now well known that most measurements of the AS-graph are missing some set of links. Many efforts have been undertaken to correct this, primarily by increasing the set of measurements, but the issue remains: how much is enough? When will we know that we have enough measurements to be sure we can see all (or almost all) of the links. This paper aims to address the problem of estimating how many links are missing from our measurements. We use techniques pioneered in biostatistics and epidemiology for estimating the size of populations (for instance of fish or disease carriers). It is rarely possible to observe entire populations, and so sampling techniques are used. We extend those techniques to the domain of the AS-graph. The key difference between our work and the biological literature is that all links are not the same, and so we build a stratified model and specify an EM algorithm for estimating its parameters. Our estimates suggest that a very significant number of links (many of thousands) are missing from standard route monitor measurements of the AS-graph. Finally, we use the model to derive the number of monitors that would be needed to see a complete AS-graph with high-probability. We estimate that 700 route monitors would see 99.9% of links.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
J. C. Doyle, D. L. Alderson, L. Li, S. Low, M. Roughan, S. Shalunov, R. Tanaka, and W. Willinger, "The 'robust yet fragile' nature of the Internet," Proceedings of the National Academy of Sciences of the USA, vol. 102, pp. 14497--502, October 2005.
 
3
A. Lakhina, J. Byers, M. Crovella, and P. Xie, "Sampling biases in IP topology measurements," in Proc. IEEE Infocom, April 2003.
 
4
S. E. Fienberg, "The multiple recapture census for closed population and incomplete 2k contingency tables.," Biometrika, vol. 59, no. 3, pp. 591--603, 1972.
 
5
International Working Group for Disease Monitoring and Forecasting, "Capture-recapture and multiple-record systems estimation i: History and theoretical development," American Journal of Epidemiology, vol. 142, no. 10, pp. 1047--1057, 1995.
6
 
7
"University of Oregon Route Views Archive Project." www.routeviews.org.
 
8
"Ripe NCC: routing information service raw data." http://www.ripe.net/projects/ris/.
 
9
Y. He, G. Siganos, M. Faloutsos, and S. V. Krishnamurthy, "A systematic framework for unearthing the missing links: Measurements and impact," in USENIX/SIGCOMM NSDI, (Cambridge, MA, USA), April 2007.
10
 
11
P. F. Rider, "Truncated binomial and negative binomial distributions," Journal of the American Statistical Association, vol. 50, pp. 877--883, Sept. 1955.
 
12
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2001.
13
14

Collaborative Colleagues:
Matthew Roughan: colleagues
Simon Jonathan Tuke: colleagues
Olaf Maennel: colleagues