|
ABSTRACT
A pervasive requirement of distributed systems is to deal with churn-change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of available nodes to use.First, we provide a comparison of the performance of a range of different node selection strategies in five real-world traces. Among our findings is that the simple strategy of picking a uniform-random replacement whenever a node fails performs surprisingly well. We explain its performance through analysis in a stochastic model.Second, we show that a class of strategies, which we call "Preference List" strategies, arise commonly as a result of optimizing for a metric other than churn, and produce high churn relative to more randomized strategies under realistic node failure patterns. Using this insight, we demonstrate and explain differences in performance for designs that incorporate varying degrees of randomization. We give examples from a variety of protocols, including anycast, over-lay multicast, and distributed hash tables. In many cases, simply adding some randomization can go a long way towards reducing churn.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. C. Arnold. Majorization and the Lorenz order: A Brief Introduction, volume 43. Lecture Notes in Statistics, Springer-Verlag, 1987.
|
| |
2
|
M. Bakkaloglu, J. J. Wylie, C. Wang, and G. R. Ganger. On correlated failures in survivable storage systems. Technical Report CMU-CS-02-129, Carnegie Mellon University, May 2002.
|
 |
3
|
Hitesh Ballani , Paul Francis, Towards a global IP anycast service, Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications, August 22-26, 2005, Philadelphia, Pennsylvania, USA
|
| |
4
|
A. Bavier, M. Bowman, B. Chun, D. Culler, S. Karlin, S. Muir, L. Peterson, T. Roscoe, T. Spalink, and M. Wawrzoniak. Operating system support for planetary-scale network services. In Proc. NSDI, 2004.
|
| |
5
|
R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G. M. Voelker. Total recall: System support for automated availability management. In NSDI, 2004.
|
| |
6
|
C. Blake and R. Rodrigues. High availability, scalable storage, dynamic peer neetworks: Pick two. In Proc. HOTOS, May 2003.
|
 |
7
|
William J. Bolosky , John R. Douceur , David Ely , Marvin Theimer, Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs, Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.34-43, June 18-21, 2000, Santa Clara, California, United States
|
 |
8
|
Frank Dabek , M. Frans Kaashoek , David Karger , Robert Morris , Ion Stoica, Wide-area cooperative storage with CFS, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
 |
9
|
|
| |
10
|
Philippe Flajolet , Danièle Gardy , Loÿs Thimonier, Birthday paradox, coupon collectors, caching algorithms and self-organizing search, Discrete Applied Mathematics, v.39 n.3, p.207-229, Nov. 11, 1992
[doi> 10.1016/0166-218X(92)90177-C]
|
 |
11
|
|
| |
12
|
M. J. Freedman, K. Lakshminarayanan, and D. Mazieres. Oasis: Anycast for any service. In NSDI, 2006.
|
| |
13
|
L. Garces-Erice, E. W. Biersack, K. W. Ross, P. A. Felber, and G. Urvoy-Keller. Hierarchical P2P systems. In Proc. ACM/IFIP International Conference on Parallel and Distributed Computing (Euro-Par), 2003.
|
| |
14
|
P. B. Godfrey, S. Shenker, and I. Stoica. Minimizing churn in distributed systems. Technical Report EECS-2006-25, EECS Department, University of California, Berkeley, 2006.
|
| |
15
|
S. Guha, N. Daswani, and R. Jain. An experimental study of the Skype peer-to-peer VoIP system. In IPTPS, 2006.
|
 |
16
|
K. Gummadi , R. Gummadi , S. Gribble , S. Ratnasamy , S. Shenker , I. Stoica, The impact of DHT routing geometry on resilience and proximity, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
[doi> 10.1145/863955.863998]
|
| |
17
|
M. F. Kaashoek and D. R. Karger. Koorde: A simple degree-optimal distributed hash table. In Proc. IPTPS, 2003.
|
| |
18
|
J. Ledlie, J. Shneidman, M. Amis, M. Mitzenmacher, and M. Seltzer. Reliability- and capacity-based selection in distributed hash tables. Technical report, Harvard University Computer Science, Sept. 2003.
|
 |
19
|
|
| |
20
|
J. Li, J. Stribling, R. Morris, and M. F. Kaashoek. Bandwidth-efficient management of DHT routing tables. In Proc. NSDI, 2005.
|
| |
21
|
J. Li, J. Stribling, R. Morris, M. F. Kaashoek, and T. M. Gil. A performance vs. cost framework for evaluating DHT design tradeoffs under churn. In Proc. INFOCOM, 2005.
|
| |
22
|
G. S. Manku, M. Bawa, and P. Raghavan. Symphony: distributed hashing in a small world. In USENIX Symposium on Internet Technologies and Systems, 2003.
|
 |
23
|
|
 |
24
|
|
 |
25
|
|
 |
26
|
Sylvia Ratnasamy , Paul Francis , Mark Handley , Richard Karp , Scott Schenker, A scalable content-addressable network, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.161-172, August 2001, San Diego, California, United States
|
| |
27
|
S. Rhea, B.-G. Chun, J. Kubiatowicz, and S. Shenker. Fixing the embarrassing slowness of OpenDHT on PlanetLab. In Proc. WORLDS, 2005.
|
| |
28
|
Sean Rhea , Patrick Eaton , Dennis Geels , Hakim Weatherspoon , Ben Zhao , John Kubiatowicz, Awarded Best Student Paper! - Pond: The OceanStore Prototype, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
| |
29
|
S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz. Handling churn in a DHT. In Proc. USENIX Annual Technical Conference, June 2004.
|
 |
30
|
Antony Rowstron , Peter Druschel, Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
| |
31
|
S. Saroiu, P. K. Gummadi, and S. D. Gribble. A Measurement Study of Peer-to-Peer File Sharing Systems. In Proc. MMCN, 2002.
|
 |
32
|
Kunwadee Sripanidkulchai , Aditya Ganjam , Bruce Maggs , Hui Zhang, The feasibility of supporting large-scale live streaming applications with dynamic application end-points, Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications, August 30-September 03, 2004, Portland, Oregon, USA
|
 |
33
|
Ion Stoica , Daniel Adkins , Shelley Zhuang , Scott Shenker , Sonesh Surana, Internet indirection infrastructure, Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, August 19-23, 2002, Pittsburgh, Pennsylvania, USA
|
 |
34
|
Ion Stoica , Robert Morris , David Karger , M. Frans Kaashoek , Hari Balakrishnan, Chord: A scalable peer-to-peer lookup service for internet applications, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.149-160, August 2001, San Diego, California, United States
|
| |
35
|
J. Stribling. Planetlab all pairs ping. http://infospect.planet-lab.org/pings.
|
| |
36
|
K. Tati and G. M. Voelker. On object maintenance in peer-to-peer systems. In Proc. IPTPS, 2006.
|
| |
37
|
H. Weatherspoon, B.-G. Chun, C. W. So, and J. Kubiatowicz. Long-Term Data Maintenance: A Quantitative Approach. Technical Report UCB/CSD-05-1404, EECS Department, University of California, Berkeley, July 2005.
|
 |
38
|
Bernard Wong , Aleksandrs Slivkins , Emin Gün Sirer, Meridian: a lightweight network location service without virtual coordinates, Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications, August 22-26, 2005, Philadelphia, Pennsylvania, USA
|
| |
39
|
B. Zhang, T. S. Ng, A. Nandi, R. Riedi, P. Druschel, and G. Wang. Measurement-based analysis, modeling, and synthesis of the Internet delay space, 2006. http://www.cs.rice.edu/bozhang/synthesizer.html.
|
CITED BY 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Georgios Smaragdakis , Vassilis Lekakis , Nikolaos Laoutaris , Azer Bestavros , John W. Byers , Mema Roussopoulos, EGOIST: overlay routing using selfish neighbor selection, Proceedings of the 2008 ACM CoNEXT Conference, p.1-12, December 09-12, 2008, Madrid, Spain
|
|
|
|
|
|
Pramod Sanaga , Jonathon Duerig , Robert Ricci , Jay Lepreau, Modeling and emulation of internet paths, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.199-212, April 22-24, 2009, Boston, Massachusetts
|
|