ACM Home Page
Please provide us with feedback. Feedback
Ranged hash functions and the price of churn
Full text PdfPdf (473 KB)
Source Symposium on Discrete Algorithms archive
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms table of contents
San Francisco, California
Pages 1066-1075  
Year of Publication: 2008
Authors
James Aspnes  Yale University
Muli Safra  School of Mathematical Sciences, Tel-Aviv University
Yitong Yin  Yale University
Sponsors
: SIAM Activity Group on Discrete Mathematics
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
Society for Industrial and Applied Mathematics  Philadelphia, PA, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 49,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Ranged hash functions generalize hash tables to the setting where hash buckets may come and go over time, a typical case in distributed settings where hash buckets may correspond to unreliable servers or network connections. Monotone ranged hash functions are a particular class of ranged hash functions that minimize item reassignments in response to churn: changes in the set of available buckets. The canonical example of a monotone ranged hash function is the ring-based consistent hashing mechanism of Karger et al. [13]. These hash functions give a maximum load of Θ (n/mlogm) when n is the number of items and m is the number of buckets. The question of whether some better bound could be obtained using a more sophisticated hash function has remained open.

We resolve this question by showing two lower bounds. First, the maximum load of any randomized monotone ranged hash function is Ω(√n/mlnm) when n = o(mlogm). This bound covers almost all of the nontrivial case, because when n = Ω(mlogm) simple random assignment matches the trivial lower bound of Ω(n/m). We give a matching (though impractical) upper bound that shows that our lower bound is tight over almost all of its range. Second, for randomized monotone ranged hash functions derived from metric spaces, there is a further trade-off between the expansion factor of the metric and the load balance, which for the special case of growth-restricted metrics gives a bound of Ω(n/mlogm), asymptotically equal to that of consistent hashing. These are the first known non-trivial lower bounds for ranged hash functions. They also explain why in ten years no better ranged hash functions have arisen to replace consistent hashing.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
4
5
 
6
 
7
A. Hajnal and E. Szemeredi. Proof of a conjecture of Erdos. Combinatorial Theory and its Applications, 2:601--623, 1970.
8
 
9
P. Indyk, J. Goodman, and J. O'Rourke. Nearest neighbors in high-dimensional spaces. Handbook of Discrete and Computational Geometry, chapter 39, 2004.
 
10
 
11
M. F. Kaashoek and D. R. Karger. Koorde: A simple degree-optimal distributed hash table. In The 2nd International Workshop on Peer-to-Peer Systems (IPTPS 2003), pages 98--107, 2003.
12
13
14
15
 
16
17
 
18
19
 
20
V. Rodl and A. Rucinski. Threshold Functions for Ramsey Properties. Journal of the American Mathematical Society, 8(4):917--942, 1995.
 
21
22

Collaborative Colleagues:
James Aspnes: colleagues
Muli Safra: colleagues
Yitong Yin: colleagues