|
ABSTRACT
The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the concept of two-way hashing where the largest bucket contains O(log log n) balls with high probability. The hash look up will now search in both the buckets an item hashes to. Since an item may be placed in one of two buckets, we could potentially move an item after it has been initially placed to reduce maximum load. Using this fact, we present a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization. In fact, with n buckets, even if the space for two items are pre-allocated per bucket, as may be desirable in hardware implementations, more than n items can be stored giving a high memory utilization. Assuming truly random hash functions, we prove the following properties for our hashing scheme.• Each lookup takes two random memory accesses, and reads at most two items per access.• Each insert takes O(log n) time and up to log log n+O(1) moves, with high probability, and constant time in expectation.• Maintains 83.75% memory utilization, without requiring dynamic allocation during inserts.We also analyze the trade-off between the number of moves performed during inserts and the maximum load on a bucket. By performing at most h moves, we can maintain a maximum load of O(log log n/h log (log log n/h)). So, even by performing one move, we achieve a better bound than by performing no moves at all.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Yossi Azar , Andrei Z. Broder , Anna R. Karlin , Eli Upfal, Balanced allocations (extended abstract), Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, p.593-602, May 23-25, 1994, Montreal, Quebec, Canada
[doi> 10.1145/195058.195412]
|
 |
2
|
Micah Adler , Soumen Chakrabarti , Michael Mitzenmacher , Lars Rasmussen, Parallel randomized load balancing, Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, p.238-247, May 29-June 01, 1995, Las Vegas, Nevada, United States
[doi> 10.1145/225058.225131]
|
| |
3
|
A. Broder and M. Mitzenmacher Using Multiple Hash Functions to Improve IP Lookups Proceedings of IEEE INFOCOM 2001, pp. 1454--1463, 2001.
|
| |
4
|
Richard Cole , Alan M. Frieze , Bruce M. Maggs , Michael Mitzenmacher , Andréa W. Richa , Ramesh K. Sitaraman , Eli Upfal, On Balls and Bins with Deletions, Proceedings of the Second International Workshop on Randomization and Approximation Techniques in Computer Science, p.145-158, October 08-10, 1998
|
| |
5
|
|
| |
6
|
|
| |
7
|
A. Czumaj, C. Riley, and C. Scheideler. Perfectly Balanced Allocation In Proceedings of the 7th International Workshop on Randomization and Approximation Techniques in Computer Science (RANDOM'03), pages 240--251.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
Martin Dietzfelbinger and Christoph Weidling. Cache-friendly dictionary implementations with constant lookup time and small space overhead. Submitted to STACS 2005. Also part of Christoph Weidling's PhD thesis, to be published, Technical University of Ilmenau.
|
| |
14
|
Ehud Friedgut and Gil Kalai, Every monotone graph property has a sharp threshold, Proceedings of the American Mathematical Society 124 (1996), 2993--3002.
|
 |
15
|
|
| |
16
|
|
| |
17
|
M. Mitzenmacher, A. Richa, and R. Sitaraman The Power of Two Random Choices: A Survey of Techniques and Results Book chapter, in Handbook of Randomized Computing: volume 1, edited by P. Pardalos, S. Rajasekaran, and J. Rolim, pp. 255--312.
|
| |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
Peter Sanders , Sebastian Egner , Jan Korst, Fast concurrent access to parallel disks, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.849-858, January 09-11, 2000, San Francisco, California, United States
|
|