|
ABSTRACT
Shared counters are among the most basic coordination structures in multiprocessor conputation, with applications ranging from barrier synchronization to concurrent-data-structure design. This article introduces diffracting trees, novel data structures for share counting and load balancing in a distributed/parallel environment. Empirical evidence, collected on a simulated distributed shared-memory machine and several simulated message-passing architectures, shows that diffracting trees scale better and are more robust than both combining trees and counting networks, currently the most effective known methods for implementing concurrent counters in software. The use of a randomized coordination method together with a combinatorial data structure overcomes the resiliency drawbacks of combining trees. Our simulations show that to handle the same load, diffracting trees and counting networks should have a similar width w, yet the depth of a diffracting tree is O(log w), whereas counting networks have depth O(log2 w). Diffracting trees have already been used to implement highly efficient producer/consumer queues, and we believe diffraction will prove to be an effective alternative paradigm to combining and queue-locking in the design of many concurrent data structures.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
A. Agarwal , D. Chaiken , K. Johnson , D. Kranz , J. Kubiatowicz , K. Kurihara , B. H. Lim , G. Maa , D. Nussbaum , M. Parkin , D. Yeung, THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR, Massachusetts Institute of Technology, Cambridge, MA, 1991
|
| |
3
|
Anant Agarwal , John Kubiatowicz , David Kranz , Beng-Hong Lim , Donald Yeung , Godfrey D'Souza , Mike Parkin, Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors, IEEE Micro, v.13 n.3, p.48-61, May 1993
[doi> 10.1109/40.216748]
|
| |
4
|
|
 |
5
|
William Aiello , Ramarathnam Venkatesan , Moti Yung, Coins, weights and contention in balancing networks, Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing, p.193-205, August 14-17, 1994, Los Angeles, California, United States
[doi> 10.1145/197917.198090]
|
| |
6
|
|
 |
7
|
James Aspnes , Maurice Herlihy , Nir Shavit, Counting networks and multi-processor coordination, Proceedings of the twenty-third annual ACM symposium on Theory of computing, p.348-358, May 05-08, 1991, New Orleans, Louisiana, United States
[doi> 10.1145/103418.103421]
|
| |
8
|
BATCHER, K.E. 1968. Sorting networks and their applications. In Proceedings of the AFIPS Joint Computer Conference. AFIPS, Montvale, N.J., 334-338.
|
| |
9
|
BLUMOFE, R.D. AND LEISERSON, C.E. 1994. Scheduling multithreaded computations by work stealing. In Proceedings of the 35th Symposium on the Foundations of Computer Science. IEEE, New York, 356-368.
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
COVINGTON, R.G., DWARKADAS, S., JUMP, J.R., SINCLAIR, J.B., AND MADALA, S. 1991. The efficient simulation of parallel computer systems. Int. J. Comput. Simul. 1, 31-58.
|
| |
15
|
CRAY. 1996. CRAY T3D system architecture overview. Cray Research, Inc., Mountain View, Calif. Available as http://www.cray.com/public/product-info/mpp/T3D_Architecture_Over.
|
| |
16
|
DEC. 1994. Alpha System Reference Manual. Digital Equipment Corp., Maynard, Mass.
|
 |
17
|
Cynthia Dwork , Maurice Herlihy , Orli Waarts, Contention in shared memory algorithms, Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, p.174-183, May 16-18, 1993, San Diego, California, United States
[doi> 10.1145/167088.167145]
|
| |
18
|
FELTEN, E.W., LAMARCA, A., AND LADNER, R. 1993. Building counting networks from larger balancers. Tech. Rep. 93-04-09, Univ. of Washington, Seattle, Wash.
|
 |
19
|
|
| |
20
|
GAWLICK, D. 1985. Processing 'hot spots' in high performance systems. In Proceedings of IEEE COMPCON '85. IEEE, New York.
|
 |
21
|
James R. Goodman , Mary K. Vernon , Philip J. Woest, Efficient synchronization primitives for large-scale cache-coherent multiprocessors, Proceedings of the third international conference on Architectural support for programming languages and operating systems, p.64-75, April 03-06, 1989, Boston, Massachusetts, United States
|
 |
22
|
|
| |
23
|
|
 |
24
|
|
 |
25
|
Maurice Herlihy , Beng-Hong Lim , Nir Shavit, Low contention load balancing on large-scale multiprocessors, Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures, p.219-227, June 29-July 01, 1992, San Diego, California, United States
[doi> 10.1145/140901.140924]
|
| |
26
|
|
| |
27
|
|
| |
28
|
JUMP, J.R. 1994. Netsim Reference Manual. Rice Univ., Houston, Tex. Available via ftp from titan.cs.rice.edu as/public/parallel/sim.tar.Z.
|
| |
29
|
KIRKPATRICK, S., GELATT, C.D., AND VECCHI, M.P. 1983. Optimization by simulated annealing. Science 220, 671-680.
|
 |
30
|
|
 |
31
|
|
 |
32
|
|
| |
33
|
|
 |
34
|
|
| |
35
|
|
| |
36
|
MIPS. 1994. The MIPS RISC Architecture. MIPS Computer Co., Mountain View, Calif.
|
 |
37
|
|
| |
38
|
|
| |
39
|
PFISTER, G.H. AND NORTON, A. 1985. 'Hot-spot' contention and combining in multistage interconnection networks. IEEE Trans. Comput. C-34, 11 (Nov.), 933-938.
|
 |
40
|
Larry Rudolph , Miriam Slivkin-Allalouf , Eli Upfal, A simple load balancing scheme for task allocation in parallel machines, Proceedings of the third annual ACM symposium on Parallel algorithms and architectures, p.237-245, July 21-24, 1991, Hilton Head, South Carolina, United States
[doi> 10.1145/113379.113401]
|
 |
41
|
|
 |
42
|
|
 |
43
|
|
| |
44
|
STONE, H.S. 1984. Database applications of the fetch-and-add instruction. IEEE Trans. Comput. C-33, 7 (July), 604-612.
|
| |
45
|
|
|