|
ABSTRACT
The fundamental problems in dynamic load balancing and job scheduling in parallel and distributed computers involve moving load between processors. In this paper, we consider a new model for load movement in synchronous parallel and distributed machines. In each step of our model, each processor can transfer load to at most one neighbor; also, any amount of load can be moved along a communication link between two processors in one step. This is a reasonable model for load movement in significant classes of dynamic load balancing problems.
We derive efficient algorithms for a number of task reallocation problems under our model of load movement. These include dynamic load balancing on processor networks, adaptive mesh re-partitioning such as those in finite element methods, and progressive job migration under dynamic generation and consumption of load.
To obtain the above-mentioned results, we introduce and solve the abstract problem of Incremental Weight Migration (IWM) on arbitrary graphs. Our main result is a simple, randomized, algorithm for this problem which provably results in asymptotically optimal convergence towards the state where weights on the nodes of the graph are all equal. This algorithm utilizes an appropriate random set of edges forming a matching. Our algorithm for the IWM problem is used in deriving efficient algorithms for all the problems mentioned above.
Our results are very general. The algorithms we derive are local, and hence, scalable. They work for arbitrary load distributions and for networks of arbitrary topology which can possibly undergo link failures. Of independent interest is our proof technique which we use to lower bound the convergence of our algorithms in terms of the eigenstructure of the underlying graph.
Finally, we present preliminary experimental results analyzing issues in load balancing related to our algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
AA+93
|
William Aiello , Baruch Awerbuch , Bruce Maggs , Satish Rao, Approximate load balancing on dynamic and asynchronous networks, Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, p.632-641, May 16-18, 1993, San Diego, California, United States
[doi> 10.1145/167088.167250]
|
 |
AB92
|
|
 |
AHS91
|
James Aspnes , Maurice Herlihy , Nir Shavit, Counting networks and multi-processor coordination, Proceedings of the twenty-third annual ACM symposium on Theory of computing, p.348-358, May 05-08, 1991, New Orleans, Louisiana, United States
[doi> 10.1145/103418.103421]
|
| |
B87
|
B. Bollobas. Random Graphs. Academic Press, New York. 1987.
|
| |
BB87
|
|
| |
CA87
|
G. Cybenko and T. G. Allen. Parallel Algorithms for Classification and Clustering. In Proc. SPIE Conference on Advanced Architectures and Algorithms for Signal Processing, San Diego, CA 1987.
|
| |
C89
|
|
| |
F93
|
R. Feldmann. Game Tree Search on Massively Parallel Systems. PhD Thesis, Dept. of Mathematics and Computer Science, University of Paderborn. August 1993.
|
| |
FG91
|
M. Factor and D. Gelernter. Software Backplanes, Realtime Data Fusion and the Process Trellis. Research Report YALEU/DCS/TR-852, Yale Computer Science Department, March 1991.
|
 |
GH89
|
B. Goldberg , P. Hudak, Implementing functional programs on a hypercube multiprocessor, Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues, p.489-504, January 19-20, 1988, Pasadena, California, United States
[doi> 10.1145/62297.62356]
|
| |
HCT89
|
J. Hong, M. Chen and X. Tan. Dynamic Cyclic Load Balancing on Hypercubes. in Proc. of the 4th Conference on Hypercubes, Concurrent Computers and Applications, Vol. 1,595-598, 1989.
|
 |
HLS92
|
Maurice Herlihy , Beng-Hong Lim , Nir Shavit, Low contention load balancing on large-scale multiprocessors, Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures, p.219-227, June 29-July 01, 1992, San Diego, California, United States
[doi> 10.1145/140901.140924]
|
 |
HT93
|
|
| |
K88
|
L.V. Kale. Comparing the Performance of Two Dynamic Load Distribution Methods. In Proc. of International Conference on Parallel Processing, Vol. 1, August 1988.
|
 |
KZ88
|
|
| |
LK87
|
|
 |
LM93
|
|
| |
LMR91
|
R. Lueling, B. Monien and F. Ramme. Load Balancing in Large Networks" A Comparative Study. In Proc. of IEEE Symp on Parallel and Distributed Computing, Dallas, 1991.
|
 |
LN+89
|
T. Leighton , M. Newman , A. G. Ranade , E. Schwabe, Dynamic tree embeddings in butterflies and hypercubes, Proceedings of the first annual ACM symposium on Parallel algorithms and architectures, p.224-234, June 18-21, 1989, Santa Fe, New Mexico, United States
[doi> 10.1145/72935.72959]
|
| |
M89
|
M. Mihail. Conductance and Convergence of Markov Chains - A Combinatorial Treatment of Expanders. In Proe. of 30th IEEE Symp on Foundations of Computer Science, 526-531, October 1989.
|
| |
MP92
|
B. Mohar and S. Poljak. Eigenvalues in Combinatorial Optimization. Research Report 92752, IMA, Minneapolis, 1992.
|
| |
N92
|
D. Nicol. Communication Efficient Global Load Balancing. In Proc. of Scalable High Performance Computing Conference, 292- 299. Williamsburg, VA. April 1992.
|
| |
NX+85
|
L. M. Ni, C. W. Xu and T. B. Gendreau. Drafting Algorithm- A Dynamic Process Migration Protocol for Distributed Systems. In Proc. of Int. Conf. on Distributed Computing Systems, 539-546, 1985.
|
 |
P89
|
|
| |
PU89
|
|
 |
R89
|
|
 |
R91
|
|
 |
RSU91
|
Larry Rudolph , Miriam Slivkin-Allalouf , Eli Upfal, A simple load balancing scheme for task allocation in parallel machines, Proceedings of the third annual ACM symposium on Parallel algorithms and architectures, p.237-245, July 21-24, 1991, Hilton Head, South Carolina, United States
[doi> 10.1145/113379.113401]
|
| |
W91
|
|
CITED BY 11
|
|
Rupak Biswas , Leonid Oliker , Andrew Sohn, Global load balancing with parallel mesh adaption on distributed-memory systems, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.33-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
Bhaskar Ghosh , S. Muthukrishnan , Martin H. Schultz, First and second order diffusive methods for rapid, coarse, distributed load balancing (extended abstract), Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, p.72-81, June 24-26, 1996, Padua, Italy
|
|
|
|
|
|
|
|
|
|
|
|
Bhaskar Ghosh , F. T. Leighton , Bruce M. Maggs , S. Muthukrishnan , C. Greg Plaxton , R. Rajaraman , Andréa W. Richa , Robert E. Tarjan , David Zuckerman, Tight analyses of two local load balancing algorithms, Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, p.548-558, May 29-June 01, 1995, Las Vegas, Nevada, United States
|
|
|
|
|
|
William Aiello , Eyal Kushilevitz , Rafail Ostrovsky , Adi Rosén, Adaptive packet routing for bursty adversarial traffic, Proceedings of the thirtieth annual ACM symposium on Theory of computing, p.359-368, May 24-26, 1998, Dallas, Texas, United States
|
|
|
|
|
|
|
|