|
ABSTRACT
We introduce Randomized Local Balance (RLB), a routing algorithm that strikes a balance between locality and load balance in torus networks, and analyze RLB's performance for benign and adversarial traffic permutations. Our results show that RLB outperforms deterministic algorithms (25% more bandwidth than Dimension Order Routing) and minimal oblivious algorithms (50% more bandwidth than 2 phase ROMM [9]) on worst-case traffic. At the same time, RLB offers higher throughput on local traffic than a fully randomized algorithm (4.6 times more bandwidth than VAL (Valiant's algorithm) [15] in the best case). RLBth (RLB threshold) improves the locality of RLB to match the throughput of minimal algorithms on very local traffic in exchange for a 4% reduction in worst-case throughput compared to RLB. Both RLB and RLBth give better throughput than all other algorithms we tested on randomly selected traffic permutations. While RLB algorithms have somewhat lower guaranteed bandwidth than VAL they have much lower latency at low offered loads (up to 3.65 times less for RLBth).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
W. Dally. Aoki: Deadlock-free adaptive routing in multicomputer networks using virtual channels, 1993.
|
| |
3
|
|
| |
4
|
William Dally, Philip Carvey, and Larry Dennison. Architecture of the avici terabit switch/router. In Proceedings of Hot Interconnects Symposium VI, August 1998, pages 41--50, 1998.
|
| |
5
|
William J. Dally and Charles L. Seitz. The torus routing chip. Distributed Computing, 1(4):187--196, 1986.
|
| |
6
|
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
G. Pfister. An introduction to the infiniband arechitecture. High Performance Mass Storage and Parallel I/O, IEEE Press, 2001., 2001.
|
| |
11
|
|
| |
12
|
S. Scott and G. Thorson. The cray t3e network: adaptive routing in a high performance 3d torus, 1996.
|
| |
13
|
H. Sullivan, T. Bashkow, and D. Klappholz. A large scale, homogeneous, fully distributed parallel machine, ii, 1977.
|
| |
14
|
|
| |
15
|
G. Valiant. A scheme for fast parallel communication. SIAM Journal on Computing , 11(2):350--361, 1982.
|
|