ACM Home Page
Please provide us with feedback. Feedback
Locality-preserving randomized oblivious routing on torus networks
Full text PdfPdf (277 KB)
Source ACM Symposium on Parallel Algorithms and Architectures archive
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures table of contents
Winnipeg, Manitoba, Canada
SESSION: Session 1 table of contents
Pages: 9 - 13  
Year of Publication: 2002
ISBN:1-58113-529-7
Authors
Arjun Singh  Stanford University
William J. Dally  Stanford University
Brian Towles  Stanford University
Amit K. Gupta  Stanford University
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 44,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564870.564873
What is a DOI?

ABSTRACT

We introduce Randomized Local Balance (RLB), a routing algorithm that strikes a balance between locality and load balance in torus networks, and analyze RLB's performance for benign and adversarial traffic permutations. Our results show that RLB outperforms deterministic algorithms (25% more bandwidth than Dimension Order Routing) and minimal oblivious algorithms (50% more bandwidth than 2 phase ROMM [9]) on worst-case traffic. At the same time, RLB offers higher throughput on local traffic than a fully randomized algorithm (4.6 times more bandwidth than VAL (Valiant's algorithm) [15] in the best case). RLBth (RLB threshold) improves the locality of RLB to match the throughput of minimal algorithms on very local traffic in exchange for a 4% reduction in worst-case throughput compared to RLB. Both RLB and RLBth give better throughput than all other algorithms we tested on randomly selected traffic permutations. While RLB algorithms have somewhat lower guaranteed bandwidth than VAL they have much lower latency at low offered loads (up to 3.65 times less for RLBth).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
W. Dally. Aoki: Deadlock-free adaptive routing in multicomputer networks using virtual channels, 1993.
 
3
 
4
William Dally, Philip Carvey, and Larry Dennison. Architecture of the avici terabit switch/router. In Proceedings of Hot Interconnects Symposium VI, August 1998, pages 41--50, 1998.
 
5
William J. Dally and Charles L. Seitz. The torus routing chip. Distributed Computing, 1(4):187--196, 1986.
 
6
7
 
8
9
 
10
G. Pfister. An introduction to the infiniband arechitecture. High Performance Mass Storage and Parallel I/O, IEEE Press, 2001., 2001.
 
11
 
12
S. Scott and G. Thorson. The cray t3e network: adaptive routing in a high performance 3d torus, 1996.
 
13
H. Sullivan, T. Bashkow, and D. Klappholz. A large scale, homogeneous, fully distributed parallel machine, ii, 1977.
 
14
 
15
G. Valiant. A scheme for fast parallel communication. SIAM Journal on Computing , 11(2):350--361, 1982.


Collaborative Colleagues:
Arjun Singh: colleagues
William J. Dally: colleagues
Brian Towles: colleagues
Amit K. Gupta: colleagues