|
ABSTRACT
FM-QoS employs a novel communication architecture based on network feedback to provide predictable communication performance (e.g. deterministic latencies and guaranteed bandwidths) for high speed cluster interconnects. Network feedback is combined with self-synchronizing communication schedules to achieve synchrony in the network interfaces (NIs). Based on this synchrony, the network can be scheduled to provide predictable performance without special network QoS hardware. We describe the key element of the FM-QoS approach, feedback-based synchronization (FBS), which exploits network feedback to synchronize senders. We use Petri nets to characterize the set of self-synchronizing communication schedules for which FBS is effective and to describe the resulting synchronization overhead as a function of the clock drift across the network nodes. Analytic modeling suggests that for clocks of quality 300 ppm (such as found in the Myrinet NI), a synchronization overhead less than 1% of the total communication traffic is achievable --- significantly better than previous software-based schemes and comparable to hardware-intensive approaches such as virtual circuits (e.g. ATM).We have built a prototype of FBS for Myricom s Myrinet network (a 1.28 Gbps cluster network) which demonstrates the viability of the approach by sharing network resources with predictable performance. The prototype, which implements the local node schedule in software, achieves predictable latencies of 23 µs for a single-switch, 8-node network and 2 KB packets. In comparison, the best-effort scheme achieves 104 µs for the same network without FBS. While this ratio of over four to one already demonstrates the viability of the approach, it includes nearly 10 µs of overhead due to the software implementation. For hardware implementations of local node scheduling, and for networks with cascaded switches, these ratios should be much larger factors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
T. von Eicken , A. Basu , V. Buch , W. Vogels, U-Net: a user-level network interface for parallel and distributed computing (includes URL), Proceedings of the fifteenth ACM symposium on Operating systems principles, p.40-53, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
2
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
| |
3
|
|
| |
4
|
Concurrent Systems Architecture Group, High Performance Virtual Machines (HPVM), Department of Computer Science, University of Illinois, http://www-csag.cs.uiuc.edu/projects/hpvm.html
|
| |
5
|
D. E. Culler, et. al. The generic Active Messages interface specification. http://now.cs.berkeley.edu/Papers/Papers/gam_spec.ps
|
| |
6
|
|
 |
7
|
A. Demers , S. Keshav , S. Shenker, Analysis and simulation of a fair queueing algorithm, Symposium proceedings on Communications architectures & protocols, p.1-12, September 25-27, 1989, Austin, Texas, United States
|
| |
8
|
End to end Performance vIa Quality of Service, http://pertsserver.cs.uiuc.edu/epiq/
|
 |
9
|
|
| |
10
|
M. Gerla , B. Kannan , B. Kwan , P. Palnati , S. Walton , E. Leonardi , F. Neri, Quality of Service Support in High Speed, Wormhole Routing Networks, Proceedings of the 1996 International Conference on Network Protocols (ICNP '96), p.40, October 29-November 01, 1996
|
| |
11
|
S. J. Golestani. Congestion-free communication in high-speed packet networks. In IEEE Transactions on Communications, Vol. 39, No. 12, pages 1802-1812, December 1991.
|
| |
12
|
S. Hauck. Asynchronous design methodologies: An overview. In Proceedings of the IEEE, Vol.83, No. 1, pages 69-93, January 1995.
|
| |
13
|
|
| |
14
|
|
| |
15
|
C. Kalmanek, H. Kanakia and S. Keshav. Rate controlled servers for very high-speed networks. In Proceedings of IEEE Global Telecommunications Conference, 1990.
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
 |
19
|
Jae H. Kim , Andrew A. Chien, Rotating combined queueing (RCQ): bandwidth and latency guarantees in low-cost, high-performance networks, Proceedings of the 23rd annual international symposium on Computer architecture, p.226-236, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
 |
20
|
Edward W. Knightly , Dallas E. Wrege , Jörg Liebeherr , Hui Zhang, Fundamental limits and tradeoffs of providing deterministic guarantees to VBR video traffic, Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.98-107, May 15-19, 1995, Ottawa, Ontario, Canada
|
| |
21
|
|
| |
22
|
|
| |
23
|
J-P. Li and M. Mutka. Real-time virtual channel flow control. In Proceedings of IEEE 13th Annual International Phoenix Conference on Computers and Communications, pages 97-103, April 1994.
|
| |
24
|
P. Newman. ATM local area networks. IEEE Communications Magazine, pages 86-98, March 1994.
|
| |
25
|
S. Pakin, V. Karamcheti and A. A. Chien. Fast Messages (FM): Efficient, portable communication for workstation clusters and massively-parallel processors. IEEE Concurrency, 1997.
|
| |
26
|
|
| |
27
|
D. Verma, H. Zhang and D. Ferrari. Delay jitter control for real-time communication in packet switching networks. In Proceedings of TriComm 91 pages 47-55, 1991.
|
 |
28
|
|
| |
29
|
L. Zhang, S. Deering, D. Estrin, S. Shenker and D. Zappala. RSVP: A new resource ReSerVation Protocol. In IEEE Network, September 1993.
|
CITED BY 5
|
|
Soichiro Araki , Angelos Bilas , Cezary Dubnicki , Jan Edler , Koichi Konishi , James Philbin, User-space communication: a quantitative study, Proceedings of the 1998 ACM/IEEE conference on Supercomputing (CDROM), p.1-16, November 07-13, 1998, San Jose, CA
|
|
|
|
|
|
|
|
|
A. Chien , M. Lauria , R. Pennington , M. Showerman , G. Iannello , M. Buchanan , K. Connelly , L. Giannini , G. Koeni , S. Krishnamurthy , Q. Liu , S. Pakin , G. Sampemane, Design and Evaluation of an HPVM-Based Windows NT Supercomputer, International Journal of High Performance Computing Applications, v.13 n.3, p.201-219, August 1999
|
|
|
|
|