ACM Home Page
Please provide us with feedback. Feedback
The effects of communication parameters on end performance of shared virtual memory clusters
Full text PdfPdf (202 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) table of contents
San Jose, CA
Pages: 1 - 35  
Year of Publication: 1997
ISBN:0-89791-985-8
Authors
Angelos Bilas  Princeton University
Jaswinder Pal Singh  Princeton University
Sponsors
IEEE-CS\DATC : IEEE Computer Society
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 11,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/509593.509594
What is a DOI?

ABSTRACT

Recently there has been a lot of effort in providing cost-effective Shared Memory systems by employing software only solutions on clusters of high-end workstations coupled with high-bandwidth, low-latency commodity networks. Much of the work so far has focused on improving protocols, and there has been some work on restructuring applications to perform better on SVM systems. The result of this progress has been the promise for good performance on a range of applications at least in the 16-32 processor range. New system area networks and network interfaces provide significantly lower overhead, lower latency and higher bandwidth communication in clusters, inexpensive SMPs have become common as the nodes of these clusters, and SVM protocols are now quite mature. With this progress, it is now useful to examine what are the important system bottlenecks that stand in the way of effective parallel performance; in particular, which parameters of the communication architecture are most important to improve further relative to processor speed, which ones are already adequate on modern systems for most applications, and how will this change with technology in the future. Such information can assist system designers in determining where to focus their energies in improving performance, and users in determining what system characteristics are appropriate for their applications.We find that the most important system cost to improve is the overhead of generating and delivering interrupts. Improving network interface (and I/O bus) bandwidth relative to processor speed helps some bandwidth-bound applications, but currently available ratios of bandwidth to processor speed are already adequate for many others. Surprisingly, neither the processor overhead for handling messages nor the occupancy of the communication interface in preparing and pushing packets through the network appear to require much improvement.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Bilas, L. Iftode, and J. P. Singh. Comparison of shared virtual memory across uniprocessor and SMP nodes. In IMA Workshop on Parallel Algorithms and Parallel Systems, Nov. 1996.
2
 
3
 
4
5
6
 
7
 
8
9
10
 
11
 
12
P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel. Treadmarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the Winter USENIX Conference, pages 115-132, Jan. 1994.
13
14
 
15
 
16
S. Pakin, M. Buchanan, M. Lauria, and A. Chien. The Fast Messages (FM) 2.0 streaming interface. Submitted to Usenix'97, 1996.
17
 
18
A. Sharma, A. T. Nguyen, J. Torellas, M. Michael, and J. Carbajal. Augmint: a multiprocessor simulation environment for intel x86 architectures. Technical report, University of Illinois at Urbana-Champaign, March 1996.
 
19
20
 
21
 
22
S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta. Methodological considerations and characterization of the SPLASH-2 parallel application suite. In Proceedings of the 23rd Annual Symposium on Computer Architecture, May 1995.
23
 
24
M. D. H. Y. Zhou, I. S. L. Iftode, B. R. T. K. Li, J. P. Singh, and D. A. Wood. Relaxed consistency and coherence granularity in DSM systems: A performance evaluation. Technical Report TR-535-96, Department of Computer Science, Princeton University, December 1996, 10 Pages.
25

CITED BY  10

Collaborative Colleagues:
Angelos Bilas: colleagues
Jaswinder Pal Singh: colleagues