ACM Home Page
Please provide us with feedback. Feedback
Evaluating NIC hardware requirements to achieve high message rate PGAS support on multi-core processors
Full text PdfPdf (221 KB)
Source
Conference on High Performance Networking and Computing archive
Proceedings of the 2007 ACM/IEEE conference on Supercomputing table of contents
Reno, Nevada
SESSION: Network interfaces table of contents
Article No. 36  
Year of Publication: 2007
ISBN:978-1-59593-764-3
Authors
Keith D. Underwood  Sandia National Laboratories, Albuquerque, NM
Michael J. Levenhagen  Sandia National Laboratories, Albuquerque, NM
Ron Brightwell  Sandia National Laboratories, Albuquerque, NM
Sponsors
IEEE-CS\DATC : IEEE Computer Society
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 42,   Citation Count: 1
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1362622.1362671
What is a DOI?

ABSTRACT

Partitioned global address space (PGAS) programming models have been identified as one of the few viable approaches for dealing with emerging many-core systems. These models tend to generate many small messages, which requires specific support from the network interface hardware to enable efficient execution. In the past, Cray included E-registers on the Cray T3E to support the SHMEM API; however, with the advent of multi-core processors, the balance of computation to communication capabilities has shifted toward computation. This paper explores the message rates that are achievable with multi-core processors and simplified PGAS support on a more conventional network interface. For message rate tests, we find that simple network interface hardware is more than sufficient. We also find that even typical data distributions, such as cyclic or block-cyclic, do not need specialized hardware support. Finally, we assess the impact of such support on the well known RandomAccess benchmark.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
 
5
R. Brightwell, D. Doerfler, and K. D. Underwood. A preliminary analysis of the InfiniPath and XDI network interfaces. In 20th International Parallel and Distributed Processing Symposium (IPDPS '06) Workshop on Communication Architectures for Clusters, April 2006.
 
6
D. Burger and T. Austin. The SimpleScalar Tool Set. Version 2.0. SimpleScalar LLC.
 
7
D. Callahan, B. L. Chamberlain, and H. P. Zima. The Cascade high productivity language. In Ninth IEEE International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2004), pages 52--60, April 2004.
 
8
J. Carbonaro and F. Verhoorn. Cavallino: The Teraflops router and NIC. In Fourth IEEE Symposium on High-Performance Interconnects (Hotl '96), August 1996.
 
9
W. W. Carlson, J. M. Draper, D. E. Culler, K. Yelick, E. Brooks, and K. Warren. Introduction to UPC and language specification. Technical Report CCS-TR-99-157, May 1999.
10
 
11
Cray, Inc. Cray XIE supercomputer. http://www.cray.com/products/systems/xi.
 
12
Cray Research, Inc. SHMEM Technical Note for C, SG-2516 2.3, October 1994.
 
13
14
 
15
H. Hellwagner and A. Reinefeld, editors.SCI: Scalable Coherent Interface: Architecture andxo Software for High-Performance Compute Clusters, volume 1734 of Lecture Notes in Computer Science. Springer, 1999.
 
16
Infiniband Trade Association. http://www.infinibandta.org, 1999.
 
17
S. M. Kelly and R. Brightwell. Software architecture of the light weight kernel, Catamount. In Proceedings of the 2005 Cray User Group Annual Technical Conference, May 2005.
 
18
J. Liu and D. K. Panda. Implementing efficient and scalable flow control schemes in MPI over InfiniBand. In 2004 Workshop on Communication Architecture for Clusters (CAC '04), April 2004.
 
19
P. Luszczek, J. Dongarra, D. Koester, R. Rabenseifner, R. Lucas, J. Kepner, J. McCalpin, D. Bailey, and D. Takahashi. Introduction to the HPC challenge benchmark suite, March 2005. http://icl.cs.utk.edu/hpcc/pubs/index.html.
 
20
D. Mayhew and V. Krishnan. PCI Express and Advanced Switching: Evolutionary path to building next generation interconnects. In Eleventh IEEE Symposium on High-Performance Interconnects (Hotl '04), August 2004.
 
21
Mellanox, Inc. New Mellanox ConnectX IB adapters unleash multi-core processor performance, http://www.mellanox.com/news/press_releases/pr_032607.php.
 
22
 
23
24
 
25
 
26
S. Plimpton, R. Brightwell, C. Vaughan, K. Underwood, and M. Davis. A simple synchronous distributed-memory algorithm for the HPCC RandomAccess benchmark. In 2006 IEEE International Conference on Cluster Computing, September 2006.
 
27
QLogic, Inc. InfiniPath interconnect performance. http://www.pathscale.com/infinipath-perf.html.
 
28
Quadrics, Inc. QSNet-II performance results. http://www.quadrics.com/.
 
29
30
31
 
32
K. Underwood. Challenges and issues in benchmarking MPI. In B. Mohr, J. L. Träff, J. Worringen, and J. Dongarra, editors, Recent Advances in Parallel Virtual Machine and Message Passing Interface: 13th European PVM/MPI Users' Group Meeting, Bonn, Germany, September 2006 Proceedings, volume 4192 of Lecture Notes in Computer Science, pages 339--346. Springer-Verlag, 2006.
 
33
K. D. Underwood, M. Levenhagen, and A. Rodrigues. Simulating Red Storm: Challenges and successes in building a system simulation. In 21st International Parallel and Distributed Processing Symposium (IPDPS '07), March 2007.
34

Collaborative Colleagues:
Keith D. Underwood: colleagues
Michael J. Levenhagen: colleagues
Ron Brightwell: colleagues