ACM Home Page
Please provide us with feedback. Feedback
Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks
Full text PdfPdf (97 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) table of contents
San Jose, CA
Pages: 1 - 12  
Year of Publication: 1997
ISBN:0-89791-985-8
Authors
Cristina Hristea  Massachusetts Institute of Technology, Cambridge, MA
Daniel Lenoski  Silicon Graphics, Inc.
John Keen  Silicon Graphics, Inc.
Sponsors
IEEE-CS\DATC : IEEE Computer Society
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 86,   Citation Count: 21
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/509593.509638
What is a DOI?

ABSTRACT

Even with today's large caches, the increasing performance gap between processors and memory systems imposes a memory bottleneck for many important scientific and commercial applications. This bottleneck is intensified in shared-memory multiprocessors by contention and the effects of cache coherency. Under heavy memory contention, the memory latency may increase 2 or 3 times. Nonethless, as more sophisticated techniques are used to hide latency and increase bandwidth, measuring memory performance has become increasingly difficult. Previous simple methods to measure memory performance can overestimate uniprocessor memory latency and underestimate bandwidth by tens of percent. This paper introduces a micro benchmark suite that measures memory hierarchy performance in light of both uniprocessor optimizations and the contention and coherence effects of multiprocessors. The benchmark suite has been used to improve the memory system performance of the SGI Origin multiprocessor.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Mike Galles. Scalable Pipelined Interconnect for Distributed Endpoint Routing: The SGI SPIDER chip. In Hot Interconnects '96.
 
2
 
3
 
4
 
5
6
 
7
 
8
John D. McCalpin. Sustainable Memory Bandwidth in Current High Performance Computers. http://reality.sgi.com/mccalpin_asd., October 1995.
 
9
Larry McVoy, Carl Staelin. lmbench: Portable tools for performance analysis. http://reality.sgi.com/lm_engr/index.html., 1995.
10
 
11
Rafael Saavedra, R. Stockton Gaines, Michael Carlton. Characterizing the Performance Space of Shared Memory Computers Using Micro Benchmarks. July 1993.
12
 
13
The Ultra Enterprise 10000 Server. http://www.sun.com/servers/datacenter/products/starfire/wp.html.
 
14
 
15

CITED BY  21
Collaborative Colleagues:
Cristina Hristea: colleagues
Daniel Lenoski: colleagues
John Keen: colleagues