| Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks |
| Full text |
Pdf
(97 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM)
table of contents
San Jose, CA
Pages: 1 - 12
Year of Publication: 1997
ISBN:0-89791-985-8
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 86, Citation Count: 21
|
|
|
ABSTRACT
Even with today's large caches, the increasing performance gap between processors and memory systems imposes a memory bottleneck for many important scientific and commercial applications. This bottleneck is intensified in shared-memory multiprocessors by contention and the effects of cache coherency. Under heavy memory contention, the memory latency may increase 2 or 3 times. Nonethless, as more sophisticated techniques are used to hide latency and increase bandwidth, measuring memory performance has become increasingly difficult. Previous simple methods to measure memory performance can overestimate uniprocessor memory latency and underestimate bandwidth by tens of percent. This paper introduces a micro benchmark suite that measures memory hierarchy performance in light of both uniprocessor optimizations and the contention and coherence effects of multiprocessors. The benchmark suite has been used to improve the memory system performance of the SGI Origin multiprocessor.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Mike Galles. Scalable Pipelined Interconnect for Distributed Endpoint Routing: The SGI SPIDER chip. In Hot Interconnects '96.
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
John D. McCalpin. Sustainable Memory Bandwidth in Current High Performance Computers. http://reality.sgi.com/mccalpin_asd., October 1995.
|
| |
9
|
Larry McVoy, Carl Staelin. lmbench: Portable tools for performance analysis. http://reality.sgi.com/lm_engr/index.html., 1995.
|
 |
10
|
|
| |
11
|
Rafael Saavedra, R. Stockton Gaines, Michael Carlton. Characterizing the Performance Space of Shared Memory Computers Using Micro Benchmarks. July 1993.
|
 |
12
|
|
| |
13
|
The Ultra Enterprise 10000 Server. http://www.sun.com/servers/datacenter/products/starfire/wp.html.
|
| |
14
|
|
| |
15
|
|
CITED BY 21
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Milo M. K. Martin , Daniel J. Sorin , Anastassia Ailamaki , Alaa R. Alameldeen , Ross M. Dickson , Carl J. Mauer , Kevin E. Moore , Manoj Plakal , Mark D. Hill , David H. Wood, Timestamp snooping: an approach for extending SMPs, ACM SIGPLAN Notices, v.35 n.11, p.25-36, Nov. 2000
|
|
|
|
|
|
|
|
|
|
|
|
Jeff Gibson , Robert Kunz , David Ofelt , Mark Horowitz , John Hennessy , Mark Heinrich, FLASH vs. (simulated) FLASH: closing the simulation loop, ACM SIGPLAN Notices, v.35 n.11, p.49-58, Nov. 2000
|
|
|
Ravi Iyer , Nancy M. Amato , Lawrence Rauchwerger , Laxmi Bhuyan, Comparing the memory system performance of the HP V-class and SGI Origin 2000 multiprocessors using microbenchmarks and scientific applications, Proceedings of the 13th international conference on Supercomputing, p.339-347, June 20-25, 1999, Rhodes, Greece
|
|
|
Dimitrios S. Nikolopoulos , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesús Labarta , Eduard Ayguadé, A case for user-level dynamic page migration, Proceedings of the 14th international conference on Supercomputing, p.119-130, May 08-11, 2000, Santa Fe, New Mexico, United States
|
|
|
Jeff Gibson , Robert Kunz , David Ofelt , Mark Horowitz , John Hennessy , Mark Heinrich, FLASH vs. (Simulated) FLASH: closing the simulation loop, ACM SIGARCH Computer Architecture News, v.28 n.5, p.49-58, Dec. 2000
|
|
|
|
|
|
Milo M. K. Martin , Daniel J. Sorin , Anatassia Ailamaki , Alaa R. Alameldeen , Ross M. Dickson , Carl J. Mauer , Kevin E. Moore , Manoj Plakal , Mark D. Hill , David A. Wood, Timestamp snooping: an approach for extending SMPs, ACM SIGARCH Computer Architecture News, v.28 n.5, p.25-36, Dec. 2000
|
|
|
Dimitrios S. Nikolopoulos , Eduard Ayguadé , Theodore S. Papatheodorou , Constantine D. Polychronopoulos , Jesús Labarta, The trade-off between implicit and explicit data distribution in shared-memory programming paradigms, Proceedings of the 15th international conference on Supercomputing, p.23-37, June 2001, Sorrento, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|