| MemorIES: a programmable, real-time hardware emulation tool for multiprocessor server design |
| Full text |
Pdf
(1.84 MB)
|
| Source
|
ACM SIGPLAN Notices
archive
Volume 35 , Issue 11 (November 2000)
table of contents
Pages: 37 - 48
Year of Publication: 2000
ISSN:0362-1340
|
|
Authors
|
|
Ashwini Nanda
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Kwok-Ken Mak
|
Cisco Systems and IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Krishnan Sugavanam
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Ramendra K. Sahoo
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Vijayaraghavan Soundararajan
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
T. Basil Smith
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 14, Citation Count: 2
|
|
|
ABSTRACT
Modern system design often requires multiple levels of simulation for design validation and performance debugging. However, while machines have gotten faster, and simulators have become more detailed, simulation speeds have not tracked machine speeds. As a result, it is difficult to simulate realistic problem sizes and hardware configurations for a target machine. Instead, researchers have focussed on developing scaling methodologies and running smaller problem sizes and configurations that attempt to represent the behavior of the real problem. Given the increasing size of problems today, it is unclear whether such an approach yields accurate results. Moreover, although commercial workloads are prevalent and important in today's marketplace, many simulation tools are unable to adequately profile such applications, let alone for realistic sizes.In this paper we present a hardware-based emulation tool that can be used to aid memory system designers. Our focus is on the memory system because the ever-widening gap between processor and memory speeds means that optimizing the memory subsystem is critical for performance. We present the design of the Memory Instrumentation and Emulation System (MemorIES). MemorIES is a programmable tool designed using FPGAs and SDRAMs. It plugs into an SMP bus to perform on-line emulation of several cache configurations, structures and protocols while the system is running real-life workloads in real-time, without any slowdown in application execution speed. We demonstrate its usefulness in several case studies, and find several important results. First, using traces to perform system evaluation can lead to incorrect results (off by 100% or more in some cases) if the trace size is not sufficiently large. Second, MemorIES is able to detect performance problems by profiling miss behavior over the entire course of a run, rather than relying on a small interval of time. Finally, we observe that previous studies of SPLASH2 applications using scaled application sizes can result in optimistic miss rates relative to real sizes on real machines, providing potentially misleading data when used for design evaluation.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
ALT
|
Altera Corporation, Flexl0K Embedded Programmable Logic Family Data Sheet. http://www.altera.com.
|
 |
BDH+99
|
E. Ender Bilir , Ross M. Dickson , Ying Hu , Manoj Plakal , Daniel J. Sorin , Mark D. Hill , David A. Wood, Multicast snooping: a new coherence method using a multicast address network, Proceedings of the 26th annual international symposium on Computer architecture, p.294-304, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
DGJ+98
|
|
 |
FW97
|
|
| |
FW99
|
|
| |
FQG+92
|
D. Fullagar, P. Quinn, C. Grilimair, J. Salmon, and M. Warren. N-body Methods on MIMD Supercomputers: Astrophysics on the Intel Touchstone Delta. In Proceedings of the Fifth Australian ~upercomputing Conference. December 1992.
|
| |
HLC+99
|
|
| |
IBM
|
IBM Corp., RS/6000 Enterprise Server S7A Users' Guide, Oct. 1998
|
| |
LEV00
|
J. Levesque. Personal Communication. April 2000.
|
 |
MNL+97
|
Maged M. Michael , Ashwini K. Nanda , Beng-Hong Lim , Michael L. Scott, Coherence controller architectures for SMP-based CC-NUMA multiprocessors, Proceedings of the 24th annual international symposium on Computer architecture, p.219-228, June 01-04, 1997, Denver, Colorado, United States
|
| |
NHO+98
|
|
| |
NMS+96
|
|
| |
PRA+97
|
V.S. Pai, P. Ranganathan, and S. Adve. RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors. In Proceedings of the Third Workshop on Computer Architecture Education. Feb. 1997.
|
| |
QUI
|
Quickturo Corporation. http://www.quicktum.com
|
| |
RHW+95
|
|
 |
JS99
|
|
| |
TPC
|
Transaction Processing Council: http://www.tpc.org
|
| |
WEB93
|
W.-D. Weber. Scalable Directories for Cache-Coherent Shared-Memory Multiprocessors. Stanford University Technical Report CSL-TR-93-557. Jan. 1993.
|
 |
WLM+99
|
Zhiqiang Wang , James A. Lupo , Alan M. McKenney , Ruth Pachter, Large scale molecular dynamics simulations with fast multipole implementations, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.56-es, November 14-19, 1999, Portland, Oregon, United States
[doi> 10.1145/331532.331588]
|
 |
WET+95
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
WR96
|
|
|