| SIGMA: a simulator infrastructure to guide memory analysis |
| Full text |
Pdf
(334 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
table of contents
Baltimore, Maryland
Pages: 1 - 13
Year of Publication: 2002
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society Press
Los Alamitos, CA, USA
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 23, Citation Count: 21
|
|
|
ABSTRACT
In this paper we present SIGMA (Simulation Infrastructure to Guide Memory Analysis), a new data collection framework and family of cache analysis tools. The SIGMA environment provides detailed cache information by gathering memory reference data using software-based instrumentation. This infrastructure can facilitate quick probing into the factors that influence the performance of an application by highlighting bottleneck scenarios including: excessive cache/TLB misses and inefficient data layouts. The tool can also assist in perturbation analysis to determine performance variations caused by changes to architecture or program. Our validation tests using the SPEC Swim benchmark show that most of the performance metrics obtained with SIGMA are within 1% of the metrics obtained with hardware performance counters, with the advantage that SIGMA provides performance data on a data structure level, as specified by the programmer.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
D. Reed, R. Aydt, R. Noe, P. Roth, K. Shields, B. Schwartz, and L. Tavera. "Scalable Performance Analysis: The Pablo Performance Analysis Environment". In Proceedings of the Scalable Parallel Libraries Conference, IEEE Computer Society, 1993.
|
| |
3
|
Barton P. Miller , Mark D. Callaghan , Jonathan M. Cargille , Jeffrey K. Hollingsworth , R. Bruce Irvin , Karen L. Karavanic , Krishna Kunchithapadam , Tia Newhall, The Paradyn Parallel Performance Measurement Tool, Computer, v.28 n.11, p.37-46, November 1995
[doi> 10.1109/2.471178]
|
| |
4
|
B. Mohr, A. Malony, and J. Cuny. "TAU Tuning and Analysis Utilities for Portable Parallel Programming". In G. Wilson, editor, Parallel Programming using C++, M.I.T. Press, 1996.
|
| |
5
|
|
| |
6
|
Marco Zagha , Brond Larson , Steve Turner , Marty Itzkowitz, Performance analysis using the MIPS R10000 performance counters, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/369028.369059]
|
| |
7
|
S. Browne , J. Dongarra , N. Garner , K. London , P. Mucci, A scalable cross-platform infrastructure for application performance tuning using hardware counters, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.42-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
8
|
R. Berrendorf, Heinz Ziegler, and Bernd Mohr. "PCL - The Performance Counter Library: A Common Interface to Access Hardware Performance Counters on Microprocessors". Research Centre Juelich GmbH, <u>http://www.fz-juelich.de/zam/PCL/</u> Version 2.1, February 2002.
|
| |
9
|
|
| |
10
|
R. Sadourny. "The Dynamics of Finite-Difference Models of the Shallow-Water Equation". In Journal of Atmospheric. Sciences, 32(4), April 1975.
|
| |
11
|
S. Herrod. "Tango lite: A multiprocessor simulation environment". In Stanford University, Computer Systems Laboratory, Technical report, <u>http://citeseer.nj.nec.com/herrod93tango.html</u>. 1993.
|
| |
12
|
|
| |
13
|
|
| |
14
|
M. Giampapa. "Augmint6k: The Augmint multiprocessor simulation toolkit for IBM PowerPC architecture". IBM Internal Report, 1998.
|
| |
15
|
Intel Corporation, <u>http://developer.intel.com/software/products/vtune/index.htm</u>.
|
 |
16
|
|
 |
17
|
Margaret Martonosi , David Ofelt , Mark Heinrich, Integrating performance monitoring and communication in parallel computers, Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.138-147, May 23-26, 1996, Philadelphia, Pennsylvania, United States
|
 |
18
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
| |
19
|
|
| |
20
|
|
 |
21
|
Margaret Martonosi , Anoop Gupta , Thomas Anderson, MemSpy: analyzing memory system bottlenecks in programs, Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.1-12, June 01-05, 1992, Newport, Rhode Island, United States
|
 |
22
|
Trishul M. Chilimbi , Thomas Ball , Stephen G. Eick , James R. Larus, StormWatch: a tool for visualizing memory system protocols, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.38-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224287]
|
 |
23
|
S. K. Reinhardt , J. R. Larus , D. A. Wood, Tempest and typhoon: user-level shared memory, Proceedings of the 21ST annual international symposium on Computer architecture, p.325-336, April 18-21, 1994, Chicago, Illinois, United States
|
CITED BY 21
|
|
Jaydeep Marathe , Frank Mueller , Tushar Mohan , Bronis R. de Supinski , Sally A. McKee , Andy Yoo, METRIC: tracking down inefficiencies in the memory hierarchy via binary rewriting, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, March 23-26, 2003, San Francisco, California
|
|
|
|
|
|
Martin Schulz , Brian S. White , Sally A. McKee , Hsien-Hsin S. Lee , Jürgen Jeitner, Owl: next generation system monitoring, Proceedings of the 2nd conference on Computing frontiers, May 04-06, 2005, Ischia, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jaydeep Marathe , Frank Mueller , Tushar Mohan , Sally A. Mckee , Bronis R. De Supinski , Andy Yoo, METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies, ACM Transactions on Programming Languages and Systems (TOPLAS), v.29 n.2, p.12-es, April 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tushar Mohan , Bronis R. de Supinski , Sally A. McKee , Frank Mueller , Andy Yoo , Martin Schulz, Identifying and Exploiting Spatial Regularity in Data Memory References, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.49, November 15-21, 2003
|
|
|
S. Sbaraglia , H. Wen , S. Seelam , I. Chung , G. Cong , K. Ekanadham , D. Klepacki, A productivity centered application performance tuning framework, Proceedings of the 2nd international conference on Performance evaluation methodologies and tools, October 22-27, 2007, Nantes, France
|
|
|
|
|
|
|
|
|
Martin Schulz , Jim Galarowicz , Don Maghrak , William Hachfeld , David Montoya , Scott Cranford, Open | SpeedShop: An open source infrastructure for parallel performance analysis, Scientific Programming, v.16 n.2-3, p.105-121, April 2008
|
|
|
Michael Noeth , Prasun Ratn , Frank Mueller , Martin Schulz , Bronis R. de Supinski, ScalaTrace: Scalable compression and replay of communication traces for high-performance computing, Journal of Parallel and Distributed Computing, v.69 n.8, p.696-710, August, 2009
|
|