| Tools for application-oriented performance tuning |
| Full text |
Pdf
(397 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 15th international conference on Supercomputing
table of contents
Sorrento, Italy
Pages: 154 - 165
Year of Publication: 2001
ISBN:1-58113-410-X
|
|
Authors
|
|
John Mellor-Crummey
|
Dept. of Computer Science, Rice University, MS 132, 6100 Main Street, Houston, TX
|
|
Robert Fowler
|
Dept. of Computer Science, Rice University, MS 132, 6100 Main Street, Houston, TX
|
|
David Whalley
|
Dept. of Computer Science, Florida State University, Tallahassee, FL
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 37, Citation Count: 14
|
|
|
ABSTRACT
Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance bottlenecks. Existing performance tools don't adequately support this process in one or more dimensions. We discuss some of the critical utility and usability issues for application-level performance analysis tools in the context of two performance tools, MHSim and HPCView, that we built to support our own work on data layout and optimizing compilers. MHsim is a memory hierarchy simulator that produces source-level information not otherwise available about memory hierarchy utilization and the causes of cache conflicts. HPCView is a tool that combines data from arbitrary sets of instrumentation sources and correlates it with program source code. Both tools report their results in scope-hierarchy views of the corresponding source code and produce their output as HTML databases that can be analyzed portably and collaboratively using a commodity browser. In addition to daily use within our group, the tools are being used successfully by several code development teams in DoD and DoE laboratories.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Carnival Web Site. http://www.cs.rochester.edu/u/leblanc/prediction.html.
|
| |
3
|
H. Davis, S. Goldschmidt, and J. Hennessy. Tango: A Multiprocessor Simulation and Tracing System. In Proceedings of the International Conference on Parallel Processing, pages 99-107, August 1991.
|
| |
4
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
5
|
A. J. Goldberg and J. Hennessy. MTOOL: A Method for Isolating Memory Bottlenecks in Shared Memory Multiprocessor Programs. In Proceedings of the International Conference on Parallel Processing, pages 251-257, August 1991.
|
| |
6
|
W3C Math Working Group. Mathematical markup language (mathml) 1.01 specification, July 1999. http://www.w3.org/TR/REC-MathML.
|
 |
7
|
|
| |
8
|
C. Janssen. The Visual Profiler. http://aros.ca.sandia.gov/~cljanss/perf/vprof/doc/README.html.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
Margaret Martonosi , David Ofelt , Mark Heinrich, Integrating performance monitoring and communication in parallel computers, Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.138-147, May 23-26, 1996, Philadelphia, Pennsylvania, United States
|
 |
13
|
Margaret Martonosi , Anoop Gupta , Thomas Anderson, MemSpy: analyzing memory system bottlenecks in programs, Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.1-12, June 01-05, 1992, Newport, Rhode Island, United States
|
 |
14
|
|
| |
15
|
Marco Zagha , Brond Larson , Steve Turner , Marty Itzkowitz, Performance analysis using the MIPS R10000 performance counters, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/369028.369059]
|
CITED BY 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jaydeep Marathe , Frank Mueller , Tushar Mohan , Bronis R. de Supinski , Sally A. McKee , Andy Yoo, METRIC: tracking down inefficiencies in the memory hierarchy via binary rewriting, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, March 23-26, 2003, San Francisco, California
|
|
|
|
|
|
|
|
|
|
|
|
Patrick G. Bridges , Arthur B. MacCabe, IMPuLSE: integrated monitoring and profiling for large-scale environments, Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems, p.1-5, October 22-23, 2004, Houston, Texas
|
|
|
|
|
|
Jaydeep Marathe , Frank Mueller , Tushar Mohan , Sally A. Mckee , Bronis R. De Supinski , Andy Yoo, METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies, ACM Transactions on Programming Languages and Systems (TOPLAS), v.29 n.2, p.12-es, April 2007
|
|
|
|
|
|
|
|
|
|
|