| Memory Profiling using Hardware Counters |
| Full text |
Pdf
(118 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
table of contents
Page: 17
Year of Publication: 2003
ISBN:1-58113-695-1
|
|
Authors
|
|
Marty Itzkowitz
|
Sun Microsystems, Inc., Menlo Park, California
|
|
Brian J. N. Wylie
|
Sun Microsystems, Inc., Menlo Park, California
|
|
Christopher Aoki
|
Sun Microsystems, Inc., Menlo Park, California
|
|
Nicolai Kosche
|
Sun Microsystems, Inc., Menlo Park, California
|
|
| Sponsor |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 41, Citation Count: 4
|
|
|
ABSTRACT
Although memory performance is often a limiting factor in application performance, most tools only show performance data relating to the instructions in the program, not to its data. In this paper, we describe a technique for directly measuring the memory profile of an application. We describe the tools and their user model, and then discuss a particular code, the MCFbenchmark from SPEC CPU 2000. We show performance data for the data structures and elements, and discuss the use of the data to improve program performance. Finally, we discuss extensions to the work to provide feedback to the compiler for prefetching and to generate additional reports from the data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
[1] Program Performance Analysis Tools, Sun Microsystems, Inc. Publication 817-0922-10, May 2003.
|
| |
2
|
[2] S.L. Graham, P.B. Kessler, and M.K. McKusick, "An Execution Profiler for Modular Programs," Software Practice and Experience, 13 671-685, August 1983.
|
| |
3
|
[3] D.F. Stevens, SPY for the CDC 6600, private communication, 1968.
|
| |
4
|
Marco Zagha , Brond Larson , Steve Turner , Marty Itzkowitz, Performance analysis using the MIPS R10000 performance counters, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/369028.369059]
|
| |
5
|
[5] K. Sridharan, "VTune: Intel's Visual Tuning Environment," Proceedings of USENIX-NT '97, 11 August, 1997.
|
| |
6
|
[6] Jennifer Anderson, Lance Berc, George Chrysos, Jeffrey Dean, Sanjay Ghemawat, Jamey Hicks, Shun-Tak Leung, Mitch Lichtenberg, Mark Vandevoorde, Carl A. Waldspurger, and William E. Weihl, "Transparent, Low-Overhead Profiling on Modern Processors," Proceedings of the Workshop on Profile and Feedback-Directed Compilation, held in conjunction with the International Conference on Parallel Architectures and Compilation Techniques (PACT'98, Paris, France), 13 October, 1998. http://research.compaq.com/SRC/dcpi/papers/pfdc98.ps
|
| |
7
|
|
| |
8
|
[8] SpeedShop User's Guide (IRIX 6.5), Silicon Graphics, Inc., manual 007-3311-005, 1998.
|
| |
9
|
[9] Rudolf Berrendorf and Bernd Mohr, PCL - The Performance Counter Library: A Common Interface to Access Hardware Performance Counters on Microprocessors, Technical Report IB-9816, Forschungszentrum Jülich. http://www.fz-juelich.de/zam/PCL
|
| |
10
|
S. Browne , J. Dongarra , N. Garner , K. London , P. Mucci, A scalable cross-platform infrastructure for application performance tuning using hardware counters, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.42-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
11
|
|
 |
12
|
Margaret Martonosi , Anoop Gupta , Thomas Anderson, MemSpy: analyzing memory system bottlenecks in programs, Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.1-12, June 01-05, 1992, Newport, Rhode Island, United States
|
| |
13
|
|
| |
14
|
[14] DyninstAPI Programmer's Guide, University of Maryland College Park, release 4.0, May 2003. http://dyninst.org/docs/dyninstProgGuide.v40.pdf
|
| |
15
|
Luiz DeRose , K. Ekanadham , Jeffrey K. Hollingsworth , Simone Sbaraglia, SIGMA: a simulator infrastructure to guide memory analysis, Proceedings of the 2002 ACM/IEEE conference on Supercomputing, p.1-13, November 16, 2002, Baltimore, Maryland
|
 |
16
|
|
 |
17
|
Trishul M. Chilimbi , Mark D. Hill , James R. Larus, Cache-conscious structure layout, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.1-12, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
18
|
Trishul M. Chilimbi , Bob Davidson , James R. Larus, Cache-conscious structure definition, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.13-24, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
19
|
|
| |
20
|
[20] Sun Microsystems, Inc., UltraSPARC III Cu User's Manual, April 2003. http://www.sun.com/processors/manuals/USIIIv2.pdf
|
| |
21
|
[21] Andreas M. Löbel, "Optimal Vehicle Scheduling in Public Transit," Ph.D. thesis, Technische Universitfauml;t Berlin, 1997. SPEC CPU2000 181.mcf; source code shown with permission of the author.
|
|