ACM Home Page
Please provide us with feedback. Feedback
Memory Profiling using Hardware Counters
Full text PdfPdf (118 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2003 ACM/IEEE conference on Supercomputing table of contents
Page: 17  
Year of Publication: 2003
ISBN:1-58113-695-1
Authors
Marty Itzkowitz  Sun Microsystems, Inc., Menlo Park, California
Brian J. N. Wylie  Sun Microsystems, Inc., Menlo Park, California
Christopher Aoki  Sun Microsystems, Inc., Menlo Park, California
Nicolai Kosche  Sun Microsystems, Inc., Menlo Park, California
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society  Washington, DC, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 41,   Citation Count: 4
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Although memory performance is often a limiting factor in application performance, most tools only show performance data relating to the instructions in the program, not to its data. In this paper, we describe a technique for directly measuring the memory profile of an application. We describe the tools and their user model, and then discuss a particular code, the MCFbenchmark from SPEC CPU 2000. We show performance data for the data structures and elements, and discuss the use of the data to improve program performance. Finally, we discuss extensions to the work to provide feedback to the compiler for prefetching and to generate additional reports from the data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
[1] Program Performance Analysis Tools, Sun Microsystems, Inc. Publication 817-0922-10, May 2003.
 
2
[2] S.L. Graham, P.B. Kessler, and M.K. McKusick, "An Execution Profiler for Modular Programs," Software Practice and Experience, 13 671-685, August 1983.
 
3
[3] D.F. Stevens, SPY for the CDC 6600, private communication, 1968.
 
4
 
5
[5] K. Sridharan, "VTune: Intel's Visual Tuning Environment," Proceedings of USENIX-NT '97, 11 August, 1997.
 
6
[6] Jennifer Anderson, Lance Berc, George Chrysos, Jeffrey Dean, Sanjay Ghemawat, Jamey Hicks, Shun-Tak Leung, Mitch Lichtenberg, Mark Vandevoorde, Carl A. Waldspurger, and William E. Weihl, "Transparent, Low-Overhead Profiling on Modern Processors," Proceedings of the Workshop on Profile and Feedback-Directed Compilation, held in conjunction with the International Conference on Parallel Architectures and Compilation Techniques (PACT'98, Paris, France), 13 October, 1998. http://research.compaq.com/SRC/dcpi/papers/pfdc98.ps
 
7
 
8
[8] SpeedShop User's Guide (IRIX 6.5), Silicon Graphics, Inc., manual 007-3311-005, 1998.
 
9
[9] Rudolf Berrendorf and Bernd Mohr, PCL - The Performance Counter Library: A Common Interface to Access Hardware Performance Counters on Microprocessors, Technical Report IB-9816, Forschungszentrum Jülich. http://www.fz-juelich.de/zam/PCL
 
10
 
11
12
 
13
 
14
[14] DyninstAPI Programmer's Guide, University of Maryland College Park, release 4.0, May 2003. http://dyninst.org/docs/dyninstProgGuide.v40.pdf
 
15
16
17
18
19
 
20
[20] Sun Microsystems, Inc., UltraSPARC III Cu User's Manual, April 2003. http://www.sun.com/processors/manuals/USIIIv2.pdf
 
21
[21] Andreas M. Löbel, "Optimal Vehicle Scheduling in Public Transit," Ph.D. thesis, Technische Universitfauml;t Berlin, 1997. SPEC CPU2000 181.mcf; source code shown with permission of the author.

Collaborative Colleagues:
Marty Itzkowitz: colleagues
Brian J. N. Wylie: colleagues
Christopher Aoki: colleagues
Nicolai Kosche: colleagues