|
ABSTRACT
A program profile attributes run-time costs to portions of a program's execution. Most profiling systems suffer from two major deficiencies: first, they only apportion simple metrics, such as execution frequency or elapsed time to static, syntactic units, such as procedures or statements; second, they aggressively reduce the volume of information collected and reported, although aggregation can hide striking differences in program behavior.This paper addresses both concerns by exploiting the hardware counters available in most modern processors and by incorporating two concepts from data flow analysis--flow and context sensitivity--to report more context for measurements. This paper extends our previous work on efficient path profiling to flow sensitive profiling, which associates hardware performance metrics with a path through a procedure. In addition, it describes a data structure, the calling context tree, that efficiently captures calling contexts for procedure-level measurements.Our measurements show that the SPEC95 benchmarks execute a small number (3--28) of hot paths that account for 9--98% of their L1 data cache misses. Moreover, these hot paths are concentrated in a few routines, which have complex dynamic behavior.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
Bal94
|
|
| |
Ben96
|
Jim Bennett (PureAtria, Inc.). Personal communication, November 1996.
|
 |
BGS97
|
Rastislav Bodík , Rajiv Gupta , Mary Lou Soffa, Interprocedural conditional branch elimination, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.146-158, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
BL94
|
|
| |
BL96
|
|
| |
CMH91
|
|
| |
GKM83
|
S.L. Graham, P. B. Kessler, and M. K. McKusick. An execution profiler for modular programs. Software-Practice and Experience, 13:671-685, 1983.
|
 |
Hal92
|
|
| |
HG93
|
R.J. Hall and A. J. Goldberg. Call path profiling of monotonic program resources in UNIX. In Proceedings of the USENIX Summer 1993 Technical Conference, pages 1- 14., Cincinnati, OH, 1993.
|
 |
JSB97
|
Dean F. Jerding , John T. Stasko , Thomas Ball, Visualizing interactions in program executions, Proceedings of the 19th international conference on Software engineering, p.360-370, May 17-23, 1997, Boston, Massachusetts, United States
[doi> 10.1145/253228.253356]
|
| |
Knu71
|
D.E. Knuth. An empirical study of FORTH, AN programs. Software-Practice and Experience, 1(2):I05-133, June 1971.
|
 |
LS95
|
|
| |
LW94
|
|
| |
MR81
|
E. Morel and C. Renvoise. lnterprocedural elimination of partial redundancies. In S.S. Muchnick and N.D. Jones, editors, Program Flow Analysis: Theory and Applica. lions. Prentice-Hall, Englewood Cliffs, N J, 1981.
|
| |
MRW92
|
|
 |
MW95
|
|
| |
PF88
|
|
| |
RBDL97
|
T. Reps, T. Ball, M. Das, and J. R. Larus. The use of program profiling for software maintenance with applica~ tions to the year 2000 problem. In Technical Report 1335, Computer Sciences Department, University of Wiscon. sin, Madison, WI, January 1997.
|
| |
Sof93
|
Pure Software. Quantify User's Guide. 1993.
|
| |
SP81
|
Micha Sharir and Amir Pnueli. Two approaches to interprocedural data flow analysis. In Steven S. Muchnick and Nell D. Jones, editors, Program Flow Analysis: Theory and Applications, pages 189-233. Prentice-Hall, 198I.
|
| |
Sun96
|
Sun Microelectronics. UltraSPARC User's Manual, 1996.
|
| |
WHH80
|
M. R. Woodward, D. Hedley, and M. A. Hennell. Experience with path analysis and testing of programs. IEEE Transactions on Software Engineering, 6(3):278- 286, May 1980.
|
CITED BY 91
|
|
Nikolas Gloy , Trevor Blackwell , Michael D. Smith , Brad Calder, Procedure placement using temporal ordering information, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.303-313, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael G. Burke , Jong-Deok Choi , Stephen Fink , David Grove , Michael Hind , Vivek Sarkar , Mauricio J. Serrano , V. C. Sreedhar , Harini Srinivasan , John Whaley, The Jalapeño dynamic optimizing compiler for Java, Proceedings of the ACM 1999 conference on Java Grande, p.129-141, June 12-14, 1999, San Francisco, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomas Ball , Peter Mataga , Mooly Sagiv, Edge profiling versus path profiling: the showdown, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.134-148, January 19-21, 1998, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Adam M. Smith , Joshua Geiger , Gregory M. Kapfhammer , Mary Lou Soffa, Test suite reduction and prioritization with call trees, Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering, November 05-09, 2007, Atlanta, Georgia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jungwoo Ha , Christopher J. Rossbach , Jason V. Davis , Indrajit Roy , Hany E. Ramadan , Donald E. Porter , David L. Chen , Emmett Witchel, Improved error reporting for software that uses black-box components, ACM SIGPLAN Notices, v.42 n.6, June 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiaotong Zhuang , Suhyun Kim , Mauri io Serrano , Jong-Deok Choi, Perfdiff: a framework for performance difference analysis in a virtual machine environment, Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, April 05-09, 2008, Boston, MA, USA
|
|
|
|
|
|
|
|
|
Walter Binder , Alex Villazón , Martin Schoeberl , Philippe Moret, Cache-aware cross-profiling for java processors, Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems, October 19-24, 2008, Atlanta, GA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mitchell Hayenga , Chander Sudanthi , Mrinmoy Ghosh , Prakash Ramrakhyani , Nigel Paver, Accurate system-level performance modeling and workload characterization for mobile internet devices, Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture, p.54-60, October 26-26, 2008, Toronto, Canada
|
|
|
Xiaoming Gu , Ian Christopher , Tongxin Bai , Chengliang Zhang , Chen Ding, A component model of spatial locality, Proceedings of the 2009 international symposium on Memory management, June 19-20, 2009, Dublin, Ireland
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Utpal Banerjee , Alexander V. Veidenbaum , Constantine D. Polychronopoulos, Cache-aware partitioning of multi-dimensional iteration spaces, Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, May 04-April 06, 2009, Haifa, Israel
|
|
|
|
|
|
|
|