| Non-intrusive dynamic application profiler for detailed loop execution characterization |
| Full text |
Pdf
(298 KB)
|
Source
|
International Conference on Compilers, Architecture and Synthesis for Embedded Systems
archive
Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
table of contents
Atlanta, GA, USA
SESSION: Compiler hardware interaction
table of contents
Pages 23-30
Year of Publication: 2008
ISBN:978-1-60558-469-0
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 101, Citation Count: 0
|
|
|
ABSTRACT
Application profiling - the process of monitoring an application to determine the frequency of execution within specific regions - is an essential step within the design process for many software and hardware systems. In this paper, we present an efficient innovative, non-intrusive dynamic application profiler (DAProf) capable of profiling an executing application by monitoring the application's short backwards branches and providing detailed profiling statistics for characterizing loop execution behavior. DAProf is ideally suited for hardware/software partitioning approaches in which detailed loop execution information is needed to provide accurate performance estimates. DAProf provides a profiling accuracy of greater than 90% with only an 11% area overhead compared to a small ARM9.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.1-14, October 05-08, 1997, Saint Malo, France
|
 |
2
|
Vasanth Bala , Evelyn Duesterwald , Sanjeev Banerjia, Dynamo: a transparent dynamic optimization system, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.1-12, June 18-21, 2000, Vancouver, British Columbia, Canada
|
| |
3
|
Bellas, N., et al. Energy and Performance Improvements in Microprocessor Design Using a Loop Cache. ICCD, 1999.
|
| |
4
|
Brad Calder , Peter Feller , Alan Eustace, Value profiling, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.259-269, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
5
|
|
| |
6
|
Anton Chernoff , Mark Herdeg , Ray Hookway , Chris Reeve , Norman Rubin , Tony Tye , S. Bharadwaj Yadavalli , John Yates, FX!32: A Profile-Directed Binary Translator, IEEE Micro, v.18 n.2, p.56-64, March 1998
[doi> 10.1109/40.671403]
|
| |
7
|
|
| |
8
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
Susan L. Graham , Peter B. Kessler , Marshall K. Mckusick, Gprof: A call graph execution profiler, Proceedings of the 1982 SIGPLAN symposium on Compiler construction, p.120-126, June 23-25, 1982, Boston, Massachusetts, United States
|
| |
14
|
|
| |
15
|
M. R. Guthaus , J. S. Ringenberg , D. Ernst , T. M. Austin , T. Mudge , R. B. Brown, MiBench: A free, commercially representative embedded benchmark suite, Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, p.3-14, December 02-02, 2001
[doi> 10.1109/WWC.2001.15]
|
 |
16
|
|
 |
17
|
|
| |
18
|
Klaiber, A. The Technology Behind Crusoe Processors. Transmeta Corporation, http://www.transmeta.com, 2000.
|
 |
19
|
Ganesh Lakshminarayana , Anand Raghunathan , Kamal S. Khouri , Niraj K. Jha , Sujit Dey, Common-case computation: a high-level technique for power and performance optimization, Proceedings of the 36th ACM/IEEE conference on Design automation, p.56-61, June 21-25, 1999, New Orleans, Louisiana, United States
[doi> 10.1145/309847.309867]
|
 |
20
|
|
 |
21
|
Lea Hwang Lee , Bill Moyer , John Arends, Instruction fetch energy reduction using loop caches for embedded applications with small tight loops, Proceedings of the 1999 international symposium on Low power electronics and design, p.267-269, August 16-17, 1999, San Diego, California, United States
[doi> 10.1145/313817.313944]
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
 |
25
|
Girish Venkataramani , Walid Najjar , Fadi Kurdahi , Nader Bagherzadeh , Wim Bohm, A compiler framework for mapping applications to a coarse-grained reconfigurable computer architecture, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502217.502235]
|
| |
26
|
Villarreal, J., R. Lysecky, S. Cotterell, F. Vahid. Loop Analysis of Embedded Applications. UCR Techn. Report UCR-CSE-01-03, 2001.
|
| |
27
|
|
| |
28
|
Marco Zagha , Brond Larson , Steve Turner , Marty Itzkowitz, Performance analysis using the MIPS R10000 performance counters, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/369028.369059]
|
 |
29
|
Xiaolan Zhang , Zheng Wang , Nicholas Gloy , J. Bradley Chen , Michael D. Smith, System support for automatic profiling and optimization, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.15-26, October 05-08, 1997, Saint Malo, France
|
|