| Using branch handling hardware to support profile-driven optimization |
| Full text |
Pdf
(954 KB)
|
| Source
|
International Symposium on Microarchitecture
archive
Proceedings of the 27th annual international symposium on Microarchitecture
table of contents
San Jose, California, United States
Pages: 12 - 21
Year of Publication: 1994
ISBN:0-89791-707-3
|
|
Authors
|
|
Thomas M. Conte
|
Department of Electrical and Computer Engineering, University of South Carolina, Columbia, South Carolina
|
|
Burzin A. Patel
|
Department of Electrical and Computer Engineering, University of South Carolina, Columbia, South Carolina
|
|
J. Stan Cox
|
Database and Compiler Technology, AT&T Global Information Solutions, Columbia, South Carolina
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 24, Citation Count: 13
|
|
|
ABSTRACT
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run 2–30 times slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper proposes using existing branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%–4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. This practically removes the inconvenience of profiling. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. A. Fisher, "Trace scheduhng: A technique for global microcode compaction," IEEE Trans. Cornput., vol. C-30, no. 7, pp. 478-490, July 1981.
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
 |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
Richard E. Hank , Scott A. Mahlke , Roger A. Bringmann , John C. Gyllenhaal , Wen-mei W. Hwu, Superblock formation using static program analysis, Proceedings of the 26th annual international symposium on Microarchitecture, p.247-255, December 01-03, 1993, Austin, Texas, United States
|
| |
10
|
|
| |
11
|
S. P. Song and M. Denman, "The PowerPC 604 RISC microprocessor," tech. rep., Somerset Design Center, Austin, TX, Apr. 1994.
|
| |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
M.L. Golden, "Issues in trace collection through program instrumentation," Master's thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, Illinois, 1991.
|
| |
17
|
T. Ball and J. R. Larus, "Optimally profiling and tracing programs," Tech. Rep. 1031, Computer Sciences Dept., University of Wisconsin-Madison, 1991.
|
| |
18
|
S. I. Feldman, "Make- A program for maintaining computer programs," Software-Practice and Experience, vol. 9, pp. 255-265, Apr. 1979.
|
 |
19
|
Tim A. Wagner , Vance Maverick , Susan L. Graham , Michael A. Harrison, Accurate static estimators for program optimization, Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation, p.85-96, June 20-24, 1994, Orlando, Florida, United States
|
CITED BY 13
|
|
|
|
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
|
|
|
|
|
|
Howard Chen , Wei-Chung Hsu , Jiwei Lu , Pen-Chung Yew , Dong-Yuan Chen, Dynamic trace selection using performance monitoring hardware sampling, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, March 23-26, 2003, San Francisco, California
|
|
|
|
|
|
|
|
|
|
|
|
Shashidhar Mysore , Banit Agrawal , Rodolfo Neuber , Timothy Sherwood , Nisheeth Shrivastava , Subhash Suri, Formulating and implementing profiling over adaptive ranges, ACM Transactions on Architecture and Code Optimization (TACO), v.5 n.1, p.1-32, May 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|