|
ABSTRACT
Compile-time optimizations generally improve program performance. Nevertheless, degradations caused by individual compiler optimization techniques are to be expected. Feedback-directed optimization orchestration systems generate optimized code versions under a series of optimization combinations, evaluate their performance, and search for the best version. One challenge to such systems is to tune program performance quickly in an exponential search space. Another challenge is to achieve high program performance, considering that optimizations interact. Aiming at these two goals, this article presents an automated performance tuning system, PEAK, which searches for the best compiler optimization combinations for the important code sections in a program. The major contributions made in this work are as follows: (1) An algorithm called Combined Elimination (CE) is developed to explore the optimization space quickly and effectively; (2) Three fast and accurate rating methods are designed to evaluate the performance of an optimized code section based on a partial execution of the program; (3) An algorithm is developed to identify important code sections as candidates for performance tuning by trading off tuning speed and tuned program performance; and (4) A set of compiler tools are implemented to automate optimization orchestration. Orchestrating optimization options in SUN Forte compilers at the whole-program level, our CE algorithm improves performance by 10.8% over the SPEC CPU2000 FP baseline setting, compared to 5.6% improved by manual tuning. Orchestrating GCC O3 optimizations, CE improves performance by 12% over O3, the highest optimization level. Applying the rating methods, PEAK reduces tuning time from 2.19 hours to 5.85 minutes on average, while achieving equal or better program performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Ali-Reza Adl-Tabatabai , Michał Cierniak , Guei-Yuan Lueh , Vishesh M. Parikh , James M. Stichnoth, Fast, effective code generation in a just-in-time Java compiler, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.280-290, June 17-19, 1998, Montreal, Quebec, Canada
|
| |
2
|
F. Agakov , E. Bonilla , J. Cavazos , B. Franke , G. Fursin , M. F. P. O'Boyle , J. Thomson , M. Toussaint , C. K. I. Williams, Using Machine Learning to Focus Iterative Optimization, Proceedings of the International Symposium on Code Generation and Optimization, p.295-305, March 26-29, 2006
[doi> 10.1109/CGO.2006.37]
|
 |
3
|
Matthew Arnold , Stephen Fink , David Grove , Michael Hind , Peter F. Sweeney, Adaptive optimization in the Jalapeño JVM, Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, p.47-65, October 2000, Minneapolis, Minnesota, United States
|
 |
4
|
Matthew Arnold , Michael Hind , Barbara G. Ryder, Online feedback-directed optimization of Java, Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, November 04-08, 2002, Seattle, Washington, USA
|
 |
5
|
Joel Auslander , Matthai Philipose , Craig Chambers , Susan J. Eggers , Brian N. Bershad, Fast, effective dynamic compilation, Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation, p.149-159, May 21-24, 1996, Philadelphia, Pennsylvania, United States
|
 |
6
|
Vasanth Bala , Evelyn Duesterwald , Sanjeev Banerjia, Dynamo: a transparent dynamic optimization system, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.1-12, June 18-21, 2000, Vancouver, British Columbia, Canada
|
| |
7
|
|
 |
8
|
John Cavazos , Michael F. P. O'Boyle, Method-specific dynamic compilation using logistic regression, Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, October 22-26, 2006, Portland, Oregon, USA
|
| |
9
|
Chow, K. and Wu, Y. 1999. Feedback-directed selection and characterization of compiler optimizations. In Proceedings of the 2nd Workshop on Feedback Directed Optimizations. Israel.
|
 |
10
|
Michał Cierniak , Guei-Yuan Lueh , James M. Stichnoth, Practicing JUDO: Java under dynamic optimizations, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.13-26, June 18-21, 2000, Vancouver, British Columbia, Canada
|
 |
11
|
|
 |
12
|
|
 |
13
|
Keith D. Cooper , Alexander Grosul , Timothy J. Harvey , Steven Reeves , Devika Subramanian , Linda Torczon , Todd Waterman, ACME: adaptive compilation made efficient, Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, June 15-17, 2005, Chicago, Illinois, USA
|
| |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
 |
19
|
|
 |
20
|
Dawson R. Engler , Todd A. Proebsting, DCG: an efficient, retargetable dynamic code generation system, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.263-272, October 05-07, 1994, San Jose, California, United States
|
| |
21
|
GNU. 2005. GCC online documentation. http://gcc.gnu.org/onlinedocs/.
|
 |
22
|
Susan L. Graham , Peter B. Kessler , Marshall K. Mckusick, Gprof: A call graph execution profiler, Proceedings of the 1982 SIGPLAN symposium on Compiler construction, p.120-126, June 23-25, 1982, Boston, Massachusetts, United States
|
| |
23
|
Granston, E. D. and Holler, A. 2001. Automatic recommendation of compiler options. In Proceedings of the 4th Workshop on Feedback-Directed and Dynamic Optimization (FDDO-4).
|
 |
24
|
Brian Grant , Matthai Philipose , Markus Mock , Craig Chambers , Susan J. Eggers, An evaluation of staged run-time optimizations in DyC, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.293-304, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
25
|
|
| |
26
|
Hedayat, A., Sloane, N., and Stufken, J. 1999. Orthogonal Arrays: Theory and Applications. Springer-Verlag, New York.
|
 |
27
|
|
| |
28
|
Karp, R. 1972. Reducibility among combinatorial problems. In Proceedings of the Symposium on the Complexity of Computer Computations. Plenum Press, New York, 85--103.
|
 |
29
|
|
| |
30
|
|
| |
31
|
|
 |
32
|
Prasad Kulkarni , Stephen Hines , Jason Hiser , David Whalley , Jack Davidson , Douglas Jones, Fast searches for effective optimization phase sequences, Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, June 09-11, 2004, Washington DC, USA
|
| |
33
|
|
 |
34
|
Jeremy Lau , Matthew Arnold , Michael Hind , Brad Calder, Online performance auditing: using hot optimizations without getting burned, Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, June 11-14, 2006, Ottawa, Ontario, Canada
|
 |
35
|
|
 |
36
|
|
| |
37
|
|
 |
38
|
Matthew C. Merten , Andrew R. Trick , Christopher N. George , John C. Gyllenhaal , Wen-mei W. Hwu, A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization, Proceedings of the 26th annual international symposium on Computer architecture, p.136-147, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
39
|
|
| |
40
|
Nandy, S., Gao, X., and Ferrante, J. 2003. TFP: Time-sensitive, flow-specific profiling at runtime. In Proceedings of the Workshop on Languages and Compiling for Parallel Computing (LCPC).
|
| |
41
|
|
| |
42
|
Pan, Z. and Eigenmann, R. 2004a. Compiler optimization orchestration for peak performance. Tech. Rep. TR-ECE-04-01, School of Electrical and Computer Engineering, Purdue University.
|
| |
43
|
|
| |
44
|
|
| |
45
|
R. P. J. Pinkers , P. M. W. Knijnenburg , M. Haneda , H. A. G. Wijshoff, Statistical Selection of Compiler Options, Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, p.494-501, October 04-08, 2004
|
| |
46
|
|
 |
47
|
|
| |
48
|
SPEC. 2000. SPEC CPU2000 Results. http://www.spec.org/cpu2000/results.
|
 |
49
|
Mark Stephenson , Saman Amarasinghe , Martin Martin , Una-May O'Reilly, Meta optimization: improving compiler heuristics with machine learning, Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, June 09-11, 2003, San Diego, California, USA
|
 |
50
|
|
| |
51
|
Sun. 2000. Forte C 6 /Sun WorkShop 6 Compilers C User's Guide. http://docs.sun.com/app/docs/doc/806-3567.
|
| |
52
|
|
| |
53
|
|
 |
54
|
|
| |
55
|
|
| |
56
|
|
|