ACM Home Page
Please provide us with feedback. Feedback
Automatic performance model construction for the fast software exploration of new hardware designs
Full text PdfPdf (254 KB)
Source International Conference on Compilers, Architecture and Synthesis for Embedded Systems archive
Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems table of contents
Seoul, Korea
SESSION: Modeling and simulation table of contents
Pages: 24 - 34  
Year of Publication: 2006
ISBN:1-59593-543-6
Authors
John Cavazos  University of Edinburgh, UK
Christophe Dubach  University of Edinburgh, UK
Felix Agakov  University of Edinburgh, UK
Edwin Bonilla  University of Edinburgh, UK
Michael F. P. O'Boyle  University of Edinburgh, UK
Grigori Fursin  Paris-Sud University, France
Olivier Temam  Paris-Sud University, France
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
ACM: Association for Computing Machinery
SIGBED: ACM Special Interest Group on Embedded Systems
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 56,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1176760.1176765
What is a DOI?

ABSTRACT

Developing an optimizing compiler for a newly proposed architecture is extremely difficult when there is only a simulator of the machine available. Designing such a compiler requires running many experiments in order to understand how different optimizations interact. Given that simulators are orders of magnitude slower than real processors, such experiments are highly restricted. This paper develops a technique to automatically build a performance model for predicting the impact of program transformations on any architecture, based on a limited number of automatically selected runs. As a result, the time for evaluating the impact of any compiler optimization in early design stages can be drastically reduced such that all selected potential compiler optimizations can be evaluated. This is achieved by first evaluating a small set of sample compiler optimizations on a prior set of benchmarks in order to train a model, followed by a very small number of evaluations, or probes, of the target program.We show that by training on less than 0. 7% of all possible transformations (640 samples collected from 10 benchmarks out of 880000 possible samples, 88000 per training benchmark) and probing the new program on only 4 transformations, we can predict the performance of all program transformations with an error of just 7. 3% on average. As each prediction takes almost no time to generate, this scheme provides an accurate method of evaluating compiler performance, which is several orders of magnitude faster than current approaches.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
 
4
H. Berry, D. G. Prez, and O. Temam. Chaos in computer performance. Chaos, 16(1), Dec. 2005.
 
5
 
6
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Chapman and Hall, 1984.
7
 
8
K. D. Cooper, A. Grosul, T. Harvey, S. Reeves, D. Subramanian, L. Torczon, , and T. Waterman. Searching for compilation sequences. Tech. report, Rice University, 2005.
 
9
10
 
11
A. Epshteyn, M. Garzaran, G. DeJong, D. Padua, G. Ren, X. Li, K. Yotov, and K. Pingali. Analytic models and empirical search: A hybrid approach to code optimization. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC), Hawthorne, NY, USA, 2005.
12
 
13
M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, volume 3, pages 1381--1384, Seattle, WA, May 1998.
 
14
15
 
16
R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive Mixtures of Local Experts. Neural Computation, 3, 1991.
17
18
 
19
C. Lee. Utdsp benchmark suite. In http://www. eecg. toronto. edu/~corinna/DSP/infrastructure/UTDSP. html, 1998.
 
20
21
 
22
 
23
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gacić, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation", 93(2):232--275, 2005.
 
24
M. Saghir, P. Chow, and C. Lee. A comparison of traditional and vliw dsp architecture for compiled dsp applications. In Proceedings of the International Workshop on Compiler and Architecture Support for Embedded Systems (CASES), Washington, DC, USA, 1998.
25
 
26
27
 
28
29
 
30
31
32
33
 
34

CITED BY  6

Collaborative Colleagues:
John Cavazos: colleagues
Christophe Dubach: colleagues
Felix Agakov: colleagues
Edwin Bonilla: colleagues
Michael F. P. O'Boyle: colleagues
Grigori Fursin: colleagues
Olivier Temam: colleagues