|
ABSTRACT
Analytical modeling is applied to the automated design of application-specific superscalar processors. Using an analytical method bridges the gap between the size of the design space and the time required for detailed cycle-accurate simulations. The proposed design framework takes as inputs the design targets (upper bounds on execution time, area, and energy), design alternatives, and one or more application programs. The output is the set of out-of-order superscalar processors that are Pareto-optimal with respect to performance-energy-area. The core of the new design framework is made up of analytical performance and energy activity models, and an analytical model-based design optimization process. For a set of benchmark programs and a design space of 2000 designs, the design framework arrives at all performance-energy-area Pareto-optimal design points within 16 minutes on a 2 GHz Pentium-4. In contrast, it is estimated that a naíve cycle-accurate simulation-based exhaustive search would require at least two months to arrive at the Pareto-optimal design points for the same design space.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
IBM, "PowerPC 440 Processor Core," available at http://www-306.ibm.com/.
|
| |
2
|
|
| |
3
|
Vinod Kathail , Shail Aditya , Robert Schreiber , B. Ramakrishna Rau , Darren C. Cronquist , Mukund Sivaraman, PICO: Automatically Designing Custom Computers, Computer, v.35 n.9, p.39-47, September 2002
[doi> 10.1109/MC.2002.1033026]
|
 |
4
|
|
| |
5
|
|
| |
6
|
S. Kirkpatrick, C. Gellat, and M. Vecchi, "Optimization by Simulated Annealing," Science, vol. 220--4598, 1983, pp. 671--680.
|
| |
7
|
|
 |
8
|
|
| |
9
|
L. Eeckhout, "Accurate Statistical Workload Modeling," PhD Thesis: University of Gent, 2002.
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
E. Riseman and C. Foster, "The Inhibition of Potential Parallelism by Conditional Jumps," IEEE Trans. on Computer Architectures, vol. C--21, 1972, pp. 1405--1411.
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
| |
18
|
"Computer Hardware Understanding Development Tools 2.0 Reference Guide for MacOS X," July 2002.
|
| |
19
|
J. M. Tendler, et. al., "IBM Power 4: System Microarchitecture," IBM Journal of Research and Development, 2002, pp. 5--26.
|
| |
20
|
S. Kachigan, Statistical Analysis. New York: Radius Press, 1986.
|
 |
21
|
|
| |
22
|
J. M. Mulder and M. Flynn, "An Area Model for On-Chip Memories and its Application," IEEE Journal of Solid-State Circuits, vol. 26, 1991, pp. 98--106.
|
| |
23
|
|
 |
24
|
Engin Ïpek , Sally A. McKee , Rich Caruana , Bronis R. de Supinski , Martin Schulz, Efficiently exploring architectural design spaces via predictive modeling, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
| |
25
|
S. Eyerman, J. Smith, and L. Eeckhout, "Characterizing the Branch Misprediction Penalty", International Symposium on Performance Analysis of Systems and Software, 2006, pp. 48--58.
|
 |
26
|
Stijn Eyerman , Lieven Eeckhout , Tejas Karkhanis , James E. Smith, A performance counter architecture for computing accurate CPI components, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
CITED BY 4
|
|
Ke Meng , Russ Joseph , Robert P. Dick , Li Shang, Multi-optimization power management for chip multiprocessors, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|