|
ABSTRACT
Architects use cycle-by-cycle simulation to evaluate design choices and understand tradeoffs and interactions among design parameters. Efficiently exploring exponential-size design spaces with many interacting parameters remains an open problem: the sheer number of experiments renders detailed simulation intractable. We attack this problem via an automated approach that builds accurate, confident predictive design-space models. We simulate sampled points, using the results to teach our models the function describing relationships among design parameters. The models produce highly accurate performance estimates for other points in the space, can be queried to predict performance impacts of architectural changes, and are very fast compared to simulation, enabling efficient discovery of tradeoffs among parameters in different regions. We validate our approach via sensitivity studies on memory hierarchy and CPU design spaces: our models generally predict IPC with only 1-2% error and reduce required simulation by two orders of magnitude. We also show the efficacy of our technique for exploring chip multiprocessor (CMP) design spaces: when trained on a 1% sample drawn from a CMP design space with 250K points and up to 55x performance swings among different system configurations, our models predict performance with only 4-5% error on average. Our approach combines with techniques to reduce time per simulation, achieving net time savings of three-four orders of magnitude.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Cai, K. Chow, T. Nakanishi, J. Hall, and M. Barany. Multivariate power/performance analysis for high performance mobile microprocessor design. In Power Driven Microarchitecture Workshop, June 1998.
|
| |
2
|
R. Caruana, S. Lawrence, and C. Giles. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Proc. Neural Information Processing Systems Conference, pages 402--408, Nov. 2000.
|
| |
3
|
K. Chow and J. Ding. Multivariate analysis of Pentium Pro processor. In Intel Software Developers Conference, pages 84--91, Oct. 1997.
|
| |
4
|
|
| |
5
|
|
 |
6
|
Lieven Eeckhout , Robert H. Bell Jr. , Bastiaan Stougie , Koen De Bosschere , Lizy K. John, Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies, Proceedings of the 31st annual international symposium on Computer architecture, p.350, June 19-23, 2004, München, Germany
|
| |
7
|
|
| |
8
|
L. Eeckhout, S. Nussbaum, J. Smith, and K. De Bosschere. Statistical simulation: Adding efficiency to the computer designer's toolbox. IEEE Micro, 23(5):26--38, 2003.
|
| |
9
|
L. Eeckhout, H. Vandierendonck, and K. De Bosschere. Quantifying the impact of input data sets on program behavior and its applications. Journal of Instruction Level Parallelism, 5:http://www.jilp.org/vol5, Feb. 2003.
|
| |
10
|
S. Eyerman, L. Eeckhout, and K.D. Bosschere. The shape of the processor design space and its implications for early stage explorations. In Proc. 7th WSEAS International Conference on Automatic Control, Modeling and Simulation, pages 395--400, Mar. 2005.
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag, 2001.
|
| |
15
|
|
| |
16
|
|
| |
17
|
P. Joseph, K. Vaswani, and M. Thazhuthaveetil. Use of linear regression models for processor performance analysis. In Proc. 12th IEEE Symposium on High Performance Computer Architecture, pages 99--108, Feb. 2006.
|
 |
18
|
|
| |
19
|
|
 |
20
|
|
 |
21
|
|
| |
22
|
M. Martonosi and K. Skadron. NSF computer performance evaluation workshop http://www.princeton.edu/~mrm/nsf sim final.pdf, Dec. 2001.
|
| |
23
|
C. Marzban. A neural network for tornado diagnosis. Neural Computing and Applications, 9(2):133--141, 2000.
|
| |
24
|
|
 |
25
|
Anish Muttreja , Anand Raghunathan , Srivaths Ravi , Niraj K. Jha, Automated energy/performance macromodeling of embedded software, Proceedings of the 41st annual conference on Design automation, June 07-11, 2004, San Diego, CA, USA
[doi> 10.1145/996566.996599]
|
 |
26
|
Anish Muttreja , Anand Raghunathan , Srivaths Ravi , Niraj K. Jha, Hybrid simulation for embedded software energy estimation, Proceedings of the 42nd annual conference on Design automation, June 13-17, 2005, San Diego, California, USA
[doi> 10.1145/1065579.1065590]
|
 |
27
|
|
 |
28
|
|
| |
29
|
A. Phansalkar, A. Joshi, L. Eeckhout, and L. John. Measuring program similarity: Experiments with SPEC CPU benchmark suites. In Proc. IEEE International Symposium on Performance Analysis of Systems and Software, pages 10--20, Mar. 2005.
|
| |
30
|
D. Pomerleau. Knowledge-based training of artificial neural networks for autonomous robot driving. In J. Connell and S. Mahadevan, editors, Robot Learning, pages 19--43. Kluwer Academic Press, Boston, 1993.
|
| |
31
|
|
| |
32
|
J. Renau. SESC. http://sesc.sourceforge.net/index.html.
|
| |
33
|
M. Saar-Tsechansky and F. Provost. Active learning for class probability estimation and ranking. In Proc. 17th International Joint Conference on Artificial Intelligence, pages 911--920, Aug. 2001.
|
 |
34
|
|
| |
35
|
Standard Performance Evaluation Corporation. SPEC CPU benchmark suite. http://www.specbench.org/osg/cpu2000/, 2000.
|
 |
36
|
|
| |
37
|
M. Van Biesbrouck, L. Eeckhout, and B. Calder. Efficient sampling startup for sampled processor simulation. In Proc. 1st International Conference on High Performance Embedded Architectures and Compilers, pages 47--67, Nov. 2005.
|
 |
38
|
|
| |
39
|
S.Wilton and N. Jouppi. CACTI: An enhanced cache access and cycle time model. IEEE Journal of Solid-State Circuits, 31(5):677--688, May 1996.
|
 |
40
|
|
| |
41
|
|
CITED BY 18
|
|
|
|
|
John Cavazos , Christophe Dubach , Felix Agakov , Edwin Bonilla , Michael F. P. O'Boyle , Grigori Fursin , Olivier Temam, Automatic performance model construction for the fast software exploration of new hardware designs, Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, October 22-25, 2006, Seoul, Korea
|
|
|
|
|
|
Benjamin C. Lee , David M. Brooks , Bronis R. de Supinski , Martin Schulz , Karan Singh , Sally A. McKee, Methods of inference and learning for performance modeling of parallel applications, Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, March 14-17, 2007, San Jose, California, USA
|
|
|
|
|
|
Christophe Dubach , John Cavazos , Björn Franke , Grigori Fursin , Michael F.P. O'Boyle , Olivier Temam, Fast compiler optimisation evaluation using code-feature based performance prediction, Proceedings of the 4th international conference on Computing frontiers, May 07-09, 2007, Ischia, Italy
|
|
|
Christopher Stewart , Terence Kelly , Alex Zhang , Kai Shen, A dollar from 15 cents: cross-platform management for internet services, USENIX 2008 Annual Technical Conference on Annual Technical Conference, p.199-212, June 22-27, 2008, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
Engin Ipek , Sally A. McKee , Karan Singh , Rich Caruana , Bronis R. de Supinski , Martin Schulz, Efficient architectural design space exploration via predictive modeling, ACM Transactions on Architecture and Code Optimization (TACO), v.4 n.4, p.1-34, January 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abhishek Das , Berkin Ozisikyilmaz , Serkan Ozdemir , Gokhan Memik , Joseph Zambreno , Alok Choudhary, Evaluating the effects of cache redundancy on profit, Proceedings of the 2008 41st IEEE/ACM International Symposium on Microarchitecture, p.388-398, November 08-12, 2008
|
|
|
|
|
|
|
|