ACM Home Page
Please provide us with feedback. Feedback
DiST: a simple, reliable and scalable method to significantly reduce processor architecture simulation time
Full text PdfPdf (1.32 MB)
Source Joint International Conference on Measurement and Modeling of Computer Systems archive
Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems table of contents
San Diego, CA, USA
SESSION: Processor evaluation table of contents
Pages: 1 - 12  
Year of Publication: 2003
ISBN:1-58113-664-1
Also published in ...
Authors
Sylvain Girbal  LRI, Paris South, University and CEA, France
Gilles Mouchard  LRI, Paris South, University and CEA, France
Albert Cohen  INRIA Rocquencourt, France
Olivier Temam  LRI, Paris South, University, France
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 32,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/781027.781029
What is a DOI?

ABSTRACT

While architecture simulation is often treated as a methodology issue, it is at the core of most processor architecture research works, and simulation speed is often the bottleneck of the typical trial-and-error research process. To speedup simulation during this research process and get trends faster, researchers usually reduce the trace size. More sophisticated techniques like trace sampling or distributed simulation are scarcely used because they are considered unreliable and complex due to their impact on accuracy and the associated warm-up issues.In this article, we present DiST, a practical distributed simulation scheme where, unlike in other simulation techniques that trade accuracy for speed, the user is relieved from most accuracy issues thanks to an automatic and dynamic mechanism for adjusting the warm-up interval size. Moreover, the mechanism is designed so as to always privilege accuracy over speedup. The speedup scales with the amount of available computing resources, bringing an average 7.35 speedup on 10 machines with an average IPC error of 1.81% and a maximum IPC error of 5.06%.Besides proposing a solution to the warm-up issues in distributed simulation, we experimentally show that our technique is significantly more accurate than trace size reduction or trace sampling for identical speedups. We also show that not only the error always remains small for IPC and other metrics, but that a researcher can reliably base research decisions on DiST simulation results. Finally, we explain how the DiST tool is designed to be easily pluggable into existing architecture simulators with very few modifications.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Anderson, L. Berc, J. Dean, S. Ghemawat, M. Henzinger, S. Leung, D. Sites, M. Vandevoorde, C. Waldspurger, and W. Weihl. Continuous profiling: Where have all the cycles gone, July 1997.
 
2
 
3
D. Burger and T. Austin. The simplescalar tool set, version 2.0. Technical Report CS-TR-97-1342, Department of Computer Sciences, University of Wisconsin, June 1997.
 
4
S. Chatterjee and S. Sen. Cache-efficient matrix transposition. In Sixth International Symposium on High-Performance Computer Architecture, pages 195--205, Toulouse, France, 2000.
 
5
 
6
7
 
8
L. Eeckhout, K. DeBousschere, and H. Neefs. Performance analysis through synthetic trace generation. In Int. Symp. on Performance Analysis of Systems and Software, Liege, Belgium, April 2000.
 
9
J. Haskins and K. Skadron. Minimal subset evaluation: Rapid warm-up for simulated hardware state. In Proc. of the 2001 International Conference on Computer Design, Austin, Texas, September 2001.
 
10
V. S. Iyengar and L. H. Trevillyan. Evaluation and generation of reduced traces for benchmarks. Technical Report RC20610, IBM T. J. Watson, Oct 1996.
 
11
A. KleinOsowski, J. Flynn, N. Meares, and D. Lilja. Adapting the SPEC 2000 benchmark suite for simulation-based computer architecture research. In Proceedings of the Third IEEE Annual Workshop on Workload Characterization, International Conference on Computer Design (ICCD),, pages 73--82, September 2000.
 
12
 
13
M. J. Litzkow, M. Livny, and M. W. Mutka. Condor - a hunter of idle workstations. In Proc. of the 8th Intl. Conf. on Distributed Computing Systems, pages 104--111, San Jose, Calif., June 1988.
14
 
15
 
16
 
17
 
18
 
19
 
20
21
 
22
Synopsys. SystemC. http://www.systemc.org, 2000-2002.
 
23
 
24
Z. Wang, K. Pierce, and S. McFarling. BMAT --- a binary matching tool for stale profile propagation. Journal of Instruction-Level Parallelism, 2(1--6), 2000.

CITED BY  9

Collaborative Colleagues:
Sylvain Girbal: colleagues
Gilles Mouchard: colleagues
Albert Cohen: colleagues
Olivier Temam: colleagues