| FaCSim: a fast and cycle-accurate architecture simulator for embedded systems |
| Full text |
Pdf
(665 KB)
|
Source
|
Language, Compiler and Tool Support for Embedded Systems
archive
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
table of contents
Tucson, AZ, USA
SESSION: Architecture
table of contents
Pages 89-100
Year of Publication: 2008
ISBN:978-1-60558-104-0
Also published in ...
|
|
Authors
|
|
Jaejin Lee
|
Seoul National University, Seoul, South Korea
|
|
Junghyun Kim
|
Seoul National University, Seoul, South Korea
|
|
Choonki Jang
|
Seoul National University, Seoul, South Korea
|
|
Seungkyun Kim
|
Seoul National University, Seoul, South Korea
|
|
Bernhard Egger
|
Samsung Institute of Technology, Yongin-si, South Korea
|
|
Kwangsub Kim
|
LG Electronics, Seoul, South Korea
|
|
SangYong Han
|
Seoul National University, Seoul, South Korea
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 31, Downloads (12 Months): 259, Citation Count: 1
|
|
|
ABSTRACT
There have been strong demands for a fast and cycle-accurate virtual platforms in the embedded systems area where developers can do meaningful software development including performance debugging in the context of the entire platform. In this paper, we describe the design and implementation of a fast and cycle-accurate architecture simulator called FaCSim as a first step towards such a virtual platform. FacSim accurately models the ARM9E-S processor core and ARM926EJ-S processor's memory subsystem. It accurately simulates exceptions and interrupts to enable whole-system simulation including the OS. Since it is implemented in a modular manner in C++, it can be easily extended with other system components by subclassing or adding new classes. FaCSim is based on an interpretive simulation technique to provide flexibility, yet achieving high speed. It enables fast cycle-accurate architecture simulation by means of three mechanisms. First, it computes elapsed cycles in each pipeline stage as a chunk and incrementally adds it up to advance the core clock instead of performing cycle-by-cycle simulation. Second, it uses a basic-block cache that caches decoded instructions at the basic-block level. Finally, it is parallelized to exploit multicore systems that are available everywhere these days. Using 21 applications from the EEMBC benchmark suite, FaCSim's accuracy is validated against the ARM926EJ-S development board from ARM, and is accurate in a ±7% error margin. Due to basic-block level caching and parallelization, FaCSim is, on average, more than three times faster than ARMulator and more than six times faster than SimpleScalar.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Vasanth Bala , Evelyn Duesterwald , Sanjeev Banerjia, Dynamo: a transparent dynamic optimization system, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.1-12, June 18-21, 2000, Vancouver, British Columbia, Canada
|
| |
2
|
|
| |
3
|
Derek Chiou , Dam Sunwoo , Joonsoo Kim , Nikhil A. Patil , William Reinhart , Darrel Eric Johnson , Jebediah Keefe , Hari Angepat, FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators, Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, p.249-261, December 01-05, 2007
[doi> 10.1109/MICRO.2007.16]
|
 |
4
|
|
| |
5
|
The Embedded Microprocessor Benchmark Consortium. EEMBC Benchmark Suite. http://www.eembc.com, 2008.
|
| |
6
|
|
| |
7
|
Joel Emer , Pritpal Ahuja , Eric Borch , Artur Klauser , Chi-Keung Luk , Srilatha Manne , Shubhendu S. Mukherjee , Harish Patil , Steven Wallace , Nathan Binkert , Roger Espasa , Toni Juan, Asim: A Performance Model Framework, Computer, v.35 n.2, p.68-76, February 2002
[doi> 10.1109/2.982918]
|
 |
8
|
Lei Gao , Stefan Kraemer , Rainer Leupers , Gerd Ascheid , Heinrich Meyr, A fast and generic hybrid simulation approach using C virtual machine, Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, September 30-October 03, 2007, Salzburg, Austria
[doi> 10.1145/1289881.1289885]
|
| |
9
|
Intel. VTune Performance Analyzer. http://www.intel.com, 2008.
|
| |
10
|
|
 |
11
|
Stefan Kraemer , Lei Gao , Jan Weinstock , Rainer Leupers , Gerd Ascheid , Heinrich Meyr, HySim: a fast simulation framework for embedded software development, Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis, September 30-October 03, 2007, Salzburg, Austria
[doi> 10.1145/1289816.1289837]
|
| |
12
|
|
| |
13
|
ARM Limited. ARM926EJ-S Techinical Reference Manual, 2003. http://infocenter.arm.com.
|
| |
14
|
ARM Limited. ARM9E-S Core Techinical Reference Manual, 2004. http://infocenter.arm.com.
|
| |
15
|
ARM Limited. ARM Architecture Reference Manual, 2005. http://infocenter.arm.com.
|
| |
16
|
ARM Limited. Verstile Application Baseboard for ARM926EJ-S User Guide, 2006. http://infocenter.arm.com.
|
| |
17
|
ARM Limited. RealView ARMulator ISS User Guide, Version 1.4.3, 2007. http://infocenter.arm.com.
|
| |
18
|
LISA - Language for Instruction Set Architecture. http://www.iss.rwth-aachen.de/lisa/, 2001.
|
 |
19
|
Chi-Keung Luk , Robert Cohn , Robert Muth , Harish Patil , Artur Klauser , Geoff Lowney , Steven Wallace , Vijay Janapa Reddi , Kim Hazelwood, Pin: building customized program analysis tools with dynamic instrumentation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
| |
20
|
Peter S. Magnusson , Magnus Christensson , Jesper Eskilson , Daniel Forsgren , Gustav Hållberg , Johan Högberg , Fredrik Larsson , Andreas Moestedt , Bengt Werner, Simics: A Full System Simulation Platform, Computer, v.35 n.2, p.50-58, February 2002
[doi> 10.1109/2.982916]
|
 |
21
|
|
| |
22
|
Christopher Mills, Stanley C. Ahalt, and Jim Fowler. Compiled instruction set simulation. Software, Practice and Experience, 21(8):877--889, 1991.
|
| |
23
|
|
 |
24
|
Achim Nohl , Gunnar Braun , Oliver Schliebusch , Rainer Leupers , Heinrich Meyr , Andreas Hoffmann, A universal technique for fast and flexible instruction-set architecture simulation, Proceedings of the 39th conference on Design automation, June 10-14, 2002, New Orleans, Louisiana, USA
[doi> 10.1145/513918.513927]
|
| |
25
|
David A. Penry, Daniel Fay, David Hodgdon, Ryan Wells, Graham Schelle, David I. August, and Dan Connors. Exploiting Parallelism and Structure to Accelerate the Simulation of Chip Multi-processors. In HPCA ?06: Proceedings of the 12th International Symposium on High-Performance Computer Architecture, pages 27--38, Feburary 2006.
|
| |
26
|
|
| |
27
|
QEMU. http://fabrice.bellard.free.fr/qemu/, 2008.
|
 |
28
|
|
 |
29
|
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
| |
33
|
SESC: SuperESCalar Simulator. http://iacoma.cs.uiuc.edu/~paulsack/sescdoc/, 2002.
|
| |
34
|
SimpleScalar. http://www.simplescalar.com, 2004.
|
| |
35
|
Infineon Technologies. HYB39S512400T(L), HYB39S512800T(L), HYB39S512160T(L) 512-Mbit Synchronous DRAM Data Sheet, Rev. 1.3, 2003. http://www.infineon.com.
|
 |
36
|
|
| |
37
|
|
 |
38
|
|
| |
39
|
Ji Zhang, Jaejin Lee, and Philip K. McKinley. Optimizing the java piped i/o stream library for performance. In LCPC ?02: Proceedings of the 15th International Workshop on Languages and Compilers for Parallel Computing, pages 233--248, Berlin/Heidelberg, Germany, July 2002. Springer. Also published in Springer Lecture Notes in Computer Science, Vol. 2481/2005.
|
 |
40
|
|
|