ACM Home Page
Please provide us with feedback. Feedback
Energy-aware computation duplication for improving reliability in embedded chip multiprocessors
Full text PdfPdf (273 KB)
Source Asia and South Pacific Design Automation Conference archive
Proceedings of the 2006 Asia and South Pacific Design Automation Conference table of contents
Yokohama, Japan
SESSION: Software techniques for efficient SoC design table of contents
Pages: 134 - 139  
Year of Publication: 2006
ISBN:0-7803-9451-8
Authors
G. Chen  Pennsylvania State University
M. Kandemir  Pennsylvania State University
F. Li  Pennsylvania State University
Sponsors
: IEEE Circuits and Systems Society
SIGDA: ACM Special Interest Group on Design Automation
IEICE ESS : Institute of Electronics, Information and Communication Engineers, Engineering Sciences Society
IPSJ SIG-SLDM : Information Processing Society of Japan, SIG System LSI Design Methodology
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 17,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1118299.1118342
What is a DOI?

ABSTRACT

Compilers designed for current embedded systems must be capable of addressing multiple constraints such as low power, high performance, small memory footprint and form factor, and high reliability at the same time. In particular, optimizing for one constraint should be performed carefully, considering its impact on other constraints. Recent trends indicate that transient errors are becoming increasingly important in embedded systems. Focusing on an embedded chip multiprocessor and array-intensive applications, this paper demonstrates how reliability against transient errors can be improved without impacting execution time by utilizing idle processors for duplicating some of the computations of the active processors. It also shows how a balance between power savings and reliability improvement can be struck using a metric called the energy-delay-fallibility product. Our experimental results indicate that the "percentage of duplicated computations" is a useful high-level metric for studying the tradeoffs among performance, power, and reliability.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
C. Bolchini. A Software Methodology for Detecting Hardware Faults in VLIW Datapaths. IEEE Transactions on Reliability, 52(4):458--468, December 2003.
 
2
Chip Multiprocessing. http://industry.java.sun.com/javanews/stories/print/0,1797,32080.00.html
 
3
Chip Multiprocessing. ITWorld.Com, http://www.itworld.com/Comp/ 1092/CW-STO54343/
 
4
5
 
6
C. Gong, R. Melhem and R. Gupta. Compiler-Assisted Fault Detection for Distributed Memory Systems. In Proc. the Scalable High Performance Computing Conference, Knoxville, TN, 1994.
 
7
 
8
J. G. Holm and P. Banerjee. Low Cost Concurrent Error Detection in a VLIW Architecture Using Replicated Instructions. In Proc. the International Conference on Parallel Processing, pp. 192--195, 1992.
 
9
C-H. Hsu and U. Kremer. Single Region vs. Multiple Regions: A Comparison of Different Compiler-Directed Dynamic Voltage Scheduling Approaches. In Proc. PACS Workshop, Cambridge, MA, February 2002.
10
11
12
 
13
14
15
 
16
 
17
Simics Tool-set. http://www.simics.com.
18
 
19
20
21
22