| Performance and power of cache-based reconfigurable computing |
| Full text |
Pdf
(905 KB)
|
Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 36th annual international symposium on Computer architecture
table of contents
Austin, TX, USA
SESSION: Memory system reconfiguration and acceleration
table of contents
Pages 395-405
Year of Publication: 2009
ISBN:978-1-60558-526-0
Also published in ...
|
|
Authors
|
|
Andrew Putnam
|
University of Washington, Seattle, WA, USA
|
|
Susan Eggers
|
University of Washington, Seattle, WA, USA
|
|
Dave Bennett
|
Xilinx, Inc., San Jose, CA, CA, USA
|
|
Eric Dellinger
|
Xilinx, Inc., San Jose, CA, CA, USA
|
|
Jeff Mason
|
Xilinx, Inc., San Jose, CA, CA, USA
|
|
Henry Styles
|
Xilinx, Inc., San Jose, CA, CA, USA
|
|
Prasanna Sundararajan
|
Xilinx, Inc., San Jose, CA, CA, USA
|
|
Ralph Wittig
|
Xilinx, Inc., San Jose, CA, CA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 120, Downloads (12 Months): 310, Citation Count: 0
|
|
|
ABSTRACT
Many-cache is a memory architecture that efficiently supports caching in commercially available FPGAs. It facilitates FPGA programming for high-performance computing (HPC) developers by providing them with memory performance that is greater and power consumption that is less than their current CPU platforms, but without sacrificing their familiar, C-based programming environment. Many-cache creates multiple, multi-banked caches on top of an FGPA's small, independent memories, each targeting a particular data structure or region of memory in an application and each customized for the memory operations that access it. The caches are automatically generated from C source by the CHiMPS C-to-FPGA compiler. This paper presents the analyses and optimizations of the CHiMPS compiler that construct many-cache caches. An architectural evaluation of CHiMPS-generated FPGAs demonstrates a performance advantage of 7.8x (geometric mean) over CPU-only execution of the same source code, FPGA power usage that is on average 4.1x less, and consequently performance per watt that is also greater, by a geometric mean of 21.3x.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
Handel-C Language Reference Manual, 4th ed., Agility, 2007.
|
| |
4
|
Catapult Synthesis Datasheet, 10th ed., Mentor Graphics, 2006.
|
| |
5
|
|
| |
6
|
|
| |
7
|
S. Möhl, "The Mitrion-C Programming Language, Mitrionics Inc., Tech. Rep., 2005.
|
| |
8
|
|
| |
9
|
Bruce A. Draper , A. P. Wim Böhm , Jeffrey Hammes , Walid A. Najjar , J. Ross Beveridge , Charlie Ross , Monica Chawathe , Mitesh Desai , José Bins, Compiling SA-C Programs to FPGAs: Performance Results, Proceedings of the Second International Workshop on Computer Vision Systems, p.220-235, July 07-08, 2001
|
| |
10
|
Implementing a Virtex-4 FX C-to-HDL Hardware Coprocessor Accelerator in a PowerPC Design, 2nd ed., Xilinx, 2007.
|
| |
11
|
|
| |
12
|
|
| |
13
|
M. Budiu, G. Venkataramani, T. Chelcea, and S. C. Goldstein, "Spatial Computation," in SIGOPS Operating Systems Review, 2004.
|
 |
14
|
Mahim Mishra , Timothy J. Callahan , Tiberiu Chelcea , Girish Venkataramani , Seth C. Goldstein , Mihai Budiu, Tartan: evaluating spatial computation for whole program execution, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
| |
15
|
|
 |
16
|
Andrew R. Putnam , Dave Bennett , Eric Dellinger , Jeff Mason , Prasanna Sundararajan, CHiMPS: a high-level compilation flow for hybrid CPU-FPGA architectures, Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays, February 24-26, 2008, Monterey, California, USA
[doi> 10.1145/1344671.1344720]
|
| |
17
|
|
| |
18
|
I. MindShare and T. Shanley, The Unabridged Pentium 4. Addison-Wesley, 2005.
|
| |
19
|
HyperTransport I/O Technology Overview, HyperTransportTM Consortium, 2004.
|
| |
20
|
DRC RPU110 Datasheet, 1st ed., DRC Computer, 2007.
|
| |
21
|
XD2000F FPGA Co-processor for AMD Socket F, 1st ed., XtremeData, 2007.
|
 |
22
|
David Slogsnat , Alexander Giese , Ulrich Brüning, A versatile, low latency HyperTransport core, Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays, February 18-20, 2007, Monterey, California, USA
[doi> 10.1145/1216919.1216926]
|
| |
23
|
FSB-FPGA Integrated Development Platform Overview", Nallatech, 2008.
|
| |
24
|
Latency Comparison Between HyperTransport and PCI-Express In Communication Systems", HyperTransport Consortium, 2006.
|
| |
25
|
S. Trimberger, "Redefining the FPGA," in FPL: International Conference on Field Programmable Logic and Applications, 2007.
|
| |
26
|
|
| |
27
|
Virtex-5 FPGA User Guide, 4th ed., Xilinx, 2008.
|
| |
28
|
Quad-Core Intel Xeon Processor 7300 Series, Intel, 2007.
|
| |
29
|
CoCentric SystemC Compiler RTL User and Modeling Guide, 2003 ed., Synopsys, 2003.
|
 |
30
|
|
 |
31
|
|
| |
32
|
|
 |
33
|
Yanbing Li , Tim Callahan , Ervan Darnell , Randolph Harr , Uday Kurkure , Jon Stockwood, Hardware-software co-design of embedded reconfigurable architectures, Proceedings of the 37th Annual Design Automation Conference, p.507-512, June 05-09, 2000, Los Angeles, California, United States
[doi> 10.1145/337292.337559]
|
| |
34
|
|
| |
35
|
Nios II C2H Compiler Users Guide, 1st ed., Altera, 2007.
|
 |
36
|
Seth Copen Goldstein , Herman Schmit , Matthew Moe , Mihai Budiu , Srihari Cadambi , R. Reed Taylor , Ronald Laufer, PipeRench: a co/processor for streaming multimedia acceleration, Proceedings of the 26th annual international symposium on Computer architecture, p.28-39, May 01-04, 1999, Atlanta, Georgia, United States
|
|