ACM Home Page
Please provide us with feedback. Feedback
Transmuting coprocessors: dynamic loading of FPGA coprocessors
Full text PdfPdf (145 KB)
Source Annual ACM IEEE Design Automation Conference archive
Proceedings of the 46th Annual Design Automation Conference table of contents
San Francisco, California
SESSION: Leveraging parallelism in FPGAs and multicore systems table of contents
Pages 848-851  
Year of Publication: 2009
ISBN:978-1-60558-497-3
Authors
Chen Huang  Univ. of California, Riverside
Frank Vahid  Univ. of California, Riverside
Sponsors
EDAC : Electronic Design Automation Consortium
SIGDA: ACM Special Interest Group on Design Automation
IEEE-CAS : Circuits & Systems
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 11,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1629911.1630127
What is a DOI?

ABSTRACT

Field-programmable gates arrays (FPGAs) are increasingly used in general-purpose computing platforms to augment microprocessors, enabling runtime loading of coprocessors customized to speed up some applications. Such transmuting coprocessors create new dynamic management problems involving decisions as to when to load a coprocessor, where to place the coprocessor in the FPGA, or which resident coprocessor to replace. We define a transmuting coprocessor problem based on Intel's FSB-FPGA architecture, with attention on communication and memory contention. We develop an online algorithm to manage coprocessor loading, the AG algorithm, which uses aggregated gains to guide coprocessor load, placement, replacement, and wait decisions. Experiments using embedded system applications, for random, biased, and periodic input application sequences, a range of reconfiguration times, and different FPGA types with different numbers of partial reconfigurable regions, demonstrate that the AG algorithm is robust across a variety of situations. The AG algorithm results are within 15% of an unlimited-size FPGA on average, exhibit a small standard deviation, and show a 1.4x speedup versus a static coprocessor loading approach and a 3x speedup over execution on a microprocessor-only solution.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
L. Bauer, M. Shafique, J. Henkel. Run-time instruction set selection in a transmutable embedded processor. DAC 2008.
 
2
Embedded Microprocessor Benchmark Consortium. EEMBC. http://www.eembc.org/home.php
 
3
S. C. Goldstein, H. Schmit, M. Budiu, S. Cadambi, M. Moe, R. R. Taylor, R. Laufer. PipeRench: a coprocessor for streaming multimedia acceleration. Computer Architecture, 1999.
 
4
D. Gross, C. M. Harris. Fundamentals of queueing theory. John Wiley & Sons, Inc. New York, NY, USA. 1985.
 
5
S. Hauck, T. W. Fry, M. M. Hosler, J. P. Kao The Chimaera reconfigurable functional unit. Very Large Scale Integration (VLSI) Systems, IEEE. 2004.
 
6
E. L. Horta, J. W. Lockwood, D. E. Taylor and D. Parlour. Dynamic Hardware Plugins in an FPGA with Partial Run-time Reconfiguration. Design Automation Conference (DAC), 2002.
 
7
C. Huang and F. Vahid. Dynamic Coprocessor Management for FPGA-Enhanced Compute Platforms. IEEE/ACM Int. Conf. on Compilers, Architectures, and Synthesis for Embedded Systems (CASES), Oct 2008.
 
8
Intel QuickAssist Technology, http://www.intel.com/technology/platforms/quickassist/index.htm
 
9
P. Lysaght, B. Blodget, J. Mason, J. Young, B. Bridgford. Invited paper: enhanced architectures, design methodologies and CAD tools for dynamic reconfiguration of Xilinx FPGAs. Field Programmable Logic and Applications, 2006.
 
10
A. Malik, B. Moyer, D. Cermak. A Lower Power Unified Cache Architecture Providing Power and Performance Flexibility. Int. Symp. on Low Power Electronics and Design, 2000.
 
11
S. McMillan, S. A. Guccione. Partial Run-Time Reconfiguration Using JRTR. Lecture Notes in Computer Science, 2000.
 
12
J. Noguera, R. M. Badia. HW/SW codesign techniques for dynamically reconfigurable architectures. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, 2002.
 
13
G. Stitt, F. Vahid. Energy advantages of microprocessor platforms with onchip configurable logic. Design & Test of Computers, IEEE, 2002.
 
14
S. Trimberger. Scheduling designs into a time-multiplexed FPGA. Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays, 1998.
 
15
H. Walder, C. Steiger, M. Platzner. Fast Online Task Placement on FPGAs: Free Space Partitioning and 2D-Hashing. Parallel and Distributed Processing Symposium, 2003.
 
16
Xilinx Virtex-5 FPGAs, http://www.xilinx.com