|
ABSTRACT
Field-programmable gates arrays (FPGAs) are increasingly used in general-purpose computing platforms to augment microprocessors, enabling runtime loading of coprocessors customized to speed up some applications. Such transmuting coprocessors create new dynamic management problems involving decisions as to when to load a coprocessor, where to place the coprocessor in the FPGA, or which resident coprocessor to replace. We define a transmuting coprocessor problem based on Intel's FSB-FPGA architecture, with attention on communication and memory contention. We develop an online algorithm to manage coprocessor loading, the AG algorithm, which uses aggregated gains to guide coprocessor load, placement, replacement, and wait decisions. Experiments using embedded system applications, for random, biased, and periodic input application sequences, a range of reconfiguration times, and different FPGA types with different numbers of partial reconfigurable regions, demonstrate that the AG algorithm is robust across a variety of situations. The AG algorithm results are within 15% of an unlimited-size FPGA on average, exhibit a small standard deviation, and show a 1.4x speedup versus a static coprocessor loading approach and a 3x speedup over execution on a microprocessor-only solution.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
L. Bauer, M. Shafique, J. Henkel. Run-time instruction set selection in a transmutable embedded processor. DAC 2008.
|
| |
2
|
Embedded Microprocessor Benchmark Consortium. EEMBC. http://www.eembc.org/home.php
|
| |
3
|
S. C. Goldstein, H. Schmit, M. Budiu, S. Cadambi, M. Moe, R. R. Taylor, R. Laufer. PipeRench: a coprocessor for streaming multimedia acceleration. Computer Architecture, 1999.
|
| |
4
|
D. Gross, C. M. Harris. Fundamentals of queueing theory. John Wiley & Sons, Inc. New York, NY, USA. 1985.
|
| |
5
|
S. Hauck, T. W. Fry, M. M. Hosler, J. P. Kao The Chimaera reconfigurable functional unit. Very Large Scale Integration (VLSI) Systems, IEEE. 2004.
|
| |
6
|
E. L. Horta, J. W. Lockwood, D. E. Taylor and D. Parlour. Dynamic Hardware Plugins in an FPGA with Partial Run-time Reconfiguration. Design Automation Conference (DAC), 2002.
|
| |
7
|
C. Huang and F. Vahid. Dynamic Coprocessor Management for FPGA-Enhanced Compute Platforms. IEEE/ACM Int. Conf. on Compilers, Architectures, and Synthesis for Embedded Systems (CASES), Oct 2008.
|
| |
8
|
Intel QuickAssist Technology, http://www.intel.com/technology/platforms/quickassist/index.htm
|
| |
9
|
P. Lysaght, B. Blodget, J. Mason, J. Young, B. Bridgford. Invited paper: enhanced architectures, design methodologies and CAD tools for dynamic reconfiguration of Xilinx FPGAs. Field Programmable Logic and Applications, 2006.
|
| |
10
|
A. Malik, B. Moyer, D. Cermak. A Lower Power Unified Cache Architecture Providing Power and Performance Flexibility. Int. Symp. on Low Power Electronics and Design, 2000.
|
| |
11
|
S. McMillan, S. A. Guccione. Partial Run-Time Reconfiguration Using JRTR. Lecture Notes in Computer Science, 2000.
|
| |
12
|
J. Noguera, R. M. Badia. HW/SW codesign techniques for dynamically reconfigurable architectures. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, 2002.
|
| |
13
|
G. Stitt, F. Vahid. Energy advantages of microprocessor platforms with onchip configurable logic. Design & Test of Computers, IEEE, 2002.
|
| |
14
|
S. Trimberger. Scheduling designs into a time-multiplexed FPGA. Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays, 1998.
|
| |
15
|
H. Walder, C. Steiger, M. Platzner. Fast Online Task Placement on FPGAs: Free Space Partitioning and 2D-Hashing. Parallel and Distributed Processing Symposium, 2003.
|
| |
16
|
Xilinx Virtex-5 FPGAs, http://www.xilinx.com
|
|