|
ABSTRACT
Soft-core microprocessors mapped onto field-programmable gate arrays (FPGAs) represent an increasingly common embedded software implementation option. Modern FPGA soft-cores are parameterized to support application-specific customization, wherein pre-defined units, such as a multiplication unit or floating-point unit, may be included in the microprocessor architecture to speed up software execution at the expense of increased size. We introduce a methodology for fast applicationspecific customization of a parameterized FPGA soft core, using synthesis and execution to obtain size and performance data in order to create a tool that can be used across a variety of tool platforms and FPGA devices. As synthesizing a soft core takes tens of minutes, developing heuristics that execute in an acceptable time of an hour or two, yet find near-optimal results, is a challenge. We consider two approaches, one using a traditional CAD approach that does an initial characterization using synthesis to create an abstract problem model and then explores the solution space using a knapsack algorithm, and the other using a synthesisin-the-loop exploration approach. We compare approaches for a variety of design constraints, on 11 EEMBC benchmarks, using an actual Xilinx soft-core processor, and for two different commercial Xilinx FPGA devices. Our results show that the approaches can generate a customized configuration exhibiting roughly 2x speedups over a base soft core, reaching within 4% of optimal in about 1.5 hours, including complete synthesis of the soft-core onto the FPGA, compared to over 11 hours for exhaustive search. Our results also show that including synthesisin-the-loop, compared to a traditional CAD approach, improved speedups by an average of 20% when size constraints were tight. The approaches may also be applicable to soft-core processors targeted to ASICs in addition to FPGAs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Santosh G. Abraham , B. R. Rau, Efficient design space exploration in PICO, Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems, p.71-79, November 17-19, 2000, San Jose, California, United States
[doi> 10.1145/354880.354891]
|
| |
2
|
Altera Corp. Excalibur Embedded Processor. http://www.altera.com/products/devices/excalibur/exc-index.html, 2005.
|
| |
3
|
Altera Corp. Nios II Processors. http://www.altera.com/products/ip/processors/nios2/ni2-index.html, 2005.
|
| |
4
|
Atmel Corp. FPSLIC (AVR with FPGA). http://www.atmel.com/products/FPSLIC/, 2005.
|
 |
5
|
Jason Cong , Yiping Fan , Guoling Han , Zhiru Zhang, Application-specific instruction generation for configurable processor architectures, Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays, February 22-24, 2004, Monterey, California, USA
[doi> 10.1145/968280.968307]
|
| |
6
|
EEMBC. http://www.eembc.org/, 2005.
|
| |
7
|
Givargis, T., F. Vahid. Platune: A Tuning Framework for Systemon-a-Chip Platforms. IEEE Transactions on Computer Aided Design, Vol. 21, No. 11, Nov. 2002, pp. 1317--1327.
|
| |
8
|
|
 |
9
|
|
 |
10
|
S. Mohanty , V. K. Prasanna , S. Neema , J. Davis, Rapid design space exploration of heterogeneous embedded systems using symbolic search and multi-granular simulation, Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems, June 19-21, 2002, Berlin, Germany
|
 |
11
|
G. Palermo , C. Silvano , S. Valsecchi , V. Zaccaria, A system-level methodology for fast multi-objective design space exploration, Proceedings of the 13th ACM Great Lakes symposium on VLSI, April 28-29, 2003, Washington, D. C., USA
[doi> 10.1145/764808.764833]
|
 |
12
|
|
 |
13
|
Timothy Sherwood , Mark Oskin , Brad Calder, Balancing design options with Sherpa, Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, September 22-25, 2004, Washington DC, USA
[doi> 10.1145/1023833.1023843]
|
| |
14
|
|
| |
15
|
Tensilica, Inc. The XPRES Compiler: Triple-Threat Solution to Code Performance Challenges. http://www.tensilica.com/pdf/XPRES-Triple-Threat_Solution.pdf, 2005.
|
| |
16
|
Toth, P. Dynamic Programming Algorithms for the Zero-One Knapsack Problem. Computing 25, pp. 29--45, 1980.
|
| |
17
|
Xilinx, Inc. MicroBlaze Soft Processor Core. http://www.xilinx.com/xlnx/xebiz/designResources/ip_product_details.jsp?key=micro_blaze, 2005.
|
| |
18
|
Xilinx, Inc. Virtex-4 Platform FPGA. http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/index.htm, 2005.
|
| |
19
|
Yamada, T., S. Kataoka and K. Watanabe. Heuristic and Exact Algorithms for the Disjunctively Constrained Knapsack Problem. Information Processing Society of Japan Journal, Vol. 43, No. 9 (2002), 2864--2870.
|
 |
20
|
|
 |
21
|
|
|