|
ABSTRACT
Embedded systems are often implemented on FPGA devices and 25% of the time include a soft processor--a processor built using the FPGA reprogrammable fabric. Because of their prevalence and flexibility, soft processors are compelling targets for customization--although current soft processors provide few architectural variations. Recent work has proposed augmenting soft processors with customizable vector processing support, enabling designers to easily scale performance by exploiting the data parallelism available in an application. However this approach provides only coarse-grain scaling, by successively doubling the number of vector datapaths for less than double the performance. In this work we further augment soft vector processors with more fine-grain architectural modifications: we add support for (i) vector chaining and (ii) heterogeneous vector lanes, allowing the soft vector processor to be customized to not only the data-level parallelism available in an application, but to the functional unit demand. We evaluate the area and wall clock performance with full hardware implementations on state-of-the-art FPGAs and find that chaining can provide between 15-45% average performance for less area than doubling the lanes, and that heterogeneous lanes can save 6-13% area with little or no performance loss in some cases. Finally, we implement 1200 soft vector processors variants and find that the peak performance per area compared to our base vector processor can be increased by an average of 13% and up to 34% when choosing the best variant per application.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
The Embedded Microprocessor Benchmark Consortium. http://www.eembc.org.
|
| |
2
|
T. Allen. Altera Corporation. Private Communication, 2009.
|
| |
3
|
K. Asanovic. Vector Microprocessors. PhD thesis, University of California-Berkeley, 1998.
|
| |
4
|
J. Cho, H. Chang, and W. Sung. An fpga based simd processor with a vector memory unit. Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 IEEE International Symposium on, pages 4 pp.--, 21--24 May 2006.
|
| |
5
|
R. Cliff. Altera Corporation. Private Communication, 2005.
|
| |
6
|
R. Dimond, O. Mencer, and W. Luk. CUSTARD -- A Customisable Threaded FPGA Soft Processor and Tools. In International Conference on Field Programmable Logic (FPL), August 2005.
|
| |
7
|
B. Fort, D. Capalija, Z. G. Vranesic, and S. D. Brown. A multithreaded soft processor for sopc area reduction. In IEEE Symposium on Field-Programmable Custom Computing Machines, pages 131--142, Washington, DC, USA, 2006.
|
| |
8
|
M. Hasan and S. Ziavras. Fpga-based vector processing for solving sparse sets of equations. In Field-Programmable Custom Computing Machines, 2005. FCCM 2005. 13th Annual IEEE Symposium on, pages 331--332, April 2005.
|
| |
9
|
M. Z. Hasan and S. G. Ziavras. Runtime partial reconfiguration for embedded vector processors. In Information Technology, 2007. ITNG '07. Fourth International Conference on, pages 983--988, April 2007.
|
| |
10
|
J. L. Hennessy and D. A. Patterson. Computer Architecture; A Quantitative Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.
|
| |
11
|
A. K. Jones, R. Hoare, D. Kusic, J. Fazekas, and J. Foster. An fpga-based vliw processor with custom hardware execution. In FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field--programmable gate arrays, pages 107--117, New York, NY, USA, 2005. ACM.
|
| |
12
|
C. Kozyrakis and D. Patterson. Scalable, vector processors for embedded systems. Micro, IEEE, 23(6):36--45, 2003.
|
| |
13
|
M. Labrecque, P. Yiannacouras, and J. G. Steffan. Scaling Soft Processor Systems. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'08)., Palo Alto, CA, April 2008.
|
| |
14
|
K. Ravindran, N. Satish, Y. Jin, and K. Keutzer. An fpga-based soft multiprocessor system for ipv4 packet forwarding. pages 487--492, Aug. 2005.
|
| |
15
|
D. Unnikrishnan, J. Zhao, and R. Tessier. Application-Specific Customization and Scalability of Soft Multiprocessors. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'09)., Napa, CA, April 2009.
|
| |
16
|
J. E. Veenstra and R. J. Fowler. MINT: a front end for efficient simulation of shared-memory multiprocessors. In Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '94)., pages 201--207, Durham, NC, January 1994.
|
| |
17
|
P. Yiannacouras, J. Rose, and J. G. Steffan. The Microarchitecture of FPGA Based Soft Processors. In CASES'05: International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pages 202--212. ACM Press, 2005.
|
| |
18
|
P. Yiannacouras, J. G. Steffan, and J. Rose. Application-specific customization of soft processor microarchitecture. In FPGA'06: Proceedings of the International Symposium on Field Programmable Gate Arrays, pages 201--210, New York, NY, USA, 2006. ACM Press.
|
| |
19
|
P. Yiannacouras, J. G. Steffan, and J. Rose. Improving memory systems for soft vector processors. In WoSPS'08: Workshop on Soft Processor Systems, 2008.
|
| |
20
|
P. Yiannacouras, J. G. Steffan, and J. Rose. Vespa: Portable, scalable, and flexible fpga-based vector processors. In CASES'08: International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM, 2008.
|
| |
21
|
J. Yu, G. Lemieux, and C. Eagleston. Vector processing as a soft-core cpu accelerator. In Symposium on Field programmable gate arrays, pages 222--232, New York, NY, USA, 2008. ACM.
|
INDEX TERMS
Primary Classification:
C.
Computer Systems Organization
C.1
PROCESSOR ARCHITECTURES
C.1.3
Other Architecture Styles
Subjects:
Adaptable architectures
General Terms:
Design,
Measurement,
Performance
Keywords:
ASIP,
FPGA,
SIMD,
VESPA,
application specific,
custom,
microarchitecture,
soft processor,
soft vector processor,
vector,
viram
|