|
ABSTRACT
The Streams-C compiler ([5]) synthesizes hardware circuits for reconfigurable FPGA-based computers from parallel C programs. The Streams-C language consists of a small number of libraries and intrinsic functions added to a synthesizable subset of C, and supports a communicating process programming model. The processes may be either software or hardware processes, and the compiler manages communication among the processes transparently to the programmer. For the hardware processes, the compiler generates Register-Transfer-Level (RTL) VHDL, targeting multiple FPGAs with dedicated memories. For the software processes, a multi-threaded software program is generated.The Streams-C language and compiler offer a very high level of expressivity for reconfigurable computing application development, particularly for stream-processing applications. We find this is reflected in productivity, for a factor of up to 10 times improvement in time to produce a program. However, use of the tool in the ``real world'' is predicated on performance: only if such a compiler can deliver performance comparable to hand-coded performance will it be used in practice.This paper presents an application study of the Streams-C compiler. Four applications have been written in Streams-C and compiled to the AMC Wildforce board containing Xilinx 4036's. Those same applications have been hand-coded in a combination of RTL and structural VHDL. We compare performance of the generated code with the hand-optimized code. Our study shows that the compiler-generated designs are 1.37--4 times the area and $1/2$--1 times the clock frequency of the hand designs. We find that the compiler, based on the SUIF infrastructure, can be greatly improved through various standard compiler optimizations that are not currently being exploited. Thus we are currently re-writing a public domain version of Streams-C to better optimize and target the Virtex chip.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Joseph Arrowood. Comparison of filter banks for signal detection. In LAUR Number 99-4551, Los Alamos, NM, March 2000.
|
| |
2
|
Xilinx Corp. http://www.xilinx.com/xilinxonline/jbits.htm. 1999.
|
| |
3
|
M. B. Gokhale, J. Frigo, and J. Stone. Parallel c programming of reconfigurable computers: the Streams-C approach. In HPEC 2000, September 2000.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
Mary Hall et al. Defacto: A design environment for adaptive computing technology. Proceedings of the 6th Recongurable Architectures Workshop (RAW'99), 1999.
|
| |
8
|
Dominique Lavenier, James Theiler, John Szymanski, Maya Gokhale, and Janette Frigo. Fpga implementation of the pixel purity index algorithm. In SPIE, FPGAs and Reconfigurable Processors for Computing and Applications, vol 4212, Boston, MA, November 2000.
|
| |
9
|
Miriam Leeser. Applying reconfigurable hardware to segmentation for multispectral imagery. InHPEC 2000, Boston, MA, September 2000.
|
| |
10
|
|
| |
11
|
P. Banerjee , N. Shenoy , A. Choudhary , S. Hauck , C. Bachmann , M. Haldar , P. Joisha , A. Jones , A. Kanhare , A. Nayak , S. Periyacheri , M. Walkden , D. Zaretsky, A MATLAB Compiler for Distributed, Heterogeneous, Reconfigurable Computing Systems, Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines, p.39, April 17-19, 2000
|
| |
12
|
|
 |
13
|
Eric K. Pauer , Paul D. Fiore , John M. Smith , Cory S. Myers, Algorithm analysis and mapping environment for adaptive computing systems (poster abstract), Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays, p.217, February 10-11, 2000, Monterey, California, United States
[doi> 10.1145/329166.329209]
|
| |
14
|
S. Periyayacheri et al. Library functions in reconfigurable hardware for matrix and signal processing operations in matlab. Proc. 11th IASTED Parallel and Distributed Computing and Systems Conference (PDCS'99), November 1999.
|
| |
15
|
|
CITED BY 23
|
|
Maya Gokhale , Jan Frigo , Kevin Mccabe , James Theiler , Christophe Wolinski , Dominique Lavenier, Experience with a Hybrid Processor: K-Means Clustering, The Journal of Supercomputing, v.26 n.2, p.131-148, September 2003
|
|
|
Girish Venkataramani , Walid Najjar , Fadi Kurdahi , Nader Bagherzadeh , Wim Bohm , Jeff Hammes, Automatic compilation to a coarse-grained reconfigurable system-opn-chip, ACM Transactions on Embedded Computing Systems (TECS), v.2 n.4, p.560-589, November 2003
|
|
|
Zhi Guo , Walid Najjar , Frank Vahid , Kees Vissers, A quantitative analysis of the speedup factors of FPGAs over processors, Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays, February 22-24, 2004, Monterey, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
Andrea Di Blas , David M. Dahle , Mark Diekhans , Leslie Grate , Jeffrey Hirschberg , Kevin Karplus , Hansjorg Keller , Mark Kendrick , Francisco J. Mesa-Martinez , David Pease , Eric Rice , Angela Schultz , Don Speck , Richard Hughey, The UCSC Kestrel Parallel Processor, IEEE Transactions on Parallel and Distributed Systems, v.16 n.1, p.80-92, January 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew R. Putnam , Dave Bennett , Eric Dellinger , Jeff Mason , Prasanna Sundararajan, CHiMPS: a high-level compilation flow for hybrid CPU-FPGA architectures, Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays, February 24-26, 2008, Monterey, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|