|
ABSTRACT
To improve FPGA performance for arithmetic circuits, this paper proposes a new architecture for FPGA logic cells that includes a 6:2 compressor. The new cell features additional fast carry-chains that concatenate adjacent compressors and can be routed locally without the global routing network. Unlike previous carry-chains for binary and ternary addition, the carry chain used by the new cell only spans 2 logic blocks, which significantly improves the delay of multi-input addition operations mapped onto the FPGA. The delay and area overhead that arises from augmenting a traditional FPGA logic cell with the new compressor structure is minimal. Using this new cell, we observed an average speedup in combinational delay of 1.41x compared to adder trees synthesized using ternary adders
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Altera Corporation, Stratix II Device Handbook, vol. 1 and 2, available online: http://www.altera.com/
|
| |
2
|
Altera Corporation, Stratix II vs. Virtex-4 Performance Comparison, available online: http://www.altera.com/
|
| |
3
|
Altera Corporation, Stratix III Device Handbook, vol. 1 and 2, available online: http://www.altera.com/
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
Chen, C.-Y., Chien, S.-Y., Huang, Y.-W., Chen, T.-C., Wang, T.-C., and Chen, L.-G. Analysis and architecture design of variable block-size motion estimation for H.264/AVC, IEEE Trans. Circuits and Systems-I, vol. 53, no. 2, February, 2006, 578--593.
|
| |
8
|
Cherepacha, D., and Lewis, D. DP-FPGA: an FPGA architecture optimized for datapaths. VLSI Design, vol. 4, no. 4, 1996, 329--343.
|
 |
9
|
|
| |
10
|
Dadda, L., Some schemes for parallel multipliers, Alta Frequenza, vol. 34, May, 1965, 349--356.
|
| |
11
|
Fadavi-Ardekani, J. M x N Booth encoded multiplier generator using optimized Wallace trees. IEEE Trans. VLSI Systems, vol. 1., no. 2, June, 1993, 120--125.
|
| |
12
|
Frederick, M.T., and Somani, A.K. Multi-bit carry chains for high-performance reconfigurable fabrics. Int. Conf. Field Prog. Logic and Applications (FPL'06) (Madrid, Spain, August 28-30, 2006) 1--6.
|
 |
13
|
|
| |
14
|
|
| |
15
|
Hu, Y., and He, L. Private communication. June 8, 2007.
|
| |
16
|
Yu Hu , Satyaki Das , Steve Trimberger , Lei He, Design, synthesis and evaluation of heterogeneous FPGA with mixed LUTs and macro-gates, Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design, November 05-08, 2007, San Jose, California
|
 |
17
|
|
| |
18
|
Kaviani, A., Vranseic, D., and Brown, S. Computational field programmable architecture, IEEE Custom Integrated Circuits, Conf. (CICC '98) (Santa Clara, CA, USA, May 11-14, 1998) 261--264.
|
| |
19
|
Kuon, I., and Rose, J. Measuring the gap between FPGAs and ASICs. IEEE Trans. Computer-Aided Design, vol. 26, no. 2, February, 2007, 203--215.
|
| |
20
|
|
 |
21
|
|
| |
22
|
Mirzaei, S., Hosangadi, A., and Kastner, R. High speed FIR filter implementation using add and shift method, Int. Conf. Computer Design (ICCD '06) (San Jose, CA, USA, October 1-4, 2006).
|
| |
23
|
|
| |
24
|
|
| |
25
|
Parandeh-Afshar, H., Brisk, P., and Ienne, P. Improving Synthesis of Compressor Trees on FPGAs via Integer Linear Programming, to appear: Design, Automation and Test in Europe Conference and Exhibition (DATE'08) (Munich, Germany, March 10-14, 2008).
|
| |
26
|
|
| |
27
|
Santoro, M., and Horowitz, M. A pipelined 64x64b iterative array multiplier, IEEE Int. Solid-State Circuits Conf. (ISSCC'88) (February 17-19, 1988) 36--37, 290.
|
| |
28
|
Song, P., and De Micheli, G. Circuit and architecture trade-offs for high speed multiplication, IEEE Journal of Solid-State Circuits, vol. 26, no. 9, September, 1991, 1184--1198.
|
| |
29
|
Sriram, S., Brown, K., Defosseux, R., Moerman, F., Paviot, O., Sundararajan, V., and Gatherer, A. A 64 channel programmable receiver chip for 3G wireless infrastructure, IEEE Custom Integrated Circuits Conf. (CICC '05) (San Jose, CA, USA, September 18-21, 2005) 59--62.
|
| |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
|
 |
34
|
|
| |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
Wallace, C.S. A suggestion for a fast multiplier, IEEE Trans. Elec. Computers, vol. 13, February, 1964, 14--17.
|
| |
39
|
Weinberger, A. 4:2 carry-save adder module, IBM Technical Disclosure Bulletin, vol. 23, Jan. 1981.
|
| |
40
|
Xilinx Corporation, Virtex-4 User Guide, available online: http://www.xilinx.com/
|
| |
41
|
Xilinx Corporation, Virtex-5 User Guide, available online: http://www.xilinx.com/
|
 |
42
|
Paul S. Zuchowski , Christopher B. Reynolds , Richard J. Grupp , Shelly G. Davis , Brendan Cremen , Bill Troxel, A hybrid ASIC and FPGA architecture, Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design, p.187-194, November 10-14, 2002, San Jose, California
[doi> 10.1145/774572.774600]
|
CITED BY
|
|
Alessandro Cevrero , Panagiotis Athanasopoulos , Hadi Parandeh-Afshar , Ajay K. Verma , Hosein Seyed Attarzadeh Niaki , Chrysostomos Nicopoulos , Frank K. Gurkaynak , Philip Brisk , Yusuf Leblebici , Paolo Ienne, Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs, ACM Transactions on Reconfigurable Technology and Systems (TRETS), v.2 n.2, p.1-36, June 2009
|
|