| A faster distributed arithmetic architecture for FPGAs |
| Full text |
Pdf
(379 KB)
|
| Source
|
International Symposium on Field Programmable Gate Arrays
archive
Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
table of contents
Monterey, California, USA
Session: Arithmetic
table of contents
Pages: 31 - 39
Year of Publication: 2002
ISBN:1-58113-452-5
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 40, Citation Count: 1
|
|
|
ABSTRACT
Distributed Arithmetic (DA) is an important technique to implement digital signal processing (DSP) functions in FPGAs. However, traditional lookup table (LUT) based DA architectures contain one or more carry propagation chains in the critical path that dictates the fastest time at which an entire design can run. In this paper, we describe a novel technique that can reduce or eliminate the carry-propagate chain from the critical path in LUT based DA architectures on FPGAs. In the proposed scheme, the individual bits of a word do not have to be processed as a unit. Instead, the current iteration can start as soon as the least significant bit (LSB) of the previous iteration is available, without waiting for the entire word from the previous iteration to be fully computed. This technique has great potential in speeding up DSP applications based on DA. Designs are described for serial and parallel DALUT and accumulator structures in which an n-bit carry chain, where n is the word length, is broken into smaller r-bit chains, 1*nnr < n . A cost-performance analysis of the designs is presented. The analysis shows that the designs proposed in this paper have a lower cost-performance ratio (indicating better performance) than traditional DA designs. We also show that the 8-bit (r = 8) designs offer a good compromise between cost and performance. The implementation is on a Xilinx chip XC4028XL-3-BG256 using Xilinx Foundation tools v 3.1i. The results show that the proposed designs can achieve speedup by a factor of at least 1.5 over traditional DA designs in some cases.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S.A. White. Applications of Distributed Arithmetic to Digital Signal Processing: A Tutorial Review. IEEE ASSP Magazine, Vol. 6, No. 3, pp. 4-19.
|
| |
2
|
B. New, "A Distributed Arithmetic Approach to Designing Scalable DSP Chips", Electronic Design News, August 17, 1995.
|
| |
3
|
Mintzer, L. FIR filters with the Xilinx FPGA. FPGA '92 ACM/SIGDA, Workshop on FPGAs. pp. 129-134.
|
| |
4
|
J. Valls, M. Martinez-Peiro, T. Sansaloni, and E. Boemo. Design and FPGA Implementation of Digit-Serial FIR Filters. Proceedings of the 1998 IEEE ICECS'98 (5th IEEE International Conference on Electronics, Circuits and Systems), Vol.2, pp.191-194, Lisboa, 7-10 Sept. 1998.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
W. Shang, B. W. Wah. Dependence Analysis and Architecture Design for Bit-Level Algorithms. Intl. Conf. On Parallel Process, vol. I, pp. 30-38, 1993.
|
| |
9
|
|
| |
10
|
Xilinx Inc. Estimating the Performance of XC4000E Adders and Counters, v.2.0, July 1996. Available from: http://www.xilinx.com/xapp/xapp018.pdf
|
| |
11
|
Xilinx Inc. XC4000 XL Electrical characteristics, v1.7, October 1999.
|
| |
12
|
J. Valls, M. Martinez, T. Sansaloni, and E. Boemo. A Study about FPGA-based Digital Filters. Proc. 1998 IEEE SIPS, IEEE Workshop on VLSI Signal Processing: Design and Implementation, pp.191-201, Boston, Oct.1998.
|
|