| Design space exploration for field programmable compressor trees |
| Full text |
Pdf
(393 KB)
|
Source
|
International Conference on Compilers, Architecture and Synthesis for Embedded Systems
archive
Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
table of contents
Atlanta, GA, USA
SESSION: Power, reconfigurability, and simulation
table of contents
Pages 207-216
Year of Publication: 2008
ISBN:978-1-60558-469-0
|
|
Authors
|
|
Seyed Hosein Attarzadeh Niaki
|
Royal Institute of Technology, Stockholm, Sweden
|
|
Alessandro Cevrero
|
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
|
|
Philip Brisk
|
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
|
|
Chrysostomos Nicopoulos
|
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
|
|
Frank K. Gurkaynak
|
Swiss Federal Institute of Technology, Zurich, Zurich, Switzerland
|
|
Yusuf Leblebici
|
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
|
|
Paolo Ienne
|
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 59, Citation Count: 0
|
|
|
ABSTRACT
The Field Programmable Compressor Tree (FPCT) is a programmable compressor tree (e.g., a Wallace or Dadda Tree) intended for integration in an FPGA or other reconfigurable device. This paper presents a design space exploration (DSE) method that can be used to identify the best FPCT architecture for a given set of arithmetic benchmark circuits; in practice, an FPGA vendor can use the design space exploration to tailor the FPCT to meet the needs of the most important benchmark circuits of the vendor's largest-volume clients. One novel feature of the DSE is the introduction of a metric called I/O utilization; we found that I/O utilization has a strong correlation with both the critical path delay and area of the benchmark circuits under study. Pruning the search space using I/O utilization allowed us to reduce significantly the number of FPCTs that must be synthesized and evaluated during the DSE, while giving high confidence that the best architectures are still explored. The DSE was applied to seven small-to-medium range benchmark circuits; one FPCT architecture was found that was 30% faster than the second best in terms of critical path delay, and only 3.34% larger than the smallest.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Ahmed, E., and Rose, J. The effect of LUT and cluster size on deep-submicron FPGA performance and density. IEEE Trans. VLSI, vol. 12, no. 3, March, 2004, 288--298.
|
| |
2
|
Altera Corporation. Stratix II vs. Virtex-4 Performance Comparison. Available online: http://www.altera.com/
|
| |
3
|
Ansaloni, G., Bonzini, P., and Pozzi, L. Design and architectural exploration of expression-grained reconfigurable arrays, IEEE Symposium on Application-Specific Processors, Anaheim, CA, USA, June 8-9, 2008.
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
 |
7
|
Alessandro Cevrero , Panagiotis Athanasopoulos , Hadi Parandeh-Afshar , Ajay K. Verma , Philip Brisk , Frank K. Gurkaynak , Yusuf Leblebici , Paolo Ienne, Architectural improvements for field programmable counter arrays: enabling efficient synthesis of fast compressor trees on FPGAs, Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays, February 24-26, 2008, Monterey, California, USA
[doi> 10.1145/1344671.1344699]
|
| |
8
|
Chen, C-Y., et al.. Analysis and architecture design of variable block-size motion estimation for H.264/AVC, IEEE Trans. Circuits and Systems-I, vol. 53, no. 2, February, 2006, 578--593.
|
| |
9
|
Dadda, L., Some schemes for parallel multipliers, Alta Frequenza, vol. 34, May, 1965, 349--356.
|
| |
10
|
|
 |
11
|
|
| |
12
|
Kuon, I., and Rose, J. Measuring the gap between FPGAs and ASICs. IEEE Trans. Computer-Aided Design, vol. 26, no. 2, February, 2007, 203--215.
|
 |
13
|
David Lewis , Elias Ahmed , Gregg Baeckler , Vaughn Betz , Mark Bourgeault , David Cashman , David Galloway , Mike Hutton , Chris Lane , Andy Lee , Paul Leventis , Sandy Marquardt , Cameron McClintock , Ketan Padalia , Bruce Pedersen , Giles Powell , Boris Ratchev , Srinivas Reddy , Jay Schleicher , Kevin Stevens , Richard Yuan , Richard Cliff , Jonathan Rose, The Stratix II logic and routing architecture, Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays, February 20-22, 2005, Monterey, California, USA
[doi> 10.1145/1046192.1046195]
|
| |
14
|
Mirzaei, S., Hosangadi, A., and Kastner, R. High speed FIR filter implementation using add and shift method, Int. Conf. Computer Design, San Jose, CA, USA, October 1-4, 2006.
|
| |
15
|
|
 |
16
|
|
| |
17
|
Sriram, S., Brown, K., Defosseux, R., Moerman, F., Paviot, O., Sundararajan, V., and Gatherer, A. A 64 channel programmable receiver chip for 3G wireless infrastructure, IEEE Custom Integrated Circuits Conf., San Jose, CA, USA, September 18-21, 2005, 59--62.
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
Wallace, C. S. A suggestion for a fast multiplier, IEEE Trans. Elec. Computers, vol. 13, February, 1964, 14--17.
|
| |
22
|
Xilinx Corporation. Virtex-5 user guide. Available online: http://www.xilinx.com/
|
| |
23
|
Ye, A. G., and Rose, J. Using bus-based connections to improve field-programmable gate array density for implementing datapath circuits. IEEE Trans. VLSI, vol. 14, no. 5, May, 2006, 462--473.
|
 |
24
|
Sami Yehia , Nathan Clark , Scott Mahlke , Krisztiàn Flautner, Exploring the design space of LUT-based transparent accelerators, Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, September 24-27, 2005, San Francisco, California, USA
[doi> 10.1145/1086297.1086301]
|
|