|
ABSTRACT
This paper presents novel techniques to integrate the use of Single Instruction Multiple Data (SIMD) functional units in a high-level synthesis (HLS) design methodology. SIMD functional units can be configured to operate in one or more SIMD modes, in which they process multiple sets of smaller bitwidth operands in parallel. Conceptually, the use of SIMD functional units en-ables HLS to (i) exploit parallelism to a higher degree without using additional resources, (ii) improve resource utilization by enabling hardware re-use at a fine-grained level, and (iii) improve energy efficiency for a given area and/or performance constraint.We illustrate the issues involved in performing high-level syn-thesis with SIMD functional units, and discuss how algorithms involved in a typical high-level synthesis flow can be enhanced to result in maximal performance and energy improvements. These techniques are not restricted to specific high-level synthesis tools/algorithms, and can be plugged into any generic high-level synthesis system. Experimental results indicate that, the use of SIMD units can improve performance by up to 1.9X (average of 1.57X), and simultaneously reduce energy consumption by up to 33.16% (average of 28.03%) compared to well-optimized conven-tional designs, with minimal area overheads (average of 2.18%). The performance improvements can be translated into additional energy savings, resulting in upto 66.26% (average of 55.88%) en-ergy reductions. Further, our experiments demonstrate that, the use of SIMD units in a HLS tool results in a shift in the entire area-delay- energy tradeoff envelope that can be obtained, to include de-sirable parts of the design space (i.e., higher quality designs) that were hitherto unreachable.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
[4] "Closing the Gap Between ASIC and Custom: Design Examples", special session #27 at the IEEE/ACM Design Automation Conf., June 2001.
|
| |
5
|
[5] N. Takagi, "A multiple precision modular multiplication algorithm with triangle additions" in IEICE Trans. on Info. and Systems, vol. E78, 1995.
|
| |
6
|
|
| |
7
|
Chang-Guo Zhou , Ihtisham Kabir , Leslie Kohn , Aman Jabbi , D. Rice , Xio-Ping Hu, MPEG video decoding with the UltraSPARC visual instruction set, Proceedings of the 40th IEEE Computer Society International Conference, p.470, March 05-09, 1995
|
| |
8
|
[8] J. F. Blinn, "Fugue for MMX" in IEEE Computer Graphics and Applications , vol. 17, no. 2, pp. 88-93, 1997.
|
| |
9
|
[9] Z. J. A. Mou, D. S. Rice, and D. Wei, "VIS based native video processing on UltraSPARC-II" in Proc. Intl. Conf. Image Proc., pp. 153-156, 1996.
|
| |
10
|
|
| |
11
|
|
| |
12
|
[12] M. D. Ercegovac, D. Kirovski, G. Mustafa, and M. Potkonjak, "Behavioral synthesis optimization using multiple precision arithmetic" in Proc. ICASSP, pp. 3113-3116, 1998.
|
 |
13
|
Milos Ercegovac , Darko Kirovski , Miodrag Potkonjak, Low-power behavioral synthesis optimization using multiple precision arithmetic, Proceedings of the 36th ACM/IEEE conference on Design automation, p.568-573, June 21-25, 1999, New Orleans, Louisiana, United States
[doi> 10.1145/309847.310000]
|
| |
14
|
|
| |
15
|
[15] "Mosis", (http://www.mosis.org/Technical/Processes/menu-processes.html).
|
| |
16
|
[16] "DC Users Manual", Synopsys Inc., (http://www.synopsys.com).
|
 |
17
|
Subhrajit Bhattacharya , Sujit Dey , Franc Brglez, Performance analysis and optimization of schedules for conditional and loop-intensive specifications, Proceedings of the 31st annual conference on Design automation, p.491-496, June 06-10, 1994, San Diego, California, United States
[doi> 10.1145/196244.196477]
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|