|
ABSTRACT
Behavior synthesis and optimization beyond the register-transfer level require an efficient utilization of the underlying platform features. This article presents a platform-based resource binding approach based on a Distributed Register-File Microarchitecture (DRFM), which makes efficient use of distributed embedded memory blocks as register files in modern FPGAs. DRFM contains multiple islands, each having a local register file, a functional unit pool, and data-routing logic. Compared to the traditional discrete-register counterpart, a DRFM allows use of the platform-featured on-chip memory or register-file IP blocks to implement its local register files, and this results in a substantial saving of multiplexing logic and global interconnects. DRFM provides a useful architectural template and a direct optimization objective for minimizing interisland connections for synthesis algorithms. Given the scheduling solution and resource (functional units) constraints, two novel algorithms in the resource binding stage are developed based on DRFM: (i) a simultaneous DRFM clustering and binding algorithm, which decides the configuration of DRFM and the assignment of operations into islands with the focus on optimizing global connections; (ii) a data-forwarding scheduling algorithm, which takes advantage of the operation slacks to handle the read-port restriction of register files. On the Xilinx Virtex4 FPGA platform, experimental results with a set of real-life test cases show a 50% logic area reduction achieved by applying our approach, with a 14.6% performance improvement, compared to the traditional discrete-register-based approach. Also, experiments on small-size designs show that our algorithm produces the same number of total connections and at most one more maximum feeding-in connection compared to optimal solutions generated by ILP.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Altera. Altera Web site. http://www.altera.com.
|
| |
2
|
Blazewicz, J. 1979. Deadline scheduling of tasks with ready times and resource constraints. Inf. Process. Lett. 8, 2, 60--63.
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
Cong, J., Fan, Y., Han, G., Jiang, W., and Zhang, Z. 2006. Platform-based behavior-level and system-level synthesis. In Proceedings of IEEE International SOC Conference (invited paper), 199--202.
|
| |
8
|
Cong, J., Fan, Y., Han, G., Yang, X., and Zhang, Z. 2004. Architecture and synthesis for on-chip multi-cycle communication. IEEE Trans. Comput.-Aided Des. Integrated Syst. 23, 4, 550--564.
|
 |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
Keith I. Farkas , Paul Chow , Norman P. Jouppi , Zvonko Vranesic, The multicluster architecture: reducing cycle time through partitioning, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.149-159, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
14
|
FFT. FFT package. http://momonga.t.u-tokyo.ac.jp/~ooura/fft.html.
|
 |
15
|
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
 |
19
|
Chu-Yi Huang , Yen-Shen Chen , Youn-Long Lin , Yu-Chin Hsu, Data path allocation based on bipartite weighted matching, Proceedings of the 27th ACM/IEEE Design Automation Conference, p.499-504, June 24-27, 1990, Orlando, Florida, United States
[doi> 10.1145/123186.123350]
|
 |
20
|
|
| |
21
|
Kernighan, B. W. and Lin, S. 1970. An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49, 2.
|
| |
22
|
Brucek Khailany , William J. Dally , Ujval J. Kapasi , Peter Mattson , Jinyung Namkoong , John D. Owens , Brian Towles , Andrew Chang , Scott Rixner, Imagine: Media Processing with Streams, IEEE Micro, v.21 n.2, p.35-46, March 2001
[doi> 10.1109/40.918001]
|
| |
23
|
Daehong Kim , Jinyong Jung , Sunghyun Lee , Jinhwan Jeon , Kiyoung Choi, Behavior-to-placed RTL synthesis with performance-driven placement, Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design, November 04-08, 2001, San Jose, California
|
| |
24
|
Kim, T. and Liu, C. 1995a. An integrated data path synthesis algorithm based on network flow method. Custom Integrated Circuits Conference, Proc. IEEE 1--4, 615--618.
|
| |
25
|
|
 |
26
|
|
| |
27
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
28
|
Hae-Dong Lee , Sun-Young Hwang, A scheduling algorithm for multiport memory minimization in datapath synthesis, Proceedings of the 1995 Asia and South Pacific Design Automation Conference, p.16-es, August 29-September 01, 1995, Makuhari, Massa, Chiba, Japan
[doi> 10.1145/224818.224847]
|
 |
29
|
|
| |
30
|
|
| |
31
|
Mandal, C. A., Chakrabarti, P. P., and Ghose, S. 1998. Some new results in the complexity of allocation and binding in data path synthesis. Comput. Math. Appl. 35, 10, 93--105.
|
| |
32
|
Pangrle, B. M. 1991. On the complexity of connectivity binding. IEEE Trans. CAD 10, 11, 1460--1465.
|
| |
33
|
|
| |
34
|
Scott Rixner , William J. Dally , Ujval J. Kapasi , Brucek Khailany , Abelardo López-Lagunas , Peter R. Mattson , John D. Owens, A bandwidth-efficient architecture for media processing, Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, p.3-13, November 1998, Dallas, Texas, United States
|
| |
35
|
Rixner, S., Dally, W. J., Khailany, B., Mattson, P. R., Kapasi, U. J., and Owens, J. D. 2000. Register organization for media processing. In Proceedings of the 6th International Symposium on High-Performance Computer Architecture, 375--386.
|
| |
36
|
|
 |
37
|
|
| |
38
|
|
| |
39
|
|
| |
40
|
Stok, L. and Philipsen, W. 1991. Module allocation and comparability graphs. IEEE International Sympoisum on Circuits and Systems vol.5, 2862--2865.
|
 |
41
|
|
| |
42
|
Xilinx. Xilinx Web site. http://www.xilinx.com.
|
|