ACM Home Page
Please provide us with feedback. Feedback
Simultaneous resource binding and interconnection optimization based on a distributed register-file microarchitecture
Full text PdfPdf (2.02 MB)
Source
ACM Transactions on Design Automation of Electronic Systems (TODAES) archive
Volume 14 ,  Issue 3  (May 2009) table of contents
Article No. 35  
Year of Publication: 2009
ISSN:1084-4309
Authors
Jason Cong  University of California, Los Angeles, Los Angeles, CA
Yiping Fan  AutoESL, Inc., Cupertino, CA
Junjuan Xu  University of California, Los Angeles, Los Angeles, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 65,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1529255.1529257
What is a DOI?

ABSTRACT

Behavior synthesis and optimization beyond the register-transfer level require an efficient utilization of the underlying platform features. This article presents a platform-based resource binding approach based on a Distributed Register-File Microarchitecture (DRFM), which makes efficient use of distributed embedded memory blocks as register files in modern FPGAs. DRFM contains multiple islands, each having a local register file, a functional unit pool, and data-routing logic. Compared to the traditional discrete-register counterpart, a DRFM allows use of the platform-featured on-chip memory or register-file IP blocks to implement its local register files, and this results in a substantial saving of multiplexing logic and global interconnects. DRFM provides a useful architectural template and a direct optimization objective for minimizing interisland connections for synthesis algorithms. Given the scheduling solution and resource (functional units) constraints, two novel algorithms in the resource binding stage are developed based on DRFM: (i) a simultaneous DRFM clustering and binding algorithm, which decides the configuration of DRFM and the assignment of operations into islands with the focus on optimizing global connections; (ii) a data-forwarding scheduling algorithm, which takes advantage of the operation slacks to handle the read-port restriction of register files. On the Xilinx Virtex4 FPGA platform, experimental results with a set of real-life test cases show a 50% logic area reduction achieved by applying our approach, with a 14.6% performance improvement, compared to the traditional discrete-register-based approach. Also, experiments on small-size designs show that our algorithm produces the same number of total connections and at most one more maximum feeding-in connection compared to optimal solutions generated by ILP.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Altera. Altera Web site. http://www.altera.com.
 
2
Blazewicz, J. 1979. Deadline scheduling of tasks with ready times and resource constraints. Inf. Process. Lett. 8, 2, 60--63.
 
3
4
 
5
6
 
7
Cong, J., Fan, Y., Han, G., Jiang, W., and Zhang, Z. 2006. Platform-based behavior-level and system-level synthesis. In Proceedings of IEEE International SOC Conference (invited paper), 199--202.
 
8
Cong, J., Fan, Y., Han, G., Yang, X., and Zhang, Z. 2004. Architecture and synthesis for on-chip multi-cycle communication. IEEE Trans. Comput.-Aided Des. Integrated Syst. 23, 4, 550--564.
9
10
 
11
 
12
 
13
 
14
FFT. FFT package. http://momonga.t.u-tokyo.ac.jp/~ooura/fft.html.
15
 
16
17
 
18
19
20
 
21
Kernighan, B. W. and Lin, S. 1970. An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49, 2.
 
22
 
23
 
24
Kim, T. and Liu, C. 1995a. An integrated data path synthesis algorithm based on network flow method. Custom Integrated Circuits Conference, Proc. IEEE 1--4, 615--618.
 
25
26
 
27
28
29
 
30
 
31
Mandal, C. A., Chakrabarti, P. P., and Ghose, S. 1998. Some new results in the complexity of allocation and binding in data path synthesis. Comput. Math. Appl. 35, 10, 93--105.
 
32
Pangrle, B. M. 1991. On the complexity of connectivity binding. IEEE Trans. CAD 10, 11, 1460--1465.
 
33
 
34
 
35
Rixner, S., Dally, W. J., Khailany, B., Mattson, P. R., Kapasi, U. J., and Owens, J. D. 2000. Register organization for media processing. In Proceedings of the 6th International Symposium on High-Performance Computer Architecture, 375--386.
 
36
37
 
38
 
39
 
40
Stok, L. and Philipsen, W. 1991. Module allocation and comparability graphs. IEEE International Sympoisum on Circuits and Systems vol.5, 2862--2865.
41
 
42
Xilinx. Xilinx Web site. http://www.xilinx.com.

Collaborative Colleagues:
Jason Cong: colleagues
Yiping Fan: colleagues
Junjuan Xu: colleagues