ACM Home Page
Please provide us with feedback. Feedback
High-level synthesis of distributed logic-memory architectures
Full text PdfPdf (1.10 MB)
Source International Conference on Computer Aided Design archive
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design table of contents
San Jose, California
Pages: 564 - 571  
Year of Publication: 2002
ISBN ~ ISSN:1092-3152 , 0-7803-7607-2
Authors
Chao Huang  Princeton University, Princeton, NJ
Srivaths Ravi  NEC USA, Princeton, NJ
Anand Raghunathan  NEC USA, Princeton, NJ
Niraj K. Jha  Princeton University, Princeton, NJ
Sponsors
: IEEE Circuits & Systems Society
IEEE-CS\DATC : IEEE Computer Society
SIGDA: ACM Special Interest Group on Design Automation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 21,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/774572.774655
What is a DOI?

ABSTRACT

With the increasing cost of global communication on-chip, high-performance designs for data-intensive applications require architectures that distribute hardware resources (computing logic, memories, interconnect, etc.) throughout a chip, while restricting computations and communications to geographic proximities. In this paper, we present a methodology for high-level synthesis (HLS) of distributed logic-memory architectures, i.e., architectures that have logic and memory distributed across several partitions in a chip. Conventional HLS tools are capable of extracting parallelism from a behavior for architectures that assume a monolithic controller/datapath communicating with a memory or memory hierarchy. This work provides techniques to extend the synthesis frontier to more general architectures that can extract both coarse- and fine-grained parallelism from data accesses and computations in a synergistic manner. Our methodology selects many possible ways of organizing data and computations, carefully examines the trade-offs (i.e., communication overheads, synchronization costs, area overheads) in choosing one solution over another, and utilizes conventional HLS techniques for intermediate steps.We have evaluated the proposed framework on several benchmarks by generating register-transfer level (RTL) implementations using an existing commercial HLS tool with and without our enhancements, and by subjecting the resulting RTL circuits to logic synthesis and layout. The results show that circuits designed as distributed logic-memory architectures using our framework achieve significant (upto, 5.31X average of 3.45X) performance improvements over well-optimized conventional designs with small area overheads (upto 19.3%, 15.1% on average).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
 
5
F. Catthoor, F. Franssen, S. Wuytack, L. Nachtergaele, and H. De Man, "Global communication and memory optimizing transformations for low power signal processing systems," in Proc. Int. Wkshp. Low Power Design, 1994, pp. 51--56.
 
6
 
7
H. De Man, F. Catthoor, G. Goossens, J. V. Meerbergen, J. Rabaey, and J. Huisken, "Architecture driven synthesis techniques for mapping digital signal processing structures into silicon," Proc. IEEE, vol. 78, no. 2, pp. 319--335, Feb. 1990.
 
8
R. Cloutier and D. Thomas, "The combination of scheduling, allocation, and mapping in a single algorithm," in Proc. Int. Symp. Microarchitecture, Dec. 1996, pp. 126--137.
 
9
 
10
 
11
L. Ramachandran, D. D. Gajski, and V. Chaiyakul, "An algorithm for array variable clustering," in Proc. European Design Automation Conf., Mar. 1994, pp. 262--266.
 
12
 
13
14
 
15
O. Sentieys, D. Chillet, J. P. Diguet, and J. L. Phillipe, "Memory module selection for high-level synthesis," in Proc. VLSI Signal Processing IX, Oct. 1996, pp. 273--282.
 
16
 
17
F. Balasa, Background Memory Allocation for Multi-dimensional Signal Processing, Ph.D. thesis, ESAT/EE Dept., K.U.Leuven, Belgium, 1995.
 
18
19
 
20
 
21
 
22
ATOMIUM Project, IMEC, http://www.imec.be/atomium.
 
23
F. Vahid, "Techniques for minimizing and balancing I/O during functional partitioning," IEEE Trans. Computer-Aided Design, vol. 18, no. 1, pp. 69--75, Jan. 1999.
 
24
25
 
26
Y. Kang, M. Huang, S. Yoo, Z. Ge, D. Keen, V. Lam, P. Pattnaik, and J. Torrellas, "Flexram: Toward an advanced intelligent memory system," Oct. 1999.
27
 
28
 
29
 
30
 
31
J. Ramanujam and P. Sadayappan, "Tiling multidimensional iteration spaces for multicomputers," J. Parallel & Distributed Computing, vol. 16, no. 2, pp. 108--230, 1992.
 
32
33
 
34
K. Wakabayashi, C-Based High-Level Synthesis System, "CYBER"-Design Experience-, vol. 41, pp. 264--268, July 2000.
 
35
SYNOPSYS Design Compiler, VSS and Cyclone User Manual, http://www.synopsys.com.
 
36
TSMC 0.25mm Process High-Density Single-Port SRAM (HD-SRAM-SP) Generator User Manual, http://www.artisan.com.
 
37
Cadence Openbook SE 5.3, IC 4.4.5 and LVD 3.0, http://www.cadence.com.


Collaborative Colleagues:
Chao Huang: colleagues
Srivaths Ravi: colleagues
Anand Raghunathan: colleagues
Niraj K. Jha: colleagues

Peer to Peer - Readers of this Article have also read: