ACM Home Page
Please provide us with feedback. Feedback
A scalable wide-issue clustered VLIW with a reconfigurable interconnect
Full text PdfPdf (365 KB)
Source International Conference on Compilers, Architecture and Synthesis for Embedded Systems archive
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems table of contents
San Jose, California, USA
SESSION: Microprocessor architecture table of contents
Pages: 148 - 158  
Year of Publication: 2003
ISBN:1-58113-676-5
Authors
Osvaldo Colavin  STMicroelectronics, Inc., San Diego, CA
Davide Rizzo  STMicroelectronics, Inc., San Diego, CA
Sponsors
ACM: Association for Computing Machinery
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 40,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/951710.951731
What is a DOI?

ABSTRACT

Clustered VLIW architectures have been widely adopted in modern embedded multimedia applications for their ability to exploit high degrees of ILP with reasonable trade-off in complexity and silicon costs. Studies have however shown limited performance scaling for wide-issue machines. In this paper we describe the architecture of a clustered VLIW with a runtime reconfigurable inter-cluster bus suitable to address such scalability problem. The architecture is aimed at kernel loops acceleration through a coprocessor approach and allows a customization of the interconnect between neighboring register files before each loop execution. We have adopted an inter-cluster communication mechanism based on a constant-complexity interconnect. The complexity and latency independent of the number of clusters preserve the scalability on issue-width. To handle the limited connectivity, the interconnection resources in the inter-cluster bus are exposed to the compiler, and scheduled like other resources with an adapted version of modulo scheduling. Other relevant features include the capability to define shifting queues in the register files, for a more effective software pipelining support. The addition of a limited amount of reconfigurability to the well established VLIW programming model results in low-overhead inter-cluster communications and a scalable ILP architecture. Simulation results show that we can achieve near linear scalability for certain classes of kernel loops.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Dasu, W. Panchanathan, "Survey of Media Processing Approaches," IEEE Tr. on Circuits and Systems for Video Technology, v.12, no.8, pp. 633--645, Aug. 2002.
 
2
 
3
 
4
 
5
6
 
7
C. Basoglu, W. Lee, J. O'Donnell, "The Equator MAP-CA DSP: An End-To-End Broadband Signal Processor VLIW," IEEE Tr. on Circuits and Systems for Video Technology, v.12 no.8, pp. 646--659, Aug. 2002.
 
8
P. Faraboschi, G. Desoli, J. Fisher, "Clustered Instruction-Level Parallel Processors," Tech. Report HPL-98-204, Hewlett-Packard, Dec. 1998.
 
9
S. Rixner, W. Dally, B. Khailany, P. Mattson, U. Kapasi, J. Owens, "Register Organization for Media Processing," HPCA6, 2000.
10
 
11
 
12
 
13
14
 
15
 
16
 
17
 
18
19
20
 
21
 
22
 
23
 
24
25
 
26
D. Rizzo and O. Colavin, "A Runtime Reconfigurable Clustered VLIW Architecture for Mediaprocessing", to appear, Proceedings of the ESTIMedia Workshop, 2003.
27
 
28
 
29


Collaborative Colleagues:
Osvaldo Colavin: colleagues
Davide Rizzo: colleagues