| Instruction scheduling for clustered VLIW architectures |
| Full text |
Pdf
(63 KB)
|
| Source
|
International Symposium on Systems Synthesis
archive
Proceedings of the 13th international symposium on System synthesis
table of contents
Madrid, Spain
SESSION: Code generation and scheduling
table of contents
Pages: 41 - 46
Year of Publication: 2000
ISBN:1080-1082
|
|
Authors
|
|
Jesús Sánchez
|
Universitat Politècnica de Catalunya, Dept. of Computer Architecture, Barcelona - SPAIN, E-mail: fran@ac.upc.es
|
|
Antonio González
|
Universitat Politècnica de Catalunya, Dept. of Computer Architecture, Barcelona - SPAIN, E-mail: antonio@ac.upc.es
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 0, Downloads (12 Months): 25, Citation Count: 12
|
|
|
ABSTRACT
Clustered VLIW organizations are nowadays a common trend in the design of embedded/DSP processors. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction scheduling in a single pass, which is more effective than doing first the assignment and latter the scheduling. We also show that loop unrolling significantly enhances the performance of the proposed scheduler, especially when the communication channel among clusters is the main performance bottleneck. By selectively unrolling some loops, we can obtain the best performance with the minimum increase in code size. Performance evaluation for the SPECfp95 shows that the clustered architecture achieves about the same IPC (Instructions Per Cycle) as a unified architecture with the same resources. Moreover, when the cycle time is taken into account, a 4-cluster configuration is 3.6 times faster than the unified architecture.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
E. Ayguade, C. Barrado, A. Gonz~lez, J. Labarta, D. L~pez, S. Moreno, D. Padua, F. Reig, Q. Riera and M. Valero, "Ictineo: a Tool for Research on ILP", in SC'96, Research Exhibit "Polaris at Work", 1996
|
 |
2
|
Andrea Capitanio , Nikil Dutt , Alexandru Nicolau, Partitioned register files for VLIWs: a preliminary analysis of tradeoffs, Proceedings of the 25th annual international symposium on Microarchitecture, p.292-300, December 01-04, 1992, Portland, Oregon, United States
|
| |
3
|
|
| |
4
|
|
| |
5
|
P. Glaskowsky, "MAP1000 unfolds at Equator", Microprocessor Report vol 12, no 16. Dec. 1998
|
| |
6
|
S. Jang, S. Carr, P. Sweany and D. Kuras, "A Code Generation Framework for VLIW Architectures with Partitioned Register Banks", in Procs. of 3rd. Int. Conf. on Massively Parallel Computing Systems, April 1998
|
 |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
Semiconductor Industry Association, "The National Technology Roadmap for Semiconductors: Technology Needs", 1997
|
| |
16
|
Texas Instruments Inc., "TMS320C62x/67x CPU and Instruction Set Reference Guide", 1998
|
| |
17
|
O. Wolfe and J. Bier, "TigerSharc Sinks Teeth Into VLIW", Microprocessor Report, vol. 12, no. 16, Dec. 1998.
|
CITED BY 12
|
|
|
|
|
|
|
|
Marcio Buss , Rodolfo Azevedo , Paulo Centoducatte , Guido Araujo, Tailoring pipeline bypassing and functional unit mapping to application in clustered VLIW architectures, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Martha Mercaldi , Steven Swanson , Andrew Petersen , Andrew Putnam , Andrew Schwerin , Mark Oskin , Susan J. Eggers, Instruction scheduling for a tiled dataflow architecture, ACM SIGOPS Operating Systems Review, v.40 n.5, December 2006
|
|
|
|
|
|
|
|