|
ABSTRACT
Power consumption and wire delays are two important limiting factors for current and forthcoming processors. Monolithic designs that keep reasonable power consumption and operate at high clock frequencies are ever harder to implement. In this paper we propose a novel multithreaded clustered microarchitecture that consists of a clustered front-end capable of fetching instructions from multiple hreads and a clustered back-end where instructions are executed. This microarchitecture combines the concepts of multithreading and clustering to a tack both problems: power consumption and wire delays. A key aspect of this microarchitecture is the assignment of resources to the simultaneously running threads. We propose two back-end assignment schemes; in the Static Back-end Assignment (SBA)the back-ends are statically assigned to the front-ends, while in the Dynamic Back-end Assignment (DBA) the back-ends are dynamically assigned according to the demands of each front-end. A limit study of the potential performance of DBA shows a minor benefit compared to SBA. The causes why the DBA scheme does not perform as initially expected are investigated and the main limiting factors of this architecture are evaluated. Finally,we point out he advantages of DBA versus SBA.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
Keith I. Farkas , Paul Chow , Norman P. Jouppi , Zvonko Vranesic, The multicluster architecture: reducing cycle time through partitioning, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.149-159, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
R. Canal, J.M. Parcerisa and A. González. Dynamic Cluster Assignment Mechanisms. In Proceedings of the HPCA-6, 2000.
|
| |
9
|
A. Buyuktosunoglu, P. Bose, P.W. Cook and S.E. Schuster. Tradeoffs in Power-Efficient Issue Queue Design. PACT2000, Nov. 2000.
|
| |
10
|
|
| |
11
|
|
 |
12
|
Amirali Baniasadi , Andreas Moshovos, Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.337-347, December 2000, Monterey, California, United States
[doi> 10.1145/360128.360165]
|
| |
13
|
A. Aggarwal and M. Franklin. An Empirical Study of the Scalability Aspects of Instruction Distribution Algorithms for Clustered Processors. In Proceedings of ISPASS, 2001.
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
Krisztián Flautner , Nam Sung Kim , Steve Martin , David Blaauw , Trevor Mudge, Drowsy caches: simple techniques for reducing leakage power, Proceedings of the 29th annual international symposium on Computer architecture, p.148, May 25-29, 2002, Anchorage, Alaska
|
| |
19
|
Intel Corp. Intel Pentium 4 Processor. http://www.intel.com/~products/desk_lap/processors/desktop/pentium4/, 2002.
|
| |
20
|
|
 |
21
|
|
| |
22
|
Intel Corp. Hyper-Threading technology. http://www.intel.com/technology/hyperthread/index.htm
|
 |
23
|
|
 |
24
|
|
| |
25
|
B. Sinharoy "POWER5 Architecture and Systems", Keynote presentation, International Symposium on High Performance Computer Architecture, Feb. 2004.
|
CITED BY 6
|
|
Jason Cong , Ashok Jagannathan , Glenn Reinman , Yuval Tamir, Understanding the energy efficiency of SMT and CMP with multiclustering, Proceedings of the 2005 international symposium on Low power electronics and design, August 08-10, 2005, San Diego, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Carlos Madriles , Pedro López , Josep M. Codina , Enric Gibert , Fernando Latorre , Alejandro Martinez , Raúl Martinez , Antonio Gonzalez, Boosting single-thread performance in multi-core systems through fine-grain multi-threading, ACM SIGARCH Computer Architecture News, v.37 n.3, June 2009
|
|