| The shared-thread multiprocessor |
| Full text |
Pdf
(201 KB)
|
Source
|
International Conference on Supercomputing
archive
Proceedings of the 22nd annual international conference on Supercomputing
table of contents
Island of Kos, Greece
SESSION: Architecture 1
table of contents
Pages 73-82
Year of Publication: 2008
ISBN:978-1-60558-158-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 147, Citation Count: 2
|
|
|
ABSTRACT
This paper describes initial results for an architecture called the Shared-Thread Multiprocessor (STMP). The STMP combines features of a multithreaded processor and a chip multiprocessor; specifically, it enables distinct cores on a chip multiprocessor to share thread state. This shared thread state allows the system to schedule threads from a shared pool onto individual cores, allowing for rapid movement of threads between cores. This paper demonstrates and evaluates three benefits of this architecture: (1) By providing more thread state storage than available in the cores themselves, the architecture enjoys the ILP benefits of many threads, but carries the in-core complexity of supporting just a few. (2) Threads can move between cores fast enough to hide long-latency events such as memory accesses. This enables very-short-term load balancing in response to such events. (3) The system can redistribute threads to maximize symbiotic behavior and balance load much more often than traditional operating system thread scheduling and context switching.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
J. Clabes, J. Friedrich, and M. S. et al. Design and implementation of the Power5 microprocessor. In International Solid-State Circuits Conference, 2004.
|
 |
4
|
|
| |
5
|
J. Friedrich, B. McCredie, N. James, B. Huott, B. Curran, E. Fluhr, G. Mittal, E. Chan, Y. Chan, D. Plass, S. Chu, H. Le, L. Clark, J. Ripley, S. Taylor, J. Dilullo, and M. Lanzerotti. Design of the Power6 microprocessor. In International Solid-State Circuits Conference, Feb. 2007.
|
 |
6
|
Tim Johnson , Umesh Nawathe, An 8-core, 64-thread, 64-bit power efficient sparc soc (niagara2), Proceedings of the 2007 international symposium on Physical design, March 18-21, 2007, Austin, Texas, USA
[doi> 10.1145/1231996.1232000]
|
 |
7
|
R. H. Katz , S. J. Eggers , D. A. Wood , C. L. Perkins , R. G. Sheldon, Implementing a cache consistency protocol, Proceedings of the 12th annual international symposium on Computer architecture, p.276-283, June 17-19, 1985, Boston, Massachusetts, United States
|
| |
8
|
|
| |
9
|
|
| |
10
|
S. Parekh, S. Eggers, and H. Levy. Thread-sensitive scheduling for SMT processors. Technical report, University of Washington, 2000.
|
 |
11
|
|
| |
12
|
|
| |
13
|
K. Stavrou, C. Kyriacou, P. Evripidou, and P. Trancoso. Chip multiprocessor based on data-driven multithreading model. In International Journal of High Performance System Architecture, 2007.
|
| |
14
|
|
| |
15
|
D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In 22nd Annual Computer Measurement Group Conference, Dec. 1996.
|
| |
16
|
|
 |
17
|
Dean M. Tullsen , Susan J. Eggers , Joel S. Emer , Henry M. Levy , Jack L. Lo , Rebecca L. Stamm, Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, Proceedings of the 23rd annual international symposium on Computer architecture, p.191-202, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
 |
18
|
|
| |
19
|
Eric Tune , Rakesh Kumar , Dean M. Tullsen , Brad Calder, Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.183-194, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.8]
|
|