| Adaptive reorder buffers for SMT processors |
| Full text |
Pdf
(517 KB)
|
| Source
|
PACT
archive
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
table of contents
Seattle, Washington, USA
SESSION: Out-of-order microarchitecture
table of contents
Pages: 244 - 253
Year of Publication: 2006
ISBN:1-59593-264-X
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 42, Citation Count: 2
|
|
|
ABSTRACT
In SMT processors, the complex interplay between private and shared datapath resources needs to be considered in order to realize the full performance potential. In this paper, we show that blindly increasing the size of the per-thread reorder buffers to provide a larger number of in-flight instructions does not result in the expected performance gains but, quite in contrast, degrades the instruction throughput for virtually all multithreaded workloads. The reason for this performance loss is the excessive pressure on the shared datapath resources, especially the instruction scheduling logic. We propose intelligent mechanisms for dynamically adapting the number of reorder buffer entries allocated to each thread in an effort to avoid such allocations if they detrimentally impact the scheduler. We achieve this goal through categorizing the program execution into issue-bound and commit-bound phases and only performing the buffer allocations to the threads operating in commit-bound phases. Our adaptive technique achieves improvements of 21% in instruction throughput and 10% in the fairness metric compared to the best performing baseline configuration with static ROBs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Burger, T. Austin. "The SimpleScalar tool set: Version 2.0." Tech. Report, Dept. of CS, Univ. of Wisconsin-Madison, June 1997 and documentation for all Simplescalar releases.
|
 |
2
|
Alper Buyuktosunoglu , David Albonesi , Stanley Schuster , David Brooks , Pradip Bose , Peter Cook, A circuit level implementation of an adaptive issue queue for power-aware microprocessors, Proceedings of the 11th Great Lakes symposium on VLSI, p.73-78, March 2001, West Lafayette, Indiana, United States
[doi> 10.1145/368122.368807]
|
| |
3
|
Francisco J. Cazorla , Alex Ramirez , Mateo Valero , Enrique Fernandez, Dynamically Controlled Resource Allocation in SMT Processors, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.171-182, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.17]
|
| |
4
|
F. Cazorla, et al. "Improving Memory Latency Aware Fetch Policies for SMT Processors." in Proc International Symposium on High Performance Computing, 2003.
|
| |
5
|
|
| |
6
|
|
| |
7
|
K. Luo, et al. "Balancing Throughput and Fairness in SMT Processors." in Proc ISPASS, 2001.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
 |
13
|
Dean M. Tullsen , Susan J. Eggers , Joel S. Emer , Henry M. Levy , Jack L. Lo , Rebecca L. Stamm, Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, Proceedings of the 23rd annual international symposium on Computer architecture, p.191-202, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
 |
14
|
|
| |
15
|
|
| |
16
|
D. Marr, et al, "Hyperthreading Technology Architecture and Microarchitecture", Intel Tech. Journal, vol. 6, No.1, Feb. 2002.
|
 |
17
|
Srikanth T. Srinivasan , Ravi Rajwar , Haitham Akkary , Amit Gandhi , Mike Upton, Continual flow pipelines, Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, October 07-13, 2004, Boston, MA, USA
|
| |
18
|
Smruti R. Sarangi , Wei Liu, Josep Torrellas , Yuanyuan Zhou, ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.257-270, November 12-16, 2005, Barcelona, Spain
[doi> 10.1109/MICRO.2005.28]
|
| |
19
|
|
 |
20
|
|
 |
21
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
22
|
J. Sharkey, "M-Sim: A Flexible, Multi-threaded Simulation Environment." Tech. Report CS-TR-05-DP1, Department of Computer Science, SUNY Binghamton, 2005. http://www.cs.binghamton.edu/~jsharke/m-sim
|
CITED BY 2
|
|
|
|
|
Hongzhou Chen , Lingdi Ping , Xuezeng Pan , Kuijun Lu , Xiaoning Jiang, A swarm-inspired resource distribution for SMT processors, Proceedings of the 3rd International Conference on Bio-Inspired Models of Network, Information and Computing Sytems, November 25-28, 2008, Hyogo, Japan
|
|