| Speculation techniques for improving load related instruction scheduling |
| Full text |
Pdf
(164 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 26th annual international symposium on Computer architecture
table of contents
Atlanta, Georgia, United States
Pages: 42 - 53
Year of Publication: 1999
ISBN:0-7695-0170-2
Also published in ...
|
|
Authors
|
|
Adi Yoaz
|
Intel Corporation, Intel Israel (74) Ltd., BMD Architecture Dept., MS: IDC-3C, P.O. Box 1659, Haifa 31015, Israel
|
|
Mattan Erez
|
Intel Corporation, Intel Israel (74) Ltd., BMD Architecture Dept., MS: IDC-3C, P.O. Box 1659, Haifa 31015, Israel
|
|
Ronny Ronen
|
Intel Corporation, Intel Israel (74) Ltd., BMD Architecture Dept., MS: IDC-3C, P.O. Box 1659, Haifa 31015, Israel
|
|
Stephan Jourdan
|
Intel Corporation, Intel Israel (74) Ltd., BMD Architecture Dept., MS: IDC-3C, P.O. Box 1659, Haifa 31015, Israel
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 50, Citation Count: 32
|
|
|
ABSTRACT
State of the art microprocessors achieve high performance by executing multiple instructions per cycle. In an out-of-order engine, the instruction scheduler is responsible for dispatching instructions to execution units based on dependencies, latencies, and resource availability. Most existing instruction schedulers are doing a less than optimal job of scheduling memory accesses and instructions dependent on them, for the following reasons:• Memory dependencies cannot be resolved prior to execution, so loads are not advanced ahead of preceding stores.• The dynamic latencies of load instructions are unknown, so scheduling dependent instructions is based on either optimistic load-use delay (may cause re-scheduling and re-execution) or pessimistic delay (creating unnecessary delays).• Memory pipelines are more expensive than other execution units, and as such, are a scarce resource. Currently, an increase in the memory execution bandwidth is usually achieved through multi-banked caches where bank conflicts limit efficiency.In this paper we present three techniques to address these scheduler limitations. One is to improve the scheduling of load instructions by using a simple memory disambiguation mechanism. The second is to improve the scheduling of load dependent instructions by employing a Data Cache Hit-Miss Predictor to predict the dynamic load latencies. And the third is to improve the efficiency of load scheduling in a multi-banked cache through Cache-Bank Prediction.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Aust97
|
|
 |
Beke99
|
Michael Bekerman , Stephan Jourdan , Ronny Ronen , Gilad Kirshenboim , Lihu Rappoport , Adi Yoaz , Uri Weiser, Correlated load-address predictors, Proceedings of the 26th annual international symposium on Computer architecture, p.54-63, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
Chry98
|
|
| |
Digi97
|
Digital Equipment Corporation, Maynard MA- "21164 Alpha Microprocessor Hardware Reference Manual"- Digital Equipment Corporation, 1997.
|
| |
Fran96
|
|
 |
Gall94
|
David M. Gallagher , William Y. Chen , Scott A. Mahlke , John C. Gyllenhaal , Wen-mei W. Hwu, Dynamic memory disambiguation using the memory conflict buffer, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.183-193, October 05-07, 1994, San Jose, California, United States
|
| |
Hess95
|
J. Hesson, J. LeBlanc and S. Ciavaglia- "'Apparatus to Dynamically Control the Out-Of-Order Execution of Load-Store Instructions"- US. Patent 5,615,350 Filed Dec. 1995, Issued Mar. 1997.
|
 |
Huan94
|
A. S. Huang , G. Slavenburg , J. P. Shen, Speculative disambiguation: a compilation technique for dynamic memory disambiguation, Proceedings of the 21ST annual international symposium on Computer architecture, p.200-210, April 18-21, 1994, Chicago, Illinois, United States
|
| |
Hunt95
|
|
| |
Inte96
|
Intel Corporation- "Pentium~ Pro Family Developers Manual "- lnteI Corporation, 1996
|
| |
Jour98
|
S. Jourdan, R. Ronen, M. Bekerman, B. Shomar, and A. Yoaz- "A Novel Renaming Scheme to Exploit Value Temporal Locality through Physical Register Reuse and Unification"- MICRO-31, Dec. 1998.
|
| |
Kess98
|
R.E. Kessler- "The Alpha 21264 Microprocessor: Out-of-Order Execution at 600 MHz"- HOT-CHIPS 10, Aug. 1998,
|
| |
Mcfa93
|
S. McFarling- "'Combining Branch Predictors"- WRL Technical Note TN-36, June 1993.
|
 |
Mich97
|
Pierre Michaud , André Seznec , Richard Uhlig, Trading conflict and capacity aliasing in conditional branch predictors, Proceedings of the 24th annual international symposium on Computer architecture, p.292-303, June 01-04, 1997, Denver, Colorado, United States
|
 |
Mosh97
|
Andreas Moshovos , Scott E. Breach , T. N. Vijaykumar , Gurindar S. Sohi, Dynamic speculation and synchronization of data dependences, Proceedings of the 24th annual international symposium on Computer architecture, p.181-193, June 01-04, 1997, Denver, Colorado, United States
|
| |
Mosh97b
|
|
| |
Mowr97
|
|
| |
Nico89
|
|
| |
Patt95
|
|
| |
Pinte96
|
|
| |
Rive97
|
Jude A. Rivers , Gary S. Tyson , Edward S. Davidson , Todd M. Austin, On high-bandwidth data cache design for multi-issue processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.46-56, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
Simo95
|
M. Simone , A. Essen , A. Ike , A. Krishnamoorthy , T. Maruyama , N. Patkar , M. Ramaswami , M. Shebanow , V. Thirumalaiswamy , D. Tovey, Implementation trade-offs in using a restricted data flow architecture in a high performance RISC microprocessor, Proceedings of the 22nd annual international symposium on Computer architecture, p.151-162, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
Tuls95
|
|
| |
Weis94
|
|
 |
Wils96
|
Kenneth M. Wilson , Kunle Olukotun , Mendel Rosenblum, Increasing cache port efficiency for dynamic superscalar microprocessors, Proceedings of the 23rd annual international symposium on Computer architecture, p.147-157, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
Yeh97
|
Yeh and Patt- ''Two-Level Adaptive Training Branch Prediction" - ISCA-24, June 1997.
|
CITED BY 32
|
|
Jih-Kwon Peir , Shih-Chang Lai , Shih-Lien Lu , Jared Stark , Konrad Lai, Bloom filtering cache misses for accurate data speculation and prefetching, Proceedings of the 16th international conference on Supercomputing, June 22-26, 2002, New York, New York, USA
|
|
|
Michael Bekerman , Adi Yoaz , Freddy Gabbay , Stephan Jourdan , Maxim Kalaev , Ronny Ronen, Early load address resolution via register tracking, ACM SIGARCH Computer Architecture News, v.28 n.2, p.306-315, May 2000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David N. Armstrong , Hyesoon Kim , Onur Mutlu , Yale N. Patt, Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.119-128, December 04-08, 2004, Portland, Oregon
|
|
|
Francisco J. Cazorla , Alex Ramirez , Mateo Valero , Enrique Fernandez, Dynamically Controlled Resource Allocation in SMT Processors, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.171-182, December 04-08, 2004, Portland, Oregon
|
|
|
|
|
|
Changpeng Fang , Steve Carr , Soner Önder , Zhenlin Wang, Feedback-directed memory disambiguation through store distance analysis, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
Deniz Balkan , Joseph Sharkey , Dmitry Ponomarev , Kanad Ghose, SPARTAN: speculative avoidance of register allocations to transient values for performance and energy efficiency, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|