| Correlated load-address predictors |
| Full text |
Pdf
(149 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 26th annual international symposium on Computer architecture
table of contents
Atlanta, Georgia, United States
Pages: 54 - 63
Year of Publication: 1999
ISBN:0-7695-0170-2
Also published in ...
|
|
Authors
|
|
Michael Bekerman
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
Stephan Jourdan
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
Ronny Ronen
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
Gilad Kirshenboim
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
Lihu Rappoport
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
Adi Yoaz
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
Uri Weiser
|
Intel Corporation, Intel Israel (74) Ltd., Haifa 31015, Israel
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 44, Citation Count: 27
|
|
|
ABSTRACT
As microprocessors become faster, the relative performance cost of memory accesses increases. Bigger and faster caches significantly reduce the absolute load-to-use time delay. However, increase in processor operational frequencies impairs the relative load-to-use latency, measured in processor cycles (e.g. from two cycles on the Pentium® processor to three cycles or more in current designs). Load-address prediction techniques were introduced to partially cut the load-to-use latency. This paper focuses on advanced address-prediction schemes to further shorten program execution time.Existing address prediction schemes are capable of predicting simple address patterns, consisting mainly of constant addresses or stride-based addresses. This paper explores the characteristics of the remaining loads and suggests new enhanced techniques to improve prediction effectiveness:• Context-based prediction to tackle part of the remaining, difficult-to-predict, load instructions.• New prediction algorithms to take advantage of global correlation among different static loads.• New confidence mechanisms to increase the correct prediction rate and to eliminate costly mispredictions.• Mechanisms to prevent long or random address sequences from polluting the predictor data structures while providing some hysteresis behavior to the predictions.Such an enhanced address predictor accurately predicts 67% of all loads, while keeping the misprediction rate close to 1%. We further prove that the proposed predictor works reasonably well in a deep pipelined architecture where the predict-to-update delay may significantly impair both prediction rate and accuracy.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Aust95
|
|
 |
Baer91
|
|
| |
Chen95
|
|
| |
Eick93
|
R.J. Eickemeyer and S. Vassiliadis, "A Load-lnstruction Unit for Pipelined Processors," in IBM Journal of Research and Development. 1993.
|
 |
Gonz97s
|
|
 |
Lipa96a
|
Mikko H. Lipasti , Christopher B. Wilkerson , John Paul Shen, Value locality and load value prediction, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.138-147, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
Lipa96b
|
M. l-I. Lipasti and J. P. Shen, "Exceeding the Dataflow Limit via Value Prediction," in MICRO-29, t996.
|
| |
Mora98
|
|
| |
Saze96
|
|
| |
Saze97
|
|
 |
Sezn96
|
André Seznec , Stéphan Jourdan , Pascal Sainrat , Pierre Michaud, Multiple-block ahead branch predictors, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.116-127, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
Wang97
|
|
CITED BY 27
|
|
|
|
|
Byung-Kwon Chung , Jinsuo Zhang , Jih-Kwon Peir , Shih-Chang Lai , Konrad Lai, Direct load: dependence-linked dataflow resolution of load address and cache coordinate, Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, December 01-05, 2001, Austin, Texas
|
|
|
|
|
|
|
|
|
Michael Bekerman , Adi Yoaz , Freddy Gabbay , Stephan Jourdan , Maxim Kalaev , Ronny Ronen, Early load address resolution via register tracking, ACM SIGARCH Computer Architecture News, v.28 n.2, p.306-315, May 2000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Binu K. Mathew , Sally A. McKee , John B. Carter , Al Davis, Algorithmic foundations for a parallel vector access memory system, Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures, p.156-165, July 09-13, 2000, Bar Harbor, Maine, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|