| Memory-system design considerations for dynamically-scheduled processors |
| Full text |
Pdf
(1.80 MB)
|
| Source
|
ACM SIGARCH Computer Architecture News
archive
Volume 25 , Issue 2 (May 1997)
table of contents
Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)
Pages: 133 - 143
Year of Publication: 1997
ISSN:0163-5964
Also published in ...
|
|
Authors
|
|
Keith I. Farkas
|
Electrical and Computer Engineering, University of Toronto, 10 Kings College Road, Toronto, Ontario M5S 3G4, Canada
|
|
Paul Chow
|
Electrical and Computer Engineering, University of Toronto, 10 Kings College Road, Toronto, Ontario M5S 3G4, Canada
|
|
Norman P. Jouppi
|
Digital Equipment Corporation, Western Research Lab, 250 University Avenue, Palo Alto, California
|
|
Zvonko Vranesic
|
Electrical and Computer Engineering, University of Toronto, 10 Kings College Road, Toronto, Ontario M5S 3G4 Canada
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 42, Citation Count: 24
|
|
|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
In this paper, we identify performance trends and design relationships between the following components of the data memory hierarchy in a dynamically-scheduled processor: the register file, the lockup-free data cache, the stream buffers, and the interface between these components and the lower levels of the memory hierarchy. Similar performance was obtained from all systems having support for fewer than four in-flight misses, irrespective of the register-file size, the issue width of the processor, and the memory bandwidth. While providing support for more than four in-flight misses did increase system performance, the improvement was less than that obtained by increasing the number of registers. The addition of stream buffers to the investigated systems led to a significant performance increase, with the larger increases for systems having less in-flight-miss support, greater memory bandwidth, or more instruction issue capability. The performance of these systems was not significantly affected by the inclusion of traffic filters, dynamic-stride calculators, or the inclusion of the per-load non-unity stride-predictor and the incremental-prefetching techniques, which we introduce. However, the incremental prefetching technique reduces the bandwidth consumed by stream buffers by 50% without a significant impact on performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Norm Jouppi. Improving Direct Mapped Cache Performance by the Addition of a Small Fully Associative Cache and Prefetch Buffers. Technical Report TN-15, Digital E, quipment Corporation Western Research Lab, March 1990.
|
| |
2
|
Keith I. Farkas. Memory-system Design Considerations for Dynamically-scheduled Microprocessors. PhD thesis, Department of Electrical and Computer Engineering, University of Toronto, Ontario, Canada, January 1997. (URL: http:llwww.eeeg.toronto.edul,,,farkaslthesis.phd.html).
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
Linley Gwermap. PA-8000 Combines Complexity and Speed. Microprocessor Reports, 8(15):1,6-11, 1994.
|
| |
7
|
|
| |
8
|
IBM Microelectronics. PowerPC 620 RISC Microprocessor Technical Summary, 10 1994. document number: MPR620TSU-01.
|
| |
9
|
|
 |
10
|
|
 |
11
|
John W. C. Fu , Janak H. Patel , Bob L. Janssens, Stride directed prefetching in scalar processors, Proceedings of the 25th annual international symposium on Microarchitecture, p.102-110, December 01-04, 1992, Portland, Oregon, United States
|
 |
12
|
|
| |
13
|
Scott McFarling. Combining Branch Predictors. Digital Equipment Corporation Western Research Lab Technical Note TN-36, 1993.
|
 |
14
|
|
| |
15
|
Keith I. Farkas, Paul Chow, Norman P. Jouppi, and Zvonko Vranesie. Memory-system Design Considerations for Dynamically-scheduled Processors. Technical Report 1, Digital Equipment Corporation Western Research Lab, 1997. (URL: http://www.research.digital.eom/wrl/teehreports).
|
CITED BY 24
|
|
Kevin Skadron , Pritpal S. Ahuja , Margaret Martonosi , Douglas W. Clark, Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques, IEEE Transactions on Computers, v.48 n.11, p.1260-1281, November 1999
|
|
|
|
|
|
Lixin Zhang , Zhen Fang , Mide Parker , Binu K. Mathew , Lambert Schaelicke , John B. Carter , Wilson C. Hsieh , Sally A. McKee, The Impulse Memory Controller, IEEE Transactions on Computers, v.50 n.11, p.1117-1132, November 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sorin Iacobovici , Lawrence Spracklen , Sudarshan Kadambi , Yuan Chou , Santosh G. Abraham, Effective stream-based and execution-based data prefetching, Proceedings of the 18th annual international conference on Supercomputing, June 26-July 01, 2004, Malo, France
|
|
|
Sally A. McKee , William A. Wulf , James H. Aylor , Maximo H. Salinas , Robert H. Klenke , Sung I. Hong , Dee A. B. Weikle, Dynamic Access Ordering for Streamed Computations, IEEE Transactions on Computers, v.49 n.11, p.1255-1271, November 2000
|
|
|
Chi-Keung Luk , Robert Muth , Harish Patil , Richard Weiss , P. Geoffrey Lowney , Robert Cohn, Profile-guided post-link stride prefetching, Proceedings of the 16th international conference on Supercomputing, June 22-26, 2002, New York, New York, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John B. Carter , Wilson C. Hsieh , Leigh B. Stoller , Mark Swanson , Lixin Zhang , Sally A. McKee, Impulse: Memory system support for scientific applications, Scientific Programming, v.7 n.3-4, p.195-209, August 1999
|
|
|
|
|
|
|
|
|
|
|
|
Akihiro Yamamoto , Yusuke Tanaka , Hideki Ando , Toshio Shimada, Data prefetching and address pre-calculation through instruction pre-execution with two-step physical register deallocation, Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture, p.33-40, September 16-16, 2007, Brasov, Romania
|
|
|
|
|
|
|
|
|
|
|
|
|
|