ACM Home Page
Please provide us with feedback. Feedback
Informing memory operations: providing memory performance feedback in modern processors
Full text PdfPdf (1.55 MB)
Source International Symposium on Computer Architecture archive
Proceedings of the 23rd annual international symposium on Computer architecture table of contents
Philadelphia, Pennsylvania, United States
Pages: 260 - 270  
Year of Publication: 1996
ISBN:0-89791-786-3
Also published in ...
Authors
Mark Horowitz  Computer Systems, Laboratory, Stanford University
Margaret Martonosi  Department of Electrical Engineering, Princeton University
Todd C. Mowry  Department of Electrical and Computer Engineering, University of Toronto
Michael D. Smith  Division of Applied Sciences, Harvard University
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 48,   Citation Count: 24
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/232973.233000
What is a DOI?

ABSTRACT

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem successfully in specific situations. However, the generality of these software approaches has been limited because current architectures do not provide a fine-grained, low-overhead mechanism for observing and reacting to memory behavior directly. To fill this need, we propose a new class of memory operations called informing memory operations, which essentially consist of a memory operation combined (either implicitly or explicitly) with a conditional branch-and-link operation that is taken only if the reference suffers a cache miss. We describe two different implementations of informing memory operations---one based on a cache-outcome condition code and another based on low-overhead traps---and find that modern in-order-issue and out-of-order-issue superscalar processors already contain the bulk of the necessary hardware support. We describe how a number of software-based memory optimizations can exploit informing memory operations to enhance performance, and look at cache coherence with fine-grained access control as a case study. Our performance results demonstrate that the runtime overhead of invoking the informing mechanism on the Alpha 21164 and MIPS R10000 processors is generally small enough to provide considerable flexibility to hardware and software designers, and that the cache coherence application has improved performance compared to other current solutions. We believe that the inclusion of informing memory operations in future processors may spur even more innovative performance optimizations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

ABC+95
ACC+90
 
AKK+93
 
AKL79
W. Abu-Sufah, D. J. Kuck, and D. H. Lawrie. Automatic Program Transformations for Virtual Memory Computers. Proc. 1979 National Computer Conf. pp 969-974, June 1979.
BLA+94
BLRC94
 
BM89
CDV+94
CMCH91
CMM+88
 
Dix92
 
DBKF90
 
DEC92
Digital Equipment Corp. DECChip 21064 RISC Microprocessor Preliminary Data Sheet. Technical report, 1992.
ECGS92
 
ERB+95
FJ94
 
GH93
 
GJMS87
K. Gallivan, W. Jalby, U. Meier, and A. Sameh. The Impact of Hierarchical Memory Systems on Linear Algebra Algorithm Design. Technical Report UIUCSRD 625, Univ. of Illinois, 1987.
 
HMMS95
 
JHei95
Joe Heinrich. MIPS R10000 Microprocessor User's Manual. 1995.
Jou90
KOH+94
LGH94
 
LW94
 
Mat94
Terje Mathison. Pentium Secrets. Byte, pp 191-192, July 1994.
 
MGA95
MLG92
 
NAB+94
A. Nowatzyk, G. Aybay, M. Browne, et al. The S3.mp Scalable Shared Memory Multiprocessor. Proc. 27th Hawaii Intl. Conf. on System Sciences Vol. I: Architecture. pp 144-53. Jan, 1994.
 
Pau94
 
Por89
RLW94
SFL+94
SG94
 
Smi81
B. J. Smith. Architecture and Applications of the HEP Multiprocessor Computer System. SPIE Real-Time Signal Processing IV, Vol. 298, 1981.
TE94
TL94
WL91

CITED BY  24

Collaborative Colleagues:
Mark Horowitz: colleagues
Margaret Martonosi: colleagues
Todd C. Mowry: colleagues
Michael D. Smith: colleagues