ACM Home Page
Please provide us with feedback. Feedback
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture
Full text PdfPdf (121 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2002 ACM/IEEE conference on Supercomputing table of contents
Baltimore, Maryland
Pages: 1 - 12  
Year of Publication: 2002
Authors
Manuel E. Acacio  Universidad de Murcia, Spain
José González  Intel Barcelona Research Center, Intel Labs, Barcelona
José M. García  Universidad de Murcia, Spain
José Duato  Universidad Politécnica de Valencia, Spain
Sponsors
IEEE-CS\DATC : IEEE Computer Society
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 34,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Cache misses for which data must be obtained from a remote cache (cache-to-cache transfer misses) account for an important fraction of the total miss rate. Unfortunately, cc-NUMA designs put the access to the directory information into the critical path of 3-hop misses, which significantly penalizes them compared to SMP designs. This work studies the use of owner prediction as a means of providing cc-NUMA multiprocessors with a more efficient support for cache-to-cache transfer misses. Our proposal comprises an effective prediction scheme as well as a coherence protocol designed to support the use of prediction. Results indicate that owner prediction can significantly reduce the latency of cache-to-cache transfer misses, which translates into speed-ups on application performance up to 12%. In order to also accelerate most of those 3-hop misses that are either not predicted or mispredicted, the inclusion of a small and fast directory cache in every node is evaluated, leading to improvements up to 16% on the final performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
4
 
5
 
6
7
8
 
9
A. Gupta, W.-D. Weber and T. Mowry. "Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes". Proc. Int'l Conference on Parallel Processing (ICPP'90), pp. 312--321, August 1990.
 
10
L. Gwennap. "Alpha 21364 to Ease Memory Bottleneck". Microprocessor Report, pp. 12--15, October 1998.
 
11
 
12
 
13
 
14
 
15
S. Kaxiras and C. Young. "Coherence Communication Prediction in Shared-Memory Multiprocessors". Proc. of the 6th Int'l High Performance Computer Architecture (HPCA-6), pp. 156--167, January 2000.
16
17
 
18
19
20
 
21
22
23
 
24
Z. Zhang. "Architectural Sensitive Application Characterization: The Approach of High-Performance Index-Set (HP-Set)". Technical Report HPL-2001--75, HP Laboratories Palo Alto, March 2001.

CITED BY  9

Collaborative Colleagues:
Manuel E. Acacio: colleagues
José González: colleagues
José M. García: colleagues
José Duato: colleagues