ACM Home Page
Please provide us with feedback. Feedback
Enhancing memory level parallelism via recovery-free value prediction
Full text PdfPdf (302 KB)
Source International Conference on Supercomputing archive
Proceedings of the 17th annual international conference on Supercomputing table of contents
San Francisco, CA, USA
SESSION: Speculative execution table of contents
Pages: 326 - 335  
Year of Publication: 2003
ISBN:1-58113-733-8
Authors
Huiyang Zhou  North Carolina State University
Thomas M. Conte  North Carolina State University
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 32,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/782814.782859
What is a DOI?

ABSTRACT

The ever-increasing computational power of contemporary microprocessors reduces the execution time spent on arithmetic computations (i.e., the computations not involving slow memory operations such as cache misses) significantly. Therefore, for memory intensive workloads, it becomes more important to overlap multiple cache misses than to overlap slow memory operations with other computations. In this paper, we propose a novel technique to parallelize sequential cache misses, thereby increasing memory-level parallelism (MLP). Our idea is based on the value prediction, which was proposed originally as an instruction-level-parallelism (ILP) optimization to break true data dependencies. In this paper, we advocate value prediction in its capability to enhance MLP instead of ILP. We propose to use value prediction and value speculative execution only for prefetching so that the complex prediction validation and misprediction recovery mechanisms are avoided and only minor changes in the microarchitecture are needed. The same hardware modifications also enable aggressive memory disambiguation for prefetching. The experimental results show that our technique enhances MLP effectively and achieves significant speedups even with a simple stride value predictor.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
B. Calder and G. Reinman, "A comparative survey of load speculation architecures", Journal of Instruction-Level Parallelism, 2000.
 
5
6
7
8
 
9
F. Gabbay and A. Mendelson, "Speculative execution based on value prediction," EE Department Tech Report 1080, Tachnion - Israel Institute of Technology, Nov. 1996.
10
 
11
 
12
 
13
T. Karkhanis and J. Smith, "A Day in the Life of a Cache Miss", Proceeding of the 2nd Annual Workshop on Memory Performance Issues (WMPI 2002), 2002.
14
 
15
 
16
17
18
 
19
 
20
 
21
 
22
23
 
24
 
25
26
 
27
28
 
29
H. Zhou and T. Conte, "Performance modeling of memory latency hiding techniques", Technical Report, ECE Department, N. C. State University, Dec. 2002.
30

CITED BY  11

Collaborative Colleagues:
Huiyang Zhou: colleagues
Thomas M. Conte: colleagues