ACM Home Page
Please provide us with feedback. Feedback
Trace preconstruction
Full text PdfPdf (121 KB)
Source International Symposium on Computer Architecture archive
Proceedings of the 27th annual international symposium on Computer architecture table of contents
Vancouver, British Columbia, Canada
Pages: 37 - 46  
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
Authors
Quinn Jacobson  Sun Microsystems, 901 San Antonio Road, Palo Alto, CA
James E. Smith  University of Wisconsin, Madison, Department of Electrical & Computer Engineering, Madison, WI
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 25,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/339647.339653
What is a DOI?

ABSTRACT

Trace caches enable high bandwidth, low latency instruction supply, but have a high miss penalty and relatively large working sets. Consequently, their performance may suffer due to capacity and compulsory misses. Trace preconstruction augments a trace cache by performing a function analogous to prefetching. The trace preconstruction mechanism observes the processor's instruction dispatch stream to detect opportunities for jumping ahead of the processor. After doing so, the preconstruction mechanism fetches static instructions from the predicted future region of the program, and constructs a set of traces in advance of when they are needed. Trace preconstruction can significantly increase both the performance of the trace cache and the robustness of the trace cache to varying workloads. All but one of the SPECint95 benchmarks see a notable reduction in trace cache miss rates from preconstruction. The three benchmarks that have the largest working set (gee, go and vortex) see a 30% to 80% reduction in trace cache misses. We also consider the integration of preconstruction with another trace-specific mechanism (preprocessing) to produce a high performance frontend. When combined, preconstruction and trace preprocessing produce an average speedup of 14% for the SPECint95 benchmarks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
D. Burger, T. Austin and S. Bennett, "Evaluating Future Microprocessors: The SimpleScalar Tool Set," University of Wisconsin- Madison Technical Report #1308, July 1996.
 
3
 
4
5
 
6
 
7
 
8
 
9
S. Patel, D. Friendly and Y. Patt, "Critical Issues Regarding the Trace Cache Fetch Mechanism." University of Michigan Technical Report CSE-TR- 335-97, 1997.
 
10
 
11
 
12
A. J. Smith, "Sequential Program Prefetching in Memory Hierarchies," IEEE Computer 11 (12), pp. 7- 21, Dec 1978.
 
13
 
14
 
15
C. Young, E. Shekita, "An Intelligent I-Cache Prefetch Mechanism," in Proceedings of the International Conference on Computer Design, pp. 44-49, Oct 1993.

CITED BY  7

Collaborative Colleagues:
Quinn Jacobson: colleagues
James E. Smith: colleagues