ACM Home Page
Please provide us with feedback. Feedback
An analysis of the effects of miss clustering on the cost of a cache miss
Full text PdfPdf (251 KB)
Source
Conference On Computing Frontiers archive
Proceedings of the 4th international conference on Computing frontiers table of contents
Ischia, Italy
SESSION: Memory hierarchy table of contents
Pages: 3 - 12  
Year of Publication: 2007
ISBN:978-1-59593-683-7
Authors
Thomas R. Puzak  IBM -- T. J. Watson Research Center, Yorktown Heights, NY
A. Hartstein  IBM -- T. J. Watson Research Center, Yorktown Heights, NY
P. G. Emma  IBM -- T. J. Watson Research Center, Yorktown Heights, NY
V. Srinivasan  IBM -- T. J. Watson Research Center, Yorktown Heights, NY
Jim Mitchell  IBM -- T. J. Watson Research Center, Yorktown Heights, NY
Sponsors
ACM: Association for Computing Machinery
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 34,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1242531.1242536
What is a DOI?

ABSTRACT

In this paper we describe a new technique, called pipeline spectroscopy, and use it to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogram, which represents a precise readout showing a detailed visualization of the cost of each cache miss throughout all levels of the memory hierarchy. We call the graphs 'spectrograms' because they reveal certain signature features of the processor's memory hierarchy, the pipeline, and the miss pattern itself. Next we provide two examples that use spectroscopy to optimize the processor's hardware or application's software. The first example demonstrates how a miss spectrogram can aid software designers in analyzing the performance of an application. The second example uses a miss spectrogram to analyze bus queueing. Our experiments show that performance gains of up to 8% are possible. Detailed analysis of a spectrogram leads to much greater insight in pipeline dynamics, including effects due to miss cluster, miss overlap, prefetching, and miss queueing delays.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Glew, "MLP yes! ILP no!," in ASPLOS Wild and Crazy Ideas Session , October 1998.
 
2
3
4
 
5
6
7
 
8
9
 
10
A. Zahir, V. Hummel, M. Kling, T Yeh, US. Patent 6,353,802, "Apparatus and Method for Cycle Accounting in Microprocessors".
 
11
B. Gaither, R. Smith, US Patent 6,892,173 B1, "Analyzing Effectiveness of a Computer Cache By Estimating a Hit Rate Based on Applying a Subset of Real-time Addresses to a Model of the Cache".
 
12
H. Ravichandran, US Patent 6,341,357 B1, "Apparatus and Method for Processor Performance Monitoring".
 
13
R. Trauben, US Patent 5,594,864, "Method and apparatus for unobtrusively monitoring Processor States and Characterizing Bottlenecks in a Pipeline Processor Executing Grouped Instructions".
 
14
G. Brooks, US Patent 5,845,310 "System and Methods For Performing Cache Latency Diagnostics in Scalable Parallel Processing Architectures Including Calculating CPU Idle Time and Counting Number of Cache Misses.
 
15
W. Flynn, US Patent 6,256,775 B1, "Facilities For Detailed Software Performance Analysis in a Multithreaded Processor".
 
16
F. Levine, B. McCredie, W. Starke, E. Welbon, US Patent 5,862,371, "Method and System for Instruction Trace Reconstruction Utilizing Performance monitor outputs and bus Monitoring".
 
17
F. Levine, B. McCredie, W. Starke, E. Welbon, US Patent 5,894,575 "Method and System for Initial State Determination for Instruction Trace Reconstruction.
 
18
19
20
 
21
A. Hartstein and T. Puzak. The optimum pipeline depth for a microprocessor, 29th International Symposium on Microarchitecture, pages 7--13 May 2002.
 
22
23
 
24
 
25
US Patent 5,636,364 Method for enabling concurrent misses in a cache memory.
 
26
US Patent 5,233,702 Cache miss facility with stored sequences for data fetching.
 
27
IBM Technical Disclosure Bulletin, ""A Protocol for Processing Concurrent Misses"", Dec. 1993, vol. 36 No. 12.
 
28
IBM Technical Disclosure Bulletin, vol. ""Design for Improved Cache Performance via Overlapping of Cache Miss Sequences"" vol. 25 No. 1B Apr. 1983 pp. 5962--5966.
 
29
R. Bartoszynski, M Niewiadomska-Bugaj,Probability and Statistical Inference, (Wiley series in probability and statistics) 1996.

Collaborative Colleagues:
Thomas R. Puzak: colleagues
A. Hartstein: colleagues
P. G. Emma: colleagues
V. Srinivasan: colleagues
Jim Mitchell: colleagues