ACM Home Page
Please provide us with feedback. Feedback
A fast and accurate framework to analyze and optimize cache memory behavior
Full text PdfPdf (270 KB)
Source ACM Transactions on Programming Languages and Systems (TOPLAS) archive
Volume 26 ,  Issue 2  (March 2004) table of contents
Pages: 263 - 300  
Year of Publication: 2004
ISSN:0164-0925
Authors
Xavier Vera  Universitat Politècnica de Catalunya-Barcelona, Barcelona, Spain
Nerina Bermudo  Universitat Politècnica de Catalunya-Barcelona, Barcelona, Spain
Josep Llosa  Universitat Politècnica de Catalunya-Barcelona, Barcelona, Spain
Antonio González  Universitat Politècnica de Catalunya-Barcelona, Barcelona, Spain
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 73,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/973097.973099
What is a DOI?

ABSTRACT

The gap between processor and main memory performance increases every year. In order to overcome this problem, cache memories are widely used. However, they are only effective when programs exhibit sufficient data locality. Compile-time program transformations can significantly improve the performance of the cache. To apply most of these transformations, the compiler requires a precise knowledge of the locality of the different sections of the code, both before and after being transformed.Cache miss equations (CMEs) allow us to obtain an analytical and precise description of the cache memory behavior for loop-oriented codes. Unfortunately, a direct solution of the CMEs is computationally intractable due to its NP-complete nature.This article proposes a fast and accurate approach to estimate the solution of the CMEs. We use sampling techniques to approximate the absolute miss ratio of each reference by analyzing a small subset of the iteration space. The size of the subset, and therefore the analysis time, is determined by the accuracy selected by the user. In order to reduce the complexity of the algorithm to solve CMEs, effective mathematical techniques have been developed to analyze the subset of the iteration space that is being considered. These techniques exploit some properties of the particular polyhedra represented by CMEs.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
Ayguadé, E. et al. 1995. A uniform internal representation for high-level and instruction-level transformations. Tech rep. UPC-DAC-95-02. Universitat Politècnica de Catalunya, Barcelona, Spain.
 
5
 
6
7
 
8
9
10
11
12
13
 
14
DeGroot, M. 1998. Probability and Statistics. Addison-Wesley, Reading, MA.
 
15
 
16
 
17
 
18
 
19
20
21
22
23
24
 
25
 
26
Hill, M. n.d. DineroIII: a uniprocessor cache simulator (http://www.cs.wisc.edu/∼larus/warts.html).
 
27
28
 
29
Kreisel, G. and Krevine, J. L. 1967. Elements of Mathematical Logic. North-Holland, Amsterdam, The Netherlands.
30
 
31
 
32
33
34
 
35
McCabe, M. 1989. Introduction to the Practice of Statistics. Freeman & Co., New York, NY.
36
 
37
MIPS. 1988. RISCompiler Languages Programmer's Guide. MIPS Computer Systems, Sunnyvale, CA.
38
 
39
Padua, D. et al. 1994. Polaris Developer's Document. Available online at http://polaris.is.uiuc.edu/polaris/polaris--developer/polaris--developer.html.
40
41
42
 
43
 
44
Sánchez, F. and González, A. 1998. Fast, flexible and accurate data locality analysis. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT'98).
45
 
46
47
48
49
 
50
 
51
Vera, X., Llosa, J., and González, A. 2002. Near-optimal padding for removing conflict misses. In Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computers (LCPC'02).
 
52
 
53
Wilde, D. 1993. A library for doing polyhedral operations. Tech. rep. 785, Oregon State University.
54
55



REVIEW

"Eno Thereska : Reviewer"

The disparity between central processing unit (CPU) and memory speeds is addressed in this paper. Memory speeds lag behind CPU speeds. Caches are often used to mitigate this problem; programs that exhibit a high degree of locality can benefit most  more...

Collaborative Colleagues:
Xavier Vera: colleagues
Nerina Bermudo: colleagues
Josep Llosa: colleagues
Antonio González: colleagues