ACM Home Page
Please provide us with feedback. Feedback
Exact analysis of the cache behavior of nested loops
Full text PdfPdf (1.66 MB)
Source Conference on Programming Language Design and Implementation archive
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation table of contents
Snowbird, Utah, United States
Pages: 286 - 297  
Year of Publication: 2001
ISBN:1-58113-414-2
Also published in ...
Authors
Siddhartha Chatterjee  Department of Computer Science, The University of North Carolina, Chapel Hill, NC
Erin Parker  Department of Computer Science, The University of North Carolina, Chapel Hill, NC
Philip J. Hanlon  Department of Mathematics, University of Michigan, Ann Arbor, MI
Alvin R. Lebeck  Department of Computer Science, Duke University, Durham, NC
Sponsor
SIGPLAN: ACM Special Interest Group on Programming Languages
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 74,   Citation Count: 37
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/378795.378859
What is a DOI?

ABSTRACT

We develop from first principles an exact model of the behavior of loop nests executing in a memory hicrarchy, by using a nontraditional classification of misses that has the key property of composability. We use Presburger formulas to express various kinds of misses as well as the state of the cache at the end of the loop nest. We use existing tools to simplify these formulas and to count cache misses. The model is powerful enough to handle imperfect loop nests and various flavors of non-linear array layouts based on bit interleaving of array indices. We also indicate how to handle modest levels of associativity, and how to perform limited symbolic analysis of cache behavior. The complexity of the formulas relates to the static structure of the loop nest rather than to its dynamic trip count, allowing our model to gain efficiency in counting cache misses by exploiting repetitive patterns of cache behavior. Validation against cache simulation confirms the exactness of our formulation. Our method can serve as the basis for a static performance predictor to guide program and data transformations to improve performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
 
5
6
7
 
8
 
9
 
10
P. Boulet and X. Redon. SPPoC: fonctionnemen et applications. Research Report 00-04, LIFL (Laboratoire de Recherche en Informatique de l'Universite des Sciences et Technologies de Lille), 2000. In French. Also see http://www.lifl.fr/west/sppoc/.
 
11
M. Brehob and R. Enbody. A mathematical model of locality and caching. Technical Report TR-MSU-CPS-96-TBD, Michigan State University, Nov. 1996.
 
12
13
14
15
 
16
S. Chatterjee and S. Sen. Cache-efficient matrix transposition. In Proceedings of HPCA-6, pages 195-205, Toulouse, France, Jan. 2000.
17
18
19
 
20
P. Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23-54, 1991.
21
22
 
23
24
25
26
 
27
 
28
29
 
30
 
31
 
32
 
33
W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott. The Omega Calculator and Library, version 1.1.0, Nov. 1996.
 
34
W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott. The Omega Library Version 1.1.0 Interface Guide, Nov. 1996.
 
35
 
36
W. Kelly and W. Pugh. Finding legal reordering transformations using mappings. Technical Report CS-TR-3297, Department of Compute Science, University of Maryland, College Park, MD, June 1994.
37
38
 
39
40
41
 
42
V. Loechner. PolyLib: A Library for Manipulating Parameterized Polyhedra, Mar. 1999.
43
44
 
45
 
46
D. C. Oppen. A 2 2 2 pn upper bound on the complexity of Presburger arithmetic. J. Comput. Syst. Sci., 16(3):323-332, July 1978.
47
 
48
49
50
51
 
52
U. Sch~ning. Complexity of Presburger arithmetic with fixed quantifier dimension. Theory of Computing Systems, 30:423-428, 1997.
 
53
N. Shibata, K. Okana, T. Higashino, and K. Taniguchi. A decision algorithm dor prenex form rational Presburger sentences based on combinatorial geometry. In Proceedings of the 2nd International Conference on Discrete Mathematics and Theoretical Computer Science and the 5th Australasian Theory Symposium (DMTCS'99+CATS'99), pages 344-359, Jan. 1999.
54
 
55
The Stanford Compiler Group. SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers. http://suif.stanford.edu.
 
56
R. A. Sugumar and S. G. Abraham. Efficient simulation of multiple cache configurations using binomial trees. Technical Report CSE-TR-111-91, 1991.
57
 
58
 
59
60
61
62
63
 
64
H. Zhang and M. Martonosi. Mathematical cache miss analysis for pointer data structures. In Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, VA, Mar. 2001. CD-ROM.

CITED BY  37

Collaborative Colleagues:
Siddhartha Chatterjee: colleagues
Erin Parker: colleagues
Philip J. Hanlon: colleagues
Alvin R. Lebeck: colleagues