ACM Home Page
Please provide us with feedback. Feedback
Skewed associativity enhances performance predictability
Full text PdfPdf (978 KB)
Source International Symposium on Computer Architecture archive
Proceedings of the 22nd annual international symposium on Computer architecture table of contents
S. Margherita Ligure, Italy
Pages: 265 - 274  
Year of Publication: 1995
ISBN:0-89791-698-0
Also published in ...
Authors
François Bodin  IRISA-INRIA, Campus de Beaulieu, 35042 Rennes Cedex, France
André Seznec  IRISA-INRIA, Campus de Beaulieu, 35042 Rennes Cedex, France
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 52,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/223982.224437
What is a DOI?

ABSTRACT

Performance tuning becomes harder as computer technology advances. One of the factors is the increasing complexity of memory hierarchies. Most modern machines now use at least one level of cache memory. To reduce execution stalls, cache misses must be very low. Software techniques used to improve locality have been developped for numerical codes, such as loop blocking and copying. Unfortunately, the behavior of direct mapped and set associative caches is still erratic when large numerical data is accessed. Execution time can vary drasticly for the same loop kernel depending on uncontrolled factors such as array leading size. The only software method available to improve execution time stability is the copying of frequently used data, which is costly in execution time. Users are not usually cache organisation experts. They are not aware of such phenomena, and have no control over it.In this paper, we show that the recently proposed four-way skewed associative cache yields very stable execution times and good average miss ratios on blocked algorithms. As a result, execution time is faster and much more predictable than with conventional caches. As a result of its better comportment, it is possible to use larger blocks sizes with blocked algorithms, which will furthermore reduces blocking overhead costs.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. Bernard, F. Bodin, A. Goasguen, C. Fechant, "Implementing a two dimensional pore-scale flow model on different parallel machines", Proceedings of X international Conference on Computational Methods in Water Resources, June 1994.
 
2
F. Bodin, C. Eisenbeis, W. Jalby, D. Windheiser, "A quantitative algorithm for data locality optimization" in Code Generation-Concepts, Tools, Techniques, Springer Verlag, 1992.
3
4
 
5
 
6
G.Irlam "Spa" personal communication 1992; the Spa package is available from gordoni@cs.adelaide.edu.au
7
 
8
A. Porterfield, "Compiler management of program locality", Technical Report, Rice University, Houston, Texas, January 1988.
 
9
M. Schlansker, R. Shaw, A. Siw~ramakrishnan "Randomization and Associativity in the Design of Placement-Insensitive Caches" HP Laboratories Technical Report 93-41, June. 1993
10
 
11
12
 
13
M. Wolf, M. Lain, "An algorithm to generate sequential and parallel code with improved data localityD", Technical Report, Stanford University 1990.
14


Collaborative Colleagues:
François Bodin: colleagues
André Seznec: colleagues