ACM Home Page
Please provide us with feedback. Feedback
Towards a theory of cache-efficient algorithms
Full text PdfPdf (273 KB)
Source Journal of the ACM (JACM) archive
Volume 49 ,  Issue 6  (November 2002) table of contents
Pages: 828 - 858  
Year of Publication: 2002
ISSN:0004-5411
Authors
Sandeep Sen  Indian Institute of Technology Delhi, New Delhi, India
Siddhartha Chatterjee  IBM Research, Yorktown Heights, New York
Neeraj Dumir  Indian Institute of Technology Delhi, New Delhi, India
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 107,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/602220.602225
What is a DOI?

ABSTRACT

We present a model that enables us to analyze the running time of an algorithm on a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our cache model, an extension of Aggarwal and Vitter's I/O model, enables us to establish useful relationships between the cache complexity and the I/O complexity of computations. As a corollary, we obtain cache-efficient algorithms in the single-level cache model for fundamental problems like sorting, FFT, and an important subclass of permutations. We also analyze the average-case cache behavior of mergesort, show that ignoring associativity concerns could lead to inferior performance, and present supporting experimental evidence.We further extend our model to multiple levels of cache with limited associativity and present optimal algorithms for matrix transpose and sorting. Our techniques may be used for systematic exploitation of the memory hierarchy starting from the algorithm design stage, and for dealing with the hitherto unresolved problem of limited associativity.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Aggarwal, A., Chandra, A., and Snir, M. 1987b. Hierarchical memory with block transfer. In Proceedings of IEEE Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., 204--216.
4
 
5
Alpern, B., Carter, L., Feig, E., and Selker, T. 1994. The uniform memory hierarchy model of computation. Algorithmica 12, 2, 72--109.
 
6
 
7
 
8
9
10
 
11
Chatterjee, S., and Sen, S. 2000. Cache-efficient matrix transposition. In Proceedings of HPCA-6 (Toulouse, France). 195--205.
 
12
 
13
 
14
Floyd, R. 1972. Permuting information in idealized two-level storage. In Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, Eds. Plenum Press, New York, N.Y., 105--109.
15
 
16
Frigo, M., and Johnson, S. G. 1998. FFTW: An adaptive software architecture for the FFT. In Proceedings of ICASSP'98, vol. 3 (Seattle, Wash.). IEEE Computer Society Press, Los Alamitos, Calif., 1381.
 
17
 
18
Goodrich, M., Tsay, J., Vengroff, D., and Vitter, J. 1993. External memory computational geometry. In Proceeding of IEEE Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., 714--723.
 
19
20
 
21
Kamath, A., Motwani, R., Palem, K., and Spirakis, P. 1994. Tail bounds for occupancy and the satisfiability threshold conjecture. In Proceeding of IEEE Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., 592--603.
 
22
23
 
24
 
25
 
26
Mehlhorn, K., and Sanders, P. 2000. Scanning multiple sequences via cache memory. citeseen.nj.nec.com/506957.html.
 
27
 
28
 
29
30
 
31
 
32
 
33
Vitter, J., and Shriver, E. 1994. Algorithms for parallel memory. I: Two-level memories. Algorithmica 12, 2, 110--147.


Collaborative Colleagues:
Sandeep Sen: colleagues
Siddhartha Chatterjee: colleagues
Neeraj Dumir: colleagues