| Influence of cross-interferences on blocked loops: a case study with matrix-vector multiply |
| Full text |
Pdf
(915 KB)
|
| Source
|
ACM Transactions on Programming Languages and Systems (TOPLAS)
archive
Volume 17 , Issue 4 (July 1995)
table of contents
Pages: 561 - 575
Year of Publication: 1995
ISSN:0164-0925
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 33, Citation Count: 8
|
|
|
ABSTRACT
State-of-the art data locality optimizing algorithms are targeted for local memories rather than for cache memories. Recent work on cache interferences seems to indicate that these phenomena can severely affect blocked algorithms cache performance. Because of cache conflicts, it is not possible to know the precise gain brought by blocking. It is even difficult to determine for which problem sizes blocking is useful. Computing the actual optimal block size is difficult because cache conflicts are highly irregular. In this article, we illustrate the issue of precisely evaluating cross-interferences in blocked loops with blocked matrix-vector multiply. Most significant interference phenomena are captured because unusual parameters such as array base addresses are being considered. The techniques used allow us to compute the precise improvement due to blocking and the threshold value of problem parameters for which the blocked loop should be preferred. It is also possible to derive an expression of the optimal block size as a function of problem parameters. Finally, it is shown that a precise rather than an approximate evaluation of cache conflicts is sometimes necessary to obtain near-optimal performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
EISENBEIS, C., JALBY, W., WINDHEISER, D., AND BODIN, F. 1990. A strategy for array management in local memory. In Proceedings of the 3rd Workshop on Programming Languages and Compilers for Parallel Computzng. Irvine, California.
|
| |
2
|
ESSEGHIR, K. 1993. Improving data locality for caches. M.S. thesis, Univ of Texas, Houston, ~rex.
|
| |
3
|
|
| |
4
|
FRICKER, C., TEMAM, O., AND JALBY, W. 1993. Accurate evMuation of blocked algorithms cache interferences. Tech. rep., Leiden Univ., Leiden, The Netherlands. Mar.
|
| |
5
|
|
 |
6
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
TEMAM, 0., FRICKER, C., AND JALBY, W. 1993. Impact of cache interferences on usual numer- ical dense loop nests. In Proc. IEEE, special issue on Computer Performance Evaluation.
|
 |
11
|
O. Temam , C. Fricker , W. Jalby, Cache interference phenomena, Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems, p.261-271, May 16-20, 1994, Nashville, Tennessee, United States
|
 |
12
|
|
CITED BY 8
|
|
|
|
|
Richard E. Ladner , James D. Fix , Anthony LaMarca, Cache performance analysis of traversals and random accesses, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.613-622, January 17-19, 1999, Baltimore, Maryland, United States
|
|
|
Siddhartha Chatterjee , Vibhor V. Jain , Alvin R. Lebeck , Shyam Mundhra , Mithuna Thottethodi, Nonlinear array layouts for hierarchical memory systems, Proceedings of the 13th international conference on Supercomputing, p.444-453, June 20-25, 1999, Rhodes, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Michael Wolfe : Reviewer"
The cache behavior of “blocked” or “tiled”
linear-algebra loops is investigated. Unlike previous work, this paper
looks at the cache misses for fetches of one array due to interference
from another array. The goal is to
more...
|