|
ABSTRACT
Caches have become increasingly important with the widening gap between main memory and processor speeds. Small and fast cache memories are designed to bridge this discrepancy. However, they are only effective when programs exhibit sufficient data locality. In addition, caches are a source of unpredictability, resulting in programs sometimes behaving in a different way than expected. Detailed information about the number of cache misses and their causes allows us to predict cache behavior and to detect bottlenecks. Small modifications in the source code may change memory patterns, thereby altering the cache behavior. Code transformations, which take the cache behavior into account, might result in a high cache performance improvement. However, cache memory behavior is very hard to predict, thus making the task of optimizing and timing cache behavior very difficult. This article proposes and evaluates a new compiler framework that times cache behavior for multitasking systems. Our method explores the use of cache partitioning and dynamic cache locking to provide worst-case performance estimates in a safe and tight way for multitasking systems. We use cache partitioning, which divides the cache among tasks to eliminate intertask cache interferences. We combine static cache analysis and cache-locking mechanisms to ensure that all intratask conflicts, and consequently, memory access times, are exactly predictable. The results of our experiments demonstrate the capability of our framework to describe cache behavior at compile time. We compare our timing approach with a system equipped with a nonpartitioned, but statically, locked data cache. Our method outperforms static cache locking for all analyzed task sets under various cache architectures, demonstrating that our fully predictable scheme does not compromise the performance of the transformed programs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Arnold, R., Müeller, F., Whalley, D., and Harmon, M. 1994. Bounding worst-case instruction cache performance. In Proceedings of 15th Real-Time Systems Symposium (RTSS'94). 172--181.
|
| |
3
|
Basumallick, S. and Nielsen, K. 1994. Cache issues in real-time systems. In Proceedings ACM Workshop on Languages, Compilers and Tools for Real-Time Systems (LCTES'94).
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
Busquets-Mataix, J. V., Serrano, J. J., and Wellings, A. 1997. Hybrid instruction cache partitioning for preemptive real-time systems. In Proceedings of 9th Euromicro Workshop on Real-Time Systems (EUROMICRO-RTS'97).
|
| |
8
|
Campoy, M., Ivars, A. P., and Busquets-Mataix, J. V. 2001. Static use of locking caches in multitask preemptive real-time systems. In Proceedings of IEEE/IEE Real-Time Embedded Systems Workshop (Satellite of the IEEE Real-Time Systems Symposium).
|
| |
9
|
|
| |
10
|
Engblom, J. and Ermedahl, A. 2000. Modeling complex flows for worst-case execution time analysis. In Proceedings of the 21st IEEE Real-Time Systems Symposium (RTSS'00).
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
Gustafsson, J. 2000. Analyzing execution time of object-oriented programs using abstract interpretation. Ph.D. thesis, Uppsala University.
|
| |
17
|
|
| |
18
|
|
| |
19
|
IBM Microelectronics Division. 1999. The PowerPC 440 core.
|
| |
20
|
Integrated Device Technologies. 2001. 79RC64574/RC64575 Data Sheet.
|
| |
21
|
Jeffay, K. and Stone, D. L. 1993. Accounting for interrupt handling costs in dynamic priority task systems. In Proceedings of 14th Real-Time Systems Symposium (RTSS'93). 212--221.
|
| |
22
|
Joseph, M. and Pandya, P. 1986. Finding response times in a real-time system. The Computer Journal 29, 5, 390--395.
|
| |
23
|
|
| |
24
|
|
| |
25
|
Kirk, D. B. 1989. SMART (strategic memory allocation for real-time) cache design. In Proceedings of 10th Real-Time Systems Symposium (RTSS'89).
|
 |
26
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
| |
27
|
Chang-Gun Lee , Joosun Hahn , Yang-Min Seo , Sang Lyul Min , Rhan Ha , Seongsoo Hong , Chang Yun Park , Minsuk Lee , Chong Sang Kim, Analysis of Cache-Related Preemption Delay in Fixed-Priority Preemptive Scheduling, IEEE Transactions on Computers, v.47 n.6, p.700-713, June 1998
[doi> 10.1109/12.689649]
|
| |
28
|
|
| |
29
|
Li, Y. T. S., Malik, S., and Wolfe, A. 1996. Cache modeling and path analysis for real-time software. In Proceedings of 17th Real-Time Systems Symposium (RTSS'96).
|
| |
30
|
|
| |
31
|
Lim, S. S., Bae, Y. H., Jang, G. T., Rhee, B. D., Min, S. L., Park, C. Y., Shin, H., Park, K., and Kim, C. S. 1994. An accurate worst case timing analysis technique for RISC processors. In Proceedings of 15th Real-Time Systems Symposium (RTSS'94). 97--108.
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
MIPS Technologies. 2001. MIPS32 4Kp- Embedded, MIPS Processor Core.
|
| |
36
|
Motorola Inc. 1996. PowerPC 604e RISC Microprocessor Technical Summary.
|
 |
37
|
|
| |
38
|
|
 |
39
|
|
| |
40
|
|
| |
41
|
Sun Microelectronics. 1997. microSPARC-IIep User's Manual.
|
| |
42
|
|
| |
43
|
|
| |
44
|
|
 |
45
|
|
| |
46
|
|
 |
47
|
|
| |
48
|
|
| |
49
|
|
 |
50
|
|
| |
51
|
Wolfe, A. 1993. Software-based cache partitioning for real-time applications. In Proceedings of the 3rd International Workshop on Responsive Computer Systems.
|
| |
52
|
|
| |
53
|
|
|