|
ABSTRACT
Multicores are becoming ubiquitous, not only in general-purpose but also embedded computing. This trend is a reflexion of contemporary embedded applications posing steadily increasing demands in processing power. On such platforms, prediction of timing behavior to ensure that deadlines of real-time tasks can be met is becoming increasingly difficult. While real-time multicore scheduling approaches help to assure deadlines based on firm theoretical properties, their reliance on task migration poses a significant challenge to timing predictability in practice. Task migration actually (a) reduces timing predictability for contemporary multicores due to cache warm-up overheads while (b) increasing traffic on the network-on-chip (NoC) interconnect. This paper puts forth a fundamentally new approach to increase the timing predictability of multicore architectures aimed at task migration in embedded environments. A task migration between two cores imposes cache warm-up overheads on the migration target, which can lead to missed deadlines for tight real-time schedules. We propose novel micro-architectural support to migrate cache lines. Our scheme shows dramatically increased predictability in the presence of cross-core migration. Experimental results for schedules demonstrate that our scheme enables real-time tasks to meet their deadlines in the presence of task migration. Our results illustrate that increases in execution time due to migration is reduced by our scheme to levels that may prevent deadline misses of real-time tasks that would otherwise occur. Our mechanism imposes an overhead at a fraction of the task's execution time, yet this overhead can be steered to fill idle slots in the schedule, i.e., it does not contribute to the execution time of the migrated task. Overall, our novel migration scheme provides a unique mechanism capable of significantly increasing timing predictability in the wake of task migration.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Tera-scale research prototype: Connecting 80 simple sores on a single test chip. ftp://download.intel.com/research/platform/terascale/terascaleresearchprototypebackgrounder.pdf.
|
| |
2
|
Tilera processor family. http://www.tilera.com/products/processors.php.
|
| |
3
|
Wcet project benchmarks, 2007. http://www.mrtc.mdh.se/projects/wcetbenchmarks.html.
|
| |
4
|
|
| |
5
|
|
| |
6
|
J. Anderson and A. Srinivasan. Early-release fair scheduling. In Euromicro Conference on Real-Time Systems, pages 35--43, June 2000.
|
| |
7
|
|
| |
8
|
R. Arnold, F. Mueller, D. B. Whalley, and M. Harmon. Bounding worst-case instruction cache performance. In IEEE Real-Time Systems Symposium, pages 172--181, Dec. 1994.
|
| |
9
|
|
| |
10
|
S. Baruah, N. Cohen, C. Plaxton, and D. Varvel. Proportionate progress: A notion of fairness in resource allocation. Algorithmica, 15:600--625, 1996.
|
| |
11
|
Stefano Bertozzi , Andrea Acquaviva , Davide Bertozzi , Antonio Poggiali, Supporting task migration in multi-processor systems-on-chip: a feasibility study, Proceedings of the conference on Design, automation and test in Europe: Proceedings, March 06-10, 2006, Munich, Germany
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
Robit Chandra , Leonardo Dagum , Dave Kohr , Dror Maydan , Jeff McDonald , Ramesh Menon, Parallel programming in OpenMP, Morgan Kaufmann Publishers Inc., San Francisco, CA, 2001
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
S. Dhall and C. Liu. On a real-time scheduling problem. Operations Research, 26(1):127--140, 1978.
|
| |
20
|
|
 |
21
|
|
| |
22
|
A. Fedorova, M. Seltzer, and M. Smith. Cache-fair thread scheduling for multicore processors. Technical Report TR-17-06, Harvard University, Oct. 2006.
|
| |
23
|
|
| |
24
|
|
 |
25
|
|
| |
26
|
|
| |
27
|
|
| |
28
|
S. Lauzac, R. Melhem, and D. Mosse. Comparison of global and partitioning schemes for scheduling rate monotonic tasks on a multiprocessor. In Euromicro Workshop on Real-Time Systems, pages 188--195, 1998.
|
| |
29
|
Chang-Gun Lee , J. Hahn , Sang Lyul Min , R. Ha , Seongsoo Hong , Chang Yun Park , Minsuk Lee , Chong Sang Kim, Analysis of cache-related preemption delay in fixed-priority preemptive scheduling, Proceedings of the 17th IEEE Real-Time Systems Symposium, p.264, December 04-06, 1996
|
 |
30
|
|
| |
31
|
T. Li, P. Brett, B. Hohlt, R. Knauerhase, S. McElderry, and S. Hahn. Operating system support for shared-isa asymmetric multi-core architectures. In Workshop on the Interaction between Operating Systems and Computer Architecture, pages 19--26, June 2008.
|
| |
32
|
J. Liu. Real-Time Systems. Prentice Hall, 2000.
|
| |
33
|
Michael R. Marty , Jesse D. Bingham , Mark D. Hill , Alan J. Hu , Milo M. K. Martin , David A. Wood, Improving Multiple-CMP Systems Using Token Coherence, Proceedings of the 11th International Symposium on High-Performance Computer Architecture, p.328-339, February 12-16, 2005
[doi> 10.1109/HPCA.2005.17]
|
 |
34
|
|
| |
35
|
|
| |
36
|
F. Mueller. Timing predictions for multi-level caches. In ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems, pages 29--36, June 1997.
|
| |
37
|
|
| |
38
|
H. Ramaprasad and F. Mueller. Tightening the bounds on feasible preemptions. Transactions on Embedded Computing Systems, Mar. 2008 (accepted).
|
| |
39
|
J. Renau, B. Fragela, J. Tuck, W. Liu, L. Ceze, S. Sarangi, P. Sack, and a. P. M. K. Strauss. Sesc simulator. http://sesc.sourceforge.net, Jan. 2005.
|
 |
40
|
Shane Ryoo , Christopher I. Rodrigues , Sara S. Baghsorkhi , Sam S. Stone , David B. Kirk , Wen-mei W. Hwu, Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, February 20-23, 2008, Salt Lake City, UT, USA
[doi> 10.1145/1345206.1345220]
|
 |
41
|
|
 |
42
|
|
| |
43
|
|
| |
44
|
|
| |
45
|
|
| |
46
|
|
 |
47
|
Reinhard Wilhelm , Jakob Engblom , Andreas Ermedahl , Niklas Holsti , Stephan Thesing , David Whalley , Guillem Bernat , Christian Ferdinand , Reinhold Heckmann , Tulika Mitra , Frank Mueller , Isabelle Puaut , Peter Puschner , Jan Staschulat , Per Stenström, The worst-case execution-time problem—overview of methods and survey of tools, ACM Transactions on Embedded Computing Systems (TECS), v.7 n.3, p.1-53, April 2008
[doi> 10.1145/1347375.1347389]
|
| |
48
|
www.openmp.org. Official OpenMP Specification, May 2005.
|
| |
49
|
J. Yan and W. Zhang. Time-predictable l2 caches for real-time multicore processors. In Work in Progress session of IEEE Real-Time Systems Symposium, Dec. 2007.
|
| |
50
|
J. Yan and W. Zhang. Wcet analysis of multi-core processors. In Work in Progress session of IEEE Real-Time Systems Symposium, Dec. 2007.
|
| |
51
|
|
| |
52
|
M. Zhang and K. Asanovic. Victim migration: Dynamically adapting between private and shared cmp caches. TR 2005-064, MIT CSAIL, 2005.
|
|