|
ABSTRACT
In this paper, a cache coherence strategy with a combined software and hardware approach is proposed for large-scale multiprocessor systems. The new strategy has the scalability advantages of existing software strategies and does not rely on shared hardware resources to maintain coherence. It exploits as much intra-task temporal locality as previously proposed low-cost, compiler-based strategies such as Simple Invalidation and Fast Selective Invalidation. With a small amount of additional hardware and a small set of cache management instructions, the new strategy preserves more inter-task-level temporal locality than these strategies. It is an economical alternative and has potential performance close to that of more elaborate strategies such as Version Control and Time Stamp. Also, the new strategy is easily extendable to include Doacross loops.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A.J. Smxth, "CPU cache consistency with software support and using 'One Time Identifiers'," Proceedings Pacific Computer Communications Symposium, pp. 153-161, October 1985.
|
| |
2
|
A. Veidenbaum, "A compder-asslsted cache coherence soluuon for multlprocessors," Proceedings 1986 Internattonal Conference on Parallel Processing, pp. 1029-1036, August 1986.
|
| |
3
|
|
| |
4
|
|
 |
5
|
R. L. Lee , P. C. Yew , D. H. Lawrie, Multiprocessor cache design considerations, Proceedings of the 14th annual international symposium on Computer architecture, p.253-262, June 02-05, 1987, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/30350.30379]
|
 |
6
|
|
| |
7
|
H. Cheong and A. V. Veidenbaum, "Stale data detection and coherence enforcement using flow analysis," Proceedings ofthe 1988 International Conference on Parallel Processing, vol. I, pp. 138- 145, August 1988.
|
| |
8
|
R. Cytron, S. Karlovsky, and K. P. McAuhffe, "Automatic management of programmable caches," Proceedings the 1988 International Conference on Parallel Processing, vol. II, pp. 229-238, August 1988.
|
| |
9
|
C.K. Tang, "Cache system design in the tightly coupled multiprocessor system," Proceedings NCC, vol. 45, pp. 749-753, 1976.
|
| |
10
|
L. M. Censier and P. Feautrier, "A new soluuon to coherence problems in mult~cache systems," IEEE Transations on Computers, vot. C-27, no. I2, pp. 1112-1118, December, 1978.
|
 |
11
|
|
 |
12
|
|
| |
13
|
E. McCrelght, "The dragon computer system: An early overview," tech. rep., Xerox Corp., September 1984.
|
 |
14
|
|
 |
15
|
|
 |
16
|
R. H. Katz , S. J. Eggers , D. A. Wood , C. L. Perkins , R. G. Sheldon, Implementing a cache consistency protocol, Proceedings of the 12th annual international symposium on Computer architecture, p.276-283, June 17-19, 1985, Boston, Massachusetts, United States
|
 |
17
|
|
 |
18
|
|
| |
19
|
S. L. Min and J.-L. Baer, "A timestamp-based cache coherence scheme," Proceedings 1989 International Conference on Parallel Processing, I, Architecture:23-32, August 1989.
|
| |
20
|
J. Fang and M. Lu, "A solution of cache ping-pong problem in RISC based parallel processing systems," Proceedings 1991 International Conference On Parallel Processing, I, Architecture:238-245, August 1991.
|
| |
21
|
|
 |
22
|
|
| |
23
|
U. Banerjee, "Data dependence in ordinary programs," Tech. Rep. No. 76-837, M.S. thests, Umverslty of Illinois at Urbana-Champaign, November 1976.
|
| |
24
|
|
| |
25
|
|
| |
26
|
J.-K. Peir, K. So, and J.-H. Tang, "Inter-section locality of shared data in parallel programs," Proceedings 1991 International Conference On Parallel Processing, I, Architecture:278-286, August 1991.
|
INDEX TERMS
Primary Classification:
D.
Software
D.3
PROGRAMMING LANGUAGES
D.3.4
Processors
Subjects:
Compilers
Additional Classification:
C.
Computer Systems Organization
C.1
PROCESSOR ARCHITECTURES
C.5
COMPUTER SYSTEM IMPLEMENTATION
C.5.1
Large and Medium ("Mainframe") Computers
Subjects:
Super (very large) computers
General Terms:
Algorithms,
Design,
Performance
Keywords:
Doacross loop,
compiler-based cache coherence,
fast selective invalidation,
inter-task-level temporal locality,
life span strategy,
parallel task execution,
simple invalidation,
time-stamp approach,
version control
|