|
ABSTRACT
Performance of high-speed multiprocessor systems is limited by the available bandwidth to memory and the need to synchronize write sharable data. This paper presents a new memory system that separates synchronization related data from others. The memory system has two tiers: synchronization memory and high bandwidth (HB) memory. The synchronization memory consists of snooping caches connected to a bus and is used to store synchronization variables such as locks and semaphores. The HB memory is used to store the bulk of the application program code and data. It contains caches and a high bandwidth interconnection network to memory, such as the crossbar, but does not have full snooping among caches.
The two tier memory system has been evaluated by analyzing the memory behavior of the simulated parallel execution of Prolog programs. Initial results indicate that the two tier memory system potentially reduces memory interference and speeds up synchronization. Three different schemes have been studied for the caches on the HB memory and the results are presented. The two-tier memory system has potential applications in areas where synchronization is light to medium and local data is often accessed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
O.P. Agrawal and A.V. Pohm, "Cache Memory Systems for Multiprocessor Architectures," Proceedings of the 11th Annual Symposium on Computer Architecture, Ann Arbor, MI, June 1984.
|
| |
2
|
Arvind and K.P. Gostelow, "The Id Report: An Asynchronous Language and Computing Machine," TR-114, Dept of Computer and Information Science, UC Irvine, UC Irvine, CA, September 1978. Also appeared in IEEE Computer, Feb. 1982.
|
| |
3
|
Information Research Associates, Performance Analyst's Workbench System ( PAWS ), Version 3.0, Information Research Associates Inc., Austin, TX, 1987.
|
 |
4
|
P. Bitar , A. M. Despain, Multiprocessor cache synchronization: issues, innovations, evolution, Proceedings of the 13th annual international symposium on Computer architecture, p.424-433, June 02-05, 1986, Tokyo, Japan
|
| |
5
|
L.M. Censier and P. Feautrier, "A New Solution to Coherence Problems in Multicache Systems," IEEE Transactions on Computers, vol. Vol. C-27, No. 12, pp. 1112- 1118, Dec. 1978.
|
| |
6
|
|
| |
7
|
T. Cheung and J.E. Smith, "An Analysis of the Cray X- MP Memory System," Proceedings of the 1984 Parallel Processing Conference, pp. 499 - 505., Michigan, Aug. 1985.
|
| |
8
|
|
| |
9
|
|
| |
10
|
A.M. Despain and V. P. Srini, "Multiprocessor Architecture Research for Prolog," Proceedings of the State of California MICRO-86 Report, March. 1988.
|
| |
11
|
W.R. Bush, G.Cheng, P.C. McGeer, and A.M. Despain, "An Advanced Silicon Compiler in Prolog," Proceedings of the Intl. Conference on Computer Design, pp. 27 - 31, Rye Town, New York, Oct. 1987.
|
| |
12
|
|
| |
13
|
V.P. Srini, J.V. Tam, T.M. Nguyen, Y.N. Patt, A.M. Despain, M. Moll, and D. Etlsworth, "A CMOS Chip for Prolog," Proceedings of the Intl. Conference on Computer Design, pp. 605 - 610, Rye Town, New York, Oct. 1987.
|
| |
14
|
|
| |
15
|
J.A. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Transactions on Computers, vol. C-30, No. 7, pp. 478-490, July 1981.
|
 |
16
|
|
| |
17
|
S. Frank, "Tightly Coupled Multiprocessor System Speeds Memory-Access Times," Electronics, Jan. 12, 1984.
|
 |
18
|
|
| |
19
|
A. Gottlieb and et. al., "The NYU Ultra Computer," IEEE TC, vol. C-32, No. 2, pp. 175-189, February 1983.
|
| |
20
|
Mark Hill , Susan Eggers , Jim Larus , George Taylor , Glenn Adams , B. K. Bose , Garth Gibson , Paul Hansen , Jon Keller , Shing Kong , Corinna Lee , Daebum Lee , Joan Pendleton , Scott Ritchie , David Wood , Ben Zorn , Paul Hilfinger , Dave Hodges , Randy Katz , John Ousterhout , Dave Patterson, Design decisions in SPUR, Computer, v.19 n.11, p.8-22, Nov. 1986
[doi> 10.1109/MC.1986.1663096]
|
| |
21
|
|
| |
22
|
S. Iacobovici and C.C. Ng, "VLSI and System Performance Modeling," IEEE Micro, Aug. 1987.
|
 |
23
|
R. H. Katz , S. J. Eggers , D. A. Wood , C. L. Perkins , R. G. Sheldon, Implementing a cache consistency protocol, Proceedings of the 12th annual international symposium on Computer architecture, p.276-283, June 17-19, 1985, Boston, Massachusetts, United States
|
| |
24
|
S.F. Lundstrom and G.H. Barnes, "A Controllable MIMD Architecture," Proceedings of the 1980 Parallel Processing Conference, pp. 19 - 27, Boyne Highlands, Michigan, Aug. 1980.
|
| |
25
|
|
| |
26
|
J.R. McGraw and et. al, SISAL: Streams and Iteration in a Single-Assignment Language, Technical Report, Lawrence Livermore National Laboratory, 1983..
|
| |
27
|
|
| |
28
|
D.A. Padua, D. J. Kuck, and D. H. Lawrie, "High-Speed Multiprocessors and Compilation Techniques," IEEE Transactions on Computers, vol. C-29, No. 9, pp. 763- 77 6, Sept. 1980.
|
| |
29
|
C. "q. Ravishankar and J. Goodman, "Cache Implementat.ion for Multiple Processors," IEEE Spring Compcon Conference, San Francisco, February 1983.
|
| |
30
|
Cray Research, Cray X-MP Computer Systems, Mainframe Reference Manual, HR-O032, Cray Research Inc., Chippewa Falls, Wisconsin, 1982.
|
| |
31
|
Cray Research, Cray Computer Systems Technical Note, Multitasking User's Guide, SN-0222, Cmy Research Inc., Chippewa Falls, Wisconsin, 1984.
|
| |
32
|
P. Van Roy, A Prolog Compiler for the PLM, Master's Thesis, University of California, Berkeley, CA, August, 1984.
|
 |
33
|
|
| |
34
|
D. Gajski, D. Kuck, D. Lawrie, and A. Sameh, "Cedar- A Large Scale Multiprocessor," Proceedings of the 1983 Parallel Processing Conference, pp. 524 - 429., Michigan, Aug. 1983.
|
| |
35
|
M. Satyanarayanan, in Multiprocessors - A Comparative Study, Prentice-Hall, Inc., 1980.
|
| |
36
|
|
 |
37
|
|
 |
38
|
|
| |
39
|
V.P. S rini, "A Low-Latency Crossbar Chip for Multiprocessors," Patent Application, University of California, Berkeley, CA, Jan. 1988.
|
| |
40
|
G.L. Steele, "Common Lisp," Digital Press, 1984.
|
| |
41
|
C.K.-Tang, "Cache System Design in the Tightly Coupled Mulfiproeessor System," Proceedings of the National Computer Conference, vol. Vol. 45, pp. 749 - 753, 1976.
|
| |
42
|
E. Tick and D.H.D. Warren, Towards a Pipelined Prolog Processor, SRI International, Technical Report., Menlo Park, CA, August 1983.
|
| |
43
|
S. Wallaeh, "The Convex C-1 64-bit Super.computer,'" Digest of Papers, Spring COMPCON 85, pp. 122-126, San Francisco, Feb. 1985.
|
| |
44
|
W. Wulf and C. Bell, "C.mmp - A multi-Miniprocessor," AFIPS Proc. (FJCC), vol. 4l, Part 2, pp. 756 - 777, 1972.
|
CITED BY 2
|
|
D. E. Marquardt , H. S. Alkhatib, C2MP: a cache-coherent, distributed memory multiprocessor-system, Proceedings of the 1989 ACM/IEEE conference on Supercomputing, p.466-475, November 12-17, 1989, Reno, Nevada, United States
|
|
|
|
|