ACM Home Page
Please provide us with feedback. Feedback
Efficient and correct execution of parallel programs that share memory
Full text PdfPdf (2.35 MB)
Source ACM Transactions on Programming Languages and Systems (TOPLAS) archive
Volume 10 ,  Issue 2  (April 1988) table of contents
Pages: 282 - 312  
Year of Publication: 1988
ISSN:0164-0925
Authors
Dennis Shasha  Courant Institute, New York Univ., New York, NY
Marc Snir  IBM T. J. Watson Research Center, Yorktown Heights, NY
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 135,   Citation Count: 55
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/42190.42277
What is a DOI?

ABSTRACT

In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers. A program on such machines consists of many sequential program segments, each executed by a single processor. These segments interact as they access shared variables. Access to memory is asynchronous, and memory accesses are not necessarily executed in the order they were issued. An execution is correct if it is sequentially consistent: It should seem as if all the instructions were executed sequentially, in an order obtained by interleaving the instruction streams of the processors. Sequential consistency can be enforced by delaying each access to shared memory until the previous access of the same processor has terminated. For performance reasons, however, we want to allow several accesses by the same processor to proceed concurrently. Our analysis finds a minimal set of delays that enforces sequential consistency. The analysis extends to interprocessor synchronization constraints and to code where blocks of operations have to execute atomically. We use a conflict graph similar to that used to schedule transactions in distributed databases. Our graph incorporates the order on operations given by the program text, enabling us to do without locks even when database conflict graphs would suggest that locks are necessary. Our work has implications for the design of multiprocessors; it offers new compiler optimization techniques for parallel languages that support shared variables.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
AHO, A. V., GAREY, M. R., AND ULLMAN, J.D. The transitive reduction of a directed graph. SIAM J. Comput. 1, 2 (Dec. 1972), 131-137.
 
3
ALLEN, J. R., AND KENNEDY, K. Automatic translation of Fortran programs to vector form. COMP TR84-9, Dept. of Computer Science, Rice Univ., Houston, Tex., July 1984.
 
4
BEERI, C., BERNSTEIN, P. A., AND GOODMAN, N. A model for nested transaction systems. Tech. Rep. CS~86-1, Computer Science Dept., Hebrew Univ., Jerusalem, 1986.
5
6
 
7
COLLIER, W. Principles of architecture for systems of parallel processes. Tech. Rep. TR00.3100, IBM, T. J. Watson Research Center, Yorktown Heights, N.Y., Mar. 1981.
8
 
9
GOTTLIEB, A., GRISHMAN, a., KRUSKAL, C. K., McAULIFFE, K. P., RUDOLPH, L., AND SNIR, M. The NYU Ultracomputer--Designing a MIMD, shared-memory parallel computer. IEEE Trans. Comput. C-32, 3 (Feb. 1983), 175-189.
10
11
 
12
LAMPORT, L. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28, 9 (Sept. 1979), 690-691.
 
13
LAMPORT, L. On interprocess communication, parts I and II. Dist. Comput. 1 (Jan. 1986), 77-101.
 
14
LYNCH, N. A., AND FISHER, M.J. Semantics of concurrent computations. Theor. Comput. Sci. 13, i (Jan. 1981), 17-43.
15
 
16
PFISTER, G. F., BRANTLEY, W. C., GEORGE, D. A., HARVEY, S. L., KLEINFELDER, W. J., McAUUFFE, K. P., MELTON, E. A., NORTON, V. A., AND WEISS, J. The IBM research parallel processor prototype (RP3): Introduction and architecture. In Proceedings of the IEEE 1985 International Conference on Parallel Processing (Boston, June 1985). IEEE Press, New York, 1985, pp. 764-771.

CITED BY  55


REVIEW

"Edward W. Davis : Reviewer"

The title of this paper will attract the attention of many people, but the content will be meaningful to only a select few. The very formal treatment of the subject includes 24 theorems, lemmas, and corollaries and many definitions. The authors   more...

Collaborative Colleagues:
Dennis Shasha: colleagues
Marc Snir: colleagues