|
ABSTRACT
In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers. A program on such machines consists of many sequential program segments, each executed by a single processor. These segments interact as they access shared variables. Access to memory is asynchronous, and memory accesses are not necessarily executed in the order they were issued. An execution is correct if it is sequentially consistent: It should seem as if all the instructions were executed sequentially, in an order obtained by interleaving the instruction streams of the processors. Sequential consistency can be enforced by delaying each access to shared memory until the previous access of the same processor has terminated. For performance reasons, however, we want to allow several accesses by the same processor to proceed concurrently. Our analysis finds a minimal set of delays that enforces sequential consistency. The analysis extends to interprocessor synchronization constraints and to code where blocks of operations have to execute atomically. We use a conflict graph similar to that used to schedule transactions in distributed databases. Our graph incorporates the order on operations given by the program text, enabling us to do without locks even when database conflict graphs would suggest that locks are necessary. Our work has implications for the design of multiprocessors; it offers new compiler optimization techniques for parallel languages that support shared variables.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
2
|
AHO, A. V., GAREY, M. R., AND ULLMAN, J.D. The transitive reduction of a directed graph. SIAM J. Comput. 1, 2 (Dec. 1972), 131-137.
|
| |
3
|
ALLEN, J. R., AND KENNEDY, K. Automatic translation of Fortran programs to vector form. COMP TR84-9, Dept. of Computer Science, Rice Univ., Houston, Tex., July 1984.
|
| |
4
|
BEERI, C., BERNSTEIN, P. A., AND GOODMAN, N. A model for nested transaction systems. Tech. Rep. CS~86-1, Computer Science Dept., Hebrew Univ., Jerusalem, 1986.
|
 |
5
|
|
 |
6
|
|
| |
7
|
COLLIER, W. Principles of architecture for systems of parallel processes. Tech. Rep. TR00.3100, IBM, T. J. Watson Research Center, Yorktown Heights, N.Y., Mar. 1981.
|
 |
8
|
Jan Edler , Allan Gottlieb , Clyde P. Kruskal , Kevin P. McAuliffe , Larry Rudolph , Marc Snir , Patricia J. Teller , James Wilson, Issues related to MIMD shared-memory computers: the NYU ultracomputer approach, Proceedings of the 12th annual international symposium on Computer architecture, p.126-135, June 17-19, 1985, Boston, Massachusetts, United States
|
| |
9
|
GOTTLIEB, A., GRISHMAN, a., KRUSKAL, C. K., McAULIFFE, K. P., RUDOLPH, L., AND SNIR, M. The NYU Ultracomputer--Designing a MIMD, shared-memory parallel computer. IEEE Trans. Comput. C-32, 3 (Feb. 1983), 175-189.
|
 |
10
|
|
 |
11
|
|
| |
12
|
LAMPORT, L. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28, 9 (Sept. 1979), 690-691.
|
| |
13
|
LAMPORT, L. On interprocess communication, parts I and II. Dist. Comput. 1 (Jan. 1986), 77-101.
|
| |
14
|
LYNCH, N. A., AND FISHER, M.J. Semantics of concurrent computations. Theor. Comput. Sci. 13, i (Jan. 1981), 17-43.
|
 |
15
|
|
| |
16
|
PFISTER, G. F., BRANTLEY, W. C., GEORGE, D. A., HARVEY, S. L., KLEINFELDER, W. J., McAUUFFE, K. P., MELTON, E. A., NORTON, V. A., AND WEISS, J. The IBM research parallel processor prototype (RP3): Introduction and architecture. In Proceedings of the IEEE 1985 International Conference on Parallel Processing (Boston, June 1985). IEEE Press, New York, 1985, pp. 764-771.
|
CITED BY 55
|
|
Phillip B. Gibbons , Michael Merritt , Kourosh Gharachorloo, Proving sequential consistency of high-performance shared memories (extended abstract), Proceedings of the third annual ACM symposium on Parallel algorithms and architectures, p.292-303, July 21-24, 1991, Hilton Head, South Carolina, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manoj Plakal , Daniel J. Sorin , Anne E. Condon , Mark D. Hill, Lamport clocks: verifying a directory cache-coherence protocol, Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, p.67-76, June 28-July 02, 1998, Puerto Vallarta, Mexico
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kourosh Gharachorloo , Daniel Lenoski , James Laudon , Phillip Gibbons , Anoop Gupta , John Hennessy, Memory consistency and event ordering in scalable shared-memory multiprocessors, 25 years of the international symposia on Computer architecture (selected papers), p.376-387, June 27-July 02, 1998, Barcelona, Spain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kourosh Gharachorloo , Daniel Lenoski , James Laudon , Phillip Gibbons , Anoop Gupta , John Hennessy, Memory consistency and event ordering in scalable shared-memory multiprocessors, ACM SIGARCH Computer Architecture News, v.18 n.3a, p.15-26, June 1990
|
|
|
|
|
|
|
|
|
|
|
|
Hagit Attiya , Soma Chaudhuri , Roy Friedman , Jennifer L. Welch, Shared memory consistency conditions for non-sequential execution: definitions and programming strategies, Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures, p.241-250, June 30-July 02, 1993, Velen, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jyh-Herng Chow , William Ludwell Harrison, III, Compile-time analysis of parallel programs that share memory, Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.130-141, January 19-22, 1992, Albuquerque, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zehra Sura , Xing Fang , Chi-Leung Wong , Samuel P. Midkiff , Jaejin Lee , David Padua, Compiler techniques for high performance sequentially consistent java programs, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 15-17, 2005, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
Vijay A. Saraswat , Radha Jagadeesan , Maged Michael , Christoph von Praun, A theory of memory models, Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, March 14-17, 2007, San Jose, California, USA
|
|
|
|
|
|
Arun Kejariwal , Hideki Saito , Xinmin Tian , Milind Girkar , Wel Li , Utpal Banerjee , Alexandru Nicolau , Constantine D. Polychronopoulos, Lightweight lock-free synchronization methods for multithreading, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Edward W. Davis : Reviewer"
The title of this paper will attract the attention of many people, but the
content will be meaningful to only a select few. The very formal treatment
of the subject includes 24 theorems, lemmas, and corollaries and many
definitions. The authors
more...
|