|
ABSTRACT
The design and correctness of a communication facility for a distributed computer system are reported on. The facility provides support for fault-tolerant process groups in the form of a family of reliable multicast protocols that can be used in both local- and wide-area networks. These protocols attain high levels of concurrency, while respecting application-specific delivery ordering constraints, and have varying cost and performance that depend on the degree of ordering desired. In particular, a protocol that enforces causal delivery orderings is introduced and shown to be a valuable alternative to conventional asynchronous communication protocols. The facility also ensures that the processes belonging to a fault-tolerant process group will observe consistent orderings of events affecting the group as a whole, including process failures, recoveries, migration, and dynamic changes to group properties like member rankings. A review of several uses for the protocols in the ISIS system, which supports fault-tolerant resilient objects and bulletin boards, illustrates the significant simplification of higher level algorithms made possible by our approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
CRISTIAN, F., AGHILI, H., STRONG, R., AND DOLEV, D. Atomic broadcast: From simple message diffusion to Byzantine agreement. IBM Tech. Rep. RJ 4540 (48668), Oct. 1984.
|
 |
8
|
|
 |
9
|
Amr El Abbadi , Dale Skeen , Flaviu Cristian, An efficient, fault-tolerant protocol for replicated data management, Proceedings of the fourth ACM SIGACT-SIGMOD symposium on Principles of database systems, p.215-229, March 25-27, 1985, Portland, Oregon, United States
[doi> 10.1145/325405.325443]
|
| |
10
|
|
 |
11
|
Nathan Goodman , Dale Skeen , Arvola Chan , Umeshwar Dayal , Stephen Fox , Daniel Ries, A recovery algorithm for a distributed database system, Proceedings of the 2nd ACM SIGACT-SIGMOD symposium on Principles of database systems, March 21-23, 1983, Atlanta, Georgia
[doi> 10.1145/588058.588061]
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
SKEEN, D. Crash recovery in distributed database systems. Ph.D. dissertation, Dept. of Electrical Engineering and Computer Science, Univ. of California, Berkeley, 1980.
|
 |
17
|
|
CITED BY 155
|
|
|
|
|
|
|
|
Ajei Gopal , Ray Strong , Sam Toueg , Flaviu Cristian, Early-delivery atomic broadcast, Proceedings of the ninth annual ACM symposium on Principles of distributed computing, p.297-309, August 22-24, 1990, Quebec City, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Lawrence C. N. Tseung , Keh-Chiang Yu, The implementation of guaranteed, reliable, secure broadcast networks, Proceedings of the 1990 ACM annual conference on Cooperation, p.259-265, February 20-22, 1990, Washington, D.C., United States
|
|
|
|
|
|
|
|
|
|
|
|
Rivka Ladin , Barbara Liskov , Liuba Shrira, Lazy replication: exploiting the semantics of distributed services, Proceedings of the ninth annual ACM symposium on Principles of distributed computing, p.43-57, August 22-24, 1990, Quebec City, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ashwani Gahlot , Mohan Ahuja , Timothy Carlson, Global flush communication primitive for inter-process communication, Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing, p.111-120, August 14-17, 1994, Los Angeles, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brendan Tangney , Vinny Cahill , Chris Horn , Dominic Herity , Alan Judge , Gradimir Starovic , Mark Sheppard, Some ideas on support for fault tolerance in COMANDOS, an object oriented distributed system, Proceedings of the 4th workshop on ACM SIGOPS European workshop, p.1-6, September 03-05, 1990, Bologna, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mahadev Satyanarayanan , James J. Kistler , Puneet Kumar , Maria E. Okasaki , Ellen H. Siegel , David C. Steere, Coda: A Highly Available File System for a Distributed Workstation Environment, IEEE Transactions on Computers, v.39 n.4, p.447-459, April 1990
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rajendra Yavatkar , James Griffoen , Madhu Sudan, A reliable dissemination protocol for interactive collaborative applications, Proceedings of the third ACM international conference on Multimedia, p.333-344, November 05-09, 1995, San Francisco, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert Simon , Robert Sclabassi , Taieb Znati, Communication control in computer supported cooperative work systems, Proceedings of the 1994 ACM conference on Computer supported cooperative work, p.311-321, October 22-26, 1994, Chapel Hill, North Carolina, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hans-Ruedi Aschmann , Niklaus Giger , Elisabeth Hoepli , Peter Janak , Hubert Kirrmann, Alphorn: A Remote Procedure Call Environment for Fault-Tolerant, Heterogeneous, Distributed Systems, IEEE Micro, v.11 n.5, p.16-19, 60-67, September 1991
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bo Xu , Ouri Wolfson , Sam Chamberlain, Spatially distributed databases on sensors, Proceedings of the 8th ACM international symposium on Advances in geographic information systems, p.153-160, November 06-11, 2000, Washington, D.C., United States
|
|
|
|
|
|
|
|
|
|
|
|
Brendan Tangney , Vinny Cahill , Chris Horn , Dominic Herity , Alan Judge , Gradimir Starovic , Mark Sheppard, Some ideas on support for fault tolerance in COMANDOS, an object oriented distributed system, ACM SIGOPS Operating Systems Review, v.25 n.2, p.130-135, April 1991
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jean Botev , Alexander Hohfeld , Hermann Schloss , Ingo Scholtes , Peter Sturm , Markus Esch, The HyperVerse: concepts for a federated and Torrent-based '3D Web', International Journal of Advanced Media and Communication, v.2 n.4, p.331-350, December 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Andrew Robert Huber : Reviewer"
The premise of this paper is that message orderings should be included in the
communications layer of a distributed system. This approach is intended to
maximize concurrency at the communications level, yet allow processes to
determine (when des
more...
|