ACM Home Page
Please provide us with feedback. Feedback
Timestamp-Based Orphan Elimination
Full text Publisher SitePublisher Site
Source IEEE Transactions on Software Engineering archive
Volume 15 ,  Issue 7  (July 1989) table of contents
Pages: 825 - 831  
Year of Publication: 1989
ISSN:0098-5589
Authors
M. Herlihy  Carnegie-Mellon Univ., Pittsburgh, PA
M. S. McKendry  FileNet Corp., Costa Mesa, CA
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1109/32.29482

ABSTRACT

An orphan in a distributed transaction system is an activity executing on behalf of an aborted transaction. A method is proposed for managing orphans created by crashes and by aborts that ensures that orphans are detected and eliminated in a timely manner, and also prevents them from observing inconsistent states. The method uses timestamps generated at each site. Transactions are assigned timeouts at different sites. These timeouts are related by a global invariant, and they may be adjusted by simple two-phase protocols. The principal advantage of this method is simplicity: it is easy to understand, and to implement, and it can be proved correct. An 'eager' version of this method uses approximately synchronized real-time clocks to ensure that orphans are eliminated within a fixed duration, and a 'lazy' version uses logical clocks to ensure that orphans are eventually eliminated as information propagates through the system. The method is fail-safe: unsynchronized clocks and lost messages may affect performance, but they cannot produce inconsistencies or protect orphans from eventual elimination. Although the method is informally described in terms of two-phase locking, the formal argument shows it is applicable to any concurrency control method that preserved atomicity.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
{1} J. Allchin, "An architecture for reliable decentralized systems," Georgia Inst. Technol., Tech. Rep. GIT-ICS-83/23, 1983.
2
3
 
4
{4} M. P. Herlihy, N. A. Lynch, M. Merritt, and W. E. Weihl, "On the correctness of orphan elimination algorithms," <i>ACM</i>, to be published; abbreviated version in 17th FTCS.
 
5
 
6
 
7
{7} G. G. Kenky, "An action management system for a distributed operating system," Master's thesis, Georgia Inst. Technol., Dec., 1985.
8
 
9
{9} B. Lampson, <i>Remote Procedure Calls (Lecture Notes in Computer Science 105)</i>. Berlin: Springer-Verlag, 1981, pp. 365-370.
10
 
11
{11} B. H. Liskov, R. Scheifler, E. Walker, and W. E. Weihl, "Orphan detection," in <i>17th Symp. Fault-Tolerant Computer Systems (FTCS)</i>, July 1987, pp. 2-7.
12
13
 
14
{14} M. S. McKendry, "Clouds: A fault-tolerant distributed operating system," <i>IEEE Tech. Com. Distributed Processing Newslett.</i>, vol. 2, no. 6, June 1984.
 
15
 
16
{16} B. Nelson, "Remote procedure call," Xerox Palo Alto Research Center, Tech. Rep. CSL-79-3, 1981.
17
 
18
{18} Rajdoot, "A remote procedure call mechanism supporting orphan detection and killing," Univ. Newcastle upon Tyne, Tech. Rep. TR 200, Apr. 1985.
19
 
20
{20} M. S. McKendry and M. P. Herlihy. "Time-driven orphan elimination," in <i>Proc. Fifth Symp. Reliability in Distributed Software and Database Systems</i>, Jan. 1986; also available as Tech. Rep. CMU-CS- 85-138.
 
21
{21} M. D. Skeen, "Crash recovery in a distributed database system," Ph.D. dissertation, Univ. California, Berkeley, May 1982.
22
 
23
{23} A. Z. Spector, J. J. Bloch, D. S. Daniels, R. P. Draves, D. Duchamp, J. L. Eppinger, S. G. Menees, and D. S. Thompson, "The Camelot project," <i>Database Eng.</i>, vol. 9, no. 4, Dec. 1986; also available as Tech. Rep. CMU-CS-86-166, Carnegie-Mellon Univ., Nov. 1986.
 
24
 
25
26



REVIEW

"Robert Joel Hofkin : Reviewer"

Many mechanisms have been proposed for recovery after the partial failure of a distributed transaction processing system. The problem is to force consistency throughout the system and to eliminate “orphans,” or activities executing  more...

Collaborative Colleagues:
M. Herlihy: colleagues
M. S. McKendry: colleagues