| A new adaptive accrual failure detector for dependable distributed systems |
| Full text |
Pdf
(157 KB)
|
| Source
|
Symposium on Applied Computing
archive
Proceedings of the 2007 ACM symposium on Applied computing
table of contents
Seoul, Korea
SESSION: Dependable and adaptive distributed systems
table of contents
Pages: 551 - 555
Year of Publication: 2007
ISBN:1-59593-480-4
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 76, Citation Count: 2
|
|
|
ABSTRACT
The detection of failures in distributed environments is a crucial part for developing dependable, robust, and self-healing systems. The contribution of this paper is a new failure detection algorithm that can be described as an adaptive accrual algorithm coupled with features to increase flexiblity and decrease computation costs. Furthermore our evaluation results show a very good detection quality in the case of message losses.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
M. Horstmann and M. Kirtland. Dcom architecture. Technical report, http://msdn.microsoft.com/library/backgrnd/html/msdn_dcomarch.htm, July 1997.
|
 |
9
|
|
 |
10
|
|
CITED BY 2
|
|
Fernando Castor Filho , Augusta Marques , Raphael Y. de Camargo , Fabio Kon, A group membership service for large-scale grids, Proceedings of the 6th international workshop on Middleware for grid computing, p.1-6, December 01-05, 2008, Leuven, Belgium
|
|
|
|
INDEX TERMS
Primary Classification:
C.
Computer Systems Organization
C.2
COMPUTER-COMMUNICATION NETWORKS
C.2.3
Network Operations
Subjects:
Network monitoring
Additional Classification:
C.
Computer Systems Organization
C.2
COMPUTER-COMMUNICATION NETWORKS
C.2.4
Distributed Systems
Subjects:
Distributed applications
C.4
PERFORMANCE OF SYSTEMS
Subjects:
Reliability, availability, and serviceability
General Terms:
Reliability
Keywords:
accrual,
algorithm,
asynchronous systems,
dependable systems,
distributed systems,
failure detection,
fault-tolerance,
heartbeat,
histogram,
probability distribution,
self-healing
|