|
ABSTRACT
Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software-based systems. While there is clear evidence that the approach can be expected to deliver some increase in reliability compared to a single version, there is no agreement about the extent of this. More importantly, it remains difficult to evaluate exactly how reliable a particular diverse fault-tolerant system is. This difficulty arises because assumptions of independence of failures between different versions have been shown to be untenable: assessment of the actual level of dependence present is therefore needed, and this is difficult. In this tutorial, we survey the modeling issues here, with an emphasis upon the impact these have upon the problem of assessing the reliability of fault-tolerant systems. The intended audience is one of designers, assessors, and project managers with only a basic knowledge of probabilities, as well as reliability experts without detailed knowledge of software, who seek an introduction to the probabilistic issues in decisions about design diversity.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ADAMS, E. N. 1984. Optimizing preventive service of software products. IBM J. Res. Devel. 28,1, 2-14.
|
| |
2
|
|
| |
3
|
ANDERSON, T., BARRETT, P. A., HALLIWELL,D.N.AND MOULDING, M. R. 1985. An evaluation of software fault tolerance in a practical system. In Proceedings of the 15th IEEE International Symposium on Fault-Tolerant Computing (FTCS- 15). (Ann Arbor, MI.)
|
| |
4
|
BABBAGE, C. 1974. On the mathematical powers of the calculating engine (unpublished manuscript, December 1837). In The Origins of Digital Computers: Selected Papers, B. Randell, Ed. Springer- Verlag, New York, 17-52.
|
| |
5
|
BISHOP, P. G. 1988. The PODS diversity experiment. In Software Diversity in Computerized Control Systems, U. Voges, Ed. Springer-Verlag, New York, pp. 51-84.
|
| |
6
|
BISHOP,P.G.AND PULLEN, F. D. 1988. PODS revisited-A study of software failure behavior. In Proceedings of the 18th International Symposium on Fault-Tolerant Computing. (Tokyo), IEEE Computer Society Press, Los Alamitos, Calif.
|
| |
7
|
BLOUGH,D.M.AND SULLIVAN, G. 1990. A comparison of voting strategies for fault-tolerant distributed systems. In Ninth Symposium on Reliable Distributed Systems (SRDS-9) (Huntsville, AL), IEEE Computer Society.
|
| |
8
|
BONDAVALLI, A., CHIARADONNA, S., DI GIANDOMENICO, F. AND STRIGINI, L. 1999. A contribution to the evaluation of the reliability of iterativeexecution software. Soft. Test. Verif. Reliab. 9,3, 145-166.
|
| |
9
|
BRIERE,D.AND TRAVERSE, P. 1993. Airbus A320/A330/A340 electrical flight controls-A family of fault-tolerant systems. In Proceedings of the 23rd International Symposium on Fault- Tolerant Computing (FTCS-23). (Toulouse, France), IEEE Computer Society, Los Alamitos, Calif.
|
| |
10
|
DI GIANDOMENICO,F.AND STRIGINI, L. 1990. Adjudicators for diverse-redundant components. In Ninth Symposium on Reliable Distributed Systems (SRDS-9) (Huntsville, AL.), IEEE Computer Society Press, Los Alamitos, Calif.
|
| |
11
|
|
| |
12
|
ECKHARDT,D.E.AND LEE, L. D. 1985. A theoretical basis for the analysis of multiversion software subject to coincident errors. IEEE Trans. Softw. Eng. SE-11, 12, 1511-1517.
|
| |
13
|
FAA 1985. Federal Aviation Administration, Advisors Circular AC 25 1309-1A.
|
| |
14
|
HAGELIN, G. 1988. ERICSSON safety systems for railway control. In Software Diversity in Computerized Control Systems, U. Voges, Ed. Springer-Verlag, New York, pp. 11-21.
|
| |
15
|
|
| |
16
|
HUGHES, R. P. 1987. A new approach to common cause failure. Reliab. Eng. 17, 211-236.
|
| |
17
|
|
| |
18
|
KERSKEN,M.AND SAGLIETTI, F. Eds. 1992. Software fault tolerance: Achievement and assessment strategies. Research Reports ESPRIT, Springer- Verlag, New York.
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
LARYD, A. 1994. Operating experience of software in programmable equipment used in ABB Atom nuclear I&C application. In Advanced Control and Instrumentation Systems in Nuclear Power Plants. Design, Verification and Validation. IAEA/IWG/ATWR & NPPCI Technical Committee Meeting (Espoo, Finland).
|
| |
24
|
|
| |
25
|
LINDEBERG, J. F. 1993. The Swedish state railways' experience with n-version programmed systems. In Directions in Safety-Critical Systems, F. Redmill and T. Anderson, Eds. Springer- Verlag, New York, p. 36.
|
| |
26
|
LITTLEWOOD, B. 1996. The impact of diversity upon common mode failures. Reliab. Eng. Syst. Safety. 51, 101-113.
|
| |
27
|
|
| |
28
|
|
 |
29
|
|
| |
30
|
LITTLEWOOD,B.AND STRIGINI, L. 1998. Guidelines for the statistical testing of software. Centre for Software Reliability, City University, London.
|
| |
31
|
LITTLEWOOD, B., POPOV,P.,AND STRIGINI, L. 1999. A note on reliability estimation of functionally diverse systems. Reliab. Eng. Syst. Safety. 66, 93- 95.
|
| |
32
|
|
 |
33
|
|
| |
34
|
MIGNEAULT, G. E. 1982. The Cost of Software Fault Tolerance Technical Report. NASA Langley Research Center, Hampton, Va.
|
| |
35
|
MoD, 1996. Safety management requirements for defence systems. U.K. Ministry of Defence.
|
| |
36
|
MoD, 1997. Requirements for safety related software in defence equipment. U.K. Ministry of Defence.
|
| |
37
|
MONGARDI, G. 1993. Dependable computing for railway control systems. In Third IFIP International Working Conference on Dependable Computing for Critical Applications (DCCA-3) (Mondello, Italy).
|
| |
38
|
|
| |
39
|
|
| |
40
|
|
| |
41
|
POPOV, P., STRIGINI, L., AND PIZZA, M. 1998. The efficacy of diverse redundancy against design error: Some practical considerations. In Preprints of the INucE Third International Conference on Control and Instrumentation in Nuclear Installations (Edinburgh).
|
| |
42
|
RTCA/EuroCAE, 1992. DO-178B, Software considerations in airborne systems and equipment certification.
|
| |
43
|
|
| |
44
|
SMITH,I.C.,WALL,D.N.,AND BALDWIN, J. A. 1991. DARTS-An experiment into cost of and diversity in safety critical computer systems. In IFAC/IFIP/EWICS/SRE Symposium on Safety of Computer Control Systems (SAFECOMP '91). (Trondheim, Norway), Pergamon Press.
|
| |
45
|
STRIGINI, L. 1996. On testing process control software for reliability assessment: The effects of correlation between successive failures. Softw. Test. Verif. Reliab. 6, 1, 36-48.
|
| |
46
|
TRAVERSE, P. J. 1988. AIRBUS and ATR system architecture and specification. In Software Diversity in Computerized Control Systems, U. Voges, Ed. Springer-Verlag, New York, pp. 95- 104.
|
| |
47
|
|
| |
48
|
VOGES,U.AND GMEINER, L. 1979. Software diversity in reactor protection systems: An experiment. In IFAC Workshop, SAFECOMP'79 (Stuttgart, Germany May 16-18).
|
| |
49
|
|
| |
50
|
VOGES, U. 1994. Software diversity. Reliab. Eng. Syst. Safety 43, 2, 103-110.
|
| |
51
|
|
CITED BY 6
|
|
Yongguang Zhang , Harrick Vin , Lorenzo Alvisi , Wenke Lee , Son K. Dao, Heterogeneous networking: a new survivability paradigm, Proceedings of the 2001 workshop on New security paradigms, September 10-13, 2001, Cloudcroft, New Mexico
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|