ACM Home Page
Please provide us with feedback. Feedback
A message-based fault diagnosis procedure
Full text PdfPdf (1.10 MB)
Source Applications, Technologies, Architectures, and Protocols for Computer Communication archive
Proceedings of the ACM SIGCOMM conference on Communications architectures & protocols table of contents
Stowe, Vermont, United States
Pages: 328 - 337  
Year of Publication: 1986
ISBN:0-89791-201-2
Also published in ...
Author
J R Agre  Information Sciences Department, Rockwell International Science Center, 1049 Camino DOS Rios, Thousand Oaks, CA
Sponsor
SIGCOMM: ACM Special Interest Group on Data Communication
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 15,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/18172.18209
What is a DOI?

ABSTRACT

A new diagnostic message protocol that provides fault diagnosis capabilities for the communications in a distributed system environment is described. The protocol is designed to operate in conjunction with a standard end-to-end communication protocol and uses special messages to determine the system fault state. A diagnosis message is represented using a test dependency model that is derived from the system topology. These messages are used by an adaptive strategy designed to achieve specific objectives such as reduced testing cost. Using the test dependency model, a general purpose algorithm is developed for generating these strategies based on an information theory criterion. Specific properties of the protocol are discussed, and several examples of strategies for a distributed system topology are provided.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
J.G. Kuhl and S.M. Reddy, "Fault-Daignosis in Fully Distributed Systems," Pro~. IEEE Intl. Symp. on Fault Tolerant Computing, June, 1951, pp. 100-105.
 
4
T.-Y. Feng and C.-L. Wu, "Fault-Diagnosis for a Class of Multistage Interconnection Networks," IEEE Trans. on Computers, Vol. C-30, No. 10, Oct., 1981, pp. 743-758.
 
5
 
6
W.Y. Lim, "A Test Strategy for Packet Switching Networks," Proc. Intl. Conf. on Parallel Processing, Aug., 1982, pp. 96-98.
 
7
C.R. Shashidhar and F.P. Coakley, "Fault Diagnosis of SPC Switching Systems Based on Structure and Signalling," Software and Microsystems, Vol. 4, No. 2, April, 1985, pp. 30-34.
 
8
S.A. Bruso, "A Failure Detection and Notification Protocol for Distributed Computing Systems," Proc. Fifth Int. Conf. on Distributed Computing Systems, Mar., 1985, pp. 116-123.
 
9
E.J. KIetsky, "An Application of the Information Theory Approach to Failure Diagnosis," IRE Trans. on Reliability and Quality Control, December 1960, pp. 29-39.
 
10
W.R. Simpson and H.S. Balaban, "The ARINC Research System Testability and Maintenance Program (STAMP)," Proc. iEEE AUTOTESTCON, Dayton, OH, Oct., 1982, pp. 88-95.
 
11
H.S. Balaban and W.R. Simpson, "Testability/Fault Isolation by Adaptive Stra. tegy," Annual Reliability and Maintainability Symposium, Orlando, FL, January 1983, pp. 344-350.
 
12
W.R. Simpson and J.R. Agre, "Experience Gained in Testability Design Tradeoffs," Proc. IEEE AU- TOTESTCON, Washington, DC, Nov., 1984, pp. 279- 286.
 
13
 
14
D.R. Shier and J.D. Spragins, "Exact and Approximate Dependent Failure Reliability Models for Telecommunications Networks," Proc. IEEE INFOCOM, Washington, DC, Mar., 1985, pp. 200-205.
 
15
Y.F. Lain and V.O. Li, "Reliability Modeling and Analysis of Communication Networks With Dependent Failures," Proc. IEEE INFOCOM, Washington, DC, Mar., 1985, pp. 196-199.
16
17
 
18
R. Strong, "Problems Maintaining Agreement," Proc. Fifth Symp. on Reliability in Distributed Software and Database Systems, Jan., 1986, pp. 20-27.
 
19
P.D. Ezhilchelvan and S.K. Shrivastava, "A Characterization of Faults in Systems," Proc. Fifth Symp. on Reliability in Distributed Software and Database Systems, Jan., 1986, pp. 215-222.
 
20
F.P. Preparata, G. Metze, and R.T. Chien, "On the Connection Assignment Problem of Diagnosable Systems,'' IEEE Trans. on Electronic Computers, Vol. EC-16, No. 6, Dec., 1967, pp. 848-854.
 
21
J.D. Russell and C.R. Kime, "System Fault Diagnosis: Closure and Diagnosability With Repair," IEEE Trans. on Computers, Vol. C-24, No. II, Nov., 1975, pp. 1078-I089.
 
22
J.D. Russell and C.R. Kime, "System Fault Diagnosis: Masking, Exposure, and Diagnosability Without Repair," IEEE Trans. on Computers, Vol. C-24, No. 12, Dec., 1975, pp. 1155-1161.
 
23
K.Y. Chwa and S.L. Hakimi, "On Fault Identification in Diagnosable Systems," iEEE Trans. on Computers, Vol. C-30, No. 6, June, 1981, pp. 414-422.
 
24
P.K. Varshney, C.R. Hartmann, and J.M. Faria, "Application of Information Theory to Sequential Fault Dignosis," IEEE Trans. on Computers, Vol. C-31, No. 2, Feb., 1982, pp. 164-170.
 
25
C.R. Hartmann, P.K. Varshney, K.G. Mehrotra, and C.L. Gerberich, "Application of Information Theory to the Construction of Efficient Decision Trees," IEEE Trans. on Information Theory, Vol. IT-28, No. 4, July, 1982, pp. 565-577.
26
 
27
L.N. Kanal, "Problem-solving Models and Search Strategies for Pattern Recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-I, No. 2, 1979, pp. 193-201.
28
 
29
R.R. Cantonne, et al, "Model-Based Probabilistic Reasoning for Electronic Troubleshooting," Proc. IJCAI- 83, Karlsruhe, Germany, Aug., 1983, pp~ 207-211.
 
30
L. Hyafil and R.L. Rivest, "Constructing Optimal Binary Decision Trees is NP-Complete," Information Processing Letters, Vol. 5, No~ 1, 1976, pp. 15-17.
 
31
 
32
J.R. Agre, "Fault Diagnosis Using the Test Dependency Model," Manuscript in progress, Rockwell International Science Center.
 
33
J.R. Agre and W.R. Simpson, "Adaptive Fault Isolation with Learning," Proc. IEEE AUTOTESTCON, Ft. Worth, TX, pp. 331-335, 1983.
 
34
R. Davis, H. Shrobe, W. llamscher, K. Wieckert, M. Shirley, and S. Polit, "Diagno.gtics Based on Description of Structure and Function," Proc. of Nat. Conf. on Artificial Intelligence (AAAI-82), Pittsburgh, PA, pp. 137-142.
 
35
P.K. Fink, J.C. Lusth, and J.W. Duran, "A Generai Expert System Design for Diagnostic Problem Solving,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-7, No. 5, Sep., 1985, pp. 553- 560.
 
36
P.K. Fink, Control and Integration of Diverse Knowledge in a Diagnostic Expert System," Proc. IJCAI-85, Los Angeles, CA, Aug., 1985, pp. 426-431.