| Implementing efficient fault containment for multiprocessors: confining faults in a shared-memory multiprocessor environment |
| Full text |
Pdf
(361 KB)
|
Source
|
Communications of the ACM
archive
Volume 39 , Issue 9 (September 1996)
table of contents
Pages: 52 - 61
Year of Publication: 1996
ISSN:0001-0782
|
|
Authors
|
|
Mendel Rosenblum
|
Stanford Univ., Stanford, CA
|
|
John Chapin
|
Stanford Univ., Stanford, CA
|
|
Dan Teodosiu
|
Stanford Univ., Stanford, CA
|
|
Scott Devine
|
Stanford Univ., Stanford, CA
|
|
Tirthankar Lahiri
|
Stanford Univ., Stanford, CA
|
|
Anoop Gupta
|
Stanford Univ., Stanford, CA
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 40, Citation Count: 3
|
|
|
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Bartlett, J., Gray, J., and Horst, B. Fault tolerance in Tandem computer systems. In Evolution of Fault-Tolerant Computin~ A. Avizienis, H. Kopetz, and J.C. Laprie, Eds. Springer-Verlag, New York, 1987.
|
 |
3
|
J. Chapin , M. Rosenblum , S. Devine , T. Lahiri , D. Teodosiu , A. Gupta, Hive: fault containment for shared-memory multiprocessors, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.12-25, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
4
|
|
| |
5
|
Khalidi, Y., Bernabeu, J., Matena, V., et al. Solaris MC: A multicomputer OS. In Proceedings of the U3~NIX 1996 Annual Technical Conference (San Diego, Calif., Jan. 22-26, 1996). USENIX Association, Berkeley, Calif., 1996, pp. 191-204.
|
 |
6
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
| |
7
|
|
| |
8
|
Mitchell, J.G., Gibbons, JJ., Hamilton, G., et al. An overview of the Spring system. In Digest of Papers, 3~Oring COMPCON 94 (San Francisco, Calif., Feb. 28-Mar. 4, 1994). IEEE Computer Society Press, 1994, pp. 122-131.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Sullivan, M. and Chillarege, R. Software defects and their impact on system availability--A study of field failures in operating systems. In Proceedings of the 21st International 3~ymposium on Fault-Tolerant Computing (Montreal, Canada, Jun. 25-27). IEEE Computer Society Press, 1991, pp. 2-9.
|
| |
13
|
|
 |
14
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
15
|
M. Young , A. Tevanian , R. Rashid , D. Golub , J. Eppinger, The duality of memory and communication in the implementation of a multiprocessor operating system, Proceedings of the eleventh ACM Symposium on Operating systems principles, p.63-76, November 08-11, 1987, Austin, Texas, United States
|
CITED BY 3
|
|
|
|
|
|
|
|
Dan Teodosiu , Joel Baxter , Kinshuk Govil , John Chapin , Mendel Rosenblum , Mark Horowitz, Hardware fault containment in scalable shared-memory multiprocessors, ACM SIGARCH Computer Architecture News, v.25 n.2, p.73-84, May 1997
|
REVIEW
"Ivan Flores : Reviewer"
The authors contend that large-scale multiprocessors are plagued by
failures in hardware and software that frequently bring down the entire
system, requiring that the machine be rebooted. They propose a scheme
for fault containment, then attem
more...
|