| Fast cluster failover using virtual memory-mapped communication |
| Full text |
Pdf
(1.45 MB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 13th international conference on Supercomputing
table of contents
Rhodes, Greece
Pages: 373 - 382
Year of Publication: 1999
ISBN:1-58113-164-X
|
|
Authors
|
|
Yuanyuan Zhou
|
Computer Science Department, Princeton University Princeton, NJ
|
|
Peter M. Chen
|
Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI
|
|
Kai Li
|
Computer Science Department, Princeton University Princeton, NJ
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 29, Citation Count: 7
|
|
|
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Joel F. Bartlett. A NonStop Operating System. Tandem Computers, Inc., 1977.
|
| |
3
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
 |
4
|
Anita Borg , Jim Baumbach , Sam Glazer, A message system supporting fault tolerance, Proceedings of the ninth ACM symposium on Operating systems principles, p.90-99, October 10-13, 1983, Bretton Woods, New Hampshire, United States
|
 |
5
|
|
| |
6
|
K.M. Chandy and C.V. Ramamoorthy. Rollback and Recovery Strategies for Computer Programs. In IEEE Transactions on Computers, pages 546-556, June 1972.
|
 |
7
|
Yuqun Chen , Angelos Bilas , Stefanos N. Damianakis , Cezary Dubnicki , Kai Li, UTLB: a mechanism for address translation on network interfaces, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.193-204, October 02-07, 1998, San Jose, California, United States
|
 |
8
|
|
| |
9
|
|
| |
10
|
E. N. Elnozahy et.al. A Survey of Rollback-Recovery Protocols in Message Passing Systems. Technical Report TR 96-181, Carnegie Mellon University, 1996.
|
| |
11
|
G. S. Delp et. al. Memory as a Network Abstraction. fEEE Network, 5, July 1991.
|
| |
12
|
Greg Minshall et. al. An Overview of the NetWare Operating System. In USENIX'9.4.
|
 |
13
|
Matthias A. Blumrich , Richard D. Alpert , Yuqun Chen , Douglas W. Clark , Stefanos N. Damianakis , Cezary Dubnicki , Edward W. Felten , Liviu Iftode , Kai Li , Margaret Martonosi , Robert A. Shillner, Design choices in the SHRIMP system: an empirical study, Proceedings of the 25th annual international symposium on Computer architecture, p.330-341, June 27-July 02, 1998, Barcelona, Spain
|
 |
14
|
M. A. Blumrich , K. Li , R. Alpert , C. Dubnicki , E. W. Felten , J. Sandberg, Virtual memory mapped network interface for the SHRIMP multicomputer, Proceedings of the 21ST annual international symposium on Computer architecture, p.142-153, April 18-21, 1994, Chicago, Illinois, United States
|
 |
15
|
Peter M. Chen , Wee Teck Ng , Subhachandra Chandra , Christopher Aycock , Gurushankar Rajamani , David Lowell, The Rio file cache: surviving operating system crashes, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.74-83, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
16
|
Peter M. Chen , Edward K. Lee , Garth A. Gibson , Randy H. Katz , David A. Patterson, RAID: high-performance, reliable secondary storage, ACM Computing Surveys (CSUR), v.26 n.2, p.145-185, June 1994
[doi> 10.1145/176979.176981]
|
| |
17
|
Peter M. Chen et.al. Discount Checking: Transparent, Low- Overhead Recovery for General Applications. Technical report, University of Michigan, July 1998.
|
| |
18
|
R. Chillarege et.al. Challenges in Designing Fault-Tolerant Systems. In FTCS'91.
|
| |
19
|
|
| |
20
|
W. Vogels et.al. Scalabitity of the Microsoft Cluster Service. In Proceedings of the ~nd USENIX Windows NT Symposium, 1998.
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
Katzman J, A and et.al. A Fault-tolerant multiprocessor system system. United States Patent 4,817,091, March 89.
|
 |
25
|
|
| |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
James $. Plank, Micah Beck, Gerry Kingsley, and Kai Li. Libckpt: Transparent Checkpointing under Unix. In Proceedings of the 1995 Winter USENIX Technical Conference, 1995.
|
 |
30
|
|
| |
31
|
Kenneth Salem and Hector Garcia-Molina. Checkpointing Memory-Resident Databases. Technical Report CS-TR-126- 87, Department of Computer Science, Princeton University, 1987.
|
| |
32
|
Siewiorek and Swarz. The Theory and Practice of Reliable Systems Design. Digital, Bedford, 1982.
|
| |
33
|
|
| |
34
|
|
| |
35
|
Michael Stonebraker. The Postgres DBMS. In $iGMOD'90.
|
 |
36
|
|
| |
37
|
W. Vogels , D. Dumitriu , K. Birman , R. Gamache , M. Massa , R. Short , J. Vert , J. Barrera , J. Gray, The Design and Architecture of the Microsoft Cluster Service - A Practical Approach to High-Availability and Scalability, Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing, p.422, June 23-25, 1998
|
 |
38
|
Michael Wu , Willy Zwaenepoel, eNVy: a non-volatile, main memory storage system, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.86-97, October 05-07, 1994, San Jose, California, United States
|
CITED BY 7
|
|
|
|
|
Florin Sultan , Aniruddha Bohra , Stephen Smaldone , Yufei Pan , Pascal Gallard , Iulian Neamtiu , Liviu Iftode, Recovering Internet Service Sessions from Operating System Failures, IEEE Internet Computing, v.9 n.2, p.17-27, March 2005
|
|
|
|
|
|
Rosalia Christodoulopoulou , Kaloian Manassiev , Angelos Bilas , Cristiana Amza, Fast and transparent recovery for continuous availability of cluster-based servers, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
|
|
|
|
|
|
Sudarshan M. Srinivasan , Srikanth Kandula , Christopher R. Andrews , Yuanyuan Zhou, Flashback: a lightweight extension for rollback and deterministic replay for software debugging, Proceedings of the USENIX Annual Technical Conference 2004 on USENIX Annual Technical Conference, p.3-3, June 27-July 02, 2004, Boston, MA
|
|
|
|
|