|
ABSTRACT
Significant performance advantages can be gained by implementing a database system on a cache-coherent shared memory multiprocessor. However, problems arise when failures occur. A single node (where a node refers to a processor/memory pair) crash may require a reboot of the entire shared memory system. Fortunately, shared memory multiprocessors that isolate individual node failures are currently being developed. Even with these, because of the side effects of the cache coherency protocol, a transaction executing strictly on a single node may become dependent on the validity of the memory of many nodes thereby inducing unnecessary transaction aborts. This happens when database objects, such as records, and database support structures, such as index structures and shared lock tables, are stored in shared memory.In this paper, we propose crash recovery protocols for shared memory database systems which avoid the unnecessary transaction aborts induced by the effects of using shared physical memory. Our recovery protocols guarantee that if one or more nodes crash, all the effects of active transactions running on the crashed nodes will be undone, and no effects of transactions running on nodes which did not crash will be undone. In order to show the practicality of our protocols, we discuss how existing features of cache-coherent multiprocessors can be utilized to implement these recovery protocols. Specifically, we demonstrate that (1) for many types of database objects and support structures, volatile (in-memory) logging is sufficient to avoid unnecessary transaction aborts, and (2) a very low overhead implementation of this strategy can be achieved with existing multiprocessor features.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Panel Discussion on Shared Nothing, Shared Disk, and Shared Memory Database Systems. Proceedings of the 1994 ACM SIGMOD international Conference on Management of Data, 23, May 1994.
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
J. Chapin. Personal Communication. 1995.
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
D. Johnson and W. Zwaenepoel. Sender-Based Message Logging. Proceedings of the 17th International Symposium on Fault-Tolerant Computing, pages 14-19, 1987.
|
 |
11
|
|
 |
12
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
E. Rahm. Concurrency and Coherency Control in Database Sharing Systems. Technical Report, University of KaisersIautern, Germany, December 1991.
|
| |
22
|
E. Rahm. Use of Global Extended Memory for Distributed Transaction Processing. Proceedings of the 4th Int. Workshop on High Performance Transaction Systems, AsiIomar, CA., September 1991.
|
| |
23
|
T. Rengarajan, P. Spiro, and W. Wright. High Availability Mechanisms of VAX DBMS Software. Digital Technical Journal, (8):88-98, February 1989.
|
| |
24
|
Kendall Square Research. KSR1 Principles of Operation. KSR Research. Waltham, Mass., 1992.
|
| |
25
|
W. Snaman and D. Thiel. The VAX/VMS Distributed Lock Manager. Digital Technical Journal, (5):29-44, September 1987.
|
| |
26
|
R. Strom, D. Bacon, and S. Yemini. Volatile Logging in n- Fault-Tolerant Distributed Systems. Proceedings of the 18th International Symposium on Fault-Tolerant Computing, pages 44-49, 1988.
|
 |
27
|
|
| |
28
|
|
|