ABSTRACT
Researchers have made great strides in improving the fault tolerance of both centralized and replicated systems against arbitrary (Byzantine) faults. However, there are hard limits to how much can be done with entirely untrusted components; for example, replicated state machines cannot tolerate more than a third of their replica population being Byzantine. In this paper, we investigate how minimal trusted abstractions can push through these hard limits in practical ways. We propose Attested Append-Only Memory (A2M), a trusted system facility that is small, easy to implement and easy to verify formally. A2M provides the programming abstraction of a trusted log, which leads to protocol designs immune to equivocation -- the ability of a faulty host to lie in different ways to different clients or servers -- which is a common source of Byzantine headaches. Using A2M, we improve upon the state of the art in Byzantine-fault tolerant replicated state machines, producing A2M-enabled protocols (variants of Castro and Liskov's PBFT) that remain correct (linearizable) and keep making progress (live) even when half the replicas are faulty, in contrast to the previous upper bound. We also present an A2M-enabled single-server shared storage protocol that guarantees linearizability despite server faults. We implement A2M and our protocols, evaluate them experimentally through micro- and macro-benchmarks, and argue that the improved fault tolerance is cost-effective for a broad range of uses, opening up new avenues for practical, more reliable services.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Amazon S3. http://aws.amazon.com/s3/.
|
| |
2
|
Intel Active Management Technology (AMT). http://www.intel.com/technology/platform-technology/intel-amt/index.htm.
|
| |
3
|
Java. http://java.sun.com/.
|
| |
4
|
SFSlite. http://www.okws.org/doku.php?id=sfslite.
|
| |
5
|
Trusted Computing Group (TCG). http://www.trustedcomputinggroup.org/.
|
 |
6
|
Michael Abd-El-Malek , Gregory R. Ganger , Garth R. Goodson , Michael K. Reiter , Jay J. Wylie, Fault-scalable Byzantine fault-tolerant services, Proceedings of the twentieth ACM symposium on Operating systems principles, October 23-26, 2005, Brighton, United Kingdom
|
 |
7
|
Amitanand S. Aiyer , Lorenzo Alvisi , Allen Clement , Mike Dahlin , Jean-Philippe Martin , Carl Porth, BAR fault tolerance for cooperative services, Proceedings of the twentieth ACM symposium on Operating systems principles, October 23-26, 2005, Brighton, United Kingdom
|
| |
8
|
|
| |
9
|
|
 |
10
|
Paul Barham , Boris Dragovic , Keir Fraser , Steven Hand , Tim Harris , Alex Ho , Rolf Neugebauer , Ian Pratt , Andrew Warfield, Xen and the art of virtualization, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
| |
11
|
Stefan Berger , Ramón Cáceres , Kenneth A. Goldman , Ronald Perez , Reiner Sailer , Leendert van Doorn, vTPM: virtualizing the trusted platform module, Proceedings of the 15th conference on USENIX Security Symposium, p.21-21, July 31-August 04, 2006, Vancouver, B.C., Canada
|
| |
12
|
|
 |
13
|
|
| |
14
|
James Cowling , Daniel Myers , Barbara Liskov , Rodrigo Rodrigues , Liuba Shrira, HQ replication: a hybrid quorum protocol for byzantine fault tolerance, Proceedings of the 7th symposium on Operating systems design and implementation, November 06-08, 2006, Seattle, Washington
|
| |
15
|
S. Haber and W. S. Stornetta. How to time-stamp a digital document. Journal of Cryptology, 3(2):99--111, 1991.
|
 |
16
|
|
 |
17
|
|
 |
18
|
Galen Hunt , Mark Aiken , Manuel Fähndrich , Chris Hawblitzel , Orion Hodson , James Larus , Steven Levi , Bjarne Steensgaard , David Tarditi , Ted Wobber, Sealing OS processes to improve dependability and safety, Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, March 21-23, 2007, Lisbon, Portugal
|
 |
19
|
|
| |
20
|
Mahesh Kallahalla , Erik Riedel , Ram Swaminathan , Qian Wang , Kevin Fu, Plutus: Scalable Secure File Sharing on Untrusted Storage, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
| |
21
|
|
 |
22
|
Ramakrishna Kotla , Lorenzo Alvisi , Mike Dahlin , Allen Clement , Edmund Wong, Zyzzyva: speculative byzantine fault tolerance, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
 |
23
|
|
 |
24
|
|
 |
25
|
|
| |
26
|
Jinyuan Li , Maxwell Krohn , David Mazières , Dennis Shasha, Secure untrusted data repository (SUNDR), Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, p.9-9, December 06-08, 2004, San Francisco, CA
|
| |
27
|
J. Li and D. Mazißres. Beyond One-third Faulty Replicas in Byzantine Fault Tolerant Systems. In Proc. of NSDI, 2007.
|
 |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
|
| |
32
|
|
 |
33
|
|
| |
34
|
M. Naor. Bit commitment using pseudorandomness. Journal of Cryptology, 4(2):151--158, 1991.
|
 |
35
|
Karin Petersen , Mike J. Spreitzer , Douglas B. Terry , Marvin M. Theimer , Alan J. Demers, Flexible update propagation for weakly consistent replication, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.288-301, October 05-08, 1997, Saint Malo, France
|
 |
36
|
|
 |
37
|
|
| |
38
|
P. Thambidurai and Y.-K. Park. Interactive consistency with multiple failure modes. In Proc. of SRDS, 1988.
|
 |
39
|
Jian Yin , Jean-Philippe Martin , Arun Venkataramani , Lorenzo Alvisi , Mike Dahlin, Separating agreement from execution for byzantine fault tolerant services, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
| |
40
|
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
Byung-Gon Chun , Petros Maniatis , Scott Shenker , John Kubiatowicz, Tiered fault tolerance for long-term integrity, Proccedings of the 7th conference on File and stroage technologies, p.267-282, February 24-27, 2009, San Francisco, California
|
|
|
|
|
|
|
|
|
Atul Singh , Pedro Fonseca , Petr Kuznetsov , Rodrigo Rodrigues , Petros Maniatis, Zeno: eventually consistent Byzantine-fault tolerance, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.169-184, April 22-24, 2009, Boston, Massachusetts
|
|
|
Dave Levin , John R. Douceur , Jacob R. Lorch , Thomas Moscibroda, TrInc: small trusted hardware for large distributed systems, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.1-14, April 22-24, 2009, Boston, Massachusetts
|
|
|
|
|