|
ABSTRACT
Pangaea is a wide-area file system that supports data sharing among a community of widely distributed users. It is built on a symmetrically decentralized infrastructure that consists of commodity computers provided by the end users. Computers act autonomously to serve data to their local users. When possible, they exchange data with nearby peers to improve the system's overall performance, availability, and network economy. This approach is realized by aggressively creating a replica of a file whenever and wherever it is accessed.This paper presents the design, implementation, and evaluation of the Pangaea file system. Pangaea offers efficient, randomized algorithms to manage highly dynamic and potentially large groups of file replicas. It applies optimistic consistency semantics to replica contents, but it also offers stronger guarantees when required by the users. The evaluation demonstrates that Pangaea outperforms existing distributed file systems in large heterogeneous environments, typical of the Internet and of large corporate intranets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Atul Adya , William J. Bolosky , Miguel Castro , Gerald Cermak , Ronnie Chaiken , John R. Douceur , Jon Howell , Jacob R. Lorch , Marvin Theimer , Roger P. Wattenhofer, Farsite: federated, available, and reliable storage for an incompletely trusted environment, Proceedings of the 5th symposium on Operating systems design and implementation Due to copyright restrictions we are not able to make the PDFs for this conference available for downloading, December 09-11, 2002, Boston, Massachusetts
[doi> 10.1145/1060289.1060291]
|
 |
2
|
T. E. Anderson , M. D. Dahlin , J. M. Neefe , D. A. Patterson , D. S. Roselli , R. Y. Wang, Serverless network file systems, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.109-126, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
3
|
William J. Bolosky , John R. Douceur , David Ely , Marvin Theimer, Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs, Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.34-43, June 18-21, 2000, Santa Clara, California, United States
|
| |
4
|
B. Callaghan, B. Pawlowski, and P. Staubach. RFC1813: NFS version 3 protocol specification. http://www.faqs.org/rfcs/rfc1813.html, June 1995.
|
| |
5
|
Pei Cao and Sandy Irani. Cost-Aware WWW proxy caching algorithms. In 1st USENIX Symp. on lnternet Tech. and Sys. (USITS), Monterey, CA, USA, December 1997.
|
 |
6
|
Frank Dabek , M. Frans Kaashoek , David Karger , Robert Morris , Ion Stoica, Wide-area cooperative storage with CFS, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
 |
7
|
Alan Demers , Dan Greene , Carl Hauser , Wes Irish , John Larson , Scott Shenker , Howard Sturgis , Dan Swinehart , Doug Terry, Epidemic algorithms for replicated database maintenance, Proceedings of the sixth annual ACM Symposium on Principles of distributed computing, p.1-12, August 10-12, 1987, Vancouver, British Columbia, Canada
[doi> 10.1145/41840.41841]
|
| |
8
|
|
| |
9
|
Paul Francis , Sugih Jamin , Cheng Jin , Yixin Jin , Danny Raz , Yuval Shavitt , Lixia Zhang, IDMaps: a global internet host distance estimation service, IEEE/ACM Transactions on Networking (TON), v.9 n.5, p.525-540, October 2001
[doi> 10.1109/90.958323]
|
| |
10
|
Richard A. Golding, Darrell D. E. Long, and John Wilkes. The refdbms distributed bibliographic database system. In USENIX Winter Tech. Conf., San Francisco, CA, USA, January 1994.
|
| |
11
|
Jim Gray. A census of Tandem system availability between 1985 mad 1990. IEEE Trans. on Reliability, 39(4):409--418, October 1990.
|
| |
12
|
|
 |
13
|
John H. Howard , Michael L. Kazar , Sherri G. Menees , David A. Nichols , M. Satyanarayanan , Robert N. Sidebotham , Michael J. West, Scale and performance in a distributed file system, ACM Transactions on Computer Systems (TOCS), v.6 n.1, p.51-81, Feb. 1988
[doi> 10.1145/35037.35059]
|
| |
14
|
M. Ji, E. Felten, R. Wang, and J. R Singh. Archipelago: an island-based file system for highly available and scalable Internet services. In USENIX Windows Systems Symposium, August 2000.
|
 |
15
|
Leonard Kawell, Jr. , Steven Beckhardt , Timothy Halvorsen , Raymond Ozzie , Irene Greif, Replicated document management in a group communication system, Proceedings of the 1988 ACM conference on Computer-supported cooperative work, September 26-28, 1988, Portland, Oregon, United States
[doi> 10.1145/62266.1024798]
|
| |
16
|
|
 |
17
|
John Kubiatowicz , David Bindel , Yan Chen , Steven Czerwinski , Patrick Eaton , Dennis Geels , Ramakrishna Gummadi , Sean Rhea , Hakim Weatherspoon , Chris Wells , Ben Zhao, OceanStore: an architecture for global-scale persistent storage, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.190-201, November 2000, Cambridge, Massachusetts, United States
|
| |
18
|
P. Kumar and M. Satyanarayanan. Flexible and safe resolution of file conflicts. In USENIX Winter Tech. Conf., pages 95--106, New Orleans, LA, USA, January 1995.
|
| |
19
|
|
 |
20
|
L. B. Mummert , M. R. Ebling , M. Satyanarayanan, Exploiting weak connectivity for mobile file access, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.143-155, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
21
|
|
 |
22
|
|
| |
23
|
D. Scott Parker, Gerald Popek, Gerard Rudisin, Allen Stoughton, Bruce Walker, Evelyn Walton, Johanna Chow, David Edwards, Stephen Kiser, and Charles Kline. Detection of mutual inconsistency in distributed systems. IEEE Trans. on Software Engineering, SE-9(3):240--247, 1983.
|
| |
24
|
Konstantinos Psounis and Balaji Prabhakar. A randomized web-cache replacement scheme. In lnfocom, Anchorage, AL, USA, April 2001.
|
| |
25
|
|
| |
26
|
Luigi Rizzo. Dummynet, http://info.iet.unipi.it/~luigi/ip_dummynet/,2001.
|
 |
27
|
Antony Rowstron , Peter Druschel, Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
| |
28
|
Yasushi Saito and Christos Karamanolis. Replica consistency management in the pangaea wide-area file system. Technical report, HP Labs, 2002. To be published.
|
| |
29
|
Yasushi Saito, Jeffrey Mogul, and Ben Verghese. A Usenet performance study, September 1998. http://www.research.digital.com/wrl/projects/newsbench/.
|
| |
30
|
Sleepycat Software. The Berkeley database, 2002. http://sleepycat. com.
|
| |
31
|
Susan Spence, Erik Riedel, and Magnus Karlsson. Adaptive consistency---patterns of sharing in a networked world. Technical Report HPL-SSP-2002-10, HP Labs, February 2002.
|
 |
32
|
D. B. Terry , M. M. Theimer , Karin Petersen , A. J. Demers , M. J. Spreitzer , C. H. Hauser, Managing update conflicts in Bayou, a weakly connected replicated storage system, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.172-182, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
33
|
|
| |
34
|
Robbert van Renesse, Yaron Minsky, and Mark Hayden. A gossip-style failure detection service. In IFIP Int. Conf. on Dist. Sys. Platforms and Open Dist. (Middleware), 1998. http://www.cs.cornell.edu/Info/People/rvr/papers/pdf/pdf.ps.
|
 |
35
|
|
 |
36
|
Bruce Walker , Gerald Popek , Robert English , Charles Kline , Greg Thiel, The LOCUS distributed operating system, Proceedings of the ninth ACM symposium on Operating systems principles, p.49-70, October 10-13, 1983, Bretton Woods, New Hampshire, United States
|
 |
37
|
|
CITED BY
|
|
Val Henson , Arjan van de Ven , Amit Gud , Zach Brown, Chunkfs: using divide-and-conquer to improve file system reliability and repair, Proceedings of the 2nd conference on Hot Topics in System Dependability, p.7-7, November 08, 2006, Seattle, WA
|
|