|
ABSTRACT
This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internet-based overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache additional copies of popular files.In the PAST system, storage nodes and files are each assigned uniformly distributed identifiers, and replicas of a file are stored at nodes whose identifier matches most closely the file's identifier. This statistical assignment of files to storage nodes approximately balances the number of files stored on each node. However, non-uniform storage node capacities and file sizes require more explicit storage load balancing to permit graceful behavior under high global storage utilization; likewise, non-uniform popularity of files requires caching to minimize fetch distance and to balance the query load.We present and evaluate PAST, with an emphasis on its storage management and caching system. Extensive trace-driven experiments show that the system minimizes fetch distance, that it balances the query load for popular files, and that it displays graceful degradation of performance as the global storage utilization increases beyond 95%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Napster. http://www.napster.com/.
|
| |
2
|
The Gnutella protocol specification, 2000. http: / / dss.clip2.com/GnuteUaProtocolO4.pdf.
|
 |
3
|
William Adjie-Winoto , Elliot Schwartz , Hari Balakrishnan , Jeremy Lilley, The design and implementation of an intentional naming system, Proceedings of the seventeenth ACM symposium on Operating systems principles, p.186-201, December 12-15, 1999, Charleston, South Carolina, United States
|
| |
4
|
|
| |
5
|
R. Anderson. The Eternity service. In Proc. PRAGOCRYPT'96, pages 242-252. CTU Publishing House, 1996. Prague, Czech Republic.
|
 |
6
|
T. E. Anderson , M. D. Dahlin , J. M. Neefe , D. A. Patterson , D. S. Roselli , R. Y. Wang, Serverless network file systems, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.109-126, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
7
|
F. Bennett, D. Clarke, J. B. Evans, A. Hopper, A. Jones, and D. Leask. Piconet - embedded mobile networking. IBEE Personal Communications, 4(5):8-15, October 1997.
|
 |
8
|
William J. Bolosky , John R. Douceur , David Ely , Marvin Theimer, Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs, Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.34-43, June 18-21, 2000, Santa Clara, California, United States
|
| |
9
|
|
| |
10
|
L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In Proc. IEEE Infoeom'g9, New York, NY, Mar. 1999.
|
| |
11
|
P. Cao and S. Irani. Cost-aware WWW proxy caching algorithms. In Proc. USENIX Symposium on Internet Technologies and Systems (USITS), Monterey, CA, Dec. 1997.
|
 |
12
|
|
| |
13
|
Ian Clarke , Oskar Sandberg , Brandon Wiley , Theodore W. Hong, Freenet: a distributed anonymous information storage and retrieval system, International workshop on Designing privacy enhancing technologies: design issues in anonymity and unobservability, p.46-66, January 2001, Berkeley, California, United States
|
 |
14
|
Frank Dabek , M. Frans Kaashoek , David Karger , Robert Morris , Ion Stoica, Wide-area cooperative storage with CFS, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
| |
15
|
|
| |
16
|
|
| |
17
|
J. Jannotti, D. K. Gifford, K. L. Johnson, M. F. Kaashoek, and J. W. O'Toole. Overcast: Reliable multicasting with an overlay network. In Proc. OSDI 2000, San Diego, CA, October 2000.
|
| |
18
|
J. Kangasharju, J. W. Roberts, and K. W. Ross. Performance evaluation of redirection schemes in content distribution networks. In Proc. 4th Web Caching Workshop, San Diego, CA, Mar. 1999.
|
| |
19
|
J. Kangasharju and K. W. Ross. A replicated architecture for the domain name system. In Proc. IEEE Infocom 2000, Tel Aviv, Israel, Max. 2000.
|
 |
20
|
John Kubiatowicz , David Bindel , Yan Chen , Steven Czerwinski , Patrick Eaton , Dennis Geels , Ramakrishna Gummadi , Sean Rhea , Hakim Weatherspoon , Chris Wells , Ben Zhao, OceanStore: an architecture for global-scale persistent storage, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.190-201, November 2000, Cambridge, Massachusetts, United States
|
 |
21
|
|
 |
22
|
Jinyang Li , John Jannotti , Douglas S. J. De Couto , David R. Karger , Robert Morris, A scalable location service for geographic ad hoc routing, Proceedings of the 6th annual international conference on Mobile computing and networking, p.120-130, August 06-11, 2000, Boston, Massachusetts, United States
[doi> 10.1145/345910.345931]
|
| |
23
|
|
| |
24
|
C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby copies of replicated objects in a distributed environment. Theory of Computing Systems, 32:241-280, 1999.
|
 |
25
|
Sylvia Ratnasamy , Paul Francis , Mark Handley , Richard Karp , Scott Schenker, A scalable content-addressable network, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.161-172, August 2001, San Diego, California, United States
|
| |
26
|
J. Reynolds. RFC 1309: Technical overview of directory services using the X.500 protocol, Mar. 1992.
|
| |
27
|
|
| |
28
|
S. Saroiu, P. K. Gummadi, and S. D. Gribble. A measurement study of peer-to-peer file sharing systems. Technical Roport UW-CSE-01-06-02, University of Washington, July 2001.
|
| |
29
|
Mark A. Sheldon , Andrzej Duda , Ron Weiss , David K. Gifford, Discover: a resource discovery system based on content routing, Proceedings of the Third International World-Wide Web conference on Technology, tools and applications, p.953-972, April 1995, Darmstadt, Germany
|
 |
30
|
Ion Stoica , Robert Morris , David Karger , M. Frans Kaashoek , Hari Balakrishnan, Chord: A scalable peer-to-peer lookup service for internet applications, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.149-160, August 2001, San Diego, California, United States
|
| |
31
|
|
CITED BY 171
|
|
|
|
|
|
|
|
|
|
|
André Brinkmann , Kay Salzwedel , Christian Scheideler, Compact, adaptive placement schemes for non-uniform requirements, Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, August 10-13, 2002, Winnipeg, Manitoba, Canada
|
|
|
|
|
|
|
|
|
Petros Maniatis , David S. H. Rosenthal , Mema Roussopoulos , Mary Baker , TJ Giuli , Yanto Muliadi, Preserving peer replicas by rate-limited sampled voting, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
Kirsten Hildrum , John D. Kubiatowicz , Satish Rao , Ben Y. Zhao, Distributed object location in a dynamic network, Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, August 10-13, 2002, Winnipeg, Manitoba, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Atul Adya , William J. Bolosky , Miguel Castro , Gerald Cermak , Ronnie Chaiken , John R. Douceur , Jon Howell , Jacob R. Lorch , Marvin Theimer , Roger P. Wattenhofer, Farsite: federated, available, and reliable storage for an incompletely trusted environment, ACM SIGOPS Operating Systems Review, v.36 n.SI, Winter 2002
|
|
|
Brent Chun , David Culler , Timothy Roscoe , Andy Bavier , Larry Peterson , Mike Wawrzoniak , Mic Bowman, PlanetLab: an overlay testbed for broad-coverage services, ACM SIGCOMM Computer Communication Review, v.33 n.3, July 2003
|
|
|
|
|
|
|
|
|
K. Gummadi , R. Gummadi , S. Gribble , S. Ratnasamy , S. Shenker , I. Stoica, The impact of DHT routing geometry on resilience and proximity, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
|
|
|
Mohamed Hefeeda , Ahsan Habib , Boyan Botev , Dongyan Xu , Bharat Bhargava, PROMISE: peer-to-peer media streaming using CollectCast, Proceedings of the eleventh ACM international conference on Multimedia, November 02-08, 2003, Berkeley, CA, USA
|
|
|
|
|
|
Venkata N. Padmanabhan , Helen J. Wang , Philip A. Chou , Kunwadee Sripanidkulchai, Distributing streaming media content using cooperative networking, Proceedings of the 12th international workshop on Network and operating systems support for digital audio and video, May 12-14, 2002, Miami, Florida, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Akihiro Nakao , Larry Peterson , Andy Bavier, A routing underlay for overlay networks, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
|
|
|
Jian Yin , Jean-Philippe Martin , Arun Venkataramani , Lorenzo Alvisi , Mike Dahlin, Separating agreement from execution for byzantine fault tolerant services, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Byung-Gon Chun , Kamalika Chaudhuri , Hoeteck Wee , Marco Barreno , Christos H. Papadimitriou , John Kubiatowicz, Selfish caching in distributed systems: a game-theoretic analysis, Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing, July 25-28, 2004, St. John's, Newfoundland, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Qin Lv , Pei Cao , Edith Cohen , Kai Li , Scott Shenker, Search and replication in unstructured peer-to-peer networks, Proceedings of the 16th international conference on Supercomputing, June 22-26, 2002, New York, New York, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mark Gaynor , Steven L. Moulton , Matt Welsh , Ed LaCombe , Austin Rowan , John Wynne, Integrating Wireless Sensor Networks with the Grid, IEEE Internet Computing, v.8 n.4, p.32-39, July 2004
|
|
|
|
|
|
Miguel Castro , Peter Druschel , Ayalvadi Ganesh , Antony Rowstron , Dan S. Wallach, Secure routing for structured peer-to-peer overlay networks, Proceedings of the 5th symposium on Operating systems design and implementation Due to copyright restrictions we are not able to make the PDFs for this conference available for downloading, December 09-11, 2002, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sumeet Sobti , Nitin Garg , Fengzhou Zheng , Junwen Lai , Yilei Shao , Chi Zhang , Elisha Ziskind , Arvind Krishnamurthy , Randolph Y. Wang, Segank: A Distributed Mobile Storage System, Proceedings of the 3rd USENIX Conference on File and Storage Technologies, March 31-31, 2004, San Francisco, CA
|
|
|
|
|
|
Sean Rhea , Patrick Eaton , Dennis Geels , Hakim Weatherspoon , Ben Zhao , John Kubiatowicz, Awarded Best Student Paper! - Pond: The OceanStore Prototype, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
|
|
Yi-Cheng Tu , Jianzhong Sun , Mohamed Hefeeda , Sunil Prabhakar, An analytical study of peer-to-peer media streaming systems, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.1 n.4, p.354-376, November 2005
|
|
|
|
|
|
Mahesh Kallahalla , Erik Riedel , Ram Swaminathan , Qian Wang , Kevin Fu, Plutus: Scalable Secure File Sharing on Untrusted Storage, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sean Rhea , Brighten Godfrey , Brad Karp , John Kubiatowicz , Sylvia Ratnasamy , Scott Shenker , Ion Stoica , Harlan Yu, OpenDHT: a public DHT service and its uses, ACM SIGCOMM Computer Communication Review, v.35 n.4, October 2005
|
|
|
|
|
|
Atul Adya , William J. Bolosky , Miguel Castro , Gerald Cermak , Ronnie Chaiken , John R. Douceur , Jon Howell , Jacob R. Lorch , Marvin Theimer , Roger P. Wattenhofer, Farsite: federated, available, and reliable storage for an incompletely trusted environment, Proceedings of the 5th symposium on Operating systems design and implementation Due to copyright restrictions we are not able to make the PDFs for this conference available for downloading, December 09-11, 2002, Boston, Massachusetts
|
|
|
Renata Teixeira , Keith Marzullo , Stefan Savage , Geoffrey M. Voelker, In search of path diversity in ISP networks, Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement, October 27-29, 2003, Miami Beach, FL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nadia Shalaby , Andy Bavier , Yitzchak Gottlieb , Scott Karlin , Larry Peterson , Xiaohu Qie , Tammo Spalink , Mike Wawrzoniak, Building extensible routers using network processors: Research Articles, Software—Practice & Experience, v.35 n.12, p.1155-1194, October 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Salma Ktari , Mathieu Zoubert , Artur Hecker , Houda Labiod, Performance evaluation of replication strategies in DHTs under churn, Proceedings of the 6th international conference on Mobile and ubiquitous multimedia, p.90-97, December 12-14, 2007, Oulu, Finland
|
|
|
|
|
|
|
|
|
|
|
|
Sudharshan S. Vazhkudai , Xiaosong Ma , Vincent W. Freeh , Jonathan W. Strickland , Nandan Tammineedi , Tyler Simon , Stephen L. Scott, Constructing collaborative desktop storage caches for large scientific datasets, ACM Transactions on Storage (TOS), v.2 n.3, p.221-254, August 2006
|
|
|
|
|
|
Mary Baker , Mehul Shah , David S. H. Rosenthal , Mema Roussopoulos , Petros Maniatis , TJ Giuli , Prashanth Bungale, A fresh look at the reliability of long-term digital storage, ACM SIGOPS Operating Systems Review, v.40 n.4, October 2006
|
|
|
Fan Chung , Ronald Graham , Ranjita Bhagwan , Stefan Savage , Geoffrey M. Voelker, Maximizing data locality in distributed systems, Journal of Computer and System Sciences, v.72 n.8, p.1309-1316, December, 2006
|
|
|
|
|
|
|
|
|
|
|
|
Chad Yoshikawa , Brent Chun , Amin Vahdat , Fred Annexstein , Ken Berman, The lonely NATed node, Proceedings of the 11th workshop on ACM SIGOPS European workshop: beyond the PC, p.36-es, September 19-22, 2004, Leuven, Belgium
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Adolfo Rodriguez , Charles Killian , Sooraj Bhat , Dejan Kostić , Amin Vahdat, MACEDON: methodology for automatically creating, evaluating, and designing overlay networks, Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation, p.20-20, March 29-31, 2004, San Francisco, California
|
|
|
Charles Blake , Rodrigo Rodrigues, High availability, scalable storage, dynamic peer networks: pick two, Proceedings of the 9th conference on Hot Topics in Operating Systems, p.1-1, May 18-21, 2003, Lihue, Hawaii
|
|
|
|
|
|
|
|
|
|
|
|
Ali Fessi , Heiko Niedermayer , Holger Kinkelin , Georg Carle, A cooperative SIP infrastructure for highly reliable telecommunication services, Proceedings of the 1st international conference on Principles, systems and applications of IP telecommunications, July 19-20, 2007, New York City, New York
|
|
|
|
|
|
Leonidas Galanis , Yuan Wang , Shawn R. Jeffery , David J. DeWitt, Locating data sources in large distributed systems, Proceedings of the 29th international conference on Very large data bases, p.874-885, September 09-12, 2003, Berlin, Germany
|
|
|
James Walkerdine , Danny Hughes , Paul Rayson , John Simms , Kiel Gilleade , John Mariani , Ian Sommerville, A framework for P2P application development, Computer Communications, v.31 n.2, p.387-401, February, 2008
|
|
|
Chreston Miller , Patrick Butler , Ankur Shah , Ali R. Butt, PeerStripe: a p2p-based large-file storage for desktop grids, Proceedings of the 16th international symposium on High performance distributed computing, June 25-29, 2007, Monterey, California, USA
|
|
|
|
|
|
Zheng Zhang , Qiao Lian , Shiding Lin , Wei Chen , Yu Chen , Chao Jin, BitVault: a highly reliable distributed data retention platform, ACM SIGOPS Operating Systems Review, v.41 n.2, p.27-36, April 2007
|
|
|
Sudharshan S. Vazhkudai , Xiaosong Ma , Vincent W. Freeh , Jonathan W. Strickland , Nandan Tammineedi , Stephen L. Scott, FreeLoader: Scavenging Desktop Storage Resources for Scientific Data, Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p.56, November 12-18, 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sage A. Weil , Andrew W. Leung , Scott A. Brandt , Carlos Maltzahn, RADOS: a scalable, reliable storage service for petabyte-scale storage clusters, Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07, November 11-11, 2007, Reno, Nevada
|
|
|
Harry C. Li , Allen Clement , Edmund L. Wong , Jeff Napper , Indrajit Roy , Lorenzo Alvisi , Michael Dahlin, BAR gossip, Proceedings of the 7th symposium on Operating systems design and implementation, November 06-08, 2006, Seattle, Washington
|
|
|
Yuhui Deng , Frank Wang , Na Helian , Sining Wu , Chenhan Liao, Dynamic and scalable storage management architecture for Grid Oriented Storage devices, Parallel Computing, v.34 n.1, p.17-31, January, 2008
|
|
|
|
|
|
Kirsten Hildrum , Fred Douglis , Joel L. Wolf , Philip S. Yu , Lisa Fleischer , Akshay Katta, Storage optimization for large-scale distributed stream-processing systems, ACM Transactions on Storage (TOS), v.3 n.4, p.1-28, February 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Giuseppe DeCandia , Deniz Hastorun , Madan Jampani , Gunavardhan Kakulapati , Avinash Lakshman , Alex Pilchin , Swaminathan Sivasubramanian , Peter Vosshall , Werner Vogels, Dynamo: amazon's highly available key-value store, ACM SIGOPS Operating Systems Review, v.41 n.6, December 2007
|
|
|
|
|
|
|
|
|
Elena Meshkova , Janne Riihijärvi , Marina Petrova , Petri Mähönen, A survey on resource discovery mechanisms, peer-to-peer and service discovery frameworks, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.52 n.11, p.2097-2128, August, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pascal Felber , Toan Luu , Martin Rajman , Etienne Riviere, Managing collaborative feedback information for distributed retrieval, Proceeding of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval, October 30-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
Byung-Gon Chun , Petros Maniatis , Scott Shenker , John Kubiatowicz, Tiered fault tolerance for long-term integrity, Proccedings of the 7th conference on File and stroage technologies, p.267-282, February 24-27, 2009, San Francisco, California
|
|
|
John MacCormick , Nicholas Murphy , Venugopalan Ramasubramanian , Udi Wieder , Junfeng Yang , Lidong Zhou, Kinesis: A new approach to replica placement in distributed storage systems, ACM Transactions on Storage (TOS), v.4 n.4, p.1-28, January 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Animesh Nandi , Tsuen-Wan Johnny Ngan , Atul Singh , Peter Druschel , Dan S. Wallach, Scrivener: providing incentives in cooperative content distribution systems, Proceedings of the ACM/IFIP/USENIX 2005 International Conference on Middleware, p.270-291, November 01-01, 2005, Grenoble, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Maysam Yabandeh , Nikola Knezevic , Dejan Kostic , Viktor Kuncak, CrystalBall: predicting and preventing inconsistencies in deployed distributed systems, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.229-244, April 22-24, 2009, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
Avishek Anand , Srikanta Bedathur , Klaus Berberich , Ralf Schenkel , Christos Tryfonopoulos, EverLast: a distributed architecture for preserving the web, Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, June 15-19, 2009, Austin, TX, USA
|
|
|
|
|
|
|
|