|
ABSTRACT
High-end computing is suffering a data deluge from experiments, simulations, and apparatus that creates overwhelming application dataset sizes. End-user workstations-despite more processing power than ever before-are ill-equipped to cope with such data demands due to insufficient secondary storage space and I/O rates. Meanwhile, a large portion of desktop storage is unused. We present the FreeLoader framework, which aggregates unused desktop storage space and I/O bandwidth into a shared cache/scratch space, for hosting large, immutable datasets and exploiting data access locality. Our experiments show that FreeLoader is an appealing low-cost solution to storing massive datasets, by delivering higher data access rates than traditional storage facilities. In particular, we present novel data striping techniques that allow FreeLoader to efficiently aggregate a workstation's network communication bandwidth and local I/O bandwidth. In addition, the performance impact on the native workload of donor machines is small and can be effectively controlled.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
[1] Seti@home: The search for extraterrestrial intelligence. http://setiathome.ssl.berkeley.edu/, 2003.
|
 |
2
|
Atul Adya , William J. Bolosky , Miguel Castro , Gerald Cermak , Ronnie Chaiken , John R. Douceur , Jon Howell , Jacob R. Lorch , Marvin Theimer , Roger P. Wattenhofer, Farsite: federated, available, and reliable storage for an incompletely trusted environment, Proceedings of the 5th symposium on Operating systems design and implementation Due to copyright restrictions we are not able to make the PDFs for this conference available for downloading, December 09-11, 2002, Boston, Massachusetts
[doi> 10.1145/1060289.1060291]
|
| |
3
|
|
 |
4
|
Micah Beck , Terry Moore , James S. Plank, An end-to-end approach to globally scalable network storage, Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, August 19-23, 2002, Pittsburgh, Pennsylvania, USA
|
| |
5
|
John Bent , Douglas Thain , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , Miron Livny, Explicit control a batch-aware distributed file system, Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation, p.27-27, March 29-31, 2004, San Francisco, California
|
| |
6
|
John Bent , Venkateshwaran Venkataramani , Nick LeRoy , Alain Roy , Joseph Stanley , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , Miron Livny, Flexibility, Manageability, and Performance in a Grid Storage Appliance, Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing, p.3, July 24-26, 2002
|
 |
7
|
Joseph Bester , Ian Foster , Carl Kesselman , Jean Tedesco , Steven Tuecke, GASS: a data movement and access service for wide area computing systems, Proceedings of the sixth workshop on I/O in parallel and distributed systems, p.78-88, May 05-05, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301816.301839]
|
| |
8
|
Charles Blake , Rodrigo Rodrigues, High availability, scalable storage, dynamic peer networks: pick two, Proceedings of the 9th conference on Hot Topics in Operating Systems, p.1-1, May 18-21, 2003, Lihue, Hawaii
|
| |
9
|
|
| |
10
|
Philip H. Carns , Walter B. Ligon, III , Robert B. Ross , Rajeev Thakur, PVFS: a parallel file system for linux clusters, Proceedings of the 4th conference on 4th Annual Linux Showcase & Conference, Atlanta, p.28-28, October 10-14, 2000, Atlanta, Georgia
|
| |
11
|
|
| |
12
|
[12] I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong. Freenet: A distributed anonymous information storage and retrieval system. Lecture Notes in Computer Science, 2000.
|
| |
13
|
[13] Cluster File Systems, Inc. Lustre: A scalable, high-performance file system. http://www.lustre.org/docs/- whitepaper.pdf, 2002.
|
| |
14
|
[14] B. Cohen. Incentives Build Robustness in BitTorrent. 2003.
|
| |
15
|
|
| |
16
|
|
| |
17
|
Michael D. Dahlin , Randolph Y. Wang , Thomas E. Anderson , David A. Patterson, Cooperative caching: using remote client memory to improve file system performance, Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation, p.19-es, November 14-17, 1994, Monterey, California
|
 |
18
|
|
 |
19
|
Antony Rowstron , Peter Druschel, Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility, Proceedings of the eighteenth ACM symposium on Operating systems principles, October 21-24, 2001, Banff, Alberta, Canada
|
 |
20
|
John Kubiatowicz , David Bindel , Yan Chen , Steven Czerwinski , Patrick Eaton , Dennis Geels , Ramakrishna Gummadi , Sean Rhea , Hakim Weatherspoon , Chris Wells , Ben Zhao, OceanStore: an architecture for global-scale persistent storage, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.190-201, November 2000, Cambridge, Massachusetts, United States
|
 |
21
|
M. J. Feeley , W. E. Morgan , E. P. Pighin , A. R. Karlin , H. M. Levy , C. A. Thekkath, Implementing global memory management in a workstation cluster, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.201-212, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
22
|
[22] S. Gadde, J. Chase, and M. Rabinovich. A taste of crispy squid. In Proceedings of the Workshop on Internet Server Performance , June 1998.
|
 |
23
|
|
| |
24
|
[24] J. Gray and A. S. Szalay. Scientific Data Federation. In I. Foster and C. Kesselman, editors, The Grid 2: Blueprint for a New Computing Infrastructure, pages 95-108, 2003.
|
 |
25
|
Krishna P. Gummadi , Richard J. Dunn , Stefan Saroiu , Steven D. Gribble , Henry M. Levy , John Zahorjan, Measurement, modeling, and analysis of a peer-to-peer file-sharing workload, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
| |
26
|
|
 |
27
|
|
| |
28
|
[28] J. H. Howard. An overview of the andrew file system. 1998.
|
| |
29
|
[29] http://www.coda.cs.cmu.edu. CODA File System, 1987.
|
| |
30
|
[30] A. Iamnitchi, M. Ripeanu, and I. Foster. Small-world file-sharing communities. In Infocom, 2004.
|
 |
31
|
|
 |
32
|
|
| |
33
|
[33] M. Litzkow, M. Livny, and M. Mutka. Condor- a hunter of idle workstations. In Proceedings of the 8th International Conference on Distributed Computing Systems, 1988.
|
| |
34
|
|
| |
35
|
[35] National center for biotechnology information. http://- www.ncbi.nlm.nih.gov/, 2005.
|
| |
36
|
[36] SHARMAN NETWORKS. The kazaa media desktop. http://www.kazaa.com.
|
| |
37
|
[37] R. Novaes, P. Roisenberg, R. Scheer, C. Northfleet, J. Jornada, and W. Cirne. Non-dedicated distributed environment: A solution for safe and continuous exploitation of idle cycles. In Proceedings of the Workshop on Adaptive Grid Middleware, 2003.
|
| |
38
|
[38] B. Nowicki. NFS: Network File System Protocol Specification . Network Working Group RFC1094, 1989.
|
| |
39
|
|
| |
40
|
|
 |
41
|
|
| |
42
|
|
| |
43
|
[43] Sloan digital sky survey. http://www.sdss.org, 2005.
|
| |
44
|
|
| |
45
|
[45] A. Szalay and J. Gray. The world-wide telescope. Science, 293(14):2037-2040, 2001.
|
 |
46
|
|
CITED BY 11
|
|
|
|
|
Xiaosong Ma , Vincent W. Freeh , Tao Yang , Sudharshan S. Vazhkudai , Tyler A. Simon , Stephen L. Scott, Coupling prefix caching and collective downloads for remote dataset access, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
|
|
|
|
|
|
Gilles Fedak , Haiwu He , Franck Cappello, Distributing and managing data on desktop grids with BitDew, Proceedings of the third international workshop on Use of P2P, grid and agents for the development of content networks, June 23-23, 2008, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|