|
ABSTRACT
In numerous scientific disciplines, terabyte and soon petabyte-scale data collections are emerging as critical community resources. A new class of Data Grid infrastructure is required to support management, transport, distributed access to, and analysis of these datasets by potentially thousands of users. Researchers who face this challenge include the Climate Modeling community, which performs long-duration computations accompanied by frequent output of very large files that must be further analyzed. We describe the Earth System Grid prototype, which brings together advanced analysis, replica management, data transfer, request management, and other technologies to support high-performance, interactive analysis of replicated data. We present performance results that demonstrate our ability to manage the location and movement of large datasets from the user's desktop. We report on experiments conducted over SciNET at SC'2000, where we achieved peak performance of 1.55Gb/s and sustained performance of 512.9Mb/s for data transfers between Texas and California.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
"Climate Data Analysis Tool," http://www.pcmdi.llnl.gov/software/cdat/index.html.
|
| |
2
|
Bill Allcock , Joe Bester , John Bresnahan , Ann L. Chervenak , Carl Kesselman , Sam Meder , Veronika Nefedova , Darcy Quesnel , Steven Tuecke , Ian Foster, Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing, Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies, p.13, April 17-20, 2001
|
| |
3
|
Chaitanya Baru , Reagan Moore , Arcot Rajasekar , Michael Wan, The SDSC storage resource broker, Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research, p.5, November 30-December 03, 1998, Toronto, Ontario, Canada
|
| |
4
|
A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, "The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Data Sets," J. Network and Computer Applications, pp. 187-200, 2001.
|
| |
5
|
|
| |
6
|
I. Foster and C. Kesselman, "Globus: A Metacomputing Infrastructure Toolkit," International Journal of Supercomputer Applications, vol. 11, pp. 115-128, 1997.
|
 |
7
|
Ian Foster , Carl Kesselman , Gene Tsudik , Steven Tuecke, A security architecture for computational grids, Proceedings of the 5th ACM conference on Computer and communications security, p.83-92, November 02-05, 1998, San Francisco, California, United States
[doi> 10.1145/288090.288111]
|
| |
8
|
|
| |
9
|
|
| |
10
|
I. Foster and C. Kesselman, "A Data Grid Reference Architecture," GriPhyN 2001-6, 2001.
|
| |
11
|
I. Foster, C. Kesselman, and S. Tuecke, "The Anatomy of the Grid: Enabling Scalable Virtual Organizations," Intl. J. Supercomputer Applications, vol. (to appear), 2001.
|
| |
12
|
P. A. Fox, J. Garcia, and P. Kellogg, "The HAO Data Service: Experience in Interdisciplinary Data Delivery," presented at Proc. of the CODATA 2000 Workshop, US National Academy, 2000.
|
| |
13
|
Dan Gunter , Brian Tierney , Brian Crowley , Mason Holding , Jason Lee, NetLogger: A Toolkit for Distributed System Performance Analysis, Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, p.267, August 29-September 01, 2000
|
| |
14
|
NTONC, "NTON Connection in support of SC2000," http://www.ntonc.org/docs/NTON_ConnectionsForSC2000v1.1.ppt, 2000.
|
| |
15
|
|
| |
16
|
B. Tierney, "TCP Tuning Guide for Distributed Applications on Wide Area Networks," presented at Usenix; login, 2001.
|
| |
17
|
|
| |
18
|
|
CITED BY 20
|
|
|
|
|
|
|
|
|
|
|
Ann Chervenak , Ewa Deelman , Carl Kesselman , Bill Allcock , Ian Foster , Veronika Nefedova , Jason Lee , Alex Sim , Arie Shoshani , Bob Drach , Dean Williams , Don Middleton, High-performance remote access to climate simulation data: a challenge problem for data grid technologies, Parallel Computing, v.29 n.10, p.1335-1356, October 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Henrique Andrade , Tahsin Kurc , Alan Sussman , Joel Saltz, Active Proxy-G: optimizing the query execution process in the grid, Proceedings of the 2002 ACM/IEEE conference on Supercomputing, p.1-15, November 16, 2002, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alberto Sánchez , María S. Pérez , Konstantinos Karasavvas , Pilar Herrero , Antonio Pérez, MAPFS-DAI, an extension of OGSA-DAI based on a parallel file system, Future Generation Computer Systems, v.23 n.1, p.138-145, January 2007
|
|
|
Tahsin Kurc , Feng Lee , Gagan Agrawal , Umit Catalyurek , Renato Ferreira , Joel Saltz, Optimizing Reduction Computations In a Distributed Environment, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.9, November 15-21, 2003
|
|
|
Gurmeet Singh , Shishir Bharathi , Ann Chervenak , Ewa Deelman , Carl Kesselman , Mary Manohar , Sonal Patil , Laura Pearlman, A Metadata Catalog Service for Data Intensive Applications, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.33, November 15-21, 2003
|
|
|
Rafael Moreno-Vozmediano , Krishna Nadiminti , Srikumar Venugopal , Ana B. Alonso-Conde , Hussein Gibbins , Rajkumar Buyya, Portfolio and investment risk analysis on global grids, Journal of Computer and System Sciences, v.73 n.8, p.1164-1175, December, 2007
|
|
|
Kurt Stockinger , Heinz Stockinger , Lukasz Dutka , Renata Slota , Darin Nikolow , Jacek Kitowski, Access Cost Estimation for Unified Grid Storage Systems, Proceedings of the 4th International Workshop on Grid Computing, p.149, November 17-17, 2003
|
|
|
|
|
|
|
|