ACM Home Page
Please provide us with feedback. Feedback
DataStager: scalable data staging services for petascale applications
Full text PdfPdf (561 KB)
Source
High Performance Distributed Computing archive
Proceedings of the 18th ACM international symposium on High performance distributed computing table of contents
Garching, Germany
SESSION: I/O and parallel computing table of contents
Pages 39-48  
Year of Publication: 2009
ISBN:978-1-60558-587-1
Authors
Hasan Abbasi  Georgia Institute of Technology, Atlanta, GA, USA
Matthew Wolf  Georgia Institute of Technology, Atlanta, GA, USA
Greg Eisenhauer  Georgia Institute of Technology, Atlanta, GA, USA
Scott Klasky  Oak Ridge National Laboratory, Oak Ridge, TN, USA
Karsten Schwan  Georgia Institute of Technology, Atlanta, GA, USA
Fang Zheng  Georgia Institute of Technology, Atlanta, GA, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 70,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1551609.1551618
What is a DOI?

ABSTRACT

Known challenges for petascale machines are that (1) the costs of I/O for high performance applications can be substantial, especially for output tasks like checkpointing, and (2) noise from I/O actions can inject undesirable delays into the runtimes of such codes on individual compute nodes. This paper introduces the flexible 'DataStager' framework for data staging and alternative services within that jointly address (1) and (2). Data staging services moving output data from compute nodes to staging or I/O nodes prior to storage are used to reduce I/O overheads on applications' total processing times, and explicit management of data staging offers reduced perturbation when extracting output data from a petascale machine's compute partition. Experimental evaluations of DataStager on the Cray XT machine at Oak Ridge National Laboratory establish both the necessity of intelligent data staging and the high performance of our approach, using the GTC fusion modeling code and benchmarks running on 1000+ processors.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
N. Ali and M. Lauria. Improving the performance of remote i/o using asynchronous primitives. High Performance Distributed Computing, 2006 15th IEEE International Symposium on, pages 218--228, 0-0 0.
 
3
P. Beckman and S. Coghlan. ZeptoOS: the small Linux for big computers, 2005.
 
4
 
5
6
 
7
R. Brightwell, T. Hudson, R. Riesen, and A. B. Maccabe. The Portals 3.0 message passing interface. Technical report SAND99-2959, Sandia National Laboratories, December 1999.
 
8
 
9
10
 
11
 
12
Lustre: A scalable, high-performance file system. Cluster File Systems Inc. white paper, version 1.0, November 2002. http://www.lustre.org/docs/whitepaper.pdf.
 
13
 
14
C. Ding, S. Dwarkadas, M. Huang, K. Shen, and J. Carter. Program phase detection and exploitation. Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International, pages 8 pp.-, 25--29 April 2006.
 
15
C. Docan, M. Parashar, and S. Klasky. High speed asynchronous data transfers on the cray xt3. In Cray User Group Conference, 2007.
 
16
G. Eisenhauer. The evpath library. http://www.cc.gatech.edu/systems/projects/EVPath.
 
17
G. Eisenhauer. Portable binary input/output. http://www.cc.gatech.edu/systems/projects/PBIO.
 
18
19
20
 
21
R. Jain, K. K. Ramakrishnan, and D. M. Chiu. Congestion avoidance in computer networks with a connectionless network layer. Technical Report DEC-TR-506, Digital Equipment Corporation, MA, Aug. 1987.
22
 
23
R. Latham, N. Miller, R. Ross, and P. Carns. A next-generation parallel file system for linux clusters. LinuxWorld, 2(1), January 2004.
24
25
 
26
 
27
R. A. Oldfield, A. B. Maccabe, S. Arunagiri, T. Kordenbrock, R. R. sen, L. Ward, and P. Widener. Lightweight I/O for Scientific Applications. In Proc. 2006 IEEE Conference on Cluster Computing, Barcelona, Spain, September 2006.
 
28
R. A. Oldfield, P. Widener, A. B. Maccabe, L. Ward, and T. Kordenbrock. Efficient Data Movement for Lightweight I/O. In Proc. 2006 Workshop on high-performance I/O techniques and deployment of Very-Large Scale I/ O Systems (HiPerI/O 2006), Barcelona, Spain, September 2006.
 
29
30
 
31
32
 
33
 
34
N. Stone, D. Balog, B. Gill, B. Johan-SON, J. Marsteller, P. Nowoczynski, D. Porter, R. Reddy, J. Scott, D. Simmel, et al. PDIO: High-performance remote file I/O for Portals enabled compute nodes. Proceedings of the 2006 Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, June, 2006.
 
35
P. M. Widener, M. Wolf, H. Abbasi, M. Barrick, J. Lofstead, J. Pullikottil, G. Eisenhauer, A. Gavrilovska, S. Klasky, R. Oldfield, P. G. Bridges, A. B. Maccabe, and K. Schwan. Structured streams: Data services for petascale science environments. Technical Report TR-CS-2007-17, University of New Mexico, Albuquerque, NM, November 2007.
 
36
M. Wolf, H. Abbasi, B. Collins, D. Spain, and K. Schwan. Service augmentation for high end interactive data services. In IEEE International Conference on Cluster Computing (Cluster 2005), September 2005.
 
37

Collaborative Colleagues:
Hasan Abbasi: colleagues
Matthew Wolf: colleagues
Greg Eisenhauer: colleagues
Scott Klasky: colleagues
Karsten Schwan: colleagues
Fang Zheng: colleagues