ACM Home Page
Please provide us with feedback. Feedback
Scaling parallel I/O performance through I/O delegate and caching system
Full text PdfPdf (461 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2008 ACM/IEEE conference on Supercomputing - Volume 00 table of contents
Austin, Texas
SECTION: Papers table of contents
Article No. 9  
Year of Publication: 2008
ISBN:978-1-4244-2835-9
Authors
Arifa Nisar  Northwestern University, Evanston, Illinois
Wei-keng Liao  Northwestern University, Evanston, Illinois
Alok Choudhary  Northwestern University, Evanston, Illinois
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): 24,   Downloads (12 Months): 258,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Increasingly complex scientific applications require massive parallelism to achieve the goals of fidelity and high computational performance. Such applications periodically offload checkpointing data to file system for post-processing and program resumption. As a side effect of high degree of parallelism, I/O contention at servers doesn't allow overall performance to scale with increasing number of processors. To bridge the gap between parallel computational and I/O performance, we propose a portable MPI-IO layer where certain tasks, such as file caching, consistency control, and collective I/O optimization are delegated to a small set of compute nodes, collectively termed as I/O Delegate nodes. A collective cache design is incorporated to resolve cache coherence and hence alleviates the lock contention at I/O servers. By using popular parallel I/O benchmark and application I/O kernels, our experimental evaluation indicates considerable performance improvement with a small percentage of compute resources reserved for I/O.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
C. W. McCurd, R. Stevens, H. Simon, W. Kramer, D. Bailey, W. Johnston, C. Catlett, R. Lusk, T. Morgan, J. Meza, M. Banda, J. Leighton, and J. Hules, "Creating Science-Driven Computer Architecture:A New Path to Scientific Leadership," National Energy Research Scientific Computing Center, Tech. Rep., October 2002. {Online}. Available: http://www.nersc.gov/news/reports/ArchDevProposal.5.01.pdf
 
2
H. Yu, R. Sahoo, C. Howson, G. Almasi, J. Castanos, M. Gupta, J. Moreira, J. Parker, T. Engelsiepen, R. Ross, R. Thakur, R. Latham, and W. Gropp, "High performance file I/O for the Blue Gene/L supercomputer," hpca, vol. O, pp. 187--196, 2006.
 
3
G. Almasi, C. Archer, J. G. Castanos, C. C. Erway, P. Heidelberger, X. Martorell, J. E. Moreira, K. Pinnow, J. Ratterman, N. Smeds, and Burkhard, "Implementing MPI on the BlueGene/L Supercomputer." {Online}. Available: citeseer.ist.psu.edu/almasi04implementing.html
 
4
R. D. Loft, "Blue Gene/L Experiences at NCAR," in IBM System Scientific User Group meeting (SCICOMP11), 2005.
 
5
R. Thakur, W. Gropp, and E. Lusk, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Mathematics and Computer Science Division, Argonne National Laboratory, Tech. Rep. ANL/MCS-TM-234, October 1997.
 
6
V. HDF Group. Hierarchical Data Format, "The National Center for Supercomputing Applications," http://hdf.ncsa.uiuc.edu/HDF5. {Online}. Available: http://hdf.ncsa.uiuc.edu/HDF5
7
 
8
R. Thakur, W. Gropp, and E. Lusk, Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation, Technical Report ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, October 1997.
 
9
 
10
"General Parallel File System," http://www-03.ibm.com/systems/clusters/software/gpfs/index.html. {Online}. Available: http://www-03.ibm.com/systems/clusters/software/gpfs/index.html
 
11
 
12
"Lustre: A Scalable, High-Performance File System. Whitepaper," 2003.
 
13
 
14
H. Shan and J. Shalf., "Using IOR to Analyze the I/O performance for HPC Platforms," in Cray Users Group Meeting (CUG) 2007, Seattle, Washington, may 7--10 2007.
 
15
J. Larkin and M. Fahey, "Guidelines for Efficient Parallel I/O on the Cray XT3/XT4," in Cray Users Group Meeting (CUG) 2007, Seattle, Washington, may 7--10 2007.
 
16
 
17
 
18
R. Bennett, K. Bryant, A. Sussman, R. Das, and J. Saltz, "Jovian: A Framework for Optimizing Parallel I/O," in Proceedings of the Scalable Parallel Libraries Conference. Mississippi State, MS: IEEE Computer Society Press, 1994, pp. 10--20. {Online}. Available: citeseer.ist.psu.edu/bennett94jovian.html
 
19
W. keng Liao, A. Ching, K. Coloma, A. N. Choudhary, and L. Ward, "An Implementation and Evaluation of Client-Side File Caching for MPI-IO," in IPDPS. IEEE, 2007, pp. 1--10.
20
 
21
 
22
 
23
P. Wong and R. der Wijngaart, "NAS Parallel Benchmarks I/O Version 2.4," NASA Ames Research Center, Moffet Field, CA, Tech. Rep. NAS-03-002, January 2003.
 
24
M. Zingale, "FLASH I/O Benchmark Routine Parallel HDF 5," http://flash.uchicago.edu/~zingale/flash_benchmark_io, March 2001.
 
25
B. Fryxell, K. Olson, P. Ricker, F. X. Timmes, M. Zingale, D. Q. Lamb, P. MacNeice, R. Rosner, and H. Tufo., "Flash: An adaptive mesh hydrodynamics code for modelling astrophysical thermonuclear flashes," in Astrophysical Journal Suppliment, 2000, p. 131273.
 
26
H. Group, "Hierarchical Data Format, Version 5. The National Center for Supercomputing Applications," http://hdf.ncsa.uiuc.edu/HDF5.
 
27
R. Sankaran, E. R. Hawkes, J. H. Chen, T. Lu, and C. K. Law, "Direct numerical simulations of turbulent lean premixed combustion," Journal of Physics Conference Series, vol. 46, pp. 38--42, Sep. 2006.


Collaborative Colleagues:
Arifa Nisar: colleagues
Wei-keng Liao: colleagues
Alok Choudhary: colleagues