ACM Home Page
Please provide us with feedback. Feedback
Y-lib: a user level library to increase the performance of MPI-IO in a lustre file system environment
Full text PdfPdf (688 KB)
Source
High Performance Distributed Computing archive
Proceedings of the 18th ACM international symposium on High performance distributed computing table of contents
Garching, Germany
SESSION: I/O and parallel computing table of contents
Pages 31-38  
Year of Publication: 2009
ISBN:978-1-60558-587-1
Authors
Phillip M. Dickens  The University of Maine, Orono, ME, USA
Jeremy Logan  The University of Maine, Orono, ME, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 47,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1551609.1551617
What is a DOI?

ABSTRACT

It is widely known that MPI-IO performs poorly in a Lustre file system environment, although the reasons for such performance are currently not well understood. The research presented in this paper strongly supports our hypothesis that MPI-IO performs poorly in this environment because of the fundamental assumptions upon which most parallel I/O optimizations are based. In particular, it is almost universally believed that parallel I/O performance is optimized when aggregator processes perform large, contiguous I/O operations in parallel. Our research shows that this approach generally provides the worst performance in a Lustre environment, and that the best performance is often obtained when the aggregator processes perform a large number of small, non-contiguous I/O operations.

In this paper, we first demonstrate and explain these non-intuitive results. We then present a user-level library, termed Y-lib, which redistributes data in a way that conforms much more closely with the Lustre storage architecture than does the data redistribution pattern employed by MPI-IO. We then provide experimental results showing that Y-lib can increase performance between 300% and 1000% depending on the number of aggregator processes and file size. Finally, we cause MPI-IO itself to use our data redistribution scheme, and show that doing so results in an increase in performance of a similar magnitude when compared to the current MPI-IO data redistribution algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Cluster File Systems, Inc., http://www.clustrefs.com
 
2
Frequently Asked Questions., http://www.clusterfs.com/faq.html
 
3
I/O Performance Project http://wiki.lustre.org/index.php?title=IOPerformanceProject
 
4
Lustre: scalable, secure, robust, highly-available cluster file system. An offshoot of AFS, CODA, and Ext2. www.lustre.org/
 
5
MPI-2: Extensions to the Message-Passing Interface. Message Passing Interface Forum http://www.mpi-forum.org/docs/mpi-20-html/mpi2-report.html
 
6
MPICH2 Home Page, http://www.mcs.anl.gov/mpi/mpich
 
7
The Panasas Home Page, http://www.panasas.com
 
8
 
9
Bramm, P.J. The Lustre Storage Architecture, White Paper, Cluster File Systems, Inc., Oct, Vol. 23 (2003)
 
10
Isaila, F. and Tichy, W.F., View I/O: improving the performance of non-contiguous I/O. In the Proceedings of the IEEE Cluster Computing Conference, (Hong Kong).
 
11
Larkin, J. and Fahey, M. Guidelines for Efficient Parallel I/O on the Cray XT3/XT4 CUG 2007, 2007.
 
12
Liao, W.-k., Ching, A., Coloma, K., Choudhary, A., et al., Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method. In the Proceedings of the Next Generation Software (NGS) Workshop, (2007).
 
13
Liao, W.-k., Ching, A., Coloma, K., Choudhary, A., et al., An Implementation and Evaluation of Client-Side File Caching for MPI-IO. In the Proceedings of the International Parallel and Distried Processing Symposium (IPDPS '07), (2007).
 
14
 
15
 
16
17
 
18
 
19
Thakur, R., Ross, R. and Gropp, W. Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation, Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, Revised May 2004.
 
20

Collaborative Colleagues:
Phillip M. Dickens: colleagues
Jeremy Logan: colleagues