ACM Home Page
Please provide us with feedback. Feedback
Scalable load-balance measurement for SPMD codes
Full text PdfPdf (1.17 MB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2008 ACM/IEEE conference on Supercomputing - Volume 00 table of contents
Austin, Texas
SECTION: Papers table of contents
Article No. 46  
Year of Publication: 2008
ISBN:978-1-4244-2835-9
Authors
Todd Gamblin  University of North Carolina at Chapel Hill
Bronis R. de Supinski  Lawrence Livermore National Laboratory
Martin Schulz  Lawrence Livermore National Laboratory
Rob Fowler  University of North Carolina at Chapel Hill
Daniel A. Reed  Microsoft Research
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): 26,   Downloads (12 Months): 156,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Good load balance is crucial on very large parallel systems, but the most sophisticated algorithms introduce dynamic imbalances through adaptation in domain decomposition or use of adaptive solvers. To observe and diagnose imbalance, developers need system-wide, temporally-ordered measurements from full-scale runs. This potentially requires data collection from multiple code regions on all processors over the entire execution. Doing this instrumentation naively can, in combination with the application itself, exceed available I/O bandwidth and storage capacity, and can induce severe behavioral perturbations.

We present and evaluate a novel technique for scalable, low-error load balance measurement. This uses a parallel wavelet transform and other parallel encoding methods. We show that our technique collects and reconstructs system-wide measurements with low error. Compression time scales sublinearly with system size and data volume is several orders of magnitude smaller than the raw data. The overhead is low enough for online use in a production environment.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. D. Adams. The JPEG-2000 still image compression standard. Technical Report 2412, ISO/IEC JTC 1/SC 29/WG, December 2002.
 
2
M. D. Adams and F. Kossentini. JasPer: a software-based JPEG-2000 codec implementation. In Proceedings of the International Conference on Image Processing, 2000.
 
3
L.-M. Ang, H. N. Cheung, and K. Eshragian. EZW lgorith using depth-first representation of the wavelet zerotree. In Fifth International Symposium on Signal Processing and its Applications (ISSPA), pages 75--78, Brisbane, Australia, August 1999.
 
4
 
5
 
6
 
7
M. Casas, R. M. Badia, and J. Labarta. Automatic phase detection of MPI applications. Parallel Commuting: Architectures, Algorithms, and Applications, 38:129--136, 2007.
 
8
M. Casas, R. M. Badia, and J. Labarta. Automatic structure extraction from MPI applications. In European Conference on Parallel Computing (Euro-Par), pages 3--12, 2007.
9
 
10
P. Colella, D. T. Graves, D. Modiano, D. B. Serafini, and B. v. Straalen. Chombo software package for AMR applications. Technical Report (Lawrence Berkeley National Laboratory), 2000. Available from: http://seesar.lbl.gov/anag/chombo.
 
11
 
12
 
13
 
14
K. Fürlinger and M. Gerndt. ompP: A profiling tool for OpenMP. In Proceedings of the First International Workshop on OpenMP (IWOMP), 2005.
 
15
A. Gara, M. A. Blumrich, D. Chen, G. L.-T. Chiu, P. Coteus, M. E. Giampapa, R. A. Haring, P. Heidelberger, D. Hoenicke, G. V. Kopcsay, T. A. Liebsch, M. Ohmacht, B. D. Steinmacher-Burow, T. Takken, and P. Vranas. Overview of the Blue Gene/L system architecture. IBM Journal of Research and Development, 49(2/3), 2005.
 
16
J. Greenough, A. Kuhl, L. Howell, A. Shestakov, U. Creach, A. Miller, E. Tarwater, A. Cook, and B. Cabot. Raptor -- software and applications for BlueGene/L. In BlueGene/L Workshop. Lawrence Livermore National Laboratory, 2003. Available from: http://www.llnl.gov/asci/platforms/bluegene/agenda.html.
 
17
D. A. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the Institute of Radio Engineers, 40(9):1098--1101, September 1952.
18
 
19
 
20
S. Louis and B. R. de Supinski. BlueGene/L: Early application scaling results. In NNSA ASC Principal Investigator Meeting & BG/L Consortium System Software Workshop, February 2005. Available from: http://www-unix.mcs.anl.gov/~beckman/bluegene/SSW-Utah-2005/BGL-SSW22-LLNL-Apps.pdf.
21
 
22
X. Martorell, N. Smeds, R. Walkup, J. R. Brunheroto, G. Almási, J. A. Gunnels, L. De Rose, J. Labarta, F. Escalé, J. Gimenez, H. Servat, and J. E. Moreira. Blue Gene/L performance tools. IBM Journal of Research and Development, 49(2--3):407--424, 2005.
 
23
J. Mellor-Crummey. HPCToolkit: Multi-platform tools for profile-based performance analysis. In 5th International Workshop on Automatic Performance Analysis (APART), November 2003.
 
24
MPI Forum. MPI: A message passing interface standard. International Journal of Supercomputer Applications and High Performance Computing, 8(3/4):159--416, 1994.
 
25
O. M. Nielsen and M. Hegland. Parallel performance of fast wavelet transforms. International Journal of High Speed Computing, 11(1):55--74, 2000.
 
26
M. Noeth, F. Mueller, M. Schulz, and B. R. de Supinski. Scalable compression and replay of communication traces in massively parallel environments. In International Parallel and Distributed Processing Symposium (IPDPS), March 26--30 2007.
 
27
Paradyn Project, Madison, WI. DynStackwalker Programmer's Guide, July 13 2007. Version 0.6b. Available from: http://ftp.cs.wisc.edu/pub/paradyn/releases/current_release/doc/stackwalker.pdf.
 
28
E. Perelman, M. Polito, J.-Y. Bouget, J. Sampson, B. Calder, and C. Dulong. Detecting phases in parallel applications on shared memory architectures. In International Parallel and Distributed Processing Symposium (IPDPS), 2006.
 
29
30
 
31
J. M. Shapiro. Embedded image coding using zerotrees of wavelet coefficients. IEEE Transactions on Signal Processing, 41(12):3445--3462, December 1993.
 
32
 
33
34
 
35
D. F. Walnut. An Introduction to Wavelet Analysis. Birkhäuser Boston, 2004.

Collaborative Colleagues:
Todd Gamblin: colleagues
Bronis R. de Supinski: colleagues
Martin Schulz: colleagues
Rob Fowler: colleagues
Daniel A. Reed: colleagues