ACM Home Page
Please provide us with feedback. Feedback
Quartz: a tool for tuning parallel program performance
Full text PdfPdf (1.51 MB)
Source Joint International Conference on Measurement and Modeling of Computer Systems archive
Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems table of contents
Univ. of Colorado, Boulder, Colorado, United States
Pages: 115 - 125  
Year of Publication: 1990
ISBN:0-89791-359-0
Also published in ...
Authors
Thomas E. Anderson  Department of Computer Science and Engineering, University of Washington, Seattle WA
Edward D. Lazowska  Department of Computer Science and Engineering, University of Washington, Seattle WA
Sponsor
SIGMETRICS: ACM Special Interest Group on Measurement and Evaluation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 30,   Citation Count: 38
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/98457.98518
What is a DOI?

ABSTRACT

Initial implementations of parallel programs typically yield disappointing performance. Tuning to improve performance is thus a significant part of the parallel programming process. The effort required to tune a parallel program, and the level of performance that eventually is achieved, both depend heavily on the quality of the instrumentation that is available to the programmer. This paper describes Quartz, a new tool for tuning parallel program performance on shared memory multiprocessors. The philosophy underlying Quartz was inspired by that of the sequential UNIX tool gprof: to appropriately direct the attention of the programmer by efficiently measuring just those factors that are most responsible for performance and by relating these metrics to one another and to the structure of the program. This philosophy is even more important in the parallel domain than in the sequential domain, because of the dramatically greater number of possible metrics and the dramatically increased complexity of program structures. The principal metric of Quartz is normalized processor time: the total processor time spent in each section of code divided by the number of other processors that are concurrently busy when that section of code is being executed. Tied to the logical structure of the program, this metric provides a “smoking gun” pointing towards those areas of the program most responsible for poor performance. This information can be acquired efficiently by checkpointing to memory the number of busy processors and the state of each processor, and then statistically sampling these using a dedicated processor. In addition to describing the design rationale, functionality, and implementation of Quartz, the paper examines how Quartz would be used to solve a number of performance problems that have been reported as being frequently encountered, and describes a case study in which Quartz was used to significantly improve the performance of a CAD circuit verifier.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
Anderson et al. 1989
Aral & Gertner 1988
 
BBN 1985
BBN Laboratories. Butterfly Parallel Processor Overview. 1985.
 
Bershad et al. 1988
 
Burkhart & Millen 1989
 
Carpenter 1987
R.J. Carpenter. Performance Measurement Instrumentation for Multiprocessor Systems. In High Performance Computer Systems, ed. E. Gelenbe, North-Holland, pp. 81-92, 1987.
Fowler et al. 1988
Graham et al. 1982
Gupta 1989
 
Halstead 1986
Kerola & Schwetman 1987
Ma et al. 1987
 
Malony et al. 1989
Allen Malony, Daniel Reed, James Arendt, Ruth Aydt, Dominique Grabas, and Brian Totty. An Integrated Performance Data Collection, Analysis, and Visualization System. Proc. 4th Conference on Hypercubes, Concurrent Computers, and Applications, 1989.
 
Miller& Yang 1987
Barton P. Miller and C.-Q. Yang. IPS: An interactive and Automatic Performance Measurement Tool for Parallel and Distributed Programs. Proc. 7th International Conference on Distributed Computing Systems, September 1987.
 
Moeller-Nielsen & Staunstrup 1987
P. Moeller-Nielsen and J. Staunstrup. Problem-Heap: A Paradigm for Multiprocessor Algorithms. Parallel Computing 4, North-Holland, 1987, pp. 63-74.
 
Pfister et al. 1985
G. Pfister, W. Brantley, D. George, S. Harvey, W. Kleinfelder, K. McAuliffe, E. Melton, V. Norton, and J. Weise. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture. Proc. 1985 International Conference on Parallel Processing, August 1985.
 
Rodgers 1986
David P. Rodgers. Personal communication.
 
Segall & Rudolph 1985
Zary Segall and Larry Rudolph. PIE: A Programming and Instrumentation Environment for Parallel Processing. IEEE Software 2,6 (November 1985).
 
Sequent 1988
Sequent Computer Systems, Inc.Symmetry Technical Summary,
 
Thacker et al. 1988
 
Yang & Miller 1988
Cui-Qing Yang and Barton Miller. Critical Path Analysis for the Execution of Parallel and Distributed Programs. Proc. 9th International Conference on Distributed Computing Systems, pp. 366-373, June 1988.

CITED BY  38
 
 
 
 
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Thomas E. Anderson: colleagues
Edward D. Lazowska: colleagues