|
ABSTRACT
Contemporary microprocessors provide a rich set of integrated performance counters that allow application developers and system architects alike the opportunity to gather important information about workload behaviors. Current techniques for analyzing data produced from these counters use raw counts, ratios, and visualization techniques help users make decisions about their application performance. While these techniques are appropriate for analyzing data from one process, they do not scale easily to new levels demanded by contemporary computing systems. Very simply, this paper addresses these concerns by evaluating several multivariate statistical techniques on these datasets. We find that several techniques, such as statistical clustering, can automatically extract important features from the data. These derived results can, in turn, be fed directly back to an application developer, or used as input to a more comprehensive performance analysis environment, such as a visualization or an expert system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, ACM Transactions on Computer Systems (TOCS), v.15 n.4, p.357-390, Nov. 1997
[doi> 10.1145/265924.265925]
|
| |
2
|
|
| |
3
|
S. Browne , J. Dongarra , N. Garner , K. London , P. Mucci, A scalable cross-platform infrastructure for application performance tuning using hardware counters, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.42-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
4
|
|
| |
5
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
6
|
|
| |
7
|
Jay Hoeflinger , Bob Kuhn , Wolfgang E. Nagel , Paul Petersen , Hrabri Rajic , Sanjiv Shah , Jeffrey S. Vetter , Michael Voss , Renee Woo, An Integrated Performance Visualizer for MPI/OpenMP Programs, Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming, p.40-52, July 30-31, 2001
|
| |
8
|
|
| |
9
|
Intel, "Intel IA-64 Architecture Software Developer's Manual, Volume 4: Itanium Processor Programmer's Guide," Intel 2000.
|
| |
10
|
Intel, VTune Performance Analyzer, http://www.intel.com/software/products/vtune, 2002.
|
| |
11
|
|
| |
12
|
L. Kaufman and P.J. Rousseeuw, Finding groups in data: an introduction to cluster analysis. New York: Wiley, 1990.
|
| |
13
|
K.R. Koch, R.S. Baker, and R.E. Alcouffe, "Solution of the First-Order Form of the 3-D Discrete Ordinates Equation on a Massively Parallel Processor," Trans. Amer. Nuc. Soc., 65(198), 1992.
|
| |
14
|
K. London, J. Dongarra, S. Moore, P. Mucci, K. Seymour, and T. Spencer, "End-user Tools for Application Performance Analysis Using Hardware Counters," Proc. International Conference on Parallel and Distributed Computing Systems, 2001.
|
| |
15
|
|
| |
16
|
Barton P. Miller , Mark D. Callaghan , Jonathan M. Cargille , Jeffrey K. Hollingsworth , R. Bruce Irvin , Karen L. Karavanic , Krishna Kunchithapadam , Tia Newhall, The Paradyn Parallel Performance Measurement Tool, Computer, v.28 n.11, p.37-46, November 1995
[doi> 10.1109/2.471178]
|
 |
17
|
A. A. Mirin , R. H. Cohen , B. C. Curtis , W. P. Dannevik , A. M. Dimits , M. A. Duchaineau , D. E. Eliason , D. R. Schikore , S. E. Anderson , D. H. Porter , P. R. Woodward , L. J. Shieh , S. W. White, Very high resolution simulation of compressible turbulence on the IBM-SP system, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.70-es, November 14-19, 1999, Portland, Oregon, United States
[doi> 10.1145/331532.331601]
|
| |
18
|
D.A. Reed, O.Y. Nickolayev, and P.C. Roth, "Real-Time Statistical Clustering and for Event Trace Reduction," J. Supercomputing Applications and High-Performance Computing, 11(2):144--59, 1997.
|
 |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
Marco Zagha , Brond Larson , Steve Turner , Marty Itzkowitz, Performance analysis using the MIPS R10000 performance counters, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/369028.369059]
|
CITED BY 8
|
|
|
|
|
|
|
|
|
|
|
Xipeng Shen , Michael L. Scott , Chengliang Zhang , Sandhya Dwarkadas , Chen Ding , Mitsunori Ogihara, Analysis of input-dependent program behavior using active profiling, Proceedings of the 2007 workshop on Experimental computer science, p.5-es, June 13-14, 2007, San Diego, California
|
|
|
Peter F. Sweeney , Matthias Hauswirth , Brendon Cahoon , Perry Cheng , Amer Diwan , David Grove , Michael Hind, Using hardware performance monitors to understand the behavior of java applications, Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium, p.5-5, May 06-07, 2004, San Jose, California
|
|
|
Xipeng Shen , Chengliang Zhang , Chen Ding , Michael L. Scott , Sandhya Dwarkadas , Mitsunori Ogihara, Analysis of input-dependent program behavior using active profiling, Experimental computer science on Experimental computer science, p.4-4, June 13-14, 2007, San Diego
|
|
|
|
|
|
|
|