ACM Home Page
Please provide us with feedback. Feedback
Performance tools for large-scale clusters
Full text HtmlHtml (2 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2006 ACM/IEEE conference on Supercomputing table of contents
Tampa, Florida
SESSION: Birds of a feather table of contents
Article No. 7  
Year of Publication: 2006
ISBN:0-7695-2700-0
Author
Sponsors
IEEE : Institute of Electrical and Electronics Engineers
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 10,   Citation Count: 0
Additional Information:

abstract   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1188455.1188463
What is a DOI?

ABSTRACT

Identifying factors that limit large-scale cluster scalability remains a challenging area of research. Several efforts have focused on developing tools to address different aspects of this problem space. Presenters from both industry and research will describe their tools, including: monitoring cluster-wide utilizations of system components (CPUs, memory, I/O, interconnect), monitoring node-level load (memory bandwidth, cache/TLB misses, stall components), high-frequency, low-overhead, fine-grained application profiling.Our first goal is to discuss how such tools can be helpful for improving application performance, reducing hot-spots, and determining the best match between applications and platforms. The second goal is to solicit developers' and users' experiences with problems in this space and ideas and techniques that might help address them.