ACM Home Page
Please provide us with feedback. Feedback
The implications of working set analysis on supercomputing memory hierarchy design
Full text PdfPdf (698 KB)
Source International Conference on Supercomputing archive
Proceedings of the 19th annual international conference on Supercomputing table of contents
Cambridge, Massachusetts
SESSION: Session 9: operating systems table of contents
Pages: 332 - 340  
Year of Publication: 2005
ISBN:1-59593-167-8
Authors
Richard Murphy  University of Notre Dame, Notre Dame, IN
Arun Rodrigues  University of Notre Dame, Notre Dame, IN
Peter Kogge  University of Notre Dame, Notre Dame, IN
Keith Underwood  Sandia National Lab, Albuquerque, NM
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 44,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1088149.1088193
What is a DOI?

ABSTRACT

Supercomputer architects strive to maximize the performance of scientific applications. Unfortunately, the large, unwieldy nature of most scientific applications has lead to the creation of artificial benchmarks, such as SPEC-FP, for architecture research. Given the impact that these benchmarks have on architecture research, this paper seeks an understanding of how they relate to real-world applications within the Department of Energy. Since the memory system has been found to be a particularly key issue for many applications, the focus of the paper is on the relationship between how the SPEC-FP benchmarks and DOE applications use the memory system. The results indicate that while the SPEC-FP suite is a well balanced suite, supercomputing applications typically demand more from the memory system and must perform more "other work" (in the form of integer computations) along with the floating point operations. The SPEC-FP suite generally demonstrates slightly more temporal locality leading to somewhat lower bandwidth demands. The most striking result is the cumulative difference between the benchmarks and the applications in terms of the requirements to sustain the floating-point operation rate: the DOE applications require significantly more data from main memory (not cache) per FLOP and dramatically more integer instructions per FLOP.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ASC Purple benchmark codes, July 2004. http://www.llnl.gov/asci/purple/benchmarks/limited/code_list.html.
 
2
SPEC website, July 2004. http://www.spec.org.
 
3
Apple Architecture Performance Groups. Computer Hardware Understanding Development Tools 2.0 Reference Guide for MacOS X. Apple Computer Inc, July 2002.
4
5
 
6
7
8
9
 
10
11
12
13
14
15
16
17
 
18
McCalpin, John D. Stream: Sustainable memory bandwidth in high performance computers, 1997.
19
 
20
SPEC Open Systems Steering Committee. Spec cpu 2000 run and reporting rules (revised). March 15, 2001.
 
21
 
22
Steven J. Plimpton. Lammps web page, July 2004. http://www.cs.sandia.gov/ sjplimp/lammps.html.
 
23
Steven J. Plimpton, R. Pollock, and M, Stevens. Particle-mesh ewald and rRESPA for parallel molecular dynamics. In Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, MN, March 1997.
24
25
26
27
28
 
29
30


Collaborative Colleagues:
Richard Murphy: colleagues
Arun Rodrigues: colleagues
Peter Kogge: colleagues
Keith Underwood: colleagues