ACM Home Page
Please provide us with feedback. Feedback
Detailed cache simulation for detecting bottleneck, miss reason and optimization potentialities
Full text PdfPdf (184 KB)
Source ACM International Conference Proceeding Series; Vol. 180 archive
Proceedings of the 1st international conference on Performance evaluation methodolgies and tools table of contents
Pisa, Italy
SESSION: Work in progress session: tools table of contents
Article No. 62  
Year of Publication: 2006
ISBN:1-59593-504-5
Authors
Jie Tao  Universität Karlsruhe (TH), Karlsruhe, Germany
Wolfgang Karl  Universität Karlsruhe (TH), Karlsruhe, Germany
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 38,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1190095.1190174
What is a DOI?

ABSTRACT

Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the side of programmers or compilers with code level optimization or at system level through appropriate schemes, like reconfigurable cache organization and adequate prefetching or replacement strategies. For the former users need to know the problem, the reason, and the solution, while for the latter a platform is required for evaluating proposed and novel approaches.As existing simulation systems do not provide such information and platforms, we implemented a cache simulator that models the complete cache hierarchy and associated techniques. More specifically, it analyzes the feature of cache miss and provides information about the runtime accesses to data structures and the cache access behavior. Together with a visualization tool, this information enables the user to detect access hotspots and optimization strategies for tackling them. For supporting the study of different techniques with respect to cache configuration and management, this simulator models a variety of cache line replacement and prefetching policies, and allows the user to specify any cache organization, including cache size, cache set size, block size, and associativity. The simulator hence forms a research platform for investigating the influence of these techniques on the execution behavior of applications.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
D. B. et. al. The NAS Parallel Benchmarks. Technical Report RNR-94-007, Department of Mathematics and Computer Science, Emory University, March 1994.
 
4
 
5
 
6
 
7
Intel Corporation. Intel Itanium Architecture Software Developer's Manual, volume 1--3. 2002. available at http://developer.intel.com/design/itanium/manuals/iiasdmanual.htm.
 
8
 
9
 
10
 
11
 
12
 
13
V. S. Pai, P. Ranganathan, and S. V. Adve. RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors. In Proceedings of the Third Workshop on Computer Architecture Education, February 1997.
 
14
B. Quaing, J. Tao, and W. Karl. YACO: A User Conducted Visualization Tool for Supporting Cache Optimization. In High Performance Computing and Communications: First International Conference, HPCC 2005. Proceedings, volume 3726 of Lecture Notes in Computer Science, pages 694--703, Sorrento, Italy, September 2005.
15
 
16
17
 
18
WWW.Cachegrind: a cache-miss profiler. available at http://developer.kde.org/sewardj/docs-2.2.0/cg_main.html#cg-top.