| Detailed cache simulation for detecting bottleneck, miss reason and optimization potentialities |
| Full text |
Pdf
(184 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 180
archive
Proceedings of the 1st international conference on Performance evaluation methodolgies and tools
table of contents
Pisa, Italy
SESSION: Work in progress session: tools
table of contents
Article No. 62
Year of Publication: 2006
ISBN:1-59593-504-5
|
|
Authors
|
|
Jie Tao
|
Universität Karlsruhe (TH), Karlsruhe, Germany
|
|
Wolfgang Karl
|
Universität Karlsruhe (TH), Karlsruhe, Germany
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 38, Citation Count: 0
|
|
|
ABSTRACT
Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the side of programmers or compilers with code level optimization or at system level through appropriate schemes, like reconfigurable cache organization and adequate prefetching or replacement strategies. For the former users need to know the problem, the reason, and the solution, while for the latter a platform is required for evaluating proposed and novel approaches.As existing simulation systems do not provide such information and platforms, we implemented a cache simulator that models the complete cache hierarchy and associated techniques. More specifically, it analyzes the feature of cache miss and provides information about the runtime accesses to data structures and the cache access behavior. Together with a visualization tool, this information enables the user to detect access hotspots and optimization strategies for tackling them. For supporting the study of different techniques with respect to cache configuration and management, this simulator models a variety of cache line replacement and prefetching policies, and allows the user to specify any cache organization, including cache size, cache set size, block size, and associativity. The simulator hence forms a research platform for investigating the influence of these techniques on the execution behavior of applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
D. B. et. al. The NAS Parallel Benchmarks. Technical Report RNR-94-007, Department of Mathematics and Computer Science, Emory University, March 1994.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
Intel Corporation. Intel Itanium Architecture Software Developer's Manual, volume 1--3. 2002. available at http://developer.intel.com/design/itanium/manuals/iiasdmanual.htm.
|
| |
8
|
Teresa L. Johnson , Matthew C. Merten , Wen-Mei W. Hwu, Run-time spatial locality detection and optimization, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.57-64, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
9
|
|
| |
10
|
|
| |
11
|
Shubhendu S. Mukherjee , Steven K. Reinhardt , Babak Falsafi , Mike Litzkow , Mark D. Hill , David A. Wood , Steven Huss-Lederman , James R. Larus, Wisconsin Wind Tunnel II: A Fast, Portable Parallel Architecture Simulator, IEEE Concurrency, v.8 n.4, p.12-20, October 2000
[doi> 10.1109/4434.895100]
|
| |
12
|
|
| |
13
|
V. S. Pai, P. Ranganathan, and S. V. Adve. RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors. In Proceedings of the Third Workshop on Computer Architecture Education, February 1997.
|
| |
14
|
B. Quaing, J. Tao, and W. Karl. YACO: A User Conducted Visualization Tool for Supporting Cache Optimization. In High Performance Computing and Communications: First International Conference, HPCC 2005. Proceedings, volume 3726 of Lecture Notes in Computer Science, pages 694--703, Sorrento, Italy, September 2005.
|
 |
15
|
|
| |
16
|
|
 |
17
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
| |
18
|
WWW.Cachegrind: a cache-miss profiler. available at http://developer.kde.org/sewardj/docs-2.2.0/cg_main.html#cg-top.
|
|