| A component model of spatial locality |
| Full text |
Pdf
(668 KB)
|
Source
|
International Symposium on Memory Management
archive
Proceedings of the 2009 international symposium on Memory management
table of contents
Dublin, Ireland
SESSION: Paper session 4
table of contents
Pages 99-108
Year of Publication: 2009
ISBN:978-1-60558-347-1
|
|
Authors
|
|
Xiaoming Gu
|
Intel China Research Center, Beijing, China
|
|
Ian Christopher
|
University of Rochester, Rochester, NY, USA
|
|
Tongxin Bai
|
University of Rochester, Rochester, NY, USA
|
|
Chengliang Zhang
|
Microsoft Corporation, Redmond, WA, USA
|
|
Chen Ding
|
University of Rochester, Rochester, NY, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 56, Citation Count: 1
|
|
|
ABSTRACT
Good spatial locality alleviates both the latency and bandwidth problem of memory by boosting the effect of prefetching and improving the utilization of cache. However, conventional definitions of spatial locality are inadequate for a programmer to precisely quantify the quality of a program, to identify causes of poor locality, and to estimate the potential by which spatial locality can be improved. This paper describes a new, component-based model for spatial locality. It is based on measuring the change of reuse distances as a function of the data-block size. It divides spatial locality into components at program and behavior levels. While the base model is costly because it requires the tracking of the locality of every memory access, the overhead can be reduced by using small inputs and by extending a sampling-based tool. The paper presents the result of the analysis for a large set of benchmarks, the cost of the analysis, and the experience of a user study, in which the analysis helped to locate a data-layout problem and improve performance by 7% with a 6-line change in an application with over 2,000 lines.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Glenn Ammons , Thomas Ball , James R. Larus, Exploiting hardware performance counters with flow and context sensitive profiling, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.85-96, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
2
|
|
 |
3
|
|
| |
4
|
K. Beyls and E. D'Hollander. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of HPCC. Springer. Lecture Notes in Computer Science Vol. 4208, pages 220--229, 2006.
|
| |
5
|
|
 |
6
|
Brad Calder , Chandra Krintz , Simmi John , Todd Austin, Cache-conscious data placement, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.139-149, October 02-07, 1998, San Jose, California, United States
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
M. Hirzel and T. M. Chilimbi. Bursty tracing: A framework for low-overhead temporal profiling. In Proceedings of ACM Workshop on Feedback-Directed and Dynamic Optimization, Dallas, Texas, 2001.
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
G. Marin and J. Mellor-Crummey. Scalable cross-architecture predictions of memory hierarchy response for scientific applications. In Proceedings of the Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, 2005.
|
| |
17
|
R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78--117, 1970.
|
 |
18
|
|
| |
19
|
|
 |
20
|
|
 |
21
|
|
 |
22
|
|
 |
23
|
|
| |
24
|
Xipeng Shen , Jonathan Shaw, Scalable Implementation of Efficient Locality Approximation, Languages and Compilers for Parallel Computing: 21th International Workshop, LCPC 2008, Edmonton, Canada, July 31 - August 2, 2008, Revised Selected Papers, Springer-Verlag, Berlin, Heidelberg, 2008
[doi> 10.1007/978-3-540-89740-8_14]
|
 |
25
|
|
| |
26
|
X. Shen, Y. Zhong, and C. Ding. Regression-based multi-model prediction of data reuse signature. In Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, November 2003.
|
| |
27
|
|
| |
28
|
Spec cpu benchmarks. http://www.spec.org/benchmarks.html\#cpu.
|
 |
29
|
|
| |
30
|
|
 |
31
|
Brian S. White , Sally A. McKee , Bronis R. de Supinski , Brian Miller , Daniel Quinlan , Martin Schulz, Improving the computational intensity of unstructured mesh applications, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
[doi> 10.1145/1088149.1088195]
|
 |
32
|
Chengliang Zhang , Chen Ding , Mitsunori Ogihara , Yutao Zhong , Youfeng Wu, A hierarchical model of data locality, Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.16-29, January 11-13, 2006, Charleston, South Carolina, USA
|
| |
33
|
|
| |
34
|
|
 |
35
|
|
|