|
ABSTRACT
Improving program locality has become increasingly important on modern computer systems. An effective strategy is to group computations on the same data so that once the data are loaded into cache, the program performs all their operations before the data are evicted. However, computation regrouping is difficult to automate for programs with complex data and control structures. This paper studies the potential of locality improvement through trace-driven computation regrouping. First, it shows that maximizing the locality is different from maximizing the parallelism or maximizing the cache utilization. The problem is NP-hard even without considering data dependences and cache organization. Then the paper describes a tool that performs constrained computation regrouping on program traces. The new tool is unique because it measures the exact control dependences and applies complete memory renaming and re-allocation. Using the tool, the paper measures the potential locality improvement in a set of commonly used benchmark programs written in C.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
J. R. Allen , Ken Kennedy , Carrie Porterfield , Joe Warren, Conversion of control dependence to data dependence, Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, p.177-189, January 24-26, 1983, Austin, Texas
[doi> 10.1145/567067.567085]
|
| |
2
|
|
 |
3
|
|
| |
4
|
[4] L. A. Belady. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal, 5(2):78-101, 1966.
|
 |
5
|
Doug Burger , James R. Goodman , Alain Kägi, Memory bandwidth limitations of future microprocessors, Proceedings of the 23rd annual international symposium on Computer architecture, p.78-89, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
[19] H. Han and C. W. Tseng. Locality optimizations for adaptive irregular scientific codes. Technical report, Department of Computer Science, University of Maryland, College Park, 2000.
|
| |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
[23] K. Kennedy and K. S. McKinley. Typed fusion with applications to parallel and sequential code generation. Technical Report TR93-208, Dept. of Computer Science, Rice University, Aug. 1993. (also available as CRPC-TR94370).
|
 |
24
|
Induprakas Kodukula , Nawaaz Ahmed , Keshav Pingali, Data-centric multi-level blocking, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.346-357, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
25
|
|
 |
26
|
|
| |
27
|
[27] R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78-117, 1970.
|
 |
28
|
|
 |
29
|
|
| |
30
|
|
| |
31
|
[31] A. Nicolau and J. A. Fisher. Measuring the parallelism available for very long instruction word architecture. IEEE Transactions on Computers, 33(11), 1984.
|
| |
32
|
|
| |
33
|
|
| |
34
|
[34] X. Shen, Y. Zhong, and C. Ding. Regression-based multi-model prediction of data reuse signature. In Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, November 2003.
|
 |
35
|
|
 |
36
|
|
 |
37
|
|
| |
38
|
[38] K. B. Theobald, G. R. Gao, and L. J. Hendren. On the limits of program parallelism and its smoothability. Technical Report ACAPS Memo 40, School of Computer Science, McGill University, June 1992.
|
 |
39
|
|
| |
40
|
|
 |
41
|
Qing Yi , Vikram Adve , Ken Kennedy, Transforming loops to recursion for multi-level memory hierarchies, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.169-181, June 18-21, 2000, Vancouver, British Columbia, Canada
|
 |
42
|
|
| |
43
|
[43] Y. Zhong, C. Ding, and K. Kennedy. Reuse distance analysis for scientific programs. In Proceedings of Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers, Washington DC, March 2002.
|
| |
44
|
|
|