|
ABSTRACT
As computer memory hierarchy becomes adaptive, its performance increasingly depends on forecasting the dynamic program locality. This paper presents a method that predicts the locality phases of a program by a combination of locality profiling and run-time prediction. By profiling a training input, it identifies locality phases by sifting through all accesses to all data elements using variable-distance sampling, wavelet filtering, and optimal phase partitioning. It then constructs a phase hierarchy through grammar compression. Finally, it inserts phase markers into the program using binary rewriting. When the instrumented program runs, it uses the first few executions of a phase to predict all its later executions.Compared with existing methods based on program code and execution intervals, locality phase prediction is unique because it uses locality profiles, and it marks phase boundaries in program code. The second half of the paper presents a comprehensive evaluation. It measures the accuracy and the coverage of the new technique and compares it with best known run-time methods. It measures its benefit in adaptive cache resizing and memory remapping. Finally, it compares the automatic analysis with manual phase marking. The results show that locality phase prediction is well suited for identifying large, recurring phases in complex programs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Rajeev Balasubramonian , David Albonesi , Alper Buyuktosunoglu , Sandhya Dwarkadas, Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.245-257, December 2000, Monterey, California, United States
[doi> 10.1145/360128.360153]
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
P.J. Denning. Working sets past and present. IEEE Transactions on Software Engineering, SE-6(1), January 1980.
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
Chung-Hsing Hsu , Ulrich Kremer, The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction, Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, June 09-11, 2003, San Diego, California, USA
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
 |
21
|
|
 |
22
|
Grigorios Magklis , Michael L. Scott , Greg Semeraro , David H. Albonesi , Steven Dropsho, Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor, Proceedings of the 30th annual international symposium on Computer architecture, June 09-11, 2003, San Diego, California
|
 |
23
|
|
| |
24
|
R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78--117, 1970.
|
| |
25
|
|
| |
26
|
C. G. Nevill-Manning and I. H. Witten. Identifying hierarchical structure in sequences: a linear-time algorithm. Journal of Artificial Intelligence Research, 7:67--82, 1997.
|
| |
27
|
|
| |
28
|
X. Shen, Y. Zhong, and C. Ding. Regression-based multi-model prediction of data reuse signature. In Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, November 2003.
|
| |
29
|
|
 |
30
|
|
 |
31
|
|
 |
32
|
|
 |
33
|
|
| |
34
|
|
| |
35
|
Lixin Zhang , Zhen Fang , Mide Parker , Binu K. Mathew , Lambert Schaelicke , John B. Carter , Wilson C. Hsieh , Sally A. McKee, The Impulse Memory Controller, IEEE Transactions on Computers, v.50 n.11, p.1117-1132, November 2001
[doi> 10.1109/12.966490]
|
 |
36
|
|
CITED BY 29
|
|
Chen Ding , Chengliang Zhang , Xipeng Shen , Mitsunori Ogihara, Gated memory control for memory monitoring, leak detection and garbage collection, Proceedings of the 2005 workshop on Memory system performance, June 12-12, 2005, Chicago, Illinois
|
|
|
|
|
|
|
|
|
Yungang Bao , Mingyu Chen , Yuan Ruan , Li Liu , Jianping Fan , Qingbo Yuan , Bo Song , Jianwei Xu, HMTT: a platform independent full-system memory trace monitoring system, ACM SIGMETRICS Performance Evaluation Review, v.36 n.1, June 2008
|
|
|
Xipeng Shen , Michael L. Scott , Chengliang Zhang , Sandhya Dwarkadas , Chen Ding , Mitsunori Ogihara, Analysis of input-dependent program behavior using active profiling, Proceedings of the 2007 workshop on Experimental computer science, p.5-es, June 13-14, 2007, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Changpeng Fang , Steve Carr , Soner Önder , Zhenlin Wang, Feedback-directed memory disambiguation through store distance analysis, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xipeng Shen , Chengliang Zhang , Chen Ding , Michael L. Scott , Sandhya Dwarkadas , Mitsunori Ogihara, Analysis of input-dependent program behavior using active profiling, Experimental computer science on Experimental computer science, p.4-4, June 13-14, 2007, San Diego
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yunlian Jiang , Xipeng Shen , Jie Chen , Rahul Tripathi, Analysis and approximation of optimal co-scheduling on chip multiprocessors, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
REVIEW
"Peter C. Patton : Reviewer"
The principles of task and data locality are very important in computer architecture, especially in the application of cache memory to bridge the speed mismatch between a central processing unit (CPU) and its primary memory. This paper goes well b
more...
|