| Better exploration of region-level value locality with integrated computation reuse and value prediction |
| Full text |
Pdf
(941 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 28th annual international symposium on Computer architecture
table of contents
Göteborg, Sweden
Pages: 98 - 108
Year of Publication: 2001
ISBN:0-7695-1162-7
Also published in ...
|
|
Authors
|
|
Youfeng Wu
|
Microprocessor Research Labs (MRL), Intel Corporation, Santa Clara, CA
|
|
Dong-Yuan Chen
|
Microprocessor Research Labs (MRL), Intel Corporation, Santa Clara, CA
|
|
Jesse Fang
|
Microprocessor Research Labs (MRL), Intel Corporation, Santa Clara, CA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 32, Citation Count: 3
|
|
|
ABSTRACT
Computation-reuse and value-prediction are two recent techniques for improving microprocessor performance by exploiting value localities. They both aim at breaking the data dependence limit in traditional processors. In this paper, we propose a speculative multithreading scheme in which the same hardware can be efficiently used for both computation reuse and value prediction. For the SpecInt95 benchmarks, our experiment shows that the integrated approach significantly out-performs either computation reuse or value prediction alone. For example, the integrated approach improves over computation reuse from a speedup of 1.25 to 1.40, and improves over value prediction from 1.28 to 1.40. In particular, the integrated approach out-performs a computation reuse configuration that has twice as much reuse buffer entries (from a speedup 1.33 to 1.40). Furthermore, unlike the computation reuse approach, the performance of the integrated approach does not rely on value profile during region formation and thus our approach is more suitable for production systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Brad Calder , Peter Feller , Alan Eustace, Value profiling, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.259-269, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
4
|
|
| |
5
|
|
 |
6
|
Daniel A. Connors , Hillery C. Hunter , Ben-Chung Cheng , Wen-mei W. Hwu, Hardware support for dynamic activation of compiler-directed computation reuse, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.222-233, November 2000, Cambridge, Massachusetts, United States
|
| |
7
|
S. Dutta and M. Franklin. Block-Level Prediction for Wide- Issue Superscalar Processors. Proc. of the 1 st International Conference on Algorithms and Architectures for Parallel Processing, Vol. 1, pp. 143-152, 1995.
|
 |
8
|
Chao-Ying Fu , Matthew D. Jennings , Sergei Y. Larin , Thomas M. Conte, Value speculation scheduling for high performance processors, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.262-271, October 02-07, 1998, San Jose, California, United States
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Intel Corp. Intel 1A-64 Architecture Software Developer's Manual, Jan. 2000. Vol. 1-4. http://developer.intel.com/design/ia-64/manuals/index.htm.
|
| |
13
|
|
 |
14
|
|
 |
15
|
Mikko H. Lipasti , Christopher B. Wilkerson , John Paul Shen, Value locality and load value prediction, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.138-147, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
 |
19
|
Tarun Nakra , Rajiv Gupta , Mary Lou Soffa, Value prediction in VLIW machines, Proceedings of the 26th annual international symposium on Computer architecture, p.258-269, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
|
 |
26
|
|
 |
27
|
|
 |
28
|
|
 |
29
|
|
| |
30
|
|
CITED BY 3
|
|
|
|
|
Tomoaki Tsumura , Ikuma Suzuki , Yasuki Ikeuchi , Hiroshi Matsuo , Hiroshi Nakashima , Yasuhiko Nakashima, Design and evaluation of an auto-memoization processor, Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks, p.245-250, February 13-15, 2007, Innsbruck, Austria
|
|
|
|
|