| Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors |
| Full text |
Pdf
(715 KB)
|
| Source
|
International Symposium on Low Power Electronics and Design
archive
Proceedings of the 1998 international symposium on Low power electronics and design
table of contents
Monterey, California, United States
Pages: 70 - 75
Year of Publication: 1998
ISBN:1-58113-059-7
|
|
Authors
|
|
Nikolaos Bellas Ibrahim Hajj
|
Department of Electrical & Computer Engineering and the Coordinated Scince Laboratory, University of Illinois at Urbana-Champaign, 1308 West Main Street, Urbana, IL
|
|
George Stamoulis
|
Department of Electrical & Computer Engineering and the Coordinated Scince Laboratory, University of Illinois at Urbana-Champaign, 1308 West Main Street, Urbana, IL and Intel Corporation, Santa Clara, CA
|
|
N. Bellas
|
|
|
C. Polychronopoulos
|
Department of Electrical & Computer Engineering and the Coordinated Scince Laboratory, University of Illinois at Urbana-Champaign, 1308 West Main Street, Urbana, IL
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 0, Downloads (12 Months): 8, Citation Count: 23
|
|
|
ABSTRACT
In this paper we propose a technique that uses an additional mini cache located between the I-Cache and the CPU core, and buffers instructions that are nested within loops and are continuously otherwise fetched from the I-Cache. This mechanism is combined with code modifications, through the compiler, that greatly simplify the required hardware, eliminate unnecessary instruction fetching, and consequently reduce signal switching activity and the dissipated energy.
We show that the additional cache, dubbed L-Cache, is much smaller and simpler than the I-Cache when the compiler assumes the role of allocating instructions in it. Through simulation, we show that, for the SPECfp95 benchmarks, the I-Cache remains disabled most of the time, and the “cheaper” extra cache is used instead. We present experimental results that validate the effectiveness of this technique, and present the energy gains for most of the SPEC95 benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
John H. Edmondson , Paul I. Rubinfeld , Peter J. Bannon , Bradley J. Benschneider , Debra Bernstein , Ruben W. Castelino , Elizabeth M. Cooper , Daniel E. Dever , Dale R. Donchin , Timothy C. Fischer , Anil K. Jain , Shekhar Mehta , Jeanne E. Meyer , Ronald P. Preston , Vidya Rajagopalan , Chandrasekhara Somanathan , Scott A. Taylor , Gilbert M. Wolrich, Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor, Digital Technical Journal, v.7 n.1, p.119-135, Jan. 1995
|
 |
2
|
|
| |
3
|
V. Tiwari, S. Malik, and A. Wolfe, "Compilation Techniques for Low Energy: An Overview," in Proceedings of the IEEE Symposium on Low Power Electronics, (San Diego, CA), Oct. 1994.
|
 |
4
|
Huzefa Mehta , Robert Michael Owens , Mary Jane Irwin , Rita Chen , Debashree Ghosh, Techniques for low energy software, Proceedings of the 1997 international symposium on Low power electronics and design, p.72-75, August 18-20, 1997, Monterey, California, United States
[doi> 10.1145/263272.263286]
|
| |
5
|
|
 |
6
|
|
 |
7
|
|
 |
8
|
J. Ph. Diguet , S. Wuytack , F. Catthoor , H. De Man, Formalized methodology for data reuse exploration in hierarchical memory mappings, Proceedings of the 1997 international symposium on Low power electronics and design, p.30-35, August 18-20, 1997, Monterey, California, United States
[doi> 10.1145/263272.263278]
|
| |
9
|
S. Wuytack, F. Catthoor, and H. DeMan, "Transforming Set Data Types to Power Optimal Data Structures," IEEE Transcactions on Computer-Aided Design, vol. 15, pp. 619-629, June 1996.
|
| |
10
|
Raminder S. Bajwa , Mitsuru Hiraki , Hirotsugu Kojima , Douglas J. Gorny , Kenichi Nitta , Avadhani Shridhar , Koichi Seki , Katsuro Sasaki, Instruction buffering to reduce power in processors for signal processing, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, v.5 n.4, p.417-424, Dec. 1997
[doi> 10.1109/92.645068]
|
| |
11
|
Johnson Kin , Munish Gupta , William H. Mangione-Smith, The filter cache: an energy efficient memory structure, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.184-193, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
12
|
|
 |
13
|
Andrew Ayers , Richard Schooler , Robert Gottlieb, Aggressive inlining, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.134-145, June 16-18, 1997, Las Vegas, Nevada, United States
|
| |
14
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
 |
15
|
|
| |
16
|
S. Wilson and N. Jouppi, "An Enhanced Access and Cycle Time Model for On-Chip Caches," DEC WRL Technical Report 93/5, July 1994.
|
| |
17
|
SpeedShop User's Guide. Silicon Graphics Inc., 1996.
|
CITED BY 23
|
|
|
|
|
|
|
|
|
|
|
Nikolaos Bellas , Ibrahim Hajj , Constantine Polychronopoulos, Using dynamic cache management techniques to reduce energy in a high-performance processor, Proceedings of the 1999 international symposium on Low power electronics and design, p.64-69, August 16-17, 1999, San Diego, California, United States
|
|
|
|
|
|
Koji Inoue , Tohru Ishihara , Kazuaki Murakami, Way-predicting set-associative cache for high performance and low energy consumption, Proceedings of the 1999 international symposium on Low power electronics and design, p.273-275, August 16-17, 1999, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Murali Jayapala , Francisco Barat , Tom Vander Aa , Francky Catthoor , Henk Corporaal , Geert Deconinck, Clustered Loop Buffer Organization for Low Energy VLIW Embedded Processors, IEEE Transactions on Computers, v.54 n.6, p.672-683, June 2005
|
|
|
|
|
|
|
|
|
|
|
Tom Vander Aa , Murali Jayapala , Francisco Barat , Geert Deconinck , Rudy Lauwereins , Francky Catthoor , Henk Corporaal, Instruction buffering exploration for low energy VLIWs with instruction clusters, Proceedings of the 2004 conference on Asia South Pacific design automation: electronic design and solution fair, p.824-829, January 27-30, 2004, Yokohama, Japan
|
|
|
|
|
|
Tom Vander Aa , Murali Jayapala , Francisco Barat , Geert Deconinck , Rudy Lauwereins , Henk Corporaal , Francky Catthoor, Instruction buffering exploration for low energy embedded processors, Journal of Embedded Computing, v.1 n.3, p.341-351, August 2005
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|