|
ABSTRACT
The large and growing impact of memory hierarchies on overall system performance compels designers to investigate innovative techniques to improve memory-system efficiency. We propose and analyze a memory hierarchy that increases both the effective capacity of memory structures and the effective bandwidth of interconnects by storing and transmitting data in compressed form.Caches play a key role in hiding memory latencies. However, cache sizes are constrained by die area and cost. A cache's effective size can be increased by storing compressed data, if the storage unused by a compressed block can be allocated to other blocks. We use a modified Indirect Index Cache to allocate variable amounts of storage to different blocks, depending on their compressibility.By coupling our compressed cache design with a similarly compressed main memory, we can easily transfer data between these structures in a compressed state, increasing the effective memory bus bandwidth. This optimization further improves performance when bus bandwidth is critical.Our simulation results, using the SPEC CPU2000 benchmarks, show that our design increases performance by up to 225% on some benchmarks while degrading performance in general by no more than 2%, other than a 12% decrease on a single benchmark. Compressed bus transfers alone account for up to 80% of this improvement, with the remainder coming from increased effective cache capacity. As memory latencies increase, our design becomes even more beneficial.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
S. Arramreddy, D. Har, K. Mak, et al, "IBM X-Press Memory Compression Technology Debuts in a ServerWorks NorthBridge", HOT Chips 12 Symposium, Aug. 2000
|
| |
4
|
N. Binkert, E. Hallnor, S. Reinhardt, "Network-Oriented Full-System Simulation using M5", Sixth Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW), February 2003
|
 |
5
|
|
| |
6
|
G. Hammond, S. Naffziger, "Next Generation Itanium Process Overview", Intel Developers Forum, 2001
|
| |
7
|
|
| |
8
|
N. Kim, T. Austin, T. Mudge, "Low-Energy Data Cache using Sign Compression and Cache Line Bisection", 2nd Annual Workshop on Memory Performance Issues, May 2002
|
| |
9
|
Darko Kirovski , Johnson Kin , William H. Mangione-Smith, Procedure based program compression, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.204-213, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
10
|
M. Kjelso, M. Gooch, S. Jones, "Design and Performance of a Main Memory Hardware Data Compressor", In the proceedings of the 22nd EUROMICRO Conference, Beyond 2000: Hardware and Software Design Strategies, 1995, pp. 423--430.
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
P. R. Wilson, S. F. Kaplan, Y. Smaragdakis, "The Case for Compressed Caching in Virtual Memory Systems", In the proceedings of USENIX 1999.
|
| |
18
|
|
 |
19
|
|
|