ACM Home Page
Please provide us with feedback. Feedback
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource
Full text PdfPdf (1.29 MB)
Source PACT archive
Proceedings of the 15th international conference on Parallel architectures and compilation techniques table of contents
Seattle, Washington, USA
SESSION: Multi-core design I table of contents
Pages: 13 - 22  
Year of Publication: 2006
ISBN:1-59593-264-X
Authors
Lisa R. Hsu  University of Michigan, Ann Arbor, Michigan
Steven K. Reinhardt  University of Michigan, Ann Arbor, Michigan
Ravishankar Iyer  Intel Corp, Hillsboro, Oregon
Srihari Makineni  Intel Corp, Hillsboro, Oregon
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 102,   Citation Count: 17
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1152154.1152161
What is a DOI?

ABSTRACT

As chip multiprocessors (CMPs) become increasingly mainstream, architects have likewise become more interested in how best to share a cache hierarchy among multiple simultaneous threads of execution. The complexity of this problem is exacerbated as the number of simultaneous threads grows from two or four to the tens or hundreds. However, there is no consensus in the architectural community on what "best" means in this context. Some papers in the literature seek to equalize each thread's performance loss due to sharing, while others emphasize maximizing overall system performance. Furthermore, the specific effect of these goals varies depending on the metric used to define "performance".In this paper we label equal performance targets as Communist cache policies and overall performance targets as Utilitarian cache policies. We compare both of these models to the most common current model of a free-for-all cache (a Capitalist policy). We consider various performance metrics, including miss rates, bandwidth usage, and IPC, including both absolute and relative values of each metric. Using analytical models and behavioral cache simulation, we find that the optimal partitioning of a shared cache can vary greatly as different but reasonable definitions of optimality are applied. We also find that, although Communist and Utilitarian targets are generally compatible, each policy has workloads for which it provides poor overall performance or poor fairness, respectively. Finally, we find that simple policies like LRU replacement and static uniform partitioning are not sufficient to provide near-optimal performance under any reasonable definition, indicating that some thread-aware cache resource allocation mechanism is required.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
D. Chiou, P. Jain, S. Devadas, and L. Rudolph. Dynamic cache partitioning via columnization. In Proceedings of Design Automation Conference, Los Angeles, June 2000.
 
3
A. Fedorova, M. Seltzer, C. Small, and D. Nussbaum. Performance of multithreaded chip multiprocessors and implications for operating system design. In Proc. 2005 USENIX Technical Conference, pages 395--398, 2005.
 
4
R. Goodwins. Does hyperthreading hurt server performance? http://news.com.com/Does+hyperthreading+hurt+server+performance/2100-1006_3-5965435.html?tag=nefd.top, Nov. 2005.
5
 
6
Intel Corp. Next leap in microprocessor architecture: Intel core duo. White paper. http://ces2006.akamai.com.edgesuite.net/yonahassets/CoreDuo_WhitePaper.pdf.
 
7
R. R. Iyer. On modeling and analyzing cache hierarchies using CASPER. In Proc. 11th Int'l Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, pages 182--187, Oct. 2003.
8
 
9
R. Kalla, B. Sinharoy, and J. M. Tendler. Ibm power5 chip: A dual-core multithreaded processor. IEEE Micro, 24(2):40--47, Mar. 2004.
 
10
 
11
 
12
S. R. Kunkel, R. J. Eickemeyer, M. H. Lipasti, T. J. Mullins, B. O'Krafka, H. Rosenberg, S. P. VanderWiel, P. L. Vitale, and L. D. Whitley. A performance methodology for commercial servers. IBM Journal of Research and Development, 44(6):851--871, November 2000.
 
13
M5 Development Team. The M5 Simulator. http://m5.eecs.umich.edu.
14
 
15
16
 
17
 
18
G. E. Suh, L. Rudolph, and S. Devadas. Dynamic cache partitioning for simultaneous multithreading systems. In Proc. 13th IASTED Int'l Conference on Parallel and Distributed Computing Systems, 2001.
 
19
20
21

CITED BY  17

Collaborative Colleagues:
Lisa R. Hsu: colleagues
Steven K. Reinhardt: colleagues
Ravishankar Iyer: colleagues
Srihari Makineni: colleagues