ACM Home Page
Please provide us with feedback. Feedback
Scaling the bandwidth wall: challenges in and avenues for CMP scaling
Full text PdfPdf (1.07 MB)
Source
International Symposium on Computer Architecture archive
Proceedings of the 36th annual international symposium on Computer architecture table of contents
Austin, TX, USA
SESSION: Potpourri table of contents
Pages 371-382  
Year of Publication: 2009
ISBN:978-1-60558-526-0
Also published in ...
Authors
Brian M. Rogers  North Carolina State University, Raleigh, NC, USA
Anil Krishna  IBM, Research Triangle Park, NC, USA
Gordon B. Bell  IBM, Research Triangle Park, NC, USA
Ken Vu  IBM, Research Triangle Park, NC, USA
Xiaowei Jiang  North Carolina State University, Raleigh, NC, USA
Yan Solihin  North Carolina State University, Raleigh, NC, USA
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 107,   Downloads (12 Months): 233,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1555754.1555801
What is a DOI?

ABSTRACT

As transistor density continues to grow at an exponential rate in accordance to Moore's law, the goal for many Chip Multi-Processor (CMP) systems is to scale the number of on-chip cores proportionally. Unfortunately, off-chip memory bandwidth capacity is projected to grow slowly compared to the desired growth in the number of cores. This creates a situation in which each core will have a decreasing amount of off-chip bandwidth that it can use to load its data from off-chip memory. The situation in which off-chip bandwidth is becoming a performance and throughput bottleneck is referred to as the bandwidth wall problem.

In this study, we seek to answer two questions: (1) to what extent does the bandwidth wall problem restrict future multicore scaling, and (2) to what extent are various bandwidth conservation techniques able to mitigate this problem. To address them, we develop a simple but powerful analytical model to predict the number of on-chip cores that a CMP can support given a limited growth in memory traffic capacity. We find that the bandwidth wall can severely limit core scaling. When starting with a balanced 8-core CMP, in four technology generations the number of cores can only scale to 24, as opposed to 128 cores under proportional scaling, without increasing the memory traffic requirement. We find that various individual bandwidth conservation techniques we evaluate have a wide ranging impact on core scaling, and when combined together, these techniques have the potential to enable super-proportional core scaling for up to 4 technology generations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
A. R. Alameldeen and D. A. Wood. Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches. In Tech. Rep. 1500, Computer Sciences Department, University of Wisconsin-Madison, 2004.
4
 
5
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec suite: Characterization and architectural implications. Tech. Rep. TR-811-08, Princeton University, 2008.
 
6
7
 
8
 
9
 
10
 
11
A. Hartstein, V. Srinivasan, T. Puzak, and P. Emma. On the Nature of Cache Miss Behavior: Is It p2? In The Journal of Instruction-Level Parallelism, volume 10, 2008.
 
12
 
13
 
14
ITRS. International Technology Roadmap for Semiconductors: 2005 Edition, Assembly and packaging. In http://www.itrs.net/Links/2005ITRS/AP2005.pdf, 2005.
 
15
16
17
 
18
 
19
Y. Li, B. Lee, D. Brooks, Z. Hu, and K. Skadron. CMP design space exploration subject to physical constraints. In in 12th Intl. Symp. on High Performance Computer Architecture, 2006.
 
20
H. McGhan. Niagara 2 Opens the Floodgates. Microprocessor Report, 2006.
 
21
P. Pujara and A. Aggarwal. Increasing the cache efficiency by eliminating noise. High-Performance Computer Architecture, 2006. The Twelfth Intl. Symp. on, pages 145--154, 2006.
 
22
 
23
 
24
Y. Solihin, F. Guo, T. R. Puzak, and P. G. Emma. Practical Cache Performance Modeling for Computer Architects. In Tutorial with HPCA--13, 2007.
 
25
 
26
 
27

Collaborative Colleagues:
Brian M. Rogers: colleagues
Anil Krishna: colleagues
Gordon B. Bell: colleagues
Ken Vu: colleagues
Xiaowei Jiang: colleagues
Yan Solihin: colleagues