ACM Home Page
Please provide us with feedback. Feedback
Achieving predictable performance through better memory controller placement in many-core CMPs
Full text PdfPdf (883 KB)
Source
International Symposium on Computer Architecture archive
Proceedings of the 36th annual international symposium on Computer architecture table of contents
Austin, TX, USA
SESSION: On-chip interconnection networks table of contents
Pages 451-461  
Year of Publication: 2009
ISBN:978-1-60558-526-0
Also published in ...
Authors
Dennis Abts  Google Inc, Madison, WI, USA
Natalie D. Enright Jerger  University of Toronto, Toronto, ON, Canada
John Kim  KAIST, Daejeon, South Korea
Dan Gibson  University of Wisconsin - Madison, Madison, WI, USA
Mikko H. Lipasti  University of Wisconsin - Madison, Madison, WI, USA
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 103,   Downloads (12 Months): 270,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1555754.1555810
What is a DOI?

ABSTRACT

In the near term, Moore's law will continue to provide an increasing number of transistors and therefore an increasing number of on-chip cores. Limited pin bandwidth prevents the integration of a large number of memory controllers on-chip. With many cores, and few memory controllers, where to locate the memory controllers in the on-chip interconnection fabric becomes an important and as yet unexplored question. In this paper we show how the location of the memory controllers can reduce contention (hot spots) in the on-chip fabric and lower the variance in reference latency. This in turn provides predictable performance for memory-intensive applications regardless of the processing core on which a thread is scheduled. We explore the design space of on-chip fabrics to find optimal memory controller placement relative to different topologies (i.e. mesh and torus), routing algorithms, and workloads.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
H. Cain, K. Lepak, B. Schwarz, and M. H. Lipasti. Precise and accurate processor simulation. In Workshop On Computer Architecture Evaluation using Commercial Workloads, 2002.
 
2
3
 
4
 
5
 
6
 
7
 
8
 
9
10
 
11
 
12
Intel Tera-scale Computing Research Program: Teraflop Research Chip. http://techresearch.intel.com/articles/tera-scale/1449.htm.
 
13
14
 
15
 
16
17
 
18
19
 
20
SPEC. SPEC benchmarks. http://www.spec.org.
21
 
22
Tilera Corporation. http://www.tilera.com.
23
 
24
TPC. TPC benchmarks. http://www.tpc.org.
 
25
S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS. Solid-State Circuits Conference, 2007. ISSCC 2007. Digest of Technical Papers. IEEE International, pages 98--589, Feb. 2007.
 
26

Collaborative Colleagues:
Dennis Abts: colleagues
Natalie D. Enright Jerger: colleagues
John Kim: colleagues
Dan Gibson: colleagues
Mikko H. Lipasti: colleagues