| A light-weight fairness mechanism for chip multiprocessor memory systems |
| Full text |
Pdf
(361 KB)
|
Source
|
Conference On Computing Frontiers
archive
Proceedings of the 6th ACM conference on Computing frontiers
table of contents
Ischia, Italy
SESSION: Innovative memory systems
table of contents
Pages 1-10
Year of Publication: 2009
ISBN:978-1-60558-413-3
|
|
Authors
|
|
Magnus Jahre
|
Norwegian University of Science and Technology, Trondheim, Norway
|
|
Lasse Natvig
|
Norwegian University of Science and Technology, Trondheim, Norway
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 78, Citation Count: 0
|
|
|
ABSTRACT
Chip Multiprocessor (CMP) memory systems suffer from the effects of destructive thread interference. This interference reduces performance predictability because it depends heavily on the memory access pattern and intensity of the co-scheduled threads. In this work, we confirm that all shared units must be thread-aware in order to provide memory system fairness. However, the current proposals for fair memory systems are complex as they require an interference measurement mechanism and a fairness enforcement policy for all hardware-controlled shared units. Furthermore, they often sacrifice system throughput to reach their fairness goals which is not desirable in all systems. In this work, we show that our novel fairness mechanism, called the Dynamic Miss Handling Architecture (DMHA), is able to reduce implementation complexity by using a single fairness enforcement policy for the complete hardware-managed shared memory system. Specifically, it controls the total miss bandwidth available to each thread by dynamically manipulating the number of Miss Status Holding Registers (MSHRs) available in each private data cache. When fairness is chosen as the metric of interest and we compare to a state-of-the-art fairness-aware memory system, DMHA improves fairness by 26% on average with the single program baseline. With a different configuration, DMHA improves throughput by 13% on average compared to a conventional memory system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Nathan L. Binkert , Ronald G. Dreslinski , Lisa R. Hsu , Kevin T. Lim , Ali G. Saidi , Steven K. Reinhardt, The M5 Simulator: Modeling Networked Systems, IEEE Micro, v.26 n.4, p.52-60, July 2006
[doi> 10.1109/MM.2006.82]
|
| |
2
|
|
 |
3
|
|
 |
4
|
Vinodh Cuppu , Bruce Jacob , Brian Davis , Trevor Mudge, A performance comparison of contemporary DRAM architectures, Proceedings of the 26th annual international symposium on Computer architecture, p.222-233, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
 |
8
|
Pawan Goyal , Harrick M. Vin , Haichen Chen, Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks, Conference proceedings on Applications, technologies, architectures, and protocols for computer communications, p.157-168, August 28-30, 1996, Palo Alto, California, United States
|
| |
9
|
|
 |
10
|
Lisa R. Hsu , Steven K. Reinhardt , Ravishankar Iyer , Srihari Makineni, Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
[doi> 10.1145/1152154.1152161]
|
 |
11
|
|
 |
12
|
Ravi Iyer , Li Zhao , Fei Guo , Ramesh Illikkal , Srihari Makineni , Don Newell , Yan Solihin , Lisa Hsu , Steve Reinhardt, QoS policies and architecture for cache/memory in CMP platforms, Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 12-16, 2007, San Diego, California, USA
|
| |
13
|
JEDEC Solid State Technology Association. DDR2 SDRAM Specification, May 2006.
|
| |
14
|
|
| |
15
|
|
| |
16
|
K. Luo, J. Gummaraju, and M. Franklin. Balancing Throughput and Fairness in SMT Processors. ISPASS, 2001.
|
| |
17
|
|
 |
18
|
|
| |
19
|
Kyle J. Nesbit , Miquel Moreto , Francisco J. Cazorla , Alex Ramirez , Mateo Valero , James E. Smith, Multicore Resource Management, IEEE Micro, v.28 n.3, p.6-16, May 2008
[doi> 10.1109/MM.2008.43]
|
| |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
|
 |
24
|
Scott Rixner , William J. Dally , Ujval J. Kapasi , Peter Mattson , John D. Owens, Memory access scheduling, Proceedings of the 27th annual international symposium on Computer architecture, p.128-138, June 2000, Vancouver, British Columbia, Canada
|
 |
25
|
|
| |
26
|
SPEC CPU 2000 Web Page. http://www.spec.org/cpu2000/.
|
| |
27
|
D. Tarjan, S. Thoziyoor, and N. P. Jouppi. CACTI 4.0 Technical Report. 2006.
|
| |
28
|
Li Zhao , Ravi Iyer , Ramesh Illikkal , Jaideep Moses , Srihari Makineni , Don Newell, CacheScouts: Fine-Grain Monitoring of Shared Caches in CMP Platforms, Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, p.339-352, September 15-19, 2007
[doi> 10.1109/PACT.2007.19]
|
|