ACM Home Page
Please provide us with feedback. Feedback
An adaptive resource partitioning algorithm for SMT processors
Full text PdfPdf (524 KB)
Source
PACT archive
Proceedings of the 17th international conference on Parallel architectures and compilation techniques table of contents
Toronto, Ontario, Canada
SESSION: Multithreading improvements table of contents
Pages 230-239  
Year of Publication: 2008
ISBN:978-1-60558-282-5
Authors
Huaping Wang  University of Massachusetts, Amherst, MA, USA
Israel Koren  University of Massachusetts, Amherst, MA, USA
C. Mani Krishna  University of Massachusetts, Amherst, MA, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 105,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1454115.1454148
What is a DOI?

ABSTRACT

Simultaneous Multithreading (SMT) increases processor throughput by allowing the parallel execution of several threads. However, fully sharing processor resources may cause resource monopolization by a single thread or other misallocations, resulting in overall performance degradation. Static resource partitioning techniques have been suggested, but are not as effective as dynamically controlling the resource usage of each thread since program behavior does change during its execution.

In this paper, we propose an Adaptive Resource Partitioning Algorithm (ARPA) that dynamically assigns resources to threads according to thread behavior changes. ARPA analyzes the resource usage efficiency of each thread in a time period and assigns more resources to threads which can use them in a more efficient way. The purpose of ARPA is to improve the efficiency of resource utilization, thereby improving overall instruction throughput. Our simulation results on a set of 42 multiprogramming workloads show that ARPA outperforms the traditional fetch policy ICOUNT by 55.8% with regard to overall instruction throughput and achieves a 33.8% improvement over Static Partitioning. It also outperforms the current best dynamic resource allocation technique, Hill-climbing, by 5.7%. Considering fairness accorded to each thread, ARPA attains 43.6%, 18.5% and 9.2% improvements over ICOUNT, Static Partitioning and Hill-climbing, respectively, using a common fairness metric.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
D. C. Burger and T. M. Austin, "The SimpleScalar Tool Set, Version 2.0," Technical Report CS-TR-1997-1342, University of Wisconsin, Madison, June 1997.
3
 
4
 
5
 
6
F. J. Cazorla, E. Fernández, A. Ramírez, and M. Valero, "Improving Memory Latency Aware Fetch Policies for SMT Processors," Proc. Fifth Int'l Symp. High Performance Computing, pp. 70--85, Oct. 2003.
 
7
8
 
9
 
10
K. Luo, J. Gummaraju, and M. Franklin, "Balancing Throughout and Fairness in SMT Processors," Proc. Int'l Symp. Performance Analysis of Systems and Software, pp. 164--171, Nov. 2001.
 
11
D. T. Marr, F. Binns, D. L. Hill, G. Hinton, D. A. Koufaty, J. A. Miller, and M. Upton, "Hyper-Threading Technology Architecture and Microarchitecture," Intel Technology J., vol. 6, no. 1, pp. 4--15, Feb. 2002.
 
12
 
13
S. Sair and M. Charney, "Memory Behavior of the SPEC2000 Benchmark Suite," Technical Report, IBM T.J. Watson Research Center, 2000.
14
15
16
 
17
18
 
19
H. Wang, Y. Guo, I. Koren, and C. M. Krishna, "Compiler-Based Adaptive Fetch Throttling for Energy Efficiency," Proc. Int'l Symp. Performance Analysis of Systems and Software, pp. 112--119, Mar. 2006.
 
20

Collaborative Colleagues:
Huaping Wang: colleagues
Israel Koren: colleagues
C. Mani Krishna: colleagues