ACM Home Page
Please provide us with feedback. Feedback
Dynamic performance tuning for speculative threads
Full text PdfPdf (461 KB)
Source
International Symposium on Computer Architecture archive
Proceedings of the 36th annual international symposium on Computer architecture table of contents
Austin, TX, USA
SESSION: Speculative threading and parallelization table of contents
Pages 462-473  
Year of Publication: 2009
ISBN:978-1-60558-526-0
Also published in ...
Authors
Yangchun Luo  University of Minnesota - Twin Cities, Minneapolis, MN, USA
Venkatesan Packirisamy  University of Minnesota - Twin Cities, Minneapolis, MN, USA
Wei-Chung Hsu  University of Minnesota - Twin Cities, Minneapolis, MN, USA
Antonia Zhai  University of Minnesota - Twin Cities, Minneapolis, MN, USA
Nikhil Mungre  University of Minnesota - Twin Cities, Minneapolis, MN, USA
Ankit Tarkas  University of Minnesota - Twin Cities, Minneapolis, MN, USA
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 80,   Downloads (12 Months): 243,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1555754.1555812
What is a DOI?

ABSTRACT

In response to the emergence of multicore processors, various novel and sophisticated execution models have been introduced to fully utilize these processors. One such execution model is Thread-Level Speculation (TLS), which allows potentially dependent threads to execute speculatively in parallel. While TLS offers significant performance potential for applications that are otherwise non-parallel, extracting efficient speculative threads in the presence of complex control flow and ambiguous data dependences is a real challenge. This task is further complicated by the fact that the performance of speculative threads is often architecture-dependent, input-sensitive, and exhibits phase behaviors. Thus we propose dynamic performance tuning mechanisms that determine where and how to create speculative threads at runtime.

This paper describes the design, implementation, and evaluation of hardware and software support that takes advantage of runtime performance profiles to extract efficient speculative threads. In our proposed framework, speculative threads are monitored by hardware-based performance counters and their performance impact is estimated. The creation of speculative threads is adjusted based on the estimation. This paper proposes speculative threads performance estimation techniques, that are capable of correctly determining whether speculation can improve performance for loops that corresponds to 83.8% of total loop execution time across all benchmarks. This paper also examines several dynamic performance tuning policies and finds that the best tuning policy achieves an overall speedup of 36.8%on a set of benchmarks from SPEC2000 suite, which outperforms static thread management by 9.5%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
BURCEA, M. stOMP: A Specializing Thread Library for OpenMP. PhD thesis, University of Toronto, 2005.
4
 
5
 
6
 
7
 
8
9
10
 
11
12
13
14
15
16
17
18
19
20
 
21
22
23
 
24
 
25
LU, J., CHEN, H., YEW, P. C., AND HSU, W. C. Design and implementation of a lightweight dynamic optimization system. Journal of Instruction Level Parallelism 6 (2004).
 
26
27
28
29
 
30
31
 
32
MERICAS, A. Performance monitoring on the POWER5 microprocessor. In Performance Evaluation and Benchmarking, L. K. John and L. Eeckhout, Eds. CRC Press, 2006, pp. 247--266.
 
33
 
34
OPEN64 DEVELOPERS. Open64 compiler and tools, 2001.
 
35
 
36
PERELMAN, E., POLITO, M., YVES BOUGUET, J., SAMPSON, J., CALDER, B., AND DULONG, C. Detecting phases in parallel applications on shared memory architectures. In Proc. of the International Parallel and Distributed Processing Symposium. 2006.
37
38
39
40
41
42
43
 
44
 
45
 
46
 
47
WANG, S., DAI, X., YELLAJYOSULA, K. S., ZHAI, A., AND YEW, P.-C. Loop selection for thread-level speculation. In Proc. of the Workshops on Languages and Compilers for Parallel Computing. Oct 2005.
48
 
49
 
50

Collaborative Colleagues:
Yangchun Luo: colleagues
Venkatesan Packirisamy: colleagues
Wei-Chung Hsu: colleagues
Antonia Zhai: colleagues
Nikhil Mungre: colleagues
Ankit Tarkas: colleagues