ACM Home Page
Please provide us with feedback. Feedback
Compiler optimization of scalar value communication between speculative threads
Full text PdfPdf (1.39 MB)
Source Architectural Support for Programming Languages and Operating Systems archive
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems table of contents
San Jose, California
SESSION: Speculative threads table of contents
Pages: 171 - 183  
Year of Publication: 2002
ISBN:1-58113-574-2
Also published in ...
Authors
Antonia Zhai  Carnegie Mellon University, Pittsburgh, PA
Christopher B. Colohan  Carnegie Mellon University, Pittsburgh, PA
J. Gregory Steffan  Carnegie Mellon University, Pittsburgh, PA
Todd C. Mowry  Carnegie Mellon University, Pittsburgh, PA
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
SIGOPS: ACM Special Interest Group on Operating Systems
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 66,   Citation Count: 24
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/605397.605416
What is a DOI?

ABSTRACT

While there have been many recent proposals for hardware that supports Thread-Level Speculation (TLS), there has been relatively little work on compiler optimizations to fully exploit this potential for parallelizing programs optimistically. In this paper, we focus on one important limitation of program performance under TLS, which is stalls due to forwarding scalar values between threads that would otherwise cause frequent data dependences. We present and evaluate dataflow algorithms for three increasingly-aggressive instruction scheduling techniques that reduce the critical forwarding path introduced by the synchronization associated with this data forwarding. In addition, we contrast our compiler techniques with related hardware-only approaches. With our most aggressive compiler and hardware techniques, we improve performance under TLS by 6.2-28.5% for 6 of 14 applications, and by at least 2.7% for half of the other applications.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
 
4
BROADCOM CORPORATION. The Sibyte SB-1250 Processor. http://www.sibyte.com/mercurian.
 
5
CHANG, P. P., WARTER, N. J., MAHLKE, S. A., CHEN, W. Y., AND HWU, W. W. Three Superblock Scheduling Models for Superscalar and Superpipelined Processors. Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, 1991.
 
6
CHEN, D. K., AND YEW, P. C. Statement re-ordering for DOACROSS loops. In International Conference on Parallel Processing (Aug. 1994), pp. 24-28.
 
7
 
8
CYTRON, R. Doacross: Beyond vectorization for multiprocessors. In International Conference on Parallel Processing (1986).
 
9
EMER, J. Ev8: The post-ultimate alpha.(keynote address). In International Conference on Parallel Architectures and Compilation Techniques (2001).
 
10
FISHER, J. A. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers 13 (June 1981).
 
11
12
 
13
 
14
15
 
16
HOLLEY, L. H., AND K. ROSEN, B. Qualified data flow problems. IEEE Transactions on Software Engineering 7, 1 (Jan. 1981).
 
17
KAHLE, J. Power4: A Dual-CPU Processor Chip. Microprocessor Forum '99 (October 1999).
18
 
19
20
 
21
 
22
23
 
24
 
25
 
26
PADUA, D., KUCK, D., AND LAWRIE, D. High-speed multiprocessors and compilation techniques. IEEE Transactions on Computing (September 1980).
27
 
28
STANDARD PERFORMANCE EVALUATION CORPORATION. The SPEC Benchmark Suite. http://www.specbench.org.
 
29
STEFFAN, J. G., COLOHAN, C. B., AND MOWRY, T. C. Architectural Support for Thread-Level Data Speculation. Tech. Rep. CMU-CS-97-188, School of Computer Science, Carnegie Mellon University, November 1997.
30
 
31
 
32
TJIANG, S., WOLF, M., LAM, M., PIEPER, K., AND HENNESSY, J. Languages and Compilers for Parallel Computing. Springer-Verlag, Berlin, Germany, 1992, pp. 137-151.
 
33
TREMBLAY, M. MAJC: Microprocessor Architecture for Java Computing. HotChips '99 (August 1999).
 
34
 
35
 
36
 
37
ZHAI, A., COLOHAN, C. B., STEFFAN, J. G., AND MOWRY, T. C. Compiler Optimizations to Accelerate Scalar Value Communication Between Speculative Threads. Tech. Rep. CMU-CS-02-162, School of Computer Science, Carnegie Mellon University, August 2002.
 
38
 
39
ZILLES, C. B., AND SOHI, G. S. Master/Slave Speculative Parallelization with Distilled Programs. Tech. Rep. TR-1438, Computer Sciences Department, University of Wisconsin-Madison, April 2002.

CITED BY  24
Collaborative Colleagues:
Antonia Zhai: colleagues
Christopher B. Colohan: colleagues
J. Gregory Steffan: colleagues
Todd C. Mowry: colleagues