ACM Home Page
Please provide us with feedback. Feedback
Exposing speculative thread parallelism in SPEC2000
Full text PdfPdf (782 KB)
Source Principles and Practice of Parallel Programming archive
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming table of contents
Chicago, IL, USA
SESSION: Automatic parallelization table of contents
Pages: 142 - 152  
Year of Publication: 2005
ISBN:1-59593-080-9
Authors
Manohar K. Prabhu  Stanford University, Stanford, CA
Kunle Olukotun  Stanford University, Stanford, CA
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 99,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1065944.1065964
What is a DOI?

ABSTRACT

As increasing the performance of single-threaded processors becomes increasingly difficult, consumer desktop processors are moving toward multi-core designs. One way to enhance the performance of chip multiprocessors that has received considerable attention is the use of thread-level speculation (TLS). As a case study, we manually parallelized several of the SPEC CPU2000 floating point and integer applications using TLS. The use of manual parallelization enabled us to apply techniques and programmer expertise that are beyond the current capabilities of automated parallelizers. With the experience gained from this, we provide insight into ways to aggressively apply TLS to parallelize applications for high performance. This information can help guide future advanced TLS compiler design.For each application, we discuss how and where parallelism was located within the application, the impediments to extracting this parallelism using TLS, and the code transformations that were required to overcome these impediments. We also generalize these experiences to a discussion of common hindrances to TLS parallelization, and describe methods of programming that help expose application parallelism to TLS systems. These guidelines can assist developers of uniprocessor programs to create applications that can easily port to TLS systems and yield good performance. By using manual parallelization on SPEC2000, we provide guidance on where thread-level parallelism exists in these well known benchmarks, what limits its extraction, how to reduce these limitations and what performance can be expected on these applications from a chip multiprocessor system with TLS.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
B. Blume, et. al, "Restructuring programs for high-speed computers with Polaris," Proc. 1996 ICPP Workshop on. Challenges for Parallel Processing, pp. 149--161, Aug. 1996.
2
3
4
 
5
 
6
J. Clabes, et al., "Design and implementation of the POWER5 microprocessor," IEEE Intl. Solid-State Circuits Conference (ISSCC), San Francisco, CA, Feb. 15-19, 2004.
7
 
8
 
9
P. Kongetira, "A 32-way multithreaded SPARC processor," Hot Chips 16, Stanford, California, Aug. 22-24, 2004.
 
10
K. Krewell, "AMD vs. Intel in dual-core duel," Microprocessor Report, Scottsdale, AZ, July 6, 2004.
 
11
D. Lammers, "Intel cancels Tejas, moves to dual-core designs," EETimes, Manhasset, New York, May 7, 2004.
 
12
13
14
 
15
C. McNairy and R. Bhatia, "Montecito - The next product in the Itanium Processor Family," Hot Chips 16, Stanford, California, Aug. 22-24, 2004.
16
17
18
19
20
 
21
T. Sherwood and B. Calder, "Time varying behavior of programs," Tech. Rep. No. CS99-630, Dept. of Computer Science and Eng., UCSD, Aug. 1999.
 
22
23
24
 
25
26

CITED BY  14


REVIEW

"Henk Sips : Reviewer"

The authors present results and experience gathered from using thread level speculation (TLS) techniques to manually parallelize seven applications chosen from CPU2000, one of the most popular benchmark suites for measuring intensive performance.   more...

Collaborative Colleagues:
Manohar K. Prabhu: colleagues
Kunle Olukotun: colleagues