|
ABSTRACT
This work presents a quantitative approach to analyze parallelization opportunities in programs with irregular memory access where potential data dependencies mask available parallelism. The model captures data and causal dependencies among critical sections as algorithmic properties and quantifies them as a density computed over the number of executed instructions. The model abstracts from runtime aspects such as scheduling, the number of threads, and concurrency control used in a particular parallelization. We illustrate the model on several applications requiring ordered and unordered execution of critical sections. We describe a run-time tool that computes the dependence densities from a deterministic single-threaded program execution. This density metric provides insights into the potential for optimistic parallelization, opportunities for algorithmic scheduling, and performance defects due to synchronization bottlenecks. Based on the results of our analysis, we classify applications into three categories with low, medium, and high dependence densities. Applications with low dependence density are naturally good candidates for optimistic concurrency, applications with medium density may require a scheduler that is aware of the algorithmic dependencies for optimistic concurrency to be effective, and applications with high dependence density may not be suitable for parallelization.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Bader and K. Madduri. Design and Implementation of the HPCS Graph Analysis Benchmark on Symmetric Multiprocessors. In Proc. of HiPC 2005, pages 465--476, December 2005.
|
 |
2
|
Jayaram Bobba , Kevin E. Moore , Haris Volos , Luke Yen , Mark D. Hill , Michael M. Swift , David A. Wood, Performance pathologies in hardware transactional memory, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
| |
3
|
|
 |
4
|
|
| |
5
|
T. Chen, J. Lin, X. Dai, W.-C. Hsu, and P.-C. Yew. Data Dependence Proceeding for Speculative Optimizations. In Proc. of Compiler Construction'04, pages 57--72, 2004.
|
| |
6
|
|
 |
7
|
Lance Hammond , Mark Willey , Kunle Olukotun, Data speculation support for a chip multiprocessor, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.58-69, October 02-07, 1998, San Jose, California, United States
|
 |
8
|
Tim Harris , Keir Fraser, Language support for lightweight transactions, Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, October 26-30, 2003, Anaheim, California, USA
|
| |
9
|
T. Harris and S. Stipic. Abstract Nested Transactions. In Proc. of TRANSACT'07, 2007.
|
| |
10
|
M. Herlihy and E. Koskinen. Transactional Boosting: A Methodology for Highly-Concurrent Transactional Objects. Technical Report CS07-08, Department of Computer Science, Brown University, July 2007.
|
 |
11
|
|
 |
12
|
|
| |
13
|
M. Kulkarni, L. P. Chew, and K. Pingali. Using Transactions in Delaunay Mesh Generation. In Workshop on Transactional Memory Workloads (WTW'06), 2006.
|
| |
14
|
M. Kulkarni and K. Pingali. Scheduling Issues in Optimistic Parallelization. In Proc. of IPDPS'07, pages 1--7, 2007.
|
 |
15
|
Milind Kulkarni , Keshav Pingali , Bruce Walter , Ganesh Ramanarayanan , Kavita Bala , L. Paul Chew, Optimistic parallelism requires abstractions, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, June 10-13, 2007, San Diego, California, USA
|
 |
16
|
|
 |
17
|
Wei Liu , James Tuck , Luis Ceze , Wonsun Ahn , Karin Strauss , Jose Renau , Josep Torrellas, POSH: a TLS compiler that exploits program structure, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
[doi> 10.1145/1122971.1122997]
|
 |
18
|
Chi-Keung Luk , Robert Cohn , Robert Muth , Harish Patil , Artur Klauser , Geoff Lowney , Steven Wallace , Vijay Janapa Reddi , Kim Hazelwood, Pin: building customized program analysis tools with dynamic instrumentation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
| |
19
|
P. Marcuello and A. Gonzalez. A Quantitative Assessment of TLS Techniques . In Proc. of IPDPS'00, 2000.
|
 |
20
|
Chi Cao Minh , Martin Trautmann , JaeWoong Chung , Austen McDonald , Nathan Bronson , Jared Casper , Christos Kozyrakis , Kunle Olukotun, An effective hybrid transactional memory system with strong isolation guarantees, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
| |
21
|
E. J. B. Moss and T. Hosking. Nested Transactional Memory: Model and Preliminary Architecture Sketches. In In Proceedings, Workshop on Synchronization and Concurrency in Object-Oriented Languages, Oct 2005.
|
| |
22
|
MySQL - The World's Most Popular Open Source Database. http://www.mysql.com.
|
| |
23
|
R. Narayanan, B. Ozisikyilmaz, J. Zamberno, G. Memik, and A. Choudhary. MineBench: A Benchmark Suite for Data Mining Workloads. In Proc. of IISWC'06, pages 182--188, 2006.
|
 |
24
|
Naveen Neelakantam , Ravi Rajwar , Suresh Srinivas , Uma Srinivasan , Craig Zilles, Hardware atomicity for reliable software speculation, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
| |
25
|
|
 |
26
|
Bratin Saha , Ali-Reza Adl-Tabatabai , Richard L. Hudson , Chi Cao Minh , Benjamin Hertzberg, McRT-STM: a high performance software transactional memory system for a multi-core runtime, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
[doi> 10.1145/1122971.1123001]
|
| |
27
|
|
 |
28
|
|
| |
29
|
|
| |
30
|
The UMT Benchmark Code. http://www.llnl.gov/asci/purple/benchmarks/limited/umt.
|
| |
31
|
|
 |
32
|
|
| |
33
|
Antonia Zhai , Christopher B. Colohan , J. Gregory Steffan , Todd C. Mowry, Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, p.39, March 20-24, 2004, Palo Alto, California
|
CITED BY 6
|
|
|
|
|
Milind Kulkarni , Martin Burtscher , Rajeshkar Inkulu , Keshav Pingali , Calin Casçaval, How much parallelism is there in irregular applications?, Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, February 14-18, 2009, Raleigh, NC, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|