| How much parallelism is there in irregular applications? |
| Full text |
Pdf
(819 KB)
|
Source
|
Principles and Practice of Parallel Programming
archive
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
table of contents
Raleigh, NC, USA
SESSION: Parallel applications
table of contents
Pages 3-14
Year of Publication: 2009
ISBN:978-1-60558-397-6
Also published in ...
|
|
Authors
|
|
Milind Kulkarni
|
The University of Texas at Austin, Austin, TX, USA
|
|
Martin Burtscher
|
The University of Texas at Austin, Austin, TX, USA
|
|
Rajeshkar Inkulu
|
The University of Texas at Austin, Austin, TX, USA
|
|
Keshav Pingali
|
The University of Texas at Austin, Austin, TX, USA
|
|
Calin Casçaval
|
IBM Research, Yorktown Heights, NY, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 64, Downloads (12 Months): 483, Citation Count: 0
|
|
|
ABSTRACT
Irregular programs are programs organized around pointer-based data structures such as trees and graphs. Recent investigations by the Galois project have shown that many irregular programs have a generalized form of data-parallelism called amorphous data-parallelism. However, in many programs, amorphous data-parallelism cannot be uncovered using static techniques, and its exploitation requires runtime strategies such as optimistic parallel execution. This raises a natural question: how much amorphous data-parallelism actually exists in irregular programs? In this paper, we describe the design and implementation of a tool called ParaMeter that produces parallelism profiles for irregular programs. Parallelism profiles are an abstract measure of the amount of amorphous data-parallelism at different points in the execution of an algorithm, independent of implementation-dependent details such as the number of cores, cache sizes, load-balancing, etc. ParaMeter can also generate constrained parallelism profiles for a fixed number of cores. We show parallelism profiles for seven irregular applications, and explain how these profiles provide insight into the behavior of these applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Arvind, David Culler, and Gino Maa. Assessing the benefits of fine-grain parallelism in dataflow programs. International Journal of High-performance Computing Applications, 2(3), 1988.
|
 |
2
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.207-216, July 19-21, 1995, Santa Barbara, California, United States
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
Matteo Frigo , Charles E. Leiserson , Keith H. Randall, The implementation of the Cilk-5 multithreaded language, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.212-223, June 17-19, 1998, Montreal, Quebec, Canada
|
 |
9
|
|
| |
10
|
Leonidas J. Guibas, Donald E. Knuth, and Micha Sharir. Randomized incremental construction of delaunay and voronoi diagrams. Algorithmica, 7(1):381--413, December 1992.
|
 |
11
|
Tim Harris , Keir Fraser, Language support for lightweight transactions, Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, October 26-30, 2003, Anaheim, California, USA
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
 |
18
|
Milind Kulkarni , Patrick Carribault , Keshav Pingali , Ganesh Ramanarayanan , Bruce Walter , Kavita Bala , L. Paul Chew, Scheduling strategies for optimistic parallel execution of irregular programs, Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures, June 14-16, 2008, Munich, Germany
[doi> 10.1145/1378533.1378575]
|
 |
19
|
|
 |
20
|
|
 |
21
|
Wei Liu , James Tuck , Luis Ceze , Wonsun Ahn , Karin Strauss , Jose Renau , Josep Torrellas, POSH: a TLS compiler that exploits program structure, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
[doi> 10.1145/1122971.1122997]
|
 |
22
|
|
| |
23
|
|
 |
24
|
|
| |
25
|
|
 |
26
|
Kevin B. Theobald , Guang R. Gao , Laurie J. Hendren, On the limits of program parallelism and its smoothability, Proceedings of the 25th annual international symposium on Microarchitecture, p.10-19, December 01-04, 1992, Portland, Oregon, United States
|
 |
27
|
|
| |
28
|
Bruce Walter, Kavita Bala, Milind Kulkarni, and Keshav Pingali. Fast agglomerative clustering for rendering. In IEEE Symposium on Interactive Ray Tracing (RT), 2008.
|
| |
29
|
Hongtao Zhong, Mojtaba Mehrara, Steve Lieberman, and Scott Mahlke. Uncovering hidden loop level parallelism in sequential applications. IEEE 14th International Symposium on High Performance Computer Architecture, pages 290--301, Feb. 2008.
|
|