ACM Home Page
Please provide us with feedback. Feedback
Beyond nested parallelism: tight bounds on work-stealing overheads for parallel futures
Full text PdfPdf (465 KB)
Source
ACM Symposium on Parallel Algorithms and Architectures archive
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures table of contents
Calgary, AB, Canada
SESSION: Concurrency mechanisms table of contents
Pages 91-100  
Year of Publication: 2009
ISBN:978-1-60558-606-9
Authors
Daniel Spoonhower  Carnegie Mellon University, Pittsburgh, PA, USA
Guy E. Blelloch  Carnegie Mellon University, Pittsburgh, PA, USA
Phillip B. Gibbons  Intel Research Pittsburgh, Pittsburgh, PA, USA
Robert Harper  Carnegie Mellon University, Pittsburgh, PA, USA
Sponsors
SIGOPS: ACM Special Interest Group on Operating Systems
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 50,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1583991.1584019
What is a DOI?

ABSTRACT

Work stealing is a popular method of scheduling fine-grained parallel tasks. The performance of work stealing has been extensively studied, both theoretically and empirically, but primarily for the restricted class of nested-parallel (or fully strict) computations. We extend this prior work by considering a broader class of programs that also supports pipelined parallelism through the use of parallel futures.

Though the overhead of work-stealing schedulers is often quantified in terms of the number of steals, we show that a broader metric, the number of deviations, is a better way to quantify work-stealing overhead for less restrictive forms of parallelism, including parallel futures. For such parallelism, we prove bounds on work-stealing overheads--scheduler time and cache misses--as a function of the number of deviations. Deviations can occur, for example, when work is stolen or when a future is touched. We also show instances where deviations can occur independently of steals and touches.

Next, we prove that, under work stealing, the expected number of deviations is O(Pd + td) in a P-processor execution of a computation with span d and t touches of futures. Moreover, this bound is existentially tight for any work-stealing scheduler that is parsimonious (those where processors steal only when their queues are empty); this class includes all prior work-stealing schedulers. We also present empirical measurements of the number of deviations incurred by a classic application of futures, Halstead's quicksort, using our parallel implementation of ML. Finally, we identify a family of applications that use futures and, in contrast to quicksort, incur significantly smaller overheads.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Umut A. Acar, Guy E. Blelloch, and Robert D. Blumofe. The data locality of work stealing. Theory of Comput. Syst., 35 (3): 321--347, 2002.
2
3
4
5
6
7
8
9
10
11
12
 
13
 
14
15
 
16
17
18
19
 
20
Intel Corporation. Intel Threading Building Blocks. www.threadingbuildingblocks.org, 2009.
21
22
 
23
24
25
26

Collaborative Colleagues:
Daniel Spoonhower: colleagues
Guy E. Blelloch: colleagues
Phillip B. Gibbons: colleagues
Robert Harper: colleagues