ACM Home Page
Please provide us with feedback. Feedback
Effective distributed scheduling of parallel workloads
Full text PdfPdf (1.80 MB)
Source Joint International Conference on Measurement and Modeling of Computer Systems archive
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems table of contents
Philadelphia, Pennsylvania, United States
Pages: 25 - 36  
Year of Publication: 1996
ISBN:0-89791-793-6
Also published in ...
Authors
Andrea C. Dusseau  Computer Science Division, University of California, Berkeley
Remzi H. Arpaci  Computer Science Division, University of California, Berkeley
David E. Culler  Computer Science Division, University of California, Berkeley
Sponsor
SIGMETRICS: ACM Special Interest Group on Measurement and Evaluation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 72,   Citation Count: 28
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/233013.233020
What is a DOI?

ABSTRACT

We present a distributed algorithm for time-sharing parallel workloads that is competitive with coscheduling. Implicit scheduling allows each local scheduler in the system to make independent decisions that dynamically coordinate the scheduling of cooperating processes across processors. Of particular importance is the blocking algorithm which decides the action of a process waiting for a communication or synchronization event to complete. Through simulation of bulk-synchronous parallel applications, we find that a simple two-phase fixed-spin blocking algorithm performs well; a two-phase adaptive algorithm that gathers run-time data on barrier wait-times performs slightly better. Our results hold for a range of machine parameters and parallel program characteristics. These findings are in direct contrast to the literature that states explicit coscheduling is necessary for fine-grained programs. We show that the choice of the local scheduler is crucial, with a priority-based scheduler performing two to three times better than a round-robin scheduler. Overall, we find that the performance of implicit scheduling is near that of coscheduling (+/- 35%), without the requirement of explicit, global coordination.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
M. J. Atallah, C. L. Black, D. C. Marinescu, H. J. Siegel, and T. L. Casavant. Models and Algorithms for Co-scheduling Compute-Intensive Tasks on a Network of Workstations Journal of Parallel and Distrbuted Computing, 16-319-327, 1992.
5
6
7
 
8
M. Crovella, P. Das, C. Dubnicki, T. LeBlanc, and E. Markatos. Multiprogramming on Multiprocessors. In Proceed,ngs of the Third IEEE Symposium on Parallel and D~stmbuted Processzng, pages 590-597, Dec. 1991.
 
9
10
 
11
J. J. Dongarra, J. R. Bunch, C. B. Moler, and G W. Stewart LINPACK User's Guide. SIAM, Philadelphia, Penn, 1979.
 
12
 
13
S. Evans, K. Clarke, D. Singleton, and B. Smaalders. Optimizing Unix Resource Scheduling for User Interaction. In 1993 Summer Usen,x, pages 205-218. USENIX, June 1993.
 
14
 
15
D. G. Feltelson and L Rudolph. Gang Scheduling Performance Benefits for Fine-Grained Synchronization Journal of Parallel and D,stmbuted Computing, 16(4):306-318, Dec. 1992
 
16
 
17
 
18
19
 
20
High Performance Fortran Forum. High Performance Fortran language specification version 1 0 Draft, Jan. 1993
21
22
 
23
K. Keeton, T. Anderson, and D. Patterson. LogP Quantified: The Case for Low-Overhead Local Area Networks. In Proceedings of Hot Interconnects II{, Aug. 1995
24
 
25
26
27
 
28
R. P. Martin HPAM An Active Message Layer for a Network of Workstations. In Proceedzngs of Hot Interconnects {I, July 1994.
29
30
31
 
32
L McVoy and C. Staelin. lmbench: Portable Tools for Performance Analysis. In Proceedings of the i996 W, nter USENIX, Jan. 1996.
33
 
34
 
35
J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Th,rd International Conference on D~stmbuted Comput,ng Systems, pages 22-30, May 1982
36
37
38
 
39
40
41
42
 
43
J. Zahor9an and E. D. Lazowska. Spinning Versus Blocking m Parallel ~y~tems with Uncertain~y. In Proceedings of the IFIP International Seminar on Performance of D,stmbuted and Parallel Systems, pages 455-472, Dec. 1988.

CITED BY  28

Collaborative Colleagues:
Andrea C. Dusseau: colleagues
Remzi H. Arpaci: colleagues
David E. Culler: colleagues