|
ABSTRACT
In modern distributed systems, coordinated time-sharing is required for communicating processes to leverage the performance of switch-based networks and low-overhead protocols. Coordinated time-sharing has traditionally been achieved with gang scheduling or explicit coscheduling, implementations of which often suffer from many deficiencies: multiple points of failure, high context-switch overheads, and poor interaction with client-server, interactive, and I/O -intensive workloads. Implicit coscheduling dynamically coordinates communicating processes across distributed machines without these structural deficiencies. In implicit coscheduling, no communication is required across operating systems schedulers; instead, cooperating processes achieve coordination by reacting to implicit information carried by communication existing within the parallel application. The implementation of this approach is simple and allows participating nodes to act autonomously. We introduce two key mechanisms in implicit coscheduling. The first is conditional two-phase waiting, a generalization of traditional two-phase waiting in which spin-time may be increased depending upon events occuring while the process waits. The second is an extension to stride scheduling that provides preemption and is fair to processes that block. To demonstrate that implicit coscheduling performs well, we show results from an extensive set of simulation and implementation experiments. To exercise the conditional two-phase waiting algorithm, we examine three workloads: bulk-synchronous and continuous-communication synthetic applications and application kernels written in the Split-C language. To exercise the local scheduler, we examine competing jobs with different communication characteristics. We demonstrate that our implementation scales well with the number of jobs and workstations and is robust to process placement. Our experiments show that implicit coscheduling is effective and fair for a wide range of workloads; most perform within 30% of an idealized model of gang scheduling.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Albert Alexandrov , Mihai F. Ionescu , Klaus E. Schauser , Chris Scheiman, LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, p.95-105, June 24-26, 1995, Santa Barbara, California, United States
[doi> 10.1145/215399.215427]
|
| |
2
|
|
 |
3
|
T. E. Anderson , M. D. Dahlin , J. M. Neefe , D. A. Patterson , D. S. Roselli , R. Y. Wang, Serverless network file systems, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.109-126, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
4
|
Remzi H. Arpaci , Andrea C. Dusseau , Amin M. Vahdat , Lok T. Liu , Thomas E. Anderson , David A. Patterson, The interaction of parallel and sequential workloads on a network of workstations, Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.267-278, May 15-19, 1995, Ottawa, Ontario, Canada
|
| |
5
|
ARPACI-DUSSEAU,A.AND CULLER, D. 1997. Extending Proportional-Share Scheduling to a Network of Workstations. In International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (Las Vegas, Nevada, June 1997).
|
| |
6
|
|
 |
7
|
Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , David E. Culler , Joseph M. Hellerstein , David A. Patterson, High-performance sorting on networks of workstations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.243-254, May 11-15, 1997, Tucson, Arizona, United States
|
| |
8
|
BIAGIONI, E., COOPER, E., AND SANSOM, R. 1993. Designing a Practical ATMLAN. IEEE Network 7,2 (March), 32-39.
|
| |
9
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
| |
10
|
|
| |
11
|
BUCHANAN,M.AND CHIEN, A. 1997. Coordinated Thread Scheduling for Workstation Clusters Under Windows NT. In Proceedings of USENIX Windows NT Workshop (Aug. 1997).
|
| |
12
|
Douglas C. Burger , Rahmat S. Hyder , Barton P. Miller , David A. Wood, Paging tradeoffs in distributed-shared-memory multiprocessors, Proceedings of the 1994 conference on Supercomputing, p.590-599, December 1994, Washington, D.C., United States
|
| |
13
|
|
| |
14
|
CULLER, D., ARPACI-DUSSEAU, A., ARPACI-DUSSEAU, R., CHUN, B., LUMETTA, S., MAINWARING, A., MARTIN, R., YOSHIKAWA,C.,AND WONG, F. 1997. Parallel Computing on the Berkeley NOW. In Ninth Joint Symposium on Parallel Processing (Kobe, Japan, May 1997).
|
 |
15
|
A. Krishnamurthy , D. E. Culler , A. Dusseau , S. C. Goldstein , S. Lumetta , T. von Eicken , K. Yelick, Parallel programming in Split-C, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.262-273, December 1993, Portland, Oregon, United States
[doi> 10.1145/169627.169724]
|
 |
16
|
David Culler , Richard Karp , David Patterson , Abhijit Sahay , Klaus Erik Schauser , Eunice Santos , Ramesh Subramonian , Thorsten von Eicken, LogP: towards a realistic model of parallel computation, Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.1-12, May 19-22, 1993, San Diego, California, United States
|
 |
17
|
|
 |
18
|
|
| |
19
|
DOWDY, L. 1988. On the Partitioning of Multiprocessor Systems. Technical Report 88-06 (July), Department of Computer Science, Vanderbilt University.
|
 |
20
|
Andrea C. Dusseau , Remzi H. Arpaci , David E. Culler, Effective distributed scheduling of parallel workloads, Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.25-36, May 23-26, 1996, Philadelphia, Pennsylvania, United States
|
| |
21
|
|
| |
22
|
EYKHOLT, J. R., KLEIMAN, S. R., BARTON, S., VOLL, J., FAULKNER, R., SHIVALINGIAH, A., SMITH, M., STEIN, D., WEEKS, M., AND WILLIAMS, D. 1992. Beyond Multiprocessing: Multithreading the SunOS Kernel. In Proceedings of the Summer 1992 USENIX Technical Conference and Exhibition (San Antontio, TX, USA, June 1992), pp. 11-18.
|
| |
23
|
FEITELSON, D. G. 1995. A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657) (February), IBM T. J. Watson Research Center, Yorktown Heights, NY. Second Revision, August 1997.
|
| |
24
|
|
| |
25
|
|
| |
26
|
FEITELSON,D.G.AND RUDOLPH, L. 1992. Gang Scheduling Performance Benefits for Fine- Grained Synchronization. Journal of Parallel and Distributed Computing 16, 4 (December), 306-18.
|
| |
27
|
|
| |
28
|
Douglas P. Ghormley , David Petrou , Steven H. Rodrigues , Amin M. Vahdat , Thomas E. Anderson, GLUnix: a global layer Unix for a network of workstations, Software—Practice & Experience, v.28 n.9, p.929-961, July 25, 1998
[doi> 10.1002/(SICI)1097-024X(19980725)28:9<929::AID-SPE183>3.0.CO;2-C]
|
| |
29
|
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
 |
33
|
|
| |
34
|
KARLIN, A. R., MANASSE, M., MCGEOCH, L., AND OWICKI, S. 1994. Competitive Randomized Algorithms For Nonuniform Problems. Algorithmica 11, 6 (June), 542-71.
|
 |
35
|
|
| |
36
|
|
 |
37
|
|
| |
38
|
MAINWARING, A. M. 1995. Active Message Application Programming Interface and Communication Subsystem Organization. Master's thesis, University of California, Berkeley.
|
 |
39
|
Richard P. Martin , Amin M. Vahdat , David E. Culler , Thomas E. Anderson, Effects of communication latency, overhead, and bandwidth in a cluster architecture, Proceedings of the 24th annual international symposium on Computer architecture, p.85-97, June 01-04, 1997, Denver, Colorado, United States
|
 |
40
|
|
| |
41
|
|
| |
42
|
OUSTERHOUT, J. K. 1982. Scheduling Techniques for Concurrent Systems. In Third International Conference on Distributed Computing Systems (May 1982), pp. 22-30.
|
 |
43
|
Scott Pakin , Mario Lauria , Andrew Chien, High performance messaging on workstations: Illinois fast messages (FM) for Myrinet, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.55-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224360]
|
 |
44
|
|
| |
45
|
|
| |
46
|
|
| |
47
|
|
 |
48
|
|
 |
49
|
T. von Eicken , A. Basu , V. Buch , W. Vogels, U-Net: a user-level network interface for parallel and distributed computing (includes URL), Proceedings of the fifteenth ACM symposium on Operating systems principles, p.40-53, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
50
|
Thorsten von Eicken , David E. Culler , Seth Copen Goldstein , Klaus Erik Schauser, Active messages: a mechanism for integrated communication and computation, Proceedings of the 19th annual international symposium on Computer architecture, p.256-266, May 19-21, 1992, Queensland, Australia
|
| |
51
|
|
| |
52
|
|
| |
53
|
|
| |
54
|
WARREN,M.S.,BECKER,D.J.,GODA,M.P.,SALMON,J.K.,AND STERLING, T. 1997. Parallel Supercomputing with Commodity Components. In International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (Las Vegas, Nevada, June 1997), pp. 1372- 1381.
|
| |
55
|
|
CITED BY 15
|
|
|
|
|
Eitan Frachtenberg , Fabrizio Petrini , Juan Fernandez , Scott Pakin , Salvador Coll, STORM: lightning-fast resource management, Proceedings of the 2002 ACM/IEEE conference on Supercomputing, p.1-26, November 16, 2002, Baltimore, Maryland
|
|
|
|
|
|
Salvador Coll , José Duato , Francisco J. Mora , Fabrizio Petrini , Adolfy Hoisie, Collective communication patterns on the quadrics network, Performance analysis and grid computing, Kluwer Academic Publishers, Norwell, MA, 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sriram Govindan , Arjun R. Nath , Amitayu Das , Bhuvan Urgaonkar , Anand Sivasubramaniam, Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms, Proceedings of the 3rd international conference on Virtual execution environments, June 13-15, 2007, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|