|
ABSTRACT
In this paper, we conduct an in-depth evaluation of a broad spectrum of scheduling alternatives for clusters. These include the widely used batch scheduling, local scheduling, gang scheduling, all prior communication-driven coscheduling algorithms (Dynamic Coscheduling (DCS), Spin Block (SB), Periodic Boost (PB), and Co-ordinated Coscheduling (CC)) and a newly proposed HYBRID coscheduling algorithm on a 16-node, Myrinet-connected Linux cluster. Performance and energy measurements using several NAS, LLNL and ANL benchmarks on the Linux cluster provide several interesting conclusions. First, although batch scheduling is currently used in most clusters, all blocking-based coscheduling techniques such as SB, CC and HYBRID and the gang scheduling can provide much better performance even in a dedicated cluster platform. Second, in contrast to some of the prior studies, we observe that blocking-based schemes like SB and HYBRID can provide better performance than spin-based techniques like PB on a Linux platform. Third, the proposed HYBRID scheduling provides the best performance-energy behavior and can be implemented on any cluster with little effort. All these results suggest that blocking-based coscheduling techniques are viable candidates to be used in clusters for significant performance-energy benefits.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
[1] Open PBS. Available from http://www.openpbs.org.
|
 |
2
|
|
| |
3
|
[3] S. Agarwal, G. Choi, C. R. Das, A. B. Yoo, and S. Nagar. Co-ordinated Coscheduling in time-sharing Clusters through a Generic Framework. In Proceedings of International Conference on Cluster Computing, December 2003.
|
| |
4
|
|
| |
5
|
|
 |
6
|
Andrea C. Arpaci-Dusseau , David E. Culler , Alan M. Mainwaring, Scheduling with implicit information in distributed systems, Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.233-243, June 22-26, 1998, Madison, Wisconsin, United States
|
| |
7
|
[7] A. M. Bailey. Accelerated Strategic Computing Initiative (ASCI) : Driving the Need for the Terascale Simulation Facility (TSF). In Proceedings of Energy2002 Workshop and Exposition, June 2002.
|
| |
8
|
|
| |
9
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
| |
10
|
|
 |
11
|
|
| |
12
|
[12] Compag, Intel and Microsoft. Specification for the Virtual Interface Architecture. Available from http://www.viarch.org, 1997.
|
 |
13
|
T. von Eicken , A. Basu , V. Buch , W. Vogels, U-Net: a user-level network interface for parallel and distributed computing (includes URL), Proceedings of the fifteenth ACM symposium on Operating systems principles, p.40-53, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
14
|
|
| |
15
|
[15] D. G. Feitelson. A Survey of Scheduling in Multiprogrammed Parallel Systems. Technical Report Research Report RC 19790(87657), IBM T. J. Watson Research Center, October 1994.
|
| |
16
|
|
| |
17
|
[17] Gigabit Ethernet Alliance. 10 Gigabit Ethernet Technology Overview White Paper. Available from http://www.10gea.org/Tech-whitepapers.htm.
|
| |
18
|
|
| |
19
|
[19] IBM Corporation. IBM LoadLeveler. Available from http://www.mppmu.mpg.de/computing/AIXuser/loadl.
|
| |
20
|
[20] InfiniBand Trade Association. InfiniBand Architecture Specification, Volume 1 & 2, Release 1.1, November 2002. Available from http://www.infinibandta.org.
|
| |
21
|
[21] Intel and Microsoft. Advanced Power Management v. 1.2. Available from http://www.microsoft.com/.
|
| |
22
|
[22] Intel, Microsoft and Toshiba. The Advanced Configuration & Power Interface Specification. Available from http://www.acpi.info.
|
 |
23
|
|
| |
24
|
M. T. Jones , P. E. Plassmann, Solution of large, sparse systems of linear equations in massively parallel applications, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, p.551-560, November 16-20, 1992, Minneapolis, Minnesota, United States
|
| |
25
|
|
| |
26
|
[26] Lawreance Livermore National Laboratory. The sPPM Benchmark Code. Available from http://www.llnl.gov/asci/purple/benchmarks/limited/sppm.
|
| |
27
|
[27] Lawrence Berkeley National Laboratory. Data Center Energy Benchmarking Case Study, July 2003. Available from http://datacenters.lbl.gov/docs/Data_Center_Fac- ility4.pdf.
|
| |
28
|
[28] Lawrence Livermore National Laboratory. Accelerated Strategic Computing Initiative (ASCI) Program. Available from http://www.llnl.gov/asci.
|
| |
29
|
|
| |
30
|
[30] Myrinet, Inc. MPICH-GM software, October 2003. Available from http://www.myrinet.com/.
|
| |
31
|
[31] Myrinet, Inc. Myrinet GM-1 software, October 2003. Available from http://www.myrinet.com/.
|
| |
32
|
|
| |
33
|
[33] NASA Advanced Supercomputing division. The NAS Parallel Benchmarks (tech report and source code). Available from http://www.nas.nasa.gov/Software/NPB/.
|
 |
34
|
Scott Pakin , Mario Lauria , Andrew Chien, High performance messaging on workstations: Illinois fast messages (FM) for Myrinet, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.55-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224360]
|
| |
35
|
[35] Quadrics Ltd. QsNet HIGH PERFORMANCE INTERCONNECT. Available from http://doc.quadrics.com/quadrics/Quadrics-Home.nsf/DisplayPages/Homepage.
|
| |
36
|
|
| |
37
|
[37] H. P. Scott Rhine, MSL. Loadable Scheduler Modules on Linux White Paper. Available from http://resourcemanagement.unixsolutions.hp.com.
|
 |
38
|
|
| |
39
|
|
| |
40
|
|
| |
41
|
|
 |
42
|
Mark S. Squillante , Yanyong Zhang , Anand Sivasubramaniam , Natarajan Gautam , Hubertus Franke , Jose Moreira, Modeling and analysis of dynamic coscheduling in parallel and distributed environments, Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 15-19, 2002, Marina Del Rey, California
|
| |
43
|
[43] Supercluster Research and Development Group. Maui Scheduler. Available from http://supercluster.org/maui/.
|
| |
44
|
Toshiyuki Takahashi , Shinji Sumimoto , Atsushi Hori , Hiroshi Harada , Yutaka Ishikawa, PM2: a high performance communication middleware for heterogeneous network environments, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
45
|
[45] TOP500.org. TOP500 SUPERCOMPUTER SITES. Available from http://www.top500.org.
|
 |
46
|
Thorsten von Eicken , David E. Culler , Seth Copen Goldstein , Klaus Erik Schauser, Active messages: a mechanism for integrated communication and computation, Proceedings of the 19th annual international symposium on Computer architecture, p.256-266, May 19-21, 1992, Queensland, Australia
|
| |
47
|
[47] Yokogawa Electric Cooperation. WT210/WT230 Digital Power Meter USER'S MANUAL, May 1998. Available from http://www.yokogawa.com/.
|
| |
48
|
|
| |
49
|
|
|