|
ABSTRACT
As cluster computers are used for a wider range of applications, we encounter the need to deliver resources at particular times, to meet particular deadlines, and/or at the same time as other resources are provided elsewhere. To address such requirements, we describe a scheduling approach in which users request resource leases, where leases can request either as-soon-as-possible ("best-effort") or reservation start times. We present the design of a lease management architecture, Haizea, that implements leases as virtual machines (VMs), leveraging their ability to suspend, migrate, and resume computations and to provide leased resources with customized application environments. We discuss methods to minimize the overhead introduced by having to deploy VM images before the start of a lease. We also present the results of simulation studies that compare alternative approaches. Using workloads with various mixes of best-effort and advance reservation requests, we compare the performance of our VM-based approach with that of non-VM-based schedulers. We find that a VM-based approach can provide better performance (measured in terms of both total execution time and average delay incurred by best-effort requests) than a scheduler that does not support task pre-emption, and only slightly worse performance than a scheduler that does support task pre-emption. We also compare the impact of different VM image popularity distributions and VM image caching strategies on performance. These results emphasize the importance of VM image caching for the workloads studied and quantify the sensitivity of scheduling performance to VM image popularity distribution.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Sumalatha Adabala , Vineet Chadha , Puneet Chawla , Renato Figueiredo , José Fortes , Ivan Krsul , Andrea Matsunaga , Mauricio Tsugawa , Jian Zhang , Ming Zhao , Liping Zhu , Xiaomin Zhu, From virtualized resources to virtual computing grids: the In-VIGO system, Future Generation Computer Systems, v.21 n.6, p.896-909, June 2005
[doi> 10.1016/j.future.2003.12.021]
|
| |
2
|
A. Andrieux, K. Czajkowski, A. Dan, K. Keahey, H. Ludwig, T. Nakata, J. Pruyne, J. Rofrano, S. Tuecke, and M. Xu. Web services agreement specification (WS-Agreement).
|
| |
3
|
Raphaël Bolze , Franck Cappello , Eddy Caron , Michel Daydé , Frédéric Desprez , Emmanuel Jeannot , Yvon Jégou , Stephane Lanteri , Julien Leduc , Noredine Melab , Guillaume Mornet , Raymond Namyst , Pascale Primet , Benjamin Quetier , Olivier Richard , El-Ghazali Talbi , Iréa Touche, Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed, International Journal of High Performance Computing Applications, v.20 n.4, p.481-494, November 2006
[doi> 10.1177/1094342006070078]
|
| |
4
|
W. S. Cleveland. Lowess: A program for smoothing scatterplots by robust locally weighted regression. The American Statistician, 35(54), 1981.
|
| |
5
|
|
| |
6
|
W. Emeneker and D. Stanzione. Increasing Reliability through Dynamic Virtual Clustering. In High Availabilityand Performance Computing Workshop, 2006.
|
| |
7
|
W. Emeneker and D. Stanzione. Efficient Virtual Machine Caching in Dynamic Virtual Clusters. In SRMPDS Workshop, ICAPDS 2007 Conference, December 2007.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, and A. Roy. A distributed resource management architecture that supports advance reservations and co-allocation. In Proceedings of the International Workshop on Quality of Service, 1999.
|
| |
12
|
I. Foster , T. Freeman , K. Keahy , D. Scheftner , B. Sotomayer , X. Zhang, Virtual Clusters for Grid Communities, Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid, p.513-520, May 16-19, 2006
[doi> 10.1109/CCGRID.2006.108]
|
| |
13
|
T. Freeman, K. Keahey, I. T. Foster, A. Rana, B. Sotomayor, and F. Wuerthwein. Division of labor: Tools for growing and scaling grids. In ICSOC, 2006.
|
| |
14
|
|
| |
15
|
P. H. Hargrove and J. C. Duell. Berkeley lab checkpoint/restart (blcr) for linux clusters. Journal of Physics: Conference Series, 46:494--499, 2006.
|
 |
16
|
Ioan Raicu , Yong Zhao , Catalin Dumitrescu , Ian Foster , Mike Wilde, Falkon: a Fast and Light-weight tasK executiON framework, Proceedings of the 2007 ACM/IEEE conference on Supercomputing, November 10-16, 2007, Reno, Nevada
[doi> 10.1145/1362622.1362680]
|
| |
17
|
David Irwin , Jeffrey Chase , Laura Grit , Aydan Yumerefendi , David Becker , Kenneth G. Yocum, Sharing networked resources with brokered leases, Proceedings of the annual conference on USENIX '06 Annual Technical Conference, p.18-18, May 30-June 03, 2006, Boston, MA
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
M. W. Margo, K. Yoshimoto, P. Kovatch, and P. Andrews. Impact of reservations on production job scheduling. In 13th Workshop on Job Scheduling Strategies for Parallel Processing, 2007.
|
| |
23
|
|
| |
24
|
|
| |
25
|
P. Beckman, S.Nadella, N.Trebon, and I.Beschastnikh. SPRUCE: A system for supporting urgent high-performance computing. IFIP International Federation for Information Processing, Grid-Based Problem Solving Environments, 239:295--311, 2007.
|
| |
26
|
Joseph Leung , Laurie Kelly , James H. Anderson, Handbook of Scheduling: Algorithms, Models, and Performance Analysis, CRC Press, Inc., Boca Raton, FL, 2004
|
| |
27
|
P. Ruth, P. McGachey, and D. Xu. VioCluster: Virtualization for dynamic computational domains. Proceedings of the IEEE International Conference on Cluster Computing (Cluster'05), 2005.
|
| |
28
|
P. Ruth, J. Rhee, D. Xu, R. Kennell, and S. Goasguen. Autonomic live adaptation of virtual computational environments in a multi-domain infrastructure. IEEE International Conference on Autonomic Computing, 2006., 2006.
|
| |
29
|
G. Singh, C. Kesselman, and E. Deelman. Performance impact of resource provisioning on workflows. Technical Report 05-850, Department of Computer Science, University of South California, 2005.
|
| |
30
|
|
| |
31
|
|
| |
32
|
B. Sotomayor. A resource management model for VM-based virtual workspaces. Master's thesis, University of Chicago, February 2007.
|
| |
33
|
|
| |
34
|
E. Walker, J. Gardner, V. Litvin, and E. Turner. Creating personal adaptive clusters for managing scientific tasks in a distributed computing environment. In Challenges of Large Applications in Distributed Environments, 2006.
|
| |
35
|
S. Yamasaki, N. Maruyama, and S. Matsuoka. Model-based resource selection for efficient virtual cluster deployment. In VTDC '07: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, 2007.
|
| |
36
|
H. Zhao and R. Sakellariou. Advance reservation policies for workflows. In 12th Workshop on Job Scheduling Strategies for Parallel Processing, 2006.
|
| |
37
|
Amazon EC2. http://aws.amazon.com/ec2/.
|
| |
38
|
Final report. teragrid co-scheduling/metascheduling requirements analysis team. http://www.teragridforum.org/mediawiki/images/b/b4/MetaschedRatReport.pdf.
|
| |
39
|
Parallel workloads archive. http://www.cs.huji.ac.il/labs/parallel/workload/.
|
INDEX TERMS
Primary Classification:
D.
Software
D.4
OPERATING SYSTEMS
D.4.7
Organization and Design
Subjects:
Distributed systems
Additional Classification:
C.
Computer Systems Organization
C.2
COMPUTER-COMMUNICATION NETWORKS
C.2.4
Distributed Systems
D.
Software
D.4
OPERATING SYSTEMS
D.4.5
Reliability
Subjects:
Checkpoint/restart
General Terms:
Design,
Management,
Performance
Keywords:
advance reservations,
backfilling,
batch processing,
checkpoint/restart,
resource leasing,
resource management,
virtual machine overhead,
virtual machines,
virtual workspaces
|