|
ABSTRACT
Large-scale distributed computing systems such as grids are serving a growing number of scientists. These environments bring about not only the advantages of an economy of scale, but also the challenges of resource and workload heterogeneity. A consequence of these two forms of heterogeneity is that job runtimes and queue wait times are highly variable, which generally reduces system performance and makes grids difficult to use by the common scientist. Predicting job runtimes and queue wait times have been widely studied for parallel environments. However, there is no detailed investigation on how the proposed prediction methods perform in grids, whose resource structure and workload characteristics are very different from those in parallel systems. In this paper, we assess the performance and benefit of predicting job runtimes and queue wait times in grids based on traces gathered from various research and production grid environments. First, we evaluate the performance of simple yet widely used time series prediction methods and the effect of applying them to different types of job classes (e.g., all jobs submitted by single users or to single sites). Then, we investigate the performance of two kinds of queue wait time prediction methods for grids. Last, we investigate whether prediction-based grid-level scheduling policies can have better performance than policies that do not use predictions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Francine Berman , Richard Wolski , Henri Casanova , Walfredo Cirne , Holly Dail , Marcio Faerman , Silvia Figueira , Jim Hayes , Graziano Obertelli , Jennifer Schopf , Gary Shao , Shava Smallen , Neil Spring , Alan Su , Dmitrii Zagorodnov, Adaptive Computing on the Grid Using AppLeS, IEEE Transactions on Parallel and Distributed Systems, v.14 n.4, p.369-382, April 2003
[doi> 10.1109/TPDS.2003.1195409]
|
| |
2
|
|
 |
3
|
|
| |
4
|
P. J. Brockwell and R. A. Davis. Introduction to Time Series and Forecasting. Springer, March 2002.
|
| |
5
|
Henri Casanova , Graziano Obertelli , Francine Berman , Rich Wolski, The AppLeS parameter sweep template: user-level middleware for the grid, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.60-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
6
|
M. Dobber, G. Koole, and R. V. D. Mei. Dynamic load balancing for a grid application. In HiPC, pages 342--352, 2004.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
G. Elliott, T. J. Rothenberg, and J. H. Stock. Efficient tests for an autoregressive unit root. Econometrica, 64(4):813--836, 1996.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
A. Iosup, D. Epema, C. Franke, A. Papaspyrou, L. Schley, B. Song, and R. Yahyapour. On grid performance evaluation using synthetic workloads. In JSSPP, pages 232--255, 2006.
|
| |
16
|
A. Iosup, M. Jan, O. Sonmez, and D. Epema. The Characteristics and Performance of Groups of Jobs in Grids. In Euro-Par, pages 382--393, 2007.
|
| |
17
|
Alexandru Iosup , Hui Li , Mathieu Jan , Shanny Anoep , Catalin Dumitrescu , Lex Wolters , Dick H. J. Epema, The Grid Workloads Archive, Future Generation Computer Systems, v.24 n.7, p.672-686, July, 2008
[doi> 10.1016/j.future.2008.02.003]
|
 |
18
|
Alexandru Iosup , Ozan Sonmez , Shanny Anoep , Dick Epema, The performance of bags-of-tasks in large-scale distributed systems, Proceedings of the 17th international symposium on High performance distributed computing, June 23-27, 2008, Boston, MA, USA
[doi> 10.1145/1383422.1383435]
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
B.-D. Lee and J. M. Schopf. Run-Time Prediction of Parallel Applications on Shared Environments. In Cluster, volume 0, pages 487--582, 2003.
|
| |
24
|
|
| |
25
|
|
| |
26
|
F. Nadeem, R. Prodan, T. Fahringer, and A. Iosup. A Framework For Resource Availability Characterization And Online Prediction in the Grids. In CoreGRID Integration Workshop, pages 209--224, 2008.
|
 |
27
|
|
| |
28
|
Daniel Pease , Arif Ghafoor , Ishfaq Ahmad , David L. Andrews , Kamal Foudil-Bey , Thomas E. Karpinski , Mohammad A. Mikki , Mohamed Zerrouki, PAWS: A Performance Evaluation Tool for Parallel Computing Systems, Computer, v.24 n.1, p.18-29, January 1991
[doi> 10.1109/2.67190]
|
| |
29
|
|
| |
30
|
|
| |
31
|
|
| |
32
|
SPECCPU Team. SPEC CPU2006. Standard Performance. http://www.spec.org/cpu2006/.
|
| |
33
|
|
| |
34
|
The Parallel Workloads Archive Team. The parallel workloads archive logs, Jan. 2009. {Online}. Available: http://www.cs.huji.ac.il/labs/parallel/workload/logs.html.
|
| |
35
|
|
 |
36
|
|
| |
37
|
|
| |
38
|
|
|