ACM Home Page
Please provide us with feedback. Feedback
Trace-based evaluation of job runtime and queue wait time predictions in grids
Full text PdfPdf (1.04 MB)
Source
High Performance Distributed Computing archive
Proceedings of the 18th ACM international symposium on High performance distributed computing table of contents
Garching, Germany
SESSION: Resource management and scheduling table of contents
Pages 111-120  
Year of Publication: 2009
ISBN:978-1-60558-587-1
Authors
Ozan Sonmez  Delft Univesity of Technology, Delft, Netherlands
Nezih Yigitbasi  Delft Univesity of Technology, Delft, Netherlands
Alexandru Iosup  Delft Univesity of Technology, Delft, Netherlands
Dick Epema  Delft Univesity of Technology, Delft, Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 59,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1551609.1551632
What is a DOI?

ABSTRACT

Large-scale distributed computing systems such as grids are serving a growing number of scientists. These environments bring about not only the advantages of an economy of scale, but also the challenges of resource and workload heterogeneity. A consequence of these two forms of heterogeneity is that job runtimes and queue wait times are highly variable, which generally reduces system performance and makes grids difficult to use by the common scientist. Predicting job runtimes and queue wait times have been widely studied for parallel environments. However, there is no detailed investigation on how the proposed prediction methods perform in grids, whose resource structure and workload characteristics are very different from those in parallel systems. In this paper, we assess the performance and benefit of predicting job runtimes and queue wait times in grids based on traces gathered from various research and production grid environments. First, we evaluate the performance of simple yet widely used time series prediction methods and the effect of applying them to different types of job classes (e.g., all jobs submitted by single users or to single sites). Then, we investigate the performance of two kinds of queue wait time prediction methods for grids. Last, we investigate whether prediction-based grid-level scheduling policies can have better performance than policies that do not use predictions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
P. J. Brockwell and R. A. Davis. Introduction to Time Series and Forecasting. Springer, March 2002.
 
5
 
6
M. Dobber, G. Koole, and R. V. D. Mei. Dynamic load balancing for a grid application. In HiPC, pages 342--352, 2004.
 
7
 
8
 
9
 
10
 
11
G. Elliott, T. J. Rothenberg, and J. H. Stock. Efficient tests for an autoregressive unit root. Econometrica, 64(4):813--836, 1996.
 
12
 
13
 
14
 
15
A. Iosup, D. Epema, C. Franke, A. Papaspyrou, L. Schley, B. Song, and R. Yahyapour. On grid performance evaluation using synthetic workloads. In JSSPP, pages 232--255, 2006.
 
16
A. Iosup, M. Jan, O. Sonmez, and D. Epema. The Characteristics and Performance of Groups of Jobs in Grids. In Euro-Par, pages 382--393, 2007.
 
17
18
 
19
 
20
 
21
 
22
 
23
B.-D. Lee and J. M. Schopf. Run-Time Prediction of Parallel Applications on Shared Environments. In Cluster, volume 0, pages 487--582, 2003.
 
24
 
25
 
26
F. Nadeem, R. Prodan, T. Fahringer, and A. Iosup. A Framework For Resource Availability Characterization And Online Prediction in the Grids. In CoreGRID Integration Workshop, pages 209--224, 2008.
27
 
28
 
29
 
30
 
31
 
32
SPECCPU Team. SPEC CPU2006. Standard Performance. http://www.spec.org/cpu2006/.
 
33
 
34
The Parallel Workloads Archive Team. The parallel workloads archive logs, Jan. 2009. {Online}. Available: http://www.cs.huji.ac.il/labs/parallel/workload/logs.html.
 
35
36
 
37
 
38

Collaborative Colleagues:
Ozan Sonmez: colleagues
Nezih Yigitbasi: colleagues
Alexandru Iosup: colleagues
Dick Epema: colleagues