| The quest for scalable support of data-intensive workloads in distributed systems |
| Full text |
Pdf
(1.28 MB)
|
Source
|
High Performance Distributed Computing
archive
Proceedings of the 18th ACM international symposium on High performance distributed computing
table of contents
Garching, Germany
SESSION: Data nabagenebt
table of contents
Pages 207-216
Year of Publication: 2009
ISBN:978-1-60558-587-1
|
|
Authors
|
|
Ioan Raicu
|
University of Chicago, Chicago, IL, USA
|
|
Ian T. Foster
|
University of Chicago & Argonne National Laboratory, Chicago, IL, USA
|
|
Yong Zhao
|
Microsoft Corporation, Redmond, WA, USA
|
|
Philip Little
|
University of Notre Dame, Notre Dame, IN, USA
|
|
Christopher M. Moretti
|
University of Notre Dame, Notre Dame, IN, USA
|
|
Amitabh Chaudhary
|
University of Notre Dame, Notre Dame, IN, USA
|
|
Douglas Thain
|
University of Notre Dame, Notre Dame, IN, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 35, Downloads (12 Months): 116, Citation Count: 0
|
|
|
ABSTRACT
Data-intensive applications involving the analysis of large datasets often require large amounts of compute and storage resources, for which data locality can be crucial to high throughput and performance. We propose a "data diffusion" approach that acquires compute and storage resources dynamically, replicates data in response to demand, and schedules computations close to data. As demand increases, more resources are acquired, thus allowing faster response to subsequent requests that refer to the same data; when demand drops, resources are released. This approach can provide the benefits of dedicated hardware without the associated high costs, depending on workload and resource characteristics. To explore the feasibility of data diffusion, we offer both a theoretical and an empirical analysis. We define an abstract model for data diffusion, introduce new scheduling policies with heuristics to optimize real-world performance, and develop a competitive online cache eviction policy. We also offer many empirical experiments to explore the benefits of dynamically expanding and contracting resources based on load, to improve system responsiveness while keeping wasted resources small. We show performance improvements of one to two orders of magnitude across three diverse workloads when compared to the performance of parallel file systems with throughputs approaching 80 Gb/s on a modest cluster of 200 processors. We also compare data diffusion with a best model for active storage, contrasting the difference between a pull-model found in data diffusion and a push-model found in active storage.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A. Szalay, J. Bunn, J. Gray, I. Foster, I. Raicu. The Importance of Data Locality in Distributed Computing Applications, NSF Workflow Workshop 2006
|
| |
2
|
J. Gray. Distributed Computing Economics, Technical Report MSR-TR-2003-24, Microsoft Research, 2003
|
 |
3
|
|
| |
4
|
|
 |
5
|
Ioan Raicu , Yong Zhao , Ian T. Foster , Alex Szalay, Accelerating large-scale data exploration through data diffusion, Proceedings of the 2008 international workshop on Data-aware distributed computing, p.9-18, June 24-24, 2008, Boston, MA, USA
[doi> 10.1145/1383519.1383521]
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
W. Xiaohui, et al. Implementing Data Aware Scheduling in Gfarm Using LSF Scheduler Plugin Mechanism, GCA05, 2005
|
| |
10
|
P. Fuhrmann. dCache, the Commodity Cache, MSST 2004
|
| |
11
|
C. Moretti, et al. All-Pairs: An Abstraction for Data-Intensive Cloud Computing, IPDPS 2008
|
| |
12
|
D. Thain, et al. Chirp: A Practical Global Filesystem for Cluster and Grid Computing, JGC, Springer, 2008
|
 |
13
|
Ioan Raicu , Yong Zhao , Catalin Dumitrescu , Ian Foster , Mike Wilde, Falkon: a Fast and Light-weight tasK executiON framework, Proceedings of the 2007 ACM/IEEE conference on Supercomputing, November 10-16, 2007, Reno, Nevada
[doi> 10.1145/1362622.1362680]
|
| |
14
|
|
| |
15
|
Ioan Raicu , Zhao Zhang , Mike Wilde , Ian Foster , Pete Beckman , Kamil Iskra , Ben Clifford, Toward loosely coupled programming on petascale systems, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, November 15-21, 2008, Austin, Texas
|
| |
16
|
A. Bialecki, et al. Hadoop: A Framework for Running Applications on Large Clusters Built of Commodity Hardware, http://lucene.apache.org/hadoop/, 2005
|
| |
17
|
M. Feller, et al. GT4 GRAM: A Functionality and Performance Study, TeraGrid Conference 2007
|
| |
18
|
William Allcock , John Bresnahan , Rajkumar Kettimuthu , Michael Link , Catalin Dumitrescu , Ioan Raicu , Ian Foster, The Globus Striped GridFTP Framework and Server, Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p.54, November 12-18, 2005
[doi> 10.1109/SC.2005.72]
|
| |
19
|
|
| |
20
|
I. Raicu, I. Foster, Y. Zhao, A. Szalay, P. Little, C. Moretti, A. Chaudhary, D. Thain. Towards Data Intensive Many-Task Computing, under review at Data Intensive Distributed Computing: Challenges and Solutions for Large-Scale Information Management, 2009
|
| |
21
|
I. Raicu, I. Foster, A. Szalay, G. Turcu. AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis, TeraGrid Conf. 2006
|
| |
22
|
E. Torng. A Unified Analysis of Paging and Caching, Algorithmica 20, 175--200, 1998
|
| |
23
|
ANL/UC TeraGrid Site Details, http://www.uc.teragrid.org/tg-docs/tg-tech-sum.html, 2007
|
| |
24
|
|
| |
25
|
T. Kosar. A New Paradigm in Data Intensive Computing: Stork and the Data-Aware Schedulers, IEEE CLADE 2006
|
| |
26
|
X. Wei, et al. Integrating Local Job Scheduler - LSF with Gfarm, ISPA05, vol. 3758/2005, 2005
|
| |
27
|
Fay Chang , Jeffrey Dean , Sanjay Ghemawat , Wilson C. Hsieh , Deborah A. Wallach , Mike Burrows , Tushar Chandra , Andrew Fikes , Robert E. Gruber, Bigtable: a distributed storage system for structured data, Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, p.15-15, November 06-08, 2006, Seattle, WA
|
| |
28
|
Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von Laszewski, I. Raicu, T. Stef-Praun, M. Wilde. Swift: Fast, Reliable, Loosely Coupled Parallel Computation, IEEE Workshop on Scientific Workflows 2007
|
| |
29
|
Y. Zhao, I. Raicu, I. Foster, M. Hategan, V. Nefedova, M. Wilde. Realizing Fast, Scalable and Reliable Scientific Computations in Grid Environments, Grid Computing Research Progress, Nova Pub. 2008
|
 |
30
|
|
| |
31
|
|
| |
32
|
Joseph Leung , Laurie Kelly , James H. Anderson, Handbook of Scheduling: Algorithms, Models, and Performance Analysis, CRC Press, Inc., Boca Raton, FL, 2004
|
| |
33
|
S. Irani. Randomized Weighted Caching with Two Page Weights, Algorithmica, 32:4, 624--640, 2002
|
| |
34
|
X. Zhang, A. Espinosa, K. Iskra, I. Raicu, I. Foster, M. Wilde. Design and Evaluation of a Collective I/O Model for Loosely-coupled PetascaleProgramming, IEEE MTAGS 2008
|
|