ACM Home Page
Please provide us with feedback. Feedback
Optimal File-Bundle Caching Algorithms for Data-Grids
Full text PdfPdf (400 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2004 ACM/IEEE conference on Supercomputing table of contents
Page: 6  
Year of Publication: 2004
ISBN:0-7695-2153-3
Authors
Ekow Otoo  Lawrence Berkeley National Laboratory
Doron Rotem  Lawrence Berkeley National Laboratory
Alexandru Romosan  Lawrence Berkeley National Laboratory
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society  Washington, DC, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 33,   Citation Count: 10
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1109/SC.2004.36

ABSTRACT

The file-bundle caching problem arises frequently in scientific applications where jobs process several files concurrently. Consider a host system in a data-grid that maintains a disk cache for servicing jobs of file requests where a job is serviced only if all its requested files are present in the disk cache. Files must now be admitted into the cache and replaced in sets of file-bundles. We show that traditional caching algorithms based on file popularity measures do not perform well since they may hold in cache non-relevant combinations of files. We present and analyze a new caching algorithm for maximizing the throughput of jobs and minimizing data replacement costs at such data-grid hosts. We tested the new algorithm using a disk cache simulation model under a wide range of conditions of file request distributions, varying cache size, file size distribution, etc. The results show significant improvement over traditional caching algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
[2] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke. The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets. J. Network and Computer Applications, 23(3):187- 200, 2000.
 
3
[3] ESG:. The Earth System Grid, http://www.scd.ucar.edu/css/esg/.
 
4
[4] U. Feige, D. Peleg, and G. Kortsarz. The dense k-subgraph problem. Algorithmica, 29(3):410-421, 2001.
 
5
[5] U. Hahn, W. Dilling, and D. Kaletta. Adaptive replacement algorithm for disk caches in hsm systems. In 16 Int'l. Symp on Mass Storage Syst., pages 128-140, San Diego, California, Mar. 15-18 1999.
 
6
 
7
 
8
 
9
 
10
[10] PPDG:. The Particle Physics Data Grid, http://www.ppdg.net/.
 
11
 
12
[12] A. Shoshani, A. Sim, and J. Gu. Storage resource managers: Middleware components for grid storage. In 10th NASA Goddard Conference on Mass Storage Syst. and Tech., Apr. 15-18 2002.
 
13
 
14
[14] J. Wang. A survey of web caching schemes for the internet. In ACM SIGCOMM'99, Cambridge, Massachusetts, Aug. 1999.
 
15
 
16

CITED BY  10
Collaborative Colleagues:
Ekow Otoo: colleagues
Doron Rotem: colleagues
Alexandru Romosan: colleagues