ACM Home Page
Please provide us with feedback. Feedback
R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems
Full text PdfPdf (965 KB)
Source
International Conference on Supercomputing archive
Proceedings of the 23rd international conference on Supercomputing table of contents
Yorktown Heights, NY, USA
SESSION: Storage solutions for supercomputing table of contents
Pages 370-379  
Year of Publication: 2009
ISBN:978-1-60558-498-0
Authors
Chuanyi Liu  Tsinghua University, Beijing, China
Yu Gu  Tsinghua University, Beijing, China
Linchun Sun  Tsinghua University, Beijing, China
Bin Yan  Tsinghua University, Beijing, China
Dongsheng Wang  Tsinghua University, Beijing, China
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 103,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1542275.1542327
What is a DOI?

ABSTRACT

Data de-duplication has become a commodity component in data-intensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, by storing duplicate data chunks just once, de-duped system improves storage utilization at cost of error resilience or reliability. In this paper, R-ADMAD, a high reliability provision mechanism is proposed. It packs variable-length data chunks into fixed sized objects, and exploits ECC codes to encode the objects and distributes them among the storage nodes in a redundancy group, which is dynamically generated according to current status and actual failure domains. Upon failures, R-ADMAD proposes a distributed and dynamic recovery process. Experimental results show that R-ADMAD can provide the same storage utilization as RAID-like schemes, but comparable reliability to replication based schemes with much more redundancy. The average recovery time of R-ADMAD based configurations is about 2-6 times less than RAID-like schemes. Moreover, R-ADMAD can provide dynamic load balancing even without the involvement of the overloaded storage nodes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J F Gantz, et al. The Expanding Digital Universe: A Forecast of Worldwide Information Growth through 2010. IDC, March 2007.
2
 
3
EMC Centera. Content Addressed Storage. http://www.emc.com/pdf/products/centera/centera guide.pdf.
 
4
Data Domain. http://www.datadomain.com.
 
5
Quantum Dxi-Series. http://www.quantum.com/Products/
 
6
Symantec PureDisk. http://www.symantec.com/business/products/overview.jsp?pcid=2244&pvid=1381_1
 
7
 
8
 
9
 
10
Qin Xin. Understanding and Coping with Failures in Large-Scale Storage Systems. Technical Report UCSC-SSRC-07-06, May 2007.
 
11
David Reine. Enterprise Data Center Storage Issues. THE CLIPPER GROUP Navigator, September 11, 2008. Accessed from http://www.clipper.com/research/TCG2008043.pdf
12
 
13
N Tolia, M Kozuch, and M Satyanarayanan, et al. Opportunistic Use of Content Addressable Storage for Distributed File Systems. In Proc. of Usenix 2003 Annual Technical Conference, San Antonio, TX, USA
14
15
 
16
Lawrence L. You and Christos Karamanolis, Evaluation of Efficient Archival Storage Techniques. 12th NASA Goddard, 21st IEEE Conference on Mass Storage Systems and Technologies. April 13--16, 2004, College Park, Maryland, USA
 
17
 
18
N. Spillers. Storage Challenges in the Medical Industry. In The 4th Intelligent Storage Workshop, Digital Technology Center, University of Minnesota, 2006.
 
19
 
20
B Van Rompay, On the security of dedicated hash functions. In the 19th Symposium on Information Theory in the Benelux, 1998
 
21
 
22
M. O. Rabin. Fingerprinting by random polynomials. Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.
23
 
24
W. W. Peterson and E. J. Weldon, Jr., Error-Correcting Codes, Second Edition. MIT Press, Cambridge, MA, 1972.
 
25
F. J. MacWilliams and N. J. A. Sloane. The Theory of Error-Correcting Codes, Part I. North-Holland, Amsterdam, 1977
 
26
Luby, M. G., M. Mitzenmacher, M.A. Shokrollahi, and D. A. Spielman, ''Efficient Erasure Correcting Codes'', IEEE Transactions on Information Theory, 47(2), 569--584, February 2001.
 
27
R. A. Meyer and R. Bagrodia. PARSEC user manual, release 1.1. http://pcl.cs.ucla.edu/projects/parsec/.
 
28
 
29
MySQL. http://www.mysql.com.
 
30
David Du, Dingshan He, Changjin Hong, Jaehoon Jeong, Vishal Kher, Yongdae Kim, Yingping Lu, Aravindan Raghuveer, and Sarah Sharafkandi, ''Experiences in Building an Object-Based Storage System based on the OSD T-10 Standard,'' Submitted to 14th NASA Goddard & 23rd IEEE (MSST2006) Conference on Mass Storage Systems and Technologies May 15-18, 2006, College Park, MD
31
32
 
33
 
34
Lustre Object-based Cluster File System. http://www.sun.com/software/products/lustre/index.xml
 
35
Storage Networking Solutions. Object Storage Architecture: Defining a new generation of storage systems built on distributed, intelligent storage devices. http://www.snseurope.com/featuresfull.php?id=2193. 2004, 9
 
36
 
37
38
 
39
 
40
IBM Enterprise disk storage. http://www.ibm.com/systems/storage/disk/enterprise/ds_family.html
 
41
NCBI GenBank. http://www.ncbi.nlm.nih.gov/Genbank/.
 
42
J. G. Elerath. Specifying reliability in the disk drive industry: No more MTBF's. In Proceedings of the 2000 Annual Reliability and Maintainability, pages 194--199. IEEE, 2000.
 
43
 
44
45
 
46

Collaborative Colleagues:
Chuanyi Liu: colleagues
Yu Gu: colleagues
Linchun Sun: colleagues
Bin Yan: colleagues
Dongsheng Wang: colleagues