|
ABSTRACT
As storage systems grow in size and complexity, they are increasingly confronted with concurrent disk failures together with multiple unrecoverable sector errors. To ensure high data reliability and availability, erasure codes with high fault tolerance are required. In this article, we present a new family of erasure codes with high fault tolerance, named GRID codes. They are called such because they are a family of strip-based codes whose strips are arranged into multi-dimensional grids. In the construction of GRID codes, we first introduce a concept of matched codes and then discuss how to use matched codes to construct GRID codes. In addition, we propose an iterative reconstruction algorithm for GRID codes. We also discuss some important features of GRID codes. Finally, we compare GRID codes with several categories of existing codes. Our comparisons show that for large-scale storage systems, our GRID codes have attractive advantages over many existing erasure codes: (a) They are completely XOR-based and have very regular structures, ensuring easy implementation; (b) they can provide up to 15 and even higher fault tolerance; and (c) their storage efficiency can reach up to 80% and even higher. All the advantages make GRID codes more suitable for large-scale storage systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
Lakshmi N. Bairavasundaram , Garth R. Goodson , Shankar Pasupathy , Jiri Schindler, An analysis of latent sector errors in disk drives, Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 12-16, 2007, San Diego, California, USA
|
| |
4
|
Blaum, M., Brady, J., Bruck, J., Menon, J., and Vardy, A. 2001. The EVENODD code and its generalization: An efficient scheme for tolerating multiple disk failures in RAID architectures. InHigh Performance Mass Storage and Parallel I/O: Technologies and Applications, 187--208.
|
| |
5
|
Blaum, M., Bruck, J., and Vardy, A. 1996. MDS array codes with independent parity symnbols. IEEE Trans. Inf. Theory 42, 2, 529--542.
|
| |
6
|
|
| |
7
|
Bloemer, J., Kalfane, M., Karp, R., Karpinski, M., Luby, M., and Zuckerman, D. 1995. An XOR-based erasure resilient coding scheme. Tech. rep. TR-95-048, International Computer Science Institute, Berkeley, California.
|
 |
8
|
Peter M. Chen , Edward K. Lee , Garth A. Gibson , Randy H. Katz , David A. Patterson, RAID: high-performance, reliable secondary storage, ACM Computing Surveys (CSUR), v.26 n.2, p.145-185, June 1994
[doi> 10.1145/176979.176981]
|
| |
9
|
|
| |
10
|
Peter Corbett , Bob English , Atul Goel , Tomislav Grcanac , Steven Kleiman , James Leong , Sunitha Sankar, Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction, Proceedings of the 3rd USENIX Conference on File and Storage Technologies, March 31-31, 2004, San Francisco, CA
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Gallager, R. G. 1962. Low density parity check codes. IRE Trans. Inf. Theory 8, 1, 21--28.
|
| |
15
|
|
| |
16
|
Greenan, K. M., Miller, E. L., and Wylie, J. J. 2008. Reliability of flat XOR-based erasure codes on heterogeneous devices. In Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'08). IEEE Computer Society, 147--156.
|
| |
17
|
|
| |
18
|
|
| |
19
|
Hafner, J. L., Deenadhayalan, V., Kanungo, T., and Rao, K. K. 2004. Performance metrics for erasure codes in storage systems. Tech. rep. RJ 10321 (A0408-003). IBM Research Division, Almaden Research Center. August.
|
| |
20
|
|
| |
21
|
Luby, M. C., Mitzenmacher, M., Shokrollahi, M. A., and Spielman, D. A. 2001. Efficient erasure correcting codes. IEEE Trans. Inf. Theory 47, 2, 569--584.
|
| |
22
|
MacWilliams, F. J. and Sloane, N. J. A. 1977. The Theory of Error-Correcting Codes. North-Holland, New York.
|
| |
23
|
|
| |
24
|
|
| |
25
|
Plank, J. S. 2005. Erasure codes for storage applications. Tutorial slides presented at the 4th USENIX Conference on File and Storage Technologies (FAST'05). http://www.cs.utk.edu/~plank/plank/papers/FAST-2005.html.
|
| |
26
|
|
| |
27
|
|
| |
28
|
|
| |
29
|
Reed, I. S. and Solomon, G. 1960. Polynomial codes over certain finite fields. J. Soc. Industrial Appl. Math. 8, 2, 300--304.
|
| |
30
|
Roth, R. M. and Lempel, A. 1989. On MDS codes via Cauchy matrices. IEEE Trans. Inf. Theory 35, 6, 1314--1319.
|
 |
31
|
|
| |
32
|
Bianca Schroeder , Garth A. Gibson, Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?, Proceedings of the 5th USENIX conference on File and Storage Technologies, p.1-es, February 13-16, 2007, San Jose, CA
|
| |
33
|
Tanner, R. M. 1981. A recursive approach to low complexity codes. IEEE Trans. Inf. Theory 27, 5, 533--547.
|
| |
34
|
W. W. Wilcke , R. B. Garner , C. Fleiner , R. F. Freitas , R. A. Golding , J. S. Glider , D. R. Kenchammana-Hosekote , J. L. Hafner , K. M. Mohiuddin , K. K. Rao , R. A. Becker-Szendy , T. M. Wong , O. A. Zaki , M. Hernandez , K. R. Fernandez , H. Huels , H. Lenk , K. Smolin , M. Ries , C. Goettert , T. Picunko , B. J. Rubin , H. Kahn , T. Loo, IBM intelligent Bricks project: petabytes and beyond, IBM Journal of Research and Development, v.50 n.2/3, p.181-197, March 2006
[doi> 10.1147/rd.502.0181]
|
| |
35
|
Wong, T. E. and Shea, J. M. 2001. Multi-Dimensional parity check codes for bursty channels. In Proceedings of the IEEE International Symposium on Information Theory (ISIT'01), 123.
|
| |
36
|
|
 |
37
|
|
| |
38
|
Xu, L. and Bruck, J. 1999. X-Code: MDS array codes with optimal encoding. IEEE Trans. Inf. Theory 45, 1, 272--276.
|
|