ACM Home Page
Please provide us with feedback. Feedback
A fresh look at the reliability of long-term digital storage
Full text PdfPdf (1.59 MB)
Source European Conference on Computer Systems archive
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006 table of contents
Leuven, Belgium
SESSION: Storage table of contents
Pages: 221 - 234  
Year of Publication: 2006
ISBN:1-59593-322-0
Also published in ...
Authors
Mary Baker  HP Labs
Mehul Shah  HP Labs
David S. H. Rosenthal  Stanford University
Mema Roussopoulos  Harvard University
Petros Maniatis  Intel Research Berkeley
TJ Giuli  Ford Research and Advanced Engineering
Prashanth Bungale  Harvard University
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 88,   Citation Count: 19
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1217935.1217957
What is a DOI?

ABSTRACT

Emerging Web services, such as email, photo sharing, and web site archives, must preserve large volumes of quickly accessible data indefinitely into the future. The costs of doing so often determine whether the service is economically viable. We make the case that these applications' demands on large scale storage systems over long time horizons require us to reevaluate traditional system designs. We examine threats to long-lived data from an end-to-end perspective, taking into account not just hardware and software faults but also faults due to humans and organizations. We present a simple model of long-term storage failures that helps us reason about various strategies for addressing some of these threats. Using this model we show that the most important strategies for increasing the reliability of long-term storage are detecting latent faults quickly, automating fault repair to make it cheaper and faster, and increasing the independence of data replicas.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
104th Congress, United States of America. Public Law 104--191: Health Insurance Portability and Accountability Act (HIPAA), Aug. 1996.
 
2
107th Congress, United States of America. Public Law 107--204: Sarbanes-Oxley Act of 2002, July 2002.
 
3
D. Akst. Postcard from Cyberspace. Los Angeles Times, Jan. 1995.
 
4
AMIA2003. Fact Sheet 5 - Estimating Tape Life. http://www.amianet.org/publication/resources/guidelines/videofacts/tapelife.html, 2003.
 
5
 
6
R. J. Anderson. The Eternity Service. In 1st Intl. Conf. on the Theory and Applications of Cryptology, 1996.
 
7
ARL - Association of Research Libraries. ARL Statistics 2000--01. http://www.arl.org/stats/arlstat/01pub/intro.html, 2001.
 
8
M. Baker, K. Keeton, and S. Martin. Why Traditional Storage Systems Don't Help Us Save Stuff Forever. In Proc. 1st IEEE Workshop on Hot Topics in System Dependability, 2005.
 
9
10
11
 
12
13
 
14
T. Dawber, G. Meadors, and F. Moore. Epidemiological Approaches to Heart Disease: the Framingham Study. American Journal of Public Health, 41(3):279--81, Mar. 1951.
 
15
W. Diffie. Perspective: Decrypting The Secret to Strong Security. http://news.com.com/2010--1071--980462.html, Jan. 2003.
 
16
G. Goble. http://ghg.ecn.purdue.edu/~ghg/.
 
17
Google, Inc. About Gmail. http://gmail.google.com/gmail/help/about.html, June 2005.
 
18
J. Gray, A. Szalay, A. Thakar, C. Stoughton, and J. vandenBerg. Online Scientific Data Curation, Publication, and Archiving. Technical Report MSR-TR-2002--74, Microsoft Research, July 2002.
 
19
J. Gray and C. van Ingen. Emprical Measurements of Disk Failure Rates and Error Rates. Technical Report MSR-TR-2005-166, Microsoft Research, Dec. 2005.
 
20
E. Hansen. Hotmail Incinerates Customer Files. News.com, http://news.com.com/Hotmail+incinerates+customer+files/2100--1038_3--5226090.html, June 2004.
 
21
J. Horlings. CD-R's Binnen Twee Jaar Onleesbaar, 2003. PC Active, See http://www.cdfreaks.com/news/7751.
 
22
IT Committee Inst. of Chartered Accountants of India. Tape backup vis-à-vis online Backup. Harmony IT, http://isaicai.org/Harmony/2004--07/index_plain.htm, July 2004.
 
23
F. Junqueira, R. Bhagwan, A. Hevia, K. Marzullo, and G. M. Voelker. Surviving Internet Catastrophes. In Usenix Annual Technical Conference, 2005.
 
24
H. Kari. Latent Sector Faults and Reliability of Disk Arrays. PhD thesis, Computer Science Department, Helsinki University of Technology, Finaland, Espoo, Finland, 1997.
 
25
M. Keeney, E. Kowalski, D. Cappelli, A. Moore, T. Shimeall, and S. Rogers. Insider Threat Study: Computer System Sabotage in Critical Infrastructure Sectors. http://www.secretservice.gov/ntac/its_report_050516.pdf, May 2005.
 
26
27
 
28
R. Lau. Personal Communication, Sept. 2004.
 
29
D. Lazarus. Prccious Photos Disappear. San Francisco Chronicle, http://www.sfgate.com/cgi-bin/article.cgi?file=/chronicle/archive/2005/02/02/BUG7QB3UOS1.DTL, Feb. 2005.
 
30
P. Luse and M. Schmisseur. Understanding Intelligent RAID 6. Technology@Intel Magazine, http://www.intel.com/technology/magazine/computing/RAID-6-0505.htm, 2006.
 
31
R. Malda. The Myth of the 100 Year CD-Rom. Slashdot, http://slashdot.org/article.pl?sid=04/04/22/1658251\&mode=flat\&tid=137\&ti, Apr. 2004.
32
 
33
D. Milbank. White House Web Scrubbing, Dec. 2003. The Washington Post, http://www.washingtonpost.com/ac2/wp-dyn?pagename=article&node=&contentId=A9821--2003Dec17¬Found=true.
 
34
 
35
NASA. Aviation Safety Reporting System. http://asrs.arc.nasa.gov/.
 
36
OCLC. Persistent Uniform Resource Locator. http://purl.oclc.org/.
 
37
K. Pang, K. Yau, and Hung-Hsiang Chou. The Earth's Palaeorotation, Postglacial Rebound and Lower Mantle Viscosity from Analysis of Ancient Chinese Eclipse Records. Pure and Applied Geophysics, 145(3--4):459--485, Sept. 1995.
38
39
 
40
J. Reason. Human Error. Cambridge University Press, 1990.
 
41
Reuters. Time Warner Says Employee Data Lost by Outside Storage Company. The New York Times, http://www.nytimes.com/2005/05/02/business/business-tech-timewarner.html?ex=1272686400&en=39cc177d5da055d2&ei=5090&partner=rssuserland&emc=rss, May 2005.
 
42
D. S. H. Rosenthal. A Digital Preservation Network Appliance Based on OpenBSD. In Proceedings of BSDcon 2003, San Mateo, CA, USA, Sept. 2003.
43
44
 
45
 
46
Seagate. ST3200822A Configuration and Specifications. http://www.seagate.com/support/disc/specs/ata/st3200822a.html, Sept. 2003.
 
47
Seagate. Cheetah 15K.4. http://www.seagate.com/cda/products/discsales/enterprise/tech/0,1084,656,00.html, 2005.
 
48
R. F. Sproull and J. Eisenberg. Building an Electronic Records Archive at the National Archives and Records Administration: Recommendations for a Long-Term Strategy, http://www.nap.edu/catalog/11332.html, June 2005.
 
49
 
50
The Memory Hole. Department of Education to Delete Years of Research From Its Website. http://www.thememoryhole.org/edu/ed-info.htm, 2002.
 
51
The OpenRAW Working Group. The RAW Problem. http://openraw.org, 2005.
 
52
J. Tom. When Mutilators Stalk the Stacks. http://gort.ucsd.edu/preseduc/bmlmutil.htm, 2000.
 
53
S. Towers. Personal Communication, July 2004.
54
 
55
 
56
 
57
D. Whitehouse. Reworked Images Reveal Hot Venus. BBC News, Jan. 2004.
 
58

CITED BY  19

Collaborative Colleagues:
Mary Baker: colleagues
Mehul Shah: colleagues
David S. H. Rosenthal: colleagues
Mema Roussopoulos: colleagues
Petros Maniatis: colleagues
TJ Giuli: colleagues
Prashanth Bungale: colleagues