|
ABSTRACT
This paper presents a new technique for disk storage management called a log-structured file system. A log-structured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it contains indexing information so that files can be read back from the log efficiently. In order to maintain large free areas on disk for fast writing, we divide the log intosegmentsand use a segment cleaner to compress the live information from heavily fragmented segments. We present a series of simulations that demonstrate the efficiency of a simple cleaning policy based on cost and benefit. We have implemented a prototype log-structured file system called Sprite LFS; it outperforms current Unix file systems by an order of magnitude for small-file writes while matching or exceeding Unix performance for reads and large writes. Even when the overhead for cleaning is included, Sprite LFS can use 70% of the disk bandwidth for writing, whereas Unix file systems typically can use only 5–10%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
BAKER, H. G. List processing in real time on a serial computer. A.I. Working Paper 139, MIT-AI Lab, Boston, MA, Apr. 1977.
|
 |
2
|
Mary G. Baker , John H. Hartman , Michael D. Kupfer , Ken W. Shirriff , John K. Ousterhout, Measurements of a distributed file system, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.198-212, October 13-16, 1991, Pacific Grove, California, United States
|
| |
3
|
|
 |
4
|
David J DeWitt , Randy H Katz , Frank Olken , Leonard D Shapiro , Michael R Stonebraker , David Wood, Implementation techniques for main memory database systems, Proceedings of the 1984 ACM SIGMOD international conference on Management of data, June 18-21, 1984, Boston, Massachusetts
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
KAZAR, M. L, LEVERETT, B. W., ANDERSON, O. T, APOSTOLIDES, V., BOTTOS, B. A CHUTANI, S., EVERHART, C. F., MASON, W. A., Tu, S. T., AND ZAYAS, E. R. Decorum file system architectural overview. In Proceedings of the USENIX 1990 Summer Conference (Anaheim, Calif., Jun. 1990), pp. 151-164.
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
McKusIcK, M K., JoY, W. N., LEFFLER~ S. J., AND FABRY, R. S. Fsck--The UNIX file system check program. Unix System Manager's Manual--4.3 BSD Virtual VAX-11 Version, USENIX, Berkeley, Calif., Apr. 1986.
|
| |
14
|
McVoY, L. AND KL~mAN, S. Extent-like performance from a UNIX file system. In Proceedings of the USENIX 1991 Winter Conference (Dallas, Tex., Jan. 1991).
|
 |
15
|
|
| |
16
|
OUSTERHOU% J. K. Why aren't operating systems getting faster as fast as hardware? In Proceedings of the USENIX 1990 Summer Conference (Anaheim, Calif., Jun. 1990), pp. 247-256.
|
| |
17
|
|
 |
18
|
John K. Ousterhout , Hervé Da Costa , David Harrison , John A. Kunze , Mike Kupfer , James G. Thompson, A trace-driven analysis of the UNIX 4.2 BSD file system, Proceedings of the tenth ACM symposium on Operating systems principles, p.15-24, December 1985, Orcas Island, Washington, United States
|
 |
19
|
David A. Patterson , Garth Gibson , Randy H. Katz, A case for redundant arrays of inexpensive disks (RAID), Proceedings of the 1988 ACM SIGMOD international conference on Management of data, p.109-116, June 01-03, 1988, Chicago, Illinois, United States
|
| |
20
|
REED, D. AND SVOBODOVA, L. SWALLOW: A distributed data storage system for a local network. In Local Networks for Computer Communications. North-Holland, 1981, pp. 355-373.
|
| |
21
|
SANDSERG, R. Design and implementation of the Sun network filesystem. In Proceedings of the USENIX 1985 Summer Conference (Portland, Ore , Jun. 1985) pp. 119-130.
|
| |
22
|
SALEM, K. AND GARCIA-MOLINA, H. Crash recovery mechamsms for main storage database systems. CS-TR-034-86, Princeton Univ., Princeton, NJ 1986.
|
 |
23
|
|
| |
24
|
SELTZER, M. I., CHEN, P. M., AND OUSTERHOUT, J. K Disk scheduling revisited. In Proceedlngs of the Winter 1990 USENIX Techmcal Conference (Jan. 1990), pp.
|
CITED BY 195
|
|
|
|
|
Peter M. Chen , Wee Teck Ng , Subhachandra Chandra , Christopher Aycock , Gurushankar Rajamani , David Lowell, The Rio file cache: surviving operating system crashes, ACM SIGPLAN Notices, v.31 n.9, p.74-83, Sept. 1996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomas E. Anderson , Michael D. Dahlin , Jeanna M. Neefe , David A. Patterson , Drew S. Roselli , Randolph Y. Wang, Serverless network file systems, ACM Transactions on Computer Systems (TOCS), v.14 n.1, p.41-79, Feb. 1996
|
|
|
|
|
|
|
|
|
|
|
|
T. E. Anderson , M. D. Dahlin , J. M. Neefe , D. A. Patterson , D. S. Roselli , R. Y. Wang, Serverless network file systems, ACM SIGOPS Operating Systems Review, v.29 n.5, p.109-126, Dec. 3, 1995
|
|
|
|
|
|
|
|
|
|
|
|
Apratim Purakayastha , Carla Schlatter Ellis , David Kotz, ENWRICH: a compute-processor write caching scheme for parallel file systems, Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference, p.55-68, May 27-27, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
Douglas S. Santry , Michael J. Feeley , Norman C. Hutchinson , Alistair C. Veitch , Ross W. Carton , Jacob Ofir, Deciding when to forget in the Elephant file system, ACM SIGOPS Operating Systems Review, v.33 n.5, p.110-123, Dec. 1999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M. Rosenblum , E. Bugnion , S. A. Herrod , E. Witchel , A. Gupta, The impact of architectural trends on operating system performance, ACM SIGOPS Operating Systems Review, v.29 n.5, p.285-298, Dec. 3, 1995
|
|
|
|
|
|
|
|
|
|
|
|
Jun Wang , Rui Min , Yingwu Zhu , Yiming Hu, UCFS-A Novel User-Space, High Performance, Customized File System for Web Proxy Servers, IEEE Transactions on Computers, v.51 n.9, p.1056-1073, September 2002
|
|
|
|
|
|
|
|
|
|
|
|
Jiri Schindler , Steven W. Schlosser , Minglong Shao , Anastassia Ailamaki , Gregory R. Ganger, Atropos: A Disk Array Volume Manager for Orchestrated Use of Disks, Proceedings of the 3rd USENIX Conference on File and Storage Technologies, March 31-31, 2004, San Francisco, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mary Baker , Satoshi Asami , Etienne Deprit , John Ouseterhout , Margo Seltzer, Non-volatile memory for fast, reliable file systems, ACM SIGPLAN Notices, v.27 n.9, p.10-22, Sept. 1992
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Matt DeBergalis , Peter Corbett , Steve Kleiman , Arthur Lent , Dave Noveck , Tom Talpey , Mark Wittle, The Direct Access File System, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
|
|
Sumeet Sobti , Nitin Garg , Chi Zhang , Xiang Yu , Arvind Krishnamurthy , Randolph Wang, PersonalRAID: Mobile Storage for Distributed and Disconnected Computers, Proceedings of the 1st USENIX Conference on File and Storage Technologies, January 28-30, 2002, Monterey, CA
|
|
|
|
|
|
|
|
|
|
|
|
John Zedlewski , Sumeet Sobti , Nitin Garg , Fengzhou Zheng , Arvind Krishnamurthy , Randolph Wang, Modeling Hard-Disk Power Consumption, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bo Hong , Feng Wang , Scott A. Brandt , Darrell D. E. Long , Thomas J. E. Schwarz, S. J., Using MEMS-based storage in computer systems---MEMS storage architectures, ACM Transactions on Storage (TOS), v.2 n.1, p.1-21, February 2006
|
|
|
Ryusuke Konishi , Yoshiji Amagai , Koji Sato , Hisashi Hifumi , Seiji Kihara , Satoshi Moriai, The Linux implementation of a log-structured file system, ACM SIGOPS Operating Systems Review, v.40 n.3, p.102-107, July 2006
|
|
|
|
|
|
|
|
|
|
|
|
Margo I. Seltzer , Gregory R. Ganger , M. Kirk McKusick , Keith A. Smith , Craig A. N. Soules , Christopher A. Stein, Journaling versus soft updates: asynchronous meta-data protection in file systems, Proceedings of the Annual Technical Conference on 2000 USENIX Annual Technical Conference, p.6-6, June 18-23, 2000, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Lakshmi Ganesh , Hakim Weatherspoon , Mahesh Balakrishnan , Ken Birman, Optimizing power consumption in large scale storage systems, Proceedings of the 11th USENIX workshop on Hot topics in operating systems, p.1-6, May 07-09, 2007, San Diego, CA
|
|
|
|
|
|
|
|
|
|
|
|
Orran Krieger , Marc Auslander , Bryan Rosenburg , Robert W. Wisniewski , Jimi Xenidis , Dilma Da Silva , Michal Ostrowski , Jonathan Appavoo , Maria Butrico , Mark Mergen , Amos Waterland , Volkmar Uhlig, K42: building a complete operating system, ACM SIGOPS Operating Systems Review, v.40 n.4, October 2006
|
|
|
Fred Douglis , Ramón Cáceres , Frans Kaashoek , Kai Li , Brian Marsh , Joshua A. Tauber, Storage alternatives for mobile computers, Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation, p.3-es, November 14-17, 1994, Monterey, California
|
|
|
Adam Sweeney , Doug Doucette , Wei Hu , Curtis Anderson , Mike Nishimoto , Geoff Peck, Scalability in the XFS file system, Proceedings of the Annual Technical Conference on USENIX 1996 Annual Technical Conference, p.1-1, January 22-26, 1996, San Diego, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fred Douglis , John Palmer , Elizabeth S. Richards , David Tao , William H. Tetzlaff , John M. Tracey , Jian Yin, Position: short object lifetimes require a delete-optimized storage system, Proceedings of the 11th workshop on ACM SIGOPS European workshop: beyond the PC, September 19-22, 2004, Leuven, Belgium
|
|
|
Dean Hildebrand , Lee Ward , Peter Honeyman, Large files, small writes, and pNFS, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
|
|
|
Gaurav Mathur , Peter Desnoyers , Deepak Ganesan , Prashant Shenoy, Capsule: an energy-optimized object storage system for memory-constrained sensor devices, Proceedings of the 4th international conference on Embedded networked sensor systems, October 31-November 03, 2006, Boulder, Colorado, USA
|
|
|
|
|
|
|
|
|
Ramakrishna Gummadi , Nupur Kothari , Todd Millstein , Ramesh Govindan, Declarative failure recovery for sensor networks, Proceedings of the 6th international conference on Aspect-oriented software development, March 12-16, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
John Linwood Griffin , Steven W. Schlosser , Gregory R. Ganger , David F. Nagle, Operating system management of MEMS-based storage devices, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.16-16, October 22-25, 2000, San Diego, California
|
|
|
|
|
|
|
|
|
Michael Abd-El-Malek , William V. Courtright, II , Chuck Cranor , Gregory R. Ganger , James Hendricks , Andrew J. Klosterman , Michael Mesnier , Manish Prasad , Brandon Salmon , Raja R. Sambasivan , Shafeeq Sinnamohideen , John D. Strunk , Eno Thereska , Matthew Wachs , Jay J. Wylie, Ursa minor: versatile cluster-based storage, Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies, p.5-5, December 13-16, 2005, San Francisco, CA
|
|
|
Christopher R. Lumb , Jiri Schindler , Gregory R. Ganger , David F. Nagle , Erik Riedel, Towards higher disk head utilization: extracting free bandwidth from busy disk drives, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.7-7, October 22-25, 2000, San Diego, California
|
|
|
|
|
|
|
|
|
Adam G. Pennington , John D. Strunk , John Linwood Griffin , Craig A. N. Soules , Garth R. Goodson , Gregory R. Ganger, Storage-based intrusion detection: watching storage activity for suspicious behavior, Proceedings of the 12th conference on USENIX Security Symposium, p.10-10, August 04-08, 2003, Washington, DC
|
|
|
|
|
|
John D. Strunk , Garth R. Goodson , Michael L. Scheinholtz , Craig A. N. Soules , Gregory R. Ganger, Self-securing storage: protecting data in compromised system, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.12-12, October 22-25, 2000, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Edmund B. Nightingale , Kaushik Veeraraghavan , Peter M. Chen , Jason Flinn, Rethink the sync, Proceedings of the 7th conference on USENIX Symposium on Operating Systems Design and Implementation, p.1-1, November 06-08, 2006, Seattle, WA
|
|
|
Margo Seltzer , Keith Bostic , Marshall Kirk Mckusick , Carl Staelin, An implementation of a log-structured file system for UNIX, Proceedings of the USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993 Conference Proceedings, p.3-3, January 25-29, 1993, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Richard Golding , Peter Bosch , Carl Staelin , Tim Sullivan , John Wilkes, Idleness is not sloth, Proceedings of the USENIX 1995 Technical Conference Proceedings on USENIX 1995 Technical Conference Proceedings, p.17-17, January 16-20, 1995, New Orleans, Louisiana
|
|
|
Margo Seltzer , Keith A. Smith , Hari Balakrishnan , Jacqueline Chang , Sara McMains , Venkata Padmanabhan, File system logging versus clustering: a performance comparison, Proceedings of the USENIX 1995 Technical Conference Proceedings on USENIX 1995 Technical Conference Proceedings, p.21-21, January 16-20, 1995, New Orleans, Louisiana
|
|
|
|
|
|
|
|
|
|
|
|
Jaeheung Lee , Junyoung Heo , Yookun Cho , Jiman Hong , Sung Y. Shin, Secure deletion for NAND flash file system, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
|
|
|
Marcus Fontoura , Engene Shekita , Jason Y. Zien , Sridhar Rajagopalan , Andreas Neumann, High performance index build algorithms for intranet search engines, Proceedings of the Thirtieth international conference on Very large data bases, p.1122-1133, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Muthian Sivathanu , Vijayan Prabhakaran , Florentina I. Popovici , Timothy E. Denehy , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau, Semantically-Smart Disk Systems, Proceedings of the 2nd USENIX Conference on File and Storage Technologies, March 31-31, 2003, San Francisco, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peter Muth , Patrick E. O'Neil , Achim Pick , Gerhard Weikum, Design, Implementation, and Performance of the LHAM Log-Structured History Data Access Method, Proceedings of the 24rd International Conference on Very Large Data Bases, p.452-463, August 24-27, 1998
|
|
|
|
|
|
|
|
|
|
|
|
R. Hugo Patterson , Stephen Manley , Mike Federwisch , Dave Hitz , Steve Kleiman , Shane Owara, SnapMirror: File-System-Based Asynchronous Mirroring for Disaster Recovery, Proceedings of the 1st USENIX Conference on File and Storage Technologies, January 28-30, 2002, Monterey, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Seungjae Baek , Seongjun Ahn , Jongmoo Choi , Donghee Lee , Sam H. Noh, Uniformity improving page allocation for flash memory file systems, Proceedings of the 7th ACM & IEEE international conference on Embedded software, September 30-October 03, 2007, Salzburg, Austria
|
|
|
|
|
|
Jongmin Lee , Sunghoon Kim , Hunki Kwon , Choulseung Hyun , Seongjun Ahn , Jongmoo Choi , Donghee Lee , Sam H. Noh, Block recycling schemes and their cost-based optimization in nand flash memory based storage system, Proceedings of the 7th ACM & IEEE international conference on Embedded software, September 30-October 03, 2007, Salzburg, Austria
|
|
|
|
|
|
|
|
|
|
|
|
Yong-Goo Lee , Dawoon Jung , Dongwon Kang , Jin-Soo Kim, μ-FTL:: a memory-efficient flash translation layer supporting multiple mapping granularities, Proceedings of the 7th ACM international conference on Embedded software, October 19-24, 2008, Atlanta, GA, USA
|
|
|
|
|
|
Dawoon Jung , Jaegeuk Kim , Jin-Soo Kim , Joonwon Lee, ScaleFFS: A scalable log-structured flash file system for mobile multimedia systems, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.5 n.1, p.1-18, October 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kyoungmoon Sun , Seungjae Baek , Jongmoo Choi , Donghee Lee , Sam H. Noh , Sang Lyul Min, LTFTL: lightweight time-shift flash translation layer for flash memory based embedded storage, Proceedings of the 7th ACM international conference on Embedded software, October 19-24, 2008, Atlanta, GA, USA
|
|
|
Jongmin Lee , Eujoon Byun , Hanmook Park , Jongmoo Choi , Donghee Lee , Sam H. Noh, CPS-SIM: configurable and accurate clock precision solid state drive simulator, Proceedings of the 2009 ACM symposium on Applied Computing, March 08-12, 2009, Honolulu, Hawaii
|
|
|
|
|
|
Hakim Weatherspoon , Lakshmi Ganesh , Tudor Marian , Mahesh Balakrishnan , Ken Birman, Smoke and mirrors: reflecting files at a geographically remote location without loss of performance, Proccedings of the 7th conference on File and stroage technologies, p.211-224, February 24-27, 2009, San Francisco, California
|
|
|
|
|
|
Richard P. Spillane , Sachin Gaikwad , Manjunath Chinni , Erez Zadok , Charles P. Wright, Enabling transactional file access via lightweight kernel extensions, Proccedings of the 7th conference on File and stroage technologies, p.29-42, February 24-27, 2009, San Francisco, California
|
|
|
Chuanyi Liu , Yu Gu , Linchun Sun , Bin Yan , Dongsheng Wang, R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anirudh Badam , KyoungSoo Park , Vivek S. Pai , Larry L. Peterson, HashCache: cache storage for the next billion, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.123-136, April 22-24, 2009, Boston, Massachusetts
|
|
|
|
|
|
Xiao-Yu Hu , Evangelos Eleftheriou , Robert Haas , Ilias Iliadis , Roman Pletka, Write amplification analysis in flash-based solid state drives, Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, May 04-April 06, 2009, Haifa, Israel
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
D.
Software
D.4
OPERATING SYSTEMS
D.4.2
Storage Management
Subjects:
Secondary storage
Additional Classification:
D.
Software
D.4
OPERATING SYSTEMS
D.4.2
Storage Management
Subjects:
Allocation/deallocation strategies
D.4.5
Reliability
Subjects:
Checkpoint/restart
D.4.8
Performance
Subjects:
Measurements;
Operational analysis;
Simulation
H.
Information Systems
H.2
DATABASE MANAGEMENT
H.2.2
Physical Design
Subjects:
Recovery and restart
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.2
Information Storage
Subjects:
File organization
General Terms:
Algorithms,
Design,
Measurement,
Performance
Keywords:
Unix,
disk storage management,
fast crash recovery,
file system organization,
file system performance,
high write performance,
log-structured,
logging
REVIEW
"Charles N. Schroeder : Reviewer"
This well-written paper is of particular interest at this time
since it deals with a much-improved disk storage management system for
UNIX-based systems. The system described shows an order of magnitude
increase in performance over
more...
|