ACM Home Page
Please provide us with feedback. Feedback
Transparent checkpoints of closed distributed systems in Emulab
Full text PdfPdf (606 KB)
Source
European Conference on Computer Systems archive
Proceedings of the 4th ACM European conference on Computer systems table of contents
Nuremberg, Germany
SESSION: Real, running systems table of contents
Pages 173-186  
Year of Publication: 2009
ISBN:978-1-60558-482-9
Authors
Anton Burtsev  University of Utah, Salt Lake City, UT, USA
Prashanth Radhakrishnan  NetApp, Bangalore, India
Mike Hibler  University of Utah, Salt Lake City, UT, USA
Jay Lepreau  University of Utah, Salt Lake City, UT, USA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 110,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1519065.1519084
What is a DOI?

ABSTRACT

Emulab is a testbed for networked and distributed systems experimentation. Two guiding principles of its design are realism and control of experimentation. There is an inherent tension between these goals, however, and in some aspects of the testbed's design, Emulab's implementers favored realism over control. Thus, Emulab provides wide-ranging control over an experiment's environment and initial conditions, but relatively little control over its execution--in particular, the ability to suspend, preempt, or replay the experiment.

We have extended Emulab with a new means of control over experiment execution: the ability to cleanly checkpoint the execution of the set of nodes and networks that comprise an experiment. Conventional checkpoint mechanisms can easily degrade the fidelity of experiment results as a consequence of checkpoint downtimes, overheads of background state saving, and unintended distributed checkpoint synchronization effects. In this paper we demonstrate a checkpointing technique that is transparent with respect to the execution of the system under test, almost completely concealing the underlying checkpoint activity.

Building on our checkpoint mechanism, we have implemented two powerful facilities for experiment execution control: the ability to preemptively swap-out experiments without losing their run-time state, and the ability to time-travel through the run of a system.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
 
5
Russell Coker. Bonnie++, 2003. http://sourceforge.net/projects/bonnie/.
 
6
7
8
9
 
10
Dennis Geels et al. Friday: Global comprehension for distributed replay. In Proc. NSDI, pages 285--298, Cambridge, MA, April 2007.
 
11
 
12
 
13
Mike Hibler, Leigh Stoller, Jay Lepreau, Robert Ricci, and Chad Barb. Fast, scalable disk imaging with Frisbee. In Proc. USENIX, pages 283--296, San Antonio, TX, June 2003.
 
14
IEEE. IEEE 1558 standard for a precision clock synchronization protocol for networked measurement and control systems, September 2004.
 
15
Charles Killian et al. Life, death, and the critical transition: Finding liveness bugs in systems code. In Proc. NSDI, pages 243--256, Cambridge, MA, April 2007.
 
16
17
18
19
 
20
David L. Mills. Internet time synchronization: The network time protocol. IEEE Trans. Comm., 39:1482--1493, 1991.
21
22
 
23
Prashanth Radhakrishnan. Stateful-swapping in the Emulab network testbed. Master's thesis, University of Utah, August 2008.
 
24
Redhat. LVM2 Resource Page, 2006. http://sourceware.org/lvm2/.
25
 
26
Robert Ricci et al. The Flexlab approach to realistic evaluation of networked systems. In Proc. NSDI, pages 201--214, Cambridge, MA, April 2007.
27
 
28
 
29
 
30
Sun Microsystems, Inc. ZFS, June 2008. http://www.opensolaris.org/os/community/zfs/.
 
31
32
33
34
35
36

Collaborative Colleagues:
Anton Burtsev: colleagues
Prashanth Radhakrishnan: colleagues
Mike Hibler: colleagues
Jay Lepreau: colleagues