|
ABSTRACT
Parallel programs are difficult to debug because they run for a, long time and two executions may yield different results. Reverse execution, is a simple and powerful concept that solves both these problems. We are designing a tool for debugging parallel programs, called Recap, that provides the illusion of reverse execution using checkpoints and event recording and playback. During normal execution, Recap logs the results of system calls and shared memory reads: as well as the times that asynchronous events (signals) occur. Recap periodically checkpoints the state of a process by forking and suspending a new process. To reverse execute to a certain point in time, Recap continues the nearest checkpoint process forward in a self-contained environment, simulating all events using the log. We are implementing Recap as part of a larger environment for parallel program development.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
R. Curt, is and L. Wittie, "Bugnet: A Debugging System for Pa.rallel Programming Environments", Proceedings of the 3rd Interna.tional Conference on Distributed Computing Systems, Miami, Florida, October 1982, pp. 394-399.
|
 |
4
|
|
| |
5
|
|
| |
6
|
M. A. Linton, "Distributed Management of a Software Database", IEEE Software, Vol. 4, No. 6, November 1987, pp 70-76.
|
| |
7
|
B. P. Miller and Jong-Deok Choi, "A Mechanism for Efficient Debugging of Parallel Programs", Technical l~eport~ TR754, University of Wisconsin-Madison, 1987.
|
 |
8
|
M. Young , A. Tevanian , R. Rashid , D. Golub , J. Eppinger, The duality of memory and communication in the implementation of a multiprocessor operating system, Proceedings of the eleventh ACM Symposium on Operating systems principles, p.63-76, November 08-11, 1987, Austin, Texas, United States
|
CITED BY 25
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
James S. Plank , Micah Beck , Gerry Kingsley , Kai Li, Libckpt: transparent checkpointing under Unix, Proceedings of the USENIX 1995 Technical Conference Proceedings on USENIX 1995 Technical Conference Proceedings, p.18-18, January 16-20, 1995, New Orleans, Louisiana
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
|