|
ABSTRACT
We present a replication-based approach to fault-tolerant distributed stream processing in the face of node failures, network failures, and network partitions. Our approach aims to reduce the degree of inconsistency in the system while guaranteeing that available inputs capable of being processed are processed within a specified time threshold. This threshold allows a user to trade availability for consistency: a larger time threshold decreases availability but limits inconsistency, while a smaller threshold increases availability but produces more inconsistent results based on partial data. In addition, when failures heal, our scheme corrects previously produced results, ensuring eventual consistency.Our scheme uses a data-serializing operator to ensure that all replicas process data in the same order, and thus remain consistent in the absence of failures. To regain consistency after a failure heals, we experimentally compare approaches based on checkpoint/redo and undo/redo techniques and illustrate the performance trade-offs between these schemes.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Daniel J. Abadi , Don Carney , Ugur Çetintemel , Mitch Cherniack , Christian Convey , Sangdon Lee , Michael Stonebraker , Nesime Tatbul , Stan Zdonik, Aurora: a new model and architecture for data stream management, The VLDB Journal — The International Journal on Very Large Data Bases, v.12 n.2, p.120-139, August 2003
[doi> 10.1007/s00778-003-0095-z]
|
| |
2
|
Abadi et al. The design of the Borealis stream processing engine. In CIDR, Jan. 2005.
|
| |
3
|
Abadi et al. The design of the Borealis stream processing engine. Technical Report CS-04-08, Department of Computer Science, Brown University, Jan. 2005.
|
| |
4
|
G. Alonso and C. Mohan. WFMS: The next generation of distributed processing tools. In S. Jajodia and L. Kerschberg, editors, Advanced Transaction Models and Architectures. Kluwer, 1997.
|
| |
5
|
Alonso et al. Exotica/FMQM: A persistent message-based architecture for distributed workflow management. In Proc. of IFIP WG8.1 Working Conf. on Information Systems for Decentralized Organizations, Aug. 1995.
|
| |
6
|
A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. Technical Report 2003-67, Stanford University, Oct. 2003.
|
 |
7
|
|
 |
8
|
|
 |
9
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
[doi> 10.1145/543613.543615]
|
 |
10
|
Philip A. Bernstein , Meichun Hsu , Bruce Mann, Implementing recoverable requests using queues, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.112-122, May 23-26, 1990, Atlantic City, New Jersey, United States
|
| |
11
|
|
| |
12
|
D. Carney, U. Çetintemel, A. Rasin, S. Zdonik, M. Cherniack, and M. Stonebraker. Operator scheduling in a data stream manager. In 29th VLDB, Sept. 2003.
|
| |
13
|
S. Chandrasekaran and M. J. Franklin. Remembrance of streams past: Overload-sensitive management of archived streams. In 30th VLDB, Sept. 2004.
|
| |
14
|
Chandrasekaran et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR, Jan. 2003.
|
| |
15
|
Cherniack et al. Scalable distributed stream processing. In CIDR, Jan. 2003.
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
 |
19
|
Nick Feamster , David G. Andersen , Hari Balakrishnan , M. Frans Kaashoek, Measuring the effects of internet path faults on reactive routing, Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 11-14, 2003, San Diego, CA, USA
|
 |
20
|
|
 |
21
|
|
 |
22
|
Jim Gray , Pat Helland , Patrick O'Neil , Dennis Shasha, The dangers of replication and a solution, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.173-182, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
23
|
|
| |
24
|
M. Hsu. Special issue on workflow systems. IEEE Data Eng. Bulletin, 18(1), Mar. 1995.
|
| |
25
|
Jeong-Hyon Hwang , Magdalena Balazinska , Alexander Rasin , Ugur Cetintemel , Michael Stonebraker , Stan Zdonik, High-Availability Algorithms for Distributed Stream Processing, Proceedings of the 21st International Conference on Data Engineering (ICDE'05), p.779-790, April 05-08, 2005
[doi> 10.1109/ICDE.2005.72]
|
| |
26
|
|
 |
27
|
Leonard Kawell, Jr. , Steven Beckhardt , Timothy Halvorsen , Raymond Ozzie , Irene Greif, Replicated document management in a group communication system, Proceedings of the 1988 ACM conference on Computer-supported cooperative work, September 26-28, 1988, Portland, Oregon, United States
[doi> 10.1145/62266.1024798]
|
| |
28
|
Y.-N. Law, H. Wang, and C. Zaniolo. Query languages and data models for database sequences and data streams. In 30th VLDB, Sept. 2004.
|
 |
29
|
|
| |
30
|
Motwani et al. Query processing, approximation, and resource management in a data stream management system. In CIDR, Jan. 2003.
|
| |
31
|
Naughton et al. The Niagara Internet query system. IEEE Data Eng. Bulletin, 24(2), June 2001.
|
| |
32
|
|
 |
33
|
|
 |
34
|
|
 |
35
|
Mehul A. Shah , Joseph M. Hellerstein , Eric Brewer, Highly available, fault-tolerant, parallel dataflows, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007662]
|
 |
36
|
|
| |
37
|
R. E. Strom. Fault-tolerance in the SMILE stateful publish-subscribe system. In DEBS, May 2004.
|
| |
38
|
N. Tatbul, U. Çetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker. Load shedding in a data stream manager. In 29th VLDB, Sept. 2003.
|
 |
39
|
D. B. Terry , M. M. Theimer , Karin Petersen , A. J. Demers , M. J. Spreitzer , C. H. Hauser, Managing update conflicts in Bayou, a weakly connected replicated storage system, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.172-182, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
40
|
The NTP Project. NTP: The Network Time Protocol. http://www.ntp.org/.
|
| |
41
|
P. A. Tucker and D. Maier. Dealing with disorder. In MPDS, June 2003.
|
| |
42
|
R. Urbano. Oracle Streams Replication Administrator's Guide, 10g Release 1 (10.1). Oracle Corporation, Dec. 2003.
|
CITED BY 20
|
|
|
|
|
Deepak S. Turaga , Brian Foo , Olivier Verscheure , Rong Yan, Configuring topologies of distributed semantic concept classifiers for continuous multimedia stream processing, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael Branson , Fred Douglis , Brad Fawcett , Zhen Liu , Anton Riabov , Fan Ye, CLASP: collaborating, autonomous stream processing systems, Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware, November 26-30, 2007, Newport Beach, California
|
|
|
Zhongtang Cai , Vibhore Kumar , Brian F. Cooper , Greg Eisenhauer , Karsten Schwan , Robert E. Strom, Utility-driven proactive management of availability in enterprise-scale information flows, Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware, November 01-01, 2006, Melbourne, Australia
|
|