ACM Home Page
Please provide us with feedback. Feedback
Repair checking in inconsistent databases: algorithms and complexity
Full text PdfPdf (471 KB)
Source ACM International Conference Proceeding Series; Vol. 361 archive
Proceedings of the 12th International Conference on Database Theory table of contents
St. Petersburg, Russia
SESSION: Inconsistency and repairs table of contents
Pages 31-41  
Year of Publication: 2009
ISBN:978-1-60558-423-2
Authors
Foto N. Afrati  National Technical University of Athens, Greece
Phokion G. Kolaitis  UC Santa Cruz and IBM Almaden
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 115,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1514894.1514899
What is a DOI?

ABSTRACT

Managing inconsistency in databases has long been recognized as an important problem. One of the most promising approaches to coping with inconsistency in databases is the framework of database repairs, which has been the topic of an extensive investigation over the past several years. Intuitively, a repair of an inconsistent database is a consistent database that differs from the given inconsistent database in a minimal way. So far, most of the work in this area has addressed the problem of obtaining the consistent answers to a query posed on an inconsistent database. Repair checking is the following decision problem: given two databases r and r', is r' a repair of r? Although repair checking is a fundamental algorithmic problem about inconsistent databases, it has not received as much attention as consistent query answering. In this paper, we give a polynomial-time algorithm for subset-repair checking under integrity constraints that are the union of a weakly acyclic set of local-as-view (LAV) tuple-generating dependencies and a set of equality-generating dependencies. This result significantly generalizes earlier work for subset-repair checking when the integrity constraints are the union of an acyclic set of inclusion dependencies and a set of functional dependencies. We also give a polynomial-time algorithm for symmetric-difference repair checking, when the integrity constraints form a weakly acyclic set of LAV tgds. After this, we establish a number of complexity-theoretic results that delineate the boundary between tractability and intractability for the repair-checking problem. Specifically, we show that the aforementioned tractability results are optimal; in particular, subset-repair checking for arbitrary weakly acyclic sets of tuple-generating dependencies is a coNP-complete problem. We also study cardinality-based repairs and show that cardinality-repair checking is coNP-complete for various classes of integrity constraints encountered in database design and data exchange.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
L. E. Bertossi, L. Bravo, E. Franconi, and A. Lopatenko. Data cleansing for numerical data sets. In SEBD, pages 292--299, 2005.
 
5
L. E. Bertossi, L. Bravo, E. Franconi, and A. Lopatenko. Fixing inconsistent databases by updating numerical attributes. In DEXA Workshops, pages 854--858, 2005.
 
6
L. E. Bertossi, L. Bravo, E. Franconi, and A. Lopatenko. Fixing inconsistent databases by updating numerical attributes. In DEXA Workshops, pages 854--858, 2005.
 
7
P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Conditional functional dependencies for data cleaning. In ICDE, pages 746--755, 2007.
8
 
9
J. Chomicki. Consistent query answering: Five easy pieces. In ICDT, pages 1--17, 2007.
 
10
 
11
 
12
W. Eckerson. Data quality and the bottom line: Achieving business success through a commitment to high quality data. Technical report, The Data Warehousing Institute, 2002. http://www.tdwi.org/research/display.aspx?ID=6064.
 
13
 
14
15
 
16
H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C.-A. Saita. Improving data cleaning quality using a data lineage facility. In DMDW, page 3, 2001.
17
 
18
 
19
20
21
 
22
A. Lopatenko and L. E. Bertossi. Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In ICDT, pages 179--193, 2007.
 
23
D. Menestrina, O. Benjelloun, and H. Garcia-Molina. Generic entity resolution with data confidences. In CleanDB, 2006.
 
24
C. Moore and J. M. Robson. Hard tiling problems with simple tiles. Discrete and Computational Geometry, 26(4):573--590, 2001.
 
25
E. Rahm and H. Do. Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4), 2000.
 
26
 
27
28

Collaborative Colleagues:
Foto N. Afrati: colleagues
Phokion G. Kolaitis: colleagues