ACM Home Page
Please provide us with feedback. Feedback
Data exchange: getting to the core
Full text PdfPdf (262 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 30 ,  Issue 1  (March 2005) table of contents
Special Issue: SIGMOD/PODS 2003
Pages: 174 - 210  
Year of Publication: 2005
ISSN:0362-5915
Authors
Ronald Fagin  IBM Almaden Research Center, San Jose, CA
Phokion G. Kolaitis  IBM Almaden Research Center, San Jose, CA
Lucian Popa  IBM Almaden Research Center, San Jose, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 130,   Citation Count: 32
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1061318.1061323
What is a DOI?

ABSTRACT

Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. Given a source instance, there may be many solutions to the data exchange problem, that is, many target instances that satisfy the constraints of the data exchange problem. In an earlier article, we identified a special class of solutions that we call universal. A universal solution has homomorphisms into every possible solution, and hence is a “most general possible” solution. Nonetheless, given a source instance, there may be many universal solutions. This naturally raises the question of whether there is a “best” universal solution, and hence a best solution for data exchange. We answer this question by considering the well-known notion of the core of a structure, a notion that was first studied in graph theory, and has also played a role in conjunctive-query processing. The core of a structure is the smallest substructure that is also a homomorphic image of the structure. All universal solutions have the same core (up to isomorphism); we show that this core is also a universal solution, and hence the smallest universal solution. The uniqueness of the core of a universal solution together with its minimality make the core an ideal solution for data exchange. We investigate the computational complexity of producing the core. Well-known results by Chandra and Merlin imply that, unless P = NP, there is no polynomial-time algorithm that, given a structure as input, returns the core of that structure as output. In contrast, in the context of data exchange, we identify natural and fairly broad conditions under which there are polynomial-time algorithms for computing the core of a universal solution. We also analyze the computational complexity of the following decision problem that underlies the computation of cores: given two graphs G and H, is H the core of G? Earlier results imply that this problem is both NP-hard and coNP-hard. Here, we pinpoint its exact complexity by establishing that it is a DP-complete problem. Finally, we show that the core is the best among all universal solutions for answering existential queries, and we propose an alternative semantics for answering queries in data exchange settings.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
 
6
Cosmadakis, S. S. and Kanellakis, P. C. 1986. Functional and inclusion dependencies: A graph theoretic approach. In Advances in Computing Research., vol. 3. JAI Press, Greenwich, CT, 163--184.
 
7
 
8
9
 
10
 
11
12
 
13
 
14
 
15
 
16
17
18
 
19
20
 
21
Papadimitriou, C. H. 1994. Computational Complexity. Addison-Wesley, Reading, MA.
 
22
Popa, L., Velegrakis, Y., Miller, R. J., Hernandez, M. A., and Fagin, R. 2002. Translating Web data. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 598--609.
23
 
24

CITED BY  32

Collaborative Colleagues:
Ronald Fagin: colleagues
Phokion G. Kolaitis: colleagues
Lucian Popa: colleagues