ACM Home Page
Please provide us with feedback. Feedback
Meaningful change detection in structured data
Full text PdfPdf (1.67 MB)
Source International Conference on Management of Data archive
Proceedings of the 1997 ACM SIGMOD international conference on Management of data table of contents
Tucson, Arizona, United States
Pages: 26 - 37  
Year of Publication: 1997
ISBN:0-89791-911-4
Also published in ...
Authors
Sudarshan S. Chawathe  Computer Science Department, Stanford University, Stanford, California
Hector Garcia-Molina  Computer Science Department, Stanford University, Stanford, California
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 83,   Citation Count: 53
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/253260.253266
What is a DOI?

ABSTRACT

Detecting changes by comparing data snapshots is an important requirement for difference queries, active databases, and version and configuration management. In this paper we focus on detecting meaningful changes in hierarchically structured data, such as nested-object data. This problem is much more challenging than the corresponding one for relational or flat-file data. In order to describe changes better, we base our work not just on the traditional “atomic” insert, delete, update operations, but also on operations that move an entire sub-tree of nodes, and that copy an entire sub-tree. These operations allows us to describe changes in a semantically more meaningful way. Since this change detection problem is NP-hard, in this paper we present a heuristic change detection algorithm that yields close to “minimal” descriptions of the changes, and that has fewer restrictions than previous algorithms. Our algorithm is based on transforming the change detection problem to a problem of computing a minimum-cost edge cover of a bipartite graph. We study the quality of the solution produced by our algorithm, as well as the running time, both analytically and experimentally.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
CGM97
S. Chawathe and H. Garcia-Molina. Meaningful change detection in structured data. Available at URL http://w~ra-db, stanford, edu, 1997. Extended version.
 
CGMH+94
S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The Tsimmis project: Integration of heterogeneous information sources. In Proceedings of lOOth Anniversary Meeting of the information Processing Society of Japan, pages 7-18, Tokyo, Japan, October 1994.
CRGMW96
 
HHS+
M. Haertel, D. Hayes, R. Stallman, L. Tower, P. Eggert., and W. Davison. The GNU diff program. Texinfo system documentation. Available by anonymous FTP from prep. ai .mit. edu.
 
Law76
E. Lawler. Combinatorial Optimization: Networks and Matroids. Holt, Rinehart and Winston, 1976.
 
LGM96
 
Mye86
E. Myers. An O(UO) difference algorithm and its variations. Algorithmica, 1(2):251-266, 1986.
 
PS82
 
Rot
E. Rothberg. The wmatch program for finding a maximum-weight matching for undirected graphs. Live OR collection. Available at URL http ://w~. orsoc, org. uk.
 
SWZS94
D. Shasha, J. Wang, K. Zhang, and F. Shih. Exact and approximate algorithms for unordered tree matching. IEEE Transactions on Systems, Man, and Cybernetics, 24(4):668-678, April 1994.
 
SZ90
Wag75
WF74
 
WMG90
 
WU95
J. Widom and J. Ullman. The C3 project: Changes, consistency, and configurations in heterogeneous distributed information systems. Unpublished manuscript; available at URL http://www-db, stan~ord, edu, 1995.
 
ZS89
 
ZWS95
K. Zhang, J. Wang, and D. Shasha. On the editing distance between undirected acyclic graphs. International Journal of Foundations of Computer Science, 1995.

CITED BY  53

Collaborative Colleagues:
Sudarshan S. Chawathe: colleagues
Hector Garcia-Molina: colleagues