|
ABSTRACT
XML-based documents play a major role in modern information architectures and their corresponding workflows. In this context, the ability to identify and represent differences between two versions of a document is essential. Several approaches to finding the differences between XML documents have already been proposed. Typically, they are based on tree-to-tree correction, or sequence alignment. Most of these algorithms, however, are too slow and do not support the subsequent merging of changes. In this paper, we present a differencing algorithm tailored to ordered XML documents, called DocTreeDiff. It relies on our context-oriented XML versioning model which allows for document merging, presented in earlier work. An empiric evaluation demonstrates the efficiency of our approach as well as the high quality of the generated deltas.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
M. Brauer, R. Weir, and M. McRae. OpenDocument v1.1 specification, 2007.
|
 |
5
|
|
 |
6
|
Sudarshan S. Chawathe , Anand Rajaraman , Hector Garcia-Molina , Jennifer Widom, Change detection in hierarchically structured information, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.493-504, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
7
|
|
| |
8
|
|
| |
9
|
S. DeRose and J. Clark. XML path language (XPath) version 1.0. W3C recommendation, W3C, Nov. 1999. http://www.w3.org/TR/1999/REC-xpath-19991116.
|
| |
10
|
R. L. Fontaine. Merging xml files: a new approach providing intelligent merge of xml data sets. In Proceedings of XML Europe 2002, 2002.
|
| |
11
|
Kuo-Si Huang , Chang-Biau Yang , Kuo-Tsung Tseng , Hsing-Yen Ann , Yung-Hsing Peng, Efficient algorithms for finding interleaving relationship between sequences, Information Processing Letters, v.105 n.5, p.188-193, February, 2008
[doi> 10.1016/j.ipl.2007.08.028]
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
E. W. Myers. An o(nd) difference algorithm and its variations. Algorithmica, 1:251--266, 1986.
|
| |
17
|
J. Paoli, I. Valet-Harper, A. Farquhar, and I. Sebestyen. ECMA-376 Office Open XML File Formats, 2006.
|
| |
18
|
S. Pemberton. XHTML™ 1.0 the extensible hypertext markup language (second edition). W3C recommendation, W3C, Aug. 2002. http://www.w3.org/TR/2002/REC-xhtml1-20020801.
|
| |
19
|
M. O. Rabin. Fingerprinting by random polynomials. Technical Report TR-CSE-03-01, Center for Research in Computing Technology, Harvard University, 1981.
|
| |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
|
| |
24
|
S. M. Selkow. The tree-to-tree editing problem. Inf. Process. Lett., 6(6):184--186, 1977.
|
 |
25
|
|
| |
26
|
G. Valiente. An efficient bottom-up distance between trees. In SPIRE, pages 212--219, 2001.
|
 |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
K. Zhang, J. T.-L. Wang, and D. Shasha. On the editing distance between undirected acyclic graphs and related problems. In CPM '95: Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching, pages 395--407, 1995.
|
|