|
ABSTRACT
We present an archiving technique for hierarchical data with key structure. Our approach is based on the notion of timestamps whereby an element appearing in multiple versions of the database is stored only once along with a compact description of versions in which it appears. The basic idea of timestamping was discovered by Driscoll et. al. in the context of persistent data structures where one wishes to track the sequences of changes made to a data structure. We extend this idea to develop an archiving tool for XML data that is capable of providing meaningful change descriptions and can also efficiently support a variety of basic functions concerning the evolution of data such as retrieval of any specific version from the archive and querying the temporal history of any element. This is in contrast to diff-based approaches where such operations may require undoing a large number of changes or significant reasoning with the deltas. Surprisingly, our archiving technique does not incur any significant space overhead when contrasted with other approaches. Our experimental results support this and also show that the compacted archive file interacts well with other compression techniques. Finally, another useful property of our approach is that the resulting archive is also in XML and hence can directly leverage existing XML tools.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A. Bairoch and R. Apweiler. The SWISS-PROT protein sequence database and its supplement TrEMBL. Nucleic Acids Research, 28:45-48, 2000.
|
| |
2
|
P. Buneman, S. Khanna, K. Tajima, and W. Tan. Archiving Scientific Data. Technical report, University of Pennsylvania, 2002.
|
| |
3
|
The WWW Virtual Library of Cell Biology. http://vlib.org/Science/Cell_Biology/databases.shtml.
|
| |
4
|
Concurrent Versions System. Unix man pages - cvs.
|
| |
5
|
E. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1(2):251-266, 1986.
|
| |
6
|
G. Cobena and S. Abiteboul and A. Marian. Detecting Changes in XML Documents. In Int'l Conf. on Data Engineering, 2001.
|
| |
7
|
XML TreeDiff. http://www.alphaworks.ibm.com/formula/xmltreediff.
|
| |
8
|
J. Clark and S. DeRose. XML Path Language (XPath). W3C Working Draft, November 1999. http://www.w3.org/TR/xpath.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
Online Mendelian Inheritance in Man, OMIM (TM), 2000. http://www.ncbi.nlm.nih.gov/omim/.
|
 |
15
|
Peter Buneman , Susan Davidson , Wenfei Fan , Carmem Hara , Wang-Chiew Tan, Keys for XML, Proceedings of the 10th international conference on World Wide Web, p.201-210, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.371984]
|
| |
16
|
The NIST Reference on Constants, Units, and Uncertainty. http://physics.nist.gov/cuu/Constants/links.html.
|
| |
17
|
|
| |
18
|
|
 |
19
|
Sudarshan S. Chawathe , Anand Rajaraman , Hector Garcia-Molina , Jennifer Widom, Change detection in hierarchically structured information, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.493-504, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
20
|
|
| |
21
|
Source Code Control System. Unix man pages - sccs.
|
| |
22
|
A. R. Schmidt , Florian Waas , Martin L. Kersten , D. Florescu , I. Manolescu , M. J. Carey , R. Busse, The XML benchmark project, CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands, 2001
|
| |
23
|
K. Tufte and D. Maier. Aggregation and Accumulation of XML Data. IEEE Data Engineering Bulletin, 24(2):34-39, 2001.
|
| |
24
|
W. Miller and E. Myers. A file comparison program. Software-Practice and Experience, 15(11):1025-1040, 1985.
|
| |
25
|
W3C. Extensible Markup Language (XML) 1.0, Feb 1998. http://www.w3.org/TR/REC-xml.
|
| |
26
|
W3C. Namespaces in XML, January 1999. http://www.w3.org/TR/REC-xml-names.
|
| |
27
|
W3C. XML Schema Part 0: Primer, May 2000. http://www.w3.org/TR/xmlschema-0/.
|
| |
28
|
W3C. XQuery 1.0: An XML Query Language, June 2001. http://www.w3.org/TR/xquery/.
|
|