|
ABSTRACT
Design principles for XML schemas that eliminate redundancies and avoid update anomalies have been studied recently. Several normal forms, generalizing those for relational databases, have been proposed. All of them, however, are based on the assumption of anative XML storage, while in practice most of XML data is stored inrelational databases. In this paper we study XML design and normalization for relational storage of XML documents. To be able to relate and compare XML and relational designs, we use an information-theoretic framework that measures information content in relations and documents, with higher values corresponding to lower levels of redundancy. We show that most common relational storage schemes preserve the notion of being well-designed (i.e., anomalies- and redundancy-free). Thus,existing XML normal forms guarantee well-designed relational storagesas well. We further show that if this perfect option is not achievable, then a slight restriction on XML constraints guarantees a "second-best" relational design, according to possible values of the information-theoretic measure. We finally consider an edge-based relational representation of XML documents, and show that while it has similar information-theoretic properties with other relational representations, it can behave significantly worse in terms of enforcing integrity constraints.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Xyleme XML server & business document management. http://www.xyleme.com/page/xml storage/.
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
C. Beeri, P. A. Bernstein, and N. Goodman. A sophisticate's introduction to database normalization theory. In VLDB, pages 113--124, 1978.
|
 |
8
|
|
| |
9
|
|
 |
10
|
Peter Buneman , Susan Davidson , Wenfei Fan , Carmem Hara , Wang-Chiew Tan, Keys for XML, Proceedings of the 10th international conference on World Wide Web, p.201-210, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.371984]
|
| |
11
|
|
| |
12
|
E. F. Codd. Normalized database structure: A brief tutorial. In ACM SIGFIDET Workshop on Data Description, Access and Control, 1971.
|
| |
13
|
E. F. Codd. Further normalization of data base relational model. In Courant Computer Science Symposium 6: Data Base Systems, pages 33--64, 1972.
|
| |
14
|
E. F. Codd. Recent investigations in relational database systems. In Information Processing, pages 1017--1021, 1974.
|
| |
15
|
|
 |
16
|
|
| |
17
|
S. Davidson, W. Fan, C. Hara, and J. Qin. Propagating XML constraints to relations. In ICDE, pages 543--554, 2003.
|
 |
18
|
Alin Deutsch , Mary Fernandez , Dan Suciu, Storing semistructured data with STORED, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.431-442, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
19
|
A. Deutsch and V. Tannen. MARS: A system for publishing XML from mixed and redundant storage. In VLDB, pages 201--212, 2003.
|
| |
20
|
|
| |
21
|
D. Florescu and D. Kossmann. Storing and querying XML data using an RDBMS. IEEE Data Eng. Bulletin, 22(3):27--34, 1999.
|
| |
22
|
P. C. Kanellakis. Elements of relational database theory. pages 1073--1156, 1990.
|
| |
23
|
|
| |
24
|
|
 |
25
|
|
| |
26
|
R. Krishnamurthy, R. Kaushik, and J. F. Naughton. XML-to-SQL query translation literature: The state of the art and open problems. In XSym'03, LNCS 28--24, pages 118, 2003.
|
| |
27
|
|
| |
28
|
|
| |
29
|
|
 |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
Jayavel Shanmugasundaram , Kristin Tufte , Chun Zhang , Gang He , David J. DeWitt , Jeffrey F. Naughton, Relational Databases for Querying XML Documents: Limitations and Opportunities, Proceedings of the 25th International Conference on Very Large Data Bases, p.302-314, September 07-10, 1999
|
 |
34
|
Igor Tatarinov , Stratis D. Viglas , Kevin Beyer , Jayavel Shanmugasundaram , Eugene Shekita , Chun Zhang, Storing and querying ordered XML using a relational database system, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564715]
|
| |
35
|
M. W. Vincent and J. Liu. Functional dependencies for XML. In APWEB, pages 22--34, 2003.
|
| |
36
|
M. W. Vincent, J. Liu, and C. Liu. A redundancy free 4NF for XML. In XSym, pages 254--266, 2003.
|
 |
37
|
|
| |
38
|
|
 |
39
|
|
 |
40
|
|
|