ACM Home Page
Please provide us with feedback. Feedback
An information-theoretic approach to normal forms for relational and XML data
Full text PdfPdf (366 KB)
Source Journal of the ACM (JACM) archive
Volume 52 ,  Issue 2  (March 2005) table of contents
Pages: 246 - 283  
Year of Publication: 2005
ISSN:0004-5411
Authors
Marcelo Arenas  University of Toronto, Toronto, Ontario, Canada
Leonid Libkin  University of Toronto, Toronto, Ontario, Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 126,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1059513.1059519
What is a DOI?

ABSTRACT

Normalization as a way of producing good relational database designs is a well-understood topic. However, the same problem of distinguishing well-designed databases from poorly designed ones arises in other data models, in particular, XML. While, in the relational world, the criteria for being well designed are usually very intuitive and clear to state, they become more obscure when one moves to more complex data models.Our goal is to provide a set of tools for testing when a condition on a database design, specified by a normal form, corresponds to a good design. We use techniques of information theory, and define a measure of information content of elements in a database with respect to a set of constraints. We first test this measure in the relational context, providing information-theoretic justification for familiar normal forms such as BCNF, 4NF, PJ/NF, 5NFR, DK/NF. We then show that the same measure applies in the XML context, which gives us a characterization of a recently introduced XML normal form called XNF. Finally, we look at information-theoretic criteria for justifying normalization algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
4
 
5
Beeri, C., Bernstein, P., and Goodman, N. 1978. A sophisticate's introduction to database normalization theory. In Proceedings of the 4th International Conference on Very Large Data Bases. 113--124.
 
6
 
7
 
8
9
 
10
11
12
13
 
14
 
15
 
16
 
17
 
18
 
19
Ley, M. 2003. DBLP. http://www.informatik.uni-trier.de/~ley/db/index.html.
20
21
22
 
23
Papadimitriou, C. 1994. Computational complexity. Addison-Wesley, Reading, Mass.
 
24
Shannon, C. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423 (Part I), 623--656 (Part II).
25
26
27
 
28
 
29
Vincent, M. 1999. Semantic foundations of 4NF in relational database design. Acta Inf. 36, 3, 173--213.

CITED BY  10

Collaborative Colleagues:
Marcelo Arenas: colleagues
Leonid Libkin: colleagues