ACM Home Page
Please provide us with feedback. Feedback
Simplifying XML schema: effortless handling of nondeterministic regular expressions
Full text PdfPdf (444 KB)
Source
International Conference on Management of Data archive
Proceedings of the 35th SIGMOD international conference on Management of data table of contents
Providence, Rhode Island, USA
SESSION: Research session 19: semi-structured data management table of contents
Pages 731-744  
Year of Publication: 2009
ISBN:978-1-60558-551-2
Authors
Geert Jan Bex  Hasselt University and Transnational University of Limburg, Hasselt, Belgium
Wouter Gelade  Hasselt University and Transnational University of Limburg, Hasselt, Belgium
Wim Martens  Technical University of Dortmund, Dortmund, Germany
Frank Neven  Hasselt University and Transnational University of Limburg, Hasselt, Belgium
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 36,   Downloads (12 Months): 135,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1559845.1559922
What is a DOI?

ABSTRACT

Whether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. For instance, complex content models in XML Schema are constrained by the infamous unique particle attribution (UPA) constraint. In formal language theoretic terms, this constraint restricts content models to deterministic regular expressions. As the latter constitute a semantic notion and no simple corresponding syntactical characterization is known, it is very difficult for non-expert users to understand exactly when and why content models do or do not violate UPA. In the present paper, we therefore investigate solutions to relieve users from the burden of UPA by automatically transforming nondeterministic expressions into concise deterministic ones defining the same language or constituting good approximations. The presented techniques facilitate XSD construction by reducing the design task at hand more towards the complexity of the modeling task. In addition, our algorithms can serve as a plug-in for any model management tool which supports export to XML Schema format.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
P. A. Bernstein. Applying Model Management to Classical Meta Data Problems. In Conference on Innovative Data Systems Research (CIDR), 2003.
5
6
 
7
 
8
9
 
10
 
11
12
 
13
 
14
15
 
16
W. Gelade and F. Neven. Succinctness of the complement and intersection of regular expressions. In Symposium on Theoretical Aspects of Computer Science (STACS), p. 325--336, 2008.
 
17
G. Ghelli, D. Colazzo, C. Sartiani. Efficient inclusion for a class of xml types with interleaving and counting. In Database Programming Languages (DBPL), p. 231--245, 2007.
18
 
19
H. Gruber and J. Johannsen. Optimal lower bounds on regular expression size using communication complexity. In Foundations of Software Science and Computation Structures (FOSSACS), p. 273--286, 2008.
 
20
 
21
 
22
 
23
 
24
W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for simple regular expressions. In Mathematical Foundations of Computer Science (MFCS), p. 889--900, 2004.
25
26
 
27
F. Neven and T. Schwentick. On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. Logical Methods in Computer Science, 2(3), 2006.
 
28
29
 
30
H.S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn. XML Schema part 1: Structures. World Wide Web Consortium (W3C), May 2001.
 
31
 
32
P. van Emde Boas. The convenience of tilings. In Complexity, Logic and Recursion Theory, p. 331--363.

Collaborative Colleagues:
Geert Jan Bex: colleagues
Wouter Gelade: colleagues
Wim Martens: colleagues
Frank Neven: colleagues