|
ABSTRACT
The common abstraction of XML Schema by unranked regular tree languages is not entirely accurate. To shed some light on the actual expressive power of XML Schema, intuitive semantical characterizations of the Element Declarations Consistent (EDC) rule are provided. In particular, it is obtained that schemas satisfying EDC can only reason about regular properties of ancestors of nodes. Hence, with respect to expressive power, XML Schema is closer to DTDs than to tree automata. These theoretical results are complemented with an investigation of the XML Schema Definitions (XSDs) occurring in practice, revealing that the extra expressiveness of XSDs over DTDs is only used to a very limited extent. As this might be due to the complexity of the XML Schema specification and the difficulty of understanding the effect of constraints on typing and validation of schemas, a simpler formalism equivalent to XSDs is proposed. It is based on contextual patterns rather than on recursive types and it might serve as a light-weight front end for XML Schema. Next, the effect of EDC on the way XML documents can be typed is discussed. It is argued that a cleaner, more robust, larger but equally feasible class is obtained by replacing EDC with the notion of 1-pass preorder typing (1PPT): schemas that allow one to determine the type of an element of a streaming document when its opening tag is met. This notion can be defined in terms of grammars with restrained competition regular expressions and there is again an equivalent syntactical formalism based on contextual patterns. Finally, algorithms for recognition, simplification, and inclusion of schemas for the various classes are given.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
Geert Jan Bex , Wim Martens , Frank Neven , Thomas Schwentick, Expressiveness of XSDs: from practice to theory, there and back again, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
[doi> 10.1145/1060745.1060848]
|
 |
4
|
|
| |
5
|
Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E., and Yergeau, F. 2004. Extensible Markup Language (XML). Tech. rep. World Wide Web Consortium. Go online to http://www.w3.org/TR/REC-xml/.
|
| |
6
|
Brüggemann-Klein, A., Murata, M., and Wood, D. 2001. Regular tree and regular hedge languages over unranked alphabets: Version 1, April 3, 2001. Tech. rep. HKUST-TCSC-2001-0. The Hongkong University of Science and Technology, Hong Kong.
|
| |
7
|
|
| |
8
|
Buck, L., Goldfarb, C., and Prescod, P. 2000. Datatypes for DTDs (DT4DTD) 1.0. Tech. rep. World Wide Web Consortium. Go online to http://www.w3.org/TR/dt4dtd/.
|
| |
9
|
Clark, J. 2002. Multi-format schema converter based on RELAX NG. Go online to http://www.thaiopensource.com/relaxng/trang.html.
|
| |
10
|
Clark, J. and Murata, M. 2001. Relax NG specification. Go online to http://www.relaxng.org/spec-20011203.html.
|
 |
11
|
Claudio Sacerdoti Coen , Paolo Marinelli , Fabio Vitali, Schemapath, a minimal extension to xml schema for conditional constraints, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988695]
|
| |
12
|
Cover, R. 2005. The Cover Pages. Go online to http://xml.coverpages.org/.
|
| |
13
|
Cristau, J., Löding, C., and Thomas, W. 2005. Deterministic automata on unranked trees. In Proceedings of the 15th International Symposium on Fundamentals of Computation Theory (FCT 2005). Springer, Berlin, Germany, 68--79.
|
| |
14
|
DuCharme, B. 2002. Filling in the DTD gaps with schematron. O'Reilly xml.com, Sebastopol, CA.
|
| |
15
|
Fernandez, M., Malhotra, A., Marsh, J., Nagy, M., and Walsh, N. 2005. XQuery 1.0 and XPath 2.0 data model. Tech. rep. World Wide Web Consortium. Go online to http://www.w3.org/TR/xpath-datamodel/.
|
| |
16
|
Fiorello, D., Gessa, N., Marinelli, P., and Vitali, F. 2004. DTD++ 2.0: Adding support for co-constraints. In Proceedings of the Extreme Markup Languages 2004 Conference (Extreme Markup Languages 2004).
|
| |
17
|
Fokoué, A. and Schloss, B. 2004. XML Schema quality checker. Go online to http://www.alphaworks.ibm.com/tech/xmlsqc.
|
| |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
Jelliffe, R. 2001. The current state of the art of schema languages for XML. Presentation at XML Asia Pacific Conference, Sidney, Australia.
|
| |
22
|
Jelliffe, R. 2005. Schematron. Go online to http://xml.ascc.net/schematron/.
|
| |
23
|
|
| |
24
|
Koch, C. and Scherzinger, S. 2003. Attribute grammars for scalable query processing on XML streams. In Proceedings of the 9th International Workshop on Database Programming Languages (DBPL 2003). Springer, Berlin, Germany, 233--256.
|
 |
25
|
|
| |
26
|
Mani, M. 2001. Keeping chess alive---do we need 1-unambiguous content models? Presentation at Extreme Markup Languages Conference 2001.
|
 |
27
|
|
| |
28
|
|
| |
29
|
Martens, W., Neven, F., and Schwentick, T. 2004. Complexity of decision problems for simple regular expressions. In Proceedings of the 29th International Symposium on Mathematical Foundations of Computer Science (MFCS 2004). Springer, Berlin, Germany, 889--900.
|
| |
30
|
Martens, W., Neven, F., and Schwentick, T. 2005. Which XML schemas admit 1-pass preorder typing? In Proceedings of the 10th International Conference on Database Theory (ICDT 2005). Springer, Berlin, Germany, 68--82.
|
| |
31
|
Martens, W. and Niehren, J. 2005. Minimizing tree automata for unranked trees. In Prodeedings of the 10th International Symposium on Database Programming Languages (DBPL 2005). Springer, Berlin, Germany, 232--246.
|
| |
32
|
Murata, M., Lee, D., and Mani, M. 2001. Taxonomy of XML schema languages using formal language theory. In Proceedings of the Extreme Markup Languages 2001 Conference (Extreme Markup Languages 2001), Montreal, P. Q., Canada.
|
 |
33
|
|
| |
34
|
|
 |
35
|
|
 |
36
|
|
| |
37
|
|
 |
38
|
|
| |
39
|
|
 |
40
|
|
| |
41
|
Sperberg-McQueen, C. 2003. XML Schema 1.0: A language for document grammars. Presentation at the Joint International Conference of the Association for Computers and the Humanities Association for Literary and Linguistic Computing (ACH/ALLC 2003). Go online to http://www.w3.org/People/cmsmcq/2003/achallc/achallc2003.html.
|
| |
42
|
Sperberg-McQueen, C. and Thompson, H. 2005. XML Schema. Go online to http://www.w3.org/XML/Schema.
|
 |
43
|
|
| |
44
|
|
| |
45
|
Thompson, H., Beech, D., Maloney, M., and Mendelsohn, N. 2004. XML Schema Part 1: Structures. Tech. rep. World Wide Web Consortium. Go online to http://www.w3.org/TR/xmlschema-1/.
|
| |
46
|
|
| |
47
|
Vitali, F., Amorosi, N., and Gessa, N. 2003. Datatype- and namespace-aware DTDs: A minimal extension. In Proceedings of the Extreme Markup Languages 2003 Conference (Extreme Markup Languages 2003).
|
CITED BY 12
|
|
Geert Jan Bex , Wouter Gelade , Wim Martens , Frank Neven, Simplifying XML schema: effortless handling of nondeterministic regular expressions, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|