ACM Home Page
Please provide us with feedback. Feedback
Efficient asymmetric inclusion between regular expression types
Full text PdfPdf (486 KB)
Source ACM International Conference Proceeding Series; Vol. 361 archive
Proceedings of the 12th International Conference on Database Theory table of contents
St. Petersburg, Russia
SESSION: XML table of contents
Pages 174-182  
Year of Publication: 2009
ISBN:978-1-60558-423-2
Authors
Dario Colazzo  Université Paris Sud, Orsay
Giorgio Ghelli  Università di Pisa, Pisa - Italy
Carlo Sartiani  Università di Pisa, Pisa - Italy
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 28,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1514894.1514916
What is a DOI?

ABSTRACT

The inclusion of Regular Expressions (REs) is the kernel of any subtype checking algorithm for XML schema languages. XML applications would benefit from the extension of REs with interleaving and counting, but this is not feasible in general, since inclusion is EXPSPACE-complete for such extended REs. In [9] we introduced a notion of "conflict-free REs", which are extended REs with excellent complexity behaviour, including a cubic inclusion algorithm [9] and linear membership [10]. Conflict-free REs have interleaving and counting, but the complexity is tamed by the "conflict-free" limitations, which have been found to be satisfied by the vast majority of the content models published on the Web.

However, the most important use of subtype checking is in the context of type-cheching of XML manipulation languges. A type checker works by testing the inclusion of inferred subtypes in declared supertypes. The conflict-free restriction, while quite harmless for the human-defined supertype, is far too restrictive for the inferred subtype, whose shape is difficult to constrain.

We show here that the PTIME inclusion algorithm can be actually extended to deal with totally unrestricted REs with counting and interleaving in the subtype position, provided that the supertype is conflict-free. This is exactly the expressive power that we need in order to use subtyping inside type-checking algorithms, and the cost of this generalized algorithm is only quadratic, which is as good as the best algorithm we have for the symmetric case (see [5]). The result is extremely surprising, since we had previously found that asymmetric inclusion becomes NP-hard as soon as the candidate subtype is enriched with binary intersection, a generalization that looked much more innocent than what we achieve here.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. Barbosa, G. Leighton, and A. Smith. Efficient incremental validation of XML documents after composite updates. In XSym, volume 4156 of LNCS, pages 107--121. Springer, 2006.
 
2
 
3
 
4
B. Choi. What are real DTDs like? In WebDB, pages 43--48, 2002.
 
5
D. Colazzo, G. Ghelli, and C. Sartiani. Efficient inclusion for a class of xml types with interleaving and counting. Information Systems, 2008. To Appear.
 
6
D. Colazzo and C. Sartiani. Efficient subtyping for unordered XML types. Technical report, Dipartimento di Informatica - Università di Pisa, 2007.
 
7
J. N. Foster, B. C. Pierce, and A. Schmitt. A logic your typechecker can count on: Unordered tree types in practice. In Workshop on Programming Language Technologies for XML (PLAN-X), informal proceedings, Jan. 2007.
 
8
W. Gelade, W. Martens, and F. Neven. Optimizing schema languages for xml: Numerical constraints and interleaving. In T. Schwentick and D. Suciu, editors, Proceedings of the 11th International Conference on Database Theory - ICDT 2007, Barcelona, Spain, January 10--12, 2007, volume 4353 of Lecture Notes in Computer Science, pages 269--283. Springer, 2007.
 
9
G. Ghelli, D. Colazzo, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. In M. Arenas and M. I. Schwartzbach, editors, Proceedings of the 11th International Symposium on Database Programming Languages, DBPL 2007, Vienna, Austria, September 23--24, 2007, Revised Selected Papers, volume 4797 of Lecture Notes in Computer Science, pages 231--245. Springer, 2007.
10
11
 
12

Collaborative Colleagues:
Dario Colazzo: colleagues
Giorgio Ghelli: colleagues
Carlo Sartiani: colleagues