| Linear time membership in a class of regular expressions with interleaving and counting |
| Full text |
Pdf
(313 KB)
|
Source
|
Conference on Information and Knowledge Management
archive
Proceeding of the 17th ACM conference on Information and knowledge management
table of contents
Napa Valley, California, USA
SESSION: DB: stream processing
table of contents
Pages 389-398
Year of Publication: 2008
ISBN:978-1-59593-991-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 74, Citation Count: 2
|
|
|
ABSTRACT
The extension of Regular Expressions (REs) with an interleaving (shuffle) operator has been proposed in many occasions, since it would be crucial to deal with unordered data. However, interleaving badly affects the complexity of basic operations, and, expecially, makes membership NP-hard [13], which is unacceptable for most uses of REs. REs form the basis of most XML type languages, such as DTDs and XML Schema types, and XDuce types [16, 11]. In this context, the interleaving operator would be a natural addition to the language of REs, as witnessed by the presence of limited forms of interleaving in XSD (the all group), Relax-NG, and SGML, provided that the NP-hardness of membership could be avoided. We present here a restricted class of REs with interleaving and counting which admits a linear membership algorithm, and which is expressive enough to cover the vast majority of real-world XML types. We first present an algorithm for membership of a list of words into a RE with interleaving and counting, based on the translation of the RE into a set of constraints. We generalize the approach in order to check membership of XML trees into a class of EDTDs with interleaving and counting, which models the crucial aspects of DTDs and XSD schemas.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
D. Barbosa, G. Leighton, and A. Smith. Efficient incremental validation of XML documents after composite updates. In XSym, volume 4156 of LNCS, pages 107--121. Springer, 2006.
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
B. Choi. What are real DTDs like? In WebDB, pages 43--48, 2002.
|
| |
9
|
W. Gelade, W. Martens, and F. Neven. Optimizing schema languages for XML: Numerical constraints and interleaving. In ICDT, 2007.
|
| |
10
|
G. Ghelli, D. Colazzo, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. In DBPL, 2007.
|
 |
11
|
|
| |
12
|
John E. Hopcroft , Rajeev Motwani , Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation (3rd Edition), Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2006
|
| |
13
|
|
| |
14
|
M. Montazerian, P. T. Wood, and S. R. Mousavi. XPath query satisfiability is in PTIME for real-world DTDs. In XSym, volume 4704 of LNCS, pages 17--30. Springer, 2007.
|
| |
15
|
A. R. Schmidt , Florian Waas , Martin L. Kersten , D. Florescu , I. Manolescu , M. J. Carey , R. Busse, The XML benchmark project, CWI (Centre for Mathematics and Computer Science), Amsterdam, The Netherlands, 2001
|
| |
16
|
H. S. Thompson, D. Beech, M. Maloney, and N. Mendelsohn. XML Schema Part 1: Structures Second Edition. Technical report, World Wide Web Consortium, Oct 2004. W3C Recommendation.
|
| |
17
|
|
CITED BY 2
|
|
|
|
|
Geert Jan Bex , Wouter Gelade , Wim Martens , Frank Neven, Simplifying XML schema: effortless handling of nondeterministic regular expressions, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|