ACM Home Page
Please provide us with feedback. Feedback
Schema extraction from XML collections
Full text PdfPdf (389 KB)
Source International Conference on Digital Libraries archive
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries table of contents
Portland, Oregon, USA
SESSION: Federating and harvesting metadata table of contents
Pages: 291 - 292  
Year of Publication: 2002
ISBN:1-58113-513-0
Author
Boris Chidlovskii  Xerox Research Centre Europe, Grenoble Laboratory, Meylan, France
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 40,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/544220.544288
What is a DOI?

ABSTRACT

XML Schema language has been proposed to replace Document Type Definitions (DTDs) as schema mechanism for XML data. This language consistently extends grammar-based constructions with constraint- and pattern-based ones and have a higher expressive power than DTDs. As schemas remain optional for XML, we address the problem of XML Schema extraction. We model the XML schema as extended context-free grammars and develop a novel extraction algorithm inspired by methods of grammatical inference. The algorithm copes also with the schema determinism requirement imposed by XML DTDs and XML Schema languages.