ACM Home Page
Please provide us with feedback. Feedback
A document object modeling method to retrieve data from a very large XML document
Full text PdfPdf (517 KB)
Source
Document Engineering archive
Proceedings of the 2007 ACM symposium on Document engineering table of contents
Winnipeg, Manitoba, Canada
SESSION: XML documents table of contents
Pages: 59 - 68  
Year of Publication: 2007
ISBN:978-1-59593-776-6
Authors
Seung Min Kim  Seoul National University
Suk I. Yoo  Seoul National University
Eunji Hong  Sung-Kong-Hoe University
Tae Gwon Kim  Kangnam University
Il Kon Kim  Kyungpook National University
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 97,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1284420.1284439
What is a DOI?

ABSTRACT

Document Object Modeling (DOM) is widely used approach for retrieving data from an XML document. If the size of the XML document is very large, however, using the DOM approach for retrieving data from the XML document may suffer from a lack of memory space for building the associated XML tree in the main memory. To alleviate this problem, we propose a method that allows the very large XML document to be split into small XML documents, retrieves data from the XML tree built from each of these small XML documents, and combines the results from all of the n XML trees to generate the final result. With this proposed approach, the memory space and processing time required to retrieve data from the very large XML document using DOM are reduced so that they can be managed by one single general-purpose personal computer.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Extensible Markup Language (XML) 1.0 (Third Edition), http://www.w3.org/TR/2004/REC-XML-20040204/
 
2
UniProt, http://www.uniprot.org/database/download.shtml
 
3
Apache Xerces, http://xerces.apache.org/
 
4
Document Object Model (DOM) Level 3 Core Specification, http://www.w3.org/TR/DOM-Level-3-Core/
 
5
SAX: A Simple API for XML, http://www.saxproject.org
6
7
 
8
XML Path Language (XPath), http://www.w3.org/TR/xpath
 
9
10
 
11
 
12
 
13
14
 
15
 
16
 
17
Tian, F., DeWitt, D. J., Chen, J., and Zhang, C. The design and performance evaluation of alternative XML storage strategies. Tech. rep., Computer Science Department, University of Wisconsin, Madison, WI, 2000.
 
18
 
19
Wei Lu, Kenneth Chiu and Yinfei Pan, A Parallel Approach to XML Parsing, In Proceedings of the 7th IEEE/ACM International Conference on Grid Computing, 2006, 223--230.
 
20
XMark, http://monetdb.cwi.nl/xml/generator.html
 
21
XMark Benchmark Queries, http://www.ins.cwi.nl/projects/xmark/Assets/xmlquery.txt
 
22
XML Query (XQuery), http://www.w3.org/XML/Query
 
23
XML Namespace, http://www.w3.org/TR/REC-xml-names/

Collaborative Colleagues:
Seung Min Kim: colleagues
Suk I. Yoo: colleagues
Eunji Hong: colleagues
Tae Gwon Kim: colleagues
Il Kon Kim: colleagues