ACM Home Page
Please provide us with feedback. Feedback
Sequencing XML data and query twigs for fast pattern matching
Full text PdfPdf (582 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 31 ,  Issue 1  (March 2006) table of contents
Pages: 299 - 345  
Year of Publication: 2006
ISSN:0362-5915
Authors
Praveen Rao  University of Arizona, AZ
Bongki Moon  University of Arizona, AZ
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 119,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1132863.1132871
What is a DOI?

ABSTRACT

We propose a new way of indexing XML documents and processing twig patterns in an XML database. Every XML document in the database can be transformed into a sequence of labels by prüfer's method that constructs a one-to-one correspondence between trees and sequences. During query processing, a twig pattern is also transformed into its Prüfer sequence. By performing subsequence matching on the set of sequences in the database and performing a series of refinement phases that we have developed, we can find all the occurrences of a twig pattern in the database. Our approach allows holistic processing of a twig pattern without breaking the twig into root-to-leaf paths and processing these paths individually. Furthermore, we show in the article that all correct answers are found without any false dismissals or false alarms. Experimental results demonstrate the performance benefits of our proposed techniques.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Berglund, A., Boag, S., Chamberlin, D., Fernandez, M. F., Kay, M., Robie, J., and Simon, J. 2002. XML path language (XPath) 2.0 W3C working draft 16. Tech. Rep. WD-xpath20-20020816, World Wide Web Consortium. (Aug.).
 
3
Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., and Simon, J. 2002. XQuery 1.0: An XML Query Language W3C working draft 16. Tech. Rep. WD-xquery-20020816, World Wide Web Consortium. (Aug.).
 
4
Bray, T., Paoli, J., Sperberg-McQueen, C. M., and Maler, E. 2000. Extensible markup language (XML) 1.0 2nd Ed. W3C recommendation. Tech. Rep. REC-xml-20001006, World Wide Web Consortium. (Oct.).
5
 
6
Catherine Bow, B. H. and Bird, S. 2003. Towards a general model of Interlinear text. In Proceedings of EMELD Workshop. Lansing, MI.
 
7
Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V. J., and Zaniolo, C. 2002. Efficient structural joins on indexed XML documents. In Proceedings of the 28th VLDB Conference. Morgan-Kaufmann, Hong Kong, China. 263--274.
 
8
9
 
10
11
 
12
Grust, T., van Keulen, M., and Teubner, J. 2003. Staircase join: Teach a relational DBMS to watch its (axis) steps. In Proceedings of the 29th VLDB Conference. Morgan-Kaufmann, Berlin, Germany. 524--535.
 
13
 
14
Jiang, H., Lu, H., Wang, W., and Ooi, B. C. 2003. XR-Tree: Indexing XML data for efficient structural joins. In Proceedings of the 19th IEEE International Conference on Data Engineering. IEEE Computer Society, Bangalore, India. 253--264.
 
15
Jiang, H., Wang, W., Lu, H., and Yu, J. X. 2003. Holistic twig joins on indexed XML documents. In Proceedings of the 29th VLDB Conference. Morgan-Kaufmann, Berlin, Germany. 273--284.
16
 
17
 
18
 
19
 
20
 
21
Müller, K. 2004. Semi-automatic construction of a question treebank. In Proceedings of the 4th International Conference on Language Resources and Evaluation. Lisbon, Portugal.
 
22
 
23
Prüfer, H. 1918. Neuer Beweis eines Satzes über Permutationen. Archiv für Mathematik und Physik 27, 142--144.
 
24
 
25
UW XML Repository. 2002. http://www.cs.washington.edu/research/xmldatasets.
26
 
27
William D. Lewis. 2005. Personal communications. http://zimmer.csufresno.edu/ wlewis/.
28
 
29
Zezula, P., Mandreoli, F., and Martoglia, R. 2004. Tree signatures and unordered XML pattern matching. In Proceedings of the 32nd International Conference on Current Trends in Theory and Practice of Computer Science. Springer, Merin, Czech Republic. 122--139.
30


Collaborative Colleagues:
Praveen Rao: colleagues
Bongki Moon: colleagues