|
ABSTRACT
We have implemented and released the XSQ system for evaluating XPath queries on streaming XML data. XSQ supports XPath features such as multiple predicates, closures, and aggregation, which pose interesting challenges for streaming evaluation. Our implementation is based on using a hierarchical arrangement of augmented finite state automata. A design goal of XSQ is buffering data for the least amount of time possible. We present a detailed experimental study that characterizes the performance of XSQ and related systems, and that illustrates the performance implications of XPath features such as closures.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Abiteboul, S., Quass, D., McHugh, J., Widom, J., and Wiener, J. 1996. The Lorel query language for semistructured data. J. Dig. Lib. 1, 1 (Nov.), 68--88.
|
| |
2
|
|
| |
3
|
Avila-Campillo, I., Raven, D., Green, T., Gupta, A., Kadiyska, Y., Onizuka, M., and Suciu, D. 2002. An XML toolkit for light-weight XML stream processing. http://www.cs.washington.edu/homes/suciu/XMLTK/.
|
 |
4
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
[doi> 10.1145/543613.543615]
|
 |
5
|
|
| |
6
|
Barton, C., Charles, P., Fontoura, M., Goyal, D., Josifovski, V., and Raghavachari, M. 2002. An algorithm for streaming XPath processing with forward and backward axes. In Proceedings of the PLAN-X Workshop on Programming Language Technologies for XML (Pittsburgh, Pa).
|
| |
7
|
Barton, C. M., Charles, P. G., Goyal, D., Raghavachari, M., Josifovski, V., and Fontoura, M. F. 2003. Streaming XPath processing with forward and backward axes. In Proceedings of the International Conference on Data Engineering (ICDE) (Bangalore, India). 455--466.
|
| |
8
|
Becker, O. 2002. Joost is ollie's original streaming transformer. http://joost.sourceforge.net/.
|
| |
9
|
Becker, O., Cimprich, P., and Nentwich, C. 2002. Streaming transformations for XML. http://www.gingerall.cz/stx.
|
| |
10
|
|
| |
11
|
Boag, S., Chamberlin, D., Fernández, M. F., Florescu, D., Robie, J., and Siméon, J. 2003. XQuery 1.0: An XML query language 1.0. W3C Working Draft, W3C, http://www.w3.org/TR/xquery/. August.
|
| |
12
|
Borne, K. D. 2002. ADC dataset, GSFC/NASA XML project. http://xml.gsfc.nasa.gov/archive/.
|
| |
13
|
Bray, T., Paoli, J., Sperberg-McQueen, C., and Maler, E. 2000. Extensible markup language (XML) 1.0 (2nd Edition). World Wide Web Consortium Recommendation. |http://www.w3.org/TR/REC-xml|.
|
 |
14
|
Peter Buneman , Susan Davidson , Gerd Hillebrand , Dan Suciu, A query language and optimization techniques for unstructured data, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.505-516, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
15
|
|
 |
16
|
|
| |
17
|
Zhiyuan Chen , H. V. Jagadish , Flip Korn , Nick Koudas , S. Muthukrishnan , Raymond T. Ng , Divesh Srivastava, Counting Twig Matches in a Tree, Proceedings of the 17th International Conference on Data Engineering, p.595-604, April 02-06, 2001
|
| |
18
|
Clark, J. and DeRose, S. 1999. XML path language (XPath) version 1.0. W3C Recommedation, W3C, http://www.w3.org/TR/xpath. Nov.
|
| |
19
|
Deutsch, A., Fernández, M. F., Florescu, D., Levy, A., and Suciu, D. 1998. XML-QL: A query language for XML. http://www.w3.org/xml/.
|
| |
20
|
|
 |
21
|
Mary Fernandez , Daniela Florescu , Jaewoo Kang , Alon Levy , Dan Suciu, STRUDEL: a Web site management system, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.549-552, May 11-15, 1997, Tucson, Arizona, United States
|
| |
22
|
Fernández, M. F. and Siméon, J. 2002. Galax. http://db.bell-labs.com/galax/.
|
| |
23
|
Gottlob, G., Koch, C., and Pichler, R. 2002. Efficient algorithms for processing XPath queries. In Proceedings of the International Conference on Very Large Data Bases (VLDB) (Hong Kong, China).
|
 |
24
|
|
| |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
Hors, A. L., Hgaret, P. L., Wood, L., Nicol, G., Robie, J., Champion, M., and Byrne, S. 2000. Document object model level 2 core specification. W3C Recommendation, W3C, http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113. November.
|
| |
29
|
Katz, H. 2002. XQEngine. http://www.fatdog.com.
|
| |
30
|
Kay, M. H. 2002. SAXON: An XSLT processor. http://saxon.sourceforge.net/.
|
| |
31
|
Kay, M. 2003. XSL transformations (XSLT) version 2.0. W3C Working Draft, W3C, http://www.w3.org/TR/xslt20/. November.
|
| |
32
|
Kilpel, P. 1992. Tree matching problems with applications to structured text databases. Ph.D. dissertation. Dept. of Computer Science, University of Helsink.
|
| |
33
|
|
| |
34
|
Ley, M. 2003. Computer science bibliography. http://dblp.uni-trier.de/xml/.
|
| |
35
|
Ludascher, B., Mukhopadhayn, P., and Papakonstantinou, Y. 2002. A transducer-based XML query processor. In Proceedings of the International Conference on Very Large Data Bases (VLDB) (Hong Kong, China). 227--238.
|
 |
36
|
|
| |
37
|
Olteanu, D., Kiesling, T., and Bry, F. 2002. An evaluation of regular path expressions with qualifiers against XML streams. Tech. Rep. PMS-FB-2002-12, Institute for Computer Science, Ludwig-Maximilians University, Munich, Germany, May.
|
 |
38
|
|
| |
39
|
Peng, F. and Chawathe, S. S. 2004. XPaSS: A multi-query streaming XPath engine. Tech. Rep. CS-TR-4565 (UMIACS-TR-2004-10), Department of Computer Science, University of Maryland. May.
|
| |
40
|
Sax Project Organization. 2001. SAX: Simple API for XML. http://www.saxproject.org/.
|
 |
41
|
|
| |
42
|
Tucker, P. A., Maier, D., and Sheard, T. 2003. Applying punctuation schemes to queries over continuous data streams. Bull. Tech. Comm. Data Eng. 26, 1 (Mar.), 33--40.
|
| |
43
|
Wu, C. H., Huang, H., Arminski, L., Catro-Alvear, J., Chen, Y., Hu, Z. Z., Ledley, R. S., Lewis, K. C., Mewes, H. W., Orcutt, B. C., Suzek, B. E., Tsugita, A., Vinayaka, C. R., Yeh, L. S., Zhang, J., and Barker, W. C. 2002. The protein information resource: An integrated public resource of functional annotation of protein. Nuc. Acids Res. 30, 35--37.
|
CITED BY 7
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mingsheng Hong , Alan J. Demers , Johannes E. Gehrke , Christoph Koch , Mirek Riedewald , Walker M. White, Massively multi-query join processing in publish/subscribe systems, Proceedings of the 2007 ACM SIGMOD international conference on Management of data, June 11-14, 2007, Beijing, China
|
|
|
|
|
|
|
|