ACM Home Page
Please provide us with feedback. Feedback
Flexible and efficient XML search with complex full-text predicates
Full text PdfPdf (356 KB)
Source International Conference on Management of Data archive
Proceedings of the 2006 ACM SIGMOD international conference on Management of data table of contents
Chicago, IL, USA
SESSION: Potpourri table of contents
Pages: 575 - 586  
Year of Publication: 2006
ISBN:1-59593-434-0
Authors
Sihem Amer-Yahia  AT&T Labs Research
Emiran Curtmola  University of California, San Diego
Alin Deutsch  University of California, San Diego
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 92,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1142473.1142537
What is a DOI?

ABSTRACT

Recently, there has been extensive research that generated a wealth of new XML full-text query languages, ranging from simple Boolean search to combining sophisticated proximity and order predicates on keywords. While computing least common ancestors of query terms was proposed for efficient evaluation of conjunctive keyword queries by exploiting the document structure, no such solution was developed to evaluate complex full-text queries. We present efficient evaluation algorithms based on a formalization of XML queries in terms of keyword patterns and an algebra which manipulates pattern matches. Our algebra captures most existing languages and their varying semantics and our algorithms combine relational query evaluation techniques with the exploitation of document structure to process queries with complex full-text predicates. We show how scoring can be incorporated into our framework without compromising the algorithms complexity. Our experiments show that considering element nesting dramatically improves the performance of queries with complex full-text predicates.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
[4] A. Balmin, V. Hristidis, N. Koudas, Y. Papakonstantinou, D. Srivastava, T. Wang. A System for Keyword Proximity Search on XML Databases. VLDB 2003.
 
5
[5] J. M. Bremer, M. Gertz. XQuery/IR: Integrating XML Document and Data Retrieval. WebDB 2002.
6
7
8
 
9
[9] C. Clarke, G. Cormack, F. Burkowski. An Algebra for Structured Text Search and a Framework for its Implementation. Comput. J. 38(1): 43-56 (1995).
 
10
[10] S. Cohen, J. Mamou. Y. Kanza, Y. Sagiv. XSEarch: A Semantic Search Engine for XML. VLDB 2003.
 
11
 
12
[12] E. Curtmola, S. Amer-Yahia, P. Brown, M. Fernández. GalaTex: A Conformant Implementation of the XQuery Full-Text Language. XIME-P 2005.
 
13
 
14
[14] N. Fuhr, K. Grossjohann. XIRQL: An Extension of XQL for Information Retrieval. SIGIR 2000.
15
 
16
[16] T. Grabs, H. Schek ETH Zürich at INEX: Flexible Information Retrieval from XML with PowerDB-XML. INEX Workshop 2002.
17
18
 
19
[19] Initiative for the Evaluation of XML Retrieval. http://inex.is.informatik.uni-duisburg.de/2005/.
 
20
[20] J. Jaakkola, P. Kilpelainen. Nested Text-Region Algebra Report C-1999-2, Dept. of Computer Science, University of Helsinki, January 1999.
 
21
[21] Y. Li, C. Yu, H. V. Jagadish. Schema-Free XQuery. VLDB 2004.
 
22
[22] Library of Congress. http://lcweb.loc.gov/crsinfo/xml/.
 
23
[23] A. Salminen, F. Tompa. PAT Expressions: an Algebra for Text Search. Acta Linguistica Hungar. 41 (1-4), 1992.
 
24
 
25
 
26
[26] A. Trotman and B. Sigurbjrnsson NEXI, Now and Next. INEX 2004.
 
27
[27] J.N. Vittaut, B. Piwowarski, P. Gallinari. An Algebra for Structured Queries in Bayesian Networks. INEX 2004.
 
28
[28] The World Wide Web Consortium. XQuery 1.0 and XPath 2.0 Full-Text. Working draft. http://www.w3.org/TR/xquery-full-text/.
29
30


Collaborative Colleagues:
Sihem Amer-Yahia: colleagues
Emiran Curtmola: colleagues
Alin Deutsch: colleagues