ACM Home Page
Please provide us with feedback. Feedback
Retrieving meaningful relaxed tightest fragments for XML keyword search
Full text PdfPdf (1.22 MB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Information retrieval table of contents
Pages 815-826  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Lingbo Kong  INRIA Futurs, Villeneuve d'Ascq, Lille, France
Rémi Gilleron  INRIA Futurs, Villeneuve d'Ascq, Lille, France
Aurélien Lemay Mostrare  INRIA Futurs, Villeneuve d'Ascq, Lille, France
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 102,   Citation Count: 1
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516454
What is a DOI?

ABSTRACT

Adapting keyword search to XML data has been attractive recently, generalized as XML keyword search (XKS). One of its key tasks is to return the meaningful fragments as the result. [1] is the latest work following this trend, and it focuses on returning the fragments rooted at SLCA (Smallest LCA -- Lowest Common Ancestor) nodes. To guarantee that the fragments only contain interesting nodes, [1] proposes a contributor-based filtering mechanism in its MaxMatch algorithm. However, the filtering mechanism is not sufficient. It will commit the false positive problem (discarding interesting nodes) and the redundancy problem (keeping uninteresting nodes).

In this paper, our interest is to propose a framework of retrieving meaningful fragments rooted at not only the SLCA nodes, but all LCA nodes. We begin by introducing the concept of Relaxed Tightest Fragment (RTF) as the basic result type. Then we propose a new filtering mechanism to overcome those two problems in Max-Match. Its kernel is the concept of valid contributor, which helps to distinguish the interesting children of a node. The new filtering mechanism is then to prune the nodes in a RTF which are not valid contributors to their parents. Based on the valid contributor concept, our ValidRTF algorithm not only overcomes those two problems in MaxMatch, but also satisfies the axiomatic properties deduced in [1] that an XKS technique should satisfy. We compare ValidRTF with MaxMatch on real and synthetic XML data. The result verifies our claims, and shows the effectiveness of our valid-contributor-based filtering mechanism.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
4
 
5
 
6
7
 
8
9
10
11
12
 
13
 
14
 
15
 
16
 
17
J. Xu, J. Lu, W. Wang, and B. Shi, "Effective keyword search in xml documents based on miu," in DASFAA, 2006.
18
 
19
"http://www.cs.washington.edu/research/xmldatasets/."
 
20
"http://monetdb.cwi.nl/xml/."
 
21
"http://lucene.apache.org/."
 
22
"www.syger.com/jsc/docs/stopwords/english.htm."
 
23
 
24
Z. Liu and Y. Chen, "Answering keyword queries on xml using materialized views," in ICDE, 2008.
25
26
27
 
28
D. Theodoratos and X. Wu, "An original semantics to keyword queries for xml using structural patterns," in DASFAA, 2007.
29
30
 
31
E. Curtmola, S. Amer-Yahia, P. Brown, and M. Fernàndez, "Galatex: A conformant implementation of the xquery fulltext language," in XIME-P, 2005.
 
32
 
33
Z. Bao, T. W. Ling, B. Chen, and J. Lu, "Effective xml keyword search with relevance oriented ranking," in ICDE, 2009.
34

Collaborative Colleagues:
Lingbo Kong: colleagues
Rémi Gilleron: colleagues
Aurélien Lemay Mostrare: colleagues