| Retrieving meaningful relaxed tightest fragments for XML keyword search |
| Full text |
Pdf
(1.22 MB)
|
| Source
|
Extending Database Technology; Vol. 360
archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Information retrieval
table of contents
Pages 815-826
Year of Publication: 2009
ISBN:978-1-60558-422-5
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 102, Citation Count: 1
|
|
|
ABSTRACT
Adapting keyword search to XML data has been attractive recently, generalized as XML keyword search (XKS). One of its key tasks is to return the meaningful fragments as the result. [1] is the latest work following this trend, and it focuses on returning the fragments rooted at SLCA (Smallest LCA -- Lowest Common Ancestor) nodes. To guarantee that the fragments only contain interesting nodes, [1] proposes a contributor-based filtering mechanism in its MaxMatch algorithm. However, the filtering mechanism is not sufficient. It will commit the false positive problem (discarding interesting nodes) and the redundancy problem (keeping uninteresting nodes). In this paper, our interest is to propose a framework of retrieving meaningful fragments rooted at not only the SLCA nodes, but all LCA nodes. We begin by introducing the concept of Relaxed Tightest Fragment (RTF) as the basic result type. Then we propose a new filtering mechanism to overcome those two problems in Max-Match. Its kernel is the concept of valid contributor, which helps to distinguish the interesting children of a node. The new filtering mechanism is then to prune the nodes in a RTF which are not valid contributors to their parents. Based on the valid contributor concept, our ValidRTF algorithm not only overcomes those two problems in MaxMatch, but also satisfies the axiomatic properties deduced in [1] that an XKS technique should satisfy. We compare ValidRTF with MaxMatch on real and synthetic XML data. The result verifies our claims, and shows the effectiveness of our valid-contributor-based filtering mechanism.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Igor Tatarinov , Stratis D. Viglas , Kevin Beyer , Jayavel Shanmugasundaram , Eugene Shekita , Chun Zhang, Storing and querying ordered XML using a relational database system, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564715]
|
| |
3
|
|
 |
4
|
|
| |
5
|
Sara Cohen , Jonathan Mamou , Yaron Kanza , Yehoshua Sagiv, XSEarch: a semantic search engine for XML, Proceedings of the 29th international conference on Very large data bases, p.45-56, September 09-12, 2003, Berlin, Germany
|
| |
6
|
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
J. Xu, J. Lu, W. Wang, and B. Shi, "Effective keyword search in xml documents based on miu," in DASFAA, 2006.
|
 |
18
|
|
| |
19
|
"http://www.cs.washington.edu/research/xmldatasets/."
|
| |
20
|
"http://monetdb.cwi.nl/xml/."
|
| |
21
|
"http://lucene.apache.org/."
|
| |
22
|
"www.syger.com/jsc/docs/stopwords/english.htm."
|
| |
23
|
|
| |
24
|
Z. Liu and Y. Chen, "Answering keyword queries on xml using materialized views," in ICDE, 2008.
|
 |
25
|
|
 |
26
|
Peter Buneman , Susan Davidson , Wenfei Fan , Carmem Hara , Wang-Chiew Tan, Keys for XML, Proceedings of the 10th international conference on World Wide Web, p.201-210, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.371984]
|
 |
27
|
|
| |
28
|
D. Theodoratos and X. Wu, "An original semantics to keyword queries for xml using structural patterns," in DASFAA, 2007.
|
 |
29
|
|
 |
30
|
|
| |
31
|
E. Curtmola, S. Amer-Yahia, P. Brown, and M. Fernàndez, "Galatex: A conformant implementation of the xquery fulltext language," in XIME-P, 2005.
|
| |
32
|
|
| |
33
|
Z. Bao, T. W. Ling, B. Chen, and J. Lu, "Effective xml keyword search with relevance oriented ranking," in ICDE, 2009.
|
 |
34
|
Guoliang Li , Beng Chin Ooi , Jianhua Feng , Jianyong Wang , Lizhu Zhou, EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
[doi> 10.1145/1376616.1376706]
|
CITED BY
|
|
Yi Chen , Wei Wang , Ziyang Liu , Xuemin Lin, Keyword search on structured and semi-structured data, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|