|
ABSTRACT
Keyword search is a proven, user-friendly way to query HTML documents in the World Wide Web. We propose keyword search in XML documents, modeled as labeled trees, and describe corresponding efficient algorithms. The proposed keyword search returns the set of smallest trees containing all keywords, where a tree is designated as "smallest" if it contains no tree that also contains all keywords. Our core contribution, the Indexed Lookup Eager algorithm, exploits key properties of smallest trees in order to outperform prior algorithms by orders of magnitude when the query contains keywords with significantly different frequencies. The Scan Eager variant is tuned for the case where the keywords have similar frequencies. We analytically and experimentally evaluate two variants of the Eager algorithm, along with the Stack algorithm [13]. We also present the XKSearch system, which utilizes the Indexed Lookup Eager, Scan Eager and Stack algorithms and a demo of which on DBLP data is available at http://www.db.ucsd.edu/projects/xksearch. Finally, we extend the Indexed Lookup Eager algorithm to answer Lowest Common Ancestor (LCA) queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A system for keyword-based search over relational databases. In ICDE, 2002.
|
| |
2
|
V. Aguilera et al. Querying XML documents in XYleme. In SIGIR Workshop on XML and Information Retrieval, 2000.
|
| |
3
|
|
| |
4
|
BerkeleyDB. http://www.sleepycat.com/.
|
| |
5
|
|
| |
6
|
|
| |
7
|
Zhiyuan Chen , H. V. Jagadish , Flip Korn , Nick Koudas , S. Muthukrishnan , Raymond T. Ng , Divesh Srivastava, Counting Twig Matches in a Tree, Proceedings of the 17th International Conference on Data Engineering, p.595-604, April 02-06, 2001
|
| |
8
|
S. Cohen, J. Namou, Y. Kanza, and Y. Sagiv. XSEarch: A semantic search engine for XML. In VLDB, 2003.
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
V. Hristidis, N. Koudas, Y. Papakonstantinou, and D. Srivastava. Keyword Proximity Search in XML Trees. Available at http://www.db.ucsd.edu/publications/treeproximity.pdf.
|
| |
15
|
V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, 2002.
|
| |
16
|
V. Hristidis, Y. Papakonstantinou, and A. Balmin. Keyword proximity search on XML graphs. In ICDE, 2003.
|
| |
17
|
|
| |
18
|
Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, 2004.
|
| |
19
|
J. Naughton et al. The Niagara Internet Query System. IEEE Data Engineering Bulletin, 24(2):27--33, 2001.
|
| |
20
|
|
| |
21
|
|
| |
22
|
D. Srivastava et al. Structural joins: A primitive for efficient XML query pattern matching. In ICDE, 2002.
|
 |
23
|
Igor Tatarinov , Stratis D. Viglas , Kevin Beyer , Jayavel Shanmugasundaram , Eugene Shekita , Chun Zhang, Storing and querying ordered XML using a relational database system, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564715]
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
XYZFind. http://www.searchtools.com/tools/xyzfind.html.
|
CITED BY 37
|
|
|
|
|
|
|
|
Guoliang Li , Beng Chin Ooi , Jianhua Feng , Jianyong Wang , Lizhu Zhou, EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Filipe Mesquita , Altigran S. da Silva , Edleno S. de Moura , Pável Calado , Alberto H. F. Laender, LABRADOR: Efficiently publishing relational databases on the web by using keyword-based query interfaces, Information Processing and Management: an International Journal, v.43 n.4, p.983-1004, July, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Lin Xudong , Xu De , Wang Ning, NNQM: a novel non-navigating XML query model, Proceedings of the 7th Conference on 7th WSEAS International Conference on Multimedia, Internet & Video Technologies, p.270-274, September 15-17, 2007, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guoliang Li , Shengyue Ji , Chen Li , Jianhua Feng, Efficient type-ahead search on relational data: a TASTIER approach, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Liang Xu , Tok Wang Ling , Huayu Wu , Zhifeng Bao, DDE: from dewey to a fully dynamic XML labeling scheme, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
Yi Chen , Wei Wang , Ziyang Liu , Xuemin Lin, Keyword search on structured and semi-structured data, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|