|
ABSTRACT
Relational Keyword Search (R-KWS) provides an intuitive way to query relational data without requiring SQL, or knowledge of the underlying schema. In this article we describe a comprehensive framework for R-KWS covering snapshot queries on conventional tables and continuous queries on relational streams. Our contributions are summarized as follows: (i) We provide formal semantics, addressing the temporal validity and order of results, spanning uniformly over tables and streams; (ii) we investigate two general methodologies for query processing, graph based and operator based, that resolve several problems of previous approaches; and (iii) we develop a range of algorithms and optimizations covering both methodologies. We demonstrate the effectiveness of R-KWS, as well as the significant performance benefits of the proposed techniques, through extensive experiments with static and streaming datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Abadi, D. J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., and Zdonik, S. 2003. Aurora: A new model and architecture for data stream management. VLDB J. 12, 2, 120--139.
|
| |
2
|
Agrawal, S., Chaudhuri, S., and Das, G. 2002. DBXplorer: A system for keyword-based search over relational databases. In Proceedings of the IEEE International Conference on Data Engineering (ICDE). 5--16.
|
| |
3
|
Arasu, A., Babu, S., and Widom, J. 2006. The CQL continuous query language: Semantic foundations and query execution. VLDB J. 15, 2, 121--142.
|
| |
4
|
Avnur, R. and Hellerstein, J. M. 2000. Eddies: Continuously adaptive query processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 261--272.
|
| |
5
|
Balmin, A., Hristidis, V., and Papakonstantinou, Y. 2004. ObjectRank: Authority-Based keyword search in databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 564--575.
|
| |
6
|
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrbarti, S., and Sudarshan, S. 2002. Keyword searching and browsing in databases using BANKS. In Proceedings of the IEEE International Conference on Data Engineering (ICDE). 431--440.
|
| |
7
|
Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S., Madden, S. R., Raman, V., Reiss, F., and Shah, M. A. 2003. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR).
|
| |
8
|
Chaudhuri, S., Dayal, U., and Yan T. W. 1995. Join queries with external text sources: Execution and optimization techniques. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 410--422.
|
| |
9
|
Cohen, S., Kanza, Y., Kimelfeld, B., and Sagiv, Y. 2005. Interconnection semantics for keyword search in XML. In Proceedings of the ACM CIKM International Conference on Information and Knowledge Management (CIKM). 389--396.
|
| |
10
|
Dar, S., Entin, G., Geva, S., and Palmon, E. 1998. DTL's dataspot: Database exploration using plain language. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 645--649.
|
| |
11
|
De Felipe, I., Hristidis, V., and Rishe, N. 2008. Keyword search on spatial databases. In Proceedings of the IEEE International Conference on Data Engineering (ICDE). 656--665.
|
| |
12
|
Fabret, F., Jacobsen, H. A., Llirbat, F., Pereira, J., Ross, K. A., and Shasha, D. 2001. Filtering algorithms and implementation for very fast publish/subscribe systems. SIGMOD Rec. 30, 2, 115--126.
|
| |
13
|
Golab, L. and Öszu, T. M. 2003. Issues in data stream management. SIGMOD Rec. 32, 2, 5--14.
|
| |
14
|
Golenberg, K., Kimelfeld, B., and Sagiv, Y. 2008. Keyword proximity search in complex data graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 927--940.
|
| |
15
|
Gravano, L., Ipeirotis, P., Koudas, N., and Srivastava, D. 2003. Text joins in an RDBMS for web data integration. In Proceedings of the International World Wide Web Conference (WWW). 90--101.
|
| |
16
|
Guo, L., Shao, F., Botev C., and Shanmugasundaram, J. 2003. XRANK: Ranked keyword search over XML documents. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 16--27.
|
| |
17
|
He, H., Wang, H., Yang, J., and Yu, P. 2007. BLINKS: Ranked keyword searches on graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 305--316.
|
| |
18
|
Hristidis, V., Gravano, L., and Papakonstantinou, Y. 2003. Efficient IR-style keyword search over relational databases. In Proceedings of 29th International Conference on Very Large Data Bases (VLDB). 850--861.
|
| |
19
|
Hristidis, V. and Papakonstantinou, Y. 2002. DISCOVER: Keyword search in relational databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 670--681.
|
| |
20
|
Hristidis, V., Papakonstantinou, Y., and Balmin, A. 2003. Keyword proximity search on XML graphs. In Proceedings of the IEEE International Conference on Data Engineering (ICDE). 367--378.
|
| |
21
|
Hristidis, V., Valdivia, O., Vlachos, M., and Yu, P. 2006. Continuous keyword search on multiple text streams. In Proceedings of the ACM CIKM International Conference on Information and Knowledge Management (CIKM). 802--803.
|
| |
22
|
Irmak, U., Mihaylov, S., Suel, T., Ganguly, S., and Izmailov, R. 2006. Efficient query subscription processing for prospective search engines. In Proceedings of the USENIX Annual Technical Conference. 375--380.
|
| |
23
|
Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., and Karambelkar, H. 2005. Bidirectional expansion for keyword search on graph databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB). 505--516.
|
| |
24
|
Kimelfeld, B. and Sagiv, Y. 2005. Efficiently enumerating results of keyword search. In Proceedings of the International Symposium on Database Programming Languages (DBPL). Lecture Notes in Computer Science, vol. 3774, 58--73.
|
| |
25
|
Kimelfeld, B. and Sagiv, Y. 2006. Finding and approximating top-k answers in keyword proximity search. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). 173--182.
|
| |
26
|
Krämer, J. and Seeger, B. 2004. PIPES: A public infrastructure for processing and exploring streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 925--926.
|
| |
27
|
Li, G., Ooi, B. C., Feng, J., Wang, J., and Zhou, L. 2008. EASE: Efficient and adaptive keyword search on unstructured, semi-structured and structured data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 903--914.
|
| |
28
|
Liu, F., Yu, C., Meng, W., and Chowdhury, A. 2006. Effective keyword search in relational databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 563--574.
|
| |
29
|
Liu, Z. and Chen, Y. 2007. Identifying meaningful return information for XML keyword search. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 329--340.
|
| |
30
|
Luo, Y., Lin, X., Wang, W., and Zhou, X. 2007. SPARK: Top-k keyword query in relational databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 115--126.
|
| |
31
|
Markowetz, A., Yang, Y., and Papadias, D. 2007. Keyword search on relational data streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 605--616.
|
| |
32
|
Markowetz, A., Yang, Y., and Papadias, D. 2009. Reachability indexes for relational keyword search. In Proceedings of the IEEE International Conference on Data Engineering (ICDE). 1163--1166.
|
| |
33
|
Sarda, N. L. and Jain, A. 2001. Mragyati: A system for keyword-based searching in databases. Tech. rep. CoRR, cs.DB/0110052.
|
| |
34
|
Sayyadian, M., Lekhac, H., Doan, A., and Gravano, L. 2007. Efficient keyword search across heterogeneous relational databases. In Proceedings of the IEEE International Conference on Data Engineering (ICDE). 346--355.
|
| |
35
|
Su, Q. and Widom, J. 2005. Indexing relational database content offline for efficient keyword-based search. In Proceedings of the International Database Engineering and Applications Symposium (IDEAS). 297--306.
|
| |
36
|
Vu, Q. H., Ooi, B. C, Papadias, D., and Tung, A. 2008. A graph method for keyword-based selection of the Top-k databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 915--926.
|
| |
37
|
Wu, P., Sismanis, Y., and Reinwald, B. 2007. Towards keyword-driven analytical processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 617--628.
|
| |
38
|
Xu, Y. and Papakonstantinou, Y. 2005. Efficient keyword search for smallest LCAs in XML databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 537--538.
|
| |
39
|
Yu, B., Li., G., Sollins, K., and Tung, A. 2007. Effective keyword-based selection of relational databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). 139--150.
|
| |
40
|
Yan, T. W. and Garcia-Molina, H. 1999. The SIFT information dissemination system. ACM Trans. Datab. Syst. 24, 4, 529--565.
|
|