ACM Home Page
Please provide us with feedback. Feedback
Spark: top-k keyword query in relational databases
Full text PdfPdf (510 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data table of contents
Beijing, China
SESSION: Top-k queries and ranking table of contents
Pages: 115 - 126  
Year of Publication: 2007
ISBN:978-1-59593-686-8
Authors
Yi Luo  University of New South Wales, Sydney, Australia
Xuemin Lin  University of New South Wales, Sydney, Australia
Wei Wang  University of New South Wales, Sydney, Australia
Xiaofang Zhou  University of Queensland, Brisbane, Australia
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 199,   Citation Count: 22
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1247480.1247495
What is a DOI?

ABSTRACT

With the increasing amount of text data stored in relational databases, there is a demand for RDBMS to support keyword queries over text data. As a search result is often assembled from multiple relational tables, traditional IR-style ranking and query evaluation methods cannot be applied directly.

In this paper, we study the effectiveness and the efficiency issues of answering top-k keyword query in relational database systems. We propose a new ranking formula by adapting existing IR techniques based on a natural notion of virtual document. Compared with previous approaches, our new ranking method is simple yet effective, and agrees with human perceptions. We also study efficient query processing methods for the new ranking method, and propose algorithms that have minimal accesses to the database. We have conducted extensive experiments on large-scale real databases using two popular RDBMSs. The experimental results demonstrate significant improvement to the alternative approaches in terms of retrieval effectiveness and efficiency.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A system for keyword-based search over relational databases. In ICDE, pages 5--16, 2002.
 
2
 
3
 
4
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430, 2001.
5
 
6
S. Chaudhuri, R. Ramakrishnan, and G. Weikum. Integrating db and ir technologies: What is the sound of one hand clapping? In CIDR, pages 1--12, 2005.
 
7
R. Cyganiak. D2RQ benchemarking. http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2rq/benchmarks/.
 
8
 
9
B. Ding, J. X. Yu, S. Wang, L. Qin, X. Zhang, and X. Lin. Finding top-k min-cost connected trees in databases. In ICDE, 2007.
 
10
 
11
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001.
 
12
R. Goldman, N. Shivakumar, S. Venkatasubramanian, and H. Garcia-Molina. Proximity search in databases. In VLDB, 1998.
13
 
14
P. J. Haas and J. M. Hellerstein. Ripple joins for online aggregation. In SIGMOD 1999, pages 287--298, 1999.
 
15
V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient IR-Style Keyword Search over Relational Databases. In VLDB, 2003.
 
16
V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In VLDB, pages 670--681, 2002.
 
17
 
18
 
19
B. Kimelfeld and Y. Sagiv. Efficient engines for keyword proximity search. In WebDB, pages 67--72, 2005.
20
 
21
G. Koutrika, A. Simitsis, and Y. Ioannidis. Précis: The essence of a query answer. In ICDE, 2006.
22
 
23
Y. Luo, X. Lin, W. Wang, and X. Zhou. SPARK: Top-k keyword query in relational databases. Technical Report 0708, School of Computer Science and Engineering, University of New South Wales, 2007.
 
24
N. Mamoulis, K. H. Cheng, M. L. Yiu, and D. W. Cheung. Efficient aggregation of ranked inputs. In ICDE, 2006.
 
25
26
 
27
D. E. Rose and D. R. Cutting. Ranking for usability: Enhanced retrieval for short queries. Technical Report 163, Apple Technical Report, 1996.
28
 
29
M. Sayyadan, H. LeKhac, A. Doan, and L. Gravano. Efficient keyword search across heterogeneous relational databases. In ICDE, 2007.
 
30
Q. Su and J. Widom. Indexing relational database content offline for efficient keyword-based search. In IDEAS, 2005.
 
31
R. Wilkinson, J. Zobel, and R. Sacks-Davis. Similarity measures for short queries. In TREC, 1995.
 
32

CITED BY  22

Collaborative Colleagues:
Yi Luo: colleagues
Xuemin Lin: colleagues
Wei Wang: colleagues
Xiaofang Zhou: colleagues