ACM Home Page
Please provide us with feedback. Feedback
Effective keyword-based selection of relational databases
Full text PdfPdf (658 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data table of contents
Beijing, China
SESSION: Data source selection and integration table of contents
Pages: 139 - 150  
Year of Publication: 2007
ISBN:978-1-59593-686-8
Authors
Bei Yu  National Univeristy of Singapore, Singapore, Singapore
Guoliang Li  Tsinghua University, Beijing, China
Karen Sollins  MIT, Boston, MA
Anthony K. H. Tung  National Univeristy of Singapore, Singapore, Singapore
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 205,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1247480.1247498
What is a DOI?

ABSTRACT

The wide popularity of free-and-easy keyword based searches over World Wide Web has fueled the demand for incorporating keyword-based search over structured databases. However, most of the current research work focuses on keyword-based searching over a single structured data source. With the growing interest in distributed databases and service oriented architecture over the Internet, it is important to extend such a capability over multiple structured data sources. One of the most important problems for enabling such a query facility is to be able to select the most useful data sources relevant to the keyword query. Traditional database summary techniques used for selecting unstructured datasources developed in IR literature are inadequate for our problem, as they do not capture the structure of the data sources. In this paper, we study the database selection problem for relational data sources, and propose a method that effectively summarizes the relationships between keywords in a relational database based on its structure. We develop effective ranking methods based on the keyword relationship summaries in order to select the most useful databases for a given keyword query. We have implemented our system on PlanetLab. In that environment we use extensive experiments with real datasets to demonstrate the effectiveness of our proposed summarization method.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A system for keyword-based search over relational databases. In ICDE, 2002.
 
2
A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: Authority-based keyword search in databases. In VLDB, 2004.
 
3
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, 2002.
4
 
5
J. P. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In SIGIR, 1995.
 
6
G. Cao, J. Y. Nie, and J. Bai. Integrating word relationships into language models. In SIGIR, 2005.
7
 
8
P. Domingos and M. Pazzani. Beyond independence: Conditions for the optimality of the simple Bayesian classifier. In Proc. 1996 Int. Conf. Machine Learning (ML'96), pages 105--112, 1996.
 
9
 
10
C. Fellbaum, editor. WordNet: An Electronic Lexical Database. Bradford Books, 1998.
 
11
J. Gao, J. Y. Nie, G. Wu, and G. Cao. Dependence language model for information retrieval. In SIGIR, 2004.
12
 
13
V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient IR-style keyword search over relational databases. In VLDB, 2003.
 
14
V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword search in relational databases. In VLDB, 2002.
 
15
F. K. Hwang, D. S. Richards, and P. Winter. The Steiner Tree Problem. Annals of Discrete Mathematics. North-Holland, 1992.
 
16
P. G. Ipeirotis, E. Agichtein, P. Jain, and L. Gravano. To search or to crawl?: towards a query optimizer for text-centric tasks. In SIGMOD, 2006.
 
17
P. G. Ipeirotis and L. Gravano. Distributed search over the hidden web: Hierarchical database sampling and selection. In VLDB, 2002.
 
18
H. V. Jagadish, B. C. Ooi, and Q. H. Vu. Baton: a balanced tree structure for peer-to-peer networks. In VLDB, 2005.
 
19
B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In PODS, 2006.
 
20
A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In VLDB, 1996.
 
21
F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD, 2006.
22
 
23
R. Nallapati and J. Allan. Capturing term dependencies using a language model based on sentence trees. In CIKM, 2002.
 
24
 
25
W. S. Ng, B. C. Ooi, and K. L. Tan. BestPeer: A self-configurable peer-to-peer system. In ICDE, 2002.Poster Paper.
26
 
27
A. Singhal. Modern information retrieval: A brief overview. IEEE Data Engineering Bulletin, 24(4):35--43, 2001.
 
28
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM, 2001.
 
29
C. Yu, W. Meng, W. Wu, and K. L. Liu. Efficient and effective metasearch for text databases incorporating linkages among documents. In SIGMOD, 2001.
 
30

CITED BY  6

Collaborative Colleagues:
Bei Yu: colleagues
Guoliang Li: colleagues
Karen Sollins: colleagues
Anthony K. H. Tung: colleagues