ACM Home Page
Please provide us with feedback. Feedback
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data
Full text PdfPdf (394 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
SESSION: Research Session 19: Keywords on Structure table of contents
Pages 903-914  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Guoliang Li  Tsinghua University, Beijing, China
Beng Chin Ooi  National University of Singapore, Singapore, Singapore
Jianhua Feng  Tsinghua University, Beijing, China
Jianyong Wang  Tsinghua University, Beijing, China
Lizhu Zhou  Tsinghua University, Beijing, China
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 42,   Downloads (12 Months): 338,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376706
What is a DOI?

ABSTRACT

Conventional keyword search engines are restricted to a given data model and cannot easily adapt to unstructured, semi-structured or structured data. In this paper, we propose an efficient and adaptive keyword search method, called EASE, for indexing and querying large collections of heterogenous data. To achieve high efficiency in processing keyword queries, we first model unstructured, semi-structured and structured data as graphs, and then summarize the graphs and construct graph indices instead of using traditional inverted indices. We propose an extended inverted index to facilitate keyword-based search, and present a novel ranking mechanism for enhancing search effectiveness. We have conducted an extensive experimental study using real datasets, and the results show that EASE achieves both high search efficiency and high accuracy, and outperforms the existing approaches significantly.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
S. Chaudhuri, R. Ramakrishnan, and G. Weikum. Integrating db and ir technologies: What is the sound of one hand clapping? In CIDR, pages 1--12, 2005.
 
4
 
5
P. DeRose, W. Shen, F. Chen, Y. Lee, D. Burdick, A. Doan, and R. Ramakrishnan. Dblife: A community information management platform for the database research community. In CIDR, 2007.
 
6
B. Ding, J. X. Yu, S. Wang, L. Qin, X. Zhang, and X. Lin. Finding top-k min-cost connected trees in databases. In ICDE, 2007.
7
8
 
9
 
10
L. Guo, J. Shanmugasundaram, and G. Yona. Topology search over biological databases. In ICDE, 2007.
11
12
 
13
 
14
 
15
 
16
V. Hristidis, Y. Papakonstantinou, and A. Balmin. Keyword proximity search on XML graphs. In ICDE, pages 367--378, 2003.
 
17
18
19
20
21
22
23
 
24
M. Mutsuzaki, M. Theobald, A. Keijzer, J. Widom, P. Agrawal, andet al. Trio-one: Layering uncertainty and lineage on a conventional dbms. In CIDR, 2007.
25
 
26
27
28
29

CITED BY  9

Collaborative Colleagues:
Guoliang Li: colleagues
Beng Chin Ooi: colleagues
Jianhua Feng: colleagues
Jianyong Wang: colleagues
Lizhu Zhou: colleagues