ACM Home Page
Please provide us with feedback. Feedback
Querying text databases and the web: beyond traditional keyword search
Source
International Conference on Management of Data archive
Proceedings of the First International Workshop on Keyword Search on Structured Data table of contents
Providence, Rhode Island
SESSION: Keynote talk 2 table of contents
Pages 2-2  
Year of Publication: 2009
ISBN:978-1-60558-570-3
Author
Luis Gravano  Columbia University
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGMOD: ACM Special Interest Group on Management of Data
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 0
Additional Information:

abstract   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557670.1557674
What is a DOI?

ABSTRACT

Traditional keyword search---where a query is a list of keywords and query results are a relevance-ordered list of documents---is, of course, a powerful query paradigm for text databases and the Web. However, more expressive query paradigms, where both queries and their results can exhibit a richer structure than in traditional keyword search, are often desirable. Information extraction systems identify and extract intrinsically structured data that is embedded in natural-language text documents, hence enabling these alternative query paradigms. Unfortunately, information extraction is a time-consuming process, often involving complex text analysis, so exhaustively processing all documents in a large text database --or on the Web-- could be prohibitively expensive. Beyond efficiency, query result quality is also important: information extraction is error-prone and not all extracted data is equally likely to be correct, so result quality is an important consideration during query processing. In this talk, I will discuss recent work on cost-based optimization of structured queries in this information extraction scenario, where modeling query result quality--in addition to execution efficiency-- is a distinctive and important challenge.