ACM Home Page
Please provide us with feedback. Feedback
Querying structured information sources on the web
Full text PdfPdf (377 KB)
Source International Conference on Information Integration and web-based Applications and Services archive
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services table of contents
Linz, Austria
WORKSHOP SESSION: iiWAS 2008 workshops: RED 2008 table of contents
Pages 470-476  
Year of Publication: 2008
ISBN:978-1-60558-349-5
Authors
Sergio Mergen  Universidade Federal do Rio, Porto Alegre - RS - Brasil
Juliana Freire  University of Utah, Salt Lake City - UT
Carlos Alberto heuser  Universidade Federal do Rio, Porto Alegre - RS - Brasil
Sponsor
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 30,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1497308.1497394
What is a DOI?

ABSTRACT

To provide access to distributed and heterogeneous sources, information integration systems have traditionally relied on the availability of a mediated schema, along with mappings between this schema and the schema of the information sources. Queries posed to the mediated schema are then reformulated in terms of the source schemas. On the Web, where sources are plentiful, autonomous and extremely volatile, a system based on the existence of a pre-defined mediated schema and mapping information presents several drawbacks. Notably, the cost of keeping the mappings up to date as new sources are found or existing sources change can be prohibitively high. In this paper, we propose a new querying mechanism for integrating a large number of sources that requires neither a mediated schema nor source mappings. In the absence of a mediated schema, the user formulates queries based on what she expects to find. These queries are rewritten using a best-effort approach: the rewriting component compares a user query against the source schemas and produces a set of rewritings based on the matches found. We demonstrate the feasibility of this approach by providing a query interface for integrating hundreds of (real) structured Web information sources. We also discuss experimental results which indicate that our query rewriting algorithm can be effective.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. D. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. In 16th Meeting of the Information Processing Society of Japan, pages 7--18, Tokyo, Japan, 1994.
 
5
6
7
 
8
 
9
10
11
 
12
T. Kirk, A. Y. Levy, Y. Sagiv, and D. Srivastava. The Information Manifold. In C. Knoblock and A. Levy, editors, Information Gathering from Heterogeneous, Distributed Environments, Stanford University, Stanford, California, 1995.
 
13
 
14
P. McBrien and A. Poulovassilis. Data integration by bi-directional schema transformation rules, 2003.
 
15
R. McCann, A. Doan, V. Varadaran, A. Kramnik, and C. Zhai. Building data integration systems: A mass collaboration approach. In WebDB, pages 25--30, 2003.
 
16
 
17
18
 
19
 
20
Collaborative Colleagues:
Sergio Mergen: colleagues
Juliana Freire: colleagues
Carlos Alberto heuser: colleagues