|
ABSTRACT
To provide access to distributed and heterogeneous sources, information integration systems have traditionally relied on the availability of a mediated schema, along with mappings between this schema and the schema of the information sources. Queries posed to the mediated schema are then reformulated in terms of the source schemas. On the Web, where sources are plentiful, autonomous and extremely volatile, a system based on the existence of a pre-defined mediated schema and mapping information presents several drawbacks. Notably, the cost of keeping the mappings up to date as new sources are found or existing sources change can be prohibitively high. In this paper, we propose a new querying mechanism for integrating a large number of sources that requires neither a mediated schema nor source mappings. In the absence of a mediated schema, the user formulates queries based on what she expects to find. These queries are rewritten using a best-effort approach: the rewriting component compares a user query against the source schemas and produces a set of rewritings based on the matches found. We demonstrate the feasibility of this approach by providing a query interface for integrating hundreds of (real) structured Web information sources. We also discuss experimental results which indicate that our query rewriting algorithm can be effective.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. D. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. In 16th Meeting of the Information Processing Society of Japan, pages 7--18, Tokyo, Japan, 1994.
|
| |
5
|
|
 |
6
|
Hasan Davulcu , Juliana Freire , Michael Kifer , I. V. Ramakrishnan, A layered architecture for querying dynamic Web content, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.491-502, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
7
|
|
| |
8
|
Marc Friedman , Alon Levy , Todd Millstein, Navigational plans for data integration, Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, p.67-73, July 18-22, 1999, Orlando, Florida, United States
|
| |
9
|
Hector Garcia-Molina , Yannis Papakonstantinou , Dallan Quass , Anand Rajaraman , Yehoshua Sagiv , Jeffrey Ullman , Vasilis Vassalos , Jennifer Widom, The TSIMMIS Approach to Mediation: Data Models and Languages, Journal of Intelligent Information Systems, v.8 n.2, p.117-132, March/April 1997
[doi> 10.1023/A:1008683107812]
|
 |
10
|
|
 |
11
|
|
| |
12
|
T. Kirk, A. Y. Levy, Y. Sagiv, and D. Srivastava. The Information Manifold. In C. Knoblock and A. Levy, editors, Information Gathering from Heterogeneous, Distributed Environments, Stanford University, Stanford, California, 1995.
|
| |
13
|
|
| |
14
|
P. McBrien and A. Poulovassilis. Data integration by bi-directional schema transformation rules, 2003.
|
| |
15
|
R. McCann, A. Doan, V. Varadaran, A. Kramnik, and C. Zhai. Building data integration systems: A mass collaboration approach. In WebDB, pages 25--30, 2003.
|
| |
16
|
|
| |
17
|
|
 |
18
|
Igor Tatarinov , Zachary Ives , Jayant Madhavan , Alon Halevy , Dan Suciu , Nilesh Dalvi , Xin (Luna) Dong , Yana Kadiyska , Gerome Miklau , Peter Mork, The Piazza peer data management project, ACM SIGMOD Record, v.32 n.3, September 2003
[doi> 10.1145/945721.945732]
|
| |
19
|
|
| |
20
|
|
|