|
ABSTRACT
We propose a framework for querying heterogeneous XML data sources. The framework ensures high autonomy to participating sources as it does not rely on a global schema or on semantic mappings between schemas. The basic intuition is that of extending traditional approaches for approximate query evaluation, by providing techniques for combining partial answers coming from different sources, possibly on the basis of limited knowledge about the local schemas (i.e., key constraints). We define a query language and its associated semantics, that allows us to collect as much information as possible from several heterogeneous XML sources. We provide algorithms for query evaluation and characterize the complexity of the query language. Finally, we validate the approach in a medical application scenario.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Sihem Amer-Yahia , Nick Koudas , Amélie Marian , Divesh Srivastava , David Toman, Structure and content scoring for XML, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
 |
4
|
|
 |
5
|
Chaitan Baru , Amarnath Gupta , Bertram Ludäscher , Richard Marciano , Yannis Papakonstantinou , Pavel Velikhov , Vincent Chu, XML-based information mediation with MIX, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.597-599, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
6
|
BitTorrent. 2007. BitTorrent home page. http://www.bittorrent.com.
|
| |
7
|
Angela Bonifati , Elaine Qing Chang , Aks V. S. Lakshmanan , Terence Ho , Rachel Pottinger, HePToX: marrying XML and heterogeneity in your P2P databases, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
| |
8
|
Camillo, S. D., Heuser, C. A., and dos Santos Mello, R. 2003. Querying heterogeneous XML sources through a conceptual schema. In Proceedings of the ER. 186--199.
|
 |
9
|
|
| |
10
|
|
 |
11
|
AnHai Doan , Pedro Domingos , Alon Y. Halevy, Reconciling schemas of disparate data sources: a machine-learning approach, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.509-520, May 21-24, 2001, Santa Barbara, California, United States
|
| |
12
|
Fazzinga, B., Flesca, S., and Pugliese, A. 2007. Vague queries on peer-to-peer XML databases. In Proceedings of the International Conference on Database and Expert Systems Applications (DEXA'07). 287--297.
|
 |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
Mandreoli, F., Martoglia, R., and Tiberio, P. 2004. Approximate query answering for a heterogeneous XML document base. In Proceedings of the International Conference on Web Information Systems Engineering (WISE'04), 337--351.
|
| |
20
|
|
 |
21
|
|
| |
22
|
Milano, D., Scannapieco, M., and Catarci, T. 2006. Structure aware XML object identification. In Proceedings of the International VLDB Workshop on Clean Databases (CleanDB).
|
| |
23
|
Napster. 2007. Napster homepage. http://www.napster.com.
|
| |
24
|
Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmer, M., and Risch, T. 2001. Edutella: A p2p networking infrastructure based on rdf. https://edutella.dev.java.net/reports/edutella-whitepaper.pdf.
|
 |
25
|
|
 |
26
|
|
| |
27
|
Puhlmann, S., Weis, M., and Naumann, F. 2006. XML duplicate detection using sorted neighborhoods. In Proceedings of the International Conference on Extending Database Technology (EDBT'06), 773--791.
|
| |
28
|
Reyner, S. W. 1977. An analysis of a good algorithm for the subtree problem. SIAM J. Comput. 6, 4, 730--732.
|
| |
29
|
Rodríguez-Gianolli, P. and Mylopoulos, J. 2001. A semantic approach to XML-based data integration. In Proceedings of the ER, 117--132.
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
| |
33
|
Vdovjak, R. and Houben, G.-J. 2001. Rdf-Based architecture for semantic integration of heterogeneous information sources. In Proceedings of the Workshop on Information Integration on the Web, 51--57.
|
| |
34
|
W3C. 2007. World wide web consortium. http://www.w3.org.
|
| |
35
|
WordNet. 2007. WordNet homepage. http://wordnet.princeton.edu.
|
 |
36
|
|
|