|
ABSTRACT
Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's domain of interest, and semantic relationships between peers are provided locally between pairs (or small sets) of peers. By traversing semantic paths of mappings, a query over one peer can obtain relevant data from any reachable peer in the network. Semantic paths are traversed by reformulating queries at a peer into queries on its neighbors.Naively following semantic paths is highly inefficient in practice. We describe several techniques for optimizing the reformulation process in a PDMS and validate their effectiveness using real-life data sets. In particular, we develop techniques for pruning paths in the reformulation process and for minimizing the reformulated queries as they are created. In addition, we consider the effect of the strategy we use to search through the space of reformulations. Finally, we show that pre-computing semantic paths in a PDMS can greatly improve the efficiency of the reformulation process. Together, all of these techniques form a basis for scalable query reformulation in PDMS.To enable our optimizations, we developed practical algorithms, of independent interest, for checking containment and minimization of XML queries, and for composing XML mappings.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
Sihem Amer-Yahia , SungRan Cho , Laks V. S. Lakshmanan , Divesh Srivastava, Minimization of tree pattern queries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.497-508, May 21-24, 2001, Santa Barbara, California, United States
|
 |
5
|
|
| |
6
|
T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, May 2001.
|
| |
7
|
P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu. Data management for peer-to-peer computing: A vision. In Proceedings of the WebDB Workshop, 2002.
|
| |
8
|
|
| |
9
|
A. Deutsch and V. Tannen. Containment and integrity constraints for xpath fragments. In KRDB, 2001.
|
| |
10
|
A. Deutsch and V. Tannen. Mars: A system for publishing xml from mixed and redundant storage. In Proc. of VLDB, 2003.
|
| |
11
|
H.-H. Do and E. Rahm. COMA - a system for flexible combination of schema matching approaches. In Proc. of VLDB, 2002.
|
 |
12
|
AnHai Doan , Pedro Domingos , Alon Y. Halevy, Reconciling schemas of disparate data sources: a machine-learning approach, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.509-520, May 21-24, 2001, Santa Barbara, California, United States
|
| |
13
|
X. Dong, A. Halevy, and I. Tatarinov. Containment of nested XML queries. Submitted for publication, 2004.
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
S. Flesca, F. Furfaro, and E. Masciari. On the minimization of xpath queries. In VLDB, 2003.
|
 |
18
|
|
 |
19
|
|
| |
20
|
A. Halevy, Z. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In Proc. of ICDE, 2003.
|
| |
21
|
|
| |
22
|
A. Y. Halevy, Z. G. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In Proc. of ICDE, 2003.
|
 |
23
|
|
 |
24
|
|
| |
25
|
J. Madhavan and A. Halevy. Composing mappings among data sources. In Proc. of VLDB, 2003.
|
 |
26
|
|
 |
27
|
|
| |
28
|
|
CITED BY 37
|
|
|
|
|
|
|
|
|
|
|
Angela Bonifati , Elaine Qing Chang , Aks V. S. Lakshmanan , Terence Ho , Rachel Pottinger, HePToX: marrying XML and heterogeneity in your P2P databases, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Federica Mandreoli , Riccardo Martoglia , Simona Sassatelli , Wilma Penzo, SRI: exploiting semantic information for effective query routing in a PDMS, Proceedings of the eighth ACM international workshop on Web information and data management, November 10-10, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
Giuseppe De Giacomo , Domenico Lembo , Maurizio Lenzerini , Riccardo Rosati, On reconciling data exchange, data integration, and peer data management, Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 11-13, 2007, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
Biao Qin , Shan Wang , Xiaoyong Du , Qiming Chen , Qiuyue Wang, Graph-based query rewriting for knowledge sharing between peer ontologies, Information Sciences: an International Journal, v.178 n.18, p.3525-3542, September, 2008
|
|
|
|
|
|
Marcos Antonio Vaz Salles , Jens-Peter Dittrich , Shant Kirakos Karakashian , Olivier René Girard , Lukas Blunschi, iTrails: pay-as-you-go information integration in dataspaces, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
Diego Calvanese , Giuseppe De Giacomo , Domenico Lembo , Maurizio Lenzerini , Riccardo Rosati, Inconsistency tolerance in P2P data integration: An epistemic logic approach, Information Systems, v.33 n.4-5, p.360-384, June, 2008
|
|
|
|
|
|
|
|
|
Bilel Gueni , Talel Abdessalem , Bogdan Cautis , Emmanuel Waller, Pruning nested XQuery queries, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Kamil Kuliberda , Radoslaw Adamus , Jacek Wislicki , Krzysztof Kaczmarski , Tomasz Marek Kowalski , Kazimierz Subieta, A generic proposal for a transparent integration of distributed data by an autonomous layer in a virtual repository, Multiagent and Grid Systems, v.3 n.4, p.393-410, December 2007
|
|
|
|
|
|
Federica Mandreoli , Riccardo Martoglia , Simona Sassatelli , Giorgio Villani , Wilma Penzo, Building a PDMS infrastructure for XML data sharing with SUNRISE, Proceedings of the 2008 EDBT workshop on Database technologies for handling XML information on the web, March 25-25, 2008, Nantes, France
|
|
|
Katja Hose , Armin Roth , André Zeitz , Kai-Uwe Sattler , Felix Naumann, A research agenda for query processing in large-scale peer data management systems, Information Systems, v.33 n.7-8, p.597-610, November, 2008
|
|
|
Dongfeng Chen , Rada Chirkova , Maxim Kormilitsin , Fereidoon Sadri , Timo J. Salo, Query optimization in xml-based information integration, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
P. Adjiman , P. Chatalic , F. Goasdou´e , M.-C. Rousset , L. Simon, Scalability study of peer-to-peer consequence finding, Proceedings of the 19th international joint conference on Artificial intelligence, p.351-356, July 30-August 05, 2005, Edinburgh, Scotland
|
|