ACM Home Page
Please provide us with feedback. Feedback
Efficient query reformulation in peer data management systems
Full text PdfPdf (292 KB)
Source International Conference on Management of Data archive
Proceedings of the 2004 ACM SIGMOD international conference on Management of data table of contents
Paris, France
SESSION: Research sessions: P2P and sensor networks table of contents
Pages: 539 - 550  
Year of Publication: 2004
ISBN:1-58113-859-8
Authors
Igor Tatarinov  University of Washington, Seattle, WA
Alon Halevy  University of Washington, Seattle, WA
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 90,   Citation Count: 37
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1007568.1007629
What is a DOI?

ABSTRACT

Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's domain of interest, and semantic relationships between peers are provided locally between pairs (or small sets) of peers. By traversing semantic paths of mappings, a query over one peer can obtain relevant data from any reachable peer in the network. Semantic paths are traversed by reformulating queries at a peer into queries on its neighbors.Naively following semantic paths is highly inefficient in practice. We describe several techniques for optimizing the reformulation process in a PDMS and validate their effectiveness using real-life data sets. In particular, we develop techniques for pruning paths in the reformulation process and for minimizing the reformulated queries as they are created. In addition, we consider the effect of the strategy we use to search through the space of reformulations. Finally, we show that pre-computing semantic paths in a PDMS can greatly improve the efficiency of the reformulation process. Together, all of these techniques form a basis for scalable query reformulation in PDMS.To enable our optimizations, we developed practical algorithms, of independent interest, for checking containment and minimization of XML queries, and for composing XML mappings.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
4
5
 
6
T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, May 2001.
 
7
P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu. Data management for peer-to-peer computing: A vision. In Proceedings of the WebDB Workshop, 2002.
 
8
 
9
A. Deutsch and V. Tannen. Containment and integrity constraints for xpath fragments. In KRDB, 2001.
 
10
A. Deutsch and V. Tannen. Mars: A system for publishing xml from mixed and redundant storage. In Proc. of VLDB, 2003.
 
11
H.-H. Do and E. Rahm. COMA - a system for flexible combination of schema matching approaches. In Proc. of VLDB, 2002.
12
 
13
X. Dong, A. Halevy, and I. Tatarinov. Containment of nested XML queries. Submitted for publication, 2004.
14
15
 
16
 
17
S. Flesca, F. Furfaro, and E. Masciari. On the minimization of xpath queries. In VLDB, 2003.
18
19
 
20
A. Halevy, Z. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In Proc. of ICDE, 2003.
 
21
 
22
A. Y. Halevy, Z. G. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In Proc. of ICDE, 2003.
23
24
 
25
J. Madhavan and A. Halevy. Composing mappings among data sources. In Proc. of VLDB, 2003.
26
27
 
28

CITED BY  37
Collaborative Colleagues:
Igor Tatarinov: colleagues
Alon Halevy: colleagues