|
ABSTRACT
We consider an architecture of mediators and wrappers for Internet accessible WebSources of limited query capability. Each call to a source is a WebSource Implementation (WSI) and it is associated with both a capability and (a possibly dynamic) cost. The multiplicity of WSIs with varying costs and capabilities increases the complexity of a traditional optimizer that must assign WSIs for each remote relation in the query while generating an (optimal) plan. We present a two-phase Web Query Optimizer (WQO). In a pre-optimization phase, the WQO selects one or more WSIs for a pre-plan; a pre-plan represents a space of query evaluation plans (plans) based on this choice of WSIs. The WQO uses cost-based heuristics to evaluate the choice of WSI assignment in the pre-plan and to choose a good pre-plan. The WQO uses the pre-plan to drive the extended relational optimizer to obtain the best plan for a pre-plan. A prototype of the WQO has been developed. We compare the effectiveness of the WQO, i.e., its ability to efficiently search a large space of plans and obtain a low cost plan, in comparison to a traditional optimizer. We also validate the cost-based heuristics by experimental evaluation of queries in the noisy Internet environment.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
S. Adali , K. S. Candan , Y. Papakonstantinou , V. S. Subrahmanian, Query caching and optimization in distributed mediator systems, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.137-146, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
2
|
|
 |
3
|
|
| |
4
|
L. Bright, J.-R. Gruser, L. Raschid, and M. E. Vidal. A wrapper generation toolkit to specify and construct wrappers for webaccesible data sources (websources). Journal of Computer Systems Special Issue on Semantics in the WWW, 14(2):83-98, 1999.
|
| |
5
|
L. Bright and L. Raschid. Cost modeling of wrappers for web accesible data sources (websources). http://www.umiacs.umd.edu/labs/CLIP/DARPA/ww97.html. (under review), 1998.
|
| |
6
|
|
 |
7
|
Jianjun Chen , David J. DeWitt , Feng Tian , Yuan Wang, NiagaraCQ: a scalable continuous query system for Internet databases, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.379-390, May 15-18, 2000, Dallas, Texas, United States
|
| |
8
|
|
| |
9
|
B. Eckman, A. Kosky, and L. Laroco. Extending traditional query-based integration approaches for functional characterization of post-genomic data. BioInformatics, 17(7):587-601, 2001.
|
| |
10
|
|
 |
11
|
Daniela Florescu , Alon Levy , Ioana Manolescu , Dan Suciu, Query optimization in the presence of limited access patterns, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.311-322, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
12
|
|
| |
13
|
Georges Gardarin , Sofiane Gannouni , Béatrice Finance , Peter Fankhausera , Wolfgang Klas , Dominique Pastrea , Régis Legoff , Antonis Ramfos, IRO-DB: a distributed system federating object and relational databases, Object-oriented multidatabase systems: a solution for advanced applications, Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1995
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
L. M. Haas , P. M. Schwarz , P. Kodali , E. Kotlar , J. E. Rice , W. C. Swope, DiscoveryLink: a system for integrated access to life sciences data sources, IBM Systems Journal, v.40 n.2, p.489-511, February 2001
|
 |
18
|
|
| |
19
|
J. Hellerstein et al. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000.
|
 |
20
|
|
 |
21
|
Zachary G. Ives , Daniela Florescu , Marc Friedman , Alon Levy , Daniel S. Weld, An adaptive query execution system for data integration, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.299-310, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
22
|
Z. Ives, A. Levy, D. Weld, D. Florescu, and M. Friedman. Adaptive query processing for internet applications. IEEE Data Engineering Bulletin, 23(2):19-26, 2000.
|
| |
23
|
|
| |
24
|
C. Li and E. Chang. Query planning with limited source capabilities. Proceedings of ICDE, 2000.
|
| |
25
|
|
| |
26
|
ACM Digital Library. http://www.acm.org/dl/Search.html.
|
| |
27
|
|
| |
28
|
|
| |
29
|
Bureau of Labor Statistics. http://stats.bls.gov.
|
| |
30
|
Yannis Papakonstantinou , Ashish Gupta , Laura Haas, Capabilities-based query rewriting in mediator systems, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.170-183, December 18-20, 1996, Miami Beach, Florida, United States
|
| |
31
|
|
| |
32
|
P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. Access path selection in a relational database management system. 1979.
|
| |
33
|
|
| |
34
|
|
| |
35
|
|
| |
36
|
T. Urhan and M. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Engineering Bulletin, 23(2):27-33, 2000.
|
| |
37
|
|
 |
38
|
Tolga Urhan , Michael J. Franklin , Laurent Amsaleg, Cost-based query scrambling for initial delays, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.130-141, June 01-04, 1998, Seattle, Washington, United States
|
| |
39
|
|
| |
40
|
V. Vassalos and Y. Papakonstantinou. Using knowledge of redundancy for query optimization in mediators. Proceedings of the AAAI Symposium on AI and Data Integration, 1998.
|
| |
41
|
M. E. Vidal. A Mediator for Scaling up to Multiple WebSources. PhD thesis, University Simon Bolivar, 2000.
|
| |
42
|
M. E. Vidal, L. Raschid, and V. Zadorozhny. Decision support model for pre-plans. In preparation, 2001.
|
| |
43
|
|
| |
44
|
|
|