ACM Home Page
Please provide us with feedback. Feedback
Adapting to source properties in processing data integration queries
Full text PdfPdf (197 KB)
Source International Conference on Management of Data archive
Proceedings of the 2004 ACM SIGMOD international conference on Management of data table of contents
Paris, France
SESSION: Research sessions: data integration table of contents
Pages: 395 - 406  
Year of Publication: 2004
ISBN:1-58113-859-8
Authors
Zachary G. Ives  University of Pennsylvania
Alon Y. Halevy  University of Washington
Daniel S. Weld  University of Washington
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 72,   Citation Count: 19
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1007568.1007613
What is a DOI?

ABSTRACT

An effective query optimizer finds a query plan that exploits the characteristics of the source data. In data integration, little is known in advance about sources' properties, which necessitates the use of adaptive query processing techniques to adjust query processing on-the-fly. Prior work in adaptive query processing has focused on compensating for delays and adjusting for mis-estimated cardinality or selectivity values. In this paper, we present a generalized architecture for adaptive query processing and introduce a new technique, called adaptive data partitioning (ADP), which is based on the idea of dividing the source data into regions, each executed by different, complementary plans. We show how this model can be applied in novel ways to not only correct for underestimated selectivity and cardinality values, but also to discover and exploit order in the source data, and to detect and exploit source data that can be effectively pre-aggregated. We experimentally compare a number of alternative strategies and show that our approach is effective.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
5
6
 
7
D. Donjerkovic, Y. E. Ioannidis, and R. Ramakrishnan. Dynamic histograms: Capturing evolving data sets. In ICDE '00.
8
9
 
10
 
11
12
 
13
 
14
15
 
16
17
 
18
19
20
 
21
V. Raman, A. Deshpande, and J. M. Hellerstein. Using state modules for adaptive query processing. In ICDE '03.
 
22
 
23
 
24
F. Tian and D. J. DeWitt. Tuple routing strategies for distributed eddies. In VLDB '03.
 
25
 
26
T. Urhan and M. J. Franklin. XJoin: A reactively-scheduled pipelined join operator. IEEE Data Engineering Bulletin, 23(2), June 2000.
27

CITED BY  19
Collaborative Colleagues:
Zachary G. Ives: colleagues
Alon Y. Halevy: colleagues
Daniel S. Weld: colleagues