ACM Home Page
Please provide us with feedback. Feedback
Optimizing complex queries with multiple relation instances
Full text PdfPdf (616 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
SESSION: Research Session 12: Query Optimization table of contents
Pages 525-538  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Yu Cao  National University of Singapore, Singapore, Singapore
Gopal C. Das  National University of Singapore, Singapore, Singapore
Chee-Yong Chan  National University of Singapore, Singapore, Singapore
Kian-Lee Tan  National University of Singapore, Singapore, Singapore
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 239,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376671
What is a DOI?

ABSTRACT

Today's query processing engines do not take advantage of the multiple occurrences of a relation in a query to improve performance. Instead, each instance is treated as a distinct relation and has its own independent table access method. In this paper, we present MAPLE, a Multi-instance-Aware PLan Evaluation engine that enables multiple instances of a relation to share one physical scan (called SharedScan) with limited buffer space. During execution, as SharedScan pulls a tuple for any instance, that tuple is also pushed to the buffers of other instances with matching predicates. To avoid buffer overflow, a novel interleaved execution strategy is proposed: whenever an instance's buffer becomes full, the execution is temporarily switched to a drainer (an ancestor blocking operator of the instance) to consume all the tuples in the buffer. Thus, the execution is interleaved between normal processing and drainers. We also propose a cost-based approach to generate a plan to maximize the shared scan benefit as well as to avoid interleaved execution deadlocks. MAPLE is light-weight and can be easily integrated into existing RDBMS executors. We have implemented MAPLE in PostgreSQL, and our experimental study on the TPC-DS benchmark shows significant reduction in execution time.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Microsoft SQL Server Library. http://msdn2.microsoft.com/en-us/library/bb545450.aspx.
 
2
PostgreSQL. http://www.postgresql.org/.
 
3
TPC BENCHMARK Decision Support. http://www.tpc.org/tpcds/.
4
 
5
 
6
 
7
8
9
10
 
11
C. A. Lang, B. Bhattacharjee, T. Malkemus, S. Padmanabhan, and K. Wong. Increasing buffer-locality for multiple relational table scans through grouping and throttling. In ICDE, pages 1136--1145, 2007.
 
12
13
14
15
 
16
K. Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient rdf storage and retrieval in jena2. In SWDB, pages 131--150, 2003.
17
18
 
19

Collaborative Colleagues:
Yu Cao: colleagues
Gopal C. Das: colleagues
Chee-Yong Chan: colleagues
Kian-Lee Tan: colleagues