ACM Home Page
Please provide us with feedback. Feedback
Exact and inexact methods for selecting views and indexes for OLAP performance improvement
Full text PdfPdf (858 KB)
Source ACM International Conference Proceeding Series; Vol. 261 archive
Proceedings of the 11th international conference on Extending database technology: Advances in database technology table of contents
Nantes, France
SESSION: Research sessions: Materialization and caching table of contents
Pages 311-322  
Year of Publication: 2008
ISBN:978-1-59593-926-5
Authors
Zohreh Asgharzadeh Talebi  NC State University, Raleigh, NC
Rada Chirkova  NC State Univ University, Raleigh, NC
Yahya Fathi  NC State University, Raleigh, NC
Matthias Stallmann  NC State Univ University, Raleigh, NC
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 120,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1353343.1353383
What is a DOI?

ABSTRACT

In on-line analytical processing (OLAP), precomputing (materializing as views) and indexing auxiliary data aggregations is a common way of reducing query-evaluation time costs for important data-analysis queries. We consider an OLAP view- and index-selection problem stated as an optimization problem, where (i) the inputs include the data-warehouse schema, a set of data-analysis queries of interest, and a storage-limit constraint, and (ii) the output is a set of views and indexes that minimizes the costs of the input queries, subject to the storage limit. While greedy and other heuristic strategies for choosing views or indexes might help to some extent in improving the costs, it is highly nontrivial to arrive at a globally optimum solution, one that reduces the processing costs of typical OLAP queries as much as is theoretically possible. In fact, as observed in [17] and to the best of our knowledge, there is no known approximation algorithm for OLAP view or index selection with nontrivial performance guarantees.

In this paper we propose a systematic study of the OLAP view- and index-selection problem. Our specific contributions are as follows: (1) We develop an algorithm that effectively and efficiently prunes the space of potentially beneficial views and indexes when given realistic-size instances of the problem. (2) We provide formal proofs that our pruning algorithm keeps at least one globally optimum solution in the search space, thus the resulting integer-programming model is guaranteed to find an optimal solution. (3) We develop a family of algorithms to further reduce the size of the search space, so that we are able to solve larger problem instances, although we no longer guarantee the global optimality of the resulting solution. (4) Finally, we present an experimental comparison of our proposed approaches with the state-of-the-art approaches of [2, 12]. Our experiments show that our approaches to view and index selection result in high-quality solutions --- in fact, in globally optimum solutions for many realistic-size problem instances. Thus, they compare favorably with the well-known OLAP-centered approach of [12] and provide for a winning combination with the end-to-end framework of [2] for generic view and index selection.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
F. N. Afrati and R. Chirkova. Selecting and using views to compute aggregate queries (extended abstract). In ICDT, pages 383--397, 2005.
 
2
 
3
Z. Asgharzadeh Talebi, R. Chirkova, and Y. Fathi. Exact and inexact methods for solving the problem of view selection for aggregate queries. Technical Report TR-2007-27, NC State University, 2007.
 
4
Z. Asgharzadeh Talebi, R. Chirkova, Y. Fathi, and M. Stallmann. Exact and inexact methods for selecting views and indexes for OLAP performance improvement. Technical report, NC State University, 2007. Available from ftp://ftp.ncsu.edu/pub/unity/lockers/ftp/csc_anon/tech/2007/TR-2007-31.pdf.
 
5
 
6
C. M. Broughton. IBM DB2 cube views and DB2 materialized query tables in a SAS environment. http://www.sas.com/partners/directory/ibm/cubeviews.pdf, 2005.
 
7
 
8
9
 
10
 
11
 
12
 
13
14
 
15
ILOG. CPLEX Homepage, 2004. Information on CPLEX is available at http://www.ilog.com/products/cplex/.
 
16
17
 
18
R. Kimball and M. Ross. The Data Warehouse Toolkit (second edition). Wiley Computer Publishing, 2002.
 
19
M. Kormilitsin, R. Chirkova, Y. Fathi, and M. Stallmann. View and index selection for query-performance improvement: Algorithms, heuristics and complexity. Technical report, NC State University, 2007.
 
20
J. Li, Z. A. Talebi, R. Chirkova, and Y. Fathi. A formal model for the problem of view selection for aggregate queries. In ADBIS, pages 125--138, 2005.
 
21
Microsoft. Web page of the AutoAdmin project: Self-tuning and self-administering databases. http://research.microsoft.com/research/dmx/autoadmin.
 
22
Microsoft. Web page of the data management, exploration and mining group. http://research.microsoft.com/research/dmx/.
 
23
 
24
TPC-H:. TPC Benchmark H (Decision Support). Available from http://www.tpc.org/tpch/spec/tpch2.1.0.pdf.
 
25


Collaborative Colleagues:
Zohreh Asgharzadeh Talebi: colleagues
Rada Chirkova: colleagues
Yahya Fathi: colleagues
Matthias Stallmann: colleagues