|
ABSTRACT
In this paper, we propose a new strategy for optimizing the placement of bin boundaries to minimize the cost of query evaluation using bitmap indices with binning. For attributes with a large number of distinct values, often the most efficient index scheme is a bitmap index with binning. However, this type of index may not be able to fully resolve some user queries. To fully resolve these queries, one has to access parts of the original data to check whether certain candidate records actually satisfy the specified conditions. We call this procedure the candidate check, which usually dominates the total query processing time. Given a set of user queries, we seek to minimize the total time required to an-swer the queries by optimally placing the bin boundaries. We show that our dynamic programming based algorithm can efficiently determine the bin boundaries. We verify our analysis with some real user queries from the Sloan Digital Sky Survey. For queries that require significant amount of time to perform candidate check, using our optimal bin boundaries reduces the candidate check time by a factor of 2 and the total query processing time by 40%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
G. Antoshenkov. Byte-aligned Bitmap Compression. Technical report, Oracle Corp., 1994. U.S. Patent number 5,363,098.
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
 |
8
|
|
 |
9
|
Nick Koudas , S. Muthukrishnan , Divesh Srivastava, Optimal histograms for hierarchical range queries (extended abstract), Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.196-204, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335223]
|
| |
10
|
|
 |
11
|
|
| |
12
|
D. Rotem, K. Stockinger, and K. Wu. Efficient Binning for Bitmap Indices on High-Cardinality Attributes. Technical Report LBNL-56936, Berkeley Lab, Berkeley, California, USA, Nov. 2004.
|
| |
13
|
Sloan Digital Sky Survey. http://www.sdss.org/dr1/.
|
| |
14
|
K. Stockinger, K. Wu, R. Brun, and P. Canal. Bitmap Indices for Fast End-User Physics Analysis in ROOT. In Nuclear Instruments and Methods in Physics Research. Elsevier. to appear.
|
| |
15
|
K. Stockinger, K. Wu, and A. Shoshani. Evaluation Strategies for Bitmap Indices with Binning. In International Conference on Database and Expert Systems Applications (DEXA 2004), Zaragoza, Spain, Sept. 2004. Springer-Verlag.
|
 |
16
|
Alexander S. Szalay , Peter Z. Kunszt , Ani Thakar , Jim Gray , Don Slutz , Robert J. Brunner, Designing and mining multi-terabyte astronomy archives: the Sloan Digital Sky Survey, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.451-462, May 15-18, 2000, Dallas, Texas, United States
|
| |
17
|
K. Wu, E. J. Otoo, and A. Shoshani. On the Performance of Bitmap Indices for High Cardinality Attributes. In VLDB 2004, Toronto, Canada, Sept. 2004. Morgan Kaufmann.
|
| |
18
|
K. Wu, W.-M. Zhang, V. Perevoztchikov, J. Lauret, and A. Shoshani. The Grid Collector: Using an Event Catalog to Speedup User Analysis in Distributed Environment. In Computing in High Energy and Nuclear Physics, (CHEP 2004), Interlaken, Switzerland, Sept. 2004.
|
| |
19
|
K.-L. Wu and P.S. Yu. Range-Based Bitmap Indexing for High-Cardinality Attributes with Skew. Technical report, IBM Watson Research Center, May 1996.
|
| |
20
|
|
|