ACM Home Page
Please provide us with feedback. Feedback
Strategies for processing ad hoc queries on large data warehouses
Full text PdfPdf (245 KB)
Source Data Warehousing and OLAP archive
Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP table of contents
McLean, Virginia, USA
Pages: 72 - 79  
Year of Publication: 2002
ISBN:1-58113-590-4
Authors
Kurt Stockinger  CERN, Geneva, Switzerland
Kesheng Wu  Lawrence Berkeley Nat'l Lab, Berkeley, CA
Arie Shoshani  Lawrence Berkeley Nat'l Lab, Berkeley, CA
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGMIS: ACM Special Interest Group on Management Information Systems
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 61,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/583890.583901
What is a DOI?

ABSTRACT

As data warehousing applications grow in size, existing data organizations and access strategies, such as relational tables and B-tree indexes, are becoming increasingly ineffective. The two primary reasons for this are that these datasets involve many attributes and the queries on the data usually involve conditions on small subsets of the attributes. Two strategies are known to address these difficulties well, namely vertical partitioning and bitmap indexes. In this paper, we summarize our experience of implementing a number of bitmap index schemes on vertically partitioned data tables. One important observation is that simply scanning the vertically partitioned data tables is often more efficient than using B-tree based indexes to answer ad hoc range queries on static datasets. For these range queries, compressed bitmap indexes are in most cases more efficient than scanning vertically partitioned tables. We evaluate the performance of two different compression schemes for bitmap indexes stored is various ways. Using the compression scheme called Word-Aligned Hybrid Code (WAH) to store the bitmaps in plain files shows the best overall performance for bitmap indexes. Tests indicate that our bitmap index strategy based on WAH is not only efficient for attributes of low cardinality, say, <100, but also for high-cardinality attributes with 200,000 or more distinct values.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
G. Antoshenkov. Byte-Aligned Bitmap Compression. Technical Report, Oracle Corp., 1994. U.S. Patent number 5,363,098.
 
3
 
4
5
6
7
8
9
 
10
 
11
M. Jürgens and H.-J. Lenz. Tree Based Indexes vs. Bitmap Indexes - a Performance Study. International Journal of Cooperative Information Systems, 10(3):355--376, 2001.
 
12
V. Markl and R. Bayer. Processing Relational OLAP Queries with UB-Trees and Multidimensional Hierarchical Clustering. In Proceedings of DMDW 2000, June 5--6, 2000.
 
13
14
 
15
 
16
 
17
18
 
19
K. Wu, E. J. Otoo, A. Shoshani, and H. Nordberg. Notes on Design and Implementation of Compressed Bit Vectors. Technical Report LBNL/PUB-3161, Lawrence Berkeley National Laboratory, Berkeley, CA, 2001.

CITED BY  6

Collaborative Colleagues:
Kurt Stockinger: colleagues
Kesheng Wu: colleagues
Arie Shoshani: colleagues