ACM Home Page
Please provide us with feedback. Feedback
Fast scans and joins using flash drives
Full text PdfPdf (384 KB)
Source Data Management On New Hardware archive
Proceedings of the 4th international workshop on Data management on new hardware table of contents
Vancouver, Canada
SESSION: Query processing on novel storage table of contents
Pages 17-24  
Year of Publication: 2008
ISBN:978-1-60558-184-2
Authors
Mehul A. Shah  HP Labs
Stavros Harizopoulos  HP Labs
Janet L. Wiener  HP Labs
Goetz Graefe  HP Labs
Sponsors
IBM : IBM
: Intel
Microsoft : Microsoft
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 170,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1457150.1457154
What is a DOI?

ABSTRACT

As access times to main memory and disks continue to diverge, faster non-volatile storage technologies become more attractive for speeding up data analysis applications. NAND flash is one such promising substitute for disks. Flash offers faster random reads than disk, consumes less power than disk, and is cheaper than DRAM. In this paper, we investigate alternative data layouts and join algorithms suited for systems that use flash drives as the non-volatile store.

All of our techniques take advantage of the fast random reads of flash. We convert traditional sequential I/O algorithms to ones that use a mixture of sequential and random I/O to process less data in less time. Our measurements on commodity flash drives show that a column-major layout of data pages is faster than a traditional row-based layout for simple scans. We present a new join algorithm, RARE-join, designed for a column-based page layout on flash and compare it to a traditional hash join algorithm. Our analysis shows that RARE-join is superior in many practical cases: when join selectivities are small and only a few columns are projected in the join result.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
P. Boncz, M. Zukowski, and N. Nes. Monetdb/x100: Hyper-pipelining query execution. CIDR, 2005.
4
 
5
G. Graefe. Sorting with flash memory. Unpublished manuscript., 2008.
 
6
7
 
8
M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of hash to data base machine and its architecture. New Generation Comput., 1(1):63--74, 1983.
9
 
10
11
 
12
 
13
Samsung. Samsung Semiconductor Products. Online. http://www.samsung.com/global/business/semiconductor/products/flash/Products_NANDFlash.html.
 
14
 
15
J. Zhou and K. A. Ross. A multi-resolution block storage model for database design. IDEAS, July 2003.


Collaborative Colleagues:
Mehul A. Shah: colleagues
Stavros Harizopoulos: colleagues
Janet L. Wiener: colleagues
Goetz Graefe: colleagues