ACM Home Page
Please provide us with feedback. Feedback
Frequent itemset mining on graphics processors
Full text PdfPdf (715 KB)
Source Data Management On New Hardware archive
Proceedings of the Fifth International Workshop on Data Management on New Hardware table of contents
Providence, Rhode Island
SESSION: Exploiting parallel hardware table of contents
Pages 34-42  
Year of Publication: 2009
ISBN:978-1-60558-701-1
Authors
Wenbin Fang  Hong Kong University of Science and Technology
Mian Lu  Hong Kong University of Science and Technology
Xiangye Xiao  Hong Kong University of Science and Technology
Bingsheng He  Microsoft Research Asia
Qiong Luo  Hong Kong University of Science and Technology
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 36,   Downloads (12 Months): 85,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1565694.1565702
What is a DOI?

ABSTRACT

We present two efficient Apriori implementations of Frequent Itemset Mining (FIM) that utilize new-generation graphics processing units (GPUs). Our implementations take advantage of the GPU's massively multi-threaded SIMD (Single Instruction, Multiple Data) architecture. Both implementations employ a bitmap data structure to exploit the GPU's SIMD parallelism and to accelerate the frequency counting operation. One implementation runs entirely on the GPU and eliminates intermediate data transfer between the GPU memory and the CPU memory. The other implementation employs both the GPU and the CPU for processing. It represents itemsets in a trie, and uses the CPU for trie traversing and incremental maintenance. Our preliminary results show that both implementations achieve a speedup of up to two orders of magnitude over optimized CPU Apriori implementations on a PC with an NVIDIA GTX 280 GPU and a quad-core CPU.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
Lamine M. Aouad, Nhien-An Le-Khac, and Tahar M. Kechadi. Distributed frequent itemsets mining in heterogeneous platforms. Journal of Engineering, Computing and Architecture, 2007.
5
 
6
Ferenc Bodon. A fast apriori implementation. FIMI, 2003.
7
 
8
9
 
10
 
11
 
12
Bart Goethals and Mohammed Javeed Zaki. Advances in frequent itemset mining implementations: Introduction to fimi'03. FIMI, 2003.
13
14
15
 
16
17
18
19
 
20
 
21
 
22
Ykä Huhtala, Juha Kärkkäinen, Pasi Porkka, and Hannu Toivonen. Tane: An efficient algorithm for discovering functional and approximate dependencies. The Computer Journal, 1999.
23
24
 
25
 
26
John D. Owens, David Luebke, Naga Govindaraju, Mark Harris, Jens Krĺźger, Aaron E. Lefohn, and Timothy J. Purcell. A survey of general-purpose computation on graphics hardware. In Computer Graphics Forum, 2007.
 
27
Lance Parsons, Ehtesham Haque, and Huan Liu. Evaluating subspace clustering algorithms. SDM, 2004.
 
28
S. Parthasarathy, M. J. Zaki, M. Ogihara, and W. Li. Parallel data mining for association rules on shared memory systems. In Knowledge and Information Systems, 2001.
 
29
Jayaprakash Pisharath, Ying Liu, Wei keng Liao, Alok Choudhary, Gokhan Memik, and Janaki Parhi. Nu-minebench 2.0. Technical report, Northwestern University, 2005.
 
30
 
31
 
32
 
33
Mohammed J Zaki, Srinivasan Parthasarathy, Mitsunori Ogihara, and Wei Li. New algorithms for fast discovery of association rules. KDD, 1997.
34

Collaborative Colleagues:
Wenbin Fang: colleagues
Mian Lu: colleagues
Xiangye Xiao: colleagues
Bingsheng He: colleagues
Qiong Luo: colleagues