ACM Home Page
Please provide us with feedback. Feedback
Out-of-core frequent pattern mining on a commodity PC
Full text PdfPdf (832 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Research track papers table of contents
Pages: 86 - 95  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Gregory Buehrer  Ohio State University, Columbus, OH
Srinivasan Parthasarathy  Ohio State University, Columbus, OH
Amol Ghoting  Ohio State University, Columbus, OH
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 92,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150416
What is a DOI?

ABSTRACT

In this work we focus on the problem of frequent itemset mining on large, out-of-core data sets. After presenting a characterization of existing out-of-core frequent itemset mining algorithms and their drawbacks, we introduce our efficient, highly scalable solution. Presented in the context of the FPGrowth algorithm, our technique involves several novel I/O-conscious optimizations, such as approximate hash-based sorting and blocking, and leverages recent architectural advancements in commodity computers, such as 64-bit processing. We evaluate the proposed optimizations on truly large data sets,up to 75GB, and show they yield greater than a 400-fold execution time improvement. Finally, we discuss the impact of this research in the context of other pattern mining challenges, such as sequence mining and graph mining.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
5
6
 
7
 
8
B. Goethals and M. Zaki. Advances in frequent itemset mining implementations. In Proceedings of the ICDM workshop on frequent itemset mining implementations, 2003.
 
9
 
10
G. Grahne and J. Zhu. Efficiently using prefix-trees in mining frequent itemsets. In Proceedings of the ICDM Workshop on Frequent Itemset Mining Implementations, 2003.
 
11
12
 
13
G. Liu, H. Lu, J. X. Yu, W. Wei, and X. Xiao. Afopt: An efficient implementation of pattern growth approach. In Proceedings of the ICDM workshop on frequent itemset mining implementations, 2003.
 
14
15
 
16
S. Parthasarathy, M. Zaki, M. Ogihara, and W. Li. Memory placement techniques for parallel association mining. International Conference on Knowledge Discovery and Data Mining (SIGKDD), 1998.
 
17
 
18
 
19
 
20
M. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery discovery of association rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (SIGKDD), 1995.
 
21


Collaborative Colleagues:
Gregory Buehrer: colleagues
Srinivasan Parthasarathy: colleagues
Amol Ghoting: colleagues