| Efficient pattern mining on shared memory systems: implications for chip multiprocessor architectures |
| Full text |
Pdf
(233 KB)
|
| Source
|
Memory System Performance
archive
Proceedings of the 2006 workshop on Memory system performance and correctness
table of contents
San Jose, California
SESSION: Workload optimization
table of contents
Pages: 31 - 40
Year of Publication: 2006
ISBN:1-59593-578-9
|
|
Authors
|
|
Gregory Buehrer
|
The Ohio State University, Columbus, OH
|
|
Yen-Kuang Chen
|
Intel Corporation, Santa Clara, CA
|
|
Srinivasan Parthasarathy
|
The Ohio State University, Columbus, OH
|
|
Anthony Nguyen
|
Intel Corporation, Santa Clara, CA
|
|
Amol Ghoting
|
The Ohio State University, Columbus, OH
|
|
Daehyun Kim
|
Intel Corporation, Santa Clara, CA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 60, Citation Count: 1
|
|
|
ABSTRACT
Frequent pattern mining is a fundamental data mining process which has practical applications ranging from market basket data analysis to web link analysis. In this work, we show that state-of-the-art frequent pattern mining applications are inefficient when executing on a shared memory multiprocessor system, due primarily to poor utilization of the memory hierarchy. To improve the efficiency of these applications, we explore memory performance improvements, task partitioning strategies, and task queuing models designed to maximize the scalability of pattern mining on SMP systems. Empirically, we show that the proposed strategies afford significantly improved performance. We also discuss implications of this work in light of recent trends in micro-architecture design, particularly chip multiprocessors (CMPs).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
|
| |
3
|
|
 |
4
|
Emery D. Berger , Kathryn S. McKinley , Robert D. Blumofe , Paul R. Wilson, Hoard: a scalable memory allocator for multithreaded applications, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.117-128, November 2000, Cambridge, Massachusetts, United States
|
| |
5
|
|
 |
6
|
Sergey Brin , Rajeev Motwani , Craig Silverstein, Beyond market baskets: generalizing association rules to correlations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.265-276, May 11-15, 1997, Tucson, Arizona, United States
|
 |
7
|
Julia Chen , Philo Juang , Kevin Ko , Gilberto Contreras , David Penry , Ram Rangan , Adam Stoler , Li-Shiuan Peh , Margaret Martonosi, Hardware-modulated parallelism in chip multiprocessors, ACM SIGARCH Computer Architecture News, v.33 n.4, November 2005
[doi> 10.1145/1105734.1105742]
|
| |
8
|
Y. Chen, L. Yang, and Y. Wang. Incremental mining of frequent xml query patterns. In Proceedings of the International Conference on Data Mining (ICDM), 1999.
|
 |
9
|
Shengnan Cong , Jiawei Han , Jay Hoeflinger , David Padua, A sampling-based framework for parallel data mining, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 15-17, 2005, Chicago, IL, USA
[doi> 10.1145/1065944.1065979]
|
| |
10
|
|
| |
11
|
L. Dehaspe, H. Toivonen, and R. D. King. Finding frequent substructures in chemical compounds. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pages 30--36. AAAI Press, 1998.
|
| |
12
|
Amol Ghoting , Gregory Buehrer , Srinivasan Parthasarathy , Daehyun Kim , Anthony Nguyen , Yen-Kuang Chen , Pradeep Dubey, Cache-conscious frequent pattern mining on a modern processor, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
| |
13
|
A. Ghoting, G. Buehrer, S. Parthasarathy, D. Kim, A. Nguyen, Y. Chen, and P. Dubey. Cache-conscious frequent pattern mining on modern and emerging architectures. In OSU Technical Report, volume OSU-CISRC-3/06-TR31, 2006.
|
| |
14
|
B. Goethals and M. Zaki. Advances in frequent itemset mining implementations. In Proceedings of the ICDM workshop on frequent itemset mining implementations, 2003.
|
| |
15
|
|
| |
16
|
V. Guralnik and G. Karypis. Dynamic load balancing algorithms for sequence mining. In University of Minnesota Technical Report TR 00-056, 2001.
|
| |
17
|
|
 |
18
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
19
|
|
| |
20
|
|
| |
21
|
B. McKay. Practical graph isomorphism. In Congressus Numerantium, volume 30, pages 45--87, 1981.
|
| |
22
|
T. Meinl, I. Fischer, and M. Philippsen. Parallel mining for frequent fragments on a shared-memory multiprocessor -results and java-obstacles-. In LWA 2005 - Beitrge zur GI-Workshopwoche Lernen, Wissensentdeckung, Adaptivitt, pages 196--201, Saarbrcken, Germany, 2005.
|
 |
23
|
|
 |
24
|
Jong Soo Park , Ming-Syan Chen , Philip S. Yu, An effective hash-based algorithm for mining association rules, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.175-186, May 22-25, 1995, San Jose, California, United States
|
| |
25
|
|
| |
26
|
|
| |
27
|
|
| |
28
|
|
| |
29
|
|
 |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
M. Zaki and C. Hsiao. CHARM: An efficient algorithm for closed itemset mining. In Proceedings of SIAM International Conference on Data Mining (SDM), 2002.
|
| |
34
|
M. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. pages 283--296, 1997.
|
| |
35
|
|
|