|
ABSTRACT
Various constrained frequent pattern mining problem formulations and associated algorithms have been developed that enable the user to specify various itemset-based constraints that better capture the underlying application requirements and characteristics. In this paper we introduce a new class of block constraints that determine the significance of an itemset pattern by considering the dense block that is formed by the pattern's items and its associated set of transactions. Block constraints provide a natural framework by which a number of important problems can be specified and make it possible to solve numerous problems on binary and real-valued datasets. However, developing computationally efficient algorithms to find these block constraints poses a number of challenges as unlike the different itemset-based constraints studied earlier, these block constraints are tough as they are neither anti-monotone, monotone, nor convertible. To overcome this problem, we introduce a new class of pruning methods that significantly reduce the overall search space and present a computationally efficient and scalable algorithm called CBMiner to find the closed itemsets that satisfy the block constraints.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Agarwal, C. Aggarwal, V. Prasad, and V. Crestana. A tree projection algorithm for generation of large itemsets for association rules. IBM Research Report, RC21341, November 1998.
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
 |
5
|
Sergey Brin , Rajeev Motwani , Jeffrey D. Ullman , Shalom Tsur, Dynamic itemset counting and implication rules for market basket data, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.255-264, May 11-15, 1997, Tucson, Arizona, United States
|
 |
6
|
Cristian Bucila , Johannes Gehrke , Daniel Kifer , Walker White, DualMiner: a dual-pruning algorithm for itemsets with constraints, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775054]
|
| |
7
|
|
| |
8
|
A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing: Design and Analysis of Algorithms, 2nd Edition. Adison Wesley Publishing Company, 2003.
|
 |
9
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
10
|
K. Gade, J. Wang, and G. Karypis. Effcient Closed Pattern Mining in the Presence of Tough Block Constraints. Technical Report TR #03--45, Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, 2003. Available on the WWW at: http://cs.umn.edu/~karypis/publications.
|
 |
11
|
|
 |
12
|
Daniel Kifer , Johannes Gehrke , Cristian Bucila , Walker White, How to quickly find a witness, Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.272-283, June 09-11, 2003, San Diego, California
[doi> 10.1145/773153.773180]
|
 |
13
|
|
 |
14
|
Bing Liu , Wynne Hsu , Yiming Ma, Mining association rules with multiple minimum supports, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.337-341, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312274]
|
 |
15
|
Guimei Liu , Hongjun Lu , Wenwu Lou , Jeffrey Xu Yu, On computing, storing and querying frequent patterns, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2003, Washington, D.C.
[doi> 10.1145/956750.956827]
|
 |
16
|
Raymond Ng , Laks V. S. Lakshmanan , Jiawei Han , Teresa Mah, Exploratory mining via constrained frequent set queries, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.556-558, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
Jian Pei , Jiawei Han , Hongjun Lu , Shojiro Nishio , Shiwei Tang , Dongqing Yang, H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases, Proceedings of the 2001 IEEE International Conference on Data Mining, p.441-448, November 29-December 02, 2001
|
| |
21
|
J. Pei, J. Han, and R. Mao. Closet: An efficient algorithm for mining frequent closed itemsets. In Proc. of the DMKD'00, May 2000.
|
| |
22
|
|
| |
23
|
R. Srikant, Q. Vu, and R. Agrawal. Mining associations rules with item constraints. In Proc. of the ACM SIGKDD'97, August 1997.
|
 |
24
|
|
| |
25
|
|
| |
26
|
K. Wang, Y. Jiang, J. Xu Yu, G. Dong, and J. Han. Pusing aggregate constraints by divide-and-approximate. In Proc. of the ICDE'03, March 2003.
|
| |
27
|
J. Wang and G. Karypis. BAMBOO: Accelerating closed itemset mining by deeply pushing the length-decreasing support constraint. In Proc. of the SDM'04, April 2004.
|
 |
28
|
|
| |
29
|
M. Zaki and C. Hsiao. Charm: An efficient algorithm for closed itemset mining. In Proc. of the SDM'02, April 2002.
|
|