|
ABSTRACT
Constraint-based mining is an active field of research which is a necessary step to achieve interactive and successful KDD processes. The limitations of the task lies in languages being limited to describe the mined patterns and the ability to express varied constraints. In practice, current approaches focus on a language and the most generic frameworks mine individually or simultaneously a monotone and an anti-monotone constraints. In this paper, we propose a generic framework dealing with any partially ordered language and a large set of constraints. We prove that this set of constraints called primitive-based constraints not only is a superclass of both kinds of monotone ones and their boolean combinations but also other classes such as convertible and succinct constraints. We show that the primitive-based constraints can be efficiently mined thanks to a relaxation method based on virtual patterns which summarize the specificities of the search space. Indeed, this approach automatically deduces pruning conditions having suitable monotone properties and thus these conditions can be pushed into usual constraint mining algorithms. We study the optimal relaxations. Finally, we provide an experimental illustration of the efficiency of our proposal by experimenting it on several contexts.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
|
| |
3
|
|
| |
4
|
C. Antunes and A.L. Oliveira. Constraint relaxations for discovering unknown sequential patterns, in: KDID 2004, Knowledge Discovery in Inductive Databases, Proceedings of the Third International Workshop on Knowledge Discovery inInductive Databases, Pisa, Italy, September 20, 2004, Revised Selected and Invited Papers, volume 3377 of Lecture Notes in Computer Science, B. Goethals and A. Siebes, eds, Springer, 2004, pp. 11-32.
|
| |
5
|
R.J. Bayardo, The hows, whys, and whens of constraints in itemset and rule discovery, in: Proc. of the Workshop on Inductive Databases and Constraint Based Mining (IDW'05), 2005.
|
| |
6
|
S. Bistarelli and F. Bonchi, Interestingness is not a dichotomy: Introducing softness in constrained pattern mining, in: PKDD, volume 3721 of Lecture Notes in Computer Science, A. Jorge, L. Torgo, P. Brazdil, R. Camacho and J. Gama, eds, Springer, 2005, pp. 22-33.
|
| |
7
|
F. Bonchi, F. Giannotti, A. Mazzanti and D. Pedreschi, Exante: Anticipated data reduction in constrained pattern mining, in: PKDD, volume 2838 of Lecture Notes in Computer Science, N. Lavrac, D. Gamberger, H. Blockeel and L. Todorovski, eds, Springer, 2003, pp. 59-70.
|
| |
8
|
|
| |
9
|
F. Bonchi and C. Lucchese, Pushing tougher constraints in frequent pattern mining, in: Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005, Proceedings, volume 3518 of Lecture Notes in Computer Science, T.B. Ho, D. Cheung and H. Liu, eds, Springer, 2005, pp. 114-124.
|
 |
10
|
Cristian Bucila , Johannes Gehrke , Daniel Kifer , Walker White, DualMiner: a dual-pruning algorithm for itemsets with constraints, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775054]
|
 |
11
|
|
| |
12
|
M. El-Hajj and O.R. Zaïane, Non-recursive generation of frequent k-itemsets from frequent pattern tree representations, in: Data Warehousing and Knowledge Discovery, 5th International Conference, DaWaK 2003, Prague, Czech Republic, September 3-5, 2003, Proceedings, volume 2737 of Lecture Notes in Computer Science, Y. Kambayashi, M.K. Mohania and W. Wöß, eds, Springer, 2003, pp. 371-380.
|
| |
13
|
|
| |
14
|
J. Fischer, Version Spaces in Constraint-based Data Mining, Master thesis, University of Freiburg, 2003.
|
 |
15
|
|
| |
16
|
|
| |
17
|
F. Geerts, B. Goethals and T. Mielikäinen, Tiling databases, in: Proceedings of the 7th International Conference on Discovery Science (DS'04), volume 3245, LNAI, 2004, pp. 278-289.
|
| |
18
|
A. Giacometti, D. Laurent and C.T. Diop, Condensed representations for sets of mining queries, in: Proceedings of KDID'02, 2002.
|
 |
19
|
Dimitrios Gunopulos , Heikki Mannila , Roni Khardon , Hannu Toivonen, Data mining, hypergraph transversals, and machine learning (extended abstract), Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.209-216, May 11-15, 1997, Tucson, Arizona, United States
[doi> 10.1145/263661.263684]
|
 |
20
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
 |
21
|
|
| |
22
|
B. Jeudy and F. Rioult, Database transposition for constrained closed pattern mining, in: Proceedings of Third International Workshop on Knowledge Discovery in Inductive Databases (KDID) co-located with ECML/PKDD, 2004.
|
| |
23
|
J. Klema, S. Blachon, A. Soulet, B. Crémilleux and O. Gandrillon, Constraintbased knowledge discovery from sage data, In Silico Biology 8 (2008), 157-175.
|
| |
24
|
D.E. Knuth, The Art of Computer Programming, Third volumes, AddisonWesley, 1968.
|
 |
25
|
|
| |
26
|
S.D. Lee and L.D. Raedt, An algebra for inductive query evaluation, in: Proceedings of the Second International Workshop on Inductive Databases (KDID'03), Rudjer Boskovic Institute, Zagreb, Croatia, September 2003, pp. 80-96.
|
| |
27
|
|
| |
28
|
|
| |
29
|
T.M. Mitchell, Generalization as search, Artif Intell 18(2) (1982), 203-226.
|
 |
30
|
|
 |
31
|
Raymond T. Ng , Laks V. S. Lakshmanan , Jiawei Han , Alex Pang, Exploratory mining and pruning optimizations of constrained associations rules, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.13-24, June 01-04, 1998, Seattle, Washington, United States
|
| |
32
|
|
| |
33
|
|
| |
34
|
J. Pei, J. Han and R. Mao, CLOSET: An efficient algorithm for mining frequent closed itemsets, in: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2000, pp. 21-30.
|
| |
35
|
Jian Pei , Jiawei Han , Behzad Mortazavi-Asl , Helen Pinto , Qiming Chen , Umeshwar Dayal , Meichun Hsu, PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth, Proceedings of the 17th International Conference on Data Engineering, p.215-224, April 02-06, 2001
|
 |
36
|
|
| |
37
|
L.D. Raedt and S. Kramer, The levelwise version space algorithm and its application to molecular fragment finding, in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI 2001, Seattle, Washington, USA, August 4-10, 2001, B. Nebel, ed., Morgan Kaufmann, 2001, pp. 853-862.
|
| |
38
|
A. Soulet and B. Crémilleux, An efficient framework for mining flexible constraints, in: Advances in Knowledge Discovery and Data Mining, 9th Paciffic-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005, Proceedings, volume 3518 of Lecture Notes in Computer Science, T.B. Ho, D. Cheung and H. Liu, eds, Springer, 2005, pp. 661-671.
|
| |
39
|
A. Soulet and B. Crémilleux, Exploiting virtual patterns for automatically pruning the search space, in: Knowledge Discovery in Inductive Databases, 4th International Workshop, KDID 2005, Porto, Portugal, October 3, 2005, Revised Selected and Invited Papers, volume 3933 of Lecture Notes in Computer Science, F. Bonchi and J.-F. Boulicaut, eds, Springer, 2005, pp. 202-221.
|
| |
40
|
|
| |
41
|
R. Srikant, Q. Vu and R. Agrawal, Mining association rules with item constraints, in: KDD, 1997, pp. 67-73.
|
| |
42
|
|
| |
43
|
|
| |
44
|
|
| |
45
|
X. Yan, J. Han and R. Afshar, CloSpan: Mining closed sequential patterns in large databases, in: Proceedings of the Third SIAM International Conference on Data Mining, D. Barbará and C. Kamath, eds, San Francisco, CA, USA, SIAM, May 1-3, 2003.
|
| |
46
|
|
| |
47
|
M.J. Zaki, N. Parimi, N. De, F. Gao, B. Phoophakdee, J. Urban, V. Chaoji, M.A. Hasan and S. Salem, Towards generic pattern mining, in: Formal Concept Analysis, Third International Conference, ICFCA 2005, Lens, France, February 14-18, 2005, Proceedings, volume 3403 of Lecture Notes in Computer Science, B. Ganter and R. Godin, eds, Springer, 2005, pp. 1-20.
|
|