|
ABSTRACT
Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints.
While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, “conventional” monotonicity-based optimization techniques do not apply effectively to 2-var constraints.
The contributions are as follows. (1) We introduce a notion of quasi-succinctness, which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
|
 |
3
|
|
 |
4
|
Sergey Brin , Rajeev Motwani , Craig Silverstein, Beyond market baskets: generalizing association rules to correlations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.265-276, May 11-15, 1997, Tucson, Arizona, United States
|
| |
5
|
S. Chaudhuri. Data mining and database systems: Where is the intersection? Bulletin of the Technical Committee on Data Engineering, 21:4-8, March 1998.
|
| |
6
|
|
 |
7
|
Takeshi Fukuda , Yasukiko Morimoto , Shinichi Morishita , Takeshi Tokuyama, Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.13-23, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
8
|
|
 |
9
|
|
| |
10
|
M. Kamber, J. Han, and J. Y. Chiang. Metarule-guided mining of multi-dimensional association rules using data cubes. In Proc. 3rd KDD'97, pp 207-210.
|
 |
11
|
Mika Klemettinen , Heikki Mannila , Pirjo Ronkainen , Hannu Toivonen , A. Inkeri Verkamo, Finding interesting rules from large sets of discovered association rules, Proceedings of the third international conference on Information and knowledge management, p.401-407, November 29-December 02, 1994, Gaithersburg, Maryland, United States
[doi> 10.1145/191246.191314]
|
| |
12
|
|
| |
13
|
L. V. S. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of constrained frequent set queries: 2-var constraints. Technical Report, Department of Computer Science, University of British Columbia, 1998.
|
 |
14
|
|
 |
15
|
Raymond T. Ng , Laks V. S. Lakshmanan , Jiawei Han , Alex Pang, Exploratory mining and pruning optimizations of constrained associations rules, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.13-24, June 01-04, 1998, Seattle, Washington, United States
|
 |
16
|
Jong Soo Park , Ming-Syan Chen , Philip S. Yu, An effective hash-based algorithm for mining association rules, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.175-186, May 22-25, 1995, San Jose, California, United States
|
| |
17
|
|
 |
18
|
Sunita Sarawagi , Shiby Thomas , Rakesh Agrawal, Integrating association rule mining with relational database systems: alternatives and implications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.343-354, June 01-04, 1998, Seattle, Washington, United States
|
 |
19
|
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
| |
23
|
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. KDD'97, pp 67-73.
|
| |
24
|
|
 |
25
|
Dick Tsur , Jeffrey D. Ullman , Serge Abiteboul , Chris Clifton , Rajeev Motwani , Svetlozar Nestorov , Arnon Rosenthal, Query flocks: a generalization of association-rule mining, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.1-12, June 01-04, 1998, Seattle, Washington, United States
|
CITED BY 48
|
|
Anthony K.H. Tung , Hongjun Lu , Jiawei Han , Ling Feng, Breaking the barrier of transactions: mining inter-transaction association rules, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.297-301, August 15-18, 1999, San Diego, California, United States
|
|
|
|
|
|
Ramesh C. Agarwal , Charu C. Aggarwal , V. V. V. Prasad, Depth first generation of long patterns, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.108-118, August 20-23, 2000, Boston, Massachusetts, United States
|
|
|
|
|
|
|
|
|
Xiuzhen Zhang , Guozu Dong , Ramamohanarao Kotagiri, Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.310-314, August 20-23, 2000, Boston, Massachusetts, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel Kifer , Johannes Gehrke , Cristian Bucila , Walker White, How to quickly find a witness, Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.272-283, June 09-11, 2003, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cheng-Ru Lin , Chang-Hung Lee , Ming-Syan Chen , Philip S. Yu, Distributed data mining in a chain store database of short transactions, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Francesco Bonchi , Fosca Giannotti , Claudio Lucchese , Salvatore Orlando , Raffaele Perego , Roberto Trasarti, A constraint-based querying system for exploratory pattern discovery, Information Systems, v.34 n.1, p.3-27, March, 2009
|
|
|
María N. Moreno , Saddys Segrera , Vivian F. López , M. José Polo, A method for mining quantitative association rules, Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization, p.173-178, September 22-24, 2006, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
|
|