ACM Home Page
Please provide us with feedback. Feedback
Cost-based query optimization for complex pattern mining on multiple databases
Full text PdfPdf (330 KB)
Source ACM International Conference Proceeding Series; Vol. 261 archive
Proceedings of the 11th international conference on Extending database technology: Advances in database technology table of contents
Nantes, France
SESSION: Research sessions: Data mining table of contents
Pages 380-391  
Year of Publication: 2008
ISBN:978-1-59593-926-5
Authors
Ruoming Jin  Kent State University, Kent, OH
Dave Fuhry  Kent State University, Kent, OH
Abdulkareem Alali  Kent State University, Kent, OH
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 130,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1353343.1353391
What is a DOI?

ABSTRACT

For complex data mining queries, query optimization issues arise, similar to those for the traditional database queries. However, few works have applied the cost-based query optimization, which is the key technique in optimizing traditional database queries, on complex mining queries. In this work, we develop a cost-based query optimization framework to an important collection of data mining queries, i.e. frequent pattern mining across multiple databases. Specifically, we make the following contributions: 1) We present a rich class of queries on mining frequent itemsets across multiple datasets supported by a SQL-based mechanism. 2) We present an approach to enumerate all possible query plans for the mining queries, and develop a dynamic programming approach and a branch-and-bound approach based on the enumeration algorithm to find optimal query plans with the least mining cost. 3) We introduce models to estimate the cost of individual mining operators. 4) We evaluate our query optimization techniques on both real and synthetic datasets and show significant performance improvements.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Christan Borgelt. Apriori implementation. http://fuzzy.cs.Uni-Magdeburg.de/borgelt/Software. Version 4.08.
4
5
6
 
7
 
8
 
9
Viviane Crestana and Nandit Soparkar. Mining decentralized data repositories. Technical Report CSE-TR-385-99, University of Michigan Department of Electrical Engineering and Computer Science, 1999.
10
 
11
Sašo Džeroski. Multi-relational data mining: an introduction. SIGKDD Explor. Newsl., 5(1):1--16, 2003.
 
12
Bart Goethals and Mohammed J. Zaki. Workshop Report on Workshop on Frequent Itemset Mining Implementations (FIMI). 2003.
13
14
 
15
 
16
Viviane Crestana Jensen and Nandit Soparkar. Algebra based optimization strategies for decentralized mining. Technical Report CSE-TR-437-00, University of Michigan, 2000.
 
17
 
18
19
20
 
21
 
22
 
23
Rosa Meo, Marco Botta, and Roberto Esposito. Query rewriting in itemset mining. In Proc. of the 6th International Conference On Flexible Query Answering Systems, pages 111--124, June 2004.
24
25
26
27
 
28
Ramakrishnan Srikant, Quoc Vu, and Rakesh Agrawal. Mining association rules with item constraints. In David Heckerman, Heikki Mannila, Daryl Pregibon, and Ramasamy Uthurusamy, editors, KDD Conference Proceedings, pages 67--73, 1997.
29
30
 
31
Craig Utley. Microsoft sql server 9.0 technical articles: Introduction to sql server 2005 data mining. http://technet.microsoft.com/en-us/library/ms345131.aspx.
32
 
33
 
34
Y. N. Law, C. R. Luo, H. Wang, and C. Zaniol. Atlas: a turing complete extension of sql for data mining applications and streams. In Posters of the 2003 ACM SIGMOD international conference on Management of data, 2003.
 
35
 
36
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In 3rd Intl. Conf. on Knowledge Discovery and Data Mining., August 1997.

Collaborative Colleagues:
Ruoming Jin: colleagues
Dave Fuhry: colleagues
Abdulkareem Alali: colleagues