ACM Home Page
Please provide us with feedback. Feedback
Tight upper bounds on the number of candidate patterns
Full text PdfPdf (458 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 30 ,  Issue 2  (June 2005) table of contents
Pages: 333 - 363  
Year of Publication: 2005
ISSN:0362-5915
Authors
Floris Geerts  University of Edinburgh, Scotland, UK
Bart Goethals  University of Helsinki
Jan Van Den Bussche  Limburgs Universitair Centrum, Diepenbeek, Belgium
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 66,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1071610.1071611
What is a DOI?

ABSTRACT

In the context of mining for frequent patterns using the standard levelwise algorithm, the following question arises: given the current level and the current set of frequent patterns, what is the maximal number of candidate patterns that can be generated on the next level? We answer this question by providing tight upper bounds, derived from a combinatorial result from the sixties by Kruskal and Katona. Our result is useful to secure existing algorithms from a combinatorial explosion of the number of candidate patterns.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
 
5
 
6
Agrawal, R. and Srikant, R. 1994b. Fast algorithms for mining association rules. IBM Research Report RJ9839, IBM Alamaden Research Center, San Jose, California. June.
 
7
Agrawal, R. and Srikant, R. 1994c. Quest Synthetic Data Generator. IBM Alamaden Research Center, http://www.almaden.ibm.com/software/quest/Resources/index.shtml.
8
 
9
Blake, C. and Merz, C. 1998. UCI Repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences, http://www.ics.uci.edu/~mlearn/MLRepository.html.
 
10
Bollobás, B. 1986. Combinatorics. Cambridge University Press.
 
11
12
 
13
 
14
Frankl, P. 1984. A new short proof for the Kruskal--Katona theorem. Discrete Mathematics 48, 327--329.
 
15
 
16
Goethals, B. and Zaki, M. J., Eds. 2003. Proceedings of the Workshop on Frequent Itemset Mining Implementations (FIMI-03), Melbourne Florida, USA, November 19, 2003. CEUR Workshop Proceedings, vol. 90. http://CEUR-WS.org/Vol-90/.
17
 
18
Katona, G. 1968. A theorem of finite sets. In Theory Of Graphs. Akadémia Kiadó, 187--207.
19
 
20
Kruskal, J. 1963. The number of simplices in a complex. In Mathematical Optimization Techniques. Univ. of California Press, 251--278.
 
21
22
 
23
24
 
25
 
26
Pei, J., Han, J., and Mao, R. 2000. Closet: An efficient algorithm for mining frequent closed itemsets. ACM SIGMOD'00 Workshop on Research Issues in Data Mining and Knowledge Discovery.
 
27
 
28
 
29
Zaki, M. and Hsiao, C.-J. 2002. CHARM: An efficient algorithm for closed itemset mining. In Proceedings of the Second SIAM International Conference on Data Mining, R. Grossman, J. Han, V. Kumar, H. Mannila, and R. Motwani, Eds. SIAM.
 
30
Zaki, M., Parthasarathy, S., Ogihara, M., and Li, W. 1997. New algorithms for fast discovery of association rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, D. Heckerman, H. Mannila, and D. Pregibon, Eds. AAAI Press, 283--296.
31


Collaborative Colleagues:
Floris Geerts: colleagues
Bart Goethals: colleagues
Jan Van Den Bussche: colleagues