| Summarizing itemset patterns using probabilistic models |
| Full text |
Pdf
(790 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Philadelphia, PA, USA
POSTER SESSION: Research track posters
table of contents
Pages: 730 - 735
Year of Publication: 2006
ISBN:1-59593-339-5
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 115, Citation Count: 3
|
|
|
ABSTRACT
In this paper, we propose a novel probabilistic approach to summarize frequent itemset patterns. Such techniques are useful for summarization, post-processing, and end-user interpretation, particularly for problems where the resulting set of patterns are huge. In our approach items in the dataset are modeled as random variables. We then construct a Markov Random Fields (MRF) on these variables based on frequent itemsets and their occurrence statistics. The summarization proceeds in a level-wise iterative fashion. Occurrence statistics of itemsets at the lowest level are used to construct an initial MRF. Statistics of itemsets at the next level can then be inferred from the model. We use those patterns whose occurrence can not be accurately inferred from the model to augment the model in an iterative manner, repeating the procedure until all frequent itemsets can be modeled. The resulting MRF model affords a concise and useful representation of the original collection of itemsets. Extensive empirical study on real datasets show that the new approach can effectively summarize a large number of itemsets and typically significantly outperforms extant approaches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
C. Borgelt. Efficient implementations of apriori and eclat. In Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, 2003.
|
| |
4
|
|
| |
5
|
T. Calders and B. Goethals. Depth-first non-derivable itemset mining. In Proceedings of the SIAM 2005 International Conference on Data Mining, 2005.
|
| |
6
|
A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian data analysis. Chapman & Hall/CRC, 2004.
|
| |
7
|
Amol Ghoting , Gregory Buehrer , Srinivasan Parthasarathy , Daehyun Kim , Anthony Nguyen , Yen-Kuang Chen , Pradeep Dubey, Cache-conscious frequent pattern mining on a modern processor, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
| |
8
|
|
 |
9
|
Dimitrios Gunopulos , Heikki Mannila , Roni Khardon , Hannu Toivonen, Data mining, hypergraph transversals, and machine learning (extended abstract), Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.209-216, May 11-15, 1997, Tucson, Arizona, United States
[doi> 10.1145/263661.263684]
|
 |
10
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
11
|
|
| |
12
|
|
| |
13
|
S. Lauritzen and D. Speigelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B (Methodological), 50(2):157--224, 1988.
|
| |
14
|
|
| |
15
|
|
| |
16
|
C. Wang and S. Parthasarathy. Summarizing itemset patterns using probabilistic models. In The Ohio State University, Technical Report, 2006.
|
 |
17
|
Xifeng Yan , Hong Cheng , Jiawei Han , Dong Xin, Summarizing itemset patterns: a profile-based approach, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081907]
|
| |
18
|
M. J. Zaki and C.-J. Hsiao. Charm: An efficient algorithm for closed itemset mining. In Proceedings of the Second SIAM International Conference on Data Mining, 2002.
|
| |
19
|
M. J. Zaki, S. Parthasarathy, and W. L. Mitsunori Ogihara. New algorithms for fast discovery of association rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 283--286, 1997.
|
CITED BY 3
|
Chen Chen , Cindy Xide Lin , Xifeng Yan , Jiawei Han, On effective presentation of graph patterns: a structural representative approach, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
Ruoming Jin , Muad Abu-Ata , Yang Xiang , Ning Ruan, Effective and efficient itemset pattern summarization: regression-based approaches, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|