ACM Home Page
Please provide us with feedback. Feedback
Probabilistic frequent itemset mining in uncertain databases
Full text MovMov (22:38),  PdfPdf (888 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 119-128  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Thomas Bernecker  Ludwig-Maximilians-Universität, Munich, Germany
Hans-Peter Kriegel  Ludwig-Maximilians-Universität, Munich, Germany
Matthias Renz  Ludwig-Maximilians-Universität, Munich, Germany
Florian Verhein  Ludwig-Maximilians-Universität, Munich, Germany
Andreas Zuefle  Ludwig-Maximilians-Universität, Munich, Germany
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 75,   Downloads (12 Months): 245,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557039
What is a DOI?

ABSTRACT

Probabilistic frequent itemset mining in uncertain transaction databases semantically and computationally differs from traditional techniques applied to standard "certain" transaction databases. The consideration of existential uncertainty of item(sets), indicating the probability that an item(set) occurs in a transaction, makes traditional techniques inapplicable. In this paper, we introduce new probabilistic formulations of frequent itemsets based on possible world semantics. In this probabilistic context, an itemset X is called frequent if the probability that X occurs in at least minSup transactions is above a given threshold τ. To the best of our knowledge, this is the first approach addressing this problem under possible worlds semantics. In consideration of the probabilistic formulations, we present a framework which is able to solve the Probabilistic Frequent Itemset Mining (PFIM) problem efficiently. An extensive experimental evaluation investigates the impact of our proposed techniques and shows that our approach is orders of magnitude faster than straight-forward approaches.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'94), Minneapolis, MN, 1994.
 
3
 
4
 
5
Chun Kit Chui and Ben Kao. A decremental approach for mining frequent itemsets from uncertain data. In The 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 64--75, 2008.
 
6
Chun Kit Chui, Ben Kao, and Edward Hung. Mining frequent itemsets from uncertain data. In 11th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2007, Nanjing, China, pages 47--58, 2007.
 
7
 
8
Karolien Geurts, Geert Wets, Tom Brijs, and Koen Vanhoof. Profiling high frequency accident locations using association rules. In Proceedings of the 82nd Annual Transportation Research Board, Washington DC. (USA), January 12-16, page 18pp, 2003.
9
 
10
H.-P. Kriegel, P. Kunath, M. Pfeifle, and M. Renz. "Probabilistic Similarity Join on Uncertain Data". In Proc. 11th Int. Conf. on Database Systems for Advanced Applications, Singapore, pp. 295--309, 2006.
 
11
 
12
C. Re, N. Dalvi, and D. Suciu. "Efficient top-k query evaluation on probalistic databases". In Proc. 23rd Int. Conf. on Data Engineering, Istanbul, Turkey, 2007.
 
13
P. Sen and A. Deshpande. "Representing and querying correlated tuples in probabilistic databases". In Proc. 23rd Int. Conf. on Data Engineering, Istanbul, Turkey, 2007.
 
14
M.A. Soliman, I.F. Ilyas, and K. Chen-Chuan Chang. "Top-k Query Processing in Uncertain Databases". In Proc. 23rd Int. Conf. on Data Engineering, Istanbul, Turkey, pages 896--905, 2007.
 
15
16
 
17
18

Collaborative Colleagues:
Thomas Bernecker: colleagues
Hans-Peter Kriegel: colleagues
Matthias Renz: colleagues
Florian Verhein: colleagues
Andreas Zuefle: colleagues