ACM Home Page
Please provide us with feedback. Feedback
Fast discovery of unexpected patterns in data, relative to a Bayesian network
Full text PdfPdf (315 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining table of contents
Chicago, Illinois, USA
SESSION: Research track paper table of contents
Pages: 118 - 127  
Year of Publication: 2005
ISBN:1-59593-135-X
Authors
Szymon Jaroszewicz  Technical University of Szczecin, Szczecin, Poland
Tobias Scheffer  Humboldt-Universität zu Berlin, Berlin, Germany
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 69,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1081870.1081887
What is a DOI?

ABSTRACT

We consider a model in which background knowledge on a given domain of interest is available in terms of a Bayesian network, in addition to a large database. The mining problem is to discover unexpected patterns: our goal is to find the strongest discrepancies between network and database. This problem is intrinsically difficult because it requires inference in a Bayesian network and processing the entire, potentially very large, database. A sampling-based method that we introduce is efficient and yet provably finds the approximately most interesting unexpected patterns. We give a rigorous proof of the method's correctness. Experiments shed light on its efficiency and practicality for large-scale Bayesian networks and databases.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
H. Dodge and H. Romig. A method of sampling inspection. The Bell System Technical Journal, 8:613--631, 1929.
 
4
 
5
U. Fayyad, G. Piatetski-Shapiro, and P. Smyth. Knowledge discovery and data mining: Towards a unifying framework. In KDD-96, 1996.
 
6
W. Gilks, S. Richardson, and D. Spiegelhalter, editors. Markov Chain Monte Carlo in Practice. Chapman & Hall, 1995.
 
7
8
 
9
10
 
11
 
12
W. Klösgen. Assistant for knowledge discovery in data. In P. Hoschka, editor, Assisting Computer: A New Generation of Support Systems, 1995.
 
13
R. Kruse. Knowledge-based operations on graphical models. In Proceedings of the Dagstuhl Seminar on Probabilistic, Logical, and Relational Learning, 2005. In print.
 
14
O. Maron and A. Moore. Hoeffding races: Accelerating model selection search for classification and function approximating. In Advances in Neural Information Processing Systems, pages 59--66, 1994.
 
15
16
 
17
P. Myllymäki, T. Silander, H. Tirri, and P. Uronen. B-course: A web-based tool for bayesian and causal data analysis. International Journal on Artificial Intelligence Tools, 11(3):369--387, 2002.
 
18
 
19
 
20
A. Silberschatz and A. Tuzhilin. On subjective measures of interestingness in knowledge discovery. In Proceedings of the SIGKDD Conference on Knowledge Discovery and Data Mining, 1995.
 
21

CITED BY  7

Collaborative Colleagues:
Szymon Jaroszewicz: colleagues
Tobias Scheffer: colleagues