ACM Home Page
Please provide us with feedback. Feedback
Theory of dependence values
Full text PdfPdf (184 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 25 ,  Issue 3  (September 2000) table of contents
Pages: 380 - 406  
Year of Publication: 2000
ISSN:0362-5915
Author
Rosa Meo  Univ. degli Studi di Torino, Torino, Italy
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 46,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/363951.363956
What is a DOI?

ABSTRACT

A new model to evaluate dependencies in data mining problems is presented and discussed. The well-known concept of the association rule is replaced by the new definition of dependence value, which is a single real number uniquely associated with a given itemset. Knowledge of dependence values is sufficient to describe all the dependencies characterizing a given data mining problem. The dependence value of an itemset is the difference between the occurrence probability of the itemset and a corresponding “maximum independence estimate.” This can be determined as a function of joint probabilities of the subsets of the itemset being considered by maximizing a suitable entropy function. So it is possible to separate in an itemset of cardinaltiy k the dependence inherited from its subsets of cardinality (k − 1) and the specific inherent dependence of that itemset. The absolute value of the difference between the probability p(i) of the event i that indicates the prescence of the itemset {a,b,... } and its maximum independence estimate is constant for any combination of values of Q &angl0; a,b,... &angr0; Q. In1p addition, the Boolean function specifying the combination of values for which the dependence is positive is a parity function. So the determination of such combinations is immediate. The model appears to be simple and powerful.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
5
6
 
7
 
8
 
9
 
10
IMIELINSKI, T. 1997. From file mining to database mining. In Proceedings of the ACM SIGMOD International Workshop on Data Mining and Knowledge Discovery (SIGMOD-96, Aug.), R. Ng, Ed. ACM Press, New York, NY, 35-39.
 
11
IV, J. F. E. AND PREGIBON, D. 1995. A statistical perspective on kdd. Tech. Rep. KDD-95-93.
12
 
13
 
14
15
 
16
 
17
18
 
19
 
20