|
ABSTRACT
In many applications, association rules will only be interesting if they represent non-trivial correlations between all constituent items. Numerous techniques have been developed that seek to avoid false discoveries. However, while all provide useful solutions to aspects of this problem, none provides a generic solution that is both flexible enough to accommodate varying definitions of true and false discoveries and powerful enough to provide strict control over the risk of false discoveries. This paper presents generic techniques that allow definitions of true and false discoveries to be specified in terms of arbitrary statistical hypothesis tests and which provide strict control over the experiment wise risk of false discoveries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
A. Agresti. A survey of exact inference for contingency tables. Statistical Science, 7(1):131--153, February 1992.
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: A new and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B, 57:289--300, 1995.
|
| |
7
|
Y. Benjamini and D. Yekutieli. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4):1165--1188, 2001.
|
| |
8
|
|
 |
9
|
Tom Brijs , Gilbert Swinnen , Koen Vanhoof , Geert Wets, Using association rules for product assortment decisions: a case study, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.254-260, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312241]
|
 |
10
|
Sergey Brin , Rajeev Motwani , Craig Silverstein, Beyond market baskets: generalizing association rules to correlations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.265-276, May 11-15, 1997, Tucson, Arizona, United States
|
| |
11
|
|
 |
12
|
|
| |
13
|
S. Hettich and S. D. Bay. The UCI KDD archive. {http://kdd.ics.uci.edu} Irvine, CA: University of California, Department of Information and Computer Science., 2006.
|
| |
14
|
B. Holland and C. M. D. Improved Bonferroni-type multiple testing procedures. Psychological Bulletin, 104(1):145--149, 1988.
|
| |
15
|
S. Holm. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6:65--70, 1979.
|
| |
16
|
International Business Machines. IBM intelligent miner user's guide, version 1, release 1, 1996.
|
| |
17
|
|
 |
18
|
Bing Liu , Wynne Hsu , Yiming Ma, Pruning and summarizing the discovered associations, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.125-134, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312216]
|
| |
19
|
H. Mannila and H. Toivonen. Multiple uses of frequent sets and condensed representations. In Proc. Int. Conf. Knowledge Discovery and Data Mining KDD'96, pages 189--194, Portland, USA, 1996.
|
| |
20
|
N. Megiddo and R. Srikant. Discovering predictive association rules. In Proc. Fourth Int. Conf. Knowledge Discovery and Data Mining (KDD-98), pages 27--78, Menlo Park, US, 1998. AAAI Press.
|
| |
21
|
D. J. Newman, S. Hettich, C. Blake, and C. J. Merz. UCI repository of machine learning databases. {Machine-readable data repository}. University of California, Department of Information and Computer Science, Irvine, CA., 2006.
|
| |
22
|
G. Piatetsky-Shapiro. Discovery, analysis, and presentation of strong rules. In G. Piatetsky-Shapiro and J. Frawley, editors, Knowledge Discovery in Databases, pages 229--248. AAAI/MIT Press, Menlo Park, CA., 1991.
|
| |
23
|
T. Scheffer. Finding association rules that trade support optimally against confidence. Intelligent Data Analysis, 9(4):381 -- 395, 1995.
|
| |
24
|
J. P. Shaffer. Multiple hypothesis testing. Ann. Rev. Psychology, 46:561--584, 1995.
|
| |
25
|
G. I. Webb. Magnum Opus Version 1.3. Software, G. I. Webb & Associates, Melbourne, Aust., 2002.
|
| |
26
|
G. I. Webb. Magnum Opus Version 3.1. Software, G. I. Webb & Associates, Melbourne, Aust., 2006.
|
| |
27
|
|
| |
28
|
|
 |
29
|
|
 |
30
|
|
CITED BY 7
|
|
|
|
|
|
|
|
Cássia Blondet Baruque , Marília A. Amaral , Alexandre Barcellos , João Carlos da Silva Freitas , Carlos Juliano Longo, Analysing users' access logs in Moodle to improve e learning, Proceedings of the 2007 Euro American conference on Telematics and information systems, May 14-17, 2007, Faro, Portugal
|
|
|
|
|
|
|
|
|
|
|
|
|
|