ACM Home Page
Please provide us with feedback. Feedback
Estimating replicability of classifier learning experiments
Full text PdfPdf (197 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 15  
Year of Publication: 2004
ISBN:1-58113-828-5
Author
Remco R. Bouckaert  University of Waikato, Hamilton, New Zealand
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 20,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015338
What is a DOI?

ABSTRACT

Replicability of machine learning experiments measures how likely it is that the outcome of one experiment is repeated when performed with a different randomization of the data. In this paper, we present an estimator of replicability of an experiment that is efficient. More precisely, the estimator is unbiased and has lowest variance in the class of estimators formed by a linear combination of outcomes of experiments on a given data set.We gathered empirical data for comparing experiments consisting of different sampling schemes and hypothesis tests. Both factors are shown to have an impact on replicability of experiments. The data suggests that sign tests should not be used due to low replicability. Ranked sum tests show better performance, but the combination of a sorted runs sampling scheme with a t-test gives the most desirable performance judged on Type I and II error and replicability.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
C. L. Blake and C. J. Merz. UCI Repository of machine learning databases. Irvine, CA: University of California, 1998.
 
2
R. R. Bouckaert. Choosing between two learning algorithms based on calibrated tests. ICML, 51--58, 2003.
 
3
 
4
 
5
G. H. John and Pat Langley. Estimating Continuous Distributions in Bayesian Classifiers. UAI, 338--345, 1995.
 
6
C. Nadeau and Y. Bengio. Inference for the generalization error. Advances in Neural Information Processing Systems 12, MIT Press, 2000.
 
7
 
8
 
9