|
ABSTRACT
Learnability in Valiant's PAC learning model has been shown to be strongly related to the existence of uniform laws of large numbers. These laws define a distribution-free convergence property of means to expectations uniformly over classes of random variables. Classes of real-valued functions enjoying such a property are also known as uniform Glivenko-Cantelli classes. In this paper, we prove, through a generalization of Sauer's lemma that may be interesting in its own right, a new characterization of uniform Glivenko-Cantelli classes. Our characterization yields Dudley, Gine´, and Zinn's previous characterization as a corollary. Furthermore, it is the first based on a Gine´, and Zinn's previous characterization as a corollary. Furthermore, it is the first based on a simple combinatorial quantity generalizing the Vapnik-Chervonenkis dimension. We apply this result to obtain the weakest combinatorial condition known to imply PAC learnability in the statistical regression (or “agnostic”) framework. Furthermore, we find a characterization of learnability in the probabilistic concept model, solving an open problem posed by Kearns and Schapire. These results show that the accuracy parameter plays a crucial role in determining the effective complexity of the learner's hypothesis class.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ALON, N., AND MILMAN, V. 1983. Embedding of 1k in finite dimension Banach spaces. Israel J. Math. 45, 265-280.
|
| |
2
|
ASSOUAD, P., AND DUDLEY, R. 1989. Minimax nonparametric estimation over classes of sets. Preprint.
|
 |
3
|
|
| |
4
|
|
| |
5
|
Shai Ben-David , Nicolò Cesa-Bianchi , David Haussler , Philip M. Long, Characterizations of learnability for classes of {0, …, n}-valued functions, Journal of Computer and System Sciences, v.50 n.1, p.74-86, Feb. 1995
[doi> 10.1006/jcss.1995.1008]
|
 |
6
|
|
| |
7
|
|
| |
8
|
DUDLEY, R. 1984. A course on empirical processes. In Lecture Notes in Mathematics, vol. 1097. Springer-Verlag, New York, pp. 2-142.
|
| |
9
|
DUDLEY, R., GIN#, E., AND ZINN, J. 1991. Uniform and universal Glivenko-Cantelli classes. J. Theoret. Prob. 4, 485-510.
|
| |
10
|
GINI#, E., AND ZINN, J. 1984. Some limit theorems for empirical processes. Ann. Prob. 12, 929-989.
|
| |
11
|
GUYON, I., VAPNIK, g., BOSER, B., BOTTOU, L., AND SOLLA, S. 1991. Structural risk minimization for character recognition. In Proceedings of the 1991 Conference on Advances in Neural Information Processing Systems. pp. 471-479.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
MILMAN, g. 1982. Some remarks about embedding of 1k in finite dimensional spaces. Israel J. Math. 43, 129-138.
|
| |
16
|
POLLARD, D. 1990. Empirical Processes: Theory and Applications, Volume 2 of NSF-CBMS Regional Conference Series in Probability and Statistics. Institute of Mathematical Statistics and American Statistical Association.
|
| |
17
|
RISSANEN, J. 1978. Modeling by shortest data description. Automatica 14, 465-471.
|
| |
18
|
SAUER, N. 1972. On the density of families of sets. J. Combin. Theory, Ser. A 13, 145-147.
|
| |
19
|
SHELAH, S. 1972. A combinatorial problem: Stability and order for models and theories in infinitary languages. Pac. J. Math. 41, 247-261.
|
| |
20
|
|
| |
21
|
VAN LINT, J. 1985. {0, 1, *} distance problems in combinatorics. In Lecture Notes of London Mathematical Society, vol. 103, Cambridge University Press, Cambridge, England, pp. 113-135.
|
| |
22
|
|
| |
23
|
|
| |
24
|
VAPNIK, g., AND CHERVONENKIS, A. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Prob. Its Applic. 16, 2, 264-280.
|
| |
25
|
VAPNIK, g., AND CHERVONENKIS, A. 1981. Necessary and sufficient conditions for uniform convergence of means to mathematical expectations. Theory Prob. Applic. 26, 3, 532-553.
|
CITED BY 29
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John Shawe-Taylor , Peter L. Bartlett , Robert C. Williamson , Martin Anthony, A framework for structural risk minimisation, Proceedings of the ninth annual conference on Computational learning theory, p.68-76, June 28-July 01, 1996, Desenzano del Garda, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Richard L. Frautschi : Reviewer"
Inspired by Valiant's PAC learning model, the authors, using
discretization techniques, seek a convergence of distribution-free
expectations over classes of random variables. Real-valued functions
(such as Glivenko-Cantelli classes) thus acqui
more...
|