| Fat-shattering and the learnability of real-valued functions |
| Full text |
Pdf
(1.15 MB)
|
| Source
|
Annual Workshop on Computational Learning Theory
archive
Proceedings of the seventh annual conference on Computational learning theory
table of contents
New Brunswick, New Jersey, United States
Pages: 299 - 310
Year of Publication: 1994
ISBN:0-89791-655-7
|
|
Authors
|
|
Peter L. Bartlett
|
Department of Systems Engineering, RSISE, Australian National University, Canberra, 0200 Australia
|
|
Philip M. Long
|
Computer Science Department, Duke University, P.O. Box 90129, Durham, NC
|
|
Robert C. Williamson
|
Department of Engineering, Australian National University, Canberra, 0200 Australia
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 13, Downloads (12 Months): 33, Citation Count: 12
|
|
|
ABSTRACT
We consider the problem of learning real-valued functions from random examples when the function values are corrupted with noise. With mild conditions on independent observation noise, we provide characterizations of the learnability of a real-valued function class in terms of a generalization of the Vapnik-Chervonenkis dimension, the fat shattering function, introduced by Kearns and Schapire. We show that, given some restrictions on the noise, a function class is learnable in our model if and only if its fat-shattering function is finite. With different (also quite mild) restrictions, satisfied for example by gaussian noise, we show that a function class is learnable from polynomially many examples if and only if its fat-shattering function grows polynomially. We prove analogous results in an agnostic setting, where there is no assumption of an underlying function class.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
N. ALON, S. BEN-DAVID, N. CESA-BIANCHI AND D. HAUS- SLER, Scale-sensitive dimensions, uniform convergence, and learnability, Symposium on Foundations of Computer Science, 1993.
|
| |
2
|
D. ANGLUIN AND L. G. VALIANT, Fast probabilistic algorithms for Hamiltonian circuits and matchings, Journal of Computer and System Sciences, 18 (1979), pp. 155-193.
|
| |
3
|
|
 |
4
|
Peter Auer , Philip M. Long , Wolfgang Maass , Gerhard J. Woeginger, On the complexity of function learning, Proceedings of the sixth annual conference on Computational learning theory, p.392-401, July 26-28, 1993, Santa Cruz, California, United States
[doi> 10.1145/168304.168384]
|
 |
5
|
|
 |
6
|
|
 |
7
|
Shai Ben-David , Nicolò Cesa-Bianchi , Philip M. Long, Characterizations of learnability for classes of {O, …, n}-valued functions, Proceedings of the fifth annual workshop on Computational learning theory, p.333-340, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/130385.130423]
|
| |
8
|
|
 |
9
|
|
| |
10
|
S. VAN DE GEER, Regression Analysis and Empirical Processes, Centrnm voor Wiskunde en Informatica, Amsterdam, 1988.
|
| |
11
|
|
| |
12
|
M.J. KEARNS AND R. E. SCHAPIRE, Efficient distrbutionfree learning of probabilistic concepts (extended abstracO, Proceedings of the 31 st Annual Symposium on the Foundations of Computer Science, 1990.
|
 |
13
|
Michael J. Kearns , Robert E. Schapire , Linda M. Sellie, Toward efficient agnostic learning, Proceedings of the fifth annual workshop on Computational learning theory, p.341-352, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/130385.130424]
|
| |
14
|
N. MERHAVANDM. FEDER, Universal schemes for sequential decision from individual data sequences, IEEE Transaction on Information Theory, 39 (1993), pp. 1280-1292.
|
| |
15
|
|
 |
16
|
|
| |
17
|
J.K. PATEL AND C. B. READ, The BigBook of Facts About the Normal Distribution, Marcel Dekker, New York, 1982.
|
| |
18
|
D. POLLARD, Convergence of Stochastic Processes, Springer, New York, 1984.
|
| |
19
|
H.L. ROYDEN, Real Analysis, Macmillan, New York, 1988.
|
| |
20
|
|
| |
21
|
H.U. S~ON, Bounds on the number of examples needed for learning functions, FB Informatik, LS II, Universitat Dortmund, Forschungsbericht Nr. 501, 1993.
|
 |
22
|
|
| |
23
|
V.N. VAPNIK AND A. Y. CHERVONENKIS, On the uniform convergence of relative frequencies of events to theirprobabilities, Theory of Probability and its Applications, XVI (1971), pp. 264-280.
|
CITED BY 12
|
|
|
|
|
|
|
|
|
|
|
Ilan Kremer , Noam Nisan , Dana Ron, On randomized one-round communication complexity, Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, p.596-605, May 29-June 01, 1995, Las Vegas, Nevada, United States
|
|
|
|
|
|
|
|
|
Wee Sun Lee , Peter L. Bartlett , Robert C. Williamson, On efficient agnostic learning of linear combinations of basis functions, Proceedings of the eighth annual conference on Computational learning theory, p.369-376, July 05-08, 1995, Santa Cruz, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|