|
ABSTRACT
Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
ANGLUIN, D., AND VALIANT, L. Fast probabilistic algorithms for Hamiltonian circuits and matchings. J. Comput. Syst. Sci. 19 (1979), 155-193.
|
| |
6
|
ASSOUAD, P. Densit6 et Dimension. Ann. Inst. Fourier, Grenoble 33, 3 (1983), 233-282.
|
| |
7
|
|
| |
7a
|
|
| |
8
|
|
| |
9
|
Avrim Blum , Ronald L. Rivest, Training a 3-node neural network is NP-complete, Proceedings of the first annual workshop on Computational learning theory, p.9-18, August 03-05, 1988, MIT, Cambridge, Massachusetts, United States
|
| |
10
|
|
| |
11
|
BOLLOB/~S, B. Combinatorics. Cambridge Univ. Press, Cambridge, Mass., 1986.
|
| |
12
|
Stéphane Boucheron , Jean Sallantin, Some remarks about space-complexity of learning, and circuit complexity of recognizing, Proceedings of the first annual workshop on Computational learning theory, p.125-138, August 03-05, 1988, MIT, Cambridge, Massachusetts, United States
|
| |
13
|
COVER, T. Geometrical and statistical properties of systems of linear inequalities with applications to pattern recognition. IEEE Trans. Elect. Comp. 14 (1965), 326-334.
|
| |
14
|
|
| |
15
|
DEVROYE, L. P., AND WAGNER, T.J. A distribution-free performance bound in error estimation. IEEE Trans. inf. Theorem 22 (1976), 586-587.
|
| |
16
|
DUDA, R., AND HART, P. Pattern Classification and Scene Analysis. Wiley, New York, 1973.
|
| |
17
|
DUDLEY, R.M. A course on empirical processes. In Lecture Notes in Mathematics, vol. 1097. Springer-Verlag, New York, 1984.
|
| |
18
|
DUDLEY, R. M. Universal Donsker classes and metric entropy. Ann. Prob. 15, 4 (1987), 1306-1326.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
GII~L, J. Probabilistic Turing machines. SlAM J. Comput. 6, 4 (1977), 675-695.
|
| |
23
|
Gm~, E., AND ZINN, J. Lectures on the central limit theorem for empirical processes. In Lecture Notes in Mathematics, vol. 122 I. Springer-Verlag, New York, 1986.
|
 |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
HA USSLER, D. Applying Valiant's learning framework to AI concept learning problems. In Proceedings of the 4th International Workshop on Machine Learning. Morgan Kaufinann, San Mateo, Calif., 1987, pp. 324-336.
|
| |
28
|
|
| |
28a
|
HA USSLER, D. Generalizing the PAC model: Sample size bounds from metric dimension-based uniform convergence results. In Proceedings of the 30th IEEE Symposium on Foundations of Computer Science (Research Triangle Park, N.C., Oct. 30-Nov. 1). IEEE, New York:, 1989, to appear.
|
| |
29
|
HA USSLER, D., AND WELZL, E. Epsilon-nets and simplex range queries. Disc. Comput. Geometry 2 (1987), 127-151.
|
| |
30
|
HAUSSLER, D., KEARNS, M., LITTLESTONE, N., AND WARMUTH, M.K. Equivalence of models for polynomial learnability. Tech. Rep. UCSC-CRL-88-06. Univ. California at Santa Cruz, Santa Cruz, Calif., 1988.
|
| |
31
|
HAUSSLER, D., LITTLESTONE, N., AND WARMUTH, M.K. Predicting {0, 1}-functions on~ randomly drawn points. In Proceedings of the 29th IEEE Symposium on Foundations of Computer Science (White Plains, N.Y., Oct.). IEEE, New York, 1988, pp. 100-109.
|
| |
32
|
JOHNSON, D.S. Approximation algorithms for combinatorial problems. J. Comput. Syst. Sci. 9 (1974), 256-278.
|
| |
33
|
|
 |
34
|
|
 |
35
|
|
 |
36
|
M. Kearns , M. Li , L. Pitt , L. Valiant, On the learnability of Boolean formulae, Proceedings of the nineteenth annual ACM conference on Theory of computing, p.285-295, January 1987, New York, New York, United States
[doi> 10.1145/28395.28426]
|
| |
37
|
KEARNS, M., LI, M., PITT, L., AND VALIANT, L. Recent results on Boolean concept learning. In Proceedings of the 4th International Workshop on Machine Learning. Morgan-Kaufinann, San Mateo, Calif., 1987, pp. 337-352.
|
| |
38
|
|
| |
39
|
LAIRD, P.D. Learning from good data and bad. Tech. Rep. YALEU/DCS/TR-551. Yale Univ., New Haven, Conn., 1987.
|
| |
40
|
LEE, D. T., AND PREPARATA, F.P. Computational geometry--A survey. IEEE Trans. Comput. 33, 12 (1984), 1072-1101.
|
| |
41
|
LINIAL, N., MANSOUR, Y., AND RWEST, R. Results on learnability and the Vapnik-Chervonenkis dimension. In Proceedings of the 29th IEEE Symposium on Foundations of Computer Science (White Plains, N.Y., Oct.). IEEE, New York, 1988, pp. 120-129.
|
| |
42
|
|
| |
43
|
MASEK, W.J. Some NP-complete set cover problems. MIT Laboratory for Computer Science, unpublished manuscript.
|
 |
44
|
|
| |
45
|
MEGIDDO, N. On the complexity of polyhedral separability. Discrete Comput. Geometry 3 (1988), 325-337.
|
| |
46
|
MUROGA, S. Threshold Logic and Its Applications. Wiley, New York, 1971.
|
 |
47
|
|
| |
48
|
NATARAJAN, B.K. Learning functions from examples. Tech. Rep. CMU-RI-TR-87-19. Carnegie Mellon Univ., Pittsburgh, Pa., Aug. 1987.
|
| |
49
|
NIGMATULLIN, R. G. The fastest descent method for coveting problems (in Russian). In Proceedings of a Symposium on Questions of Precision and Efficiency of Computer Algorithms, Book 5. Kiev, 1969, pp. 116-126.
|
| |
50
|
PEARL, J. On the connection between the complexity and credibility of inferred models. Int. J. Gen. Syst. 4 (1978), 255-264.
|
| |
51
|
PEARL, J. Capacity and error estimates for Boolean classifiers with limited capacity. IEEE Trans. Pattern Analysis Mach. Intell. 1, 4 (1979), 350-355.
|
 |
52
|
|
| |
53
|
PITT, L., AND WARMUTH, M. Reductions among prediction problems, on the difficulty of predicting automata. In Proceedings of the 3rd IEEE Structure in Complexity Theory Conference (Washington, D.C.). IEEE, New York, 1988, pp. 62-69.
|
| |
54
|
POLLARD, D. Convergence of Stochastic Processes. Springer-Verlag, New York, 1984.
|
| |
55
|
|
| |
56
|
|
| |
57
|
ROYDEN, H.L. RealAnalysis, 2nd ed. MacMillan, New York, 1968.
|
| |
58
|
|
 |
59
|
|
| |
60
|
VALIANT, L.G. Learning disjunctions of conjunctions. In Proceedings of the 9th International Conference on Artificial Intelligence (Los Angeles, Calif., Aug.), vol. 1. Morgan Kaufmann, San MateD, Calif., 1985, pp. 560-566.
|
| |
61
|
|
| |
62
|
VAPNIK, V. N., AND CHERVONENKIS, A. YA. On the uniform convergence of relative frequencies of events to their probabilities. Theoret. Probi. and its Appl. 16, 2 ( 1971), 264-280.
|
| |
63
|
VAPNIK, V. N., AND CHERVONENKIS, A. YA. Theory of Pattern Recognition (in Russian). Nauka, Moscow, 1974.
|
| |
64
|
Jeffrey Scott Vitter , Jyh-Han Lin, Learning in parallel, Proceedings of the first annual workshop on Computational learning theory, p.106-124, August 03-05, 1988, MIT, Cambridge, Massachusetts, United States
|
| |
65
|
WATANABE, S. Pattern recognition as information compression. In Frontiers of Pattern Recognition, S. Watanabe, Ed. Academic Press, Orlando, Fla., 1972.
|
| |
66
|
WENOCUR, R. S., AND DUDLEY, R.M. Some special Vapnik-Chervonenkis classes. Discrete Math. 33 (1981), 313-318.
|
| |
67
|
WINSTON, P. Learning structural descriptions from examples. In The Psychology of Computer Vision. McGraw-Hill, New York, 1975.
|
CITED BY 221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alberto Bertoni , Paola Campadelli , Anna Morpurgo , Sandra Panizza, Polynomial uniform convergence and polynomial-sample learnability, Proceedings of the fifth annual workshop on Computational learning theory, p.265-271, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jiří Matoušek , Raimund Seidel , E. Welzl, How to net a lot with little: small &egr;-nets for disks and halfspaces, Proceedings of the sixth annual symposium on Computational geometry, p.16-22, June 07-09, 1990, Berkley, California, United States
|
|
|
|
|
|
Lisa Hellerstein , Vijay Raghavan , Krishnan Pillaipakkamnatt , Dawn Wilkins, How many queries are needed to learn?, Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, p.190-199, May 29-June 01, 1995, Las Vegas, Nevada, United States
|
|
|
|
|
|
Nicolò Cesa-Bianchi , Eli Dichterman , Paul Fischer , Hans Ulrich Simon, Noise-tolerant learning near the information-theoretic bound, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.141-150, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nicolò Cesa-Bianchi , Yoav Freund , David Haussler , David P. Helmbold , Robert E. Schapire , Manfred K. Warmuth, How to use expert advice, Journal of the ACM (JACM), v.44 n.3, p.427-485, May 1997
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Felipe Cucker , Marek Karpinski , Pascal Koiran , Thomas Lickteig , Kai Werther, On real Turing machines that toss coins, Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, p.335-342, May 29-June 01, 1995, Las Vegas, Nevada, United States
|
|
|
Scott Decatur , Oded Goldreich , Dana Ron, Computational sample complexity, Proceedings of the tenth annual conference on Computational learning theory, p.130-142, July 06-09, 1997, Nashville, Tennessee, United States
|
|
|
Nader H. Bshouty , Sally A. Goldman , H. David Mathias , Subhash Suri , Hisao Tamaki, Noise-tolerant distribution-free learning of general geometric concepts, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.151-160, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
Peter Auer , Philip M. Long , Aravind Srinivasan, Approximating hyper-rectangles: learning and pseudo-random sets, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.314-323, May 04-06, 1997, El Paso, Texas, United States
|
|
|
Wee Sun Lee , Peter L. Bartlett , Robert C. Williamson, Lower bounds on the VC-dimension of smoothly parametrized function classes, Proceedings of the seventh annual conference on Computational learning theory, p.362-367, July 12-15, 1994, New Brunswick, New Jersey, United States
|
|
|
|
|
|
Martin Anthony , Graham Brightwell , Dave Cohen , John Shawe-Taylor, On exact specification by examples, Proceedings of the fifth annual workshop on Computational learning theory, p.311-318, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
Nader H. Bshouty , Sally A. Goldman , H. David Mathias, Noise-tolerant parallel learning of geometric concepts, Proceedings of the eighth annual conference on Computational learning theory, p.345-352, July 05-08, 1995, Santa Cruz, California, United States
|
|
|
Shai Ben-David , Nicolò Cesa-Bianchi , Philip M. Long, Characterizations of learnability for classes of {O, …, n}-valued functions, Proceedings of the fifth annual workshop on Computational learning theory, p.333-340, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Svetlana Anoulova , Paul Fischer , Stefan Pölt , Hans Ulrich Simon, PAB-decisions for Boolean and real-valued features, Proceedings of the fifth annual workshop on Computational learning theory, p.353-362, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
|
|
|
Michael J. Kearns , Robert E. Schapire , Linda M. Sellie, Toward efficient agnostic learning, Proceedings of the fifth annual workshop on Computational learning theory, p.341-352, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rocco A. Servedio, On PAC learning using Winnow, Perceptron, and a Perceptron-like algorithm, Proceedings of the twelfth annual conference on Computational learning theory, p.296-307, July 07-09, 1999, Santa Cruz, California, United States
|
|
|
|
|
|
|
|
|
Michael Kearns , Yishay Mansour , Dana Ron , Ronitt Rubinfeld , Robert E. Schapire , Linda Sellie, On the learnability of discrete distributions, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, p.273-282, May 23-25, 1994, Montreal, Quebec, Canada
|
|
|
Noga Alon , Shai Ben-David , Nicolò Cesa-Bianchi , David Haussler, Scale-sensitive dimensions, uniform convergence, and learnability, Journal of the ACM (JACM), v.44 n.4, p.615-631, July 1997
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Paul W. Goldberg , Sally A. Goldman , H. David Mathias, Learning unions of boxes with membership and equivalence queries, Proceedings of the seventh annual conference on Computational learning theory, p.198-207, July 12-15, 1994, New Brunswick, New Jersey, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peter L. Bartlett , Philip M. Long , Robert C. Williamson, Fat-shattering and the learnability of real-valued functions, Proceedings of the seventh annual conference on Computational learning theory, p.299-310, July 12-15, 1994, New Brunswick, New Jersey, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ilan Kremer , Noam Nisan , Dana Ron, On randomized one-round communication complexity, Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, p.596-605, May 29-June 01, 1995, Las Vegas, Nevada, United States
|
|
|
|
|
|
Shai Ben-David , Michal Jacovi, On learning in the limit and non-uniform (&egr;,&dgr;)-learning, Proceedings of the sixth annual conference on Computational learning theory, p.209-217, July 26-28, 1993, Santa Cruz, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael Frazier , Sally Goldman , Nina Mishra , Leonard Pitt, Learning from a consistently ignorant teacher, Proceedings of the seventh annual conference on Computational learning theory, p.328-339, July 12-15, 1994, New Brunswick, New Jersey, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wee Sun Lee , Peter L. Bartlett , Robert C. Williamson, On efficient agnostic learning of linear combinations of basis functions, Proceedings of the eighth annual conference on Computational learning theory, p.369-376, July 05-08, 1995, Santa Cruz, California, United States
|
|
|
Avrim Blum , Merrick Furst , Jeffrey Jackson , Michael Kearns , Yishay Mansour , Steven Rudich, Weakly learning DNF and characterizing statistical query learning using Fourier analysis, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, p.253-262, May 23-25, 1994, Montreal, Quebec, Canada
|
|
|
Charles W. Glover , Nageswara S. V. Rao , E. M. Oblow, Hybrid pattern recognition system capable of self-modification, Proceedings of the second international conference on Information and knowledge management, p.239-244, November 01-05, 1993, Washington, D.C., United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Vitit Kantabutra : Reviewer"
The type of learning considered in this paper, which extends
Valiant's theory of “learning by sampling” [1] from Boolean
domains to multidimensional Euclidean spaces, is exemplified by the
following case. We want an algorithm that
more...
|