ACM Home Page
Please provide us with feedback. Feedback
Learnability and the Vapnik-Chervonenkis dimension
Full text PdfPdf (3.25 MB)
Source Journal of the ACM (JACM) archive
Volume 36 ,  Issue 4  (October 1989) table of contents
Pages: 929 - 965  
Year of Publication: 1989
ISSN:0004-5411
Authors
Anselm Blumer  Tufts Univ., Medford, MA
A. Ehrenfeucht  Univ. of Colorado at Boulder, Boulder
David Haussler  Univ. of California at Santa Cruz, Santa Cruz
Manfred K. Warmuth  Univ. of California at Santa Cruz, Santa Cruz
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 239,   Citation Count: 221
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/76359.76371
What is a DOI?

ABSTRACT

Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space En. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
ANGLUIN, D., AND VALIANT, L. Fast probabilistic algorithms for Hamiltonian circuits and matchings. J. Comput. Syst. Sci. 19 (1979), 155-193.
 
6
ASSOUAD, P. Densit6 et Dimension. Ann. Inst. Fourier, Grenoble 33, 3 (1983), 233-282.
 
7
 
7a
 
8
 
9
 
10
 
11
BOLLOB/~S, B. Combinatorics. Cambridge Univ. Press, Cambridge, Mass., 1986.
 
12
 
13
COVER, T. Geometrical and statistical properties of systems of linear inequalities with applications to pattern recognition. IEEE Trans. Elect. Comp. 14 (1965), 326-334.
 
14
 
15
DEVROYE, L. P., AND WAGNER, T.J. A distribution-free performance bound in error estimation. IEEE Trans. inf. Theorem 22 (1976), 586-587.
 
16
DUDA, R., AND HART, P. Pattern Classification and Scene Analysis. Wiley, New York, 1973.
 
17
DUDLEY, R.M. A course on empirical processes. In Lecture Notes in Mathematics, vol. 1097. Springer-Verlag, New York, 1984.
 
18
DUDLEY, R. M. Universal Donsker classes and metric entropy. Ann. Prob. 15, 4 (1987), 1306-1326.
 
19
 
20
 
21
 
22
GII~L, J. Probabilistic Turing machines. SlAM J. Comput. 6, 4 (1977), 675-695.
 
23
Gm~, E., AND ZINN, J. Lectures on the central limit theorem for empirical processes. In Lecture Notes in Mathematics, vol. 122 I. Springer-Verlag, New York, 1986.
24
 
25
 
26
 
27
HA USSLER, D. Applying Valiant's learning framework to AI concept learning problems. In Proceedings of the 4th International Workshop on Machine Learning. Morgan Kaufinann, San Mateo, Calif., 1987, pp. 324-336.
 
28
 
28a
HA USSLER, D. Generalizing the PAC model: Sample size bounds from metric dimension-based uniform convergence results. In Proceedings of the 30th IEEE Symposium on Foundations of Computer Science (Research Triangle Park, N.C., Oct. 30-Nov. 1). IEEE, New York:, 1989, to appear.
 
29
HA USSLER, D., AND WELZL, E. Epsilon-nets and simplex range queries. Disc. Comput. Geometry 2 (1987), 127-151.
 
30
HAUSSLER, D., KEARNS, M., LITTLESTONE, N., AND WARMUTH, M.K. Equivalence of models for polynomial learnability. Tech. Rep. UCSC-CRL-88-06. Univ. California at Santa Cruz, Santa Cruz, Calif., 1988.
 
31
HAUSSLER, D., LITTLESTONE, N., AND WARMUTH, M.K. Predicting {0, 1}-functions on~ randomly drawn points. In Proceedings of the 29th IEEE Symposium on Foundations of Computer Science (White Plains, N.Y., Oct.). IEEE, New York, 1988, pp. 100-109.
 
32
JOHNSON, D.S. Approximation algorithms for combinatorial problems. J. Comput. Syst. Sci. 9 (1974), 256-278.
 
33
34
35
36
 
37
KEARNS, M., LI, M., PITT, L., AND VALIANT, L. Recent results on Boolean concept learning. In Proceedings of the 4th International Workshop on Machine Learning. Morgan-Kaufinann, San Mateo, Calif., 1987, pp. 337-352.
 
38
 
39
LAIRD, P.D. Learning from good data and bad. Tech. Rep. YALEU/DCS/TR-551. Yale Univ., New Haven, Conn., 1987.
 
40
LEE, D. T., AND PREPARATA, F.P. Computational geometry--A survey. IEEE Trans. Comput. 33, 12 (1984), 1072-1101.
 
41
LINIAL, N., MANSOUR, Y., AND RWEST, R. Results on learnability and the Vapnik-Chervonenkis dimension. In Proceedings of the 29th IEEE Symposium on Foundations of Computer Science (White Plains, N.Y., Oct.). IEEE, New York, 1988, pp. 120-129.
 
42
 
43
MASEK, W.J. Some NP-complete set cover problems. MIT Laboratory for Computer Science, unpublished manuscript.
44
 
45
MEGIDDO, N. On the complexity of polyhedral separability. Discrete Comput. Geometry 3 (1988), 325-337.
 
46
MUROGA, S. Threshold Logic and Its Applications. Wiley, New York, 1971.
47
 
48
NATARAJAN, B.K. Learning functions from examples. Tech. Rep. CMU-RI-TR-87-19. Carnegie Mellon Univ., Pittsburgh, Pa., Aug. 1987.
 
49
NIGMATULLIN, R. G. The fastest descent method for coveting problems (in Russian). In Proceedings of a Symposium on Questions of Precision and Efficiency of Computer Algorithms, Book 5. Kiev, 1969, pp. 116-126.
 
50
PEARL, J. On the connection between the complexity and credibility of inferred models. Int. J. Gen. Syst. 4 (1978), 255-264.
 
51
PEARL, J. Capacity and error estimates for Boolean classifiers with limited capacity. IEEE Trans. Pattern Analysis Mach. Intell. 1, 4 (1979), 350-355.
52
 
53
PITT, L., AND WARMUTH, M. Reductions among prediction problems, on the difficulty of predicting automata. In Proceedings of the 3rd IEEE Structure in Complexity Theory Conference (Washington, D.C.). IEEE, New York, 1988, pp. 62-69.
 
54
POLLARD, D. Convergence of Stochastic Processes. Springer-Verlag, New York, 1984.
 
55
 
56
 
57
ROYDEN, H.L. RealAnalysis, 2nd ed. MacMillan, New York, 1968.
 
58
59
 
60
VALIANT, L.G. Learning disjunctions of conjunctions. In Proceedings of the 9th International Conference on Artificial Intelligence (Los Angeles, Calif., Aug.), vol. 1. Morgan Kaufmann, San MateD, Calif., 1985, pp. 560-566.
 
61
 
62
VAPNIK, V. N., AND CHERVONENKIS, A. YA. On the uniform convergence of relative frequencies of events to their probabilities. Theoret. Probi. and its Appl. 16, 2 ( 1971), 264-280.
 
63
VAPNIK, V. N., AND CHERVONENKIS, A. YA. Theory of Pattern Recognition (in Russian). Nauka, Moscow, 1974.
 
64
 
65
WATANABE, S. Pattern recognition as information compression. In Frontiers of Pattern Recognition, S. Watanabe, Ed. Academic Press, Orlando, Fla., 1972.
 
66
WENOCUR, R. S., AND DUDLEY, R.M. Some special Vapnik-Chervonenkis classes. Discrete Math. 33 (1981), 313-318.
 
67
WINSTON, P. Learning structural descriptions from examples. In The Psychology of Computer Vision. McGraw-Hill, New York, 1975.

CITED BY  221


REVIEW

"Vitit Kantabutra : Reviewer"

The type of learning considered in this paper, which extends Valiant's theory of “learning by sampling” [1] from Boolean domains to multidimensional Euclidean spaces, is exemplified by the following case. We want an algorithm that   more...

Collaborative Colleagues:
Anselm Blumer: colleagues
A. Ehrenfeucht: colleagues
David Haussler: colleagues
Manfred K. Warmuth: colleagues