ACM Home Page
Please provide us with feedback. Feedback
On a learnability question associated to neural networks with continuous activations (extended abstract)
Full text PdfPdf (1.01 MB)
Source Annual Workshop on Computational Learning Theory archive
Proceedings of the seventh annual conference on Computational learning theory table of contents
New Brunswick, New Jersey, United States
Pages: 47 - 56  
Year of Publication: 1994
ISBN:0-89791-655-7
Authors
Bhaskar DasGupta  Department of Computer Science, University of Minnesota, Minneapolis, MN
Hava T. Siegelmann  Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel
Eduardo Sontag  Department of Mathematics, Rutgers University, New Brunswick, NJ
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 37,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/180139.181009
What is a DOI?

ABSTRACT

This paper deals with learnability of concept classes defined by neural networks, showing the hardness of PAC-learning (in the complexity, not merely information-theoretic sense) for networks with a particular class of activation. The obstruction lies not with the VC dimension, which is known to grow slowly; instead, the result follows the fact that the loading problem is NP-complete. (The complexity scales badly with input dimension; the loading problem is polynomial-time if the input dimension is constant.) Similar and well-known theorems had already been proved by Megiddo and by Blum and Rivest, for binary-threshold networks. It turns out the general problem for continuous sigmoidal-type functions, as used in practical applications involving steepest descent, is not NP-hard—there are “sigmoidals” for which the problem is in fact trivial—so it is an open question to determine what properties of the activation function cause difficulties. Ours is the first result on the hardness of loading networks which do not consist of binary neurons; we employ a piecewise-linear activation function that has been used in the neural network literature. Our theoretical results lend further justification to the use of incremental (architecture-changing) techniques for training networks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Albertini, F., E.D. Sontag, and V. Maillot, "Uniqueness of weights for neural networks," in Artificial Neural Networks for Speech and Vision (R. Mammone, ed.), Chapman and Hall, London, 1993, pp. 115-125.
 
2
 
3
Batruni, R., "A multilayer neural network with piecewise-linear structure and back-propagation learning," IEEE Transactions on Neural Networks 2(1991): 395-403.
 
4
 
5
 
6
Brown, J., M. Garber, and S. Vanable, "Artificial neural network on a SIMD architecture," in Proc. 2nd Symposium on the Frontier of Massively Parallel Computation, Fairfax, VA, 1988, pp. 43-47.
7
 
8
 
9
DasGupta, B., Siegelmann, H. T., and Sontag, E., "On the Complexity of Training Neural Networks with Continuous Activation Functions", Tech Report 4~ 93-61, Department of Computer Science, University of Minnesota, September, 1993.
 
10
11
 
12
 
13
Jones, K.L., "A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training," Annals of Statistics, 20(1992): 608-613.
 
14
 
15
16
 
17
 
18
Lippmann, R., "An introduction to computing with neural nets," IEEE Acoustics, Speech, and Signal Processing Magazine, 1987, pp. 4-22.
19
20
 
21
Megiddo, M., "On the complexity of polyhedral separability," Discrete Computational Geometry, 3(1988): 325-337.
 
22
Muroga, S., Threshold Logic and its Applications, John Wiley & Sons Inc., 1971.
 
23
 
24
 
25
Sontag, E.D. and H.J. Sussmann, "Backpropagation can give rise to spurious local minima even for networks without hidden layers," Complex Systems 3(1989): 91-106.
 
26
 
27
 
28
Zhang, X-D., "Complexity of neural network learning in the real number model," preprint, Comp. Sci. Dept., U. Mass., 1992.


Collaborative Colleagues:
Bhaskar DasGupta: colleagues
Hava T. Siegelmann: colleagues
Eduardo Sontag: colleagues