Abstract
Artificial neural networks grow on the number of applications and complexity, which require a minimization on the number of units for some practical implementations. A particular problem is the minimum number of units that a feed forward neural network needs on its first layer. In order to study this problem, it is defined a family of classification problems following a continuity hypothesis, where inputs that are close to some set of points may share the same category. Given a set S of k −dimensional inputs and let \(\mathcal {N}\) be a feed forward neural network that classifies any input in S within a fixed error, there is proved that \(\mathcal {N}\) requires \({\Theta } \left (k \right )\) units in the first layer, if \(\mathcal {N}\) can solve any instance from the given family of classification problems. Furthermore, this asymptotic result is optimal.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Anthony, M.: Boolean Functions and Artificial Neural Networks. Centre for Discrete and Applicable Mathematics, London School of Economics and Political Science (2003)
Baum, E.B., Haussler, D.: What Size Net Gives Valid Generalization? Computer Research Laboratory. University of California, Santa Cruz (1988)
Beiu, V.: A survey of perceptron circuit complexity results. In: Proceedings of the International Joint Conference on Neural Networks, 2003, vol. 2, pp 989–994. IEEE (2003)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Bengio, Y, et al.: Learning deep architectures for ai. Foundations and trends®; in Machine Learning 2(1), 1–127 (2009)
Bollobás, B.: Modern Graph Theory, vol. 184. Springer Science & Business Media (2013)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT press (2009)
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012)
Håstad, J.: On the size of weights for threshold gates. SIAM J. Discret. Math. 7(3), 484–492 (1994)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Horne, B.G., Hush, D.R.: On the node complexity of neural networks. Neural Netw. 7(9), 1413–1426 (1994)
Jain, A.K., Narasimha Murty, M., Flynn, P.J.: Data clustering: A review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)
Kurzweil, R.: The Singularity is Near: When Humans Transcend Biology. Penguin (2005)
Lawrence, S., Lee Giles, C., Chung Tsoi, A., Back, A.D.: Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Le, Q.V.: Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 8595–8598. IEEE (2013)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for sequence learning. arXiv:1506.00019 (2015)
Mhaskar, H.N, Poggio, T: Deep vs. shallow networks: An approximation theory perspective. Anal. Appl. 14(06), 829–848 (2016)
Rojas, R.: Neural Networks: A Systematic Introduction, pp. 143. Springer Science & Business Media (2013)
Simard, P.Y., Steinkraus, D., Platt, J.C., et al.: Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR, vol. 3, pp. 958–962 (2003)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer science & business media (2013)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Grillo, S.A. A linear relation between input and first layer in neural networks. Ann Math Artif Intell 87, 361–372 (2019). https://doi.org/10.1007/s10472-019-09657-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-019-09657-3