Supervised and Unsupervised Co-training of Adaptive Activation Functions in Neural Nets

Castelli, Ilaria; Trentin, Edmondo

doi:10.1007/978-3-642-28258-4_6

Ilaria Castelli¹⁹ &
Edmondo Trentin¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7081))

Included in the following conference series:

IAPR International Workshop on Partially Supervised Learning

586 Accesses
2 Citations

Abstract

In spite of the nice theoretical properties of mixtures of logistic activation functions, standard feedforward neural network with limited resources and gradient-descent optimization of the connection weights may practically fail in several, difficult learning tasks. Such tasks would be better faced by relying on a more appropriate, problem-specific basis of activation functions. The paper introduces a connectionist model which features adaptive activation functions. Each hidden unit in the network is associated with a specific pair (f(·), p(·)), where f(·) (the very activation) is modeled via a specialized neural network, and p(·) is a probabilistic measure of the likelihood of the unit itself being relevant to the computation of the output over the current input. While f(·) is optimized in a supervised manner (through a novel backpropagation scheme of the target outputs which do not suffer from the traditional phenomenon of “vanishing gradient” that occurs in standard backpropagation), p(·) is realized via a statistical parametric model learned through unsupervised estimation. The overall machine is implicitly a co-trained coupled model, where the topology chosen for learning each f(·) may vary on a unit-by-unit basis, resulting in a highly non-standard neural architecture.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

Architecture-Aware Bayesian Optimization for Neural Network Tuning

Treating Artificial Neural Net Training as a Nonsmooth Global Optimization Problem

Keywords

References

Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley (2001)
Google Scholar
Cybenko, G.: Approximation by Superpositions of a Sigmoidal Function. Mathematics of Control, Signals, and Systems 4, 303–314 (1989)
Article MATH MathSciNet Google Scholar
Stinchcombe, M., White, H.: Universal Approximation using Feedforward Networks with Non-Sigmoid Hidden Layer Activation Functions. In: International Joint Conference on Neural Networks, IJCNN 1989, vol. 1, pp. 613–617 (1989)
Google Scholar
Chen, T., Chen, H.: Universal Approximation to Nonlinear Operators by Neural Networks with Arbitrary Activation Functions and its Application to Dynamical Systems. IEEE Transaction on Neural Networks 4, 911–917 (1995)
Article Google Scholar
Vecci, L., Piazza, F., Uncini, A.: Learning and Approximation Capabilities of Adaptive Spline Activation Function Neural Networks (1998)
Google Scholar
Castelli, I., Trentin, E.: Semi-unsupervised Weighted Maximum-Likelihood Estimation of Joint Densities for the Co-Training of Adaptive Activation Functions. In: Schwenker, F., Trentin, E. (eds.) PSL 2011. LNCS (LNAI), vol. 7081, pp. 62–71. Springer, Heidelberg (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Ingegneria dell’Informazione, Università di Siena, via Roma 56, Siena, Italy
Ilaria Castelli & Edmondo Trentin

Authors

Ilaria Castelli
View author publications
You can also search for this author in PubMed Google Scholar
Edmondo Trentin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Friedhelm Schwenker Edmondo Trentin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castelli, I., Trentin, E. (2012). Supervised and Unsupervised Co-training of Adaptive Activation Functions in Neural Nets. In: Schwenker, F., Trentin, E. (eds) Partially Supervised Learning. PSL 2011. Lecture Notes in Computer Science(), vol 7081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28258-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-28258-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28257-7
Online ISBN: 978-3-642-28258-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Supervised and Unsupervised Co-training of Adaptive Activation Functions in Neural Nets

Abstract

Chapter PDF

Similar content being viewed by others

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

Architecture-Aware Bayesian Optimization for Neural Network Tuning

Treating Artificial Neural Net Training as a Nonsmooth Global Optimization Problem

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Supervised and Unsupervised Co-training of Adaptive Activation Functions in Neural Nets

Abstract

Chapter PDF

Similar content being viewed by others

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

Architecture-Aware Bayesian Optimization for Neural Network Tuning

Treating Artificial Neural Net Training as a Nonsmooth Global Optimization Problem

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation