Information Theory Based Regularizing Methods

Deco, Gustavo; Obradovic, Dragan

doi:10.1007/978-1-4612-4016-7_10

Gustavo Deco⁴ &
Dragan Obradovic⁴

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

125 Accesses

Abstract

One of the main requirements in modelling, i.e. neural network training, is to assure good generalization of the obtained model. Hence, the latter requirement has to be built into the training mechanism either by constantly monitoring the behavior of the trained network on an independent data set during learning or by appropriately modifying the cost function. Akaike’s and Rissanen’s methods for formulating cost functions which naturally include model complexity terms are presented in Chapter 7 while the problem of generalization over an infinite ensemble of networks is presented in Chapter 8.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

G. Deco, W. Finnoff and H.G. Zimmermann: Unsupervised Mutual Information Criterion for Elimination of Overtraining in Supervised Multilayer Networks. Neural Computation, 7, 86–107, 1995.
Article Google Scholar
Y. Le Cun, J. Denker and S. Solla: Optimal Brain Damage. In Proceedings of the Neural Information Processing Systems, Denver, 598–605, 1990.
Google Scholar
A. Weigend and D. Rumelhart: The Effective Dimension of the Space of Hidden Units. In Proceedings International Joint Conference on Neural Networks, Singapore, 1991.
Google Scholar
G. Deco and J. Ebmeyer: Coarse Coding Resource-Allocating-Network. Neural Computation, 5, 105–114, 1993.
Article Google Scholar
S.E. Fahlman and C. Lebiere: The Cascade Correlation Learning Architecture. In Advances in Neural Information Processing 2, D.S. Touretzky ed., Morgan Kaufmann, San Mateo, CA, 1990.
Google Scholar
W. Finnoff and H.G. Zimmermann: Detecting Structure in Small Data Sets by Network Fitting under Complexity Constrains. In Proceedings of 2nd Ann. Workshop Computational Learning Theory and Natural Learning Systems, Berkeley, 1991.
Google Scholar
A. Weigend, D. Rumelhart and B. Huberman: Generalization by Weight Elimination with Application to Forecasting. In Advances in Neural Information Processing 3, R. P. Lippman, J. Moody and D.S. Touretzky eds., Morgan Kaufmann, San Mateo, CA, 1991.
Google Scholar
J. Moody and C. Darken: Fast Learning in Networks of Locally-Tuned Processing Units. Neural Computation, 1, 281–294, 1989.
Article Google Scholar
V.N. Vapnik: Estimation of Dependencies Based on Empirical Data. Springer Verlag, New York, 1982.
Google Scholar
V.N. Vapnik: Principles of Risk Minimization for Learning Theory. In Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Lippmann eds., 831–838, Morgan Kaufmann, San Mateo, CA, 1992.
Google Scholar
D. Pollard: Estimation of Dependencies Based on Empirical Data. Springer-Verlag, New York, 1984.
Google Scholar
S. Hanson and L. Pratt: Comparing Biases for Minimal Network Construction with Back-Propagation. In Advances in Neural Information Processing II, D.S. Touretzky, Ed., Morgan Kaufmann, New York, 533–541, 1989.
Google Scholar
S. Nowlan and G. Hinton: Adaptive Soft Weight Tying using Gaussian Mixtures. Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Lippmann eds., 993–1000, Morgan Kaufmann, San Mateo, CA, 1992.
Google Scholar
D. MacKay: Bayesian Modelling and Neural Networks. Ph.D thesis, Computation and Neural Systems, California Institute of Technology, Pasadena, CA, 1991.
Google Scholar
J. Moody: The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems. In Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Lippmann eds., 847–854, Morgan Kaufmann, San Mateo, CA, 1992.
Google Scholar
J. Bridle, D. MacKay and A. Heading: Unsupervised Classifiers, Mutual Information and ‘Phantom Targets’. In Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Lippmann eds., 1096–1101, Morgan Kaufmann, San Mateo, CA, 1992.
Google Scholar
R. Linsker: How to Generate Ordered Maps by Maximizing the Mutual Information Between Input and Output Signals. Neural Computation, 1, 402–411, 1991.
Article Google Scholar
A.N. Redlich: Redundancy Reduction as a Strategy for Unsupervised Learning. Neural Computation, 5, 289–304, 1993.
Article Google Scholar
D. MacKay: A Practical Bayesian Framework for Backpropagation Networks. Neural Computation, 4, 448–472, 1992.
Article Google Scholar
J. H. Friedman: Multivariate Adaptive Regression Splines. Annals of Statistics, 19, 1–141, 1991.
Article MathSciNet MATH Google Scholar
S. Nowlan and G. Hinton: Simplifying Neural Networks by Soft Weight-Sharing. Neural Computation, 4, 473–493.
Google Scholar
C. Peterson and B. Soederberg: A New Method for Mapping Optimization Problems Onto Neural Networks. Int. J. Neural Syst., 1, 68, 1989.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Corporate Research and Development, Siemens AG, Otto-Hahn-Ring 6, 81739, Munich, Germany
Gustavo Deco & Dragan Obradovic

Authors

Gustavo Deco
View author publications
You can also search for this author in PubMed Google Scholar
Dragan Obradovic
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Deco, G., Obradovic, D. (1996). Information Theory Based Regularizing Methods. In: An Information-Theoretic Approach to Neural Computing. Perspectives in Neural Computing. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4016-7_10

Download citation

DOI: https://doi.org/10.1007/978-1-4612-4016-7_10
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-8469-7
Online ISBN: 978-1-4612-4016-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics