Skip to main content

Variance Reduction via Noise and Bias Constraints

  • Chapter
Combining Artificial Neural Nets

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

Summary

Bootstrap samples with noise are shown to be an effective smoothness and capacity control technique for training feed-forward networks and for other statistical methods such as generalized additive models. It is shown that noisy bootstrap performs best in conjunction with weight decay regularisation and ensemble averaging. The two-spiral problem, a highly nonlinear noise-free data, is used to demonstrate these findings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. E. Baum and K. Lang. Constructing hidden units using examples and queries. In R. P. Lippmann, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems, volume 3, pages 904–910. Morgan Kaufmann, San Mateo, CA, 1991.

    Google Scholar 

  2. C. M. Bishop. Training with noise is equivalent to Tikhonov regularization. Neural Computation, 7 (1): 108–116, 1995.

    Article  MathSciNet  Google Scholar 

  3. L. Breiman. Bagging predictors. Machine Learning, 24: 123–140, 1996.

    MathSciNet  MATH  Google Scholar 

  4. J. Buckheit and D. L. Donoho. Improved linear discrimination using time-frequency dictionaries. Technical Report, Stanford University, 1995.

    Google Scholar 

  5. G. Deffuant. An algorithm for building regularized piecewise linear discrimination surfaces: The perceptron membrane. Neural Computation, 7 (2): 380–398, 1995.

    Article  Google Scholar 

  6. B. Efron and R. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, New York, 1993.

    MATH  Google Scholar 

  7. S. E. Fahlman. Faster-learning variations on back-propagation: An empirical study. In T. J. Sejnowski, G. E. Hinton, and D. S. Touretzky, editors, Connectionist Models Summer School. Morgan Kaufmann, San Mateo, CA, 1988.

    Google Scholar 

  8. S. E. Fahlman and C. Lebiere. The cascade-correlation learning architecture. CMU-CS-90–100, Carnegie Mellon University, 1990.

    Google Scholar 

  9. R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7: 179–188, 1936.

    Article  Google Scholar 

  10. S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias-variance dilemma. Neural Computation, 4: 1–58, 1992.

    Article  Google Scholar 

  11. L. K. Hansen and P. Salamon. Neural networks ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12: 993–1001, 1990.

    Article  Google Scholar 

  12. T. Hastie and R. Tibshirani. Generalized additive models. Statistical Science, 1: 297–318, 1986.

    Article  MathSciNet  Google Scholar 

  13. A. Krogh and J. A. Hertz. A simple weight decay can improve generalization. In J.E. Moody, S.J Hanson, and R.P. Lippmann, editors, Advances in Neural Information Processing Systems, volume 4, pages 950–957. Morgan Kaufmann, San Mateo, CA, 1992.

    Google Scholar 

  14. K. J. Lang and M. J. Witbrock. Learning to tell two spirals apart. In D. S. Touretzky, J. L. Ellman, T. J. Sejnowski, and G. E. Hinton, editors, Proceedings of the 1988 Connectionists Models, pages 52–59. 1988.

    Google Scholar 

  15. U. Naftaly, N. Intrator, and D. Horn. Optimal ensemble averaging of neural networks. Network, 8 (3): 283–296, 1997.

    Article  MATH  Google Scholar 

  16. M. P. Perrone. Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization. PhD thesis, Brown University, Institute for Brain and Neural Systems, May 1993.

    Google Scholar 

  17. Y. Raviv and N. Intrator. Bootstrapping with noise: Application to time-series prediction. Preprint.

    Google Scholar 

  18. Y. Raviv and N. Intrator. Bootstrapping with noise: An effective regularization technique. Connection Science, Special issue on Combining Estimators, 8: 356–372, 1996.

    Google Scholar 

  19. B. D. Ripley. Pattern Recognition and Neural Networks. Oxford Press, 1996.

    MATH  Google Scholar 

  20. J. Sietsma and R. J. F. Dow. Creating artificial neural networks that generalize. Neural Networks, 4: 67–79, 1991.

    Article  Google Scholar 

  21. D. H. Wolpert. Stacked generalization. Neural Networks, 5: 241–259, 1992.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag London Limited

About this chapter

Cite this chapter

Sharkey, A.J.C. (1999). Variance Reduction via Noise and Bias Constraints. In: Sharkey, A.J.C. (eds) Combining Artificial Neural Nets. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0793-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0793-4_7

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-004-0

  • Online ISBN: 978-1-4471-0793-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics