Skip to main content

Regularization Theory, Radial Basis Functions and Networks

  • Conference paper
From Statistics to Neural Networks

Part of the book series: NATO ASI Series ((NATO ASI F,volume 136))

Abstract

This paper consists of two parts. In the first part we consider the problem of learning from examples in the setting of the theory of the approximation of multivariate functions from sparse data. We first will show how an approach based on regularization theory leads to develop a family of approximation techniques, including Radial Basis Functions, and some tensor product and additive splines. Then we will show how this fairly classical approach has to be extended in order to cope with special features of the problem of learning of examples, such as high dimensionality and strong anisotropics. Furthermore, the same extension that leads from Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, such as some forms of Projection Pursuit Regression.

In the second part we consider the problem of estimating the complexity of the approximation problem. We first briefly discuss the problem of degree of approximation, that is related to the number of parameters that are needed in order to guarantee a generalization error smaller than a certain amount. Then we consider certain Radial Basis Functions techniques, and, using a lemma by Jones (1992) and Barron (1991, 1993) we show that it is possible to define function spaces and basis functions for which the rate of convergence to zero of the error is O \(\left( {\frac{1} {{\sqrt n }}} \right)\) in any number of dimensions. The apparent avoidance of the “curse of dimensionality” is due to the fact that these function spaces are more and more constrained as the dimension increases. Examples include spaces of the Sobolev type, in which the number of weak derivatives is required to be larger than the number of dimensions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. N. Aronszajn. Theory of reproducing kernels. Trans. Amer. Math. Soc, 686:337–404, 1950.

    Article  MathSciNet  Google Scholar 

  2. A.R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. Technical Report 58, Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL, March 1991.

    Google Scholar 

  3. A.R. Barron. Approximation and estimation bounds for artificial neural networks. Technical Report 59, Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL, March 1991a.

    Google Scholar 

  4. A.R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Info. Theory, 39(3):930–945, May 1993.

    Article  MATH  MathSciNet  Google Scholar 

  5. A.R. Barron. Approximation and estimation bounds for artificial neural networks. Machine Learning, 14:115–133, 1994. (to appear).

    MATH  Google Scholar 

  6. R.E. Bellman. Adaptive Control Processes. Princeton University Press, Princeton, NJ, 1961.

    MATH  Google Scholar 

  7. M. Bertero. Regularization methods for linear inverse problems. In C. G. Talenti, editor, Inverse Problems. Springer-Verlag, Berlin, 1986.

    Google Scholar 

  8. L. Breiman. Hinging hyperplanes for regression, classification, and function approximation. 1992. (submitted for publication).

    Google Scholar 

  9. D.S. Broomhead and D. Lowe. Multivariable functional interpolation and adaptive networks. Complex Systems, 2:321–355, 1988.

    MATH  MathSciNet  Google Scholar 

  10. A. Buja, T. Hastie, and R. Tibshirani. Linear smoothers and additive models. Annals of Statistics, 17:453–555, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  11. R. DeVore, R. Howard, and C. Micchelli. Optimal nonlinear approximation. Manuskripta Mathematika, 1989.

    Google Scholar 

  12. R.A. DeVore. Degree of nonlinear approximation. In C.K. Chui, L.L. Schumaker, and D.J. Ward, editors, Approximation Theory, VI, pages 175–201. Academic Press, New York, 1991.

    Google Scholar 

  13. D.L. Donohue and I.M. Johnstone. Projection-based approximation and a duality with kernel methods. Annals of Statistics, 17(1):58–106, 1989.

    Article  MathSciNet  Google Scholar 

  14. J. Duchon. Spline minimizing rotation-invariant semi-norms in Sobolev spaces. In W. Schempp and K. Zeller, editors, Constructive theory of functions os several variables, Lecture Notes in Mathematics, 571. Springer-Verlag, Berlin, 1977.

    Google Scholar 

  15. R.M. Dudley. Comments on two preprints: Barron (1991), Jones (1991). Personal communication, March 1991.

    Google Scholar 

  16. N. Dyn. Interpolation of scattered data by radial functions. In C.K. Chui, L.L. Schumaker, and F.I. Utreras, editors, Topics in multivariate approximation. Academic Press, New York, 1987.

    Google Scholar 

  17. N. Dyn. Interpolation and approximation by radial and related functions. In C.K. Chui, L.L. Schumaker, and D.J. Ward, editors, Approximation Theory, VI, pages 211–234. Academic Press, New York, 1991.

    Google Scholar 

  18. R. Franke. Scattered data interpolation: tests of some method. Math. Comp., 38(5):181–200, 1982.

    MATH  MathSciNet  Google Scholar 

  19. R. Franke. Recent advances in the approximation of surfaces from scattered data. In C.K. Chui, L.L. Schumaker, and F.I. Utreras, editors, Topics in multivariate approximation. Academic Press, New York, 1987.

    Google Scholar 

  20. J.H. Friedman and W. Stuetzle. Projection pursuit regression. Journal of the American Statistical Association, 76(376):817–823, 1981.

    Article  MathSciNet  Google Scholar 

  21. F. Girosi. On some extensions of radial basis functions and their applications in artificial intelligence. Computers Math. Applic., 24(12):61–80, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  22. F. Girosi. Rates of convergence of approximation by translates and dilates of a given function. 1993a. (in preparation).

    Google Scholar 

  23. F. Girosi and G. Anzellotti. Rates of convergence of approximation by translates. A.I. Memo 1288, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 1992.

    Google Scholar 

  24. F. Girosi and G. Anzellotti. Rates of convergence for radial basis functions and neural networks. In Artificial Neural Networks with Applications in Speech and Vision, London, 1993. Chapman & Hall.

    Google Scholar 

  25. F. Girosi, M. Jones, and T. Poggio. Priors, stabilizers and basis functions: From regularization to radial, tensor and additive splines. A.I. Memo No. 1430, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 1993.

    Google Scholar 

  26. R.L. Harder and R.M. Desmarais. Interpolation using surface splines. J. Aircraft, 9:189–191, 1972.

    Article  Google Scholar 

  27. R.L. Hardy. Theory and applications of the multiquadric-biharmonic method. Computers Math. Applic., 19(8/9):163–208, 1990.

    Article  MATH  MathSciNet  Google Scholar 

  28. T. Hastie and R. Tibshirani. Generalized additive models. Statistical Science, 1:297–318, 1986.

    Article  MathSciNet  Google Scholar 

  29. T. Hastie and R. Tibshirani. Generalized additive models: some applications. J. Amer. Statistical Assoc, 82:371–386, 1987.

    Article  MATH  MathSciNet  Google Scholar 

  30. T. Hastie and R. Tibshirani. Generalized Additive Models, volume 43 of Monographs on Statistics and Applied Probability. Chapman and Hall, London, 1990.

    Google Scholar 

  31. P.J. Huber. Projection pursuit. Annals of Statistics, 13(2):435–475, 1985.

    Article  MATH  MathSciNet  Google Scholar 

  32. L.K. Jones. A simple lemma on greedy approximation in Hubert space and convergence rates for Projection Pursuit Regression and neural network training. Annals of Statistics, 20(1):608–613, March 1992.

    Article  MATH  MathSciNet  Google Scholar 

  33. E.J. Kansa. Multiquadrics — a scattered data approximation scheme with applications to computational fluid dynamics —. Computers Math. Applic., 19(8/9), 1990.

    Google Scholar 

  34. G. G. Lorentz. Approximation of Functions. Chelsea Publishing Co., New York, 1986.

    MATH  Google Scholar 

  35. W.R. Madych and S.A. Nelson. Multivariate interpolation and conditionally positive definite functions. II. Mathematics of Computation, 54(189): 211–230, January 1990.

    Article  MATH  MathSciNet  Google Scholar 

  36. B. Maurey. In: “Remarques sur un resultat non publiè de B. Maurey” by G. Pisier. In Centre de Mathematique, editor, Seminarte d’analyse fonctionelle 1980–1981, Palaiseau, 1981.

    Google Scholar 

  37. J. Meinguet. Multivariate interpolation at arbitrary points made simple. J. Appl. Math. Phys., 30:292–304, 1979.

    Article  MATH  MathSciNet  Google Scholar 

  38. C. A. Micchelli. Interpolation of scattered data: distance matrices and conditionally positive definite functions. Constr. Approx., 2:11–22, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  39. C. A. Micchelli. Interpolation of scattered data: distance matrices and conditionally positive definite functions. Constr. Approx., 2:11–22, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  40. J. Moody and C. Darken. Fast learning in networks of locally-tuned processing units. Neural Computation, 1(2):281–294, 1989.

    Article  Google Scholar 

  41. V.A. Morozov. Methods for solving incorrectly posed problems. Springer-Verlag, Berlin, 1984.

    Book  Google Scholar 

  42. A. Pinkus. N-widths in Approximation Theory. Springer-Verlag, New York, 1986.

    Google Scholar 

  43. T. Poggio and F. Girosi. A theory of networks for approximation and learning. A.I. Memo No. 1140, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 1989.

    Google Scholar 

  44. T. Poggio and F. Girosi. Networks for approximation and learning. Proceedings of the IEEE, 78(9), September 1990.

    Google Scholar 

  45. T. Poggio and F. Girosi. Regularizaron algorithms for learning that are equivalent to multilayer networks. Science, 247:978–982, 1990.

    Article  MATH  MathSciNet  Google Scholar 

  46. M. J. D. Powell. Radial basis functions for multivariable interpolation: a review. In J. C. Mason and M. G. Cox, editors, Algorithms for Approximation. Clarendon Press, Oxford, 1987.

    Google Scholar 

  47. M.J.D. Powell. The theory of radial basis functions approximation in 1990. Technical Report NAH, Department of Applied Mathematics and Theoretical Physics, Cambridge, England, December 1990.

    Google Scholar 

  48. N. Sivakumar and J.D. Ward. On the best least square fit by radial functions to multidimensional scattered data. Technical Report 251, Center for Approximation Theory, Texas A & M University, June 1991.

    Google Scholar 

  49. E.M. Stein. Singular integrals and differentiability properties of functions. Princeton, N.J., Princeton University Press, 1970.

    MATH  Google Scholar 

  50. A. N. Tikhonov. Solution of incorrectly formulated problems and the regularization method. Soviet Math. Dokl., 4:1035–1038, 1963.

    Google Scholar 

  51. A. N. Tikhonov and V. Y. Arsenin. Solutions of Ill-posed Problems. W. H. Winston, Washington, D.C., 1977.

    MATH  Google Scholar 

  52. V. N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer-Verlag, Berlin, 1982.

    MATH  Google Scholar 

  53. G. Wahba. Practical approximate solutions to linear operator equations when the data are noisy. SIAM J. Numer. Anal., 14, 1977.

    Google Scholar 

  54. G. Wahba. Splines Models for Observational Data. Series in Applied Mathematics, Vol. 59, SIAM, Philadelphia, 1990.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Girosi, F. (1994). Regularization Theory, Radial Basis Functions and Networks. In: Cherkassky, V., Friedman, J.H., Wechsler, H. (eds) From Statistics to Neural Networks. NATO ASI Series, vol 136. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-79119-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-79119-2_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-79121-5

  • Online ISBN: 978-3-642-79119-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics