Skip to main content

Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification

  • Conference paper
Learning Theory (COLT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Included in the following conference series:

Abstract

We show that forms of Bayesian and MDL inference that are often applied to classification problems can be inconsistent. This means there exists a learning problem such that for all amounts of data the generalization errors of the MDL classifier and the Bayes classifier relative to the Bayesian posterior both remain bounded away from the smallest achievable generalization error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Barron, A.R.: Information-theoretic characterization of Bayes performance and the choice of priors in parametric and nonparametric problems. In: Bayesian Statistics, vol. 6, pp. 27–52. Oxford University Press, Oxford (1998)

    Google Scholar 

  2. Barron, A.R., Rissanen, J., Yu, B.: The MDL Principle in coding and modeling. IEEE Trans. Inform. Theory 44(6), 2743–2760 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  3. Barron, A.R.: Complexity regularization with application to artificial neural networks. In: Nonparametric Functional Estimation and Related Topics, pp. 561–576. Kluwer Academic Publishers, Dordrecht (1990)

    Google Scholar 

  4. Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Trans. Inform. Theory 37(4), 1034–1054 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bernardo, J.M., Smith, A.F.M.: Bayesian theory. John Wiley, Chichester (1994)

    Book  MATH  Google Scholar 

  6. Blackwell, D., Dubins, L.: Merging of opinions with increasing information. The Annals of Mathematical Statistics 33, 882–886 (1962)

    Article  MATH  MathSciNet  Google Scholar 

  7. Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.: Occam’s razor. Information Processing Letters 24, 377–380 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  8. Bunke, O., Milhaud, X.: Asymptotic behaviour of Bayes estimates under possibly incorrect models. The Annals of Statistics 26, 617–644 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  9. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)

    Book  MATH  Google Scholar 

  10. Diaconis, P., Freedman, D.: On the consistency of Bayes estimates. The Annals of Statitics 14(1), 1–26 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  11. Grünwald, P.D.: MDL tutorial. In: Grünwald, P.D., Myung, I.J., Pitt, M.A. (eds.) Minimum Description Length: recent developments in theory and practice, ch.1, MIT Press, Cambridge (2004) (to appear)

    Google Scholar 

  12. Grünwald, P.D.: The Minimum Description Length Principle and Reasoning under Uncertainty. PhD thesis, University of Amsterdam, The Netherlands (1998)

    Google Scholar 

  13. Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research 1, 49–75 (2000)

    Article  Google Scholar 

  14. Jordan, M.I.: Why the logistic funtion? a tutorial discussion on probabilities and neural networks. Computational Cognitive Science Tech. Rep. 9503, MIT (1995)

    Google Scholar 

  15. Kearns, M., Mansour, Y., Ng, A.Y., Ron, D.: An experimental and theoretical comparison of model selection methods. Machine Learning 27, 7–50 (1997)

    Article  Google Scholar 

  16. Kleijn, B., van der Vaart, A.: Misspecification in infinite-dimensional Bayesian statistics. (2004) (submitted)

    Google Scholar 

  17. Li, J.K.: Estimation of Mixture Models. PhD thesis, Yale University, Department of Statistics (1997)

    Google Scholar 

  18. McAllester, D.: PAC-Bayesian model averaging. In: Proceedings COLT 1999 (1999)

    Google Scholar 

  19. Meir, R., Merhav, N.: On the stochastic complexity of learning realizable and unrealizable rules. Machine Learning 19, 241–261 (1995)

    MATH  Google Scholar 

  20. Quinlan, J., Rivest, R.: Inferring decision trees using the minimum description length principle. Information and Computation 80, 227–248 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  21. Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)

    MATH  Google Scholar 

  22. Tipping, M.E.: Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  23. Viswanathan, M., Wallace, C.S., Dowe, D.L., Korb, K.B.: Finding cutpoints in noisy binary sequences - a revised empirical evaluation. In: Foo, N.Y. (ed.) AI 1999. LNCS, vol. 1747, pp. 405–416. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  24. Yamanishi, K.: A decision-theoretic extension of stochastic complexity and its applications to learning. IEEE Trans. Inform. Theory 44(4), 1424–1439 (1998)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grünwald, P., Langford, J. (2004). Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27819-1_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22282-8

  • Online ISBN: 978-3-540-27819-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics