Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification

Grünwald, Peter; Langford, John

doi:10.1007/978-3-540-27819-1_23

Peter Grünwald²⁰ &
John Langford²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Included in the following conference series:

International Conference on Computational Learning Theory

2211 Accesses
5 Citations

Abstract

We show that forms of Bayesian and MDL inference that are often applied to classification problems can be inconsistent. This means there exists a learning problem such that for all amounts of data the generalization errors of the MDL classifier and the Bayes classifier relative to the Bayesian posterior both remain bounded away from the smallest achievable generalization error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Constrained Naïve Bayes with application to unbalanced data classification

Article Open access 20 October 2021

When is the Naive Bayes approximation not so naive?

Article 21 July 2017

Naïve Bayes

References

Barron, A.R.: Information-theoretic characterization of Bayes performance and the choice of priors in parametric and nonparametric problems. In: Bayesian Statistics, vol. 6, pp. 27–52. Oxford University Press, Oxford (1998)
Google Scholar
Barron, A.R., Rissanen, J., Yu, B.: The MDL Principle in coding and modeling. IEEE Trans. Inform. Theory 44(6), 2743–2760 (1998)
Article MATH MathSciNet Google Scholar
Barron, A.R.: Complexity regularization with application to artificial neural networks. In: Nonparametric Functional Estimation and Related Topics, pp. 561–576. Kluwer Academic Publishers, Dordrecht (1990)
Google Scholar
Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Trans. Inform. Theory 37(4), 1034–1054 (1991)
Article MATH MathSciNet Google Scholar
Bernardo, J.M., Smith, A.F.M.: Bayesian theory. John Wiley, Chichester (1994)
Book MATH Google Scholar
Blackwell, D., Dubins, L.: Merging of opinions with increasing information. The Annals of Mathematical Statistics 33, 882–886 (1962)
Article MATH MathSciNet Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.: Occam’s razor. Information Processing Letters 24, 377–380 (1987)
Article MATH MathSciNet Google Scholar
Bunke, O., Milhaud, X.: Asymptotic behaviour of Bayes estimates under possibly incorrect models. The Annals of Statistics 26, 617–644 (1998)
Article MATH MathSciNet Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Book MATH Google Scholar
Diaconis, P., Freedman, D.: On the consistency of Bayes estimates. The Annals of Statitics 14(1), 1–26 (1986)
Article MATH MathSciNet Google Scholar
Grünwald, P.D.: MDL tutorial. In: Grünwald, P.D., Myung, I.J., Pitt, M.A. (eds.) Minimum Description Length: recent developments in theory and practice, ch.1, MIT Press, Cambridge (2004) (to appear)
Google Scholar
Grünwald, P.D.: The Minimum Description Length Principle and Reasoning under Uncertainty. PhD thesis, University of Amsterdam, The Netherlands (1998)
Google Scholar
Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research 1, 49–75 (2000)
Article Google Scholar
Jordan, M.I.: Why the logistic funtion? a tutorial discussion on probabilities and neural networks. Computational Cognitive Science Tech. Rep. 9503, MIT (1995)
Google Scholar
Kearns, M., Mansour, Y., Ng, A.Y., Ron, D.: An experimental and theoretical comparison of model selection methods. Machine Learning 27, 7–50 (1997)
Article Google Scholar
Kleijn, B., van der Vaart, A.: Misspecification in infinite-dimensional Bayesian statistics. (2004) (submitted)
Google Scholar
Li, J.K.: Estimation of Mixture Models. PhD thesis, Yale University, Department of Statistics (1997)
Google Scholar
McAllester, D.: PAC-Bayesian model averaging. In: Proceedings COLT 1999 (1999)
Google Scholar
Meir, R., Merhav, N.: On the stochastic complexity of learning realizable and unrealizable rules. Machine Learning 19, 241–261 (1995)
MATH Google Scholar
Quinlan, J., Rivest, R.: Inferring decision trees using the minimum description length principle. Information and Computation 80, 227–248 (1989)
Article MATH MathSciNet Google Scholar
Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)
MATH Google Scholar
Tipping, M.E.: Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
Article MATH MathSciNet Google Scholar
Viswanathan, M., Wallace, C.S., Dowe, D.L., Korb, K.B.: Finding cutpoints in noisy binary sequences - a revised empirical evaluation. In: Foo, N.Y. (ed.) AI 1999. LNCS, vol. 1747, pp. 405–416. Springer, Heidelberg (1999)
Chapter Google Scholar
Yamanishi, K.: A decision-theoretic extension of stochastic complexity and its applications to learning. IEEE Trans. Inform. Theory 44(4), 1424–1439 (1998)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

CWI Amsterdam,
Peter Grünwald
TTI-Chicago,
John Langford

Authors

Peter Grünwald
View author publications
You can also search for this author in PubMed Google Scholar
John Langford
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Centre for Computational Statistics and Machine Learning Department of Computer Science, University College London, Gower St., WC1E 6BT, London
John Shawe-Taylor
Google, 1600 Amphitheater Parkway, CA 94043, Mountain View, USA
Yoram Singer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grünwald, P., Langford, J. (2004). Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-27819-1_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22282-8
Online ISBN: 978-3-540-27819-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Constrained Naïve Bayes with application to unbalanced data classification

When is the Naive Bayes approximation not so naive?

Naïve Bayes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Constrained Naïve Bayes with application to unbalanced data classification

When is the Naive Bayes approximation not so naive?

Naïve Bayes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation