Abstract
We develop the idea that the PAC-Bayes prior can be informed by the data-generating distribution. We prove sharp bounds for an existing framework, and develop insights into function class complexity in this model and suggest means of controlling it with new algorithms. In particular we consider controlling capacity with respect to the unknown geometry of the data-generating distribution. We finally extend this localization to more practical learning methods.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Empirical Risk
- Structural Risk Minimization
- Machine Learn Research
- Intrinsic Geometry
- Empirical Counterpart
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ambroladze, A., Parrado-Hernández, E., Shawe-Taylor, J.: Tighter pac-bayes bounds. In: NIPS, pp. 9–16. MIT Press, Cambridge (2006)
Azuma, K.: Weighted sums of certain dependent random variables. Tohoku Mathematical Journal 68, 357–367 (1967)
Balcan, M., Blum, A.: A discriminative model for semi-supervised learning. JACM, 57 (2010)
Belkin, M., Matveeva, I., Niyogi, P.: Regularization and semi-supervised learning on large graphs. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 624–638. Springer, Heidelberg (2004)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7, 2399–2434 (2006)
Blanchard, G., Fleuret, F.: Occam’s hammer. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 112–126. Springer, Heidelberg (2007)
Bousquet, O., Elisseeff, A.: Stability and generalization. J. Mach. Learn. Res. 2, 499–526 (2002)
Catoni, O.: PAC-Bayesian surpevised classification: the thermodynamics of statistical learning. Monograph Series of the Institute of Mathematical Statistics (2007)
Da Prato, G.: An introduction to infinite-dimensional analysis. Springer, Heidelberg (2006)
Germain, P., Lacasse, A., Laviolette, F., Marchand, M.: Pac-bayesian learning of linear classifiers. In: ICML, p. 45. ACM, New York (2009)
Hein, M., Audibert, J.-Y., von Luxburg, U.: Graph laplacians and their convergence on random neighborhood graphs. CoRR (2006)
Kallenberg, O., Sztencel, R.: Some dimension-free features of vector-valued martingales. Probability Theory and Related Fields 88, 215–247 (1991)
Langford, J.: Tutorial on practical prediction theory for classification. Journal of Machine Learning Research 6, 273–306 (2005)
Langford, J., Shawe-taylor, J.: Pac-bayes and margins. In: Advances in Neural Information Processing Systems, vol. 15, pp. 439–446. MIT Press, Cambridge (2002)
Lever, G., Laviolette, F., Shawe-Taylor, J.: Distribution dependent pac-bayes priors. UCL technical report (2010), http://www.cs.ucl.ac.uk/staff/G.Lever/pubs/DDPB.pdf
McAllester, D.A.: Pac-bayesian model averaging. In: COLT, pp. 164–170 (1999)
Ralaivola, L., Szafranski, M., Stempfel, G.: Chromatic pac-bayes bounds for non-iid data: Applications to ranking and stationary β-mixing processes. CoRR, abs/0909.1993 (2009)
Seeger, M.: Pac-bayesian generalisation error bounds for gaussian process classification. Journal of Machine Learning Research 3, 233–269 (2002)
Serfling, R.: Approximation theorems of mathematical statistics. Wiley, Chichester (1980)
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory 44, 1926–1940 (1998)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003, pp. 912–919 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lever, G., Laviolette, F., Shawe-Taylor, J. (2010). Distribution-Dependent PAC-Bayes Priors. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2010. Lecture Notes in Computer Science(), vol 6331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16108-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-16108-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16107-0
Online ISBN: 978-3-642-16108-7
eBook Packages: Computer ScienceComputer Science (R0)