Abstract
We establish a generic theoretical tool to construct probabilistic bounds for algorithms where the output is a subset of objects from an initial pool of candidates (or more generally, a probability distribution on said pool). This general device, dubbed “Occam’s hammer”, acts as a meta layer when a probabilistic bound is already known on the objects of the pool taken individually, and aims at controlling the proportion of the objects in the set output not satisfying their individual bound. In this regard, it can be seen as a non-trivial generalization of the “union bound with a prior” (“Occam’s razor”), a familiar tool in learning theory. We give applications of this principle to randomized classifiers (providing an interesting alternative approach to PAC-Bayes bounds) and multiple testing (where it allows to retrieve exactly and extend the so-called Benjamini-Yekutieli testing procedure).
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Audibert, J.-Y.: Data-dependent generalization error bounds for (noisy) classification: a PAC-Bayesian approach. Technical Report PMA-905, Laboratoire de Probabilités et Modèles Aléatoires, Universités Paris 6 and Paris 7 (2004)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate – a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. B 57(1), 289–300 (1995)
Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29(4), 1165–1188 (2001)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.: Occam’s razor. Information processing letters 24, 377–380 (1987)
Catoni, O.: A PAC-Bayesian approach to adaptive classification. Technical report, LPMA, Université Paris 6 (submitted to Annals of Statistics) (2004)
Gavin, G., Gelly, S., Guermeur, Y., Lallich, S., Mary, J., Sebag, M., Teytaud, O.: PASCAL theoretical challenge. Type I and type II errors for multiple simultaneous hypothesis testing, http://www.lri.fr/~teytaud/risq
Langford, J.: Tutorial on practical prediction theory for classification. Journal of Machine Learning Research 6, 273–306 (2005)
Langford, J., Blum, A.: Microchoice bounds and self bounding learning algorithms. Machine Learning (First communicated at COLT’99) 51(2), 165–179 (2003)
McAllester, D.: Bayesian stochastic model selection. Machine Learning (First communicated at COLT’98 and ’99) 51(1), 5–21 (2003)
Zhang, T.: Information theoretical upper and lower bounds for statistical estimation. IEEE Transaction on Information Theory 52(4), 1307–1321 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Blanchard, G., Fleuret, F. (2007). Occam’s Hammer. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-72927-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)