Abstract
Schapire and Singer's improved version of AdaBoost for handling weak hypotheses with confidence rated predictions represents an important advance in the theory and practice of boosting. Its success results from a more efficient use of information in weak hypotheses during updating. Instead of simple binary voting a weak hypothesis is allowed to vote for or against a classification with a variable strength or confidence. The Pool Adjacent Violators (PAV) algorithm is a method for converting a score into a probability. We show how PAV may be applied to a weak hypothesis to yield a new weak hypothesis which is in a sense an ideal confidence rated prediction and that this leads to an optimal updating for AdaBoost. The result is a new algorithm which we term PAV-AdaBoost. We give several examples illustrating problems for which this new algorithm provides advantages in performance.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Apte, C., Damerau, F., & Weiss, S. (1998). Text mining with decision rules and decision trees. Conference Proceedings The Conference on Automated Learning and Discovery, CMU.
Aslam, J. (2000). Improving algorithms for boosting. Conference Proceedings 13th COLT. Palo Alto, California.
Ayer, M., Brunk, H. D., Ewing, G. M., Reid, W. T., & Silverman, E. (1954). An empirical distribution function for sampling with incomplete information. Annals of Mathematical Statistics, 26, 641–647.
Bennett, K. P., Demiriz, A., & Shawe-Taylor, J. (2000). A column generation algorithm for boosting. Conference Proceedings 17th ICML.
Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear smoothers and additive models. The Annals of Statistics, 17:2 453–555.
Burges, C. J. C. (1999). A tutorial on support vector machines for pattern recognition (Available electronically from the author): Bell Laboratories, Lucent Technologies.
Carreras, X., & Marquez, L. (2001). September 5–7, 2001. Boosting trees for anti-spam email filtering. Conference Proceedings RANLP2001, Tzigov Chark, Bulgaria.
Collins, M., Schapire, R. E., & Singer, Y. (2002). Logistic regression, AdaBoost and Bregman distances. Machine Learning, 48:1, 253–285.
Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern Classification (2 edn.). New York: John Wiley & Sons, Inc.
Duffy, N., & Helmbold, D. (1999). Potential boosters? Conference Proceedings Advances in Neural Information Processing Systems 11.
Duffy, N., & Helmbold, D. (2000). Leveraging for regression. Conference Proceedings 13th Annual Conference on Computational Learning Theory. San Francisco.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal or Computer and System Sciences, 55:1, 119–139.
Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 38:2, 337–374.
Hardle, W. (1991). Smoothing Techniques: With Implementation in S. New York: Springer-Verlag.
Johnson, M., Geman, S., Canon, S., Chi, Z., & Riezler, S. (1999). Estimators for stochastic “unification-based” grammars. Conference Proceedings Proceedings ACL'99. Univ. Maryland.
Kim, W., Aronson, A. R., & Wilbur, W. J. (2001). Automatic MeSH term assignment and quality assessment. Conference Proceedings Proc. AMIA Symp. Washington, D.C.
Kim, W. G., & Wilbur, W. J. (2001). Corpus-based statistical screening for content-bearing terms. Journal of the American Society for Information Science, 52:3, 247–259.
Langley, P., & Sage, S. (1994). Induction of selective Bayesian classifiers. Conference Proceedings Tenth Conference on Uncertainty in Artificial Intelligence, Seattle, WA.
Maclin, R. (1998). Boosting classifiers locally. Conference Proceedings Proceedings of AAAI.
Mason, L., Bartlett, P. L., & Baxter, J. (2000). Improved generalizations through explicit optimizations of margins. Machine Learning, 38, 243–255.
McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification. Conference Proceedings AAAI-98 Workshop on Learning for Text Categorization.
Meir, R., El-Yaniv, R., & Ben-David, S. 2000. Localized boosting. Conference Proceedings 13th COLT. Palo Alto, California.
Mitchell, T. M. (1997). Machine learning. Boston: WCB/McGraw-Hill.
Moerland, P., & Mayoraz, E. (1999). DynamBoost: combining boosted hypotheses in a dynamic way (Technical Report RR 99-09): IDIAP Switzerland.
Nock, R., & Sebban, M. (2001). A Bayesian boosting theorem. Pattern Recognition Letters, 22, 413–419.
Pardalos, P. M., & Xue, G. (1999). Algorithms for a class of isotonic regression problems. Algorithmica, 23, 211–222.
Ratsch, G., Mika, S., & Warmuth, M. K. (2001).On the Convergence of Leveraging (NeuroCOLT2 Technical Report 98). London: Royal Holloway College.
Ratsch, G., Onoda, T., & Muller, K.-R. (2001). Soft margins for AdaBoost. Machine Learning, 42, 287–320.
Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. Conference Proceedings 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37:3, 297–336.
Vapnik, V. (1998). Statistical Learning Theory. New York: John Wiley & Sons, Inc.
Witten, I. H., Moffat, A., & Bell, T. C. (1999). Managing Gigabytes (2 edn.). San Francisco: Morgan-Kaufmann Publishers, Inc.
Zhang, T., & Oles, F. J. (2001). Text categorization based on regularized linear classification methods. Information Retrieval, 4:1, 5–31.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Robert Schapire
Rights and permissions
About this article
Cite this article
Wilbur, W.J., Yeganova, L. & Kim, W. The Synergy Between PAV and AdaBoost. Mach Learn 61, 71–103 (2005). https://doi.org/10.1007/s10994-005-1123-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-005-1123-6