Abstract
We propose a shrinkage procedure for simultaneous variable selection and estimation in generalized linear models (GLMs) with an explicit predictive motivation. The procedure estimates the coefficients by minimizing the Kullback-Leibler divergence of a set of predictive distributions to the corresponding predictive distributions for the full model, subject to an l 1 constraint on the coefficient vector. This results in selection of a parsimonious model with similar predictive performance to the full model. Thanks to its similar form to the original Lasso problem for GLMs, our procedure can benefit from available l 1-regularization path algorithms. Simulation studies and real data examples confirm the efficiency of our method in terms of predictive performance on future observations.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aitchison, J.: Goodness of prediction fit. Biometrika 62, 547–554 (1975)
Bailey, C.: Smart Exercise: Burning Fat, Getting Fit. Houghton-Mifflin, Boston (1994)
Brown, P.J., Vannucci, M., Fearn, T.: Bayes model averaging with selection of regressors. J. R. Stat. Soc. B 64, 519–536 (2002)
Burnham, K.P., Anderson, D.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer, New York (2002)
Chen, M.H., Ibrahim, J.G.: Conjugate priors for generalized linear models. Stat. Sin. 13, 461–476 (2003)
Dupuis, J.A., Robert, C.P.: Variable selection in qualitative models via an entropic explanatory power. J. Stat. Plan. Inference 111, 77–94 (2003)
Geisser, S.: Discussion of “Sampling and Bayes’ inference in scientific modelling and robustness'' by G.E.P. Box. J. R. Stat. Soc., Ser. A 143, 416–417 (1980)
Geisser, S.: Predictive Inference: An Introduction. Chapman & Hall, New York (1993)
Gelman, A., Jakulin, A., Grazia, P., Su, Y.-S.: A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2, 1360–1383 (2008)
Gneiting, T., Raftery, A.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)
Good, I.J.: Rational decisions. J. R. Stat. Soc. B 14, 107–114 (1952)
Hersbach, H.: Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather Forecast. 15, 559–570 (2000)
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T.: Bayesian model averaging: a tutorial. Stat. Sci. 14, 382–417 (1999)
Johnson, R.W.: Fitting percentage of body fat to simple body measurements. J. Stat. Educ. 4, 1 (1996)
Leng, C., Tran, M.-N., Nott, D.J.: Bayesian adaptive Lasso. Technical Report (2010). arXiv:1009.2300v1
Lindley, D.V.: The choice of variables in multiple regression (with discussion). J. R. Stat. Soc. B 30, 31–66 (1968)
Nott, D.J., Leng, C.: Bayesian projection approaches to variable selection in generalized linear models. Comput. Stat. Data Anal. 54, 3227–3241 (2010)
O’Hagan, A., Forster, J.: The Advanced Theory of Statistics, Bayesian Inference, vol. 2B. Edward Arnold, London (2004)
Park, T., Casella, G.: The Bayesian Lasso. J. Am. Stat. Assoc. 103, 681–686 (2008)
Raftery, A.E., Madigan, D., Hoeting, J.A.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92, 179–191 (1997)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B 58, 267–288 (1996)
Tran, M.N.: A criterion for optimal predictive model selection. Commun. Stat., Theory Methods 40, 893–906 (2011)
Vehtari, A., Lampinen, J.: Model selection via predictive explanatory power. Report B38, Laboratory of Computational Engineering, Helsinki University of Technology (2004)
Zellner, A.: On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In: Bayesian Inference and Decision Techniques: Essays in Honour of Bruno De Finetti, pp. 233–243. North-Holland, Amsterdam (1986)
Zhao, P., Yu, B.: On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)
Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors would like to thank the Editor, Associate Editor and referees for insightful comments which helped to improve the manuscript.
Rights and permissions
About this article
Cite this article
Tran, MN., Nott, D.J. & Leng, C. The predictive Lasso. Stat Comput 22, 1069–1084 (2012). https://doi.org/10.1007/s11222-011-9279-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-011-9279-3