Abstract
It is still crucial problem to estimate high dimensional graphical models and to choose the regularization parameter in dependent data. There are several classical methods such as Akaike’s information criterion and Bayesian Information criterion to solve this problem, but also more recent methods have been proposed such as stability selection and stability approach to regularization selection method (StARS) and some extensions of AIC and BIC which are more appropriate for high dimensional datasets. In this review, we give some overview about these methods and also give their consistency properties for graphical lasso. Then, we evaluate the performance of these approaches in real datasets. Finally, we propose the theoretical background of our proposal model selection criterion that is based on the KL-divergence and the bootstrapping computation, and is particularly suggested for the sparse biological networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abbruzzo, A., Vujacic, I., Wit, E., Mineo, A.M.: Generalized information criterion for model selection in penalized graphical models. Arxiv (2014)
Akaike, H.: Information theory and an extension of the maximum likelihood priciple. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiad, Budepest (1973)
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autocontrol 19, 716–723 (1974)
Banerjee, O., El Ghaoui, L., d’Aspremont, L.: Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9, 485–516 (2008)
Ayyildiz, E., Ağraz, M., Purutçuoğlu, V.: MARS as an alternative approach of Gaussian graphical model for biochemical networks. J. Appl. Stat. 44c(16), 2858–2876 (2017)
Bahçivancı, B., Purutçuooğlu, V., Purutçuoğlu, E., Ürün, Y.: Estimation of gynecological cancer networks via target proteins. J. Multidiscip. Eng. Sci. 5(12), 9296–9302 (2018)
Bogdan, M., Ghosh, J.K., Doerge, R.W.: Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics 167, 989–999 (2004)
Boltzmann, L.: Uber die Beziehung zwischen dem zweiten Hauptsatze dewr mechanischen Warmetheorie und der Wahrscheinlichkeitsrechnung, respective den Satzenuber das Warmegleichgewicht. Weiner Bericte 76, 373–435 (1877)
Boyd, S., Vanderberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Bozdogan, H.: Model selection and AIC: the general theory and its analytical extensions. Pscychometrica 52(3), 345–370 (1987)
Bozdogan, H.: A new class of information complexity (ICOMP) criteria with an application to costumer profiling and segmentation. Istanbul Univ. J. Sch. Bus. Adm. 39(2), 370–398 (2010)
Bülbül, G.B., Purutçuoğlu, V., Purutçuoğlu, E.: Novel model selection criteria on sparse biological networks. Int. J. Environ. Sci. Technol. 16, 1–12 (2019)
Cavanaugh, J.E., Shumway, R.H.: A bootstrap variant of AIC for state-space model selection. Stat. Sin. 7, 473–496 (1997)
Chen, J., Chen, Z.: Extended Bayesian information criterian for model selection with large model space. Biometrika 95, 759–771 (2008)
Chen, J., Chen, Z.: Extended BIC for small-n-large-p sparse GLM. Stat. Sin. 22, 555–574 (2011)
Claeskans, G., Hjort, N.L.: Model Selection and Model Everaging. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge (2008)
Dempster, A.: Covariance selection. Biometrics 28, 157–175 (1972)
Dobra, A., Lenkoski, A.: Copula Gaussian graphical models and their application to modeling functional disability data. Ann. Appl. Stat. 5(2A), 969–993 (2011)
Efron, B.: The Jackknife, The Bootstrap and Other Resampling Plans. SIAM [Society for Industrial and Applied Mathematics], Philadelphia (1982)
Foygel, R., Drton, M.: Extended Bayesian information criteria for Gaussian graphical models. In: Advances in Neural Information Processing Systems, vol. 23, pp. 2020–2028 (2010)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2007)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Verlag, New York (2009)
Hurvich, C.M., Tsai, C.L.: A corrected Akaike information criterion for vector autoregressive model selection. J. Time Ser. Anal. 14, 271–279 (1993)
Lim, C., Yu, B.: Estimation stability with cross-validation. J. Comput. Graph. Stat. 25(2), 464–492 (2016)
Liu, H., Roeder, K., Wasserman, L.: Stability approach to regulazation selection (STARS) for high dimensional graphical models. In: Proceeding of the Twenty-Third Annual Conference on Neural Information Processing System (NIPS), pp. 1–14 (2010)
Meinhausen, N., Buhlmann, P.: High dimensional graphs and variable selection with lasso. Ann. Stat. 34, 1436–1462 (2006)
Meinhausen, N., Bühlmann, P.: Stability selection. J. Roy. Stat. Soc. Ser. A 72, 417–473 (2010)
Müller, C.L., Bonneau, R., Kurtz, Z.D.: Generalized stability approach for regularized graphical models. Arxiv (2016)
Politis, D.N., Romano, J.P., Wolf, M.: Subsampling. Springer, Heidelberg (1999)
Schwartz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. Roy. Stat. Soc. B 1(1), 55–80 (2013)
Shibata, R.: Bootstrap estimate of Kullback-Leibler information for model selection. Stat. Sin. 7(2), 375–394 (1997)
Sugiura, N.: Further analysis of the data by Akaike’s information criterion and the finite correction. Commun. Stat. Theory Methods A7, 13–26 (1978)
Yuan, M., Lin, Y.: Model selection and estimation in Gaussian graphical model. Biometrika 94, 19–35 (2007)
Zhao, P., Yu, B.: On model selection consistency of lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)
Whittaker, J.: Graphical Models in Applied Multivariate Statistics. Wiley, New York (1990)
Acknowledgement
The authors thank to Ms Gül Bahar Bülbül for her help while preparing the tables.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kaygusuz, M.A., Purutçuoğlu, V. (2020). The Model Selection Methods for Sparse Biological Networks. In: Hemanth, D., Kose, U. (eds) Artificial Intelligence and Applied Mathematics in Engineering Problems. ICAIAME 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 43. Springer, Cham. https://doi.org/10.1007/978-3-030-36178-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-36178-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36177-8
Online ISBN: 978-3-030-36178-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)