Credit Risk Scoring: A Stacking Generalization Approach

Raimundo, Bernardo; Bravo, Jorge M.

doi:10.1007/978-3-031-45642-8_38

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 799))

Included in the following conference series:

World Conference on Information Systems and Technologies

211 Accesses

Abstract

Forecasting the creditworthiness of customers in new and existing loan contracts is a central issue of lenders’ activity. Credit scoring involves the use of analytical methods to transform historical loan application and loan performance data into credit scores that signal creditworthiness, inform, and determine credit decisions, determine credit limits, and loan rates, and assist in fraud detection, delinquency intervention, or loss mitigation. The standard approach to credit scoring is to pursue a “winner-take-all” perspective by which, for each dataset, a single believed to be the “best” statistical learning or machine learning classifier is selected from a set of candidate approaches using some method or criteria often neglecting model uncertainty. This paper empirically investigates the predictive accuracy of single-based classifiers against the stacking generalization approach in credit risk modelling using real-world peer-to-peer lending data. The findings show that stacking ensembles consistently outperform most traditional individual credit scoring models in predicting the default probability. Moreover, the findings show that adopting a feature selection process and hyperparameter tuning contributes to improving the performance of individual credit risk models and the super-learner scoring algorithm, helping models to be simpler, more comprehensive, and with lower classification error rates. Improving credit scoring models to better identify loan delinquency can substantially contribute to reducing loan impairments and losses leading to an improvement in the financial performance of credit institutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Application of Classification Models in Credit Risk Analysis

Machine learning techniques for credit risk evaluation: a systematic literature review

Article 01 April 2020

Probability of Loan Default—Applying Data Analytics to Financial Credit Risk Prediction

References

Ashofteh, A., Bravo, J.M.: A conservative approach for online credit scoring. Expert Syst. Appl. 176, 114835 (2021)
Article Google Scholar
Ashofteh A., Bravo J.M.: A non-parametric-based computationally efficient approach for credit scoring. In: CAPSI 2019 - 19th Conference of the Portuguese Association for Information Systems, Lisbon, Code 160805 (2019)
Google Scholar
Chamboko, R., Bravo, J.M.: On the modelling of prognosis from delinquency to normal performance on retail consumer loans. Risk Manage. 18, 264–287 (2016)
Article Google Scholar
Chamboko, R., Bravo, J.M.: A multi-state approach to modelling intermediate events and multiple mortgage loan outcomes. Risks 8(2), 64 (2020). https://doi.org/10.3390/risks8020064
Article Google Scholar
Thomas, L.C.: A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int. J. Forecast. 16(2), 149–172 (2000)
Article Google Scholar
Saunders, A., Allen, L.: Credit Risk Measurement-New Approaches to Value at Risk and Other Paradigms. Wiley, New York (2002)
Google Scholar
Lessmann, S., Baesens, B., Seow, H., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur. J. Oper. Res. 247(1), 124–136 (2015)
Article Google Scholar
Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Financ. 23(4), 589–609 (1968)
Article Google Scholar
Chamboko, R., Bravo, J.M.: Modelling and forecasting recurrent recovery events on consumer loans. Int. J. Appl. Decis. Sci. 12(3), 271–287 (2019)
Google Scholar
Chamboko, R., Bravo, J.M.: Frailty correlated default on retail consumer loans in developing markets. Int. J. Appl. Decis. Sci. 12(3), 257–270 (2019)
Google Scholar
Altman, E.I., Haldeman, R.G., Narayanan, P.: ZETATM analysis a new model to identify bankruptcy risk of corporations. J. Bank. Finan. 1(1), 29–54 (1977)
Article Google Scholar
Ala’raj, M., Abbod, M.F.: Classifiers consensus system approach for credit scoring. Knowl.-Based Syst. 104, 89–105 (2016)
Article Google Scholar
Barboza, F., Kimura, H., Altman, E.: Machine learning models and bankruptcy prediction. Expert Syst. Appl. 83, 405–417 (2017)
Article Google Scholar
Zhang, D., Zhou, X., Leung, S., Zheng, J.: Vertical bagging decision trees model for credit scoring. Expert Syst. Appl. 37, 7838–7843 (2010)
Article Google Scholar
Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007)
Article Google Scholar
Mukid, M., Widiharih, T., Rusgiyono, A., Prahutama, A.: Credit scoring analysis using weighted k-nearest neighbour. J. Phys. Conf. Ser. 1025, 012114 (2018)
Article Google Scholar
West, D.: Neural network credit scoring models. Comput. Oper. Res. 27(11–12), 1131–1152 (2000)
Article Google Scholar
Steel, M.F.J.: Model averaging and its use in economics. J. Econ. Lit. 58, 644–719 (2020)
Article Google Scholar
Ashofteh, A., Bravo, J.M., Ayuso, M.: A new ensemble learning strategy for panel time-series forecasting with applications to tracking respiratory disease excess mortality during the COVID-19 pandemic. Appl. Soft Comput. 128, 109422 (2022)
Article Google Scholar
Bravo, J.M., Ayuso, M., Holzmann, R., Palmer, E.: Addressing the life expectancy gap in pension policy. Insur. Math. Econ. 99, 200–221 (2021)
Google Scholar
Bravo, J.M.: Pricing participating longevity-linked life annuities: a Bayesian model ensemble approach. Eur. Actuar. J. 12, 125–159 (2021)
Article MathSciNet Google Scholar
Ayuso, M., Bravo, J.M., Holzmann, R., Palmer, E.: Automatic indexation of the pension age to life expectancy: when policy design matters. Risks 9(5), 96 (2021). https://doi.org/10.3390/risks9050096
Article Google Scholar
Bravo, J.M., Ayuso, M.: Mortality and life expectancy forecasts using Bayesian model combinations: an application to the Portuguese population. RISTI - Revista Ibérica de Sistemas e Tecnologias de Informação, E40, 128–144 (2020). https://doi.org/10.17013/risti.40.128–145
Google Scholar
Bravo, J.M., Ayuso, M.: Linking pensions to life expectancy: tackling conceptual uncertainty through Bayesian model averaging. Mathematics, 9(24), 3307 (2021). 1–27
Google Scholar
Feng, X., Xiao, Z., Zhong, B., Qiu, J., Dong, Y.: Dynamic ensemble classification for credit scoring using soft probability. Appl. Soft Comput. J. 65, 139–151 (2018)
Article Google Scholar
Xia, Y., Liu, C., Da, B., Xie, F.: A novel heterogeneous ensemble credit scoring model based on bstacking approach. Expert Syst. Appl. 93, 182–199 (2018)
Article Google Scholar
Re, M., Valentini, G.: Ensemble methods: A review. Advances in Machine Learning and Data Mining for Astronomy, pp. 563–594. Chapman & Hall (2012). https://doi.org/10.1201/B11822-34
Zhou, Z.: Ensemble Methods: Foundations and Algorithms, pp. 15-16. Chapman and Hall (2012).https://doi.org/10.1201/b12207
Dietterich, T.G.: Ensemble methods in machine learning. In: Multiple Classifier Systems. MCS 2000, LNCS, pp. 1–15 (2000). https://doi.org/10.1007/3-540-45014-9_1
Wolpert, D.: Stacked generalization. Neural Netw. 5, 241–259 (1992)
Article Google Scholar
Cox, D.R.: The regression analysis of binary sequences. J. R. Stat. Soc. Ser. B (Methodol.) 20(2), 215–232 (1958)
Google Scholar
Cortes, C., Vapnik, V.: Support vector network. Mach. Learn. 20, 273–297 (1995)
Article Google Scholar
Jijo, B.T., Abdulazeez, A.M.: Classification based on decision tree algorithm for machine learning. J. Appl. Sci. Technol. Trends 2(01), 20–28 (2021)
Article Google Scholar
Zhang, Y., Wang, J.: K-nearest neighbors and a kernel density estimator for GEFCom2014 probabilistic wind power forecasting. Int. J. Forecast. 32(3), 1074–1080 (2016)
Article Google Scholar
Jiang, W., Chen, Z., Xiang, Y., Shao, D., Ma, L., Zhang, J.: SSEM: a novel self-adaptive stacking ensemble model for classification. IEEE Access 7, 120337–120349 (2019)
Article Google Scholar
Marqués, A.I., García, V., Sánchez, J.S.: On the suitability of resampling techniques for the class imbalance problem in credit scoring. J. Oper. Res. Soc. 64(7), 1060–1070 (2013)
Article Google Scholar
Mishra, S., Sarkar, U., Taraphder, S., Datta, S., Swain, D., Saikhom, R., et al.: Multivariate statistical data analysis- principal component analysis (PCA). Int. J. Livestock Res. 7(5), 60–78 (2017)
Google Scholar
Abdou, H., Pointon, J.: Credit scoring, statistical techniques and evaluation criteria: a review of the literature. Int. Syst. Acc. Finan. Manag. 18, 59–88 (2011)
Article Google Scholar
Powers, D.M.W.: Evaluation: From precision, recall and f-measure to ROC., informedness, markedness & correlation. J. Mach. Learn. Technol. 2, 37–63 (2011)
Google Scholar
Luo, G.: A review of automatic selection methods for machine learning algorithms and hyper-parameter values. Netw. Model. Anal. Health Inform. Bioinform. 5(1), 1–16 (2016). https://doi.org/10.1007/s13721-016-0125-6
Article Google Scholar
Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. (JAIR) 16, 321–357 (2002)
Article Google Scholar
Mienye, D., Sun, Y.: Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inform. Med. Unlocked 25, 1–10 (2021)
Article Google Scholar
Yu, H., Sun, C., Yang, X., Zheng, S., Wang, Q., Xi, X.: LW-ELM: a fast and flexible cost-sensitive learning framework for classifying imbalanced data. IEEE Access 6, 28488–28500 (2018)
Article Google Scholar
Ampountolas, A., Nyarko Nde, T., Date, P., Constantinescu, C.: A machine learning approach for micro-credit scoring. Risks 9(3), 50 (2021)
Article Google Scholar
Bravo, J.M., Ayuso, M.: Forecasting the retirement age: a Bayesian model ensemble approach. In: Rocha, Á., Adeli, H., Dzemyda, G., Moreira, F., Ramalho Correia, A.M. (eds.) WorldCIST 2021. AISC, vol. 1365, pp. 123–135. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72657-7_12
Chapter Google Scholar
Ashofteh, A. Bravo, J.M.: Life table forecasting in COVID-19 times: an ensemble learning approach. In: Rocha, A., Gonçalves, R., Penalvo, F.G., Martins, J. (eds.), Proceedings of CISTI 2021 - Iberian Conference on Information Systems and Technologies. IEEE Computer Society Press (2021). https://doi.org/10.23919/CISTI52073.2021.9476583
Bravo, J.M., El Mekkaoui, N.: Short-term CPI Inflation forecasting: probing with model combinations. In: Rocha, A. et al. (eds.) Information Systems and Technologies. WorldCIST 2022. Lecture Notes in Networks and Systems, vol. 468, pp. 564–578. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04826-5_56
Ashofteh, A., Bravo, J.M. Ayuso, M.: A novel layered learning approach for forecasting respiratory disease excess mortality during the COVID-19 pandemic. In: CAPSI 2021 Proceedings, Volume 2021 – October 2021, Code 183080 (2021)
Google Scholar
Bravo, J.M.: Longevity-linked life annuities: a Bayesian model ensemble pricing approach. In: CAPSI 2020 Proceedings. 29. https://aisel.aisnet.org/capsi2020/29 (Atas da 20ª Conferência da Associação Portuguesa de Sistemas de Informação 2020) (2020)
Bouttier, F., Marchal, H.: Probabilistic thunderstorm forecasting by blending multiple ensembles. Tellus A 72(1), 1–19 (2020)
Article Google Scholar

Download references

Acknowledgements

This work has been supported by Fundação para a Ciência e a Tecnologia, grants UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC) and UIDB/00315/2020 (BRU-ISCTE-IUL).

Author information

Authors and Affiliations

NOVA IMS - Universidade Nova de Lisboa, Lisbon, Portugal
Bernardo Raimundo
NOVA IMS - Universidade Nova de Lisboa & MagIC & Université Paris-Dauphine PSL & ISCTE-IUL Business Research Unit (BRU-IUL) & CEFAGE-UE, Lisbon, Portugal
Jorge M. Bravo

Authors

Bernardo Raimundo
View author publications
You can also search for this author in PubMed Google Scholar
Jorge M. Bravo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge M. Bravo .

Editor information

Editors and Affiliations

ISEG, Universidade de Lisboa, Lisbon, Cávado, Portugal
Alvaro Rocha
College of Engineering, The Ohio State University, Columbus, OH, USA
Hojjat Adeli
Institute of Data Science and Digital Technologies, Vilnius University, Vilnius, Lithuania
Gintautas Dzemyda
DCT, Universidade Portucalense, Porto, Portugal
Fernando Moreira
TeCIP Institute, Scuola Superiore Sant’Anna, Pisa, Italy
Valentina Colla

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raimundo, B., Bravo, J.M. (2024). Credit Risk Scoring: A Stacking Generalization Approach. In: Rocha, A., Adeli, H., Dzemyda, G., Moreira, F., Colla, V. (eds) Information Systems and Technologies. WorldCIST 2023. Lecture Notes in Networks and Systems, vol 799. Springer, Cham. https://doi.org/10.1007/978-3-031-45642-8_38

Download citation

DOI: https://doi.org/10.1007/978-3-031-45642-8_38
Published: 16 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45641-1
Online ISBN: 978-3-031-45642-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Credit Risk Scoring: A Stacking Generalization Approach

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Application of Classification Models in Credit Risk Analysis

Machine learning techniques for credit risk evaluation: a systematic literature review

Probability of Loan Default—Applying Data Analytics to Financial Credit Risk Prediction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Credit Risk Scoring: A Stacking Generalization Approach

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Application of Classification Models in Credit Risk Analysis

Machine learning techniques for credit risk evaluation: a systematic literature review

Probability of Loan Default—Applying Data Analytics to Financial Credit Risk Prediction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation