Credit Risk Assessment Based on Flexible Neural Tree Model

Zhang, Yishen; Wang, Dong; Chen, Yuehui; Zhao, Yaou; Shao, Peng; Meng, Qingfang

doi:10.1007/978-3-319-59072-1_26

Yishen Zhang^16,17,
Dong Wang^16,17,
Yuehui Chen^16,17,
Yaou Zhao^16,17,
Peng Shao¹⁸ &
…
Qingfang Meng^16,17

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

International Symposium on Neural Networks

2603 Accesses
4 Citations

Abstract

In recent years, as China’s credit market continues to expand, a large number of P2P (person-to-person borrow or lend money in Internet Finance) platforms were born and developed. Most of the P2P platforms in China use data mining methods to evaluate the credit risk of loan applicants. Artificial neural network (ANN) is an emerging data mining tool and has good classification ability in many application fields. This paper presents a model of credit risk assessment based on flexible neural tree (FNT), which can reduce the overdue rate and save the analysis time. Overdue and non-overdue sample data are provided by the Jinan Hengxin Micro-Investment Advisory Co., Ltd., and used to build the model. Experiments show that the proposed model is more accurate and has less time cost for the overdue classification of credit risk assessment.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Banking Credit Risk Analysis using Artificial Neural Network

The Study on the Credit Risk Assessment of Borrower in P2P Network of China

Credit Risk Assessment Based on Long Short-Term Memory Model

Keywords

1 Introduction

Credit loan is an unsecured loan model. In recent years, the credit market has been expanding rapidly in China. On one hand, the rapid development of China’s economy has shortened the cycle of capital turnover. On the other hand, because of the improvement of Chinese national consumption capacity, businesses increasingly need high demand for funds, so a large number of P2P Internet inclusive financial platforms came into being. As no complete credit evaluation system like banks in China, P2P platform has small contain ability to non-collateral customers, it obtains better risk prediction results only through the establishment of the corresponding credit risk assessment model. So a large number of platforms are exploring their own methods of credit risk assessment, most of which use data mining approach to try to collect and understand the customer information to better grasp the authenticity and validity of customer information; to evaluation financial situation of customers more reasonable; to predict the business conditions, repayment intention and ability of borrows more accurately.

The establishment of a good credit risk evaluation model is the biggest challenge to the development of P2P platform and credit market. If the model control is too strict to the customer, the platform will lose some high-quality customers and make it passive in the industry competition. On the contrary, the overdue rate of the platform will continued to rise, which makes financial managers difficult to be responsible and lose credibility. Therefore, it is important to establish the credit risk evaluation model to prevent bad debts happening, to promote the speed of capital flow and to maintain the security and stability of capital. In the field of credit risk assessment, artificial neural networks, genetic programming, genetic algorithms, support vector machines, logistic regression and some hybrid models have achieved gratifying results in terms of performance and precision.

In the past few years, many excellent algorithms and research methods have been tested on the basis of customer information data in the field of credit risk assessment. Khashman used artificial neural network algorithm in Germany customer dataset and achieved the accuracy rate of 83.6% [1]. Bekhet and Eletter applied RBF network algorithm to the Jordanian commercial bank data set, and the test sets had accuracy rate of 86.5% [2]. Wang et al. uses the improved BP neural network algorithm and the accuracy rate is 86% [3]. The traditional Artificial Neural Network has the stationary structure, but Flexible neural tree (FNT) has the special structures which called flexible tree structures, with this characteristic, FNT model can get better property from the learning.

In this paper, a new method based on FNT model was proposed for classification of customer information, and the results in 10-fold cross validation shows our method achieved better performance than the other state-of-arts.

2 Data Collection and Variable Definition

Customer information data can be described from many dimensions. In this paper, we randomly took 300 samples of overdue customers and 300 Negative samples of non-overdue customers all of which were from 2,000 customers of Jinan Hengxin Micro-Investment Advisory Co., Ltd. between 2014 and 2016. In this study, the author chooses 13 dimensions to describe and consider the customer information. The standard of selected dimensions are: (1) do not contain the customer’s identity information; (2) exclude the subjective information from the point of view of the actual human audit, such as the use of loans, business models, profits and other objective information which can only be verified by a third party as difficulties to verify and census them.

According to these principles, the selected dimensions can maximize the provided data by customer which objectively and difficulty to forge. The accurate classification based on actual data which can verify and excluding the subjective description. Table 1 shows the variable, values, and definitions of 13 selected dimensions of the study, and the Table 2 shows the examples of datasets.

Table 1. Proposed variables for building dataset

Full size table

Table 2. Examples

Full size table

The 600 samples are based on the statistics in Table 1, and then all the data will processed as “Max_Min standardization” for the next step, and get ready to input to the FNT model, the normalized samples are shown in Table 3. The normalization rule is shown in Eq. (1).

$$ P_{ij}^{'} = \frac{{P_{ij} - m_{i} }}{{M_{i} - m_{i} }} $$

(1)

where, $ P_{ij}^{'} $ is the normalized customer data. $ P_{ij} $ is the original customer data. $ M_{i} $ is the maximum value of the dimension i. $ m_{i} $ is the minimum value of the dimension i.

Table 3. Normalized samples

Full size table

3 Classification Method

3.1 Flexible Neural Tree

Flexible neural tree (FNT) is a special artificial neural network with flexible tree structures. It is proposed by Chen et al. [4, 5] and relatively easy for this model to reach near-optimal structure by using optimization algorithms. The FNT model consists of tree-structural encoding method and specific instruction set, it is also generated by using function set F and terminal instruction set T, described as follows.

$$ {\text{S}} = {\text{F}}\,\mathop \cup \nolimits \,{\text{T}} = \left\{ { +_{2} , +_{3} \cdots +_{N} } \right\}\,\mathop \cup \nolimits \,\left\{ {x_{1} \cdots x_{n} } \right\} $$

(2)

where $ +_{i} (i = 1,2 \cdots N) $ denotes non-leaf nodes with i arguments, the $ x_{1} ,x_{2} \cdots x_{n} $ are leaf nodes with none arguments.

Figure 1 shows the output of a non-leaf node which calculated by FNT model. Instruction $ +_{i} $ is also called a flexible neuron operator with i inputs. The output of a flexible neuron +n is calculated as follows and the total excitation of $ +_{n} $ is given by

$$ net_{n} = \sum\nolimits_{j = 1}^{n} {w_{j} } x_{j} $$

(3)

In Eq. (3), $ x_{j} (j = 1,2, \cdots ,n) $ are the input elements to node $ +_{n} $. The output of the node $ +_{n} $ is then calculated by

$$ out_{n} = f\left( {a_{n} ,b_{n} ,net_{n} } \right) = e^{{ - (\frac{{net_{n} - a_{n} }}{{b_{n} }})^{2} }} $$

(4)

A typical FNT model is illustrated in Fig. 2. Its overall output can be computed from left to right by a depth-first method recursively.

General learning algorithm of FNT

Step 1. Initialize the values of parameters used in the particle swarm optimization (PSO) algorithms. Set the elitist program as NULL and set the fitness value as the biggest positive real number. Create the initial population.
Step 2. Construct optimization using PSO algorithm, in which the fitness function is calculated by root mean square error (RMSE).
Step 3. If the better structure has found, then go to step 4, otherwise go to step 2.
Step 4. Optimize parameters using PSO algorithm.
Step 5. If the maximum number of local search is reached, or no better parameter vector is found for a significantly long time (100 steps), then go to step 6; otherwise go to step 4.
Step 6. If the satisfied solution is found, then stop; otherwise go to step 2.

3.2 Prediction Assessment

In statistical analysis, two methods can be used to check the effectiveness of the classifier in applications, namely, independent dataset tests and 10-fold cross validation tests. For 10-fold cross validation, the full training set will be separated equally into 10 subset. Each subset will regarded as test data set to compute the overall accuracy (OA) of the model trained by the rest of full training data set. In addition, Sensitivity (Sens) and Specificity (Spec) are also used to evaluate the performance of classifier.

4 Discussion and Results

In this study, the FNT model was used to perform a 10-fold cross validation of a data set containing 600 sample data, i.e. 540 training samples and 60 testing samples were used for each experiment and were performed on each data set. The results show that the average accuracy of the test set is 88.32% (Table 4). In the Table 4, “T” is abbreviation of “trail”, “D” is abbreviation of “data”, “OA” is abbreviation of “Overall”, “A-acc” is abbreviation of “Average accuracy rate” and “acc” is abbreviation of “accuracy rate”, the values of “A-acc” and “acc” are percentages.

Table 4. The part of results of FNT model in 10-fold cross validation

Full size table

We compared the average accuracy, sensitivity and specificity between our model and other methods. The results are shown in Table 5, we can see that our method has higher accuracy compared to other method, and the specificity is slightly better than the others. Another point to make is this: the sensitivity value of Improved BP Neutral Network method is 91.6%, and this value was calculated by once experiment result form with 14 positive simples and 6 negative samples, totally 20 simples. The proportion of positive samples is much higher, so the sensitivity value also high, besides the sensitivity index is mentioned there only and no mention of any other place, so this value is included in Table 5 for reference.

Table 5. The comparison of our method and other methods

Full size table

5 Conclusion

In this study, we proposed a redesigned and redefined customer information feature dimension and FNT model for the field of credit risk assessment. Compared with other methods, the method proposed in this study has different degrees of improvement in various evaluation indexes, while the validity of the FNT model is proved. In the future, we will continue to improve the algorithm method and search for more effective classifiers in order to obtain better classification accuracy in this field.

References

Khashman, A.: Neural network for credit risk evaluation: investigation of different neural models and learning schemes. Exp. Syst. Appl. 37(9), 6233–6239 (2010)
Article Google Scholar
Bekhet, H., Eletter, S.: Credit risk assessment model for Jordanian commercial banks: neural scoring approach. Rev. Dev. Financ. 4(1), 20–28 (2014)
Article Google Scholar
Wang, L., Chen, Y., Zhao, Y., Meng, Q., Zhang, Y.: Credit management based on improved BP neural network. IHMSC 1, 497–500 (2016)
Google Scholar
Chen, Y., Yang, B., Dong, J., Abraham, A.: Time-series forecasting using flexible neural tree model. Inf. Sci. 174, 219–235 (2005)
Article MathSciNet Google Scholar
Yang, B., Chen, Y., Jiang, M.: Reverse engineering of gene regulatory networks using flexible neural tree models. Neurocomputing 99, 458–466 (2013)
Article Google Scholar
Abdou, H., Pointon, J., El-Masry, A.: On the applicability of credit scoring models in Egyptian banks. Banks Bank Syst. 2(1), 4–19 (2007)
Google Scholar
Bensic, M., Sarlija, N., Zekic-Susac, M.: Modeling small-business credit scoring by using logistic regression, neural networks and decision trees. Intell. Syst. Account. Financ. Manag. 13(3), 133–150 (2005)
Article Google Scholar
Blanco, A., Mejias, R., Lara, J., Rayo, S.: Credit scoring models for the microfinance industry using neural networks: evidence from Peru. Exp. Syst. Appl. 40(1), 356–364 (2013)
Article Google Scholar
Heiat, A.: Comparing performance of data mining models for computer credit scoring. J. Int. Financ. Econ. 12(1), 78–83 (2012)
Google Scholar
Koh, H., Tan, W., Goh, C.: A two-step method to construct credit scoring models with data mining techniques. Int. J. Bus. Inf. 1(1), 96–118 (2006)
Google Scholar
Jagric, V., Kracun, D., Jagric, T.: Does non-linearity matter in retail credit risk modeling? Financ. uver-Czech J. Econ. Financ. 61(4), 384–402 (2011)
Google Scholar
Wu, C., Guo, Y., Zhang, X., Xia, H.: Study of personal credit risk assessment based on support vector machine ensemble. Int. J. Innov. 6(5), 2353–2360 (2010)
Google Scholar
Xie, T., Yu, H., Wilamowski, B.: Comparison between traditional neural networks and radial basis function networks. In: Proceedings of 2011 IEEE International Symposium on Industrial Electronics, pp. 1194–1199 (2011)
Google Scholar
Yap, P., Ong, S., Husain, N.: Using data mining to improve assessment of credit worthiness via credit scoring models. Exp. Syst. Appl. 38(10), 1374–1383 (2011)
Article Google Scholar
Memarian, H., Balasundram, S.: Comparison between multi-layer perceptron and radial basis function networks for sediment load estimation in a tropical watershed. J. Water Resour. Prot. 4, 870–876 (2012)
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant No. 61302128, 61573166, 61572230, 61671220, 61640218), the Youth Science and Technology Star Program of Jinan City (201406003), the Shandong Distinguished Middle-aged and Young Scientist Encourage and Reward Foundation, China (Grant No. ZR2016FB14), the Project of Shandong Province Higher Educational Science and Technology Program, China (Grant No. J16LN07), the Shandong Province Key Research and Development Program, China (Grant No. 2016GGX101022).

Author information

Authors and Affiliations

School of Information Science and Engineering, University of Jinan, Jinan, People’s Republic of China
Yishen Zhang, Dong Wang, Yuehui Chen, Yaou Zhao & Qingfang Meng
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, 250022, People’s Republic of China
Yishen Zhang, Dong Wang, Yuehui Chen, Yaou Zhao & Qingfang Meng
School of Mathematics, Dalian University of Technology, Dalian, People’s Republic of China
Peng Shao

Authors

Yishen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuehui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yaou Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Peng Shao
View author publications
You can also search for this author in PubMed Google Scholar
Qingfang Meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Dong Wang or Yuehui Chen .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Fengyu Cong
City University of Hong Kong, Kowloon Tong, Hong Kong
Andrew Leung
Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Wang, D., Chen, Y., Zhao, Y., Shao, P., Meng, Q. (2017). Credit Risk Assessment Based on Flexible Neural Tree Model. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-59072-1_26
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Credit Risk Assessment Based on Flexible Neural Tree Model

Abstract

Similar content being viewed by others

Banking Credit Risk Analysis using Artificial Neural Network

The Study on the Credit Risk Assessment of Borrower in P2P Network of China

Credit Risk Assessment Based on Long Short-Term Memory Model

Keywords

1 Introduction

2 Data Collection and Variable Definition