Abstract
In recent years, as China’s credit market continues to expand, a large number of P2P (person-to-person borrow or lend money in Internet Finance) platforms were born and developed. Most of the P2P platforms in China use data mining methods to evaluate the credit risk of loan applicants. Artificial neural network (ANN) is an emerging data mining tool and has good classification ability in many application fields. This paper presents a model of credit risk assessment based on flexible neural tree (FNT), which can reduce the overdue rate and save the analysis time. Overdue and non-overdue sample data are provided by the Jinan Hengxin Micro-Investment Advisory Co., Ltd., and used to build the model. Experiments show that the proposed model is more accurate and has less time cost for the overdue classification of credit risk assessment.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Credit loan is an unsecured loan model. In recent years, the credit market has been expanding rapidly in China. On one hand, the rapid development of China’s economy has shortened the cycle of capital turnover. On the other hand, because of the improvement of Chinese national consumption capacity, businesses increasingly need high demand for funds, so a large number of P2P Internet inclusive financial platforms came into being. As no complete credit evaluation system like banks in China, P2P platform has small contain ability to non-collateral customers, it obtains better risk prediction results only through the establishment of the corresponding credit risk assessment model. So a large number of platforms are exploring their own methods of credit risk assessment, most of which use data mining approach to try to collect and understand the customer information to better grasp the authenticity and validity of customer information; to evaluation financial situation of customers more reasonable; to predict the business conditions, repayment intention and ability of borrows more accurately.
The establishment of a good credit risk evaluation model is the biggest challenge to the development of P2P platform and credit market. If the model control is too strict to the customer, the platform will lose some high-quality customers and make it passive in the industry competition. On the contrary, the overdue rate of the platform will continued to rise, which makes financial managers difficult to be responsible and lose credibility. Therefore, it is important to establish the credit risk evaluation model to prevent bad debts happening, to promote the speed of capital flow and to maintain the security and stability of capital. In the field of credit risk assessment, artificial neural networks, genetic programming, genetic algorithms, support vector machines, logistic regression and some hybrid models have achieved gratifying results in terms of performance and precision.
In the past few years, many excellent algorithms and research methods have been tested on the basis of customer information data in the field of credit risk assessment. Khashman used artificial neural network algorithm in Germany customer dataset and achieved the accuracy rate of 83.6% [1]. Bekhet and Eletter applied RBF network algorithm to the Jordanian commercial bank data set, and the test sets had accuracy rate of 86.5% [2]. Wang et al. uses the improved BP neural network algorithm and the accuracy rate is 86% [3]. The traditional Artificial Neural Network has the stationary structure, but Flexible neural tree (FNT) has the special structures which called flexible tree structures, with this characteristic, FNT model can get better property from the learning.
In this paper, a new method based on FNT model was proposed for classification of customer information, and the results in 10-fold cross validation shows our method achieved better performance than the other state-of-arts.
2 Data Collection and Variable Definition
Customer information data can be described from many dimensions. In this paper, we randomly took 300 samples of overdue customers and 300 Negative samples of non-overdue customers all of which were from 2,000 customers of Jinan Hengxin Micro-Investment Advisory Co., Ltd. between 2014 and 2016. In this study, the author chooses 13 dimensions to describe and consider the customer information. The standard of selected dimensions are: (1) do not contain the customer’s identity information; (2) exclude the subjective information from the point of view of the actual human audit, such as the use of loans, business models, profits and other objective information which can only be verified by a third party as difficulties to verify and census them.
According to these principles, the selected dimensions can maximize the provided data by customer which objectively and difficulty to forge. The accurate classification based on actual data which can verify and excluding the subjective description. Table 1 shows the variable, values, and definitions of 13 selected dimensions of the study, and the Table 2 shows the examples of datasets.
The 600 samples are based on the statistics in Table 1, and then all the data will processed as “Max_Min standardization” for the next step, and get ready to input to the FNT model, the normalized samples are shown in Table 3. The normalization rule is shown in Eq. (1).
where, \( P_{ij}^{'} \) is the normalized customer data. \( P_{ij} \) is the original customer data. \( M_{i} \) is the maximum value of the dimension i. \( m_{i} \) is the minimum value of the dimension i.
3 Classification Method
3.1 Flexible Neural Tree
Flexible neural tree (FNT) is a special artificial neural network with flexible tree structures. It is proposed by Chen et al. [4, 5] and relatively easy for this model to reach near-optimal structure by using optimization algorithms. The FNT model consists of tree-structural encoding method and specific instruction set, it is also generated by using function set F and terminal instruction set T, described as follows.
where \( +_{i} (i = 1,2 \cdots N) \) denotes non-leaf nodes with i arguments, the \( x_{1} ,x_{2} \cdots x_{n} \) are leaf nodes with none arguments.
Figure 1 shows the output of a non-leaf node which calculated by FNT model. Instruction \( +_{i} \) is also called a flexible neuron operator with i inputs. The output of a flexible neuron +n is calculated as follows and the total excitation of \( +_{n} \) is given by
In Eq. (3), \( x_{j} (j = 1,2, \cdots ,n) \) are the input elements to node \( +_{n} \). The output of the node \( +_{n} \) is then calculated by
A typical FNT model is illustrated in Fig. 2. Its overall output can be computed from left to right by a depth-first method recursively.
General learning algorithm of FNT
-
Step 1. Initialize the values of parameters used in the particle swarm optimization (PSO) algorithms. Set the elitist program as NULL and set the fitness value as the biggest positive real number. Create the initial population.
-
Step 2. Construct optimization using PSO algorithm, in which the fitness function is calculated by root mean square error (RMSE).
-
Step 3. If the better structure has found, then go to step 4, otherwise go to step 2.
-
Step 4. Optimize parameters using PSO algorithm.
-
Step 5. If the maximum number of local search is reached, or no better parameter vector is found for a significantly long time (100 steps), then go to step 6; otherwise go to step 4.
-
Step 6. If the satisfied solution is found, then stop; otherwise go to step 2.
3.2 Prediction Assessment
In statistical analysis, two methods can be used to check the effectiveness of the classifier in applications, namely, independent dataset tests and 10-fold cross validation tests. For 10-fold cross validation, the full training set will be separated equally into 10 subset. Each subset will regarded as test data set to compute the overall accuracy (OA) of the model trained by the rest of full training data set. In addition, Sensitivity (Sens) and Specificity (Spec) are also used to evaluate the performance of classifier.
4 Discussion and Results
In this study, the FNT model was used to perform a 10-fold cross validation of a data set containing 600 sample data, i.e. 540 training samples and 60 testing samples were used for each experiment and were performed on each data set. The results show that the average accuracy of the test set is 88.32% (Table 4). In the Table 4, “T” is abbreviation of “trail”, “D” is abbreviation of “data”, “OA” is abbreviation of “Overall”, “A-acc” is abbreviation of “Average accuracy rate” and “acc” is abbreviation of “accuracy rate”, the values of “A-acc” and “acc” are percentages.
We compared the average accuracy, sensitivity and specificity between our model and other methods. The results are shown in Table 5, we can see that our method has higher accuracy compared to other method, and the specificity is slightly better than the others. Another point to make is this: the sensitivity value of Improved BP Neutral Network method is 91.6%, and this value was calculated by once experiment result form with 14 positive simples and 6 negative samples, totally 20 simples. The proportion of positive samples is much higher, so the sensitivity value also high, besides the sensitivity index is mentioned there only and no mention of any other place, so this value is included in Table 5 for reference.
5 Conclusion
In this study, we proposed a redesigned and redefined customer information feature dimension and FNT model for the field of credit risk assessment. Compared with other methods, the method proposed in this study has different degrees of improvement in various evaluation indexes, while the validity of the FNT model is proved. In the future, we will continue to improve the algorithm method and search for more effective classifiers in order to obtain better classification accuracy in this field.
References
Khashman, A.: Neural network for credit risk evaluation: investigation of different neural models and learning schemes. Exp. Syst. Appl. 37(9), 6233–6239 (2010)
Bekhet, H., Eletter, S.: Credit risk assessment model for Jordanian commercial banks: neural scoring approach. Rev. Dev. Financ. 4(1), 20–28 (2014)
Wang, L., Chen, Y., Zhao, Y., Meng, Q., Zhang, Y.: Credit management based on improved BP neural network. IHMSC 1, 497–500 (2016)
Chen, Y., Yang, B., Dong, J., Abraham, A.: Time-series forecasting using flexible neural tree model. Inf. Sci. 174, 219–235 (2005)
Yang, B., Chen, Y., Jiang, M.: Reverse engineering of gene regulatory networks using flexible neural tree models. Neurocomputing 99, 458–466 (2013)
Abdou, H., Pointon, J., El-Masry, A.: On the applicability of credit scoring models in Egyptian banks. Banks Bank Syst. 2(1), 4–19 (2007)
Bensic, M., Sarlija, N., Zekic-Susac, M.: Modeling small-business credit scoring by using logistic regression, neural networks and decision trees. Intell. Syst. Account. Financ. Manag. 13(3), 133–150 (2005)
Blanco, A., Mejias, R., Lara, J., Rayo, S.: Credit scoring models for the microfinance industry using neural networks: evidence from Peru. Exp. Syst. Appl. 40(1), 356–364 (2013)
Heiat, A.: Comparing performance of data mining models for computer credit scoring. J. Int. Financ. Econ. 12(1), 78–83 (2012)
Koh, H., Tan, W., Goh, C.: A two-step method to construct credit scoring models with data mining techniques. Int. J. Bus. Inf. 1(1), 96–118 (2006)
Jagric, V., Kracun, D., Jagric, T.: Does non-linearity matter in retail credit risk modeling? Financ. uver-Czech J. Econ. Financ. 61(4), 384–402 (2011)
Wu, C., Guo, Y., Zhang, X., Xia, H.: Study of personal credit risk assessment based on support vector machine ensemble. Int. J. Innov. 6(5), 2353–2360 (2010)
Xie, T., Yu, H., Wilamowski, B.: Comparison between traditional neural networks and radial basis function networks. In: Proceedings of 2011 IEEE International Symposium on Industrial Electronics, pp. 1194–1199 (2011)
Yap, P., Ong, S., Husain, N.: Using data mining to improve assessment of credit worthiness via credit scoring models. Exp. Syst. Appl. 38(10), 1374–1383 (2011)
Memarian, H., Balasundram, S.: Comparison between multi-layer perceptron and radial basis function networks for sediment load estimation in a tropical watershed. J. Water Resour. Prot. 4, 870–876 (2012)
Acknowledgements
This research was supported by the National Natural Science Foundation of China (Grant No. 61302128, 61573166, 61572230, 61671220, 61640218), the Youth Science and Technology Star Program of Jinan City (201406003), the Shandong Distinguished Middle-aged and Young Scientist Encourage and Reward Foundation, China (Grant No. ZR2016FB14), the Project of Shandong Province Higher Educational Science and Technology Program, China (Grant No. J16LN07), the Shandong Province Key Research and Development Program, China (Grant No. 2016GGX101022).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhang, Y., Wang, D., Chen, Y., Zhao, Y., Shao, P., Meng, Q. (2017). Credit Risk Assessment Based on Flexible Neural Tree Model. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-59072-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)