Keywords

1 Introduction

From 1960’s to present, researchers have paid a great deal of attention to finding a successful way for predicting business failures. It can be described as developing a methodology to predict financial distress using several existing financial features of an enterprise. Business failure prediction, which is also known as financial distress prediction or firm failure prediction has a considerable importance to shareholders, investors, credit managers, etc. Business failure prediction models as such alert a stakeholder or a manager to take timely precautions to prevent failures before they occur. For investors, this model provides vital information which helps them deciding whether to invest in a firm or not. In other words, this model reduces the risk of false investment decisions and prevents financial loss. Also, this model can be used by credit managers to evaluate the level of risk and credit limit for an enterprise.

There exist many studies in the literature about business failure prediction. The first study about this topic was done by Beaver in 1966 [1]. Beaver used univariate analysis to forecast bankruptcy. After that, Altman proposed multivariate discriminant analysis to solve this problem [2]. Most of the subsequent studies were based on Altman’s study. After 1980, different types of regression models, such as logit and probit, were proposed to develop a model which can predict business failures accurately. Afterwards, machine learning algorithms were introduced as alternatives to the statistical models. Most of the recent studies compare traditional statistical models with machine learning models or combine several models in one methodology [3,4,5,6,7,8]. In general, obtained results show that machine learning algorithms overcome statistical models in predicting business failure.

In this study, we proposed a framework for successfully predicting business failures. This framework contains nine different prediction models, namely, Logistic Regression, Multilayer Perceptron (MLP), Sequential Minimal Optimization (SMO), Bayesian Network, Naive Bayes, J48, Random Forest, Random Tree and NARX (nonlinear autoregressive network with exogenous inputs) feedback neural network. To the best of our knowledge, NARX has never been used for business failure prediction before this study. In addition to that, this framework gives chance of making multistep ahead prediction with NARX model. For the evaluation purposes, nine different models were applied to same datasets on the same framework and obtained results are given in detail.

The paper organized as follows: In Sect. 2, we reviewed the related work. Details of constructed datasets are given in Sect. 3. In addition to that, proposed methodology is explained in Sect. 3. The performances of applied methodologies are evaluated in Sect. 4. Comparisons of these performances are also given in this section. The paper is concluded by summarizing achievements and giving future directions in Sect. 5.

2 Related Work

Financial distress prediction has remained highly popular since 1960’s. After Altman’s multivariate discriminant analysis, Ohlson proposed logit analysis for bankruptcy prediction for the first time [9].

After that, machine learning algorithms came into use as an alternative to statistical models. For instance, neural networks were used in numerous studies in order to predict business failure [3,4,5, 10, 11]. In these studies, neural networks were compared with traditional statistical models such as multivariate discriminant analysis. Most of these studies claim that neural networks gave better performance than discriminant analysis. In several studies, SVM has been also used for predicting business failures. It has been found that SVM outperformed the classical methods [12, 13]. Another popular machine learning approach which is used for firm failure prediction is tree algorithms such as ID3 and decision trees [14, 15]. In these studies, tree algorithms were compared with discriminant analysis and provided better results than statistical models. According to the literature review, we can say that machine learning models generally outperform traditional statistical models such as multivariate discriminant analysis.

Combining a model with other models to strengthen the weak points of the model is a common approach in machine learning studies. In this direction, researchers compared neural networks to decision trees, SVM, majority voting and concluded that neural networks was the best method for forecasting financial distress in comparison to other methods [7]. Azayite and Achchab composed a hybrid model based on discriminant analysis, back propagation neural network and self-organizing maps [8]. They applied the hybrid model to Moroccan firms and claimed that the hybrid model outperformed discriminant analysis. Wu et al. proposed a genetic based SVM to predict bankruptcy [16]. This methodology tested on Taiwan dataset to compare with discriminant analysis, logit, probit, neural networks and traditional SVM. Proposed hybrid methodology gave the best predictive accuracy according to the experimental results. Another hybrid study brought together SVM and logistic regression [17]. The methodology modified the outputs of the SVM classifiers according to the result of logistic regression analysis. In [18], single classifiers were trained by SVM algorithms with different kernel functions on different feature subsets of one initial dataset. This ensemble SVM provided better performance than individual SVM classifier. Lin et al. proposed another hybrid method which combines locally linear embedding (LLE) and SVM to predict firm failures [19].

Even though big data approach is extremely popular, it is not used for predicting business failures. In literature, there is only one study which uses big data approach for business failure prediction [20]. The reason for that may be that it is quite difficult to obtain huge amounts of data for business failure prediction.

In this study, we propose a framework for business failure prediction by making following contributions:

  • Our framework contains NARX network algorithm which has never been used for business failure prediction before.

  • Thanks to NARX network, multistep ahead prediction can be done in addition to one-step ahead prediction.

  • Proposed framework can be used for not only business failure prediction, but also other suitable prediction problems in some areas such as finance, biomedical etc., due to its flexible structure.

3 The Dataset and the Proposed Framework

3.1 Details of Dataset

Financial statements of enterprises, which are registered to IMKB BIST [21], are published on Public Disclosure Platform, periodically. In addition to that, deteriorated firms are published on Public Disclosure Platform, as well. Datasets for our study are derived from these resources. 10 different financial ratios are defined as input variables from these datasets. These variables are selected according to Aktan’s study which detects 10 best financial ratios for bankruptcy prediction within 53 financial ratios [22]. Selected financial ratios can be seen in Table 1.

Table 1. Selected financial ratios

Class values, which correspond to financial status of firms are defined as good, bad and very bad in constructed datasets.

In the first dataset, input variables and class values are calculated for quarterly periods. Apart from that, a second dataset is constructed using yearly values of selected variables.

3.2 Proposed Framework

The proposed framework contains three main steps, Data Preparation, Prediction and Evaluation as seen in Fig. 1.

Fig. 1.
figure 1

Proposed framework.

Data Preparation Step. In data preparation step, data rows, which include null values for some financial ratios, are removed from the dataset. After cleaning, 10 financial ratios are calculated using several financial variables. Lastly, a matrix data structure is composed from calculated financial ratios and class values.

Prediction Step. This step is responsible for producing business failure prediction results. For this purpose, we constructed Logistic Regression [23], Multilayer Perceptron [24], Sequential Minimal Optimization [25], Bayesian Network [26], Naive Bayes [27], J48 [28], Random Forest [29], Random Tree [30] and NARX models in this step. A prediction model should be selected within these nine models to continue this step of the framework. Afterwards, the selected model is trained using given data and the prediction results are produced according to the trained model.

Due to page limitations, we, very briefly, explain NARX model, which has not been employed for business failure prediction purposes before.

NARX, which is a dynamic network, is useful for time series modeling. As can be seen in Eq. 1, the previous output value of the network and previous values of input parameters are used for producing next step value of the output.

$$\begin{aligned} y(t) = f(y(t - 1), y(t - 2),..., y(t - n_y), u(t - 1), u(t - 2),...,u(t - n_u)) \end{aligned}$$
(1)

In this equation, u represents the training inputs while y represents the target variables to be predicted. t means the discrete time step in this equation. For predicting next values of y(t), previous values of the exogenous input and previous values of the output regress together using f function. A general NARX network architecture can be seen in Fig. 2.

Fig. 2.
figure 2

NARX network architecture.

There are two types of NARX network: series-parallel architecture and parallel architecture. Series-parallel architecture which is also called open-loop, uses existing output as one of network inputs. Parallel network (close-loop) uses the output produced by previous iteration as one of network inputs.

Firstly, the series-parallel architecture is constructed in order to train network. In this network, inputs of the network are selected financial ratios (u1(t),  u2(t),  .., u10(t)) and existing outputs (y(t)). Series-parallel NARX completes training phase in a shorter time than parallel NARX because series-parallel one uses existing output values.

Afterwards, the architecture of the realized network is transformed to parallel architecture in prediction step. Reason of using parallel NARX network is that parallel architecture provides opportunity to make multi-step ahead prediction.

Evaluation Step. Accuracy, Type I error and Type II error are measured for the performance review of applied algorithms. Brief descriptions and formulas of them are given below:

Accuracy, calculates the ratio of total number of correct predictions to total number of predictions.

Type I error (false positive), means predicting a firm’s financial status as good when it is actually bad or very bad. Also, predicting a firm’s financial status as bad when it is actually very bad is Type I error, as well.

Type II error (false negative), means predicting a firm’s financial status as bad when it is actually good. In addition to that, predicting a firm’s status as very bad when it is actually bad or good is also Type II error.

If we compare Type I and Type II, we can easily say that Type I error is more significant than Type II error for our problem. If a firm’s financial status is bad or very bad but our methodology says that it is good, firm’s managers will not take necessary precautions and possibly, end up with bankruptcy.

4 Performance Evaluation Results

As we mentioned before, two separate datasets are constructed from raw data. First one contains data in quarters and second one contains annual data. In both datasets, 2015 data is used for testing. Test data sample counts of quarter-period dataset and annual dataset are 222 and 66, respectively. Class values for datasets are defined as good, bad and very bad. The optimal parameters are defined using validation set which includes 2014 data.

Table 2. Comparison results for quarter-period dataset

Constructed NARX network contains 10 neurons in the hidden layer and Levenberg - Marquardt [31] algorithm is used as the training step for the network. The applied NARX network contains one hidden layer. In our NARX model, 10 financial ratio values which are given in Table 1, are used as input. There is one value as output of the network which corresponds financial status of firm. Transfer function of the NARX model is sigmoid function. In Eq. 1, \(n_y\) and \(n_u\) are the lags of the input and output of our NARX model. \(n = 1\) means one-step ahead, while any larger value of n means multi-step ahead prediction (If \(n = 2\), model predicts 2 step ahead value).

Besides NARX, other prediction algorithms are applied using Weka. NARX algorithm is implemented using MATLAB. Evaluated results of applied methods for the first dataset (quarter-period dataset) are given in Table 2.

As can be seen in Table 2, Random Forest gives the best accuracy for quarter-period dataset. In addition to that, lowest Type I and Type II error rates are obtained with Random Forest for quarter-period dataset. One step ahead NARX provides second best results for accuracy, Type I and Type II error rates.

As shown in Table 3, one step ahead NARX gives the best accuracy for the annual dataset. For Type I error, Random Forest outperforms one step ahead NARX. For Type II error, NARX gives lowest error rate.

Table 3. Comparison results for annual dataset

The reason Random Forest gives satisfying results is that it is actually an ensemble learning methodology. It contains multitude of decision trees and prediction results are chosen according to the voting mechanism. Ensemble learning approach is based on obtaining highly accurate classifiers by combining less accurate ones.

In addition to Random Forest, one step ahead NARX, also, gives better results than other prediction models of the framework for our datasets as NARX is commonly used for modelling time series based prediction and our datasets also have a temporal ordering for several different financial ratios.

Since, this framework also provides a multi-step ahead business failure prediction, we did some extra experiments for multi-step ahead prediction using parallel NARX network. Detailed results of these experiments are given in Tables 4 and 5.

In Table 4, 5 step ahead prediction gives result for one year later in quarter-period dataset. As you can see from Table 4, accuracy value of 3 step ahead NARX is lower than expected. We guess that this decrease causes from imbalanced dataset of 3 step ahead test.

Table 4. Comparison results for one step and multistep ahead NARX for quarter-period dataset
Table 5. Comparison results for one step and multistep ahead NARX for annual dataset

In Table 5, each step indicates one year, thus 5 step ahead prediction gives results for five years later. Not surprisingly, prediction accuracy, Type I and Type II error rates drop year after year. It is obvious that the long-term business failure prediction is challenging since political and societal changes also play a role in business failure. However, it is difficult to predict political and societal changes.

5 Conclusions

In this study, we presented a framework for business failure prediction. To achieve that, Logistic Regression, Multilayer Perceptron, Sequential Minimal Optimization, Bayesian Network, Naive Bayes, J48 Tree, Random Forest, Random Tree and NARX models are constructed in this framework. We also want to emphasize that this is the first study, which uses NARX for business failure prediction. All prediction models of framework are tested separately using two different datasets which contain firms from Turkey. The first dataset uses quarterly period data but second one uses annual data for financial ratios and class values estimations.

In conclusion, we can confidently say that proposed framework is very useful for business failure prediction. Using this framework, suitable business failure prediction model for a dataset can be chosen easily. Moreover, NARX model gives a chance of predicting multi-step ahead business failure.