1 Introduction

Academics and practitioners have long shown interest in the prediction of corporate bankruptcy and financial distress.Footnote 1 Since the seminal work of Beaver (1966) and Altman (1968) researchers have almost without exception developed bankruptcy prediction models using accounting information as the variables (covariates) that predict financial distress.Footnote 2 More recent studies have focused on how models based on a market-driven measure of distance to default (DD), derived from the Merton (1974) structural model for pricing corporate debt,Footnote 3 perform in predicting bankruptcy; see, for example, Vassalou and Xing (2004), Duffie et al. (2007) and Bharath and Shumway (2008). Shumway (2001) shows that a combination of both accounting and market-based variables in a hazard model can predict corporate financial distress more accurately than accounting or market-based variables alone for US firms.

While there is a wealth of evidence on the ability of these models to predict financial distress for US firms there is much less evidence on how they perform on samples outside of the US, both in terms of how the different models perform and whether different predictors are needed. We provide an evaluation of how well these models predict financial distress outside of the US in this paper. We first examine their performance in predicting the financial distress of UK firms. The UK provides a particularly valuable focus for this empirical investigation since the London Stock Exchange is the third largest in terms of capitalisation following the New York and NASDAQ exchanges. Further, London’s second tier market for small and growing companies, AIM, continually attracts new listings and is considered to be a successful growth market. Therefore, the UK provides an ideal environment to examine how well corporate financial distress prediction models work in a developed economy other than the US. Taffler (1983) uses an accounting-based model to estimate Z scores to predict corporate financial distress in the UK which Agarwal and Taffler (2007) find has clear predictive ability. Agarwal and Taffler (2008) compare Taffler’s Z-score model with market-driven measures from the structural Merton DD model. However, they treat the accounting and market-based information sets as mutually exclusive, something which, according to US evidence, is not appropriate. Our paper can therefore be thought of as a complement to, and extension of, Agarwal and Taffler (2008), where we do not treat accounting and market-based information as mutually exclusive but as complementary. We then investigate whether a model that predicts bankruptcy for firms in developed economies can also predict corporate financial distress efficiently in an emerging market. In particular, we evaluate the performance of bankruptcy forecast models for Indian firms and explore whether there is a common hazard model that can be applied successfully in both developed and emerging markets.

To predict the probability of financial distress we use a reduced-form hazard model in the spirit of Shumway (2001).Footnote 4 We divide the information set from which we draw our predictor variables into three categories. First, we examine the ability of accounting information to predict the probability of financial distress. Starting our analysis with the UK, we begin by examining whether the popular Z-score measure contains any predictive information about the probability of bankruptcy. We calculate the Z-score using Taffler’s model (Taffler (1983) and Agarwal and Taffler (2007)) as we are interested in predicting the probability of financial distress for UK firms. Given that the Z-score and the probability of financial distress should capture the same information, we would expect to find a significant association between them. We also decompose the Z-score into its individual components to examine which of the accounting ratios that comprise the Z-score significantly predict the probability of bankruptcy. Second, we examine the ability of market information to predict bankruptcy by exploring whether the expected default frequency (EDF) is related to the prediction of financial distress. We also evaluate the performance of a hazard model that contains what we term Shumway market-based variables, i.e. market capitalization, past excess returns and stock return volatility. Third, we investigate the role of both accounting and market-based information in predicting bankruptcy by comparing a set of reduced-form models. We investigate the extent to which a model that combines Z-score, or its components, and EDF can predict bankruptcy. We also assess the predictive properties of a model that combines Z-score with the market-based variables used by Shumway (2001). Finally, we examine whether a model that incorporates profitability, book leverage and market-driven variables can predict financial distress more accurately.

We find that in a univariate model, Z-score significantly predicts the probability of financial distress but when the Z-score is decomposed into its individual components we find that only two of the four accounting ratios that comprise it are statistically significant. Using market-based information we find that EDF is positively associated with the prediction of financial distress while in a model that is based solely on Shumway’s market-based variables, we find that excess returns, size based on market capitalization, and return volatility all significantly predict the probability of bankruptcy for UK firms. When we combine Z-score and EDF, we find that both significantly predict the probability of financial distress with the expected signs. Things deteriorate rapidly for accounting ratios based on the Z-score when faced with market-based information, however. In a model that incorporates EDF and the accounting ratio components of Z-score we find that only EDF is associated with the prediction of financial distress. When we include the Z-score in the model with market-based variables other than EDF, Z-score has no incremental predictive power above that contained in the market-based variables. Given the apparent inability of the Z-score to predict bankruptcy once excess returns, volatility and relative size are considered in the model, we broaden the set of accounting ratios. We find that in addition to the Shumway market-based variables, book leverage is statistically significant. Results from Vuong’s (1989) model comparison test and in-sample prediction tests confirm that the combination of book leverage and Shumway market-based variables best explains the probability of bankruptcy for UK firms. Finally, we consider the out-of-sample predictive capabilities of these models. This test is all the more stringent in our case given that we test how successfully these models predict corporate failures during the global financial crisis. Out-of-sample forecast tests show that the model for the UK that combines leverage with three market-driven variables identifies the highest number of bankruptcies that occurred during the financial crisis and yields the lowest misclassification rate.

We then turn our attention to an evaluation of the performance of these models in predicting the probability of financial distress for Indian firms. Similar to what we find for the UK, among the accounting ratios only profitability and financial risk (FRISK) impact on the probability of financial distress for Indian firms. EDF has a strong positive effect on the probability of bankruptcy. However, when we combine EDF with the accounting ratios only profitability is significant for the Indian sample. With respect to the Shumway market-based variables, while size affects the probability of corporate financial distress in India, excess past stock returns and stock return volatility fail to predict bankruptcy. When we combine the Shumway market-based variables with profitability and financial risk we find that only profitability and FRISK are strongly associated with the probability of financial distress. We conclude that a model that includes only accounting ratios reflecting profitability and FRISK best describes the probability of financial distress for Indian firms. In-sample and out-of-sample forecast tests confirm the model with the two accounting ratios has the highest predictive ability.

Finally, we conduct a variety of robustness tests to examine whether our results are robust to the inclusion of variables that capture movements in the macroeconomy and whether they are robust to controlling for industry effects both for the UK and the Indian sample. We find that our results for UK and Indian firms are robust.

The results of this paper have important implications. First, in terms of developed markets, we provide evidence that hazard models predicting the financial distress of US firms using accounting and market-based variables seem robust in that they perform well out-of-sample in terms of their ability to predict the probability of financial distress not only for US firms but also for UK firms. Second, the paper offers evidence on how bankruptcy forecast models perform in an emerging market, in this case India. Unlike for developed markets we find that market-based variables fail to predict financial distress for Indian firms. We find that the best financial distress prediction model is exclusively based on two accounting ratios: profitability and FRISK. Our results suggest that we cannot use a common model to predict financial distress regardless of the state of development of the economy of a country; in the Indian market, at least, it appears that accounting information alone provides better predictions of financial distress.

The rest of the paper is organized as follows. Section 2 provides an overview of how we model the probability of financial distress. Section 3 describes our data. Section 4 presents the results from the various bankruptcy forecasting models for the UK, including an analysis of the in-sample and out-of-sample forecast accuracy of the models. Section 5 evaluates bankruptcy prediction models in an emerging economy, India, while Sect. 6 reports the results from robustness tests. Section 7 concludes.

2 Modeling the probability of financial distress

A variety of estimation techniques have been employed to develop bankruptcy forecasting models. Beaver (1966) uses a multiple regression model to predict corporate failure with accounting ratios. Altman (1968) employs multivariate discriminant analysis to derive the Z-score measure for predicting bankruptcy for US firms; Taffler (1983) uses the same technique for UK firms, while Altman et al. (1977) use quadratic discriminant analysis to identify firms in danger of going bankrupt. Ohlson (1980) estimates a conditional logit model to generate the probability that a firm will enter bankruptcy (known as the “O-score”) while Zmijewski (1984) estimates a probit model.Footnote 5 On a cautionary note, however, Grice and Dugan (2001) find that the models of Ohlson (1980) and Zmijewski (1984) are sensitive to the time periods used, with the accuracy of these two models declining over time. Further, Shumway (2001) argues that these bankruptcy forecasting models are misspecified as they do not properly account for the length of time that a healthy firm has survived. In particular, such models typically use observations on the explanatory variables for a single firm-year (hence they can be thought of as static), which can be arbitrarily chosen for non-bankrupt firms, while for non-bankrupt firms the firm-year observation is not randomly selected, typically corresponding to the year before bankruptcy. This induces a selection bias; see, Shumway (2001) and Hillegeist et al. (2004). Shumway (2001) shows that ignoring information about the length of time a healthy firm has survived produces biased and inconsistent estimates of the parameters of the model. To properly address this, Shumway (2001) uses a discrete time hazard model. In the hazard model, the hazard rate is the probability of the firm going bankrupt at time t conditional upon having survived until time t. Therefore, in a hazard model the probability of bankruptcy changes through time. This variation in the probability of bankruptcy over time not only allows researchers to take advantage of all the available firm-year observations, it allows the probability of bankruptcy to change as a function of a vector of explanatory variables, known as covariates, that also change over time. The general form of the hazard model is

$$\ln \left[ \frac{h_i(t)}{1-h_i(t)} \right] = \alpha(t) + {\varvec{\beta}}^{\prime }{\mathbf{x}}_{it}$$
(1)

where h i (t) represents the hazard of bankruptcy at time t for firm i, conditional on survival to t; α(t) is the baseline hazard; β is a vector of coefficients and x it a k × 1 vector of observations on the ith covariate at time t. The attraction of this approach, as Shumway (2001) shows, is that the discrete-time hazard model is econometrically equivalent to a dynamic (multi-period) logit model where each period that a firm survives is included as a non-failing firm. Therefore, we estimate the probability of bankruptcy as

$$P_{t-1}(Y_{it}=1) =\frac{1}{1+\exp {(-\alpha -\varvec{\beta }^{\prime }\mathbf {x}_{it-1})}}$$
(2)

where Y it is a variable that equals one if firm i goes bankrupt in year t, zero otherwise. β and x are as before. Notice that we use data dated t − 1 in estimating the probability of bankruptcy. This is to ensure that we only use data that is actually available at the beginning of the year in which bankruptcy occurs. Given that it is possible to set the hazard model up as a logit model, it is very easy to estimate with one proviso. Before any inference can be undertaken in relation to the significance or otherwise of the elements of β, it is necessary to adjust the Wald statistic that tests the significance of the coefficients. The reason for this is that because we treat each firm-year observation as if it were a separate firm, estimation using standard logit routines treats the model as if it were static. In the static logit model, the number of firm-years is used in calculating the Wald statistics. However, this is not correct for the dynamic logit model because in the dynamic logit model, unlike the static logit model, firm-year observations are not independent of each other. For the dynamic logit model, it is the number of firms rather than the number of firm years that should be used. The test statistics therefore need to be scaled by the average number of firm years per firm.Footnote 6

3 Data

For the UK sample, we obtain the accounting data from Datastream and the market data from the London Share Price Database (LSPD). We use the returns on the FTSE All Share Index as a measure of the returns on the market index. We exclude financial firms and utilities from the sample. We exclude firm-year observations for which we do not have available data. The final estimation sample consists of 2,277 (alive and dead) UK listed firms with 17,662 firm-year observations over the period 1980–2006. We identify bankrupt firms as follows. For each year in the sample, we search the LSPD database based on the LSPD death type. We define a firm as bankrupt if a firm’s LSPD death type is liquidation, voluntary liquidation, receiver appointed/liquidation, in administration/administrative receivership, and canceled assumed valueless, otherwise it is classed as non-bankrupt. We identify 175 bankrupt firms who provide 1,055 firm-year observations in total. There are 2,102 non-bankrupt firms in our sample, providing 16,607 firm-year observations. We also use data from 2007–2011, covering the period of the financial crisis, to conduct out-of-sample forecast accuracy tests. This sample consists of 5,315 firm-year observations from 1,476 firms, of which 92 enter financial distress. Table 1 provides detailed information on the definition of the variables we use in the study.

Table 1 Definition of variables

Of the accounting information we use, profit before tax (PROF), working capital (WCAP), FRISK and Liquidity (LIQUID) are the accounting ratios on which Taffler (1983) bases his Z-score. We also include book leverage (BLEV) in the accounting-variable information set since leverage has been found to predict financial distress in the US (see Zmijewski 1984; Shumway 2001, for example.) Of the market information we use, the EDF is estimated from the Merton DD Model. More detail on how we estimate the EDF from the Merton DD model is provided in the Appendix. In addition to the EDF we use the three market-based variables that Shumway (2001) uses for the US. These are relative size (REL_SIZE), which expresses the equity market capitalization of the firm relative to total equity market capitalization, excess stock returns (EXRET) and idiosyncratic stock return volatility (SIGMA). Both Shumway (2001) and Campbell et al. (2008) find that these variables are statistically significant predictors of the probability of financial distress for US firms. We include them here to examine whether these variables are robust predictors of the probability of financial distress. If they are, we would expect their UK equivalents to be significant predictors of the probability of bankruptcy for UK firms. With the exception of REL_SIZE, which appears on the basis of normality tests to be normally distributed, and Z-score, we truncate the independent variables at the 1st and 99th percentiles of the distribution to avoid outliers. For the Z-score, we follow Agarwal and Taffler (2008) and winsorize Z-score so that it lies between ±18.4207. We winsorize the sample so that the minimum (maximum) value of EDF will be 0.00001 (0.99999). However, we cannot include EDF in a logit model as it is already expressed in the form of a probability. Therefore, following Hillegeist et al. (2004), in the next section we transform EDF into a “score” using the inverse logistic function \(\text {EDF-score}=\ln (EDF/(1-EDF))\). The winsorization of EDF leads EDF-score to lie between ±11.512992. Descriptive statistics for the explanatory variables of the UK sample are provided in Panel A of Table 2. We observe that the average value of PROF is lower than its median value as the distribution of PROF is negatively skewed. Z-score, which includes PROF follows the same distribution. The average EDF for our sample is 10 %.Footnote 7 The average of REL_SIZE is negative as it is defined as the logarithm of a generally small fraction (see Table 1.)

Table 2 Summary statistics

With respect to the Indian sample we use Thomson Financial Datastream to collect accounting and market data. We use the returns on the CNX500 as measure of the returns of the market index in India. Similar to the UK sample, we exclude firm-year observations for which we do not have available data. The final sample includes 958 alive and dead Indian firms listed on the National Stock Exchange of India over the period 2002–2012. We collect data on delisted firms and the reason for delisting from the National Stock Exchange of India. For the estimation sample, which covers the period 2002–2006, we identify 43 financially distressed firms, yielding 250 firm-year observations, while the non-bankrupt sample contains 915 firms with 4,717 firm-year observations. To perform the out-of-sample forecast test we use data from 2007–2012. This sample consists of 4,408 firm-year observations on 949 firms, of which 37 firms are bankrupt. Descriptive statistics for the variables for the Indian sample are presented in Panel B of Table 2. Unlike the UK sample, PROF is normally distributed while FRISK is negatively skewed. We note that the average EDF for Indian firms is 20 %, twice than that observed for UK firms.Footnote 8

4 Results For UK firms

The results from estimating a series of dynamic logit models for the UK sample using different information sets are presented in Panel A of Table 3.Footnote 9 The first two columns provide evidence on the ability of accounting information to predict financial distress. The column headed ZSCORE is for a model where the only predictor of the probability of financial distress is the Z-score. The column headed ZDECOMP is a multivariate model where we decompose the Z-score into its component accounting ratios to examine which, if any, of the ratios that make up the Z-score individually help predict the probability of financial distress.

Table 3 Results for hazard models predicting the probability of financial distress for UK firms

Focusing on the ZSCORE column, encouragingly, but perhaps not surprisingly given that the Z-score should be capable of identifying firms in financial distress, we find that the Z-score is a statistically significant predictor of the probability of financial distress and has the correct sign: the higher is the Z-score, the less likely the firm is to fail, and this is what we observe. This is an encouraging start, but if Z-score is a powerful predictor of financial distress, as the results in the ZSCORE column suggest it might be, we might expect all of its component ratios to predict financial distress. The ZDECOMP column reports the results from examining which of the components of the Z-score best predict financial distress. When we break the Z-score down into its constituent parts, we find that the significance of the Z-score in predicting the probability of financial distress is driven by profitability and FRISK, with more profitable firms less likely to fail while firms with higher FRISK (higher current liabilities relative to total assets) more likely to fail. WCAP and LIQUID appear to have no incremental predictive power over that contained in profitability and financial risk. The predictive ability of Z-score, then, derives from firms’ profitability and FRISK.

Turning our attention to the market information, the column headed EDF presents the results from a model that uses an equivalent measure of the expected default frequency (EDF-score) derived from the Merton DD model, as a predictor of financial distress. The results show that there is a positive association between EDF-score and the probability of financial distress. The next column, ZEDFSCORE, presents the results of a model that combines Z-score with EDF-score to forecast financial distress. There is clear evidence that both Z-score and EDF-score are strongly related to the probability of financial distress for UK firms, and both enter the model with the expected signs. The EDFZDECOMP column documents the parameter estimates from a model that includes the EDF-score and the accounting-ratio components of Z-score. Surprisingly, we find that while EDF-score remains a significant predictor of bankruptcy the accounting ratios on which Z-score is based are no longer associated with bankruptcy.

The model in the MV column predicts the probability of financial distress using Shumway’s three market-driven variables, i.e., relative size, excess returns and idiosyncratic return volatility. Size, as measured by relative market capitalization, and excess returns are significant, although size is only significant at the 10 % level, and are negatively related to failure while volatility is significantly positively related to failure. The column labelled MVZSCORE reports the results from a model that combines the three market-based predictors with Z-score. We find that Z-score fails to provide any incremental ability to predict bankruptcy above that contained in the MV model.Footnote 10

Shumway (2001) shows that we can improve corporate financial distress prediction by using both accounting and market-driven variables. He uses profitability and book leverage along with the three market-driven variables relative, excess returns and volatility to predict bankruptcy and finds that leverage coupled with the three market-driven variables improves our ability to predict financial distress. We follow Shumway (2001) and apply his model to our sample of UK firms to see whether his findings generalize to the UK. In particular, we estimate a dynamic logit model that includes profitability and book leverage, along with relative size, excess past returns and idiosyncratic volatility, to forecast bankruptcy for UK firms. The results of this model can be found in the MVACC column. Like Shumway (2001), we find that profitability is statistically insignificant but book leverage is significant at the 5 % level. The Shumway market-driven variables also retain their earlier significance. The results suggest that a model with book leverage, past excess returns, relative size and idiosyncratic return volatility may be the best model (in terms of statistical significance) to predict the probability of financial distress. To examine whether this is the case, we re-estimate the MVACC model with profitability excluded. The results can be found in the MVLEV column, where it can be seen that book leverage, relative size, excess returns and idiosyncratic return volatility are strongly associated with the probability of financial distress.

The intriguing question that arises from this analysis is which of the above models provides the best prediction of the probability of a firm entering financial distress? To investigate this question, we follow Hillegeist et al. (2004) and use the Vuong (1989) model comparison test.

Vuong (1989) develops a test for choosing between two models, i and j. Under the null hypothesis that there is no difference between the two models, the log of the ratio of the likelihood for model i to that for model j should be zero. If the difference is significantly positive, i is preferred to j and vice versa. Vuong (1989) derives a statistic that allows us to test this hypothesis. Under the null hypothesis that there is no difference between the competing models, the test statistic has a standard normal distribution. Panel B of Table 3 contains results of the Vuong test for various models. Of interest here is how the MVLEV model performs against the ZSCORE, ZDECOMP, EDF, ZEDFSCORE, and MV models. This information is contained in Panel B of Table 3. The Vuong tests show that the null hypothesis that there is no difference between MVLEV and ZSCORE, MVLEV and ZDECOMP, MVLEV and EDF, MVLEV and MV is soundly rejected in all four cases. It shows that MVLEV performs better than the ZSCORE, ZDECOMP, EDF and MV models. In addition we compare the MV and ZSCORE models, and the MV and EDF models. Unlike Agarwal and Taffler (2008), we find that the MV model contains more significant information about the probability of financial distress than the univariate ZSCORE model. In line with Agarwal and Taffler (2008) we find that the MV model outperforms the univariate EDF model. The last row of Panel B in Table 3 compares the performance of Z-score with the EDF model. Based on the Vuong test there is no difference between Z-score and the EDF model, which is in line with Agarwal and Taffler (2008).

We then compare the predictive power of our “best” reduced-form model, MVLEV, with that of the univariate ZSCORE and EDF models to explore whether each of Z-score and EDF provide incremental information about the probability of bankruptcy that could further improve the MVLEV model. The column headed “MVLEV with Z-score” in Panel A of Table 4 reports the estimated coefficients when we include Z-score in the MVLEV model. The column headed “MVLEV with EDF-score” does likewise with the EDF-score derived from the Merton DD model. The table clearly shows that neither Z-score nor EDF-score offer any incremental predictive ability above that contained in the specification of the MVLEV model. This means that when we account for the variables used in the MVLEV model Z-score and EDF lose their predictive power. While Agarwal and Taffler (2008) provide evidence that Merton DD models convey no incremental information once conditioned on control variables, they find that Z-score continues to provide incremental information about the probability of financial distress when it is combined with control variables. This is possibly because Agarwal and Taffler (2008) use contemporaneous predictors of financial distress instead of lagged predictors. Panel B of Table 4 reports the Vuong test statistics, which show there is no difference between either MVLEV and the MVLEV model augmented with Z-score or MVLEV and the MVLEV model augmented with the EDF-score. This confirms the results in Panel A of Table 4, suggesting that each of the Z-score and EDF-score fail to make an incremental contribution to financial distress prediction above that of MVLEV model.

Table 4 Comparison of the MVLEV model with the ZSCORE and EDF models for the UK sample

Taken together the results in Tables 3 and 4 provide some interesting insights into forecasting the probability of financial distress for UK firms. We find that by combining accounting and equity market-driven variables that are strongly associated with financial distress we can predict the probability of financial distress for UK firms more accurately. However, we need to be cautious about the type of accounting and market information used when developing a financial distress prediction model as the choice of variables matters. We provide evidence that a model that combines book leverage with the three market-driven variables used in Shumway (2001) describes better the probability of financial distress than a model that combines Z-score and EDF-score derived from a Merton DD model. Therefore, book leverage and Shumway market-driven variables, which are directly obtained from the balance sheets and the equity markets, are more capable of forecasting financial distress than variables that have already been estimated as a function of accounting ratios (Z-score) or market-driven variables (EDF). Our results also cast doubt on the robustness of Z-score and the Merton DD model in predicting financial distress. While Taffler (1983) and Agarwal and Taffler (2008) show that Z-score and EDF have predictive ability with respect to UK firms, we find that Z-score and EDF do not remain significant predictors of bankruptcy once conditioned on book leverage and the Shumway market-based variables. Our results are consistent with Shumway’s (2001) findings for the US. However, what we have yet to see is whether things change when we directly consider the in and out-of-sample predictive capabilities of these models. This is the subject matter of the next section.

4.1 In-sample and Out-of-sample forecast accuracy

4.1.1 In-sample performance

To evaluate the predictive ability of the models from the previous section, we sort firms into deciles based on the probability of bankruptcy estimated by the relevant model. Deciles one through five contain firms that are more likely to go bankrupt, decile one containing those firms with the highest predicted probability of bankruptcy, while deciles six through ten contain those firms that are considered least likely to go bankrupt, decile ten containing those firms with the lowest predicted probability of bankruptcy. We then calculate the percentage of bankrupt firms that are allocated to the various deciles. Table 5 presents the results for in-sample predictive ability. This can be thought of as a means by which we can assess the explanatory capability of the various models using the entire sample to estimate the models. We assess their ability to correctly classify those firms that went bankrupt as likely to go bankrupt. In particular, for each model, we report the percentage of bankrupt firms classified in firms with high probability of financial distress (deciles 1–5). Also, for each model, we show the percentage of bankrupt firms classified in firms with low probability of financial distress (deciles 6–10). This represents the misclassification rate of each model. We consider the ability of the models to predict bankruptcy out-of-sample in the next subsection.

Table 5 In-sample forecast accuracy examining the percentage of firms predicted to go bankrupt that actually went bankrupt, UK sample

Looking at Table 5, an interesting point to note is that all of the models perform well in terms of identifying bankrupt firms as firms more likely to go bankrupt than less. The ZSCORE model correctly places around 81 % of those firms that do go bankrupt in deciles 1 through 5, which contain those firms that are predicted more likely than not to go bankrupt. The ZDECOMP model only correctly predicts that around 73 % of those firms that actually went bankrupt were more likely to go bankrupt than not. This is quite a deterioration in explanatory capability relative to the ZSCORE model. The EDF model correctly classifies around 79 % of bankrupt firms in the deciles 1–5. Things improve once we include both Z-score and EDF-score (ZEDFSCORE). The ZEDFSCORE model classifies 83 % of bankrupt firms within deciles 1–5. However, the model that includes only the Shumway market-driven variables (see the column headed MV) correctly classifies 86 % of those firms that actually go bankrupt as more likely than not to go bankrupt while the model including the Shumway market-driven variables and an accounting ratio reflecting book leverage (MVLEV) correctly classifies 89 % of bankrupt firms as more likely than not to go bankrupt. Further, compared to the previous models, the MVLEV model classifies the highest proportion of firms that go bankrupt into the first decile (45 %) and exhibits the lowest misclassification rate (11 %). Thus, while all of the models perform well in terms of correctly classifying those firms more likely to go bankrupt, the best performer by some way is the model that combines Shumway’s market-driven variables with an accounting ratio capturing leverage.

4.1.2 Out-of-sample performance

While our in-sample forecast results in the previous section appear to lend strong support to a model that predicts bankruptcy using book leverage, size, excess returns and volatility there is a danger that the model is uninformative out-of-sample due to over-fitting. Over-fitting is a situation in which a model includes regressors which improve the in-sample fit of the model but penalize the model in an out-of-sample forecasting test. In this sub-section we examine the out-of-sample forecasting performance of the models discussed in Sect. 4. In particular, we investigate their forecasting ability over the period 2007–2011, a period which includes the recent financial crisis. The financial crisis that erupted in 2007 offers a unique challenge to test the out-of-sample predictive ability of the bankruptcy-forecasting models discussed in Sect. 4. To examine the out-of-sample performance of the models we use the estimation sample parameter estimates in Table 3 to forecast corporate financial distress over the period 2007–2011. Table 6 reports the results. As the number of bankrupt firms is somewhat reduced we sort the firms into hexiles rather than deciles, based on the estimated probability derived from each model. Hexiles 1 through 3 contain firms that are predicted as more likely to go bankrupt, with “1” containing those firms with the highest predicted probability of bankruptcy, while 4 through 6 contain those firms that are considered least likely to go bankrupt, “6” containing those firms with the lowest predicted probability of bankruptcy. 92 firms in the sample entered financial distress during 2007–2011. We then calculate the percentage of bankrupt firms that are allocated to the various categories. Again, all of the models do reasonably well. For example, the ZSCORE model correctly identifies approximately 82 % of those firms that go bankrupt over the period 2007–2011 (the sum of hexiles 1, 2 and 3.) The ZDECOMP and EDF models correctly categorize 73 and 75 % of financially distressed firms, respectively. ZEDFSCORE performs better than ZSCORE, ZDECOMP and EDF. It correctly classifies 85 % of bankrupt firms. The MV model performs better than ZEDFSCORE as it correctly categorizes 87 % of bankrupt firms. However, in line with the in-sample forecast accuracy results presented in Table 5, it is the model with the Shumway market-driven variables and an accounting ratio capturing leverage that delivers the best out-of-sample performance. In particular, the MVLEV model correctly classifies around 89 % of firms that went bankrupt as more likely to go bankrupt and produces the lowest misclassification rate of approximately 11 %.

Table 6 Out-of-sample forecast accuracy examining the percentage of firms predicted to go bankrupt that actually went bankrupt during the financial crisis, UK sample

5 Predicting corporate financial distress in emerging markets

Our knowledge about the performance of bankruptcy prediction models has mostly been derived from data from developed economies such as the UK and the US, economies that have many institutional similarities. The evidence on the performance of corporate bankruptcy prediction models for emerging markets, however, is on the thin side.Footnote 11 In this section we investigate whether the model that best predicts the probability of financial distress for UK firms, a model that contains very similar variables to those found to predict the probability of bankruptcy for US firms, best predicts the probability of financial distress for Indian firms.Footnote 12

Table 7 Results for hazard models predicting the probability of financial distress for Indian firms

We explore the performance of several hazard models to assess their ability to predict the probability of bankruptcy for Indian firms. The results are presented in Table 7. Unlike for the UK and US, there is no established Z-score model for predicting financial distress of Indian firms. We therefore begin our analysis by looking directly at the predictive ability of the individual accounting ratios that comprise the Z-score, that is, we start with the ZDECOMP model. The results appear in the ZDECOMP column in Table 7. We find that profitability and FRISK are statistically significant and that profitability is negatively related to the probability of financial distress while FRISK is positively related to the probability of financial distress for Indian firms. WCAP and LIQUID have no significant impact on the probability of financial distress. Overall, the evidence on the predictive ability of the accounting ratios is in line with the evidence documented in Table 3 for the UK. With respect to the predictive ability of EDF, we find that, as for the UK, EDF-score is positively and significantly related to the probability of bankruptcy, as can be seen in the EDF column of Table 7. However, when EDF-score is combined with the accounting ratios, only profitability and FRISK are significant predictors of corporate financial distress (see the ZDECOMPEDF column in Table 7.) The results in the MV column show that of the market variables, only relative size significantly predicts financial distress; excess past stock returns and sigma do not significantly predict the probability of financial distress for Indian firms. The MVACC column reports results from combining market information and accounting information in the information set. The results show that the market-based variables (relative size, excess past stock returns and sigma) do not play a significant role in predicting the probability of financial distress for Indian firms. In this case, only accounting information matters. In particular, profitability and FRISK contribute to the prediction of financial distress. The column entitled 2ACCSIZE shows results from a model that uses profitability, FRISK and relative size to forecast bankruptcy. While profitability and FRISK are strongly related to the prediction of bankruptcy, there is no association between relative size and the probability of financial distress once profitability and FRISK are accounted for. The final column in Table 7 (2ACC) reports results from a model that only includes profitability and FRISK as predictors of bankruptcy for Indian firms. Both are statistically significant. Panel B of Table 7 reports the results from Vuong model comparison tests. These results suggest that no one model dominates the others as we observe that the \(z\) statistic is statistically insignificant in all cases. Therefore, we need to perform in-sample and out-of-sample forecast accuracy tests to assess the predictive ability of each model.

Table 8 In-sample forecast accuracy examining the percentage of firms predicted to go bankrupt that actually went bankrupt, Indian sample

Table 8 presents results for the in-sample predictive ability of each model. As with the UK sample, firms are sorted into groups based on the probability of financial distress derived from each model. Hexiles 1–3 contain the firms that are more likely to go bankrupt while hexiles 3–6 contain firms that are less likely to go bankrupt. The ZDECOMP model correctly allocates 65.12 % of bankrupt firms to hexiles 1–3. The EDF model exhibits the worst predictive ability as it correctly classifies less than 50 % of bankrupt firms as being bankrupt. The EDFZDECOMP model, which combines the individual accounting ratios with the EDF-score, has similar predictive ability to the ZDECOMP model (it correctly classifies 65.11 % into hexiles 1–3.) The MV model correctly identifies 67.44 % of the bankrupt firms whereas the MVACC model correctly allocates 62.79 % of bankrupt firms into hexiles 1–3. The 2ACCSIZE model classifies a higher percentage of financially distressed Indian firms (65.12 %) than the MVACC model but the 2ACC model has the best in-sample predictive ability, classifying the highest percentage of bankrupt firms in hexiles 1–3, 69.77 % in total.

Table 9 Out-of-sample forecast accuracy examining the percentage of firms predicted to go bankrupt that actually went bankrupt during the financial crisis, Indian sample

As for our analysis of UK firms, we also perform an out-of-sample forecast accuracy test for each model. We estimate each model using data from 2002–2006 and use the coefficients estimated using this sample to estimate the probability of financial distress for the period 2007–2012. We then sort the firms into hexiles based on the estimated probability of financial distress. Hexiles 1–3 contain firms that are more likely to enter bankruptcy while hexiles 4–6 contain firms that are less likely to go bankrupt. The results are presented in Table 9. The ZDECOMP model correctly allocates 62.16 % of bankrupt firms to hexiles 1–3 and 37.84 % of bankrupt firms are misclassified into hexiles 4–6. The EDF model has poor out-of-sample predictive ability as it correctly allocates only 40.54 % of financially distressed firms into hexiles 1–3. The EDFZDECOMP model allocates 54.05 % of bankrupt firms to hexiles 1–3, a pronounced deterioration in performance relative to the ZDECOMP model. Unlike the in-sample performance, the MV model exhibits the worst out-of-sample predictive ability as it correctly classifies only 37.84 % of bankrupt firms into the hexiles with the highest probability of financial distress. The MVACC model allocates a substantially higher percentage of bankrupt firms to hexiles 1–3 than the MV model (56.75 %), showing the importance of the accounting ratios as predictors of financial distress relative to the market-based variables. The 2ACCSIZE model performs better again, correctly classifiying 64.86 % of bankrupt firms into hexiles 1–3. However, the 2ACC model exhibits the best out-of-sample predictive ability, correctly identifying the largest number of bankrupt firms (70.27 %.) This result is in line with the in-sample evidence suggesting that the 2ACC model predicts financial distress most accurately for Indian firms .

6 Robustness tests

The results in the previous sections provide strong support for the use of a hazard model combining market- and accounting-based information to predict the probability of bankruptcy for UK firms and one using accounting-based information for Indian firms. In this section, we consider the robustness of the models to the inclusion of the actual bankruptcy rate in the previous year (the number of bankruptcies relative to the total number of firms), macroeconomic variables, industry effects and the sensitivity of our results to alternative measures of profitability and leverage.

Table 10 Robustness of the MVLEV model to the inclusion of the bankruptcy rate, macroeconomic factors and industry dummies for UK firms

Table 10 reports results from the robustness tests for the UK sample. The MVLEV-BR column in Table 10 reports the results from augmenting the MVLEV model with the previous year’s bankruptcy rate. This can be thought of as a proxy for recessionary and expansionary periods; we would expect the bankruptcy rate to be higher during a recession and lower during an economic expansion. Consistent with Agarwal and Taffler (2008), we find that this variable has no incremental explanatory power. The Vuong test also confirms that MVLEV-BR does not contain incremental predictive information compared to the MVLEV model.

With regard to the macroeconomic variables, we consider several alternatives. We augment the specification of our best hazard model, MVLEV, with the term premium, which is the difference between the yield on 10 years UK government bonds and the 3 month Treasury Bill rate, the inflation rate, the default premium, calculated as the difference between the yield on corporate bonds and the yield on UK government bonds, UK aggregate domestic credit to the private sector scaled by the GDP growth rate, and GDP growth. We document the results for this model in the column entitled MVLEV-MACRO. We find that our results are robust to the inclusion of macroeconomic variables, with none of the macro variables being statistically significant. In no case do we find that these macroeconomic variables contain incremental predictive ability in terms of predicting the probability of bankruptcy for UK firms, a result confirmed by the Vuong test which shows that the MVLEV-MACRO model does not contain superior information to the MVLEV model in terms of predicting financial distress.

Chava and Jarrow (2004) find that there are strong industry effects in the probability of bankruptcy especially for financial, insurance and real-estate firms. To investigate whether industry effects matter for our model, the final column of Table 10, MVLEV-IND, reports the results from including industry dummy variables to capture any possible industry effects.Footnote 13 None of the parameters are statistically significant, although given we exclude financial, insurance and real-estate firms from our sample this is perhaps not too surprising, and the Vuong test confirms that MVLEV-IND does not contain incremental information compared to the MVLEV model.

Table 11 Robustness of the 2ACC model to the inclusion of the bankruptcy rate, macroeconomic factors and industry dummies for Indian firms

Table 11 reports similar robustness tests for Indian firms. The column entitled 2ACC-BR presents the results when we augment the 2ACC model with the bankruptcy rate. Unlike the UK, we find that the bankruptcy rate is positively related to the probability of financial distress for Indian firms. However, the Vuong model comparison test in Panel B of Table 11 suggests that the inclusion of the bankruptcy rate does not add value to the 2ACC model in terms of predictive ability, that is, the 2ACC-BR model does not significantly outperform the 2ACC model, the \(z\) statistic testing this hypothesis being insignificant. The 2ACC-MACRO column in Table 11 documents the results when we augment the 2ACC model with macroeconomic factors. We find that the coefficient on inflation is negative and statistically significant. This in line with Kang and Pflueger (2013), who find that low inflation raises firms’ real debt burdens and increases default risk. The remaining macroeconomic variables (aggregate domestic credit scaled by GDP, GDP growth and the term premium)Footnote 14 are statistically insignificant as predictors of financial distress for Indian firms. The Vuong test also shows that the macroeconomic variables do not contain incremental predictive information over and above that contained in the 2ACC model. Finally, the 2ACC-IND column reports the results when taking into consideration industry effects. We find that none of the four industry dummies is associated with the prediction of financial distress. Moreover, the Vuong test clearly shows that the 2ACC model is better than the 2ACC-IND as the \(z\) statistic is positive and significant.

A final issue we consider is whether the significance or otherwise of the profitability and leverage accounting variables is sensitive to the definition of these variables. This is particularly important for the measure of profitability for the UK sample as this is insignificant. We use three alternative measures of profitability in the MVACC model: EBITDA divided by total assets, net income divided by total assets and profit before tax divided by total assets. The adjusted \(\chi ^{2}\) statistics testing the null hypothesis that the coefficient is zero are, to two decimal places, 0.00, 0.01 and 0.03. The test has a \(\chi ^{2}(1)\) distribution under the null hypothesis so in no case is profitability statistically significant. With regard to leverage, we use two alternative measures of market leverage: book value of debt divided by book value of debt plus market value of equity; and book value of debt divided by market value of assets, and an alternative measure of book leverage: book value of debt divided by total assets. We find that the two measures of market leverage are positively and significantly associated with the probability of financial distress at the 10 % level (the test statistics are 3.01 and 2.96 respectively) although debt to total assets is not statistically significant. In all cases, profitability remains an insignificant predictor of financial distress. Vuong tests show that neither of the two measures of market leverage contain predictive over that contained in the measure of book leverage used in our MVLEV model. Also, neither book nor market leverage has an impact on the probability of financial distress for Indian firms regardless of the alternative definitions used. In all cases profitability remains a significant predictor of corporate financial distress in India.

7 Concluding remarks

While there is a wealth of evidence on the ability of bankruptcy prediction models to predict financial distress for US firms, there is much less evidence on how they perform on samples outside of the US, both in terms of how the different models perform and whether different predictors are needed. We have provided an evaluation of how well these models predict financial distress outside of the US by examining their performance in predicting the financial distress of firms (a) in another developed economy, the UK and (b) in an emerging economy, India.

Our results show that for the UK, a model that combines book leverage and the three market-based variables (excess returns, market capitalization and return volatility) provides the best information for predicting financial distress for UK firms. This is consistent with Shumway’s (2001) findings for the US and confirms that the variables Shumway (2001) uses appear to be robust predictors of financial distress. We also document that a model solely based on the three Shumway market-based variables of excess returns, market capitalization and return volatility, outperforms a multivariate model that combines Z-score with the EDF, a multivariate model that combines EDF with the accounting ratio components of Z-score, a univariate model that uses Z-score and a univariate model that uses EDF. In addition we find that neither Z-score nor EDF provide incremental predictive ability when each of them is combined with the variables used in our best model. We also compare the in-sample and out-of-sample forecasting ability of the various models we estimate. The out-of-sample forecast period of 2007–2011 provides a stringent test of the different models as it incorporates the global financial crisis. We find that the hazard model combining book leverage and the three market-related variables generates the best in-sample and out-of-sample predictive performance. It identifies the highest number of bankrupt firms and has the lowest misclassification rate. We also document that a model that uses excess returns, market capitalization and stock return volatility performs better in terms of predictive ability than a model that uses Z-score, a model that uses EDF and a model that combines both Z-score and EDF.

Our results for India tell a rather different story, however. We find that a model based on profitability and FRISK predicts financial distress more accurately. Further to this, we find that market-based variables do not impact on the probability of financial distress for Indian firms when they are combined with accounting information. Neither EDF nor the three Shumway market-based variables of excess returns, market capitalization and return volatility,are significant predictors of financial distress when we additionally include profitability and FRISK. In-sample and out-of-sample forecast tests confirm that the model with the two accounting ratios has the highest predictive ability. Robustness checks show that controlling for macroeconomic effects, industry effects and using different definitions of leverage and profitability do not alter our results or our conclusions for both the UK and Indian samples.

Overall, our findings provide important additional insights on the prediction of corporate bankruptcy. Evaluating bankruptcy forecast models using a dataset from a developed market other than the US, and an emerging market provides insightful information on the following question: Can we use a common model to predict corporate financial distress regardless of the stage of development of the economy of a country? The answer is no. Our findings strongly suggest that an accurate bankruptcy forecast model in a developed economy (the UK in our case) can not be successfully applied to the emerging economy that we consider (India). We find that, like for the US, both accounting and market-based variables are essential to predict corporate financial distress in the UK. However, this finding does not carry over to the developing economy we consider, India: we find that only accounting information is needed to predict corporate financial distress in India; market-based information does not appear to contain additional predictive information over and above that contained in measures of profitability and FRISK for Indian firms.