1 Introduction

Credit risk based on the characteristics of the debtor is often divided into sovereign, corporate, retail, etc. Retail debt is centered on customer credit, which includes short-term and intermediate-term credit to finance the purchase of commodities and services for consumption or to refinance debt incurred for such purposes. Corporate credit evaluation methods now are almost based on quantitative analysis. A good credit risk evaluation tool can help to grant credit to more creditworthy applicants and thus increases profit. Moreover, it can deny credit for the noncredit worthy applicants and thus decreases losses.

Today academic researches of forcasting enterprise default are based on two major milestone results, one is structure model, on behalf of Merton model and KMV model; the other is credit scoring. The structure model has entered the BaselIII and aslo become a market standard of monitoring the default risk. But in practice, the structure model needs enterprise’s equity value and the fluctuation of equity value which limited scope of this model, and the prediction effect depends directly on estimation results of these two parameters.

This paper deeply discussed the methods for estimating parameters of structure equation model, put forward the wavelet structure model. Through wavelet transform decomposed the revenue sequence into the low frequency and high frequency, one can modeled the two sequences part respectively. As it is known that the low frequency part contains more information and the high frequency part has more noise, so modeling them with different equations can minimize cumulative calculation error of equity volatility, and can get the core information for prediction sequences reconstruction, so all of these gave a lot of improvement in the default prediction which can be verified with the Chinese listed company.

The structure of the rest in this paper is as follows: the next section puts forward the prior research of structural model and wavelet analysis. Section 3 describes our methods for default prediction in detail. Section 4 is about experiment studies; in this part we will present several robust check result of structural models in China’s actual practice. Section 5 is the robust test. The Final section discusses the interesting results and gives some remarks.

2 Prior Research

The first researcher who gave a deep research on default prediction for large listed companies can be traced back to Merton (1974). He constructed the structure equations to price debt, through appropriately simplifying the corporate capital structure and dynamic change of corporate value of the company; he matched the price of corporate debt and corporate equity with options. And then KMV (Crosbie and Bohn 2002) company developed an empirical estimators Probability of Default on the basis of Metron model, which is known as the estimation of expected default frequency (EDF), and it can be seen as a consistent estimator for probability of default (PD), rather than using the cumulative normal distribution in the Merton model to calculate PD.

At present, the research of structural models for default prediction mainly focuses on the actual assessment and improvement in estimators. Ye et al. (2005) using a small sample with 22 enterprises to adjust the parameters of the KMV model, which makes the model better adapt to the situation in China. Lee (2011) used genetic algorithm to improve the best default point of KMV model. Camara and Popova (2011) applying several structure models to evaluate the default risk in the financial enterprises after the subprime crisis, and they found that KMV model had better accuracy in default prediction. Chen et al. (2010) experimented with large sample of 80 enterprises between 2004 and 2006 to build the KMV model, and in his research it can be found that structural models cannot give early warning of default risk to small and medium-sized enterprise in China. Though structural models are be seen as the most effective methods in the default prediction of large companies, it cannot avoid calculation of yield volatility which is also the key of these models.

Now, the majority studies focus on the calculation of yield volatility is time series modeling, which relies on linearity and symmetry assumptions. However, several authors have discussed in detail the inadequacy of linear models in capturing asymmetries. Importantly, Hamilton (2003) has settled that non-linear specifications should be seen as better candidate models than traditional linear approaches in capturing significantly much more stronger effects of oil shocks. Chiou and Lee (2009) argued that most of the time series models experience structural changes that when applied to real data, determine the break locations. Whilst there is general agreement in the literature that any inferences without consideration of regime switching phenomenon may well lead to unreliable results for many financial time series. Regimes switching models occurred as an alternative to standard GARCH models in allowing dynamic variables’ behavior to depend on the state that takes place at any given point in time.

Undoubtedly, GARCH models worked well to capture leptokurtosis and volatility clustering generally observed in financial time series but they demonstrate some inaccuracies in terms of changes of time scales (Yalamova 2006). One major advantage afforded by wavelets analysis is the ability to perform local analysis e that is, to analyze a localized sub image area of a larger image (or signal). Therefore, wavelet analysis is capable of revealing aspects of data that other signal analysis techniques (like GARCH models) usually miss; aspects like trends, sharp spikes, discontinuities in higher derivatives, self-similarity.etc. Likewise, because it affords a different view of data than those presented by traditional techniques, wavelet analysis can often compress or de-noise a signal without appreciable degradation (Nguyen and Nabney 2010). In their brief history within the signal processing field, wavelets have already proven themselves to be a very useful tool for data de-noising (Chen et al. 1986) and deconvolution (separation between two convolved signals namely smooth and detail).Wavelet analysis provides better resolutions in the time domain since wavelet basis functions are time-localized, which is useful for capturing the changing volatility by Jagric Yogo (2008). Jagric (2002, (2003, (2004), Raihan et al. (2005), and Fernandez and Kutan (2005) use wavelet analysis to find out cyclical components. Crowley et al. (2005) use wavelet transforms (MODWT, and CWT) to analyze productivity cycles in Euro area, US and UK. Crivellini et al. (2003) apply wavelet analysis to analyze industrial output fluctuations in developing countries and concludes that time scale decomposition through wavelet analysis may reveal very different aspects in the characteristics and correlations of business cycle fluctuations. Yamada and Honda (2005) use wavelet analysis to predict business turning points of Nikkei 225 index and find that wavelet analysis can capture business peaks and troughs (minimum points) as an alternative structural break analysis. Yogo (2008) uses wavelet filter for business cycle component of US real GDP and concludes that wavelet filter is better than band-pass filter of Baxter and King (1999). Bowden and Zhu (2008) combines wavelet analysis with structural breaks and apply this combined model to agribusiness cycle.

In conclusion, it can be seen that structural model has good performance in the default prediction, but it has a number of parameters to estimate which directly influence the accuracy of the models. Wavelet analysis technology has been widely applied in the engineering field. Through joining time-frequency analysis, wavelet can make any signal decomposed into high frequency part and low frequency part which are independent, and according to the different sampling density, one can adjust the time window, so it can be used to check the characteristics of any signal under different magnification to effectively filter the noise. And it has be tested that wavelet has a good performance in the stock market. This article is based on these reasons, combines the structural model with wavelet analysis to better estimate the parameters, and then uses the actual data of listed companies to discuss the accuracy of structural model in China.

3 Preliminaries

Structure model requires the mark-to-market value for listed companies’ credit assessment, which described the process of default as the explicit result of the deterioration with the companies’ value. Then it can be simplified that the company owners’ equity can be seen as call option, and the liability as put option. Once one made sure of the company valuation model and the company’s capital structure, one can use the option pricing formula to price the equity and debt in order to predict default.

3.1 Assumption

Structural equation model usually needs to meet the following assumptions:

  1. 1.

    There are only two ways-equity (with the value S) and debt (with the principal D and maturity T) for the company (with the value V) to finance.

  2. 2.

    At any \(t\ll T\), the value of a company is equal to the sum of debt value and equity value, which can be described as \(V_{t} =S_{t} +D_t \).

  3. 3.

    The value of the company follows geometric Brownian motion, that is \(dV=uVdt+\sigma _{u} VdZ\).

  4. 4.

    Before maturity, bond holders cannot force companies to bankruptcy. But at the maturity T, if the value of the company can cover the debt principal, it means that company has the payment ability, otherwise the value of the company is not enough to pay back the principal, which is \(V<D\) and it occurs default.

  5. 5.

    When default occurs, bond holders have more priority than shareholders. So they can get the full value V of the company, otherwise, bond holders get their principals D.

According to these assumptions above, the share holders’ profit and loss can be thought as a call option of the company’s value which the strike price is D, and the bond holders’ profit and losses can be thought as a put option that is the risk-free bond D, minus the company’s value. Then based on this, one can predict default through pricing the value of equity and debt.

3.2 Models

According to the Black and Scholes option price, we can get the following relationship:

$$\begin{aligned} S=VN(d_{1} )-De^{-rT}N(d_{2} ) \end{aligned}$$
(1)

\(d_{1} =\frac{\ln (V/D)\,+\,(r\,+\,\sigma ^{2} _{V} /2)\,\times \, T}{\sigma _{V} \sqrt{T}}, d_{2} =d_{1} -\sigma _{V} \sqrt{T},N^{(*)}\) is the cumulative probability distribution function of standard normal distribution.

According to the sensitivity analysis \(\sigma _{s} =\frac{V}{S}(\frac{\partial V}{\partial S})\sigma _{V} \), there is \(\frac{\partial V}{\partial S}=N(d_{1} )\),then we can get that:

$$\begin{aligned} \sigma _{s} =\frac{V}{S}N(d_{1} )\sigma _{V} \end{aligned}$$
(2)

Because the value V of the company and its volatility cannot be evaluated easily, but we can calculate the value and the volatility \(\sigma _{s} \) of equity from capital market, then we can get V and \(\sigma _{v} \) through simultaneous Eqs. (1) and (2) the two equations. Therefore, it can be seen that S and \(\sigma _{s} \) are the key indicator for the model’s accuracy. Furthermore, with the effect financial market, S can be obtained directly, so the calculation of \(\sigma _{s} \) becomes the core for structure model.

3.3 Kernel Parameters

In recent research, there always uses Garch(1,1) to estimate \(\sigma _{s} \), the details can be found in the references (Yogo 2008). We gave the model below:

$$\begin{aligned} r_{t}= & {} \sqrt{h_{t} e_{t} }\nonumber \\ h_{t}= & {} \alpha _{0} +\alpha _{1} r^2_{t-1} +\beta _{1} h_{t-1}\nonumber \\ e_{t}\sim & {} iidN(0,1) \end{aligned}$$
(3)

From the model, we can see that it needs repeat the prediction of future volatility, so it tends to expand the prediction calculation errors. Meanwhile, using this model with the passage of time, the volatility tends to be weaken and finally approaches zero, which can be called convergence.

Wavelet analysis is similar to Fourier analysis, the basic principle is to use a cluster of basis functions to represent or approach any signal, this cluster of basis function is called wavelet function system, which is the group of base stretching or shifting through wavelet function and its transform coefficients can be described as the characteristics of original signal.

Definition 1

Let \(\psi \in L^{2}(R)\cap L^{1}(R)\) and \(\hat{{\psi }}(0)=0\), then the function \(\left\{ {\psi _{a,b} } \right\} \) which can be got through the way below is called Wavelet function.

$$\begin{aligned} \psi _{a,b} (x)=\left| a \right| ^{-\frac{1}{2}}\psi \left( {\frac{x-b}{a}} \right) \quad b\in R,\quad a\in R^{+} \end{aligned}$$
(4)

And \(\psi (x)\) is called basis wavelet or mother wavelet,a is called stretch factor,b is called shift factor. \(\hat{{\psi }}(w)\) is the fourier transformFootnote 1 of \(\psi (x)\).

Definition 2

Let \(\psi (x)\) be the basis wavelet, \(\left\{ {\psi _{a,b} } \right\} \) is continuous wavelet got by (4), so for any function \(f\in L^{2}(R)\), continuous wavelet transform \(W_{f} (a,b)\) is defined as below:

$$\begin{aligned} W_{f} (a,b)=\left\langle {f,\psi _{a,b} } \right\rangle =\left| a \right| ^{-\frac{1}{2}}\int _{R} {f(x)\overline{\psi \Big (\frac{x-b}{a}\Big )} } dx \end{aligned}$$
(5)

Definition 3

Let \(\psi \in L^{2}\cap L^{1}\) and satisfying \(C_{\psi } =\int {\frac{\left| {\hat{{\psi }}(w)} \right| ^{2}}{\left| w \right| }dw<+\infty } , \hat{{\psi }}(w)\) is the fourier transform of \(\psi (x)\), then \(\psi (x)\) is called allowed wavelet. Allowed wavelet functions means that the wavelet has sufficient rate to decay, and with the mean of zero.

Theorem 1

Let \(\psi (x)\) be the allowed wavelet, and for any \(f,h\in L^{2}(R,dx)\), there is \(\iint _{R^{2}} {W_{f} (a,b)\overline{W_{h} (a,b)} } \frac{da}{a^{2}}db=C_{\psi } \left\langle {f,h} \right\rangle \). For any \(f\in L^{2}(R)\), then f(x) can be reconstructed by the following way: \(f(x)=\frac{1}{C_{\psi } }\int ^{+\infty }_{-\infty } {\int ^\infty _0 {W_{f} (a,b)} \psi _{(a,b)} \frac{da}{a^{2}}db} \).

From these definitions and theorem above, we can find that any f(x) can be reconstructed by its wavelet transform \(W_{f} (a,b)\), wavelet transform can be seen as the process of f(x) decomposition on its basis of wavelet, so there are multiple transforms for a. In the practice, we need only orthogonal wavelet basis to keep the bases off correlation which can be conducted by sampling discretely.

Definition 4

If stretch factor a and shift factor b are conducted by these rules: \(a=a^{-m}_{0} \quad \quad a_{0} >1, b=nb_{0} a^{-m}_{0} \quad b_{0} \in R\;m,n\in Z\) then the basis wavelet can be described as: \(\psi _{m,n} (x)=a^{m/2}_0 \psi (a^{m}_{0} x-nb_{0} )\), so discrete wavelet transform can be written as \(DW_{f} =\int ^{+\infty }_{-\infty } {f(x)\overline{\psi _{m,n} (x)} dx} =\left\langle {f(x),\psi _{m,n} (x)} \right\rangle \).In the process of discretization, if the stretch factor a is conducted by binary arithmetic operation \(a_{j} =2^{j}\), then the wavelet is called binary wavelet.

Theorem 2

Let \(\varphi (x)\) is the scale function of multiresolution analyses\(\left\{ {V_{j} } \right\} _{j\in Z} \) in \(L^{2}(R)\), and satisfies: 1)\(\left\{ {\varphi (x-n)} \right\} _{n\in Z} \) is orthonormal basis of \(V_{0} \); 2)\(\varphi (x)=\sqrt{2}\sum _{k} {p_{k} \varphi (2x-k),\;\left\{ {p_k } \right\} _{k\in Z} \in l^{2}} \), let \(\psi (x)=\sqrt{2}\sum _{k} {p_{1-k} \varphi (2x-k)=} \sqrt{2}\sum _{k} {q_{k} \varphi (2x-k)} \), then 1) \(L^{2}(R)=\overline{\mathop \oplus _{j\in Z} W_{j} } \); 2)\(W_{j} \bot W_{{j}'} ,\; j\ne {j}'\); 3)\(\left\{ {\psi _{j,k} (x)} \right\} _{k\in Z} \) is the orthonormal basis of \(W_{j} \), and also the orthonormal basis of \(L^{2}(R)\).

Therefore, any f(x) has two way of wavelet transform: \(f(x)=\sum _{m,n} \left\langle {f,\psi _{m,n} } \right\rangle \psi _{m,n} (x) \) and \(f(x)=\sum _{n} {\left\langle {f,\varphi _{m_{0} ,n} } \right\rangle } \varphi _{m_{0} ,n} (x)+\sum _{m>m_{0} ,n} {\left\langle {f,\psi _{m,n} } \right\rangle \psi _{m,n} (x)} \).On the one hand, let \(\psi (x)\) be wavelet function and \(\varphi (x)\) be scale function, then for any \(N\in Z\), we can use \(f_{N} \) to approach \(V_{N} \), while for all \(j\in Z\), there is \(V_{j} =V_{j-1} \oplus W_{j-1} \). So \(f_{N} =f_{N-1} +g_{N-1} ,f_{N-1} \in V_{N-1} ,g_{N-1} \in W_{N-1} \), repeat this process, then \(f_{N-1} =g_{N-1} +g_{N-2} +\ldots +g_{N-M} +f_{N-M} \). On the other hand, both \(\psi (x)\in W_{0} \) and \(\varphi (x)\in V_{0} \) belong to \(V_{1} \) which is constructed by \(\varphi _{1,k} (x)=2^{1/2}\varphi (2x-k),\;k\in Z\), so there exists \(\left\{ {p_{k} } \right\} \) and \(\left\{ {q_{k} } \right\} \), which can make two-scale relations sense. Then we know that both \(\varphi (2x)\) and \(\varphi (2x-1)\) belong to \(V_{1} ,V_{1} =V_{0} \oplus W_{0} \), so it can get that \(\varphi (2x-l)=\sum _{k} {[a_{l-2k} \varphi (x-k)+b_{l-2k} \psi (x-k)]} \), which is called the decomposition of \(\varphi (x)\) and \(\psi (x)\), in which \(\left\{ {p_{k} } \right\} \) and \(\left\{ {q_{k} } \right\} \) are called reconstruction sequence, \(\left\{ {a_k } \right\} \) and \(\left\{ {b_{k} } \right\} \) are called decomposition sequence. Finally \(f_{j} \in V_{j} \) and \(g_{j} \in W_{j} \) can be written by the only sequence below:

$$\begin{aligned} \left\{ {\begin{array}{l} f_{j} (x)=\sum _{k} {c_{j,k} \varphi \big (2^{j}x-k\big )} \\ c^{j}=\left\{ {c_{j,k} } \right\} \in l^{2} \\ \end{array}} \right. \end{aligned}$$
(6)
$$\begin{aligned} \left\{ {\begin{array}{l} g_{j} (x)=\sum _{k} {d_{j,k} \psi (2^{j}x-k)} \\ d^{j}=\left\{ {d_{j,k} } \right\} \in l^{2} \\ \end{array}} \right. \end{aligned}$$

The process of decomposition is

$$\begin{aligned} \left\{ {\begin{array}{l} c_{j-1,k} =\sum _{l} {a_{l-2k} c_{j,l} } \\ d_{j-1,k} =\sum _{l} {b_{l-2k} c_{j,l} } \\ \end{array}} \right. \end{aligned}$$

The process of reconstruction is

$$\begin{aligned} c_{j,k} =\sum _{l} {\big [p_{k-2l} c_{j-1,l} +q_{k-2l} d_{j-1,l}\big ]} \end{aligned}$$

This is Mallat Algorithm. Both \(c^{j-1}\) and \(d^{j-1}\) are got by moving average with \(c^{j}\) using the weight of \(\left\{ {a_{k} } \right\} \) and \(\left\{ {b_{k} } \right\} \), while this process only takes the samples on even integer points, so it is also called down sample; meanwhile the process of reconstruction takes up sample which does convolution with \(c^{j-1}\) and \(d^{j-1}\) only on even integer points to get \(\left\{ {p_{k} } \right\} \) and \(\left\{ {q_{k} } \right\} \).

Based on Mallat Algorithm, we can do wavelet decomposition with r(x), and then construct appropriate predictive models for \(\left\{ {a_{k} } \right\} \) and \(\left\{ {b_{k} } \right\} \) separately, finally using the predictive models to reconstruct \(\left\{ {p_{k} } \right\} \) and \(\left\{ {q_{k} } \right\} \) in order to get \(r(x{)}'\), which can ensure that when \(l\rightarrow \infty , r(x)'\rightarrow r(x)\).

4 Empirical Analysise

4.1 Data and Parameters

In this paper we use the CSMAR database (http://www.gtarsc.com/) as data source, and select listed companies in China which appeared on the market before 2009 and have not been off in 2009. There are 1697 listed companies (including 184 ST companies), and they belong to 13 industry groups.

This article mainly aimed at the default prediction of listed companies, Because financial listed companies have more liquidity with the assets and liabilities than the ordinary corporate entities, whose default risk may cause shocks in the financial system, there are many regulatory factors to restrict, so its default cannot be predicted only through the market information. Base on these, we don’t take these financial listed companies in our experiment, then we sampled the rest of listed companies randomly according to the industry category, finally we selected 100 sample enterprises (including 50 ST enterprises, 50 enterprises not ST), enterprise code and name can be seen in Table 1.

Table 1 Table of results

Using the following way to estimate several key parameters in the model:

First, period. Structure model considers only default prediction problem which will be due in the next year. This section uses the listed companies’ market data in 2009 to predict the default probability in 2010, and in Sect. 4 we use the market data in 2010 to validate the model’s results.

Second, risk-free rate. In this section we use the 1-year deposit rates 2.25 % in 2009 as the risk-free rate.

Then, equity value. There are two kinds of shocks of a listed company in China for a long time-tradable shares and non-tradable shares. There mainly are two ways to calculate the equity value: (1) equity value = tradable shares \(\times \) market price + non-tradable shares \(\times \) conversion ratio; (2) equity value = tradable shares \(\times \) market price + non-tradable shares \(\times \) net assets per share. This paper adopts the second calculation method. The results are shown in Table 1.

Finally, default point. This paper follows the KMV model to determine the default point. Long Term debt is the debt with maturity more than 1 year, and written as LT for short; Short Term is the debt with maturity within a year, and written as ST for short. Then the default point can be determined in accordance with the following standards. The results are shown in Table 1.

$$\begin{aligned} DP= & {} ST+0.5\times LT\quad \quad \quad \quad \quad \quad \quad \quad \qquad \quad LT/ST<1.5 \\ DP= & {} ST+(0.7-0.3\times ST/LT)\times LT\quad LT/ST\ge 1.5 \end{aligned}$$

4.2 GARCH Structural Model

According to the closed price of a stock, we can get the yield sequences by \(r_{t} =\log (\frac{p_t }{p_{t-1} })\) in 2009, with \(p_{t} \) on behalf of the day’s closing price and \(p_{t-1} \) on behalf of the precious’s closing price. Here take the stock with code 000713 as an example; its daily closing yield sequence is shown in Fig. 1 below:

Fig. 1
figure 1

Return series figure

Take autocorrelation test with \(r_{t} \) and \(r^{2}_{t} \);the results are shown in Tables 2 and 3. From Table 2 we can see that there is no autocorrelation of \(r_{t} \), and from Table 3 we can see there exists auto-correlation of \(r^{2}_{t} \).

Table 2 Autocorrelation of \(r_t \)
Table 3 Autocorrelation of \(r^2_{t} \)

Give the LM test with \(r_{t} \) to test whether there is ARCH effect. The result is shown in the following Table 4.

From Table 4, we can find that there exists high-order ARCH effect, so it can be fit with GARCH (1,1) model. The result of GARCH (1,1) is showing in the following Table 5.

From the table above, the GARCH (1, 1) model gets through the test and has small AIC. So it can be said that GARCH (1, 1) has a good fitting effect of the volatility of sequence \(r_{t} \).

Continue to take LM test with the sequence residual error, it is can be found that the residual error has no ARCH effect. Thereby we can get the yield sequence volatility model as follows: \(\sigma ^{2} _{t} =0.0000894+0.1670\varepsilon _{t-1}^2 +0.7313\sigma _{t-1}^2 \).

Based on aggregation formula of GARCH model, it can be obtained the predicted earnings volatility model as follows: \(\sigma ^{2}_{t+h,t} =0.0000894\times \frac{1-(0.167+0.7313)^{h}}{1-(0.167+0.7313)}+(0.167+0.7313)^{h}\sigma ^{2} _{t} \). So the total volatility is equal to \(\sigma ^{2}=\sum _{h=1}^{252} {\sigma _{t+h,t}^2 } \) in future year, which is obtained by cumulating the every day’s volatility.

From this process, we can see that the volatility will have huge error with the time pass, so this model can only fit short time prediction. Just as this process, other stocks’ volatility can be predicted just as this, which is shown in Table 1.

Table 4 LM Tests of \(r_t \)
Table 5 Statistic of GARCH(1,1)
Fig. 2
figure 2

Wavelet decomposition of \(r_{t} \)

Table 6 Autocorrelations of ca3
Table 7 Partial autocorrelations of ca3
Table 8 Check for white noise of ca3

4.3 Wavelet Structural Model

Wavelet structural model takes the different way to predict the yield sequence, which can avoid accumulation process. Here still give the stock (code 000713) as an example. For \(r_{t} \), according to the Formula (5), we adopt Mallat algorithm using db3 wavelet to decompose the sequence, which is asymmetric wavelet, and it is suitable for processing large fluctuations of the sequence, then it can get the corresponding low frequency sequence and high frequency sequence. The results are shown in Fig. 2, in which ca3 is on behalf of the three layer decomposition’s low frequency coefficient sequences and cd1 to cd3 are on behalf of the corresponding decompositions’ high frequency coefficient sequences. With the wavelet decomposition, low frequency coefficient sequence tends to contain the trend of the model, and so it always has useful information, just as which can be seen from the Fig. 2 there is a growth trend of ca3; high frequency coefficient sequences are usually noisy disturbance information.

Table 9 Test ARIMA of ca3

Take stationary test and pure randomness test with ca3, cd3, cd2 and cd1 respectively. Upon examination, in addition to the ca3, the other sequences are all white noise sequences. The results of stationarity and pure randomness test with ca3 can be found in Tables 6, 7 and 8.

From these tables above, it can be found that ca3 autocorrelation coefficient is tailing, and partial autocorrelation coefficient is not tailing which is suitable for ARIMA model. The modeling experiments results are shown in Table 9. According to the smallest Akaike information criterion, we choose to build AR (2) model as a low-frequency coefficients ca3’s prediction model, the result is \(ca3_{t} =1-0.27373\times ca3_{t-2} \).

According to the ca3’s prediction model AR(2), we can use the coefficient sequences \(\left\{ {p_{k} } \right\} \) and \(\left\{ {q_{k} } \right\} \) which is shown as cd3, cd2 and cd1 to reconstruct the yield sequence\(r_{t} \)using the Eq. (5). The layers of the reconstructed results are shown in Fig. 3.

Fig. 3
figure 3

Wavelet reconstruction of \(r_{t} \)

Reconstructed yield sequence comparing with the original yield sequence is shown in Fig. 4, the reconstructed sequence is marked with circles and the original sequence is marked with asterisk. Through calculating the daily return value error between the two is only 0.026164.

Fig. 4
figure 4

Original \(r_{t} \) and new \(r_{t} \)

As it can be seen, although in low frequency coefficient prediction is using predictive model, taking off the noisy information the low frequency part also becomes more accurate and efficient, while its high frequency coefficient information is not reduced, so the reconstruction of the sequence can highly approach the original series.

Furthermore, for a stock, it can be considered that its volatility characteristics will not change in short term, that is, its high-frequency disturbance sequence will not be significantly different, so we can use low-frequency sequence predictive model to predict future earnings trends, and then use the data in the future to do successive convolution with high frequency coefficient sequence, to achieve the forecast earnings in the coming year. The prediction results are shown in Fig. 5.

Finally, we can get the standard deviation \(\hat{{\sigma }}=\sqrt{\frac{1}{n-1}\sum ^{n}_{i=1} {(r_{i} -\bar{{r}})^{2}} }\) of yield forecasts for the coming year, which can be seen as a consistent estimate of volatility.

The volatility of other sample enterprises can also be calculated following similar processes, which are shown in Table 1.

5 Robust Tests

According to the modeling process in the third part, we determine the parameters using Matlab. Based on simultaneous Eqs. (1) and (2), we can solve assets V and its volatility \(\sigma _{V} \), the results are also shown in Table 1.

Although calculations of default probability in structural models are very different, now researchers generally agree that the calculation method of KMV is more fit able. KMV model uses the default distance which can be seen in Eq. (6) to judge for the possibility of default, the default distances of selected samples are shown in Table 1.

$$\begin{aligned} DD=\frac{V-DP}{V\times \sigma _{v} } \end{aligned}$$
(7)
Fig. 5
figure 5

Forecast of \(r_{t} \)

On the one side, there are great difficulty with collecting the information of a company’s actual default, on the other side, it is a regulation that if there are any financial struggle or abnormal situation with the listed company, which can lead investors difficult to judge its prospects and interests may be impaired, its stock must be special treatment with a “ST” mark in China. Therefore in these robust checks, we use “ST” representing breach of companies. In order to validate the two models, we do the paired T tests. The results are shown in Tables 10 and  11.

As can be seen from the tables above, at the 95 % confidence interval, the two models can identify ST enterprises from good enterprises, so it can be said that both models have effective judgment in default prediction with listed companies in China, thus it can be inferred that structural model is effective in risk assessment with Chinese listed companies.

Furthermore, in order to compare the two structural models, we design two sets of experiments.

Firstly, test the distinguish ability. We use the data in 2009. According to Garch structural model and Wavelet structural model, we construct the paired group between non-ST and ST companies to do paired samples t-test. The results are shown in Tables 12 and 13.

The result can be seen from the Table 12 is that at 95 % confidence interval Wavelet structural model has smaller default distance for ST enterprises than time series model; meanwhile, the result can be seen from Table 13 is that at 95 % confidence interval wavelet structure model has greater default distance for non ST enterprises than time series model. Therefore, it is can be said that Wavelet structure model has better performance in default distinguish for both ST companies and non-ST companies.

Table 10 Pair T-test of time series model
Table 11 Pair T-test of wavelet analysis model
Table 12 Identify ST companies of these two models

Secondly, test prediction ability. We use the data in 2010 just like distinguish test, compare the two group predictive power based on the ST and non-ST companies. There is only one company which is non-ST in 2009 and is ST in 2010 with the stock code 600355. For time series model it gets the default distance is 2.067, and for wavelet structural model it has the default distance is 1.957. Because of the limited sample size, though it is not statistically inferred that wavelet structural model is superior to time series model, as far as in this case, wavelet structural model has better predictive power than time series model which has the default distance closer to 0. Also, we do paired t test to compare the two models’ prediction ability with these companies which are ST in 2009 and non-ST in 2010, there are 13 companies, the results of paired t test are shown in the Table 14 below.

Table 13 Identify Not ST companies of these two models
Table 14 Forecast check of these two models

From Table 14, it is can be seen that default distance of wavelet structural model has larger mean than one of time series model. Because of sample size limitation, the results are not very notable, but it can also be said that at the 80 % confidence interval wavelet structural model has better predictive power than time series model in default prediction.

6 Conclusion

Credit risk management is one of the most important problems which commercial bank should face with. Normally, the optimal way of credit risk management is to forecast the default accurately before the loan.

In the recent study, it is always using the structure model to forecast the default of the listed companies, because this model can realize the company’s market value by mark-to-market, thus it is can be inferred that this model can give the overall information for a company, so it is more accurate, and be widely used in practice. But with the application process of structural model, it needs to estimate the company’s equity value. According to the recent research, it usually use the time series modeling the volatility for the prediction of equity value, but with this method it can’t avoid iterative calculation, so with the accumulation of time interval, the prediction will have large deviation.

Based on these, this paper puts forward a new method which is named wavelet structural model for default prediction. Regarding on characteristics of listed companies, the author sampled 100 companies according to industry types to construct wavelet structural model. The process of wavelet structural model is that: firstly apply wavelet decomposition on the proceeds, and then built different models separately for low frequency part and high frequency part, finally reconstruct the predictive return. So through this process, wavelet structural model can avoid accumulated calculate process of the volatility in time series model.

Checking with the actual situation of Chinese listed companies, it can be found that the wavelet structural model is more sensibility and more precision than time series model. But just as other structure models, wavelet structural model still cannot be able to avoid the calculation of the value of equity, so it has strong dependence with market environment, which limits its applications with the small and medium-sized enterprises.