1 Introduction

The fact that investors holding fixed income portfolios to diversify risk or enhance investment returns have suffered from systematic credit risk of different entities has been observed more recently. This raises the question whether there are common factors determining systematic credit risk across different entities, different credit ratings, different countries and different maturities. In fact, an increase in systematic credit risk will harm the benefit from a well-diversified bond portfolio. An examination into common credit risk factors explores the nature of correlated defaults. Several illustrations for correlated defaults were proposed by Das et al. (2007). Firstly, firms may be exposed to common or correlated risk factors. Secondly, the event of default by one firm may be contagious. Thirdly, learning from default may generate default correlation. This study examines what are the common factors that determine systematic credit risk, and estimates and interprets the common risk factors. In further steps, we estimate the market prices of risk factors and test their significance. Based on factor models, we propose various time series properties for common factors and idiosyncratic components, and examine which one can produce the best forecasting to the dynamics of credit default swap (CDS) spreads.

Understanding how corporate defaults are correlated is particularly important for the risk management of corporate debt portfolio, since banks have to retain greater capital to survive default losses if defaults are heavily clustered in time. An investigation of the sources and degree of default clustering is also crucial for the rating and risk analysis of structured credit products, such as collateralized debt obligations (CDOs) and options on portfolios of default swaps that are exposed to correlated default. Several attempts have been made in the literature to address this issue. The first one incorporates correlated default into the reduce-form credit risk modeling (Das et al. 2006, 2007). The second research stream assumes that default probabilities depend on firm-specific and market-wide factors. Typically, portfolio loss distributions are based on the correlating influence from such observable market-wide factors. A number of potentially observable factors from macroeconomic fundamentals have been proposed to analyze correlated defaults (Collin-Dufresne et al. 2001; Benkert 2004; Ericsson et al. 2009). The third research stream, however, extracts some latent/unobservable factors mainly from the principal components analysis (PCA) method to avoid a possible downward bias from estimating tail loss (Duffie et al. 2009; Cesare and Guazzarotti 2010; Anderson 2008). As we know, not all relevant risk factors are potentially observable by econometricians (Duffie et al. 2009).

Recent research claims that common latent factors increasingly and apparently explain the time-variation of credit risk, especially during the financial crisis. Anderson (2008) finds that a very high fraction of weekly variations in the implied default intensity is explained by a single common factor. Cesare and Guazzarotti (2010) found that CDS spread changes were increasingly driven by a common factor during the US subprime crisis. This paper goes beyond these two studies by additionally interpreting the common latent factors and modelling their time-variation patterns. We demonstrate this by using a very extensive CDS data set, encompassing different maturities, different credit ratings, different entities and different countries, and produce robust common factors with a convincing interpretation.

We compare the contributions of common factors in explaining the CDS spreads changes during the pre-crisis, the crisis and the post-crisis period. We find that the fraction of CDS variation explained by the first principal component increases from 58.7 to 72.3 % during the crisis period, and then declines to 47 % after the crisis. The results suggest that during the crisis, the changes of CDS spreads are increasingly driven by common factors and less by idiosyncratic components. Furthermore, the eigenstructures across three sub-periods are distinct based on the result of a likelihood ratio test that compares the common principal components model against the unrestricted model indicates. To interpret the estimated factors, we investigate the association between the latent factors and the observed economic variables.

Having applied the factor model to CDS spreads, we model the time-variation of common factors to examine the predictability of CDS spreads. This prediction will certainly benefit investors to hedge, speculate and arbitrage in the credit markets. We propose various factor models and compare their out-of-sample forecasting performance. Testing their equal predictive ability is also required to show whether relatively outperformance is statistically significant.

The remainder of this research is organized as follows. The next section describes the data we have used. Section 3 presents the factor models used in this study, and provides an economic interpretation for the estimated factors. In Sect. 4, we propose several factor specifications to predict the times-variation of CDS spreads; evaluating their out-of-sample forecasting performances and testing their equal predicting ability are both conducted in this section.

2 Data description

Credit default swap data are collectable from Markit, an aggregator of CDS pricing data from the leading broker-dealers. In terms of our focus on the commonality of CDS spreads, we are interested in the CDS indices rather than single name reference entity CDS contracts to mitigate the idiosyncratic components and liquidity risk. Our concern coincides with Driessen et al. (2003) in studying the common factors in international bond returns. They suggest that bond portfolio data is the preferred method to clear idiosyncratic risk embedded in individual bonds. Markit provides a detailed CDS index series, for example, the Markit CDX indices comprise the most liquid baskets of names covering North American Investment Grade and High Yield single name credit default swaps with various maturities, while the Markit iTraxx indices comprise of the most liquid names in the rest of regions such as Europe, Asia, Australia and Japan. Each index rolls biannually in March and September. Credit events that trigger settlement for individual components are bankruptcy and failure to pay, and are subsequently settled via credit event auctions. For traders, trading CDS indices is more attractive since they are allowed to trade large sizes and confirm all trades electronically. Stronger support from dealers and industry participants has prominently enhanced liquidity in all market conditions. The transparency of CDS markets has gradually improved since the default of Lehman (Avellaneda and Cont 2010). Central clearing and increased reporting of CDS trades to data repositories are important steps towards increased transparency, which regulators intend to use for monitoring and enhancing market stability. As such, they are quite acceptable as a representative benchmark of the overall market credit risk.

The indices quoted on a spread basis are selected by its regions: North American (CDX), Europe (iTraxx EU), by maturities: 5 and 10-year, by credit ratings: investment-grade (IG) and high-yield grade (HY). From October 2004 to June 2011, these eight indices with different regions, maturities and credit ratings will be analyzed in the subsequent sections. The US subprime crisis period is emphasized since the function of money markets in the U.S. was severely impaired in the summer of 2007, and then even further following the collapse of Bear Sterns in mid-March 2008 and the bankruptcy of Lehman Brother in September 2008. The turmoil from June 2007 to July 2009 is referred to a crisis period. After mapping the trading date among eight CDS indices, each index has 315 weekly observations: 134 in the pre-crisis period (from October 2004 to May 2007), 104 in the crisis period (from June 2007 to July 2009) and 76 in the post-crisis period (from August 2009 to June 2011). Table 1 summarizes the descriptive statistics for the entire sample period, the pre-crisis, the crisis and the post-crisis period. During the crisis period, the average changes of CDS spreads are all apparently positive, and are extremely volatile.

Table 1 Summary statistics for entire sample period, pre-, during and post-crisis period

The time-variations of CDS indices as displayed in Fig. 1 exhibit a changing dynamic. One noticeable feature is a high level of comovement across various maturities and credit ratings, which motivates the study of common factors. Specially, in Fig. 1 the apparent spike during the outbreak of the U.S. subprime crisis shows an inversion of the risk structure. For a given maturity, a high-yield (HY) index should be higher than an investment-grade (IG) one to compensate for a higher default risk taken by investors. The default risk premium between a HY and an IG may expand during the financial crisis to reflect a shift in investor risk appetite. Due to this changing risk attitude in a distressed time, risk-averse investors require a higher default risk premium. Pan and Singleton (2008) claimed that a comovement effect in the CDS markets is partly caused by a shift in investor risk appetite, especially for the turbulent period.

Fig. 1
figure 1

Time series plots of CDX index and iTraxx EU index

Figure 1 also shows the term structure of CDS markets. Normally, the slope of CDS term structure is upward in which the longer-term CDS spreads are higher than the respective shorter-term ones due to a greater risk-taking in longer maturity contracts. In this regard, the term structure should never be inverted. But, the term structure did occasionally invert, especially during the financial crisis (Pan and Singleton 2008). For an upcoming crisis, the demand for short-term CDS contracts is appealing. To cover a higher hedging cost faced by protection sellers, the bid-ask spreads of short-term contracts should be comparable to those of longer-dated contracts. As shown in Fig. 1, we have consistent evidence in the CDS term structure of an inverted slope in the crisis period and an upward slope in the rest of periods.

3 Factor representation of CDS spreads change

3.1 Model specifications

Let \(S_{it}\) be the observed change of CDS spreads for the \(i\hbox {th}\) cross-section unit at time \(t\), for \(i=1,\ldots ,N\), and \(t=1,\ldots ,T\). The factor model for given \(i\)th unit is:

$$\begin{aligned} S_{it} = {\varvec{F}}_t \lambda _i +e_{it} \end{aligned}$$
(1)

where \(F_t\) is a vector of common factors and is not observable, \(\lambda _i\) is a vector of factor loadings associated with \(F_t\), and \(e_{it}\) is the idiosyncratic component of \(S_{it} \). It is assumed that factors and idiosyncratic disturbances are mutually uncorrelated, \(E\left( {F_t ,e_{it} } \right) =0\). Obviously, Eq. (1) is the static factor representation of the change of the CDS spreads. For the forecasting exercise in subsequent sections, we will invoke the assumptions about the cross-sectional and temporal dependence in the idiosyncratic components.

The asymptotic principal components technique established by Stock and Watson (2002) and Bai and Ng (2002) can be used to consistently estimate the common factors. One starts with an arbitrary number of factors \(k\left( {k<min\left\{ {N,T} \right\} } \right) \) and estimates \(\lambda ^{k}\) and \(F^{k}\) by solving:

$$\begin{aligned} \left( {\lambda ^{k},F^{k}} \right) = \hbox {arg} \mathop {min}\limits _{\varLambda ^{k},F^{k}} \left( {NT} \right) ^{-1}\sum \limits _{i=1}^N \sum \limits _{t=1}^T \left( {S_{it} - {\varvec{F}}_t^k \lambda _i^k } \right) ^{2} \end{aligned}$$
(2)

subject to the normalization of either \({\varvec{\varLambda }}^{k^\mathrm{T}}{\varvec{\varLambda }}^{k}/N=I_k \) with \({\varvec{\varLambda }}^{k}=\left[ {\lambda _1^k \ldots \lambda _N^k } \right] ^\mathrm{T}\) or \({\varvec{F}}^{k^\mathrm{T}} {\varvec{F}}^{k}/T=I_k\). One solution of this optimization is given by \(\left( {\hat{\varLambda }^{k},\hat{F}^{k}} \right) \), where \(\hat{{\varvec{\varLambda }}}^{k}\) is \(\sqrt{N}\) times the eigenvectors corresponding to the \(k\) largest eigenvalues of the \(N\times N\) matrix \({\varvec{S}}^\mathrm{T}{\varvec{S}}\) where \({\varvec{S}}\) is a \(T\) by \(N\) dimension matrix comprising \(N\) units until time \(T\), and \(\hat{{\varvec{F}}}^{{\varvec{k}}} = {\varvec{S}}\hat{{\varvec{\varLambda }}}^{k}/N\).

3.2 Common principal components in the different sub-periods

In Table 2 we present the results for the factor model using the CDS index data, and find that a four-factor model in general explains up to 90.5 % of the variance in the changes of CDS spreads. The first factor explains 63 % of the variance of the change of CDS spreads, the explained variance of the second, third and fourth factors are 12.1, 8, and 7.4 %. When turning to three sub-periods, the first factor explains 58.7 % of the variance in the pre-crisis period, 72.3 % of the variance in the crisis period and 47 % of the variance in the post-crisis period. The fraction of CDS variation explained by the first principal component increases from 58.7 % before the crisis to 72.3 % during the crisis period, but declines to 47 % after the crisis. The CDS spreads during the crisis are increasingly driven by common factors and less by idiosyncratic components, which is evident by an increased explanatory power up to 94.1 %.

Table 2 Explained variance by principal component analysis

To formally test whether the eigenstructures across three sub-periods are distinct, we perform a likelihood ratio test comparing a restricted (the Common Principal Components (CPC) model) against the unrestricted model (the model where all covariances are treated separately). The likelihood ratio statistic is given by

$$\begin{aligned} \hbox {T}_{\left( {n_1 ,n_2 ,\ldots ,n_h } \right) \,=\, -2\hbox {log}\frac{L\left( {\hat{\varSigma }_1 ,\ldots ,\hat{\varSigma }_h } \right) }{L\left( {S_1 ,\ldots ,S_h } \right) }} \end{aligned}$$
(3)

where \(\varSigma _i =\Gamma \Lambda _i \Gamma ^\mathrm{T}, i=1,\ldots , h\), is a positive definite \(N\times N\) covariance matrix for every \(i\), \(\Gamma =\left( {\gamma _1 ,\ldots ,\gamma _N } \right) \) is an orthogonal \(N\times N\) transformation matrix and \(\varLambda _i =\hbox {diag}\left( {\vartheta _{i1} ,\ldots ,\vartheta _{iN}} \right) \) is the matrix of eigenvalues where all \(\vartheta _i\) are assumed to be distinct. The CPC is motivated by the similarity of the covariance matrices in the \(h\)-sample problem. The basic assumption of CPC is that the space spanned by the eigenvectors is identical across several groups, whereas variances associated with the components are allowed to vary (Flury 1988).

Let \(S\) be the sample covariance matrix of an underlying \(N\)-variate normal distribution with sample size \(n\). Then the distribution of nS has \(n-1\) degree of freedom and is known as the Wishart distribution.

$$\begin{aligned} nS\sim W_N \left( {\Sigma , n-1} \right) \end{aligned}$$

Hence, for Wishart covariance matrices \(S_{i}, i=1, \ldots , h\) with sample size \(n_{i}\), the likelihood function can be expressed as

$$\begin{aligned} L\left( {\varSigma _1 ,\ldots ,\varSigma _h } \right) =\hbox {C}\mathop {\prod \limits _{i=1}^h} \hbox {exp}\left[ {\hbox {tr}\left\{ {-\frac{1}{2}\left( {n_i -1} \right) \varSigma _i^{-1} S_i } \right\} } \right] \left| {\varSigma _i } \right| ^{-\frac{1}{2}\left( {n_i -1} \right) } \end{aligned}$$
(4)

where C is a constant independent of the parameters \(\varSigma _i\). See Härdle and Simar (2011), inserting (4) to (3), the likelihood ratio statistic is obtained and has a \(\chi ^{2}\) distribution as min(\(n_{i}\)) tends to infinity with

$$\begin{aligned} h\left\{ {\frac{1}{2}N\left( {N-1} \right) +1} \right\} -\left\{ {\frac{1}{2}N\left( {N-1} \right) +hN} \right\} =\frac{1}{2}\left( {h-1} \right) N\left( {N-1} \right) \end{aligned}$$

degree of freedom. Using \(h=3\) sub-periods sample covariance matrix data, the calculation yields 897.54 for the likelihood ratio statistic, which corresponds to a zero \(p\)-value for the \(\chi ^{2}\left( {56} \right) \) distribution. Hence, the CPC model is rejected against the unrestricted model, where the PCA model is applied to each sub-period separately. The finding indicates that the eigenstructures across three sub-periods, pre-, during and post-crisis, are dramatically distinct. There is no common eigenstructures (e.g. of CPC type) for these periods. Indeed, the outbreak of subprime credit crisis has led to a structure change in the commonality of CDS markets.

3.3 Interpreting the factor loadings

To get a better feel from the estimated factor loadings in Table 3, we plot the estimated factor loadings against credit rating and maturity in Fig. 2. The characteristics of factors seem intuitive and interpretable. For factor 1, the factor loadings all have the same sign and same magnitude across maturities and ratings. It can therefore be interpreted as a level effect. The CDS spreads, resembled in bond spreads, are sensitive to the level and movement of the interest rate. As pointed out by Longstaff and Schwartz (1995), the static effect of a higher spot rate increases the risk-neutral drift of the firm value process, which reduces the probability of default and in turn, reduces the CDS spreads. Further empirical evidence is supported by Duffie (1998) and the above references.

Table 3 Estimated factor loadings
Fig. 2
figure 2

The association between factor loadings, credit ratings and maturities

Factor 2 can be interpreted as a region effect. The factor loadings of CDX series are higher than those of iTraxx Europe family. Since the PCA technology joins the U.S. and European CDS indices, at least one factor should capture the fundamental or economic differences between the regions. It is not so straightforward to interpret factor 3 in the CDX case, but factor 3 in the iTraxx Europe case may be related to a volatility effect. In Table 3 and Fig. 2, we find that for iTraxx Europe, the factor loadings of HY are higher than those of IG. The evidence that the HY spreads are more sensitive to volatility than IG ones is well documented in the literature. The contingent–claims approach implies that the debt claim has features similar to a short position in a put option. Since option values increase with volatility, increased volatility increases the probability of default. Finally, we interpret factor 4 as a term structure effect. This is certainly clear because in Table 3 and Fig. 2, the sign of loading of 5-year CDS spreads is always negative while that of 10-year CDS spreads is positive. This is in accordance with Pan and Singleton (2008) who found that the term structure of CDS spreads is associated with a default risk premium. An increase in the default risk premium pushes up the long-term CDS spreads more than the short-term CDS spreads, leading to a steeper term structure of CDS spreads.

We admit that the information from Fig. 2 is insufficient to label the latent factors, therefore we have regressed the latent factors on the economic variables and find that it’s not easy to label the factors by the chosen economic variables.Footnote 1 The difficulty is attributable to that the chosen economic variables such as the change of interest rate, the change of yield curve, the credit spread change and the change of VIX level generally exhibit the indistinguishable contributions or explanatory powers for the latent factors. In our findings, the latent factors are linear combination of the economic variables. These economic variables are highly correlated since they are governed by the same latent factors. Applying them together into the regression may result in a collinearity problem and bias our interpretation. For instance, in our case the change of VIX level almost dominates across the four factors. Eichengreen et al. (2012) claim that the exact association of a economic variable with any one of the latent factors is hard to define due to non-uniqueness of the factor estimates. Although our interpretation for Fig. 2. is not testable, the information from Fig. 2 helps to propose the observed economic variables in the subsequent analysis.

3.4 Connecting latent factors with observed variables

To realize the degree of association between the unobservable factors and observable economic variables, and to answer the question of interest; whether some of the observables are in fact underlying latent factors, we apply the method developed by Bai and Ng (2006) to determine if the observed and the latent are identical. The observed indicator with a stronger coherence with the latent factors is a good proxy. Two statistical criteria, the \(R^{2}\) and the noise-to-signal ratio, are used to examine whether any of the economic series yields the same information that is contained in the factors.

Let \(G_t\) be an \(J\)-dimensional vector of observed economic variables. The basic idea behind the test developed by Bai and Ng (2006) is to investigate whether any of the economic series can be represented as a linear combination of the latent factors by permitting a limited degree of noise in this association, thus

$$\begin{aligned} G_{j,t} =\beta _j^\mathrm{T} \hbox {F}_t +\varsigma _{j,t} \end{aligned}$$
(5)

where \(\beta _j\) is estimated by the OLS regression, and \(\varsigma _{j,t}\) is denoted as the error term. The above equation yields the predicted value \(\hat{G}_{j,t} =\hat{\beta }_j^\mathrm{T} \hat{\hbox {F}}_t \). \(R^{2}\left( j \right) \) is designed to measure the association between \(G_{j,t}\) and \(\hat{G}_{j,t}\), and defined as:

$$\begin{aligned} R^{2}\left( j \right) =\frac{\widehat{\hbox {var}} \left( {\hat{G}_j } \right) }{\widehat{\hbox {var}} \left( {G_j } \right) } \end{aligned}$$
(6)

where \(\widehat{\hbox {var}} \left( \cdot \right) \) denotes the sample variance and \(\widehat{\hbox {var}} \left( {\hat{G}} \right) \) is computed by using the sample analog of the factors’ asymptotic covariance matrix. \(R^{2}\left( j \right) \) is bounded between zero and one. It is equal to one if they have a high association, and is close to zero in the absence of correlation. A second measure NS(j), called the noise-to-signal ratio, is constructed as:

$$\begin{aligned} NS\left( j \right) =\frac{\widehat{\hbox {var}} \left( {\hat{\varsigma }_j} \right) }{\widehat{\hbox {var}} \left( {G_j } \right) } \end{aligned}$$
(7)

A larger NS(j) thus indicates an important departure of \(G_j\) from the latent factors. Normally, the magnitude of \(R^{2}\left( j \right) \) is reverse to that of \(NS\left( j \right) \) since the sum of \(R^{2}\left( j \right) \) and \(NS\left( j \right) \) should be equal to one.

As further observed economic variables in Eq. (5), one may include the change of the interest rate level, change of the credit spread, change of the interest rate term structure and the change of the stock index volatility. These variables are suggested by Collin-Dufresne et al. (2001), Benkert (2004) and Ericsson et al. (2009) since they are important determinants of credit assets. We limit our attention to the U.S. variables because the corresponding European variables are highly correlated with the U.S. series. The 1-year Treasury bond rate represents the level of the risk-free interest rate in the U.S. The difference between the 10-year Treasury bond rate and the 1-year Treasury bond rate is used to evaluate the slope of the yield curve in the U.S. The credit spread in the U.S. is the difference between the average Moody’s Baa yield and the average Moody’s Aaa yield of U.S. corporate bonds. We also employ the CBOE VIX index to measure generalized risk aversion.

Table 4 shows the association of the first four factors with the chosen economic variables. For the entire sample period, the \(R^{2}\) criterion gives a value of 0.3 and 0.375 on the credit spread and VIX index, respectively. The four factors are more correlated with the credit spread and VIX, and less correlated with the level and the term structure of the interest rate. This finding is accordance with Cao et al. (2010), Cremers et al. (2008) and Collin-Dufresne et al. (2001). The implication is that perceptions of credit risk were shaped by the common factors that are best summarized by credit spread and a generalized risk aversion. In other words, the result suggests that a higher credit spread or a higher generalized risk aversion does actually translate into systematic credit risk. Analogically, the sub-period analysis reports that credit spread and VIX are relatively correlated with the latent factors prior to the crisis. During the crisis, the \(R^{2}\) criterion even gives a value of 0.418 on credit spread, implying that the latent factors are best summarized by credit spread. The post-crisis analysis reveals that a generalized risk aversion with 0.59 \(R^{2}\) criterion is highly associated to common factors.

Table 4 The association between the latent factors and the economic variables

3.5 Factor risk prices

How the market prices the factor risk inherent in the CDS spreads is of interest, since one can deduce how the market compensates investors, often referred to as the protection sellers, for bearing credit risk. If we fit the factor model into the framework of the arbitrage pricing theory (Ross 1976), the factor model for an \(N\)-dimensional returns on CDS indices of different credit ratings, maturities and regions, \(R_t\), at time \(t\) can be presented as

$$\begin{aligned} R_t =\lambda \varUpsilon +\lambda F_t +e_t \end{aligned}$$
(8)

The arbitrage pricing theory states that the cross-section returns, \(R_t\), are determined by \(K\) common factors \(F_t\) through the \(N\times K\) factor loading matrix \(\lambda \). Given the assumption that the unobservable common factor \(F_t\) and error term \(e_t\) are \(i.i.d\). distributed, the elements of the \(K\)-dimensional vector \(\varUpsilon \) can be interpreted as the market prices of factor risk. Eq. (8) implies that the expected CDS returns satisfy

$$\begin{aligned} \hbox {E}\left( {R_t } \right) =\lambda \varUpsilon \end{aligned}$$
(9)

Given the estimated factor loadings \(\lambda \), we can estimate the prices of factor risk \(\varUpsilon \) by the generalized methods of moments (GMM) (Hansen 1982) on the moment restrictions in Eq. (9). This is equivalent to a GLS regression of the average changes of CDS indices on the factor loading matrix \(\lambda \). Since we adopted a four-factor model in the previous sections, the GMM method enables us to estimate the prices of factor risk in this model and test their significance. As shown in Table 5, the market prices of a four-factor model are all significant, and the first two factors exhibit a promising size in their risk prices. If we consider a five-factor model, the risk prices are significant in the first four factors but insignificant in the fifth factor.

Table 5 Estimation of factor risk prices

Table 5 also contains the GMM \(J\)-statistic, a test statistic for testing the over identifying restrictions in Eq. (9), and the corresponding \(p\) value. The \(J\)-statistic acts as an omnibus test statistic for model miss-specification. In a well specified over identifying model with valid moment conditions, the \(J\)-statistic behaves like a Chi-square random variable with degrees of freedom equal to the number of over identifying restrictions. Typically, a large \(J\)-statistic indicates a miss-specified model. In Table 5, the \(J\)-statistics in the both four- and five-factor models cannot reject the null hypothesis, implying that both models are well-specified. Furthermore, the four- and five-factor models provide a good fit, as measured by the \(R^{2}\) of the GLS regression, which is equal to 95.42 and 95.89 %, respectively. The results from \(J\)-statistic, \(R^{2}\) of the GLS and the significance of factor prices suggest that the four-factor model is efficient enough to measure the CDS returns.

4 Method of asymptotic principal components and forecast performance

4.1 Competing factor models

According to this study and previous literature, the common latent factors extracted from factor models have proven their representative ability for systematic credit risk. This motivates us to examine whether modelling the time series properties of the factors can improve our ability to forecast the time-variation of CDS index changes. Acting as the benchmark model, the static model in Eq. (1) is too restricted to accommodate the realistic time-variation. The latent factors it produces can only follow one of the few plausible, realistic patterns that do actually appear in the credit markets. The generalized models in which the factors could be defined in a general way are developed to minimize the gap, and should entail less restrictions.

The dynamic factor model, a simple vector autoregressive (VAR) specification, is the first shown to achieve a remarkable fit of the factors’ dynamics. By permitting a VAR specification in the factors with autoregressive parameters \({\varvec{B}}\), this model captures the common dynamics in the cross-sectional analysis. Additionally, the error term, \({\varvec{u}}_t\), from a VAR equation in Eq. (10) is conditionally heteroscedastic and follows a GARCH(\(p,q\)) process.

$$\begin{aligned}&\displaystyle S_{it} = {\varvec{F}}_t \lambda _{it} +e_{it} \nonumber \\&\displaystyle \left( {{\varvec{I}} - {\varvec{B}}_1 L-\cdots - {\varvec{B}}_h L^{h}} \right) {\varvec{F}}_t = {\varvec{u}}_t \end{aligned}$$
(10)
$$\begin{aligned}&\displaystyle {\varvec{u}}_t = {\varvec{H}}_t^{1/2} {\varvec{\eta }}_t \end{aligned}$$
(11)
$$\begin{aligned}&\displaystyle vech\left( {{\varvec{H}}_{\varvec{t}}} \right) = {\varvec{c}} + \sum \limits _{j=1}^q {\varvec{A}}_{\varvec{j}} vech\left( {{\varvec{u}}_{t-j} {\varvec{u}}_{t-j}^\mathrm{T} } \right) + \sum \limits _{j=1}^p {\varvec{D}}_{\varvec{j}} vech\left( {{\varvec{H}}_{t-j} } \right) \end{aligned}$$
(12)

where \(i=1,\ldots ,N,t=1,\ldots ,T, {\varvec{F}}_t\) is \(T\times k\) and \({\varvec{\lambda }}_{it}\) is \(k\times 1\). \({\varvec{\eta }}_t\) is white noise.

To take into account the possibility that the idiosyncratic errors in Eq. (1) may entail serial and cross-section correlation, the dynamic factor with dependent error model is built with additional assumptions on the idiosyncratic components shown in Eqs. (13), (14) and (15).

$$\begin{aligned} S_{it} = {\varvec{F}}_t \lambda _{it} +e_{it} \end{aligned}$$
$$\begin{aligned} \left( {{\varvec{I}}-{\varvec{B}}_1 L-\cdots - {\varvec{B}}_h L^{h}} \right) {\varvec{F}}_t = {\varvec{u}}_t \end{aligned}$$
$$\begin{aligned} {\varvec{u}}_t = {\varvec{H}}_t^{1/2} {\varvec{\eta }}_t \end{aligned}$$
$$\begin{aligned} vech\left( {{\varvec{H}}_{\varvec{t}}} \right) = {\varvec{c}} + \sum \limits _{j=1}^q {\varvec{A}}_{\varvec{j}} vech\left( {{\varvec{u}}_{t-j} {\varvec{u}}_{t-j}^\mathrm{T} } \right) + \sum \limits _{j=1}^p {\varvec{D}}_{\varvec{j}} vech\left( {{\varvec{H}}_{t-j} } \right) \end{aligned}$$
$$\begin{aligned}&\displaystyle \left( {1-\alpha L} \right) e_{it} =\upsilon _{it} +\theta _1 \upsilon _{i+1,t} +\theta _2 \upsilon _{i-1,t} \end{aligned}$$
(13)
$$\begin{aligned}&\displaystyle \upsilon _{it} =\sigma _{it} \eta _{it} \end{aligned}$$
(14)
$$\begin{aligned}&\displaystyle \sigma _{it}^2 =\delta _0 +\delta _1 \sigma _{i,t-1}^2 +\delta _2 \upsilon _{i,t-1}^2 \end{aligned}$$
(15)

The idiosyncratic components, \(e_{it}\), in Eq. (13) are serially correlated, with an AR(1) coefficient \(\alpha \), and weakly cross-section correlated with the coefficients \(\theta _1\) and \(\theta _2\). The innovations \(\upsilon _{it}\) are conditionally heteroscedastic and follow a GARCH(1,1) process with parameters \(\delta _0 ,\delta _1\), and \(\delta _2\) in Eq. (15).

In practice, when factors are constructed over a long period, some degree of temporal instability is inevitable. Following Stock and Watson (2002), we model this instability as stochastic drift in the factor loadings, and the factor loading evolves through time with a serial correlation \(\rho _i\) shown in Eq. (16).

$$\begin{aligned} \lambda _{it} =\rho _i \lambda _{i,t-1} +\left( {c/T} \right) \zeta _{it} \end{aligned}$$
(16)

where \(\zeta _{it}\) is white noise. Equation (16) implies that factor loadings for the \(i\hbox {th}\) variable shift by an amount, \(\left( {c/T} \right) \zeta _{it}\), in time period \(t\). In addition, it keeps a relationship with its previous level which is measured by \(\rho _i\). The time-varying factor loading model ideally incorporates all of the features covering from Eqs. (10) to (16). Whether this model is more superior due to its abundant generalization will be examined with respect to its predictive ability, and will be analyzed in the subsequent section.

4.2 Out-of-sample forecasting performance

Having proposed the competing models developed by more general ways, we take an explicit out-of-sample forecasting approach to evaluate their predicting performance regarding the CDS dynamics. Using the previous 1-year weekly data, we estimate the parameters and produce a 1-week ahead forecast. After estimation, we find that the dynamic of the CDS index captured by these factor models exhibits significant time-variation and persistence, and we summarize their forecasting performance in Table 6. The most outperformed one can be potentially applied to price credit risk accurately and achieve a better credit risk management.

Table 6 Forecasting performance

To assess an out-of-sample forecasting performance, for each proposed model we compute each day \(t\), the following four measures (a) mean squared error (MSE) between the observed change of CDS spreads and the predicted change of CDS spreads from the competing factor models; (b) mean absolute error (MAE); (c) mean correct prediction (MCP) of the direction of change in CDS spreads. The MCP exhibits the average numbers from \(N\) CDS indices are correctly forecast based on their signs of changes; (d) the trace of \(R^{2}\) of the multivariate regression of \(\hat{{\varvec{S}}}\) onto \({\varvec{S}}\),

$$\begin{aligned} R_{\hat{S} ,S}^2 =\hat{E} \parallel {\varvec{P}}_{\varvec{S}} \hat{{\varvec{S}}} \parallel ^{2} {\big /}\hat{E} \parallel \hat{{\varvec{S}}} \parallel ^{2}=\hat{E} tr\left( {\hat{{\varvec{S}}}^\mathrm{T} {\varvec{P}}_{\varvec{S}} \hat{{\varvec{S}}}} \right) {\big /}\hat{E} tr\left( {\hat{{\varvec{S}}}^\mathrm{T}\hat{{\varvec{S}}}} \right) \!, \end{aligned}$$
(17)

where S is a \(T\times N\) matrix comprising \(N\) units until time \(T, \hat{E}\) denotes the expectation estimated by averaging the relevant statistic and \({\varvec{P}}_{\varvec{S}} = {\varvec{S}} \left( {{\varvec{S}}^\mathrm{T}{\varvec{S}}} \right) ^{-\mathbf{1}}{\varvec{S}}^\mathrm{T}\). As shown in Table 6, the time-varying factor loading model exhibits the best 1-week ahead point-forecast performance with the lowest MSE, MAE and the highest MCP, trace of \(R^{2}\). For each model, we measure the forecasting performances under different numbers of factors that range from one to seven. Table 6 indicates that the dynamic factor model and the time-varying factor loading model constitute a promising improvement over the static factor model. A poorest forecast performance in the static factor model implies that the factors exhibit persistency, predictability and temporal instability, and these characteristics contribute to the prediction on the changes of CDS spreads. We further conduct a test for their equal predictive ability against the static factor model in Sect. 4.3.

Determining the number of factors can be regarded as a model selection problem, which is a trade-off between goodness-of-fit and parsimony. Following Bai and Ng (2002), the number of factors is estimated by an information criteria function (IC):

$$\begin{aligned} k=\arg {\min }_{0\le k \le kmax} IC\left( k \right) \end{aligned}$$
(18)

where \(\hbox {IC}\left( k \right) =\hbox {log}\left( {V\left( {k,\hat{{\varvec{F}}}^{k}} \right) } \right) +kg\left( {N,T} \right) \). \(V\left( {k,\hat{{\varvec{F}}}^{k}} \right) \!=\!\frac{1}{NT} {\sum \nolimits _{i=1}^N} {\sum \nolimits _{t=1}^T} \left( S_{it}\right. \left. -\hat{{\varvec{F}}}_t^k \lambda _i^k \right) ^{2}\) is simply the average residual variance, and \(g\left( {N,T} \right) \) is a penalty function for overfitting. Bai and Ng (2002) have proposed three specific formulations of \(g\left( {N,T} \right) \) that depend on both \(N\) and \(T\).

$$\begin{aligned}&IC_{p1} \left( k \right) =\hbox {log}\left( {V\left( {k,\hat{{\varvec{F}}}^{k}} \right) } \right) +k\left( {\frac{N+T}{NT}} \right) \hbox {log}\left( {\frac{NT}{N+T}} \right) \end{aligned}$$
(19)
$$\begin{aligned}&IC_{p2} \left( k \right) =\hbox {log}\left( {V\left( {k,\hat{{\varvec{F}}}^{k}} \right) } \right) +k\left( {\frac{N+T}{NT}} \right) \hbox {log}\left( {min\left\{ {N,T} \right\} } \right) \end{aligned}$$
(20)
$$\begin{aligned}&IC_{p3} \left( k \right) =\hbox {log}\left( {V\left( {k,\hat{{\varvec{F}}}^{k}} \right) } \right) +k\left( {\frac{\hbox {log}\left( {min\left\{ {N,T} \right\} } \right) }{min\left\{ {N,T} \right\} }} \right) \end{aligned}$$
(21)

Table 6 summarizes the results of the IC function and shows that for both the static factor model and the dynamic factor model, the one-factor model with the minimized information criteria is the best one to model the common factors in the changes of CDS spreads. However, for both the dynamic factor with dependent errors model and for the time-varying factor loading model, the two-factor model is relatively adequate.

4.3 Testing equal predictive ability

To formally assess the statistical significance of the superior out-of-sample performance of the dynamic factor models over the static factor model, we employ the equal predictive ability test of Diebold and Mariano (1995) and report the testing results in Table 7. Diebold and Mariano (1995) propose a method for measuring and assessing the significance of divergences between two competing forecasts, and allow for forecast errors that are potentially non-Gaussian, serially correlated and contemporaneously correlated.

Table 7 Comparing predictive accuracy

To be specific, let \(d_t\) be the loss differential between two forecast errors. The null hypothesis is no difference in the accuracy of two forecasts, that is \(\hbox {E}d_t =0\). The asymptotic distribution of the sample mean loss differential is:

$$\begin{aligned} \sqrt{T}\left( {\bar{d} -\mu } \right) \sim N\left( {0,2\pi f_d \left( 0 \right) } \right) \end{aligned}$$
(22)

where \(f_d \left( 0 \right) \) is the spectral density of the loss differential at frequency 0.

The statistical significance of the difference in forecast errors between the models is summarized in Table 7. The tabulated \(p\) values indicate that we can reject the null hypothesis of equal forecasting ability between the static factor model and the time-varying factor model. We also reject the equal predicting ability between the static factor model and the dynamic factor with dependent errors model. With the exception in CDX 5-year IG and 10-year HY indices, the equal predictive ability between the static factor model and the dynamic factor model is rejected. Furthermore, to claim that the time-varying factor model is the best one, we compare its forecast ability with the dynamic factor model, and the dynamic factor with dependent errors model. We find that significant differences exist in their predicting ability in both cases.

In summary, the results in Table 6 together with Table 7 indicate that the time-varying factor model reveals a statistically significant outperformance for most of the cases, suggesting that common factors drive the time-variation of CDS spreads and that the dynamics in the factors exhibit moderate predictability in the short-run. As evident, the temporal instability in the common factors is inevitable and contributes to forecasting. However, the serial or cross correlation in the idiosyncratic components only have little effect on the forecasts, implying that the common factors dominate the predicting performance. The predictability of CDS spreads changes, certainly benefits the hedging, speculating and arbitraging activities in the credit markets.

5 Conclusion

The commonalities in CDS spreads and their factor loadings are analyzed in this study. We collect CDS indices in North American and Europe with 5- and 10-year maturities, and with different credit ratings (IG and HY) from October 2004 to June 2011. The estimated risk factors can be interpreted as the level, the region, the volatility and the term structure effect. By conducting a test if there are common principal components, we find that the eigenstructures are distinct for the pre-, during and post-crisis periods. The first factor explains 58.7 % of the variance in the pre-crisis period, 72.3 % of the variance in the crisis period and 47 % of the variance in the post-crisis period, indicating that during the crisis, CDS spreads are increasingly driven by common factors and less by idiosyncratic components. We also find that during the crisis the latent factors are more correlated with the credit spread and VIX, and less correlated with the level and the term structure of the interest rate.

The time-variation of CDS spreads changes is modelled via various dynamic factor models. We apply the asymptotic principal component technique to extract the common factors, and then determine the number of factors by information criteria functions. The out-of-sample forecasting performance and the results of equal predictive ability indicate that the common factors drive the time-variation of CDS spreads and the dynamics in the factors exhibit moderate predictability in the short-run. In addition, the temporal instability in the common factors is inevitable and contributes to forecasting, but the serial or cross correlation in the idiosyncratic components have little effect on the forecasts.