Abstract
In this paper, we study large losses arising from defaults of a credit portfolio. We assume that the portfolio dependence structure is modelled by the Archimedean copula family as opposed to the widely used Gaussian copula. The resulting model is new, and it has the capability of capturing extremal dependence among obligors. We first derive sharp asymptotics for the tail probability of portfolio losses and the expected shortfall. Then we demonstrate how to utilize these asymptotic results to produce two variance reduction algorithms that significantly enhance the classical Monte Carlo methods. Moreover, we show that the estimator based on the proposed two-step importance sampling method is logarithmically efficient while the estimator based on the conditional Monte Carlo method has bounded relative error as the number of obligors tends to infinity. Extensive simulation studies are conducted to highlight the efficiency of our proposed algorithms for estimating portfolio credit risk. In particular, the variance reduction achieved by the proposed conditional Monte Carlo method, relative to the crude Monte Carlo method, is in the order of millions.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Credit risk in banking and trading book is one of the largest financial risk exposures for many financial institutions. As such quantifying the risk of a credit portfolio is essential in credit risk management. One of the key challenges in credit risk management is the accurate modelling of dependence between obligors, particularly in the tails. This is attributed to two phenomena. First, many financial institutions’ exposure to credit risk do not just confine to a single obligor, but to a large portfolio of multiple obligors. Second, empirically we have been observing simultaneous defaults in large credit portfolios as financial institutions are affected by common macroeconomic or systemic factors. This suggests that obligors tend to exhibit stronger dependence in a stressed market and hence simultaneous defaults tend to be more likely. For this reason, the model for the dependence structure of the default events has a direct impact on the tail of the loss for a large portfolio. In light of these issues and challenges, the first objective of this paper is to analyze the large credit portfolio loss when the obligors are modeled to be strongly dependent. From the asymptotic analysis, the second objective of the paper is to propose two variance reduction simulation algorithms that provide efficient estimation of the risk of a credit portfolio.
To accomplish the above two objectives, we model the credit risk based on the so-called threshold models which are widely used to capture the event of default for an individual obligor within the portfolio. A default in a threshold model occurs if some critical random variable, usually called a latent variable, exceeds (or falls below) a pre-specified threshold. The dependence among defaults then stems from the dependence among the latent variables. It has been found that the copula representation is a useful tool for studying the dependence structure. Specifically, the copula of the latent variables determines the link between marginal default probabilities for individual obligors and joint default probabilities for groups of obligors. Most threshold models used in the industry are based explicitly or implicitly on the Gaussian copula; see for example, CreditMetrics (Gupton et al., 1997) and Moody’s KMV system (Kealhofer & Bohn, 2001). While the Gaussian copula models can accommodate a wide range of correlation structure, they are inadequate to model extremal dependence between the latent variables as these models are known to exhibit weaker dependence in the tails of the underlying random variables (see for example, Section 7.2.4 of McNeil et al. (2015) for further discussion on tail dependence). This limitation raises considerable concern as we have already noted earlier that when the market condition worsens, simultaneous defaults can occur with nonnegligible probability in large credit portfolios. To better reflect the empirical evidence, copulas that can capture “stronger” tail dependence of obligors such as t-copula and its generalizations have been proposed (see Bassamboo et al., 2008; Chan & Kroese, 2010; Tang et al., 2019). As pointed out in Section 11.1.4 of McNeil et al. (2015), the Archimedean copula is another plausible class of dependence model for obligors. The Archimedean copula offers great flexibility in modeling dependence as it is capable of covering dependence structures ranging from independence to comonotonicity (the perfect dependence). Typical examples of Archimedean copulas include Clayton, Gumbel and Frank copulas. Because of their flexibility in dependence modeling, the Archimedean copulas have been applied in credit risks (Cherubini et al., 2004; Hofert, 2010; Hofert & Scherer, 2011; Naifar, 2011), insurance (Frees & Valdez, 1998; Embrechts et al., 2001; Denuit et al., 2004; Albrecher et al., 2011; Cossette et al., 2018), and many other areas of applications such as Genest and Favre (2007), Zhang and Singh (2007) and Wang (2003). See also Charpentier and Segers (2009), Hofert et al. (2013), Okhrin et al. (2013) and Zhu et al. (2016) for higher dimensional applications of Archimedean copulas and their generalizations (such as the hierarchical Archimedean copulas) in finance and risk management. Motivated by their dependence modeling flexibility and their wide applications, this paper similarly uses the Archimedean copula to model the dependence of obligors in order to account for market phenomenon of simultaneous defaults. Moreover, as to be discussed in Sect. 3.1, when obligors are modeled with Archimedean copulas, the threshold model can similarly be understood as a one-factor Bernoulli mixture model, which leads to great conveniences for asymptotic analysis and simulations of large portfolio losses in the later sections.
In terms of quantifying portfolio credit risk, the most popular measure is to study the probability of large portfolio loss over a fixed time horizon, say, a year (see Glasserman, 2004; Glasserman et al., 2007; Tang et al., 2019). The expected shortfall of large portfolio loss, which has been found to be very useful in the risk management and the pricing of credit instruments, is another important measure of credit risk. A general discussion on quantifying portfolio credit risk can be found in Hong et al. (2014). With the extremal dependence being modelled by the Archimedean copula, there are no analytic expressions for the above two measures. This implies we need to rely on numerical methods to evaluate these measures. While the Monte Carlo (MC) simulation method is a popular alternate numerical tool, naive application of the method to evaluate these measures is very inefficient since the event of defaults of high-quality obligors is rare and the naive MC is notoriously known to be inefficient for rare-event applications. Hence, variance reduction techniques such as that based on the importance sampling (Glasserman & Li, 2005; Bassamboo et al., 2008; Glasserman et al., 2008) and the conditional Monte Carlo method (Chan & Kroese, 2010) have been proposed to increase the efficiency of MC methods for estimating these measures.
We now summarize the key contributions of the paper in studying the large portfolio loss from the following two aspects. By exploiting the threshold model and using an Archimedean copula to capture the dependence among latent variables, the first key contribution is to derive sharp asymptotics for the two performance measures: the probability of large portfolio loss and the expected shortfall. These results quantify the asymptotic behavior of the two measures when the portfolio size is large and provide better understanding on how dependence affects the large portfolio loss. While the effectiveness of estimating the portfolio loss based on the asymptotic expansions may deteriorate if the portfolio size is not sufficiently large enough, it is still very useful as it provides a good foundation for the MC based algorithms that we are developing subsequently. In particular, the second key contribution of the paper is to exploit these asymptotic results and propose two efficient MC based methods for estimating the above two performance measures. More specifically, the first one is a two-step full importance sampling algorithm that provides efficient estimators for both measures. Furthermore, we show that the proposed estimator of probability of large portfolio loss is logarithmically efficient. The second algorithm, which is based on the conditional Monte Carlo method, is shown to have bounded relative error. Relative to the importance sampling method, the conditional Monte Carlo algorithm has the advantage of its simplicity though it can only be used to estimate the probability of portfolio loss. Simulation studies also show the better performance of the second algorithm than the first one. Overall, both of them generate significant variance reductions when compared to naive MC simulations.
The rest of the paper is organized as follows. We formulate our problem in Sect. 2 and describe Archimedean copula and regular variation in Sect. 3. Main results are presented in Sects. 4, 5 and 6, with Sect. 4 derives the sharp asymptotics and the latter two sections present our proposed efficient Monte Carlo algorithms and analyze their performances. Through an extensive simulation study, Sect. 7 provides further comparable analysis on the relative effectiveness of our proposed algorithms. Proofs are relegated to “Appendix”.
2 Problem formulation
Consider a large credit portfolio of n obligors. Similar to Bassamboo et al. (2008), we employ a static structural model for portfolio loss by introducing latent variables \(\{X_{1},\ldots ,X_{n}\}\) so that each obligor defaults if each latent variable \(X_{i}\) exceeds some pre-specified threshold \(x_{i}\), for \(i=1,2,\ldots ,n\). By denoting \(c_{i}>0\) as the risk exposure at default that corresponds to obligor i, for \(i=1,2,\ldots ,n\), the portfolio loss incurred from defaults is given by
where \(1_{A}\) is the indicator function of an event A. Such a threshold model can first trace back to Merton (1974). Let \(F_{i}\) and \(\overline{F}_{i}\) be, respectively, the marginal distribution function and marginal survival function of \(X_{i}\). Then \(F_{i}=1-\overline{F}_{i}\), for \(i=1,2,\ldots ,n\). In the threshold model, \(\overline{F}_{i}(x_i)\) can be interpreted as the marginal default probability of obligor i and we use \(p_{i}\) to denote it.
As pointed out in the last section, the dependence among the latent variables has a direct impact on the tail of the loss for a large portfolio and their dependence structure is conveniently modelled via copulas. This is also highlighted in Lemma 11.2 of McNeil et al. (2015) that in a threshold model the copula of the latent variables determines the link between marginal default probabilities and portfolio default probabilities. To see this, let \(U_{i}=F_{i}(X_{i})\) and \(p_i= \overline{F}_{i} (x_i)\) for \(i=1,\ldots ,n\). It follows immediately from Lemma 11.2 of McNeil et al. (2015) that \((X_{i},x_{i})_{1\le i\le n}\) and \((U_{i},p_{i})_{1\le i\le n}\) are two equivalent threshold models. Then the portfolio loss is affected by the dependence among the latent variables rather than the distribution of each latent variable. This is also the reason why we conduct our analysis by focusing on the dependence structure of the obligors and with the assumption that the dependence of \((U_{1},U_{2},\ldots ,U_{n})\) is adequately captured by an Archimedean copula.
Recall that the main focus of the paper is to study the credit portfolio for which the portfolio consists of a large number of obligors and each obligor has low default probability. While default events are rare, the potential loss is significant once they are triggered and with simultaneous defaults. By the transformation \(U_{i}=F_{i}(X_{i})\) for \(i=1,\ldots ,n\), the event that obligor defaults \(\{X_{i}>x_{i}\}\) is equivalent to \(\{U_{i}>1-p_{i}\}\). From the theory of diversification, the probability of large portfolio loss should diminish as n increases. To capture this feature, the individual default probability can also be expressed as \(p_{i}=l_{i}f_{n}\) for \(i=1,\ldots ,n\), where \(f_{n}\) is a positive deterministic function converging to 0 as \(n\rightarrow \infty \) and \(\{l_{1},\ldots ,l_{n}\}\) are strictly positive constants accounting for variations effect on different obligors. We emphasize that on one hand, the assumption that \(f_{n}\) converges to 0 is to reflect the diversification effect in a large portfolio that individual default probability diminishes as n increases. On the other hand, it provides mathematical convenience to derive sharp asymptotics for the large portfolio loss (see discussions after Theorem 4.1) and to prove the algorithms’ efficiency (Theorems 5.1 and 6.1). Such condition is also assumed in Gordy (2003), Bassamboo et al. (2008), Chan and Kroese (2010) and Tang et al. (2019), for example. More detailed explanations on the assumption of \(f_n\) are provided in Sect. 4.1. With this representation, we can rewrite the overall portfolio loss (2.1) as
In the remaining of the paper, we use (2.2) to analyze the large portfolio loss. To characterize the potential heterogeneity among obligors, we further impose some restrictions on the sequence \(\{(c_{i},l_{i}):i\ge 1\}\), as in Bassamboo et al. (2008).
Assumption 2.1
Let the positive sequence \(((c_{i},l_{i}):i\ge 1)\) take values in a finite set \(\mathcal {W}\). By denoting \(n_{j}\) as the number of each element \((c_{j},l_{j})\in \mathcal {W}\) in the portfolio, we further assume that \(n_{j}/n\) converges to \(w_{j}>0\), for each \(j\le |\mathcal {W}|\) as \(n\rightarrow \infty \).
In practice, Assumption 2.1 can be interpreted as a heterogeneous credit portfolio that comprises of a finite number of homogeneous sub-portfolios based on risk types and exposure sizes. We note that it is easy to relax this assumption to the case where \(c_{i}\) and \(l_{i}\) are random variables; see Tong et al. (2016) and Tang et al. (2019) for recent discussions.
3 Preliminaries
3.1 Archimedean copulas
Archimedean copulas have a simple closed form and can be represented by a generator function \(\phi \) as follows:
where \(C:[0,1]^{n}\rightarrow [0,1]\) is a copula function. The generator function \(\phi :[0,1]\rightarrow [0,\infty ]\) is continuous, decreasing and convex such that \(\phi (1)=0\) and \(\phi (0)=\infty \), and \(\phi ^{-1}\) is the inverse of \(\phi \). We further assume \(\phi ^{-1}\) is completely monotonic, i.e. \((-1)^{i}\left( \phi ^{-1}\right) ^{(i)}\ge 0\) for all \(i\in \mathbb {N}\), which allows \(\phi ^{-1}\) to be a Laplace-Stieltjes (LS) transform of a distribution function G on \([0,\infty ]\) such that \(G(0)=0\). Let V be a random variable with a distribution function G on \([0,\infty ]\). The LS transform of V (or G) is defined as
Archimedean copulas that are generated from LS transforms of different distributions are referred as LT-Archimedean copulas, as formally defined below:
Definition 3.1
An LT-Archimedean copula is a copula of the form (3.1), where \(\phi ^{-1}\) is the Laplace-Stieltjes transform of a distribution function G on \([0,\infty ]\) such that \(G(0)=0\).
For many popular Archimedean copulas, the random variable V has a known distribution. For example, V is Gamma distributed for Clayton copulas, while V is a one-sided stable random variable for Gumbel copulas. A detailed specification on V can be found in Table 1 of Hofert (2008).
The following result provides a stochastic representation of \(\mathbf {U} =(U_{1},\ldots ,U_{n})\) where \(\mathbf {U}\) follows an LT-Archimedean copula of the form:
Here V is a positive random variable with LS transform \(\phi ^{-1}\) and \(\{R_{1},\ldots ,R_{n}\}\) is a sequence of independent and identically distributed (i.i.d.) standard exponential random variables independent of V. This representation is first recognized by Marshall and Olkin (1988) and later is formally proved in Proposition 7.51 of McNeil et al. (2015).
The construction (3.2) is especially useful in the field of credit risk. To see this, let us consider the threshold model defined in (2.2) and that \(\mathbf {U}\) has an LT-Archimedean copula with generator \(\phi \) as defined in (3.2). Then the random variable V can be considered as a proxy for systematic risks. Conditioning on V, random variables \(U_{1},\ldots ,U_{n}\) are independent with conditional distribution function \(\mathbb {P}(U_{i}\le u|V=v)=\exp (-v\phi (u))=p_{i}(v)\) for some predetermined \(u\in [0,1]\). By such a construction, the threshold model (2.2) can be represented succinctly as a one-factor Bernoulli mixture model with mixing variable V and mixing probabilities \(p_{i}(v),i=1,\ldots ,n\). This property offers two important aspects. First is that it facilitates the asymptotic analysis of large portfolio loss. By understanding it as a one-factor Bernoulli mixture model, we will show in Sect. 4 that the large portfolio loss is essentially determined by the mixing distribution of V or its LS transform \(\phi ^{-1}\). This is also observed in Gordy (2003) that asymptotic analysis provides a simple yet quite accurate way of evaluating large portfolio loss. In the current paper, we push one step further by deriving the sharp asymptotics for the large portfolio loss in a more explicit way and under the Archimedean copula model. Second, the Bernoulli mixture models lend themselves to practical implementation of MC simulations (see McNeil et al., 2015; Basoğlu et al., 2018). To be more specific, a Bernoulli mixture model can be simulated by first generating a realization v of V and then conducting independent Bernoulli experiments with conditional default probabilities \(p_{i}(v)\). This generation algorithm is explicitly exploited in Sect. 5 as the starting point of our proposed importance sampling simulations.
3.2 Regular variation
Regular variation is an important notion in our modeling. Intuitively, a function f is regularly varying at infinity if it behaves like a power law function near infinity. Interested readers may refer to Bingham et al. (1989) and Resnick (2013) for textbook treatments. In our model, we assume the generator function \(\phi \) of the LT-Archimedean copula is regularly varying in order to capture the upper tail dependence.
Definition 3.2
A positive Lebesgue measurable function f on \((0,\infty )\) is said to be regularly varying at \(\infty \) with index \(\alpha \in \mathbb {R}\), written as \(f\in \mathrm {RV}_{\alpha }\), if for \(x>0\),
Similarly, f is said to be regularly varying at 0 if \(f(\frac{1}{\cdot })\in \mathrm {RV}_{\alpha }\) and f to be regularly varying at \(a>0\) if \(f(a-\frac{1}{\cdot })\in \mathrm {RV}_{\alpha }\).
It turns out that many LT-Archimedean copulas which are commonly used in practice have generators that are regularly varying at 1. For example, Gumbel copula has a generator of \(\phi (t)=(-\ln (t))^{\alpha }\) for \(\alpha \in [1,\infty )\), and it follows that \(\phi ^{-1}\) is completely monotonic and \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV}_{-\alpha }\).
4 Asymptotic analysis for large portfolio loss
In this section, we conduct an asymptotic analysis on a regime where the number of obligors is large with each individual obligor having an excellent credit rating (i.e. with small default probability). Our asymptotic analysis focuses on large portfolio losses with Sect. 4.1 analyzes the tail probabilities of the losses and Sect. 4.2 tackles the expected shortfall of the losses.
4.1 Asymptotics for probabilities of large portfolio loss
By considering the portfolio loss model (2.2), this subsection analyzes the asymptotic probability \(\mathbb {P}(L_{n}>nb)\) as \(n\rightarrow \infty \), where b is an arbitrarily fixed number. We restrict our analysis to LT-Archimedean copulas for modeling the dependence among the latent variables in order to fully take advantage of the Bernoulli mixture structure (as explained in Sect. 3.1). Recall the random variable V in the representation (3.2) can be interpreted as the systematic risk or common shock factor. The dependence of obligors is mainly induced by V. Note that \(\phi ^{-1}\) is a decreasing function. When V takes on large values, all \(U_{i}\)’s are likely to be large (close to 1), which leads to many obligors default simultaneously. To incorporate strong dependence among obligors, we assume V to be heavy tailed. In particular, we assume \(\overline{F}_{V} \in \mathrm {RV}_{-1/\alpha }\), with \(\overline{F}_{V}(\cdot )\) corresponds to the survival distribution function of V and \(\alpha >1\). Here \(\alpha \) represents the heavy tailedness of V so that the larger \(\alpha \) is, the heavier tail V has and the more dependent the obligors are, which means simultaneous defaults are more likely to occur. By Karamata’s Tauberian theorem, this is equivalent to assuming that \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV}_{-\alpha }\) with \(\alpha >1\), where by the convexity of the generator \(\phi \) the condition \(\alpha >1\) necessarily holds. Such heavy tailed assumption on the systematic risk factor (or common shock) can be seen in Bassamboo et al. (2008), Chan and Kroese (2010) and Tang et al. (2019). This is formalized in the following assumption.
Assumption 4.1
Assume \(\mathbf {U}=(U_{1},\ldots ,U_{n})\) follows an LT-Archimedean copula with generator \(\phi \) satisfying that \(\phi (1-\frac{1}{\cdot } )\in \mathrm {RV}_{-\alpha }\) with \(\alpha >1\). Let V be the random variable associated with Laplace-Stieltjes transform \(\phi ^{-1}\). Assume that V has a eventually monotone density function.
Before presenting the main result of this section, it is useful to note that by conditioning on \(V=\dfrac{v}{\phi (1-f_{n})}\), we have
Under Assumption 4.1 that \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV} _{-\alpha }\), we immediately obtain
With the condition \(V=\dfrac{v}{\phi (1-f_{n})}\), and by the Kolmogorov’s strong law of large numbers, it follows that, almost surely
Recall \(w_j\) is formally defined in Assumption 2.1. Note that r(v) is strictly increasing in v and attains its upper bound \(\bar{c}=\sum _{j\le |\mathcal {W}|}c_{j}w_{j}\) at infinity, where \(\bar{c}\) can be interpreted as the limiting average loss when all obligors default. Thus, for each \(b\in (0,\bar{c})\), we denote \(v^{*}\) as the unique solution to
Essentially, \(v^{*}\) represents the threshold value so that for \(V\in (0,v^{*}/\phi (1-f_{n}))\), the limiting average portfolio loss \(\bar{c}\) is less than b; for \(V\in (v^{*}/\phi (1-f_{n}),\infty )\), the limiting average portfolio loss \(\bar{c}\) is greater than b.
Now we are ready to present the main theorem of this section which gives a sharp asymptotic for the probability of large portfolio losses. The proof is relegated to “Appendix”.
Theorem 4.1
Consider the portfolio loss defined in (2.2). Under Assumptions 2.1 and 4.1 and further assume that \(\exp (-n\beta )=o(f_{n})\) for any \(\beta >0\). Then for any fixed \(b\in (0,\bar{c})\), as \(n\rightarrow \infty \),
where \(v^{*}\) is the unique solution that solves (4.3).
Remark 4.1
We emphasize that the asymptotic behavior of the portfolio loss is mostly dictated by \(\alpha \) and \(f_{n}\). Recall \(\alpha \) is the index of regular variation of the generator function \(\phi \). It controls dependence among the latent variables: the larger \(\alpha \), the more likely that obligors tend to default simultaneously. Once \(\alpha \) is fixed, Theorem 4.1 shows that the probability of large portfolio loss diminishes to zero at the same rate as \(f_{n}\). This result is sharp and is more explicit, compared to the results in Gordy (2003). As explained in Sect. 3.1, the threshold model (2.2) under the Archimedean copula is reduced to a Bernoulli mixture model while Gordy (2003) studies a risk-factor model, which is essentially a Bernoulli mixture model. In that paper, the author shows that the capital of the fine-grained portfolio asymptotically converges to the capital of the systematic risk factor. In other words, the asymptotics are presented between the portfolio loss and the systematic risk factor (equivalent to the random variable V here). Moreover, through numerical experiments, the author shows the approximation of the large portfolio loss by the systemic risk factor is quite accurate and simple. In our current paper, we are able to go one step further that the large portfolio loss is given in a more explicit form that it is linearly proportional to the individual default probabilities \(f_{n}\). Thus, to approximate the large portfolio loss, we do not rely on the full information of the systematic risk factor. At the same time, the asymptotics derived in our paper should share the same advantages of accuracy and simplicity as discussed in Gordy (2003).
Now we discuss the assumption on \(f_n\) in greater details. As mentioned earlier, due to the effect of diversification, individual default probability diminishes in a large portfolio as n increases. On the technical side, letting \(f_{n}\) converge to 0 ensures that a large portfolio loss occurs primarily when V takes large values, whereas \(R_{i}\), \(i=1,\ldots ,n\), generally does not play a role in the occurrence of the large portfolio loss. To better understand this requirement, we consider the case with \(f_{n}\equiv f\) being a constant. Then similar calculations as in (4.1) leads to
Note that \(p_{0}(v,i)\) is strictly increasing in v. Under the condition \(V=v\), and the Kolmogorov’s strong law of large numbers, we have, almost surely,
where the limit follows from Assumption 2.1. Clearly, \(r_{0}(v)\) is also strictly increasing in v. Define \(v_{0}^{*}\) as the unique solution to
It then follows for portfolio size n large enough, we have \(\mathbb {P} (L_{n}>nb|V=v)=0\) for \(v\le v_{0}^{*}\); and \(\mathbb {P}(L_{n}>nb|V=v)=1\) for \(v>v_{0}^{*}\). Thus, for any \(b\in (0,\bar{c})\), and large enough n, we have
This leads to a mathematically trivial result. Moreover, it is counter-intuitive in the sense that as the size of the portfolio increases, the probability of portfolio loss is still significant (i.e. not converging to 0 which is in contradiction with the portfolio diversification). This illustration exemplified the importance of the assumption that \(f_{n}\) diminishes to 0 as \(n\rightarrow \infty \) to account for the rarity of large loss. The assumption \(\exp (-n\beta )=o(f_{n})\) essentially requires that the decay rate of \(f_{n}\) to 0 needs to be slower than an exponential function. By choosing different \(f_{n}\), portfolios will have different credit rating classes. For example, if \(f_{n}\) decays at a faster rate such as 1/n, then the portfolio has higher quality obligors, whereas if \(f_{n}\) decays at a slower rate of \(1/\ln n\), then the portfolio consists of more risky obligors. There are also many similar discussions on the requirement of individual default probability diminishes in a large portfolio (equivalent to our \(f_{n}\rightarrow 0\)) in the literature, which are all rooted in the effect of diversification of a large portfolio. For example, in Gordy (2003), the assumption (A-2) guarantees that the share of the largest single exposure in total portfolio exposure vanishes to zero as the number of exposures in the portfolio increases. Tang et al. (2019) explained it more explicitly in their Sect. 2 as follows. As the size of the portfolio increases, each latent variable \(X_{i}\) should be modified to \(\frac{X_{i}}{\iota _{i}g_{n}}\), where \(g_{n}\) is a positive function diverging to \(\infty \) to reflect an overall improvement on the credit quality, and \(\iota _{i}\) is a positive random variable to reflect a minor variation in portfolio effect on obligor i. With the endogenously determined default threshold of obligor i is fixed as \(a_{i}>0\), the individual default occurs as \(\frac{X_{i}}{\iota _{i}g_{n}}>a_{i}\) if and only if \(X_{i}>\iota _{i}a_{i}g_{n}\), which is equivalent to have \(U_{i}<1-l_{i}f_{n}\) with \(f_{n}\) decreasing to 0 in our context.
Next we use an example involving a fully homogeneous portfolio to further illustrate our results.
Example 4.1
Assume a fully homogeneous portfolio, that is \(l_{i}\equiv l,c_{i}\equiv c\). Under this assumption, (4.2) simplifies to
Thus, \(v^{*}=l^{-\alpha }\ln \dfrac{c}{c-b}\) is the unique solution to \(r(v)=b\). It immediately follows from relation (4.4) that, for \(b\in (0,c)\), we have
Direct calculation further shows that the right-hand side of (4.5) is an increasing function of \(\alpha \) if \(\ln \frac{c}{c-b}\ge \exp (-\gamma )\), i.e., \(b/c\ge 1-e^{-e^{-\gamma }}\), where \(\gamma \) denotes the Euler’s constant. This monotonic result can be interpreted in an intuitive way. Recall that \(\alpha \) is the index of regular variation of the generator function \(\phi \). A larger \(\alpha \) corresponds to a stronger upper tail dependence, and therefore a joint default of obligors is more likely to occur. However, the monotonicity fails if b is not large. In this case, the mean portfolio loss (\(L_{n}/n\)) is compared to a lower level b and such event may occur due to a single obligor default. Thus, both the upper tail dependence and the level of mean portfolio loss affect probability of large portfolio loss.
4.2 Asymptotics for expected shortfall of large portfolio loss
The asymptotic expansions on the tail probabilities of the large portfolio loss provide the foundation in the analysis of the expected shortfall. To see this, the expected shortfall can be rewritten as
Theorem 4.1 becomes the key to establishing an asymptotic for the expected shortfall, as formally stated in the following theorem.
Theorem 4.2
Under the same assumption as in Theorem 4.1, the following relation
holds for any fixed \(b\in (0,\bar{c})\), where
The above theorem states that the expected shortfall grows almost linearly with the size of the portfolio n.
5 Importance sampling (IS) simulations for large portfolio loss
The asymptotic results established in the last section (see Theorems 4.1 and 4.2) characterize the behavior of large portfolio losses. These results, however, may not be applicable in practical applications unless the size of portfolio is large. In practice, the tail probability or the expected shortfall of the portfolio loss are typically estimated via MC simulation methods due to the non-tractability. Naive application of MC method to this type of rare-event problems, on the other hand, is notoriously inefficient. For this reason, variance reduction methods are often used to enhance the underlying MC methods. We similarly follow this line of inquiry and propose two variance reduction algorithms. In particular, an IS algorithm based on a hazard rate twisting is presented in this section while a second algorithm based on the conditional Monte Carlo simulations will be discussed in the next section. The asymptotic analysis in Sect. 4 plays an important role in proving the efficiency of both algorithms.
5.1 Preliminary of importance sampling
We are interested in estimating \(\mathbb {P}\left( L_{n}>nb\right) \), where \(L_{n}\) can be considered as a linear combination of conditionally independent Bernoulli random variables \(\{1_{\{U_{i}>1-l_{i}f_{n}\}},i=1,\ldots ,n\}\). For each Bernoulli variable, the associated probability is denoted by \(p_{j}\) for \(j\le |\mathcal {W}|\), which is a function of the generated variable V. Following the analysis in Sect. 4, \(p_{j}\) is explicitly given as p(v, j) as shown in (4.1). The simulation of \(\mathbb {P}\left( L_{n}>nb\right) \) is then conducted in two steps. In step 1, the common factor V using the density function \(f_{V}(\cdot )\) is simulated and in step 2, the corresponding Bernoulli random variables are generated. When the portfolio size is very large, the event \(\{L_{n}>nb\}\) only occurs when V takes large values and it further leads to the default probability \(p_{j}\) for each Bernoulli variable is small. Thus, both steps in the simulation of \(\mathbb {P}\left( L_{n}>nb\right) \) are rare event simulations. Estimation by naive MC simulation becomes impractical due to the large number of samples needed, and therefore, one has to resort to variance reduction techniques. IS is a widely used variance reduction technique by placing greater probability mass on the rare event of interest and then appropriately normalizing the resulting output. Next, we briefly discuss how we apply IS in the two-step simulation.
In the first step, the tail behavior of large portfolio loss highly depends on the tail distribution of V, i.e., the key to the occurrence of the large loss event corresponds to V taking large value. Then a good importance sampling distribution for random variable V should be more heavy-tailed than its original distribution, so that a larger probability could be assigned to the event that the average portfolio loss conditioned on V exceeds the level b. Such importance sampling distribution can be obtained via hazard rate twisting on V. Let \(\tilde{f}_{V}(\cdot )\) denote the importance sampling density function for V after the application of IS. In the second step, we improve the efficiency of calculating the conditional probabilities by replacing each Bernoulli success probability \(p_{j}\) by some probability \(\tilde{p}_{j}\), for \(j\le |\mathcal {W}|\). In this case, exponentially twisting is a fairly well-established approach for Bernoulli random variables; see, e.g., Glasserman and Li (2005). Let \(\tilde{\mathbb {P}}\) denote the corresponding IS probability measure and \(\tilde{\mathbb {E}}\) be the expectation under the measure \(\tilde{\mathbb {P}}\). Then the following identity holds:
where \(\tilde{L}=\dfrac{d\mathbb {P}}{d\tilde{\mathbb {P}}}\) is the Radon-Nikodym derivative of \(\mathbb {P}\) with respect to \(\tilde{\mathbb {P}}\) and equals
In the above expression, \(Y_{j}=1_{\{U_{j}>1-l_{j}f_{n}\}}\) and \(n_{j}Y_{j}\) denotes the number of defaults in sub-portfolio j. We refer to \(\tilde{L}\) as the unbiasing likelihood ratio. The key finding from (5.1) is that calculating the tail probability \(\mathbb {P}\left( L_{n}>nb\right) \) is equivalent to evaluating either expectation \(\mathbb {E}\left[ 1_{\{L_{n}>nb\}}\right] \) or \(\tilde{\mathbb {E}}\left[ 1_{\{L_{n}>nb\}}\tilde{L}\right] \). We refer the estimator based on the latter expectation as the IS estimator and its efficiency crucially depends on the choice of the IS density function \(\tilde{f}_{V}(\cdot )\).
We now discuss two measures to characterize the performance of the proposed IS estimator. Asymptotically, the good performance commonly observed in realistic situations is a bounded relative error (see Asmussen & Kroese, 2006; McLeish, 2010). We say a sequence of estimators \((1_{\{L_{n} >nb\}}\tilde{L}:n\ge 1)\) under probability measure \(\tilde{\mathbb {P}}\) has bounded relative error if
A slightly weaker form criterion called asymptotically optimal is also widely used (see Glasserman & Li, 2005; Glasserman et al., 2007, 2008) if the following condition holds,
This condition is equivalent to saying that \(\lim \limits _{n\rightarrow \infty }\tilde{\mathbb {E}}\left[ 1_{\{L_{n}>nb\}}\tilde{L}^{2}\right] /\mathbb {P}\left( L_{n}>nb\right) ^{2-\varepsilon }=0\), for every \(\varepsilon >0\). It is readily to check that bounded relative error implies asymptotically optimality.
5.2 Two-step importance sampling for tail probabilities
5.2.1 First step: twisting V
As a first step in providing our IS algorithm for LT-Archimedean copulas, we apply IS to the distribution of random variable V. In Assumption 4.1, we assume the generator \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV}_{-\alpha }\), where \(\phi ^{-1}\) is the LS transform of random variable V. Then by Karamata’s Tauberian Theorem (see Feller, 1971, pp. 442–446) V is actually heavy-tailed with tail index \(1/\alpha \). As noted in Asmussen et al. (2000), traditional exponential twisting approach cannot work directly for distributions with heavy tails, since a finite cumulant generating function in (5.7) does not exist when a positive twisting parameter is required. So an alternative method must be used. In this subsection we describe an IS algorithm to assign a larger probability to the event \(\left\{ V>\frac{v^{*}}{\phi (1-f_{n})}\right\} \) by hazard rate twisting the original distribution of V; see Juneja and Shahabuddin (2002) for an introduction on hazard rate twisting. We prove that this leads to an estimator that is asymptotically optimal.
Let us define the hazard rate function associated to the random variable V as
By changing \(\mathcal {H}(x)\) to \((1-\theta )\mathcal {H}(x)\) for some \(0<\theta <1\), the tail distribution changes to
and the density function becomes
Note that we have imposed an additional subscript \(\theta \) on both \(\overline{F}_{V,\theta }(x)\) and \(\overline{f}_{V,\theta }(x)\) to emphasize that these are the functions that correspond to the transformed variable \((1-\theta )\mathcal {H}(x)\). The prescribed transformation is similar to exponential twisting, except that the twisting rate is \(\theta \mathcal {H}(x)\) rather than \(\theta x\). By (5.2), one can also note that the tail of random variable V becomes heavier after twisting.
The key, then, is finding the best parameter \(\theta \). By (5.3), the corresponding likelihood ratio \(f_{V}(x)/f_{V,\theta }(x)\) is \(\frac{1}{1-\theta }\exp (-\theta \mathcal {H}(x))\), and this is upper bounded by
on the set \(\left\{ V>\frac{v^{*}}{\phi (1-f_{n})}\right\} \). It is a standard practice in IS to search for \(\tilde{\theta }\) by minimizing the upper bound on the likelihood ratio, since this also minimizes the upper bound of the second moment of the estimator \(1_{\{L_{n}>nb\}}\frac{f_{V}(V)}{f_{V,\theta }^{*}(V)}\). By taking the derivative on the upper bound (5.4) w.r.t. \(\theta \), we obtain
Then, the tail distribution in (5.2) corresponding to hazard rate twisting by \(\tilde{\theta }\) equals
Explicit form of (5.5) is usually difficult to derive, because the tail distribution for random variable V is only specified in a semiparametric way. Alternatively, we can replace the hazard function \(\mathcal {H}(x)\) by \(\tilde{\mathcal {H}}(x)\) where \(\mathcal {H}(x)\sim \tilde{\mathcal {H}}(x)\) and \(\tilde{\mathcal {H}}(x)\) is available in a closed form. Juneja et al. (2007) prove that estimators derived by such “asymptotic” hazard rate twisting method can achieve asymptotic optimality.
By Proposition B.1.9(1) of de Haan and Ferreira (2007), \(\overline{F}_{V} \in \mathrm {RV}_{-1/\alpha }\) implies \(\mathcal {H}(x)\sim \frac{1}{\alpha } \log (x)\) as \(x\rightarrow \infty \). This, along with (5.5), suggests that the tail distribution \(\overline{F}_{V,\tilde{\theta }}\) should be close to
For considerably large n, we can even ignore the term \(\log (v^{*})\) to achieve further simplification. Hence, the corresponding density function can be taken as
which is a Pareto distribution with shape parameter \(-1/\log \phi (1-f_{n})\). Now we define
where \(x_{0}\) is chosen to remain the ratio \(f_{V}(x)/f_{V}^{*}(x)\) upper bounded by a constant for all x. Thus, the tail part of random variable V becomes heavier from twisting, but the probability for small values remains unchanged.
Remark 5.1
The role of \(x_{0}\) is crucial for showing the asymptotic optimality of the algorithm, which is later seen in the proof of Lemma 5.1. Theoretically, its value relies on the explicit expression of the density function \(f_{V}(x)\). Practically, our numerical results are not sensitive to \(x_{0}\) and hence for ease of implementation, one may fix \(x_{0}\) to an arbitrary constant.
5.2.2 Second step: twisting to Bernoulli random variables
We now proceed to applying exponential twisting to Bernoulli random variables \(\{1_{\{U_{i}>1-l_{i}f_{n}\}},i=1,\ldots ,n\}\) conditional on the common factor V. A measure \(\tilde{\mathbb {P}}\) is said to be an exponentially twisted measure of \(\mathbb {P}\) by parameter \(\theta \), for some random variable X, if
where \(\Lambda _{X}(\theta )=\log \mathbb {E}[\exp (\theta X)]\) represents the cumulant generating function. Suppose random variable X has density function \(f_{X}(x)\), then the exponential twisted density has the form \(\exp (\theta x-\Lambda _{X}(\theta ))f_{X}(x)\).
Now we deal with the Bernoulli success probability \(p_{j}\), which is essentially p(v, j) as defined in (4.1) by conditioning on \(V=\dfrac{v}{\phi (1-f_{n})}\). In order to increase the conditional default probabilities, followed by the idea in Glasserman and Li (2005), we apply an exponential twist by choosing a parameter \(\theta \) and taking
where \(p_{j}^{\theta }\) denotes the \(\theta \)-twisted probability conditional on \(V=\dfrac{v}{\phi (1-f_{n})}\). Note that \(p_{j}^{\theta }\) is a strictly increasing function in \(\theta \) if \(\theta >0\). With this new choice of conditional default probabilities \(\left\{ p_{j}^{\theta }:j\le |\mathcal {W}|\right\} \), straightforward calculation shows that the likelihood ratio conditioning on V simplifies to
where
is the cumulant generating function of \(L_{n}\) conditional on V. For any \(\theta \), the estimator
is unbiased for \(\mathbb {P}\left( L_{n}>nb\left| V=\frac{v}{\phi (1-f_{n})}\right. \right) \) if probabilities \(\left\{ p_{j}^{\theta } :j\le |\mathcal {W}|\right\} \) are used to generate \(L_{n}\). Equation (5.8) formally establishes that applying an exponential twist on the probabilities is equivalent to applying an exponential twist to \(L_{n}|V\) itself.
It remains to choose the parameter \(\theta \). A standard practice in IS is to select a parameter \(\theta \) that minimizes the upper bound of the second moment of the estimator to reduce the variance. As we can see,
where \(\mathbb {E}_{\theta }\) denotes expectation using the \(\theta \)-twisted probabilities. The problem is then identical to finding a parameter \(\theta \) that maximizes \(nb\theta -\Lambda _{L_{n}|V}(\theta )\). Straightforward calculation shows that
By the strictly increasing property of \(\Lambda _{L_{n}|V}^{\prime }(\theta )\), the maximum is attained at
By (5.9), the two cases in (5.10) are distinguished by the value of \(\mathbb {E}\left[ L_{n}\left| V=\frac{v}{\phi (1-f_{n})}\right. \right] =\sum _{j\le |\mathcal {W}|}n_{j}c_{j}p_{j} \). For the former case, our choice of twisting parameter \(\theta ^{*}\) shifts the distribution of \(L_{n}\) so that the average portfolio loss is b; while for the latter case, the event \(\{L_{n}>nb\}\) is not rare, so we use the original probabilities.
5.2.3 Algorithm
Now we are ready to present the algorithm. It consists of three stages. First, a sample of V is generated using hazard rate twisting. Depending on the value of V, samples of the Bernoulli variables \(1_{\{U_{i}>1-l_{i}f_{n}\}}\) are generated in the second step, using either naive simulation (original probabilities) or importance sampling. The details on how to adjust conditional default probabilities have already been discussed in the previous subsections. Finally we compute the portfolio loss \(L_{n}\) and return the estimator after incorporating the likelihood ratio.
The following algorithm is for each replication.
Let \(\mathbb {P}^{*}\) and \(\mathbb {E}^{*}\) denote the IS probability measure and expectation corresponding to this algorithm. The likelihood ratio is given by
The following lemma is important in demonstrating the efficiency of our IS Algorithm.
Lemma 5.1
Under the same assumptions as in Theorem 4.1, we have
In view of Theorem 4.1, which provides the asymptotic estimate of the tail probability \(\mathbb {P}\left( L_{n}>nb\right) \), we conclude in the following theorem that our proposed algorithm is asymptotically optimal.
Theorem 5.1
Under the same assumptions as in Theorem 4.1, we have
Thus, the IS estimator (5.11) achieves asymptotic zero variance on the logarithmic scale.
5.3 Importance sampling for expected shortfall
In risk management, one is usually interested in estimating the expected shortfall at a confidence level close to 1, which is again a rare event simulation. In this subsection, we discuss how to apply our proposed IS algorithm to estimate the expected shortfall.
First, note that the expected shortfall can be understood as follows,
By involving the unbiasing likelihood ratio \(L^{*}\), (5.12) is equivalent to
where \(\mathbb {E}^{*}\) is the expectation corresponding to the IS algorithm in Sect. 5.2.3. Suppose m i.i.d. samples \((L_{n}^{1},\ldots ,L_{n}^{m})\) are generated under measure \(\mathbb {P}^{*}\). Let \(L_{i} ^{*}\) denote the corresponding likelihood ratio for each sample i. Then the IS estimator of the expected shortfall is given as
Note that the samples generated to estimate the numerator in (5.13) take positive value only when large losses occur. Therefore, one can expect the IS algorithm that works for estimating the probability of the event \(\{L_{n}>nb\}\) should also work well in estimating \(\mathbb {E}[L_{n}-nb]_{+}\). This is later confirmed by our numerical results.
6 Conditional Monte Carlo simulations for large portfolio loss
In this section, we propose another estimation method based on the conditional Monte Carlo approach, which is another variance reduction technique; see, e.g., Asmussen and Kroese (2006) and Asmussen (2018). Our proposed algorithm is motivated by Chan and Kroese (2010), in which the authors derived simple simulation algorithms to estimate the probability of large portfolio losses under the t-copula.
By utilizing the stochastic representation (3.2) for LT-Archimedean and the asymptotic expansions in Theorem 4.1, the rare event \(\{L_{n}>nb\}\) occurs primarily when the random variable V takes large value, while \(\mathbf {R}=(R_{1},\ldots ,R_{n})\) generally has little influence on the occurrence of the rare event. This simply suggests that an approach by integrating out V analytically could lead to a substantial variance reduction.
To proceed, it is useful to define
The individual obligor defaults if \(U_{i}>1-l_{i}f_{n}\), then \(V>O_{i}\). Thus, the portfolio loss in (2.2) can be rewritten as,
We rank \(O_{1},\ldots ,O_{n}\) as \(O_{(1)}\le O_{(2)}\le \cdots \le O_{(n)}\), and let \(c_{(i)}\) denote the associated exposure at default with \(O_{(i)}\). Then, one can check that the event \(\{L_{n}>nb\}\) happens if and only if \(V>O_{(k)}\), where \(k=\min \{l:\sum _{i=1}^{l}c_{(i)}>nb\}\). Particularly, if \(c_{i}\equiv c\) for all \(i=1,\ldots ,n\), then \(k=\lceil nb/c\rceil \). Now conditional on \(\mathbf {R}\), we have
We summarize our proposed conditional Monte Carlo algorithm, which is labelled as CondMC, in the following algorithm.
We now show that the conditional Monte Carlo estimator has bounded relative error, a stronger notion of asymptotic optimality than that for the IS estimator (5.11) established in Theorem 5.1.
Lemma 6.1
Under the same assumptions as in Theorem 4.1 except that \(\frac{1}{n}=O(f_{n})\), we have
In view of Theorem 4.1, we immediately obtain the following theorem concerning the algorithm efficiency.
Theorem 6.1
Under the same assumptions as in Lemma 6.1, we have
In other words, the conditional Monte Carlo estimator (6.2) has bounded relative error.
7 Numerical results
In this section, we assess the relative performance of our proposed algorithms via simulations, and investigate their sensitivity to \(\alpha \) (heavy tailedness of the systematic risk factor V), n (size of the portfolio) and b (a pre-fixed number that controls the level of the proportion of obligors who default). The numerical results indicate that our proposed algorithms, especially the CondMC algorithm, provide considerable variance reductions when compared to naive MC simulations. This supports our theoretical result that our proposed algorithms are all asymptotically optimal.
Due to the assumption that \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV}_{-\alpha }\) with \(\alpha >1\), we consider the Gumbel copula in our numerical experiment. The generator function of Gumbel copula is \(\phi (t)=(-\ln (t))^{\alpha }\) with \(\alpha >1\). By varying \(\alpha \), the Gumbel copula covers from independence (\(\alpha \rightarrow 1\)) to comonotonicity (\(\alpha \rightarrow \infty \)).
In all the experiments below, only homogeneous portfolios are considered. However, it should be emphasized that the performance of our algorithms is not affected for inhomogeneous portfolio. This is asserted by Theorems 5.1 and 6.1 that have been proved under a general setting for both homogeneous and inhomogeneous portfolios. To evaluate the accuracy of the estimators, for each set of specified parameters, we generate 50,000 samples for our proposed algorithms, estimate the probability of large portfolio loss, and provide the relative error (in \(\%\)), which is defined as the ratio of the estimator’s standard deviation to the estimator. More precisely, if \(\hat{p}\) is an unbiased estimator of \(\mathbb {P}\left( L_{n}>nb\right) \), its relative error is defied as \(\sqrt{\mathrm {Var} (\hat{p})}/\hat{p}\). We also report the variance reduction achieved by our proposed algorithms compared with naive simulation. For naive simulation, it is highly possible that the rare event would not be observed in any sample path with only 50,000 samples. Therefore, variance under naive simulation is estimated indirectly by exploiting the fact that variance for Bernoulli(p) equals \(p(1-p)\).
Table 1 provides a first comparison of our IS algorithm and CondMC algorithm with naive simulation as \(\alpha \) changes. The chosen model parameter values are \(n=500\), \(f_{n}=1/n\), \(b=0.8\), \(l_{i}=0.5\) and \(c_{i}=1\) for each i. As can be concluded from Table 1, both algorithms outperform the naive simulation, especially when \(\alpha \) is small, obligors have weaker dependence and the probability of large portfolio losses becomes smaller. Relative to the naive MC method, the variance reduction attained by the IS estimator is in the order of hundreds and thousands while the CondMC estimator is in the order of millions. This demonstrates that CondMC estimator significantly outperforms IS estimator.
In Table 2, we perform the same comparison by varying b while keeping \(\alpha \) fixed at 1.5. Under the setting that \(c=1\), the parameter b controls the level of the proportion of obligors that default. As is clear from the table, when b increases, the estimated probability decreases and the variance reduction becomes larger.
Table 3 provides the relative error and variance reduction of our algorithms compared with naive simulation as the number of obligors changes. All other parameters are identical to previous experiments by fixing \(\alpha =1.5\) and \(b=0.8\). In the last column, we also derive the sharp asymptotic for the desired probability of large portfolio loss based on the expression in (4.4). Note that as n increases, both the accuracy of the sharp asymptotic and the reduction in variance improve.
In Table 4, we study the accuracy of the sharp asymptotic for expected shortfall as the number of obligors increases. Model parameters are taken to be \(f_{n}=1/n\), \(\alpha =1.5\), \(b=0.8\), \(l_{i}=0.5\) and \(c=1\) for each i. For estimating expected shortfall, we simply use all the 50,000 sample paths generated under the proposed IS measure, and then consider those with portfolio loss exceeding nb. As shown in Table 4, the accuracy is quite high even for small values of n. This is mainly due to the fact that the hazard rate density is chosen based on the asymptotic result in Theorem 4.1. Discrepancy here is measured as the percentage difference between the ES estimated via importance sampling and the sharp asymptotic in (4.7).
To conclude the section, we note again that to the best of our knowledge, this is the first paper that adopts the Archimedean copula in the analysis of the large credit portfolio loss and proposes the corresponding importance sampling and conditional Monte Carlo estimators. On the other hand, the importance sampling estimators of Bassamboo et al. (2008) and conditional Monte Carlo estimators of Chan and Kroese (2010) assume that the dependence structure of obligors are modeled with a t-copula. Because of the difference in the underlying assumed dependence structure, the estimators considered in this paper are not directly comparable to the corresponding estimators in those two papers. Nevertheless, by comparing our simulation results to theirs, it is reassuring that even under very different dependence structure, significant variance reduction, especially for the estimator based on the conditional Monte Carlo method, can be expected. Furthermore, regardless of the assumed dependence structure, all estimators exhibit consistent behavior in the sense that they perform better for weaker dependence structures and larger portfolio sizes.
8 Conclusion
In this paper, we consider an Archimedean copula-based model for measuring portfolio credit risk. The analytic expressions of the probability of such portfolio incurs large losses is not available and directly applying naive MC simulation on these rare events are also not efficient. We first derive sharp asymptotic expansions to study the probability of large portfolio losses and the expected shortfall of the losses. Using this as a stepping stone, we develop two efficient algorithms to estimate the risk of a credit portfolio via simulation. The first one is a two-step full IS algorithm, which can be used to estimate both probability and expected shortfall of portfolio loss. We show that the proposed estimator is logarithmically efficient. The second algorithm is based on the conditional Monte Carlo simulation, which can be used to estimate the probability of portfolio loss. This estimator is shown to have bounded relative error. Through extensive simulation studies, both algorithms, especially the second one, show significant variance reductions when compared to naive MC simulations.
References
Albrecher, H., Constantinescu, C., & Loisel, S. (2011). Explicit ruin formulas for models with dependence among risks. Insurance: Mathematics and Economics, 48(2), 265–270.
Asmussen, S. (2018). Conditional Monte Carlo for sums, with applications to insurance and finance. Annals of Actuarial Science, 12(2), 455–478.
Asmussen, S., Binswanger, K., Højgaard, B., et al. (2000). Rare events simulation for heavy-tailed distributions. Bernoulli, 6(2), 303–322.
Asmussen, S., & Kroese, D. P. (2006). Improved algorithms for rare event simulation with heavy tails. Advances in Applied Probability, 38(2), 545–558.
Basoğlu, I., Hörmann, W., & Sak, H. (2018). Efficient simulations for a Bernoulli mixture model of portfolio credit risk. Annals of Operations Research, 260, 113–128.
Bassamboo, A., Juneja, S., & Zeevi, A. (2008). Portfolio credit risk with extremal dependence: Asymptotic analysis and efficient simulation. Operations Research, 56(3), 593–606.
Berndt, B. C. (1998). Ramanujan’s notebooks part V. Springer.
Bingham, N. H., Goldie, C. M., & Teugels, J. L. (1989). Regular variation (Vol. 27). Cambridge University Press.
Chan, J. C., & Kroese, D. P. (2010). Efficient estimation of large portfolio loss probabilities in \(t\)-copula models. European Journal of Operational Research, 205(2), 361–367.
Charpentier, A., & Segers, J. (2009). Tails of multivariate Archimedean copulas. Journal of Multivariate Analysis, 100(7), 1521–1537.
Cherubini, U., Luciano, E., & Vecchiato, W. (2004). Copula methods in finance. Wiley.
Cossette, H., Marceau, E., Mtalai, I., & Veilleux, D. (2018). Dependent risk models with Archimedean copulas: A computational strategy based on common mixtures and applications. Insurance: Mathematics and Economics, 78, 53–71.
de Haan, L., & Ferreira, A. (2007). Extreme value theory: An introduction. Springer.
Denuit, M., Purcaru, O., Van Keilegom, I., et al. (2004). Bivariate Archimedean copula modelling for loss-alae data in non-life insurance. IS Discussion Papers, 423.
Embrechts, P., Lindskog, F., & McNeil, A. (2001). Modelling dependence with copulas (p. 14). Département de mathématiques, Institut Fédéral de Technologie de Zurich, Zurich: Rapport technique.
Feller, W. (1971). An introduction to probability theory and its applications (Vol. 2). Wiley.
Frees, E. W., & Valdez, E. A. (1998). Understanding relationships using copulas. North American Actuarial Journal, 2(1), 1–25.
Genest, C., & Favre, A.-C. (2007). Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering, 12(4), 347–368.
Glasserman, P. (2004). Tail approximations for portfolio credit risk. The Journal of Derivatives, 12(2), 24–42.
Glasserman, P., Kang, W., & Shahabuddin, P. (2007). Large deviations in multifactor portfolio credit risk. Mathematical Finance, 17(3), 345–379.
Glasserman, P., Kang, W., & Shahabuddin, P. (2008). Fast simulation of multifactor portfolio credit risk. Operations Research, 56(5), 1200–1217.
Glasserman, P., & Li, J. (2005). Importance sampling for portfolio credit risk. Management Science, 51(11), 1643–1656.
Gordy, M. B. (2003). A risk-factor model foundation for ratings-based bank capital rules. Journal of Financial Intermediation, 12(3), 199–232.
Gupton, G. M., Finger, C. C., & Bhatia, M. (1997). Creditmetrics: Technical document. JP Morgan & Co.
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301), 13–30.
Hofert, M. (2008). Sampling Archimedean copulas. Computational Statistics & Data Analysis, 52(12), 5163–5174.
Hofert, M. (2010). Sampling nested Archimedean copulas with applications to CDO pricing. PhD thesis, Universität Ulm.
Hofert, M., Mächler, M., & McNeil, A. J. (2013). Archimedean copulas in high dimensions: Estimators and numerical challenges motivated by financial applications. Journal de la Société Française de Statistique, 154(1), 25–63.
Hofert, M., & Scherer, M. (2011). CDO pricing with nested Archimedean copulas. Quantitative Finance, 11(5), 775–787.
Hong, L. J., Juneja, S., & Luo, J. (2014). Estimating sensitivities of portfolio credit risk using Monte Carlo. INFORMS Journal on Computing, 26(4), 848–865.
Juneja, S., Karandikar, R., & Shahabuddin, P. (2007). Asymptotics and fast simulation for tail probabilities of maximum of sums of few random variables. ACM Transactions on Modeling and Computer Simulation (TOMACS), 17(2), 7.
Juneja, S., & Shahabuddin, P. (2002). Simulating heavy tailed processes using delayed hazard rate twisting. ACM Transactions on Modeling and Computer Simulation (TOMACS), 12(2), 94–118.
Kealhofer, S. & Bohn, J. (2001). Portfolio management of credit risk. Technical Report.
Marshall, A. W., & Olkin, I. (1988). Families of multivariate distributions. Journal of the American Statistical Association, 83(403), 834–841.
McLeish, D. L. (2010). Bounded relative error importance sampling and rare event simulation. ASTIN Bulletin: The Journal of the IAA, 40(1), 377–398.
McNeil, A. J., Frey, R., & Embrechts, P. (2015). Quantitative risk management: Concepts, techniques and tools. Princeton University Press.
Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance, 29(2), 449–470.
Naifar, N. (2011). Modelling dependence structure with Archimedean copulas and applications to the iTraxx CDS index. Journal of Computational and Applied Mathematics, 235(8), 2459–2466.
Okhrin, O., Okhrin, Y., & Schmid, W. (2013). On the structure and estimation of hierarchical Archimedean copulas. Journal of Econometrics, 173(2), 189–204.
Rényi, A. (1953). On the theory of order statistics. Acta Mathematica Hungarica, 4(3–4), 191–231.
Resnick, S. I. (2013). Extreme values, regular variation and point processes. Springer.
Tang, Q., Tang, Z., & Yang, Y. (2019). Sharp asymptotics for large portfolio losses under extreme risks. European Journal of Operational Research, 276(2), 710–722.
Tong, E. N., Mues, C., Brown, I., & Thomas, L. C. (2016). Exposure at default models with and without the credit conversion factor. European Journal of Operational Research, 252(3), 910–920.
Wang, W. (2003). Estimating the association parameter for copula models under dependent censoring. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(1), 257–273.
Zhang, L., & Singh, V. P. (2007). Bivariate rainfall frequency distributions using Archimedean copulas. Journal of Hydrology, 332(1–2), 93–109.
Zhu, W., Wang, C., & Tan, K. S. (2016). Levy subordinated hierarchical Archimedean copula: Theory and application. Journal of Banking and Finance, 69, 20–36.
Acknowledgements
We are grateful to the Editor and the anonymous reviewer for the helpful comments and suggestions that have greatly improved the presentation of the paper. Hengxin Cui thanks the support from the Hickman Scholar Program of the Society of Actuaries. Ken Seng Tan acknowledges the research funding from the Society of Actuaries CAE’s grant and the Singapore University Grant. Fan Yang acknowledges financial support from the Natural Sciences and Engineering Research Council of Canada (Grant Number: 04242).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs
Appendix: Proofs
To simplify the notation, for any two positive functions g and h, we write \(g\lesssim h\) or \(h > rsim g\) if \(\lim \sup g/h\le 1\).
1.1 A.1 Proofs for LT-Archimedean copulas
We first list a series of lemmas that will be useful for proving Theorem 4.1 and Theorem 4.2. The following is a restatement of Theorem 2 of Hoeffding (1963).
Lemma A.1
If \(X_{1},X_{2},\ldots ,X_{n}\) are independent and \(a_{i}\le X_{i}\le b_{i}\) for \(i=1,\ldots ,n\), then for \(\varepsilon >0\)
with \(\bar{X}_{n}=\left( X_{1}+X_{2}+\ldots +X_{n}\right) /n\).
Applying Lemma A.1, we obtain the following inequality:
Lemma A.2
For any \(\varepsilon >0\) and any large M, there exists a constant \(\beta >0\) such that
uniformly for all \(0<v\le M\) and for all sufficiently large n, where \(\mathbb {P}_{v}\) denotes the original probability measure conditioned on \(V=\frac{v}{\phi (1-f_{n})}\).
Proof
Note that \(U_{i}\) are conditionally independent on V. Then by Lemma A.1, for every n,
where \(\beta \) is some unimportant constant not depending on n and v.
Using (A.1), to obtain the desired result, it suffices to show the existence of N, such for all \(n\ge N\),
holds uniformly for all \(v\le M\). Recall that \(r(v)=\sum _{j\le |\mathcal {W} |}c_{j}w_{j}\tilde{p}(v,j)\). Note that \(n_{j}\) denotes the number of obligors in sub-portfolio j. Then
where \(\bar{c}=\sum _{j\le |\mathcal {W}|}c_{j}w_{j}\). By Assumption 2.1, there exists \(N_{1}\) satisfying \(\sum _{j\le |\mathcal {W}|} c_{j}\left| \frac{n_{j}}{n}-w_{j}\right| \le \frac{\varepsilon }{2}\) for all \(n\ge N_{1}\). For the second part of (A.3), by noting that \(e^{x}\ge 1+x\) for all \(x\in \mathbb {R}\), we have
Since \(\phi \in \mathrm {RV}_{\alpha }(1)\), there exists \(N_{2}\) such that for all \(n\ge N_{2}\), \(\bar{c}\max \limits _{j\le |\mathcal {W}|,v\in A}\left| p(v,j)-\tilde{p}(v,j)\right| \le \frac{\varepsilon }{2}\).
Combining the upper bound for both parts in (A.3) and letting \(N=\max \{N_{1},N_{2}\}\), (A.2) holds uniformly for all \(v\le M\). The proof is then completed. \(\square \)
The following proof of Theorem 4.1 is motivated by the proof of Theorem 1 in Bassamboo et al. (2008).
Proof of Theorem 4.1
Let \(v_{\delta }^{*}\) denote the unique solution to the equation \(r(v)=b-\delta \). By using continuity and monotonicity of r(v) in v, we have
as \(\delta \rightarrow 0\).
Fix \(\delta >0\). We decompose the probability of the event \(\{L_{n}>nb\}\) into two terms as
The remaining part of proof will be divided into three steps. We first show that \(I_{1}\) is asymptotically negligible. Then we develop upper and lower bounds for \(I_{2}\) with the second and third step.
Step 1. We show \(I_{1}=o(f_{n})\). Note that for any \(v\le v_{\delta }^{*}\), \(r(v)\le b-\delta \). Thus, by Lemma A.2, for all sufficiently large n, there exists a constant \(\beta >0\) such that
uniformly for all \(v\le v_{\delta }^{*}\). So the same upper bound holds for \(I_{1}\). Due to the condition on \(f_{n}\), \(I_{1}=o(f_{n})\).
Step 2. We now develop an asymptotic upper bound for \(I_{2} \). Note that
Recall that \(\phi ^{-1}\) is the LS transform for random variable V. Then by \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV}_{-\alpha }\) and Karamata’s Tauberian theorem, we obtain
where in the first step we used \(\overline{F}_{V}\in \mathrm {RV}_{-1/\alpha }\) and the second step is due to \(1-\phi ^{-1}(\frac{1}{\cdot })\in \mathrm {RV} _{1/\alpha }\). Letting \(\delta \downarrow 0\), we obtain
Step 3. We now develop an asymptotic lower bound for \(I_{2} \). Denote \(v_{\widehat{\delta }}^{*}\) as the unique solution to the equation \(r(v)=b+\delta \). Similarly, we have \(v_{\widehat{\delta }}^{*}\rightarrow v^{*}\) as \(\delta \rightarrow 0\). It also follows from the monotonicity of r(v) that \(v_{\widehat{\delta }}^{*}\ge v_{\delta }^{*}\). Thus,
Note that for any large \(M>0\), applying Lemma A.2, it holds uniformly for \(v\in \left[ v_{\hat{\delta }}^{*},M\right] \) that \(r(v)\ge b+\delta \) and then as \(n\rightarrow \infty \), by Lemma A.2
Hence,
Taking \(M\rightarrow \infty \) followed by \(\delta \rightarrow 0\), we get
Combining (A.4), (A.5) with Step 1 completes the proof of the theorem. \(\square \)
Proof of Theorem 4.2
We first note that the expected shortfall can be rewritten as in (4.6). Using Theorem 4.1, in order to get the desired result, it suffices to show that
We decompose the left-hand side of (A.6) into the following two terms
where \(\bar{c}=\sum _{j\le |\mathcal {W}|}c_{j}w_{j}\). The remaining part of proof will be divided into three steps. We first show \(\mathbb {P}\left( L_{n}>n\bar{c}\right) \) and \(J_{2}\) are asymptotically negligible in the first two steps. Then we develop the asymptotic for \(J_{1}\) in the last step. For simplicity, we denote the unique solution of the equation \(r(v)=s\) for \(0\le s\le \bar{c}\) by \(r^{\leftarrow }(s)\).
Step 1. In this step, we show
Fix an arbitrarily small \(\delta >0\). Proceeding in the same way as in step 1 in the proof of Theorem 4.1, for all sufficiently large n, there exists a constant \(\beta >0\) such that
Due to the condition on \(f_{n}\) and letting \(\delta \downarrow 0\), we have the desired result in (A.7).
Step 2. In this step, we show \(J_{2}=o(f_{n}).\) Note that \(J_{2}\) can be rewritten as follows,
Since \(\frac{L_{n}}{n}<\max \limits _{j\le \vert \mathcal {W}\vert }c_{j}\), we have
It follows from (A.7) that \(J_{2}=o(f_{n})\).
Step 3. To this end, we show
First note that, for any \(x\in [b,\bar{c}]\), by Theorem 4.1 we have
Further, the following inequality holds any \(x\in [b,\bar{c}]\)
Applying the dominated convergence theorem, we obtain
The last equality is by changing the variable and let \(v=r^{\leftarrow }(x)\).
Combing Step 2 and Step 3 completes the proof of the theorem. \(\square \)
1.2 A.2 Proofs for algorithm efficiency
Lemma A.3 and A.4 will be used in proving Lemma 5.1.
Lemma A.3
For sufficiently large n, there exists a constant C such that
for all x, where \(f_{V}^{*}(x)\) is defined in (5.6).
Proof
By definition of \(f_{V}^{*}(x)\), the ratio \(\frac{f_{V}(x)}{f_{V}^{*}(x)}\) equals 1 for \(x<x_{0}\). Hence, to show (A.8), it suffices to show the existence of a constant C for all \(x\ge x_{0}\).
Note that when \(x\ge x_{0}\),
By Assumption 4.1 that V has a eventually monotone density function, we have \(f_{V}\in \mathrm {RV}_{-1/\alpha -1}\). Then by Potter’s bounds [see e.g. Theorem B.1.9 (5) of de Haan and Ferreira (2007)], for any small \(\varepsilon >0\), there exists \(x_{0}>0\) and a constant \(C_{0}>0\) such that for all \(x\ge x_{0}\)
Thus,
which yields our desired result by noting the fact that \(x\ge x_{0}\) and \(-1/\alpha -\frac{1}{\log \phi (1-f_{n})}+\varepsilon <0\). \(\square \)
Lemma A.4
If \(\phi (1-\frac{1}{\cdot })\in \mathrm {RV}_{-\alpha }\) for some \(\alpha >1\) and \(f_{n}\) is a positive deterministic function converging to 0 as \(n\rightarrow \infty \), then
Proof
By Proposition B.1.9(1) of de Haan and Ferreira (2007), \(\phi \in \mathrm {RV}_{\alpha }(1)\) implies that
as \(x\rightarrow 0\). \(\square \)
The following proof is motivated by the proof of Theorem 3 in Bassamboo et al. (2008).
Proof of Lemma 5.1
Let
Note that if \(\mathbb {E}\left[ L_{n}\left| V=\frac{v}{\phi (1-f_{n} )}\right. \right] <nb\), \(p_{j}^{*}=p_{\theta ^{*}}(V\phi (1-f_{n}),j)\) where \(\theta ^{*}\) is chosen by solving \(\Lambda _{L_{n}|V}^{\prime } (\theta )=nb\); otherwise \(p_{j}^{*}=p\left( V\phi (1-f_{n}),j\right) \) by setting \(\theta ^{*}=0\). Besides, (5.8) shows \(\hat{L}\) can be written as follows.
Then it follows that, for any v,
Since \(\Lambda _{L_{n}|V}(\theta )\) is a strictly convex function, one can observe that \(-\theta nb+\Lambda _{L_{n}|V}(\theta )\) is minimized at \(\theta ^{*}\) and equals 0 at \(\theta =0\). Hence, the following relation
holds for any v.
To prove the theorem, now we re-express
where \(v_{\delta }^{*}\) is the unique solution to the equation \(r(v)=b-\delta \).
The remaining part of proof will be divided into three steps.
Step 1. In this step, we show
By Lemma A.3, for sufficiently large n, there exists a finite positive constant C such that
for all v. From (A.10), it then follows that
Therefore, \(K_{1}\) is upper bounded by
The last step is due to step 1 in the proof of Theorem 4.1. Moreover, by Lemma A.4, \(-\log \phi (1-f_{n})\sim \alpha \log \left( \frac{1}{f_{n}}\right) =o\left( \frac{1}{f_{n}}\right) \). Note \(f_{n}\) has a sub-exponential decay rate, it implies \(\frac{1}{f_{n}}\exp (-\beta n/2)\rightarrow 0\). Therefore, \(K_{1}\) is still \(o(f_{n})\).
Step 2. We show that
By Jensen’s inequality,
where the last step is due to Theorem 4.1. Then (A.11) follows by applying the logarithm function on both sides and using the fact that \(\log \left( f_{n}\right) <0\) for all sufficiently large n.
Step 3. We show that
First note that, on the set \(\left\{ L_{n}>nb,V>\frac{v_{\delta }^{*}}{\phi (1-f_{n})}\right\} \), by (A.10) the likelihood ratio \(L^{*}\) is upper bounded by \(\frac{f_{V}(v)}{f_{V}^{*}(v)}\) and hence by (A.9), with sufficiently large n, it holds for all \(v>\frac{v_{\delta }^{*}}{\phi (1-f_{n})}\) that
Multiplying it with the indicator and taking expectation under \(\mathbb {E} ^{*}\), we obtain
Then, taking logarithms on both sides, dividing by \(\log f_{n}\) and by Lemma A.4, we obtain
Finally, (A.12) is yield by letting \(\varepsilon \downarrow 0\).
Combining Step 1, Step 2 and Step 3, the desired result asserted in the theorem is obtained. \(\square \)
The following two proofs are motivated by Chan and Kroese (2010). Lemma A.5 below will be used in proving Lemma 6.1.
Lemma A.5
Let \(R_{1},\ldots ,R_{n}\) be an i.i.d. sequence of standard exponential random variables. Suppose \(R_{(k)}\) is the kth order statistic and \(\lim _{n\rightarrow \infty }\frac{k}{n}=a<1\). Then, for every \(\varepsilon >0\), there exists a constant \(\beta >0\) such that the following inequality
holds for all sufficiently large n.
Proof
For i.i.d. standard exponential random variables \(R_{i},i=1,\ldots ,n\), it follows from Rényi (1953) that
Then,
where \(H_{n}\) denotes the nth harmonic number, i.e., \(H_{n}=1+\frac{1}{2}+\cdots +\frac{1}{n}\) for \(n\ge 1\). (A.13) is verified by noting the following asymptotic expansion; see, e.g., Berndt (1998),
and \(\gamma \) is the Euler’s constant. Similarly,
where \(H_{n}^{(2)}\) is the nth harmonic number of order 2, i.e., \(H_{n}^{(2)}=1+\frac{1}{2^{2}}+\cdots +\frac{1}{n^{2}}\) for \(n\ge 1\). (A.14) is derived by applying the asymptotic expansion of \(H_{n}^{(2)} \); see, e.g., Berndt (1998),
Then, by Chebyshev’s inequality, it follows that, for every \(n>0\),
Due to (A.13) and (A.14), there exists N, such that for all \(n\ge N\),
where \(\beta \) only depends on \(\varepsilon \) and a. \(\square \)
Proof of Lemma 6.1
Recall that \(O_{i}=\frac{R_{i}}{\phi (1-l_{i}f_{n})}\), for all \(i=1,\ldots ,n\). Then the order statistic \(O_{(k)}\) is almost surely lower bounded by
Since \(k=\min \{l:\sum _{i=1}^{l}c_{(i)}>nb\}\), we have
Fix \(\varepsilon >0\). For all sufficiently large n, \(\mathbb {E}\left[ S^{2}(\mathbf {R})\right] \) can be bounded as follows,
Then,
The last step is due to the regular variation of V, Lemma A.5 and the condition that \(\frac{1}{n}=O(f_{n})\). \(\square \)
Rights and permissions
About this article
Cite this article
Cui, H., Tan, K.S. & Yang, F. Portfolio credit risk with Archimedean copulas: asymptotic analysis and efficient simulation. Ann Oper Res 332, 55–84 (2024). https://doi.org/10.1007/s10479-022-04717-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-04717-0