Abstract
This paper introduces a general semi-parametric method for estimating a vector of parameters in multivariate copula models. The proposed approach uses the moments of the multivariate probability integral random variable to generalize the inversion of Kendall’s tau estimator. What makes the new methodology attractive is the fact that it can be performed as soon as one can simulate from the assumed parametric family of copulas. This feature is especially helpful when explicit expressions are not available for the theoretical moments. The consistency and asymptotic normality of the proposed estimators are established under mild conditions. An extensive simulation study indicates that the price to pay for the estimation of the moments is modest and that the new estimators are almost as accurate as the pseudo-maximum likelihood (PML) estimator. The usefulness of the proposed estimators is illustrated on the modelling of multivariate data with copula models where the PML estimator is hardly computable.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
When the goal is to find an appropriate model for a random vector \(\textbf{X}\in \mathbb {R}^d\), a well-established strategy is to model the marginal behaviours and the dependence structure separately. This approach is possible thanks to a theorem of Sklar (1959), which states that there exists \(C:[0,1]^d \rightarrow [0,1]\) such that for all \({\textbf{x}}= (x_1, \ldots , x_d) \in \mathbb {R}^d\),
Whenever the marginal distributions are continuous, the function C is unique and is called the copula of \(\textbf{X}\). See Nelsen (2007) and Joe (2015) for details on copulas. In that context, it is customary to assume that C belongs to a parametric family \(\mathcal C = \{ C_{\varvec{\theta }}; {\varvec{\theta }}\in \varTheta \subseteq \mathbb {R}^p \}\) and then estimate the unknown parameter from a sample of copies \(\textbf{X}_1, \ldots , \textbf{X}_T\) of \(\textbf{X}\).
A popular estimator of a copula parameter is the pseudo-maximum likelihood (PML) estimator introduced by Oakes (1994) and later investigated by Genest et al. (1995). In principle, this method is applicable regardless of the dimension of the vector and of the number of parameters. However, the PML estimator requires an explicit expression for the copula density, which is not always the case, notwithstanding the fact that the density may be explicit but intractable. The PML estimator can also be numerically unstable, especially when the family has several parameters. Some authors have proposed approaches to address these shortcomings. For example, minimum-distance estimators were considered by Biau and Wegkamp (2005) based on copula densities and by Tsukahara (2005) and Weiß (2011) relying on goodness-of-fit metrics. When there is only one parameter to estimate, a common strategy is the inversion of Kendall’s tau. This estimator was considered by Genest et al. (2006) in the bivariate case and by Genest et al. (2011) for d-dimensional copulas.
This paper extends the use of the inversion of Kendall’s tau to families comprising multiple parameters. The proposed estimators are based on unbiased estimations of the moments of the multivariate probability integral transformation (MPIT) random variable, thereby avoiding the need to estimate the marginal distributions. The proposed method is similar to Brahimi and Necir (2012), who suggested using higher moments of the MPIT. However, their estimators are biased and their approach is limited to cases where the vector of theoretical moments is explicitly invertible. In order to circumvent the aforementioned constraint, an approach based on simulated moments à la McFadden (1989) is adopted. Hence, the proposed estimators can be performed as soon as it is possible to simulate from a given copula family.
The manuscript is organized as follows. A generic method-of-moments estimator based on a vector of U-statistics is introduced in Sect. 2; its consistency and asymptotic normality are established under mild conditions for both the standard and simulated versions. Section 3 explains how to estimate the moments of the multivariate probability integral transformation without biases and describes the new estimators of copula parameters suitably adapted to the parametric structure at hand. In Sect. 4, the performance of the proposed estimators is investigated and compared to competing procedures through an extensive simulations study. Section 5 illustrates the introduced methodologies on the modelling of multivariate data with copula models that have a complex parametric structure. Section 6 ends the paper with a brief discussion. The mathematical proofs can be found in an appendix and the Matlab code is available at www.uqtr.ca/MyMatlabWebpage.
2 A generic method-of-moments estimator of copula parameters
2.1 Statistical functionals
The estimators proposed in this paper are special cases of a generic method-of-moments estimator based on a vector of U-statistics. Specifically, let \(\textbf{X}= (X_1,\ldots , X_d)\) be a random vector from a d-variate distribution function F that has continuous marginals \(F_1, \ldots , F_d\) and a unique copula C. Consider the vector \({\varvec{\kappa }}= ({\varvec{\kappa }}_1, \ldots , {\varvec{\kappa }}_L)\), where for each \(\ell \in \{1, \ldots , L\}\), \({\varvec{\kappa }}_\ell := {\varvec{\kappa }}_\ell ({\textbf{x}}_1, \ldots , {\textbf{x}}_m)\) is a symmetric function in its m arguments. Then, for \(\textbf{X}_1, \ldots , \textbf{X}_m\) i.i.d. F, define the statistical functional
In order to develop semi-parametric estimators of the parameters of a given family of copulas, it is necessary that \(\mathcal S_{\varvec{\kappa }}(F)\) be free of the marginal distributions. To this end, \({\varvec{\kappa }}\) must be such that \(\mathcal S_{\varvec{\kappa }}(F) = \mathcal S_{\varvec{\kappa }}(C)\), i.e.,
where \({\textbf{U}}_1, \ldots , {\textbf{U}}_m\), with \({\textbf{U}}_j = ( F_1(X_{j1}), \ldots , F_d(X_{jd}) )\), are i.i.d. C.
Assuming a random sample \(\textbf{X}_1, \ldots , \textbf{X}_T\) i.i.d. F, an unbiased estimator of the vector of means \({\varvec{\mu }}_C^{\varvec{\kappa }}= {\mathbb {E}}\left\{ {\varvec{\kappa }}({\textbf{U}}_1, \ldots , {\textbf{U}}_m) \right\} \) is given by the L-dimensional vector of U-statistics
where in the above expression, the sum of vectors is taken component by component. From Theorem 3, p. 122 of Lee (1990), \({\varvec{\mu }}_T^{\varvec{\kappa }}\) converges almost surely to \(\mu _C^{\varvec{\kappa }}\) as long as \({\mathbb {E}}\{ |{\varvec{\kappa }}(\textbf{X}_1,\ldots ,\textbf{X}_m)| \} < \infty \).
The next result states the asymptotic normality of \(\sqrt{T} ({\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}_C^{\varvec{\kappa }})\). The proof is a direct application of Theorem 2, p. 76 of Lee (1990) and is therefore omitted. Here and in the sequel, \(\rightsquigarrow \) means “converges in distribution to”.
Lemma 1
For each \(\ell \in \{ 1, \ldots , L \}\), let \({\textbf{U}}_1, \ldots , {\textbf{U}}_m\) be i.i.d. C and define \({\varvec{\kappa }}_C^\star = ({\varvec{\kappa }}^\star _1, \ldots , {\varvec{\kappa }}^\star _L)\), where
Provided that \(\varSigma _C^{\varvec{\kappa }}= {\mathbb {E}}\{ {\varvec{\kappa }}_C^\star ({\textbf{U}}_1)^\top {\varvec{\kappa }}_C^\star ({\textbf{U}}_1) \} \in \mathbb {R}^{L\times L}\) is non-singular, \(\sqrt{T} ({\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}_C^{\varvec{\kappa }}) \rightsquigarrow \mathcal {N}(\textbf{0}_L, \varSigma _C^{\varvec{\kappa }})\).
2.2 A generalized method-of-moments estimator
Suppose that the copula C of the population belongs to a copula family
In that case, there exists \({\varvec{\theta }}_0 \in \varTheta \) called the “true value”, so that \(C = C_{{\varvec{\theta }}_0}\). As is well known, a method-of-moments estimator consists of estimating \({\varvec{\theta }}_0\) by selecting \({\varvec{\theta }}\in \varTheta \) such that the sample moments match their theoretical counterparts. In the current context, let \(\mu ^{\varvec{\kappa }}({\varvec{\theta }}) = \mu _{C_{\varvec{\theta }}}^{\varvec{\kappa }}\) and assume that the map \({\varvec{\mu }}^{\varvec{\kappa }}: \varTheta \rightarrow \mathbb {R}^L\) is one-to-one on an open set \(\varTheta \subset \mathbb {R}^L\). A method-of-moments estimator of \({\varvec{\theta }}_0\) is the unique solution of \({\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}) = {\varvec{\mu }}_T^{\varvec{\kappa }}\), namely
The consistency and asymptotic normality of \({\varvec{\theta }}_T^{\varvec{\kappa }}\) are stated next.
Proposition 1
Let \(\textbf{X}_1, \ldots , \textbf{X}_T\) be i.i.d. \(C_{{\varvec{\theta }}_0}\) and assume that \({\varvec{\mu }}^{\varvec{\kappa }}: \varTheta \rightarrow \mathbb {R}^L\) is one-to-one and continuously differentiable at \({\varvec{\theta }}_0 \in \varTheta \) with nonsingular first order derivative \({\varvec{\nu }}_0^{\varvec{\kappa }}\in \mathbb {R}^{L\times L}\) at \({\varvec{\theta }}_0\). Let also \(A_0 = ({\varvec{\nu }}_0^{\varvec{\kappa }})^{-1}\) and define \(\varSigma _0^{\varvec{\kappa }}\) as the covariance matrix in Lemma 1 when \(C = C_{{\varvec{\theta }}_0}\). Then \({\varvec{\theta }}_T^{\varvec{\kappa }}\) exists with a probability tending to one and
2.3 Simulated version of the generalized method-of-moments estimator
It is difficult to compute the proposed method-of-moments estimator when \({\varvec{\mu }}^{\varvec{\kappa }}\) is not explicitly invertible. This is even worse when there is no explicit expression for \({\varvec{\mu }}^{\varvec{\kappa }}\). In such cases, it is useful to express \({\varvec{\theta }}_T^{\varvec{\kappa }}\) as the minimum-distance estimator
where \(M_T \in \mathbb {R}^{L\times L}\) is a weight matrix that converges in probability to a positive definite matrix \(M_0 \in \mathbb {R}^{L\times L}\) as \(T \rightarrow \infty \). Nevertheless, this expression does not solve the problem of cases where \({\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\) admits no explicit expression.
To avoid the above-mentioned drawbacks, a simulated version of the generic method-of-moments estimator \({\varvec{\theta }}_T^{\varvec{\kappa }}\) is proposed. The idea is in the same spirit as that investigated by Oh and Patton (2013) for copula-based time series models, which is itself inspired by the simulated method-of-moments estimators studied by McFadden (1989), Pakes and Pollard (1989) and Newey and McFadden (1994). To describe the idea in the current context, consider a version of \({\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\) based on a simulated sample \({\textbf{U}}_1^{\varvec{\theta }}, \ldots , {\textbf{U}}_S^{\varvec{\theta }}\) i.i.d. \(C_{\varvec{\theta }}\), namely
The proposed simulated method-of-moments estimator of \({\varvec{\theta }}_0 \in \varTheta \) is then
Proposition 2 states the consistency of \({\varvec{\theta }}_{T,S}^{\varvec{\kappa }}\) as \(T,S \rightarrow \infty \), i.e., \({\varvec{\theta }}_{T,S}^{\varvec{\kappa }}\) converges in probability to the true value \({\varvec{\theta }}_0 \in \varTheta \). The conditions under which it happens are mild. In particular, it is no longer required that \(\mu ^{\varvec{\kappa }}\) is one-to-one.
Proposition 2
Let \(\textbf{X}_1, \ldots , \textbf{X}_T\) be i.i.d. from a distribution with continuous marginals and unique copula that belongs to a family \(\{ C_{\varvec{\theta }}; {\varvec{\theta }}\in \varTheta \subset \mathbb {R}^L \}\), where \(\varTheta \) is compact. For \({\varvec{\theta }}_0 \in \varTheta \) being the true value, assume that \({\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) \ne {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\) as soon as \({\varvec{\theta }}\ne {\varvec{\theta }}_0\) and that \(\mu ^{\varvec{\kappa }}\) is Lipschitz continuous on \(\varTheta \). Then, as \(T, S \rightarrow \infty \), \({\varvec{\theta }}_{T,S}^{\varvec{\kappa }}\) is consistent for \({\varvec{\theta }}_0\).
Remark 1
Unlike standard results on the consistency of simulated method-of-moments estimators (see Pakes and Pollard 1989; McFadden 1989, for instance), Proposition 2 allows S, T to go to infinity at different rates. In other words, it is assumed more generally that \(T/S \rightarrow \zeta \in [0,\infty )\) as \(T,S \rightarrow \infty \), so that the case when \({\varvec{\mu }}({\varvec{\theta }})\) is explicit is recovered at the limit when \(T/S \rightarrow 0\).
Lemma 2
A sufficient condition for the Lipschitz continuity of \({\varvec{\mu }}^{\varvec{\kappa }}\) is that the density \(c_{\varvec{\theta }}\) of \(C_{\varvec{\theta }}\) be uniformly Lipschitz, i.e., there exists \(K \in (0,\infty )\) such that
where \(\Vert \cdot \Vert \) is the euclidean norm.
The uniform Lipschitz property as stated in Lemma 2 holds for the Farlie–Gumbel–Morgenstern family of bivariate copulas whose members have density \(c_\theta (u_1,u_2) = 1 + \theta (1-2u_1)(1-2u_2)\) for \(\theta \in [-1,1]\), since one readily obtains
However, this uniform Lipschitz condition is rather strong and not verified for many commonly used copula families.
It is possible to show the almost sure convergence (strong consistency) of \({\varvec{\theta }}_{T,S}^{\varvec{\kappa }}\), but its weak consistency is enough to establish its asymptotic normality. This result is precisely the subject of the next proposition.
Proposition 3
Let \(\textbf{X}_1, \ldots , \textbf{X}_T\) be i.i.d. from a distribution with continuous marginals and unique copula that belongs to a family \(\{ C_{\varvec{\theta }}; {\varvec{\theta }}\in \varTheta \subset \mathbb {R}^L \}\). Assume that the true value \({\varvec{\theta }}_0\) is an interior point of \(\varTheta \) and that \(g_0^{\varvec{\kappa }}({\varvec{\theta }}) = {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) - {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\) possesses a derivative \(G_0^{\varvec{\kappa }}\) at \({\varvec{\theta }}_0\) such that \(B_0^{\varvec{\kappa }}= G_0^{\varvec{\kappa }}\, M_0 \, (G_0^{\varvec{\kappa }})^\top \) is nonsingular. Also, suppose that for \(g_{T,S}^{\varvec{\kappa }}({\varvec{\theta }}) = {\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}_S^{\varvec{\kappa }}({\varvec{\theta }})\),
Then for \(\varOmega _0^{\varvec{\kappa }}= (B_0^{\varvec{\kappa }})^{-1} \, G_0^{\varvec{\kappa }}\, M_0 \, \varSigma _0^{\varvec{\kappa }}\, M_0 \, (G_0^{\varvec{\kappa }})^\top (B_0^{\varvec{\kappa }})^{-1}\),
3 Moments of the multivariate probability integral transformation
3.1 Population versions
The estimators that will be developed in the sequel are based on the multivariate probability integral transformation (MPIT). The MPIT was first introduced by Genest and Rivest (1993) as a tool for testing the fit to bivariate copulas. Specifically, the MPIT of a random vector \(\textbf{X}= (X_1, \ldots , X_d) \sim F\) is the random variable \(W = F(\textbf{X})\). When the marginals \(F_1, \ldots , F_d\) of F are continuous, one can invoke Sklar’s Theorem and write
where \({\textbf{U}}= (F_1(X_1), \ldots , F_d(X_d)) \sim C\). In other words, the stochastic behaviour of W depends only on the unique copula C of F.
For certain parametric structures, it is advantageous to work with the pair-by-pair probability integral transformations related to \(\textbf{X}\). Specifically, let \(C_{jj'}\) be the copula of the pair \((X_j,X_{j'})\) and consider \(W_{jj'} = C_{jj'}(U_j,U_{j'})\), where \(U_j = F_j(X_j)\) and \(U_{j'} = F_{j'}(X_{j'})\). The first moment of \(W_{jj'}\) is related to Kendall’s measure of association of \((U_j,U_{j'})\) through the relationship
A similar claim can be made about a d-variate extension due to Kendall and Smith (1940) defined as the mean of the pair-by-pair Kendall’s tau. Another extension of Kendall’s tau proposed by Joe (1990) is related to the first moment of W. These two d-variate Kendall measures are given respectively by
3.2 Unbiased estimation of the moments of the MPIT
The next proposition provides an unbiased estimator of the moments of W based on a sample \(\textbf{X}_1, \ldots , \textbf{X}_T\) i.i.d. from a d-variate distribution F with continuous marginals \(F_1, \ldots , F_d\) and unique copula C.
Proposition 4
For \(a \in \mathbb {N}\), let \(\mu _a = {\mathbb {E}}(W^a)\), where \(W = C({\textbf{U}})\) and \({\textbf{U}}\sim C\). Then an unbiased estimator of \(\mu _a\) is provided by the U-statistic \(\widehat{\mu }_a\) with symmetric kernel of order \(a+1\) defined for \({\textbf{x}}_1, \ldots , {\textbf{x}}_{a+1} \in \mathbb {R}^d\) by
This kernel satisfies (1), i.e., is marginal-free.
Note that when \(a=1\), \(\mathcal {K}_1({\textbf{x}}_1,{\textbf{x}}_2) = \{ \mathbb {I}({\textbf{x}}_1< {\textbf{x}}_2) + \mathbb {I}({\textbf{x}}_2 < {\textbf{x}}_1) \} / 2\). This is, up to a constant, the kernel of the empirical Kendall’s measure of association.
One is now in a position to establish the convergence in distribution of the vector of U-statistics \(\widehat{\varvec{\mu }}= ( \widehat{\mu }_1, \ldots , \widehat{\mu }_L)\).
Proposition 5
Let \(\textbf{X}_1, \ldots , \textbf{X}_T\) be i.i.d. from a d-variate distribution with continuous marginals and unique copula C. For \(\widehat{\varvec{\mu }}= ( \widehat{\mu }_1, \ldots , \widehat{\mu }_L)\) and \({\varvec{\mu }}= ( \mu _1, \ldots , \mu _L)\), then \(\sqrt{T} (\widehat{\varvec{\mu }}- {\varvec{\mu }})\) converges to the L-variate normal distribution with zero means and variance-covariance matrix \(\varSigma \in \mathbb {R}^{L\times L}\) such that for each \(a, a' \in \{ 1, \ldots , L \}\), \(\varSigma _{aa'} = {\mathbb {E}}\{ \mathcal {K}_a^\star ({\textbf{U}}) \, \mathcal {K}_{a'}^\star ({\textbf{U}}) \}\), \({\textbf{U}}\sim C\), where
3.3 Estimators of copula parameters
Let \(\textbf{X}_1, \ldots , \textbf{X}_T\) be a random sample of independent and identically distributed \(\mathbb {R}^d\)-valued vectors whose joint distribution F has continuous marginal distributions. It is assumed that the unique copula C of F belongs to \(\{ C_{\varvec{\theta }}; {\varvec{\theta }}\in \varTheta \subseteq \mathbb {R}^L \}\) and the goal is to estimate the unknown parameter \({\varvec{\theta }}_0 \in \varTheta \subset \mathbb {R}^L\).
One possibility to estimate \({\varvec{\theta }}_0\) is to use the first L moments of \(W = C_{\varvec{\theta }}({\textbf{U}})\), i.e., let \({\varvec{\mu }}({\varvec{\theta }}) = ({\mathbb {E}}(W), \ldots , {\mathbb {E}}(W^L) )\), where it is understood that the expectation is taken with respect to \(C_{\varvec{\theta }}\). An empirical version of \({\varvec{\mu }}({\varvec{\theta }})\) is the vector \({\varvec{\mu }}_T\) of the first L empirical moments. The corresponding vector of kernels is then \({\varvec{\kappa }}= (\mathcal {K}_1, \ldots , \mathcal {K}_L)\), where \(\mathcal {K}_a\) is defined in Proposition 4. The simulated method-of-moments estimator is thus of the form given in (3), i.e.,
where \({\varvec{\mu }}_S({\varvec{\theta }})\) estimates \({\varvec{\mu }}({\varvec{\theta }})\) based on a sample of size S from \(C_{\varvec{\theta }}\).
Many parametrization schemes have a pair-by-pair structure of the form \({\varvec{\theta }}= (\varSigma ,{\varvec{\gamma }})\), where \(\varSigma \in \mathbb {R}^{d\times d}\) is a correlation matrix whose off-diagonal entry \(\varSigma _{jj'}\) appears only in the distribution of \((X_j,X_{j'})\) for each \(j \ne j' \in \{1,\ldots ,d\}\). This pattern occurs, for example, for models derived from the multivariate Normal. If \({\varvec{\gamma }}\in \mathbb {R}^q\) is a parameter that appears in the distribution of every sub-vector of \(\textbf{X}\), one can estimate \(\varSigma _{jj'}\) and \({\varvec{\gamma }}\) from the first \(q+1\) moments of \(W_{jj'} = C_{\varSigma _{jj'},{\varvec{\gamma }}}(U_j,U_{j'})\), yielding \(\widehat{\varSigma }_{jj'}\) and \(\widehat{{\varvec{\gamma }}}_{jj'}\). The global parameter can then be estimated with
Remark 2
It is worth mentioning the work by Brahimi and Necir (2012), who also suggested using the first L moments of W. Their methodology is however limited to cases where the vector of theoretical moments is explicitly invertible and is based on a biased estimation of the moments using the empirical copula.
3.4 On the use of alternative probability integral random variables
Other moments of a copula could be used for the purpose of parameter estimation. For instance, as pointed out by Quessy (2009), Spearman’s measure of association and some of its multivariate extensions can be expressed as the expectation of a symmetric kernel of a U-statistic. Indeed, Spearman’s rho is an affine transformation of the expectation of \(W_\textrm{Sp} = F_1(X_1) \times \cdots \times F_d(X_d)\), where \(\textbf{X}= (X_1, \ldots , X_d)\) follows a distribution F with continuous marginals \(F_1, \ldots , F_d\). One could then consider \({\mathbb {E}}(W_\textrm{Sp}^a)\), but its estimation with a U-statistic involves a kernel of order \(a \times (d+1)\). To see it, note that
where \(\textbf{Y}_1^\varPi , \ldots , \textbf{Y}_a^\varPi \) are i.i.d. \(F_1 \times \cdots \times F_d\) and \(\textbf{X}\sim F\). The expression inside the brackets involves \(a \times (d+1)\) independent random variables. This makes the use of \(W_\textrm{Sp}\) less attractive than the use of W, especially as the number of parameters increases.
4 Sampling properties of the estimators
4.1 Preliminaries
This section investigates the performance of the estimators defined in Eqs. (4) and (5). Comparisons with competing estimators are also made. In the sequel, the accuracy of an estimator \(\widehat{\theta }\) of a given parameter \(\theta \in \mathbb {R}\) is measured by its relative bias (RB) and relative root mean-squared error (RRMSE), namely
As explained by Oh and Patton (2013) (see also Gouriéroux et al. 1996), it is crucial that the random number generator seed involved in their computation be fixed across the generation of the simulated datasets of size S. Otherwise, the evaluation of the estimated function will be unstable and the optimization will not converge. The minimization in Eq. (3) is performed using the MATLAB routine fminsearchbnd written by John D’Errico. Unlike for example the Newton algorithm, the latter does not require the existence of derivatives. The maximum number of iterations has been set to 40. Many choices are possible for the weight matrix, including the inverse of the efficient weight matrix. However, in line with a recommendation by Oh and Patton (2013), one only considers \(M_T = \textrm{I}_L\) throughout in order to simplify the analyses.
4.2 Calibration of the simulated estimator
A popular estimator in the case of a one-parameter bivariate copula families \(\mathcal C = \{ C_\theta ; \theta \in \varTheta \subseteq \mathbb {R}\}\) is the inversion of Kendall’s tau (IKT). Specifically, if \((X_{11},X_{12}), \ldots , (X_{T1},X_{T2})\) are random pairs from a population with continuous marginals and a copula \(C \in \mathcal C\), the IKT estimator is defined by
In that expression, \(\tau _C\) is the population value of Kendall’s tau, i.e.,
and \(\tau _T\) is its empirical counterpart, i.e.,
Whereas the simulated method-of-moments estimator is designed for cases where an explicit expression for the vector of moments is unavailable, it can be performed in the case of copula families with explicit expressions for their moments. This is the case for the one-parameter Clayton, Gumbel, Normal and Frank bivariate copula families described in Table 1.
An investigation on the accuracy of \(\widehat{\theta }^\tau \) and its estimated version \(\widehat{\theta }\) has been performed in the light of their estimated RRMSE as estimated from 1000 replicates. This study can provide not only information on the loss of accuracy due to the use of the estimated version instead of the mathematical inversion, but also on the role of the number of simulated samples S on the performance of \(\widehat{\theta }\). To this end, the values \(S \in \{ 100, 250, 500 \}\) have been considered; the corresponding estimators are referred respectively to \(\widehat{\theta }_{100}\), \(\widehat{\theta }_{250}\) and \(\widehat{\theta }_{500}\). The results for sample sizes \(T \in \{50,100\}\) for a Kendall’s tau that belongs to \(\{ 1/4, 1/2, 3/4 \}\) are found in Table 2. Note that the expression of Kendall’s tau for Frank’s copula has been inverted numerically.
Looking at Table 2, one can see that the RRMSE are smaller when \(T=100\) compared to \(T=50\), as expected. It is also not a surprise that \(\widehat{\theta }^\tau \) is more accurate than its simulated versions. However, the loss of efficiency for using \(\mu _S(\theta )\) instead of \(\mu (\theta )\) is rather mild as soon as \(S=250\). In fact, when \(T=100\), the average of the relative efficiency \(\textrm{RRMSE}(\widehat{\theta }^\tau ) / \textrm{RRMSE}(\widehat{\theta })\) over the twelve models is 74,0%, 87,2% and 93,3% when \(S=100,250,500\), respectively. Note finally that the RRMSE decreases as the level of dependence increases, i.e. as \(\tau _C\) increases, for all the models.
4.3 On the use of higher moments of the MPIT
One could ask if considering higher moments of W can lead to better estimators. To answer this question, at least in part, let \(\widehat{\theta }_{\{2\}}\) and \(\widehat{\theta }_{\{1,2\}}\) be the simulated method-of-moments estimators based respectively on the second moment of W and on its first two moments. The estimated relative bias and relative root mean-squared error of these two estimators, as well as of \(\widehat{\theta }\), are found in Table 3 for the one-parameter families of Table 1. The sample size is \(T=100\) and the number of simulated samples has been set to \(S=100\). It is reasonable to think that the use of higher values of S would not influence much the relative efficiency of the estimators. The conclusion from Table 3 is that using higher moments does not result in more efficient estimators. Even though the RRMSE of \(\widehat{\theta }_{\{1,2\}}\) is slightly smaller than that of \(\widehat{\theta }\) in some cases, the gain in accuracy is rather small. As a general rule, one can advocate the use of the simulated method-of-moments estimator with the same number of moments of W as the number of parameters to estimate.
4.4 Comparison with competing procedures
In this section, the performance of \(\widehat{\varvec{\theta }}\) will be compared to two other semi-parametric procedures, namely the pseudo-maximum likelihood (PML) estimator and the simulated method-of-moments estimator of Oh and Patton (2013). Other possibilities exist, for instance minimum distance (MD) estimators derived from goodness-of-fit criteria. Based on an extensive simulation study, Weiß (2011) concluded that no MD estimator stands out and are worse than the PML estimator in terms of bias and mean-squared error. For that reason, no MD estimator will be considered in the upcoming simulations study.
The PML estimator is the rank-based version of the maximum-likelihood estimator and is sometimes referred to the canonical maximum likelihood (see Cherubini et al. 2004, for instance). If \(C_{\varvec{\theta }}\) admits a density \(c_{\varvec{\theta }}\), the PML estimator of \({\varvec{\theta }}\) as defined by Shih and Louis (1995) and Genest et al. (1995) is
where for each \(j \in \{ 1, \ldots , d \}\), \(R_{tj}\) is the rank of \(X_{tj}\) among \(X_{1j}, \ldots , X_{Tj}\). The estimator of Oh and Patton (2013) is based on the vector of statistics \({\varvec{\nu }}_T = (\rho _T^\textrm{Sp}, \lambda _T^{.05}, \lambda _T^{.10}, \lambda _T^{.90}, \lambda _T^{.95})\), where \(\rho _T^\textrm{Sp}\) is Spearman’s rank correlation and
where \(C_T\) is the bivariate empirical copula. Recall that
Letting \({\varvec{\nu }}_S({\varvec{\theta }})\) be a version of \({\varvec{\nu }}_T\) based on a sample of size S from \(C_{\varvec{\theta }}\), the estimator of Oh and Patton (2013) is
As suggested by Oh and Patton (2013), the weight matrix is set to \(M_T = \textrm{I}_5\).
4.4.1 One-parameter families
The performances of \(\widehat{\theta }_\textrm{ML}\), \(\widehat{\theta }_\textrm{OP }\) and \(\widehat{\theta }\) have been compared under the four families of Tables 1. The maximization of the pseudo-likelihood for the computation of \(\widehat{\theta }_\textrm{ML}\) uses the MATLAB procedure fminsearchbnd. To increase the numerical stability, the computation of Gumbel’s density uses the finite-difference approximation \(c_\theta ^\textrm{Gu}(u_1,u_2) \approx C_\theta ^\textrm{Gu}(u_1+\epsilon ,u_2+\epsilon ) + C_\theta ^\textrm{Gu}(u_1,u_2) - C_\theta ^\textrm{Gu}(u_1+\epsilon ,u_2) - C_\theta ^\textrm{Gu}(u_1,u_2+\epsilon )\), where \(\epsilon = 1\times 10^{-8}\). The four copula models are parametrized in term of Kendall’s tau \(\tau _C \in [0,1]\) and the optimum is searched inside [.01, .99] with the initial value \(x_0 = 1/2\). The results on the relative bias and relative root mean-squared error are found in Table 4 when \(T=100\). The number of simulated samples has been set to \(S=500\).
First note that the estimator \(\widehat{\theta }_\textrm{OP}\) of Oh and Patton (2013) is significantly more biased than \(\widehat{\theta }\), except when \(\tau _C=1/4\) for the Normal and Frank copulas. This difference in accuracy is also reflected in their respective RRMSE. As expected, the pseudo-maximum likelihood (PML) estimator performs generally better than the simulated method-of-moments (SMM) estimator in terms of RRMSE. However, the average of the relative efficiency over the twelve models is 87,8%, so the loss of efficiency for using \(\widehat{\theta }\) instead of \(\widehat{\theta }_\textrm{ML}\) is small. Interestingly, the relative bias of the SMM estimator is often smaller than that of the PML estimator when the level of dependence is small or moderate, i.e., when \(\tau \in \{1/4, 1/2\}\). Overall, the SMM estimator can be safely recommended when the use of the PML estimator is problematic, e.g., when the density is not tractable and/or when there is a high number of parameters to estimate.
4.4.2 The two-parameters chi-square copula
As defined by Bárdossy (2006) and later investigated by Quessy et al. (2016), the d-variate chi-square copula is the dependence structure of \(\textbf{X}= ( (Z_1+\gamma _1)^2, \ldots , (Z_d+\gamma _d)^2 )\), where \((Z_1, \ldots , Z_d)\) is d-variate normal with zero means, unit variances and correlation \(\varSigma \), and \(\gamma _1, \ldots , \gamma _d \in [0,\infty )\) are non-centrality parameters. One recovers the Normal copula at the limit when \(\gamma _1 = \cdots = \gamma _d \rightarrow \infty \). Unlike the Normal, the chi-square copula is radially asymmetric.
The results in Table 5 concern the estimation of the parameters of the bivariate chi-square copula when \(\gamma _1 = \gamma _2 = \gamma \), where one can find the estimated relative bias and relative root mean-squared error of \((\widehat{\gamma }_\textrm{ML},\widehat{\theta }_\textrm{ML})\), \((\widehat{\gamma }_\textrm{OP},\widehat{\theta }_\textrm{OP})\) and \((\widehat{\gamma },\widehat{\theta })\) when \(T=100\) and \(S=100\). In that case, the density is
where \(\phi _\theta \) is the density of the bivariate normal distribution with zero means, unit variances and correlation \(\theta \in [-1,1]\), and for \(G_\gamma (x) = \varPhi (\sqrt{x}-\gamma ) + \varPhi (\sqrt{x}+\gamma ) - 1\), \(h_\gamma (u) = \{ G_\gamma ^{-1}(u) \}^{1/2}\) and \(D_\gamma (u) = \phi \{h_\gamma (u)-\gamma \} + \phi \{h_\gamma (u)+\gamma \}\).
For the estimation of the non-centrality parameter \(\gamma \), one observes that the accuracy of the three estimators, both in terms of RB and RRMSE, increase as \(\gamma \) increases. For a given value of \(\gamma \), their accuracy also increases as \(\tau _C\) increases, except when \(\gamma =2\). Looking at the relative performance of the three estimators, the PML estimator is clearly the best. Overall, \(\widehat{\gamma }_\textrm{OP}\) is systematically, although slightly, more accurate than \(\widehat{\gamma }\). Turning to the estimation of the dependence parameter \(\theta \), one can say that \(\widehat{\theta }\) stands out positively from its two competitors when \(\gamma \in \{ 3/2, 2 \}\). When \(\gamma \in \{1/2, 1 \}\), there is no clear trend as to which estimator performs better overall. Generally speaking, \(\widehat{\theta }\) is the best when \(\tau _C = 1/4\), \(\widehat{\theta }^\textrm{OP}\) when \(\tau _C = 1/2\) and \(\widehat{\theta }^ \textrm{ML}\) when \(\tau _C = 3/4\).
4.5 Multivariate models
4.5.1 The Archimedean family
A d-dimensional copula is a member of the Archimedean family if it can be expressed in the form \(C_\varPsi ({\textbf{u}}) = \varPsi \left\{ \varPsi ^{-1}(u_1) + \cdots + \varPsi ^{-1}(u_d) \right\} \), where \(\varPsi : [0,\infty ) \rightarrow [0,1]\) is called the generator and satisfies \((-1)^j \, \varPsi ^{[j]} \ge 0\) for each \(j \in \{1,\ldots ,d\}\), where \(\varPsi ^{[j]}(t) = \partial ^j \, \varPsi (t) / \partial ^j t\). See McNeil and Nešlehová (2009) for more details. The Clayton, Gumbel and Frank copulas whose bivariate versions are detailed in Table 1 are particular cases of this class, where the generators are respectively \(\varPsi _\theta ^{\textrm{C}\ell }(t) = (\theta t + 1)^{-1/\theta }\), \(\varPsi _\theta ^\textrm{Gu}(t) = e^{-t^{1-\theta }}\) and
If it is assumed that the d-variate copula of a population belongs to a given parametric Archimedean family with \({\varvec{\theta }}\in \varTheta \subset \mathbb {R}^L\), then \({\varvec{\theta }}_0\) can be estimated by the simulated method-of-moments estimator in (4) based on the first L moments of W. Alternatively, since \({\varvec{\theta }}_0\) appears in the distribution of any possible pair of variables, another estimator similar to the one in (5) is
Table 6 reports the performance of \(\widehat{\theta }\) and \(\widehat{\theta }^\star \) in terms of relative bias and RRMSE when \(T=100\) and \(S=250\) for the multivariate one-parameter Clayton, Gumbel and Frank copulas in dimensions \(d \in \{ 3, 4, 5 \}\). First observe that the RRMSE decreases as the dimension d increases. Note also the drastic decrease in RRMSE as Kendall’s tau passes from 1/4 to 1/2, then a slightly smaller decrease for the passage from 1/2 to 3/4. Note also that in most cases, \(\widehat{\theta }\) is less biased than \(\widehat{\theta }^\star \). However, the opposite is true when looking at the RRMSEs, although the differences are minimal. In short, it can be concluded that both estimation strategies work well.
4.5.2 Elliptical copulas and their squared versions
Elliptical distributions are parametrized in a pair-by-pair fashion. Specifically, as initially defined by Cambanis et al. (1981), a vector \(\textbf{X}\in \mathbb {R}^d\) is said to follow an elliptically contoured distribution if it admits the stochastic representation \(\textbf{X}= R A \mathcal U\), where \(R>0\) is the radial random variable, \(\varSigma = A^\top A \in \mathbb {R}^{d\times d}\) is symmetric and positive definite and \(\mathcal U\) is uniformly distributed on the unit sphere in \(\mathbb {R}^d\). An elliptical copula is simply the copula extracted from an elliptical distribution, as first investigated by Fang et al. (2002). Elliptical copulas then inherit from the pairwise parametrization of elliptical distributions.
Apart from the Normal copula, a popular elliptical model is Student’s copula with \(\gamma \in (0,\infty )\) degrees of freedom and parameter \(\theta \in (-1,1)\), which can be expressed implicitly by \(C_{\gamma ,\theta }(u_1,u_2) = \varOmega _{\gamma ,\theta } \left\{ \varOmega _\gamma ^{-1}(u_1), \varOmega _\gamma ^{-1}(u_2) \right\} \), where \(\varOmega _\gamma \) is the cumulative distribution function of the univariate Student and \(\varOmega _{\gamma ,\theta }\) is the cdf of Student’s bivariate distribution. Another model is the generalized Laplace copula (see Kozubowski et al. 2013, for details) extracted from the multivariate distribution whose density is
with \(K_\lambda \) the modified Bessel function of index \(\lambda \).
As is well known, Kendall’s tau of a bivariate elliptical copula is given by the simple formula \(\tau _C(\theta ) = (2/\pi ) \sin ^{-1}\theta \), whatever the form of the radial random variable (see Fang et al. 2002, for instance). Hence, \({\mathbb {E}}(W) = \{ \tau _C(\theta ) + 1 \} / 4\) does not depend on R. As is stated in the next proposition, it is indeed the case for any moment of order \(a \in \mathbb {N}\) of W. This theoretical result is illustrated in the two top panels of Fig. 1, where one can find the estimated curves of \({\mathbb {E}}(W^2)\) as a function of \(\theta \in (0,1)\) for various values of \(\gamma \) in the case of the Student and Laplace copulas.
Proposition 6
The moment of order \(a \in \mathbb {N}\) of an elliptical copula characterized by some radial variable R does not depend on R.
In the light of Proposition 6, it is not possible to estimate \((\gamma ,\theta )\) with the SMM estimator based on the MPIT random variable. It is however possible to estimate the parameters of the so-called squared version of an elliptical copula. As defined by Quessy and Durocher (2019), the squared copula associated to a d-variate copula C is the distribution of \((|1-2U_1|, \ldots , |1-2U_d|)\) when \((U_1, \ldots , U_d) \sim C\). When C is an elliptical copula, the resulting squared construction has a pair-by-pair parametric structure \(\varSigma \in \mathbb {R}^{d\times d}\) and a global parameter \({\varvec{\gamma }}\in \mathbb {R}^q\). See the bottom panels of Fig. 1 for \({\mathbb {E}}(W^2)\) as a function of \(\theta \) for the squared–Student (introduced by Favre et al. (2018) as the Fisher copula) and squared–Laplace copulas.
As is explained in Sect. 3.3, the entries of \(\varSigma \) can be estimated using the first \(q+1\) moments of \(W_{jj'}\) for each \(j<j' \in \{1, \ldots , d\}\), and then estimate \({\varvec{\gamma }}\) with the mean of \(\widehat{{\varvec{\gamma }}}_{jj'}\), \(j<j' \in \{1, \ldots , d\}\), i.e., with the estimator in (5). The results in Table 7 concern the Fisher and squared–Laplace copulas in dimension \(d \in \{ 3, 4, 5 \}\) when \(T=100\) and \(S=100\). To simplify the presentation of results, the correlation matrix \(\varSigma \in \mathbb {R}^{d\times d}\) has been taken equicorrelated in such a way that \(\varSigma _{jj'} = \sin (\pi \tau _C /2)\) for each \(j \ne j' \in \{1, \ldots , d\}\), where \(\tau _C \in \{ 1/4,1/2,3/4\}\). From experiences not presented here, considering more general correlation matrices has a negligible influence on the performance of the estimator.
Looking at the results in Table 7, it can be seen that the performance of \(\widehat{\gamma }\) is, more or less, equivalent for the three dimensions considered. For the Fisher copula, the accuracy significantly increases as \(\tau _C\) increases when \(\gamma \in \{3,6\}\); when \(\gamma =10\), the estimator is the most accurate when \(\tau =1/2\). Turning to the squared–Laplace copula, the accuracy of the estimator in relation to the level of Kendall’s tau depends on \(\gamma \). Hence, whereas the accuracy increases as \(\tau _C\) increases when \(\gamma =1\), there is no clear trend when \(\gamma \in \{3,5\}\).
5 Data analysis of hockey data
Ice hockey is a fast-paced team game that is played continuously and for which the measure of a player’s quality with appropriate indicators is a real challenge. For the illustration that follows, the \(T=410\) forwards that played at least 16,000 s in the 2019–2020 season of the National Hockey League (NHL) have been considered. Five variables, namely \(X_1\): Points, \(X_2\): Expected goals (xg) with a player on, \(X_3\): Playing in attack, \(X_4\): Scoring chances and \(X_5\): Number of shots, have been selected to characterize their offensive skills. Each variable have been rescaled to a block of 60 min. The pairwise scatterplots of the raw data and of the standardized ranks is found in Fig. 2.
Looking at Fig. 2, a radially asymmetric dependence structure featuring more weights in the upper tails seems to emerge. This is confirmed by the test of radial symmetry of Bahraoui and Quessy (2017) based on the copula characteristic function with the normal weight and smoothing parameter \(\sigma = 1\). Indeed, the p-value of the test, as estimated from 10,000 multiplier bootstrap samples, is 4.61%. For this reason, parameter estimation has been performed for the following seven radially asymmetric copula families in order to capture this radially asymmetric behaviour:
-
(i)
The one-parameter survival-Clayton and Gumbel Archimedean copulas;
-
(ii)
The chi-square copula with non-centrality parameter \(\gamma \in [0,\infty )\);
-
(iii)
The squared versions of the Student, Laplace and Pearson type II copulas;
-
(iv)
A special case of the skew–Student copula as defined by Demarta and McNeil (2005), i.e., the dependence structure of \(\textbf{X}= {\textbf{Z}}/ \sqrt{Y} + \gamma _2 \, \textbf{1}_d / Y\), where \({\textbf{Z}}\) is standard normal with correlation \(\varSigma \in \mathbb {R}^{d\times d}\), \(\gamma _1 \, Y\) is chi-square with \(\gamma _1 \ge 1\) degrees of freedom, \(\gamma _2 \in \mathbb {R}\) is an asymmetry parameter and \(\textbf{1}_d = (1, \ldots , 1)\).
The results of the estimation based on the simulated method-of-moments estimator performed with \(S=250\) are in Table 8. As a criterion for choosing an appropriate model among the seven copula families, the ability of the model to reproduce Kendall’s matrix has been considered. Specifically, a sample of size \(T=2,500\) have been simulated from each estimated model and the Frobenius matrix distance between the sample Kendall matrix \(K_T\) of the data and that of the simulated sample have been computed. For the Hockey data,
Looking at the results in the third column of Table 8, the Chi-square (\(\chi ^2\)) and skew–Student (Sk) copulas stand out among the seven models considered. In order to see if these models are good to reproduce the observed data, artificial samples of size \(T=410\) from both copula models have been simulated at the estimated values. The corresponding estimations of \(\varSigma \) are
Note that \(\widehat{\varSigma }^\textrm{Sk}\) results from a transformation due to Higham (2002) to make it positive definite. The resulting samples have then been put on the same scales as the raw data by taking the empirical percentiles. Their scatterplots (raw data and standardized ranks) are found in Fig. 3 and Fig. 4, respectively. Whereas the chi-square copula is better at reproducing Kendall’s matrix, the skew–Student copula seems better at reproducing the upper tail behaviour of the dependence structure.
6 Conclusion
This paper developed a general parameter estimation procedure for multivariate copula models. The proposed estimators are based on the moments of the multivariate probability integral transformation (MPIT). It then generalizes the inversion of Kendall’s tau estimator. On one hand, moments of order greater than one are considered, making possible the estimation in multi-parameters models. On the other hand, the use of simulated moments make the methodology apply as soon as it is possible to simulate from a given parametric model. Compared to Brahimi and Necir (2012), the proposed estimators are not restricted to (the few) cases where the mapping induced by the theoretical moments is explicitly invertible. Oh and Patton (2013) also developed simulated method-of-moments estimators in copula models. What can be seen as an advantage over their method: (i) it is no longer necessary to base the estimation on pairwise dependence measures and (ii) the number of moments can match the number of parameters of the assumed parametric copula model.
Knowing how to estimate for dimensions \(d>2\) and multi-parameter models is important, especially in the context of big data, which is becoming more important. However, the applicability of the pseudo-maximum likelihood estimator (i.e., any dimension d, any number of parameters to estimate) is more theoretical than practical. Indeed, even in cases when an explicit (and numerically tractable) copula density is available, the PMLE can be computationally very intensive, especially when d is large. The flexibility of the method introduced in this work then appears to be of great interest for multi-parameter complex dependence models. One can mention the skew–Student copulas introduced by (Demarta and McNeil 2005), the factor copulas investigated for instance by Krupskii and Joe (2013), Krupskii and Joe (2015) and Mazo et al. (2016) as well as the vines copulas (see Czado 2019).
References
Andrews DWK (1994) Asymptotics for semiparametric econometric models via stochastic equicontinuity. Econometrica 62:43–72
Bahraoui T, Quessy J-F (2017) Tests of radial symmetry for multivariate copulas based on the copula characteristic function. Electron J Stat 11:2066–2096
Bárdossy A (2006) Copula-based geostatistical models for groundwater quality parameters. Water Resour Res 42:1–12
Biau G, Wegkamp M (2005) A note on minimum distance estimation of copula densities. Stat Probab Lett 73:105–114
Brahimi B, Necir A (2012) A semiparametric estimation of copula models based on the method of moments. Stat Methodol 9:467–477
Cambanis S, Huang S, Simons G (1981) On the theory of elliptically contoured distributions. J Multivar Anal 11:368–385
Cherubini U, Luciano E, Vecchiato W (2004) Copula methods in finance. Wiley, Chichester
Czado C (2019) Analyzing dependent data with vine copulas, vol. 222 of Lecture Notes in Statistics. Springer, Cham. A practical guide with R
Demarta S, McNeil AJ (2005) The t copula and related copulas. Int Stat Rev 73:111–129
Fang H-B, Fang K-T, Kotz S (2002) The meta-elliptical distributions with given marginals. J Multivar Anal 82:1–16
Favre A-C, Quessy J-F, Toupin M-H (2018) The new family of Fischer copulas to model upper tail dependence and radial asymmetry: properties and application to high-dimensional rainfall data. Environmetrics 29(e2494):17
Genest C, Ghoudi K, Rivest L-P (1995) A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82:543–552
Genest C, Nešlehová J, Ben Ghorbal N (2011) Estimators based on Kendall’s tau in multivariate copula models. Aust N Z J Stat 53:157–177
Genest C, Quessy J-F, Rémillard B (2006) Goodness-of-fit procedures for copula models based on the probability integral transformation. Scand J Stat 33:337–366
Genest C, Rivest L-P (1993) Statistical inference procedures for bivariate Archimedean copulas. J Am Stat Assoc 88:1034–1043
Gouriéroux C, Monfort A, Renault E (1996) Two-stage generalized moment method with applications to regressions with heteroscedasticity of unknown form. J Stat Plann Inference 50:37–63
Higham NJ (2002) Computing the nearest correlation matrix-a problem from finance. IMA J Numer Anal 22:329–343
Joe H (1990) Multivariate concordance. J Multivar Anal 35:12–30
Joe H (2015) Dependence modeling with copulas. Monographs on statistics and applied probability. CRC Press, Boca Raton
Kendall MG, Smith BB (1940) On the method of paired comparisons. Biometrika 31:324–345
Kozubowski TJ, Podgórski K, Rychlik I (2013) Multivariate generalized Laplace distribution and related random fields. J Multivar Anal 113:59–72
Krupskii P, Joe H (2013) Factor copula models for multivariate data. J Multivar Anal 120:85–101
Krupskii P, Joe H (2015) Structured factor copula models: theory, inference and computation. J Multivar Anal 138:53–73
Lee AJ (1990) \(U\)-statistics statistics: textbooks and monographs. Marcel Dekker Inc, New York
Mazo G, Girard S, Forbes F (2016) A flexible and tractable class of one-factor copulas. Stat Comput 26:965–979
McFadden D (1989) A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica 57:995–1026
McNeil AJ, Nešlehová J (2009) Multivariate Archimedean copulas, \(d\)-monotone functions and \(l_1\)-norm symmetric distributions. Ann Stat 37:3059–3097
Nelsen RB (2007) Extremes of nonexchangeability. Stat Pap 48:329–336
Newey WK, McFadden D (1994) Large sample estimation and hypothesis testing. In Handbook of econometrics, Vol. IV, vol. 2 of Handbooks in Econom. North-Holland, Amsterdam, pp 2111–2245
Oakes D (1994) Multivariate survival distributions. J Nonparametr Stat 3:343–354
Oh DH, Patton AJ (2013) Simulated method of moments estimation for copula-based multivariate models. J Am Stat Assoc 108:689–700
Pakes A, Pollard D (1989) Simulation and the asymptotics of optimization estimators. Econometrica 57:1027–1057
Quessy J-F (2009) Theoretical efficiency comparisons of independence tests based on multivariate versions of Spearman’s rho. Metrika 70:315–338
Quessy J-F, Durocher M (2019) The class of copulas arising from squared distributions: properties and inference. Econom Stat 12:148–166
Quessy J-F, Rivest L-P, Toupin M-H (2016) On the family of multivariate chi-square copulas. J Multivar Anal 152:40–60
Shih JH, Louis TA (1995) Inferences on the association parameter in copula models for bivariate survival data. Biometrics 51:1384–1399
Sklar A (1959) Fonctions de répartition à \(n\) dimensions et leurs marges. Publ Inst Stat Univ Paris 8:229–231
Tsukahara H (2005) Semiparametric estimation in copula models. Can J Stat 33:357–375
van der Vaart AW (1998) Asymptotic statistics, Cambridge series in statistical and probabilistic mathematics, vol 34. Cambridge University Press, Cambridge
Weiß G (2011) Copula parameter estimation by maximum-likelihood and minimum-distance estimators: a simulation study. Comput Stat 26:31–54
Acknowledgements
The authors acknowledge financial support by individual grants from the Natural Sciences and Engineering Research Council of Canada (NSERC).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We state that there is no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Proofs
A Proofs
1.1 A.1 Proof of Proposition 1
The proof uses standard arguments, such as those in the statement of Theorem 4.1, p. 36, of van der Vaart (1998). By the definition of \({\varvec{\theta }}_T^{\varvec{\kappa }}\) and a first-order Taylor expansion, \({\varvec{\mu }}_T^{\varvec{\kappa }}= {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_T^{\varvec{\kappa }}) = {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) + ({\varvec{\mu }}^{\varvec{\kappa }})'({\varvec{\theta }}^\star ) ({\varvec{\theta }}_T^{\varvec{\kappa }}- {\varvec{\theta }}_0^{\varvec{\kappa }})\), where \({\varvec{\theta }}^\star \) lies between \({\varvec{\theta }}_T^{\varvec{\kappa }}\) and \({\varvec{\theta }}_0\). One can then write \(({\varvec{\mu }}^{\varvec{\kappa }})'({\varvec{\theta }}^\star ) \sqrt{T} ({\varvec{\theta }}_T^{\varvec{\kappa }}- {\varvec{\theta }}_0) = \sqrt{T} ( {\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) )\). The stated result holds in view of Lemma 1 and since \(({\varvec{\mu }}^{\varvec{\kappa }})'({\varvec{\theta }}^\star )\) converges in probability to \(({\varvec{\mu }}^{\varvec{\kappa }})'({\varvec{\theta }}_0) = {\varvec{\nu }}_0^{\varvec{\kappa }}\).
\(\square \)
1.2 A.2 Proof of Proposition 2
The difference here compared to standard method-of-moments techniques is the fact that the objective function is not continuous as a function of \({\varvec{\theta }}\) due to the estimated function \({\varvec{\mu }}_S({\varvec{\theta }})\). Basically, one has to appeal to the notion of stochastic equicontinuity in order to have some sort of uniform convergence for the objective function. Specifically, the proof consists in verifying the conditions of Theorem 2.1 of Newey and McFadden (1994) that ensure the consistency of minimum distance estimators of the form
as long as \(\varTheta \) is compact and the objective function \(\widehat{Q}({\varvec{\theta }})\) converges uniformly in probability to a function \(Q_0({\varvec{\theta }})\) that is continuous and uniquely minimized at \({\varvec{\theta }}_0\).
In the case of the proposed simulated method-of-moments estimator, one has \(\widehat{Q}:= g_{T,S}\, M_T \, g_{T,S}^\top \), where \(g_{T,S}({\varvec{\theta }}) = {\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}_S^{\varvec{\kappa }}({\varvec{\theta }})\). Since \({\mathbb {E}}\{ |{\varvec{\kappa }}(\textbf{X}_1,\ldots ,\textbf{X}_m)| \} < \infty \), \(g_{T,S}({\varvec{\theta }})\) converges in probability to \(g_0({\varvec{\theta }}) = {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) - {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\) as \(T,S\rightarrow \infty \) for each \({\varvec{\theta }}\in \varTheta \). Also, the assumption that \(\mu ^{\varvec{\kappa }}\) is continuous entails the continuity of \(Q_0:= g_0 \, M_0 \, g_0^\top \). Moreover, since \({\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) \ne {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\) as soon as \({\varvec{\theta }}\ne {\varvec{\theta }}_0\), the function \(Q_0\) is uniquely minimized at \({\varvec{\theta }}_0\). It then remains to establish the uniform convergence in probability of \(\widehat{Q}\) to \(Q_0\). To this end, it will first be shown that \(g_{T,S}\) is stochastically equicontinuous. Because \({\varvec{\mu }}_S^{\varvec{\kappa }}({\varvec{\theta }})\) converges in probability to \(\mu ^{\varvec{\kappa }}({\varvec{\theta }})\) for any fixed \({\varvec{\theta }}\in \varTheta \), one has for any \({\varvec{\theta }}_1, {\varvec{\theta }}_2 \in \varTheta \) that
where \((R_S({\varvec{\theta }}_1), R_S({\varvec{\theta }}_2))\) converges in probability to (0, 0). Hence,
Since \({\varvec{\mu }}^{\varvec{\kappa }}\) is Lipschitz continuous, there exists \(\zeta >0\) such that \(\Vert {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_2) - {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_1) \Vert \le \zeta \Vert {\varvec{\theta }}_2-{\varvec{\theta }}_1\Vert \), and then one can write
where \(\varLambda _S = \zeta + \Vert R_S({\varvec{\theta }}_2)-R_S({\varvec{\theta }}_1)\Vert / \Vert {\varvec{\theta }}_2-{\varvec{\theta }}_1\Vert \). Because \(\mu _S^{\varvec{\kappa }}\) is bounded, it follows that one can find \(\xi > 0\) such that
This means that \(g_{T,S}\) is asymptotically Lipschitz continuous. Hence, \(g_{T,S}\) is stochastically equicontinuous, i.e., for all \(\epsilon ,\eta > 0\), there exists \(\delta > 0\) such that
Invoking Lemma 2.8 of Newey and McFadden (1994). the conditions are met in order that as \(S, T \rightarrow \infty \),
Since \(g_0({\varvec{\theta }})\) is bounded, \(M_T = O_p(1)\) and \(M_T\) converges in probability to \(M_0\), an application of the triangular and Cauchy–Schwarz inequalities yield
where \(K \in (0,\infty )\), \(\zeta _T = O_{\mathbb {P}}(1)\) and \(\kappa _T = o_{\mathbb {P}}(1)\). In view of (6), one can conclude that \(\widehat{Q}\) converges uniformly in probability to \(Q_0\). \(\square \)
1.3 A.3 Proof of Lemma 2
Because \(\prod _{j=1}^m b_j - \prod _{j=1}^m a_j = \sum _{j=1}^m (b_j-a_j) \prod _{k<j} a_k \prod _{k>j} b_k\), one can write
Since \({\mathbb {E}}\{ \Vert {\varvec{\kappa }}({\textbf{u}}_1,{\textbf{U}}_2,\ldots ,{\textbf{U}}_m) \Vert \} < \infty \), one can conclude that \(\mu ^{\varvec{\kappa }}\) is Lipschitz continuous. \(\square \)
1.4 A.4 Proof of Proposition 3
The proof consists in verifying the conditions of Theorem 7.2 of Newey and McFadden (1994) that ensures that asymptotic normality of minimum distance estimators of the form
Specifically, \(\sqrt{T}(\widehat{\varvec{\theta }}- {\varvec{\theta }}_0)\) converges in distribution to the Normal law with mean zero and variance-covariance matrix \(\varOmega _0\) as long as
(\(\mathcal C_1\)) \(\widehat{M}\) converges in probability to a positive definite matrix \(M_0\) and \(\widehat{\varvec{\theta }}\) converges in probability to an interior point \({\varvec{\theta }}_0\) of \(\varTheta \);
(\(\mathcal C_2\)) \(\sqrt{T} \, \widehat{g}({\varvec{\theta }}_0)\) converges to the mean zero normal law with variance-covariance matrix \(\varSigma \);
(\(\mathcal C_3\)) \(\displaystyle \widehat{g}(\widehat{\varvec{\theta }})^\top \widehat{M} \widehat{g}(\widehat{\varvec{\theta }}) \le \inf _{{\varvec{\theta }}\in \varTheta } \widehat{g}({\varvec{\theta }})^\top \widehat{M} \widehat{g}({\varvec{\theta }})^\top + o_{\mathbb {P}}(T^{-1})\);
(\(\mathcal C_4\)) There exists a function \(g_0\) such that \(g_0({\varvec{\theta }}_0) = 0\), \(g_0({\varvec{\theta }})\) is differentiable at \({\varvec{\theta }}_0\) whose derivative \(G_0\) is such that \(G_0 \, M_0 \, G_0^\top \) is nonsingular, and for any \(\delta _T \rightarrow 0\) as \(T \rightarrow \infty \),
In the case of the proposed simulated method-of-moments estimator, one has \(\widehat{M}:= M_T\) and \(\widehat{g}({\varvec{\theta }}):= {\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}_S^{\varvec{\kappa }}({\varvec{\theta }})\). First, (\(\mathcal C_1\)) holds by the assumption on \(M_T\) and the fact that \({\varvec{\theta }}_{T,S}^{\varvec{\kappa }}\) converges in probability to \({\varvec{\theta }}_0\), as ensured by Proposition 2. To establish (\(\mathcal C_2\)), define \(Z_{1T} = \sqrt{T} \{ {\varvec{\mu }}_T^{\varvec{\kappa }}- {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) \}\) and \(Z_{2\,S} = \sqrt{S} \{ {\varvec{\mu }}_S^{\varvec{\kappa }}({\varvec{\theta }}_0) - {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) \}\), so that \(\sqrt{T} \, g_{T,S}^{\varvec{\kappa }}({\varvec{\theta }}_0) = Z_{1T} - \sqrt{T/S} \, Z_{2\,S}\). From an application of Proposition 1, \(Z_{1T} \rightsquigarrow Z_1\) and \(Z_{2\,S} \rightsquigarrow Z_2\) for \(Z_1, Z_2\) i.i.d \(\mathcal {N}(0,\varSigma _0)\), and then \(\sqrt{T} \, g_{T,S}^{\varvec{\kappa }}({\varvec{\theta }}_0)\) converges in distribution to \(\mathcal {N} \left( 0, (1+\zeta ) \varSigma _0 \right) \), where \(\zeta = \lim _{T,S\rightarrow \infty } T/S \in [0,\infty )\). Hence, (\(\mathcal C_2\)) holds. Condition (\(\mathcal C_3\)) holds by assumption.
Now to establish (\(\mathcal C_4\)), it will shown that the function \(\varUpsilon _{T,S}^{\varvec{\kappa }}({\varvec{\theta }}) = \sqrt{T} \{ g_{T,S}^{\varvec{\kappa }}({\varvec{\theta }}) - g_0^{\varvec{\kappa }}({\varvec{\theta }}) \}\), where \(g_0^{\varvec{\kappa }}({\varvec{\theta }}) = {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }}_0) - {\varvec{\mu }}^{\varvec{\kappa }}({\varvec{\theta }})\), is stochastically equicontinuous. As formalized by Andrews (1994), a function \(\nu _T\) is stochastically equicontinuous at \(\tau _0\) if for all \(\epsilon , \eta > 0\), there exists \(\delta > 0\) such that
To establish the stochastic equicontinuity of \(\varUpsilon _{T,S}^{\varvec{\kappa }}\), Assumptions A–B of Theorem 1 of Andrews (1994) will be shown to hold. Firstly, Assumption A holds true since \(\{ g_{T,S}^{\varvec{\kappa }}({\varvec{\theta }}): {\varvec{\theta }}\in \varTheta \}\) is a type II class of functions and satisfies Pollard’s entropy condition with envelope
where referring to (6), \(\varLambda _S = \zeta + \Vert R_S({\varvec{\theta }}_2)-R_S({\varvec{\theta }}_1)\Vert / \Vert {\varvec{\theta }}_2-{\varvec{\theta }}_1\Vert \) for \((R_S({\varvec{\theta }}_1),R_S({\varvec{\theta }}_2))\) that converges to (0, 0) in probability. Assumption B holds as well because \(g_{T,S}^{\varvec{\kappa }}\) is bounded and there exists \(\xi >0\) such that
As a consequence, \(\varUpsilon _{T,S}^{\varvec{\kappa }}\) satisfies (7) at \(\tau _0 = {\varvec{\theta }}_0\). Then, since
one can conclude that for any \(\epsilon > 0\),
This establishes (\(\mathcal C_4\)) and concludes the proof. \(\square \)
1.5 A.5 Proof of Proposition 4
Letting \(F({\textbf{x}}) = {\mathbb {P}}(\textbf{X}\le {\textbf{x}})\),
Since by definition, \(W = F(\textbf{X})\), where \(\textbf{X}\sim F\), it follows readily that
Hence, the U-statistic with kernel \(\mathcal {K}_a\) is an unbiased for \(\mu _a\). To conclude the proof, simply observe that because \(W = F(\textbf{X}) = C({\textbf{U}})\), where \({\textbf{U}}= (F_1(X_1), \ldots , F_d(X_d)) \sim C\), it is clear that \({\mathbb {E}}\left\{ \mathcal {K}_a(\textbf{X}_1, \ldots , \textbf{X}_{a+1}) \right\} = {\mathbb {E}}(W^a) = {\mathbb {E}}\left\{ \mathcal {K}_a({\textbf{U}}_1, \ldots , {\textbf{U}}_{a+1}) \right\} \), which establishes that \(\mathcal {K}_a\) is marginal-free. \(\square \)
1.6 A.6 Proof of Proposition 5
The result is a special case of Lemma 1 and mainly consists in deriving an expression for \(\mathcal {K}_a^\star ({\textbf{u}}) = (a+1) [ {\mathbb {E}}\{ \mathcal {K}_a({\textbf{u}}, {\textbf{U}}_2, \ldots , {\textbf{U}}_{a+1}) \} - \mu _a ]\). Upon recalling that
one computes
From an application of Theorem 2, p. 76 of Lee (1990), \(\sqrt{T} (\widehat{\varvec{\mu }}- {\varvec{\mu }})\) is asymptotically L-variate Normal with vector of means \(( {\mathbb {E}}\{\mathcal {K}_1({\textbf{U}})\}, \ldots , {\mathbb {E}}\{\mathcal {K}_L({\textbf{U}})\} ) = (0, \ldots , 0)\) and variance-covariance matrix \(\varSigma \in \mathbb {R}^{L\times L}\) such that for any \(a,a' \in \{ 1, \ldots , L \}\),
which completes the proof. \(\square \)
1.7 A.7 Proof of Proposition 6
First note that \({\mathbb {E}}(W^a) = {\mathbb {P}}\left( \textbf{X}^\star - \textbf{X}_1> 0, \ldots , \textbf{X}^\star - \textbf{X}_a > 0 \right) \), where \(\textbf{X}^\star , \textbf{X}_1, \ldots , \textbf{X}_a\) are i.i.d. from the elliptical distribution characterized by the radial variable R. For each \(j \in \{1, \ldots , a \}\), the distribution of \(\textbf{Y}_j = \textbf{X}^\star - \textbf{X}_j\) is elliptically contoured, so that it admits the stochastic representation \(\textbf{Y}_j = \mathcal G_j \, {\textbf{Z}}_j\) for some positive random variable \(\mathcal G_j\) and where \({\textbf{Z}}_j\) is d-variate normal with some covariance matrix \(\varSigma \). As a consequence, one has the stochastic representation \(( \textbf{X}^\star - \textbf{X}_1, \ldots , \textbf{X}^\star - \textbf{X}_a) = (\mathcal G_1 \, {\textbf{Z}}_1, \ldots , \mathcal G_a \, {\textbf{Z}}_a)\), where \(Z_1, \ldots , Z_d\) are dependent normal vectors. One can then write \({\mathbb {E}}(W^a) = {\mathbb {P}}(\mathcal G_1 \, {\textbf{Z}}_1> 0, \ldots , \mathcal G_a \, {\textbf{Z}}_a> 0) = {\mathbb {P}}(Z_1>0, \ldots , Z_a>0)\), which ends the proof. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Belalia, M., Quessy, JF. Generalized simulated method-of-moments estimators for multivariate copulas. Stat Papers (2024). https://doi.org/10.1007/s00362-024-01574-w
Received:
Revised:
Published:
DOI: https://doi.org/10.1007/s00362-024-01574-w