Abstract
We present the asymptotic joint distribution of the sample central moments and the standardized sample central moments of multivariate random variables. Sample central moments and standardized sample central moments are quantities of interest for statistical inference as the variance and the coefficients of skewness and kurtosis are particular cases. The results described here are known for univariate random variables; now, we extend them to random vectors. After presenting our results, we apply them to multivariate elliptical distributions and the multivariate skew-normal distribution, showing that these expressions can be simplified considerably in specific cases.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
1 Introduction
Statistical analyses frequently make use of functions of the sample mean and sample covariance matrix for multivariate inference. In the exponential family, for instance, such statistics are sufficient to estimate the parameters of distributions. In other families, the third and fourth standardized sample moments, respectively, known as the coefficients of skewness and kurtosis, may be of interest. Here, we present the asymptotic joint distribution for multivariate sample moments and apply it to both multivariate elliptical distributions and the multivariate skew-normal family.
Sample moments are used in the method of moments, an estimation technique based on the assumption that unknown parameters can be computed by matching the sample moments with the theoretical ones, and solving a system of p equations and p unknown parameters. The p parameters may be over-identified by the system of equations; so, the Generalized Method of Moments (GMM) was developed to tackle this obstacle. As noted by Harris and Mátyás (1999), the estimation via moments requires fewer assumptions than the maximum likelihood estimation, which needs specification of the whole distribution. Therefore, estimation via moments may be convenient in many situations. The sample moments can also be used for optimization of the likelihood, according to Lehmann and Casella (1998, pp. 456–457).
As the sample moments have numerous applications, these measures and their asymptotic distributions have been vastly explored in the literature. As one of the first in this field, Cramér (1946) dealt with moments, functions of moments, and their asymptotic normality using a technique that later became known as the delta method. Pewsey (2005) derived a general result for the large-sample joint distribution of the mean, the standard deviation, and the coefficients of skewness and kurtosis of a general distribution by employing the Central Limit Theorem (CLT), the Taylor expansion of functions of the moments, and extensive algebraic manipulations. Both these works referred to the univariate context only.
An interesting property of Pewsey’s result is that he isolated the asymptotic bias for the coefficients of skewness and kurtosis, so his formulation can be applied in bias corrections of estimators. However, practical simulations from the author with bias correction through subtraction or ratio performed poorly. Bao (2013) derived analytical results for finite sample biases for skewness and kurtosis coefficients in a different way. He achieved a good performance using his asymptotic results for bias correction in an AR(1) process. He also claimed that applying the results to hypothesis tests for normality increased the power of the tests. In the multivariate context, Kollo and von Rosen (2005) presented the asymptotic distribution of the sample mean and the sample covariance matrix, using as a background the law of large numbers and the CLT.
Asymptotic results may be applied to the multivariate skew-normal distribution, a more general class than the normal distribution, as shown by Arnold and Beaver (2002). The authors also exposed different causes yielding skewed distributions, for example, the hidden truncation mechanism. Arnold et al. (1993), motivated by practical problems, such as “selective reporting,” i.e., when, intentionally or not, only random vectors related to a truncated variable are recorded, developed these ideas and provided a direct relationship with Azzalini’s (1985) skew-normal distribution. As selective reporting is generated by common procedures, this hidden truncation mechanism may be frequent in data analyses and was addressed by a series of papers that Prof. Arnold pioneered.
Here, we apply asymptotic results to multivariate elliptical distributions and the multivariate skew-normal distribution developed by Azzalini and Dalla Valle (1996). In this last scenario, we show that expressions simplify considerably, depending on the parameters. Two key advantages of our results are that we address the higher-order moments, unlike previous works, and we employ intuitive and straightforward notation.
The structure of this paper is as follows. In Sect. 2, we provide the notation and terminology used throughout the paper. In Sect. 3, we present the main results about the asymptotic joint distribution of multivariate sample moments and multivariate standardized sample moments and describe several examples for illustration. In Sect. 4, we apply the results to multivariate elliptical distributions, and in Sect. 5, we evaluate the asymptotic behavior for the skew-normal distribution.
2 Notation and Terminology
To derive the asymptotic joint distribution of central moments from multivariate random variables, we consider a non-degenerate random vector X = (X 1, …, X d)⊤∼ f(x;θ), \(\boldsymbol {x}~\in ~\mathcal {X}~\subseteq ~\mathbb {R}^d\), \(\boldsymbol {\theta }~\in ~\Theta \subset \mathbb {R}^q\), where f is a parametric joint probability density function. We also consider the following theoretical quantities, provided they exist:
-
\(\mu _{kr} = \mathbb {E}(X_k^r), k=1,\ldots ,d,\,\,r=1,\ldots ,p\), is the rth theoretical moment of X k, and μ k1 = μ k is the mean of the kth variable;
-
\(\kappa _{kr} = \mathbb {E}\{(X_{k}-\mu _{k})^r\}, k=1,\ldots ,d,\,\,r=1,\ldots ,p\), is the rth theoretical central moment of X k about the mean μ k, where κ k1 = 0 and \(\kappa _{k2}=\sigma _k^2\) is the variance;
-
\(\kappa _{kl,rs} =\mathbb {E}\{(X_{k}-\mu _{k})^r(X_{l}-\mu _{l})^s\}, k,l=1,\ldots ,d,\,\,r,s=1,\ldots ,p\), represents the theoretical central cross-moments of orders r and s between the kth and lth variables, κ kl,11 = σ kl is the covariance between the kth and lth variables, and κ kk,rs = κ k,r+s;
-
\(\rho _{kr} = \frac {\kappa _{kr}}{\kappa _{k2}^{r/2}}, k=1,\ldots ,d,\,\,r=1,\ldots ,p\), is the standardized rth theoretical moment of X k with ρ k1 = 0, ρ k2 = 1, ρ k3 = γ k1 and ρ k4 − 3 = γ k2, where γ k1 is the skewness coefficient and γ k2 is the excess kurtosis;
-
\(\rho _{kl,rs} = \frac {\kappa _{kl,rs}}{\kappa _{k2}^{r/2}\kappa _{l2}^{s/2}}\), k, l = 1, …, d, r, s = 1, …, p, and ρ kk,rs = ρ k,r+s, ρ kk,11 = ρ k2 = 1;
-
\(\bar \rho _{kl,rs} = \frac {\kappa _{kl,rs}-\kappa _{kr}\kappa _{ls}}{\kappa _{k2}^{r/2}\kappa _{l2}^{s/2}} = \rho _{kl,rs}-\rho _{kr}\rho _{ls}\), k, l = 1, …, d, r, s = 1, …, p, and \(\bar {\rho }_{kk,rs}=\bar {\rho }_{k,r+s}\), \(\bar {\rho }_{kl,1s} = \rho _{kl,1s}\) and \(\bar {\rho }_{kl,r1} = \rho _{kl,r1}\).
We also define D kr, S kr, and R kr, which are, respectively, the rth sample central moment about the mean, the rth sample central moment about the sample mean, and the rth standardized sample central moment about the sample mean, for a random sample X i = (X i1, …, X id)⊤, i = 1, …, n, from the random vector X = (X 1, …, X d)⊤∼ f(x;θ) as follows:
The sample central moments (S kr) are strongly consistent estimators of the respective theoretical central moments (κ kr) for each k = 1, …, d and r = 2, …, p. Therefore, the standardized sample central moments (R kr) are also strongly consistent estimators of the respective standardized theoretical central moments (ρ kr) for each k = 1, …, d and r = 2, …, p, i.e., each univariate marginal. Besides, if the (2r)th theoretical moments are finite, then the asymptotic normality of these central statistics is known. In the next section, we deliver the basic elements needed to study the asymptotic distribution in the multivariate context and give some illustrative examples of how to apply the proposed results.
3 Main Results
We let \(\boldsymbol {D}=(\boldsymbol {D}_1^{\top },\ldots ,\boldsymbol {D}_p^{\top })^{\top }\), D k = (D k1, …, D kp)⊤, and \(\boldsymbol {D}_k=\frac {1}{n}\sum _{i=1}^n\boldsymbol {W}_{ik}\), where W ik = ((X ik − μ k)1, (X ik − μ k)2, …, (X ik − μ k)p)⊤, k = 1, …, d, i = 1, …, n. If the mean vector and the variance–covariance matrix of W ik exist, they are, respectively, defined as
Thus, \(\boldsymbol {D}=\frac {1}{n}\sum _{i=1}^n \boldsymbol {W}_i\), where \(\boldsymbol {W}_i=(\boldsymbol {W}_{i1}^{\top },\ldots ,\boldsymbol {W}_{id}^{\top })^{\top }\), i = 1, …, n, are i.i.d. random vectors, with a mean vector \(\boldsymbol {\kappa }=(\boldsymbol {\kappa }_1^{\top },\ldots ,\boldsymbol {\kappa }_d^{\top })^{\top }\) and a variance–covariance matrix \(\boldsymbol {\mathcal {K}}=(\boldsymbol {\mathcal {K}}_{kl}), k,l=1,\ldots ,d\), where the block is
With this, we make use of the multivariate Central Limit Theorem (CLT) to obtain the results in Proposition 1:
Proposition 1
Let \(\boldsymbol {D}=(\boldsymbol {D}_1^{\top },\ldots ,\boldsymbol {D}_d^{\top })^{\top }\) , and \(\boldsymbol {\kappa }=(\boldsymbol {\kappa }_1^{\top },\ldots ,\boldsymbol {\kappa }_d^{\top })^{\top }\) , where D k = (D k1, …, D kp)⊤, \(D_{k1}=\bar X_k-\mu _k\), κ k = (κ k1, …, κ kp)⊤ , κ k1 = 0, and \(\kappa _{k2}=\sigma _k^2\) , k = 1, …, d. If κ k,2p < ∞ for all k = 1, …, d, then
where \(\boldsymbol {\mathcal {K}}\) has block elements \(\boldsymbol {\mathcal {K}}_{kl}\) given by (1). In particular,
Example 1 We illustrate this result with the case in which p = 4. Assuming that κ k,8 < ∞, then for all k = 1, …, d,
If the distribution of X k − μ k is symmetric around zero, then the result reduces to
indicating asymptotic independence between the random vectors \(\sqrt {n}~(D_{k1},D_{k3})^{\top }\) and \(\sqrt {n}~(D_{k2}~-~\kappa _{k2},D_{k4}-~\kappa _{k4})^{\top }\).
Similarly, for sample central moments about the true mean vector, we derive asymptotic distributions for the sample central moments about the sample mean as stated below in Proposition 2. As noted by Afendras et al. (2020), when investigating the limiting behavior of sample central moments in the univariate context, two general assumptions about each of the components of the random vector X = (X 1, …, X d)⊤ are required. First, \(\mathbb {E}(|X_k|{ }^{2r})<\infty \). Second, non-singularity of order r, that is, \(\tau _{kr}^2\ne 0\), for r = 2, 3, …. These conditions guarantee the marginal \(\sqrt {n}\)-convergence of the sample central moments, i.e., each marginal sample central moment \(\sqrt {n}\,(S_{kr}-\kappa _{kr})\) converges in distribution to a non-degenerate \(\mathcal {N}_1(0,\tau _{kr}^2)\), with \(\tau _{kr}^2>0\). Under singularity of order r, whenever \(\tau _{kr}^2=0\), Afendras et al. (2020) verified that n (S kr − κ kr) converges in distribution to a non-normal law of probability.
Proposition 2
Let \(\boldsymbol {S}=(\boldsymbol {S}_1^{\top },\ldots ,\boldsymbol {S}_d^{\top })^{\top }\) and \(\boldsymbol {\kappa }=(\boldsymbol {\kappa }_1^{\top },\ldots ,\boldsymbol {\kappa }_d^{\top })^{\top }\) , where S k = (D k1, S k2, …, S kp)⊤ and κ k = (κ k1, κ k2, …, κ kp)⊤ , k = 1, …, d. If κ k(2p) < ∞ for all k = 1, …, d, then
where C = diag(C 1, …, C d), and
where κ k1 = 0 and \(\kappa _{k2}=\sigma _k^2\) . In particular,
where the asymptotic variance–covariance matrix \(\boldsymbol {C}_k\boldsymbol {\mathcal {K}}_{kk}\boldsymbol {C}_k^{\top }\) has entries τ k,rs , where \(\tau _{k,rr}=\tau _{k,r}^2\) , and
Proof of Proposition 2
Since \(\bar X_k-\mu _k=D_{k1}\) and, for r = 2, …, p,
we have
By Proposition 1, \(\sqrt {n}\,(D_{ks}-\kappa _{ks})=O_p(1)\) as n →∞, for all k = 1, …, d and s = 1, …, p, implying that:D k1 = O p(n −1∕2) = o p(1) and \(D_{k1}^{r-s}=o_p(1)\), for all r − s > 0;\(\sqrt {n}\,(D_{ks}-\kappa _{ks})D_{k1}^{r-s}=O_p(1)o_p(1)=o_p(1)\), for s = 2, …, r − 1 and r = 3, …, p; and \(\sqrt {n}\,D_{k1}^{r-s}=n^{-(r-s-1)/2}(\sqrt {n}\,D_{k1})^{r-s}=o_p(1)O_p(1)=o_p(1)\), for all r − s ≥ 2.These facts imply that:
which holds for all k = 1, …, d and all r = 2, …, p.
Hence, we obtain \(\sqrt {n}\,(\boldsymbol {S}_k-\boldsymbol {\kappa }_k)=\boldsymbol {C}_k\sqrt {n}(\boldsymbol {D}_k-\boldsymbol {\kappa }_k)+o_p(1)\), for all k = 1, …, d, and thus, \(\sqrt {n}\,(\boldsymbol {S}-\boldsymbol {\kappa })=\boldsymbol {C}\sqrt {n}(\boldsymbol {D}-\boldsymbol {\kappa })+o_p(1)\). The proof is concluded by applying Proposition 1 and Slutsky’s theorem. □
Example 2 Similar to Example 1, for p = 4, we suppose that κ k8 < ∞. Then, for all k = 1, …, d,
where
with κ k1 = 0 and \(\kappa _{k2}=\sigma _k^2\). In particular, if the marginal distribution of X k − μ k is symmetric around zero, then κ kr = 0 for odd r, and the asymptotic multivariate normal distribution of \(\sqrt {n}\,(\bar X_{k}-\mu _k,S_{k2}-\kappa _{k2},S_{k3},S_{k4}-\kappa _{k4})^{\top }\) reduces to
which indicates that there is asymptotic independence between the sample central moments of odd and even orders. This is a general result valid for higher-order sample central moments.
The next proposition shows the asymptotic joint distribution of multivariate standardized sample central moments.
Proposition 3
Let \(\boldsymbol {R}=(\boldsymbol {R}_1^{\top },\ldots ,\boldsymbol {R}_d^{\top })\) and \(\boldsymbol {\rho }=(\boldsymbol {0}^{\top },\boldsymbol {\rho }_1^{\top },\ldots ,\boldsymbol {\rho }_d^{\top })^{\top }\) , where R k = (D k1, S k2, R k3, …, R kp)⊤ and ρ k = (0, κ k2, ρ k3, …, ρ kp)⊤, k = 1, …, d. If κ k,2p < ∞ for all k = 1, …, d, then
where GC = diag(G 1 C 1, …, G d C d), with
In particular,
Proof of Proposition 3
We let g(x) = (g 1(x 1), …, g d(x d))⊤, where \(\boldsymbol {x}=(\boldsymbol {x}_1^{\top },\ldots ,\boldsymbol {x}_d^{\top })^{\top }\), x k = (x k1, …, x kp)⊤, g k = (g k1, …, g kp)⊤, and
The Jacobian matrix is \(\dot {\boldsymbol {G}}(\boldsymbol {x})=\text{diag}(\boldsymbol {G}_1(\boldsymbol {x}_1),\ldots ,\boldsymbol {G}_k(\boldsymbol {x}_k))\), with \(\boldsymbol {G}_k(\boldsymbol {x}_k)=\left ( \frac {\partial \boldsymbol {g}_k(\boldsymbol {x}_k)}{\partial \boldsymbol {x}_{k}}\right ) \) given by
Thus, from the delta method, we have \(\sqrt {n}\,(\boldsymbol {R}-\boldsymbol {\rho })=\sqrt {n}\,(\boldsymbol {G}(\boldsymbol {S})-\boldsymbol {G}(\boldsymbol {\rho }))\overset { d}\longrightarrow \mathcal {N}_{dp}(\boldsymbol {0},\boldsymbol {GC\mathcal {K} C}^{\top } \boldsymbol {G}^{\top })\), where G = G(ρ) = diag(G 1(ρ 1), …, G d(ρ d)) and GC = diag(G 1 C 1, …, G d C d), concluding the proof. □
Example 3 For p = 4, we have
Hence, as in Example 2, if κ k8 < ∞, then for all k = 1, …, d,
where υ k,ij = υ k,ji and \(\upsilon _{k,ii}=\upsilon _{k,i}^2\), with
In this paper, we developed all the calculations considering S k2, i.e., the second sample central moment (the sample variance). Pewsey (2005), on the other hand, built his results with \(S_k=\sqrt {S_{k2}}\), the sample standard deviation, and only for the univariate case. Therefore, Example 3 corresponds to Pewsey’s result, if k = 1, and we make use of another Jacobian matrix P k:
Hence, the variance–covariance matrix for the asymptotic distribution for the kth marginal univariate example, considering Pewsey’s approach, is given by the expression \(\boldsymbol {P}_k (\boldsymbol {G}_k \boldsymbol {C}_k \boldsymbol {\mathcal {K} C}_k^{{ }^{\top }}\boldsymbol {G}_k^{{ }^{\top }}) \boldsymbol {P}_k^{\top }\).
With Proposition 3, we derive the following corollary:
Corollary 1
Let R 3⋅ = (R 31, …, R 3d)⊤ and ρ 3⋅ = (ρ 31, …, ρ 3d)⊤ . Under the conditions of Proposition 3 , we have
where
with \(\boldsymbol {e}_3=(0,0,1,0,\ldots ,0)^{\top }\in \mathbb {R}^{p}\) , i.e., Υ 3 has entries \(\upsilon ^{\{3\}}_{kl}, k,l=1,\ldots ,d\) , given by
In particular,
with
Example 4 For symmetric distributions, we have ρ k5 = ρ k3 = ρ k1 = 0. Therefore,
where \(\rho _{kr}=\frac {\kappa _{kr}}{\kappa _{k2}^{r/2}}\). For the normal model, 9 − 6ρ k4 + ρ k6 = 9 − 6 × 3 + 15 = 6, so
Focusing on the fourth standardized sample central moment, we derive the next corollary:
Corollary 2
Let R 4⋅ = (R 41, …, R 4d)⊤ and ρ 4⋅ = (ρ 41, …, ρ 4d)⊤ . Under the conditions of Proposition 3 , we have
where
with \(\boldsymbol {e}_4=(0,0,1,0,\ldots ,0)^{\top }\in \mathbb {R}^{p}\) , i.e., Υ 4 has entries \(\upsilon ^{\{4\}}_{kl}\) , k, l = 1, …, d, given by
\(\upsilon ^{\{4\}}_{kl}=\boldsymbol {e}_4^{\top } \boldsymbol {G}_k\boldsymbol {C}_k\boldsymbol {\mathcal {K}}_{kl}\boldsymbol {C}_l^{\top } \boldsymbol {G}_l^{\top } \boldsymbol {e}_4 = \begin {pmatrix}-\frac {4\rho _{k3}}{\kappa _{k2}^{1/2}},&-\frac {2\rho _{k4}}{\kappa _{k2}}, &0, &\frac {1}{\kappa _{k2}^{2}},&0, &\ldots ,&0\end {pmatrix}\)
In particular,
with
Example 5 When working with symmetric distributions, we have ρ k5 = ρ k3 = ρ k1 = 0.
Therefore,
For the standard normal model, we have ρ k4 = 3, ρ k6 = 15, ρ k8 = 108, and
so \(\sqrt {n}\,R_{k4}\overset { d}\longrightarrow \mathcal {N}\left (0,24\right ),\quad k=1,\ldots ,d\).
4 Application to Multivariate Elliptical Distributions
In this section, we apply the previous results to a d-dimensional elliptical random vector X ∼ El d(μ, Ω;h) with the density function | Ω|−1∕2 h{(x −μ)⊤ Ω −1(x −μ)}, where μ is a d × 1 location vector, Ω is a d × d positive definite scale matrix, and h is the density generator function.
The central moments of X can be obtained from the moments of R and U because X −μ = R Ω 1∕2 U, where R and U are independent random quantities, with \(R\overset { d}=\|\boldsymbol {Z}\|\), a radial variable, and \(\boldsymbol {U}\overset { d}=\frac {\boldsymbol {Z}}{\|\boldsymbol {Z}\|}\), a uniform vector on the unit sphere \(\{\boldsymbol {x}\in \mathbb {R}^d: \|\boldsymbol {x}\|=1\}\), where Z = Ω −1∕2(X −μ) is the spherical version of X. The existence of these moments depends on the existence of the associated moments of R. For instance, as we know,
-
if \(\mathbb {E}(R)<\infty \), then \(\mathbb {E}(\boldsymbol {X})=\boldsymbol {\mu }\), and
-
if \(\mathbb {E}(R)<\infty \), then \(\mbox{Var}(\boldsymbol {X})=\sigma _h^2\boldsymbol {\Omega }\),
where \(\sigma _h^2=\frac {1}{d}\,\mathbb {E}(R^2)\) becomes the marginal variance induced by the density generator function h.
By symmetry, the odd moments of X −μ are zero, and its even moments can be computed using the results in Berkane and Bentler (1986); see also Lemmas 1 and 2 in Maruyama and Seo (2003). Thus, for r + s = 2m (even), we have
where \(\sigma _{kl}=\mbox{Cov}(X_k,X_l)=\sigma _h^2\omega _{kl}\) becomes κ kl,11, \(\nu _{2m}=\frac {(2m)!}{2^{m}m!}\) is the (2m)th moment of \(Z~\sim ~\mathcal {N}(0,1)\), and \(\kappa _{(m)}+1=\frac {d^m}{d_{(m)}}\, \frac {\mathbb {E}(R^{2m})}{(\mathbb {E}(R^2))^m}\), with d (m) = d(d + 2)⋯(d − 2(m − 1)) being the mth moment of the chi-square distribution with d degrees of freedom. We note that κ (1) = 0 and κ (2) = κ is the kurtosis parameter, which is related to the multivariate kurtosis index of Mardia (1970) of X ∼ El d(μ, Ω;h). The \(\boldsymbol {\mathcal {K}}_{kl}\) matrix for elliptical distributions can be simplified due to symmetry, which makes the odd central moments equal to zero. As mentioned before, this result implies asymptotic independence between the even and odd sample central moments. The \(\boldsymbol {\mathcal {K}}_{kl}\) matrix is given by
That is, if r + s = 2m (even) and r and s are odd, then the elements of \(\boldsymbol {\mathcal {K}}_{kl}\) are κ kl,rs; if r + s = 2m (even) and r and s are also even, then the elements of \(\boldsymbol {\mathcal {K}}_{kl}\) are of the form κ kl,rs − κ kr κ ls. When r + s = 2m − 1 (odd), then the element in row r and column s of \(\boldsymbol {\mathcal {K}}_{kl}\) is zero. If p = 4 and we are interested in \(\boldsymbol {\mathcal {K}}_{kk}\), then the expression reduces to the following, as κ kk,rs = κ k,r+s:
Therefore, we see that there is independence between the pairs X k and S k2 and R k3 and R k4. Also, by Proposition 3, we have
For elliptical distributions, with κ k8 < ∞, for all k = 1, …, d,
where υ ij = υ ji, and
Using the formula for computing κ kl,rs, we have the following elements:
For the multivariate normal distribution, according to Maruyama and Seo (2003), κ (i) = 0, i = 2, 3, 4. Considering the standard normal distribution, \(\sigma _h^2=\omega _{kk} = 1\), υ k,11 = 1, υ k,13 = 0, υ k,22 = 2, υ k,24 = 0, υ k,33 = 6, and υ k,44 = 24. For other elliptical distributions, the values for the asymptotic variance–covariance matrix depend on the computation of \(\sigma _h^2\) and κ (i), i = 2, 3, 4. For a multivariate Student-t distribution, we have \(\sigma _h^2=\frac {\nu }{\nu -2}\), \(\kappa _{(2)}=\frac {2}{\nu -4}, \kappa _{(3)}=\frac {6\nu -20}{(\nu -6)(\nu -4)}\) and \(\kappa _{(4)}=\frac {12\nu ^2-92\nu +184}{(\nu -8)(\nu -6)(\nu -4)}\).
5 Application to Multivariate Skew-Normal Distributions
In this section, we apply the previous results to the multivariate skew-normal distributions of Azzalini and Dalla Valle (1996); see also the books by Genton (2004) and Azzalini and Capitanio (2014). We let \(\boldsymbol {X}\sim \mathcal {S}\mathcal {N}_d(\boldsymbol {\xi },\boldsymbol {\Omega },\boldsymbol {\alpha })\), with the density given by \(2\phi _d(\boldsymbol {x}-\boldsymbol {\xi };\boldsymbol {\Omega })\Phi \{\boldsymbol {\alpha }^{\top }\boldsymbol {\omega }^{-1}(\boldsymbol {x}-\boldsymbol {\xi })\}, \boldsymbol {x}\in \mathbb {R}^d\), where ϕ d(x; Ω) is the pdf of the \(\mathcal {N}_d(\boldsymbol {0},\boldsymbol {\Omega })\) distribution, and Φ(⋅) is the cdf of the univariate standard normal distribution. We know that \(\boldsymbol {\mu }=\mathbb {E}(\boldsymbol {X})=\boldsymbol {\xi }+\boldsymbol {\mu }_0\) and Σ = Var\((\boldsymbol {X})=\boldsymbol {\Omega }-\boldsymbol {\mu }_0\boldsymbol {\mu }_0^{\top },\) where \(\boldsymbol {\mu }_0=\mathbb {E}(\boldsymbol {X}-\boldsymbol {\xi })=\sqrt {2/\pi }\boldsymbol {\omega \delta }\), \(\boldsymbol {\delta }=\boldsymbol {\bar \Omega \alpha /}\sqrt {\boldsymbol {1}+\boldsymbol {\alpha }^{\top }\boldsymbol {\bar \Omega \alpha }}\), \(\boldsymbol {\bar \Omega }=\boldsymbol {\omega }^{-1}\boldsymbol {\Omega }\boldsymbol {\omega }^{-1}\), and \(\boldsymbol {\omega }=\text{ diag}(\boldsymbol {\Omega })^{1/2}=\boldsymbol {\sigma }\{\boldsymbol {I}_d+\text{ diag}(\boldsymbol {\bar \mu }_0\boldsymbol {\bar \mu }_0^{\top })\}^{1/2}\), with \(\boldsymbol {\bar \mu }_0=\boldsymbol {\sigma }^{-1}\boldsymbol {\mu }_0\) and σ = diag( Σ)1∕2.
Here, for any d × d matrix A = (a ij) ≥ 0, diag(A)1∕2 is the diagonal matrix whose diagonal elements are \(a_{11}^{1/2},\ldots ,a_{dd}^{1/2}\). We know that
where \(\boldsymbol {\bar \Sigma }=\boldsymbol {\sigma }^{-1}\boldsymbol {\Sigma \sigma }^{-1}\) and \(\boldsymbol {\omega \sigma }^{-1}=\boldsymbol {\sigma }^{-1}\boldsymbol {\omega }=\{\boldsymbol {I}_d+\text{ diag}(\boldsymbol {\bar \mu }_0\boldsymbol {\bar \mu }_0^{\top })\}^{1/2}\).
We let Z = σ −1(X −μ) = σ −1(X 0 −μ 0), where X 0 = X −ξ. Its density and moment-generating functions are, respectively, given by
and
where \(\boldsymbol {\omega }^{-1}\boldsymbol {\sigma }=\left \{\boldsymbol {I}_d+\text{ diag}(\boldsymbol {\bar \mu }_0\boldsymbol {\mu }_0^{\top })\right \}^{-1/2}=\text{ diag}\left (\boldsymbol {\bar \Sigma }+\boldsymbol {\bar \mu }_0\boldsymbol {\bar \mu }_0^{\top }\right )^{-1/2}\quad \text{ and}\\ \quad \boldsymbol {\sigma }^{-1}\boldsymbol {\omega \delta }=\sqrt {\frac {\pi }{2}}\,\boldsymbol {\bar \mu }_0.\)
Hence, we obtain
Moreover, Z has univariate and bivariate marginals given by
and
where \(\boldsymbol {\bar \Sigma }=(\bar \sigma _{kl})\), with \(\bar \sigma _{kk}=1\). To identify the skewness parameters \((\alpha _k^{\prime },\alpha _l^{\prime })^{\top }\), we use the fact that for a d B × d matrix B of rank d B, \(\boldsymbol {BZ}\sim \mathcal {S}\mathcal {N}_{d_B}(\boldsymbol {B\bar \mu }_0,\boldsymbol {B\bar \Sigma }+\boldsymbol {B\bar \mu }_0\boldsymbol {\bar \mu }_0\boldsymbol {B}^{\top },\boldsymbol {\alpha _B})\), with
and
We also note that
Thus, for B = (e k, e l)⊤, we have
and δ B = (δ 0k, δ 0l)⊤, where
and
We compute \(\rho _{kr}(Z_k)=\mathbb {E}(Z_k^r)\), \(\rho _{kl,rs}(Z_k,Z_l)=\mathbb {E}(Z_k^rZ_l^s)\), and \(\bar \rho _{kl,rs}(Z_k,Z_l)=\mathbb {E}(Z_k^rZ_l^s)-\mathbb {E}(Z_k^r)\mathbb {E}(Z_l^s)=\rho _{kl,rs}(Z_k,Z_l)-\rho _{kr}(Z_k)\rho _{lr}(Z_k)\).
Now, we let \(M(t_k,t_l)=M_{Z_k,Z_l}(t_k,t_l)\), and the cumulant function
We also denote the derivatives as follows:
and
Then, we have
with
Here, ζ k(x) is the kth derivative of \(\zeta _0(x)=\log \{2\Phi (x)\}\), for which
Hence,
Thus, considering that M(0) = 1, ρ k1 = M k(0) = K k(0) = 0, and \(\rho _{kl,11}=M_{kl}(0)=K_{kl}(0)=\bar \sigma _{kl}\), with \(\bar \sigma _{kk}=\bar \sigma _{ll}=1\), we have
and
Finally, with the purpose of illustrating the application of some of the previous results to the multivariate skew-normal distribution, we present two examples below.
Example 6 From Corollary 1, we have that \(\sqrt {n}(\boldsymbol {R}_{3\cdot }-\boldsymbol {\rho }_{3\cdot })\overset { d}\longrightarrow \mathcal {N}(\boldsymbol {0},\boldsymbol {\Upsilon }_3),\) k = 1, …, d, where Υ 3 has elements given by
In particular, \(\sqrt {n}\,(R_{k3}-\rho _{k3})\overset { d}\longrightarrow \mathcal {N}(0,\upsilon _{kk}^{\{3\}}),~\) k = 1, …, d, with
Moreover, for \(\bar \mu _{0k}=0\) (k = 1, …, d), we have
where \(\bar \sigma _{kl}\) (k, l = 1, …, d) are the entries of the correlation matrix \(\boldsymbol {\bar \Sigma }\).
In a similar way, from Corollary 2, we can find the asymptotic variance–covariance matrix Υ 4 of \(\sqrt {n}(\boldsymbol {R}_{4\cdot }-\boldsymbol {\rho }_{4\cdot })\).
The following example provides for each marginal k the joint asymptotic distribution for its sample mean, sample variance, and sample skewness, and from which, we can also find the joint asymptotic distribution of the moment estimators of the respective marginal parameters, namely, \((\xi _k,\omega _k^2,\alpha _{k})\), k = 1, …, d.
Example 7 From Example 3, we have for all k = 1, …, d:
where
with
For \(\bar \mu _{0k}=0\), we have
6 Final Remarks
We used standard tools to obtain our results, hence facilitating the comprehension of the derivations. We illustrated the practical capabilities of the developed techniques through several simple examples. Derivations similar to the ones we presented can be carried out for multivariate skew-t and skew-elliptical distributions. Some references on theses distributions include Branco and Dey (2001), Azzalini and Capitanio (2003), Gupta (2003), and Genton and Loperfido (2005).
As a by-product of the derivations, we found that in the context of symmetric distributions, such as the elliptical ones, the known fact of asymptotic independence between the sample mean and the sample variance extends to all the sample central moments of both even and odd orders.
References
Afendras, G., Papadatos, N., & Piperigou, V. E. (2020). On the limiting distribution of sample central moments. Annals of the Institute of Statistical Mathematics, 72(2), 399–425.
Arnold, B. C., & Beaver, R. J. (2002). Skewed multivariate models related to hidden truncation and/or selective reporting. TEST, 11(1), 7–54.
Arnold, B. C., Beaver, R. J., Groeneveld, R. A., & Meeker, W. Q. (1993). The nontruncated marginal of a truncated bivariate normal distribution. Psychometrika, 58(3), 471–488
Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 12(2), 171–178.
Azzalini, A., & Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), 367–389.
Azzalini, A., & Capitanio, A. (2014). The skew-normal and related families. IMS monographs. Cambridge: Cambridge University Press.
Azzalini, A., & Dalla Valle, A. (1996). The multivariate skew-normal distribution. Biometrika, 83(4), 715–726.
Bao, Y. (2013). On sample skewness and kurtosis. Econometric Reviews, 32(4), 415–448.
Berkane, M., & Bentler, P. (1986). Moments of elliptically distributed random variates. Statistics & Probability Letters, 4(6), 333–335.
Branco, M. D., & Dey, D. K. (2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79(1), 99–113.
Cramér, H. (1946). Mathematical methods of statistics. Princeton: Princeton University Press.
Genton, M. G. (Eds.) (2004). Skew-elliptical distributions and their applications: A journey beyond normality. Boca Raton: CRC Press.
Genton, M. G., & Loperfido, N. M. (2005). Generalized skew-elliptical distributions and their quadratic forms. Annals of the Institute of Statistical Mathematics, 57(2), 389–401.
Gupta, A. K. (2003). Multivariate skew t-distribution. Statistics: A Journal of Theoretical and Applied Statistics, 37(4), 359–363.
Harris, D., & Mátyás, L. (1999). Introduction to the generalized method of moments estimation. In L. Mátyás (Ed.), Generalized method of moments estimation (pp. 3–30). Cambridge: Cambridge University Press.
Kollo, T., & von Rosen, D. (2005). Distribution expansions. In M. Hazewinkel (Ed.), Advanced multivariate statistics with matrices (Vol. 579, Chap. III). Berlin: Springer.
Lehmann, E. L., & Casella, G. (1998). Theory of point estimation (2nd ed.). Springer texts in statistics. New York: Springer.
Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530.
Maruyama, Y., & Seo, T. (2003). Estimation of moment parameter in elliptical distributions. Journal of the Japan Statistical Society, 33(2), 215–229.
Pewsey, A. (2005). The large-sample distribution of the most fundamental of statistical summaries. Journal of Statistical Planning and Inference, 134(2), 434–444.
Acknowledgements
The authors would like to thank Professor Adelchi Azzalini and a reviewer for their useful suggestions and comments. The second author was financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001. The work of the last author was supported by the King Abdullah University of Science and Technology (KAUST).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Arellano-Valle, R.B., Harnik, S.B., Genton, M.G. (2021). On the Asymptotic Joint Distribution of Multivariate Sample Moments. In: Ghosh, I., Balakrishnan, N., Ng, H.K.T. (eds) Advances in Statistics - Theory and Applications. Emerging Topics in Statistics and Biostatistics . Springer, Cham. https://doi.org/10.1007/978-3-030-62900-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-62900-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62899-4
Online ISBN: 978-3-030-62900-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)