Abstract
The problem of estimating the variance of a multivariate normal distribution is considered under quadratic loss. A large class of generalized Bayes minimax estimators for the variance is found. This class include estimators obtained by Ghosh (1994). A simulation study shows superior performance of our estimators.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let X and S be independent random variables having
where \( N_{p} \left (\theta , \sigma ^{2} I_{p} \right )\) denotes the p-variate normal distribution with unknown mean vector 𝜃 and covariance matrix σ2Ip, where Ip denotes the p × p identity matrix, and \({\chi _{n}^{2}}\) denotes a chisquare variable with n degree of freedom. We consider the problem of estimating σ2 when the loss function is
where δ = δ(X,S) is an estimator of σ2.
The best affine equivariant estimator is δ0 = (n + 2)− 1S which is a minimax estimator with constant risk 2(n + 2)− 1. Stein (1964) showed that δ0 can be improved by considering a class of scale equivariant estimators \(\delta = \left ({n + 2} \right )^{-1} \left (1 - \phi (F) \right )S\) for \(F = \frac {X^{\prime }X}{S}\). He found a specific better estimator \(\delta ^{S} = \left (n + 2 \right )^{-1} \left (1 - \phi ^{S} (F) \right )S\), where \(\phi ^{S} (F) = {\max \limits } \left \{ 0, \frac {p - (n + 2)F}{p + n + 2} \right \}\). Brewster and Zidek (1974) obtained an improved generalized Bayes estimator \(\delta ^{BZ} = \left (n + 2 \right )^{-1} \left (1 - \phi ^{BZ} (F) \right )S\), where
They also gave a general sufficient condition for minimaxity, using an integral expression of the difference in risks between δ0 and δ. Strawderman (1974) derived another sufficient condition for minimaxity. Using conditions of Brewster and Zidek (1974), Ghosh (1994) obtained a class of generalized Bayes estimators for σ2. Maruyama and Strawderman (2006) proposed another class of improved generalized Bayes estimators.
In this paper, we derive a large class of generalized Bayes minimax estimators of σ2 which contains estimators of Ghosh (1994) as special cases. To do so, we use techniques of Wells and Zhou (2008) and Brewster and Zidek (1974). Section 3 considers some examples of classes of generalized Bayes minimax estimators. In particular, Example 1 demonstrates that a result in Ghosh (1994) follows from our main theorem. Section 4 compares the minimax estimators of Sections 2, 3 and the equivariant estimator δ0 by simulation.
2 A class of generalized Bayes minimax estimators
In this section, we consider the problem of estimating σ2 in (1) under the loss function (2). Our main result is Theorem 2.2. Before stating and proving this theorem, we state a theorem due to Brewster and Zidek (1974) and Kubokawa (1994), discuss a class of priors, borrow some notations from Wells and Zhou (2008) and state and prove two technical lemmas (Lemmas 2.1 and 2.2).
Brewster and Zidek (1974) derived general sufficient conditions for minimaxity of estimators having the form \(\delta = \left (n + 2 \right )^{- 1} \left (1 - \phi (F) \right )S\), where ϕ(F) is a function of \(F=\frac {X^{\prime }X}{S}\). For the purpose of verifying the minimaxity of a generalized Bayes estimator, we use the following specialized result.
Theorem 2.1.
The estimator δ(X,S) given by
is minimax for σ2 under the loss function (2) provided that the following conditions hold
-
ϕ(F) is nonincreasing,
-
0 ≤ ϕ(F) ≤ ϕBZ(F), where ϕBZ(F) is given by (3).
Proof 1.
See Brewster and Zidek (1974) and Kubokawa (1994). □
Now, we construct generalized Bayes minimax estimators of σ2 under the loss function (2). To do so, we consider the following class of prior distributions.
For η = σ2, let the conditional distribution of 𝜃 given ν and η be normal with zero mean vector and covariance matrix νη− 1Ip and let the generalized density of (ν,η) given by h(ν,η) = ηbg(ν), ν > 0, η > 0, where \(b > -\frac {n+p}{2}-1\) and g(ν) is a continuously differentiable positive function on \([0, \infty )\) such that the following conditions hold
-
\({{\int \limits }_{0}^{1}} \lambda ^{\frac {p}{2} - 2} g\left (\frac {1 - \lambda }{\lambda } \right ) d\lambda < \infty \),
-
\(\mathop {\lim }\limits _{\nu \to \infty } \frac {g(\nu )}{(1 + \nu )^{\frac {p}{2} - 1}} = 0\).
In the following discussion, we obtain conditions on g and b such that the generalized Bayes estimators satisfy the conditions of Theorem 2.1, and hence are minimax. Note that the joint density function f(η,x,s) of η, X, S is
where \(\left \| . \right \|\) denotes the Euclidean norm. Therefore, the generalized Bayes estimator of η = σ− 2 with respect to the loss function (2) is
Using the change of variables \(\lambda = \frac {1}{1+\nu }\), we have
This estimator is of the form (4) with
where \(d = \frac {n + 2}{n + p + 2b + 4}\) and
where \(A=\frac {n+p}{2}+b+3\).
To continue discussion, we need the following notations borrowed from Wells and Zhou (2008). Define the function Iα,A,g(F) as
Using integration by parts, we obtain
Also, we define the functions \(J_{a} \left (g\left (\frac {F - u}{u} \right ) \right ) \) and \( J_{a} \left (\frac {Au}{1 + u} g\left (\frac {F - u}{u} \right ) \right )\) as
and
respectively. By integration by parts, we have
To show ϕ(F) is a decreasing function in F, it is sufficient to show that r(F) is an increasing function in F. The following lemma gives conditions under which \(\widetilde r(F)=F^{c} r(F)\) is an increasing function in F.
Lemma 2.1.
If \(\psi \left (\nu \right ) = -\left (1 + \nu \right ) \frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )}\) can be decomposed as l1(ν) + l2(ν), where l1(ν) is increasing in ν and 0 ≤ l2(ν) ≤ c, a constant, then \(\widetilde r(F) = F^{c} r(F)\) is nondecreasing.
Proof 2.
The proof is similar to the proof of Lemma 3.2 in Wells and Zhou (2008). Differentiating \(\widetilde r(F) = F^{c} r(F)\) with respect to F, we have
where \(R(F) = \frac {r(F)}{F}\). Therefore, \(\frac {\partial \widetilde r(F)}{\partial F} \ge 0\) is equivalent to
which in turn is equivalent to
Now, we see
Using (9) and (10), (12) can be written as
By applying (11), (13) is equivalent to
which in turn is equivalent to
Since \(I_{\frac {p}{2} - 1, A, g} \left (F \right ) \le I_{\frac {p}{2} - 2, A, g} \left (F \right )\), we have
Note also that l1(ν) is increasing in ν implies that for all F fixed, \(l_{1} \left (\frac {F - u}{u} \right )\) is decreasing in u. When t < u, we have
By a monotone likelihood argument, we have
and
Thus, we have established the inequality (14) and the proof is complete. □
The next lemma gives conditions under which a lower bound of r(F) can be determined.
Lemma 2.2.
With the regularity conditions C1 and C2, assume that \(\psi \left (\nu \right ) = -\left (1 + \nu \right ) \frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )} \ge M\), where M is a finite real number. For the r(F) function, we have
Proof 3.
The proof is similar to the proof of Lemma 3.1 in Wells and Zhou (2008). According to (6), we have
and
Combining all the terms, we obtain the following inequality
implying
Thus, we have the needed bound on the r(F) function. □
Now we use Lemmas 2.1 and 2.2 to show minimaxity of the generalized Bayes estimator δB in (5). In fact, this is our main result.
Theorem 2.2.
If \(\psi \left (\nu \right ) = -\left (1 + \nu \right )\frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )}\) is increasing in ν and \(\psi \left (\nu \right ) = -\left (1 + \nu \right )\frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )} \ge M\), where M is a finite real number and also 1 ≤ d \(\left (1 + \left (\frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right ) \right )\), then δB in (5) is minimax under the loss function (2).
Proof 4.
First, assume l2(ν) = 0 and \(l_{1}(\nu )=\psi \left (\nu \right )\). By using Lemma 2.1 for the case c = 0, we see that r(F) is an increasing function in F, hence ϕ(F) is deceasing in F. By (2), we have
where
Using the change of variables u = λF, we have
Since r1(F) is increasing in F, we have
By Lemma 2.2, we obtain
Also we have
Therefore, if \(1 - d\left [ 1 + \left (\frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right ) \right ]\le 0\) is satisfied, then condition (II) in Theorem 2.2 is satisfied and hence δB in (5) is minimax under the loss function (2). □
3 Examples
In this section, we give three examples to which our results can be applied. We also make some connections to existing literature (Ghosh, 1994).
Example 1.
The class of priors studied by Ghosh (1994) (in the setup and notation of Section 2) corresponds to
where k is a positive constant. If M = b + 2 and \(-\frac {p}{2} - 1 < b \le - 1\), then we can show that the class of priors of Ghosh (1994) satisfies the conditions and hence our results include the results of Ghosh (1994).
Example 2.
Another class of prior distributions is
where k is a positive constant. If M = 1, p ≥ 2 and \(-\frac {p + n}{2} - 1 < b \le - 1\) then Theorem 2.2 is satisfied and the generalized Bayes estimator will be minimax under the loss function (2).
Example 3.
Consider the following prior distribution
where k is a positive constant. If M = a + c, c ≥ 0 and \(-\frac {p + n}{2} - 1 < b \le 2 + a + c\), then the conditions of Theorem 2.2 are satisfied, so the corresponding generalized Bayes estimators are minimax.
4 Simulation study
In this section, we compare the performance of the affine equivariant estimator δ0 with the generalized Bayes estimator in Theorem 2.2. The comparison uses the following simulation scheme computing bias and mean squared error:
-
a)
set values for n, 𝜃 and σ2;
-
b)
simulate a random sample of size n from a seven-dimensional normal distribution with mean vector 𝜃 and covariance matrix σ2Ip;
-
c)
compute
$$ \begin{array}{@{}rcl@{}} \displaystyle S = \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{7} \left( X_{ij} - \overline{X} \right)^{2}, \end{array} $$where
$$ \begin{array}{@{}rcl@{}} \displaystyle \overline{X} = \frac {\sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{7} X_{ij}}{7n}; \end{array} $$ -
d)
compute the equivariant estimator δ0 by \(\delta _{0} = \frac {S}{n + 2}\);
-
e)
compute the estimator δB given by Theorem 2.2 with b = − 2 and g(ν) = e−ν;
-
f)
repeat steps b) to e) one thousand times;
-
g)
compute the biases of the estimators as
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{bias} \left( \delta_{0} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{0, i} - \sigma^{2} \right) \end{array} $$and
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{bias} \left( \delta_{B} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{B, i} - \sigma^{2} \right), \end{array} $$where δ0,i and δB,i denote the estimates of δ0 and δB, respectively, in the i th iteration;
-
h)
compute the mean squared errors of the estimators as
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{mse} \left( \delta_{0} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{0, i} - \sigma^{2} \right)^{2} \end{array} $$and
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{mse} \left( \delta_{B} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{B, i} - \sigma^{2} \right)^{2}. \end{array} $$
Plots of the biases and mean squared errors versus n = 10,11,…,100 when 𝜃 = (0,0,0,0,0,0,0) and σ2 = 1 are shown in Figs. 1 and 2. Plots of the biases and mean squared errors versus 𝜃0 = − 50,49,…,50 when \(\theta = \left (\theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0} \right )\), n = 100 and σ2 = 1 are shown in Figs. 3 and 4. Plots of the biases and mean squared errors versus σ2 = 1,2,…,100 when 𝜃 = (0,0,0,0,0,0,0) and n = 100 are shown in Figs. 5 and 6.
We can observe the following from Figs. 1 to 6. The biases are generally negative for both estimators. The biases approach zero in magnitude as n increases. δB has smaller bias for every n. The mean squared errors approach zero as n increases. δB has smaller mean squared error for every n. The biases decrease from being positive to negative as 𝜃0 increases from − 50 to 50. The biases are smallest in magnitude when 𝜃0 = 0. The mean squared errors take a parabolic shape as 𝜃0 increases from − 50 to 50. The mean squared errors are smallest when 𝜃0 = 0. The biases are negative and decrease as σ2 increases from 1 to 100. The biases are smallest in magnitude when σ2 = 1. The mean squared errors increase as σ2 increases from 1 to 100. The mean squared errors are smallest when σ2 = 1.
The computations were performed using the R software (R Development Core Team, 2023) and the package m vtnorm (Genz et al. 2021). The codes used are given in the Appendix A.
Code Availability
The code are in the Appendix A.
References
Brewster, J. F. and Zidek, J. C. (1974). Improving on equivariant estimators. Ann. Stat. 2, 21–38.
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Bornkamp, B., Maechler, M. and Hothorn, T. (2021). mvtnorm: Multivariate normal and t distributions. R package version 1, 1–3.
Ghosh, M. (1994). On some Bayesian solutions of the Neyman-Scott problem,.
Kubokawa, T. (1994). An unified approach to improving equivariant estimators. Ann. Stat. 22, 290–299.
Maruyama, Y. and Strawderman, W. E. (2006). A new class of minimax generalized Bayes estimators of a normal variance. Journal of Statistical Planning and Inference 136, 3822–3836.
R Development Core Team (2023). A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Stein, C. (1964). Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean. Ann. Inst. Stat. Math. 16, 155–160.
Strawderman, W. E. (1974). Minimax estimation of powers of the variance of a normal population under squared error loss. Ann. Stat. 2, 190–198.
Wells, T. M. and Zhou, G. (2008). Generalized Bayes minimax estimators of the mean of multivariate normal distribution with unknown variance. J. Multivar. Anal. 99, 2208–2220.
Acknowledgments
The authors would like to thank the Editor, the Associate Editor and the referee for careful reading and comments which greatly improved the paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
Authors declare no conflicts of interest.
Consent for Publication
All authors gave explicit consent to publish this manuscript.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix : : R codes
Appendix : : R codes
############################################################### # computes the bias and mean squared error with respect to n # ############################################################### nn=seq(10,100) bias1=nn bias2=nn mse1=nn mse2=nn nsim=1000 est1=rep(0,nsim) est2=est1 for (n in seq(10,100)) {for (i in 1:nsim) {x=rmvnorm(n,mean=rep(0,7),sigma=diag(7)) mm=mean(x) S=sum((x-mm)**2) tt=0 for (i in seq(1,n)) tt=tt+sum((x[i,])**2) F=tt/S f1=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2)} f2=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2-1)} est1[i]=S/(n+2) est2[i]=S*integrate(f1,lower=0,upper=1)$value/ ((n+7)*integrate(f2,lower=0,upper=1)$value)} bias1[n-9]=mean(est1-9) bias2[n-9]=mean(est2-9) mse1[n-9]=mean((est1-9)**2) mse2[n-9]=mean((est2-9)**2) }
################################################################## # computes the bias and mean squared error with respect to sigma # ################################################################## nn=seq(1,100) bias1=nn bias2=nn mse1=nn mse2=nn nsim=1000 est1=rep(0,nsim) est2=est1 n=100 for (s in seq(1,100)) {for (i in 1:nsim) {x=rmvnorm(n,mean=rep(0,7),sigma=(s*diag(7))) mm=mean(x) S=sum((x-mm)**2) tt=0 for (i in seq(1,n)) tt=tt+sum((x[i,])**2) F=tt/S f1=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2)} f2=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2-1)} est1[i]=S/(n+2) est2[i]=S*integrate(f1,lower=0,upper=1)$value/ ((n+7)*integrate(f2,lower=0,upper=1)$value)} bias1[s]=mean(est1-s) bias2[s]=mean(est2-s) mse1[s]=mean((est1-s)**2) mse2[s]=mean((est2-s)**2) } ################################################################### # computes the bias and mean squared error with respect to theta # ################################################################### nn=seq(1,101) bias1=nn bias2=nn mse1=nn mse2=nn nsim=1000 est1=rep(0,nsim) est2=est1 n=100 for (mu in seq(-50,50)) {for (i in 1:nsim) {x=rmvnorm(n,mean=rep(mu,7),sigma=(diag(7))) mm=mean(x) S=sum((x-mm)**2) tt=0 for (i in seq(1,n)) tt=tt+sum((x[i,])**2) F=tt/S f1=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2)} f2=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2-1)} est1[i]=S/(n+2) est2[i]=S*integrate(f1,lower=0,upper=1)$value/ ((n+7)*integrate(f2,lower=0,upper=1)$value)} bias1[mu+51]=mean(est1-mu) bias2[mu+51]=mean(est2-mu) mse1[mu+51]=mean((est1-mu)**2) mse2[mu+51]=mean((est2-mu)**2) }
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zinodiny, S., Nadarajah, S. Generalized Bayes Minimax Estimators of the Variance of a Multivariate Normal Distribution. Sankhya A 85, 1667–1683 (2023). https://doi.org/10.1007/s13171-023-00311-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-023-00311-z