Generalized Bayes Minimax Estimators of the Variance of a Multivariate Normal Distribution

Zinodiny, Shokofeh; Nadarajah, Saralees

doi:10.1007/s13171-023-00311-z

Generalized Bayes Minimax Estimators of the Variance of a Multivariate Normal Distribution

Published: 17 April 2023

Volume 85, pages 1667–1683, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Sankhya A Aims and scope Submit manuscript

Generalized Bayes Minimax Estimators of the Variance of a Multivariate Normal Distribution

Download PDF

102 Accesses
Explore all metrics

Abstract

The problem of estimating the variance of a multivariate normal distribution is considered under quadratic loss. A large class of generalized Bayes minimax estimators for the variance is found. This class include estimators obtained by Ghosh (1994). A simulation study shows superior performance of our estimators.

Regression analysis: likelihood, error and entropy

Article 23 March 2018

A family of the adjusted estimators maximizing the asymptotic predictive expected log-likelihood

Article 09 December 2016

Maximum a posteriori estimators as a limit of Bayes estimators

Article 30 January 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let X and S be independent random variables having

$$ \begin{array}{@{}rcl@{}} \displaystyle X \sim N_{p} \left( \theta, \sigma^{2} I_{p} \right), \ \displaystyle S \sim \sigma^{2} {\chi_{n}^{2}}, \end{array} $$

(1)

where $ N_{p} \left (\theta , \sigma ^{2} I_{p} \right )$ denotes the p-variate normal distribution with unknown mean vector 𝜃 and covariance matrix σ²I_p, where I_p denotes the p × p identity matrix, and ${\chi _{n}^{2}}$ denotes a chisquare variable with n degree of freedom. We consider the problem of estimating σ² when the loss function is

$$ \begin{array}{@{}rcl@{}} \displaystyle L\left( \delta; \sigma^{2} \right) = \left( \frac {\delta}{\sigma^{2}} - 1 \right)^{2}, \end{array} $$

(2)

where δ = δ(X,S) is an estimator of σ².

The best affine equivariant estimator is δ₀ = (n + 2)^− 1S which is a minimax estimator with constant risk 2(n + 2)^− 1. Stein (1964) showed that δ₀ can be improved by considering a class of scale equivariant estimators $\delta = \left ({n + 2} \right )^{-1} \left (1 - \phi (F) \right )S$ for $F = \frac {X^{\prime }X}{S}$. He found a specific better estimator $\delta ^{S} = \left (n + 2 \right )^{-1} \left (1 - \phi ^{S} (F) \right )S$, where $\phi ^{S} (F) = {\max \limits } \left \{ 0, \frac {p - (n + 2)F}{p + n + 2} \right \}$. Brewster and Zidek (1974) obtained an improved generalized Bayes estimator $\delta ^{BZ} = \left (n + 2 \right )^{-1} \left (1 - \phi ^{BZ} (F) \right )S$, where

$$ \begin{array}{@{}rcl@{}} \displaystyle \phi^{BZ} (F) = 1 - \frac {n + 2}{p + n + 2} \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} (1 + \lambda F)^{-\frac {p + n}{2} - 1} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} (1 + \lambda F)^{-\frac {p + n}{2} - 2} d\lambda}. \end{array} $$

(3)

They also gave a general sufficient condition for minimaxity, using an integral expression of the difference in risks between δ₀ and δ. Strawderman (1974) derived another sufficient condition for minimaxity. Using conditions of Brewster and Zidek (1974), Ghosh (1994) obtained a class of generalized Bayes estimators for σ². Maruyama and Strawderman (2006) proposed another class of improved generalized Bayes estimators.

In this paper, we derive a large class of generalized Bayes minimax estimators of σ² which contains estimators of Ghosh (1994) as special cases. To do so, we use techniques of Wells and Zhou (2008) and Brewster and Zidek (1974). Section 3 considers some examples of classes of generalized Bayes minimax estimators. In particular, Example 1 demonstrates that a result in Ghosh (1994) follows from our main theorem. Section 4 compares the minimax estimators of Sections 2, 3 and the equivariant estimator δ₀ by simulation.

2 A class of generalized Bayes minimax estimators

In this section, we consider the problem of estimating σ² in (1) under the loss function (2). Our main result is Theorem 2.2. Before stating and proving this theorem, we state a theorem due to Brewster and Zidek (1974) and Kubokawa (1994), discuss a class of priors, borrow some notations from Wells and Zhou (2008) and state and prove two technical lemmas (Lemmas 2.1 and 2.2).

Brewster and Zidek (1974) derived general sufficient conditions for minimaxity of estimators having the form $\delta = \left (n + 2 \right )^{- 1} \left (1 - \phi (F) \right )S$, where ϕ(F) is a function of $F=\frac {X^{\prime }X}{S}$. For the purpose of verifying the minimaxity of a generalized Bayes estimator, we use the following specialized result.

Theorem 2.1.

The estimator δ(X,S) given by

$$ \begin{array}{@{}rcl@{}} \displaystyle \delta = \left( n + 2 \right)^{-1} \left( 1 - \phi (F) \right)S \end{array} $$

(4)

is minimax for σ² under the loss function (2) provided that the following conditions hold

ϕ(F) is nonincreasing,
0 ≤ ϕ(F) ≤ ϕ^BZ(F), where ϕ^BZ(F) is given by (3).

Proof 1.

See Brewster and Zidek (1974) and Kubokawa (1994). □

Now, we construct generalized Bayes minimax estimators of σ² under the loss function (2). To do so, we consider the following class of prior distributions.

For η = σ², let the conditional distribution of 𝜃 given ν and η be normal with zero mean vector and covariance matrix νη^− 1I_p and let the generalized density of (ν,η) given by h(ν,η) = η^bg(ν), ν > 0, η > 0, where $b > -\frac {n+p}{2}-1$ and g(ν) is a continuously differentiable positive function on $[0, \infty )$ such that the following conditions hold

${{\int \limits }_{0}^{1}} \lambda ^{\frac {p}{2} - 2} g\left (\frac {1 - \lambda }{\lambda } \right ) d\lambda < \infty $,
$\mathop {\lim }\limits _{\nu \to \infty } \frac {g(\nu )}{(1 + \nu )^{\frac {p}{2} - 1}} = 0$.

In the following discussion, we obtain conditions on g and b such that the generalized Bayes estimators satisfy the conditions of Theorem 2.1, and hence are minimax. Note that the joint density function f(η,x,s) of η, X, S is

$$ \begin{array}{@{}rcl@{}} \displaystyle f(\eta, x, s) &\propto& \displaystyle {\int}_{0}^{\infty} {\int}_{R^{p}} \eta^{\frac {p}{2}} e^{-\frac {\eta \left\| x - \theta \right\|^{2}}{2}} \nu^{-\frac {p}{2}} \eta^{\frac {p}{2}} e^{-\frac {\eta \left\| \theta \right\|^{2}}{2\nu}} g\left( \nu \right)\eta^{b} \eta^{\frac {n}{2}} e^{-\frac {\eta s}{2\nu}} d\theta d\nu \\ &\propto& \displaystyle {\int}_{0}^{\infty} {\int}_{R^{p}} \eta^{\frac {2p + n}{2} + b} e^{-\frac {\eta}{2}\left[ \left\| x - \theta \right\|^{2} + \frac {\left\| \theta \right\|^{2}}{\nu } \right]} \nu^{-\frac {p}{2}} g\left( \nu \right)e^{-\frac {\eta s}{2\nu}} d\theta d\nu \\ &\propto& \displaystyle {\int}_{0}^{\infty} \eta^{\frac {p + n}{2} + b} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} e^{-\frac {\eta }{2} \left[ s + \frac {\left\| x \right\|^{2}}{1 + \nu} \right]} d\nu, \end{array} $$

where $\left \| . \right \|$ denotes the Euclidean norm. Therefore, the generalized Bayes estimator of η = σ^− 2 with respect to the loss function (2) is

$$ \begin{array}{@{}rcl@{}} \displaystyle \delta_{B} &=& \displaystyle \frac {E\left[ \left. \eta \right|X,S \right]} {E\left[ \left. \eta^{2} \right|X,S \right]} \\ &=& \displaystyle \frac {{\int}_{0}^{\infty} {\int}_{0}^{\infty} \eta^{\frac {p + n}{2} + b + 1} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} e^{-\frac {\eta S}{2}\left( 1 + \frac {F}{1 + \nu} \right)} d\eta d\nu} {{\int}_{0}^{\infty} {\int}_{0}^{\infty} \eta^{\frac {p + n}{2} + b+ 2} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} e^{-\frac {\eta S}{2}\left( 1 + \frac {F}{1 + \nu } \right)} d\eta d\nu} \\ &=& \displaystyle \frac {S}{n + p + 2b+ 4} \frac {{\int}_{0}^{\infty} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} \left( 1 + \frac {F}{1 + \nu} \right)^{-\frac {n + p}{2} - b - 2} d\nu} {{\int}_{0}^{\infty} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} \left( 1 + \frac {F}{1 + \nu} \right)^{-\frac {n + p}{2} - b - 3} d\nu}. \end{array} $$

Using the change of variables $\lambda = \frac {1}{1+\nu }$, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \delta_{B} = \frac {S}{n + p + 2b+ 4} \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - b - 2} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - b - 3} d\lambda}. \end{array} $$

(5)

This estimator is of the form (4) with

$$ \begin{array}{@{}rcl@{}} \displaystyle \phi \left( F \right) = 1 - d\left( 1 + r(F)\right), \end{array} $$

where $d = \frac {n + 2}{n + p + 2b + 4}$ and

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) = F \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda}, \end{array} $$

(6)

where $A=\frac {n+p}{2}+b+3$.

To continue discussion, we need the following notations borrowed from Wells and Zhou (2008). Define the function I_α,A,g(F) as

$$ \begin{array}{@{}rcl@{}} \displaystyle I_{\alpha, A, g} \left( F \right) = {{\int}_{0}^{1}} \lambda^{\alpha} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right)d\lambda. \end{array} $$

(7)

Using integration by parts, we obtain

$$ \begin{array}{@{}rcl@{}} \displaystyle FI_{\frac {p}{2} - 1, A, g} \left( F \right) &=& \displaystyle {{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) d\left[ \frac {\left( 1 + \lambda F \right)^{1 - A}}{1 - A} \right] \\ &=& \displaystyle g(0)\frac {\left( 1 + F \right)^{1 - A}}{1 - A} \\ && \displaystyle +\frac {\frac {p}{2} - 1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} (1 + \lambda F) \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) d\lambda \\ && \displaystyle -\frac {1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} (1 + \lambda F) \frac {1}{\lambda^{2}}\lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) d\lambda. \end{array} $$

(8)

Also, we define the functions $J_{a} \left (g\left (\frac {F - u}{u} \right ) \right ) $ and $ J_{a} \left (\frac {Au}{1 + u} g\left (\frac {F - u}{u} \right ) \right )$ as

$$ \begin{array}{@{}rcl@{}} \displaystyle J_{a} \left( g\left( \frac {F - u}{u} \right) \right) = {{\int}_{0}^{F}} u^{a} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du = F^{a + 1} I_{a, A, g} \left( F \right) \end{array} $$

(9)

and

$$ \begin{array}{@{}rcl@{}} \displaystyle J_{a} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right) &=& \displaystyle {{\int}_{0}^{F}} u^{a} \left( 1 + u \right)^{-A} \frac {Au}{1 + u}g\left( \frac {F - u}{u} \right)du \\ &=& \displaystyle F^{a + 1} {{\int}_{0}^{1}} \lambda^{a} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right) \frac {A\lambda F}{1 + \lambda F}d\lambda,\\ \end{array} $$

(10)

respectively. By integration by parts, we have

$$ \begin{array}{@{}rcl@{}} J_{a} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right) &=& {{\int}_{0}^{F}} u^{a} \left( 1 + u \right)^{-A} \frac {Au}{1 + u}g\left( \frac {F - u}{u} \right)du \\ &=& -{{\int}_{0}^{F}} u^{a + 1} g\left( \frac {F - u}{u} \right)d \left( {1 + u} \right)^{-A} \\ &=& -F^{a + 1} g(0)\left( 1 + F \right)^{-A} + (a + 1) {{\int}_{0}^{F}} \left( 1 + u \right)^{-A} u^{a} g\left( \frac {F - u}{u} \right) du \\ && +{{\int}_{0}^{F}} \left( 1 + u \right)^{-A} u^{a + 1} g^{\prime}\left( \frac {F - u}{u} \right) \left( -\frac {F}{u^{2}} \right) du. \end{array} $$

(11)

To show ϕ(F) is a decreasing function in F, it is sufficient to show that r(F) is an increasing function in F. The following lemma gives conditions under which $\widetilde r(F)=F^{c} r(F)$ is an increasing function in F.

Lemma 2.1.

If $\psi \left (\nu \right ) = -\left (1 + \nu \right ) \frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )}$ can be decomposed as l₁(ν) + l₂(ν), where l₁(ν) is increasing in ν and 0 ≤ l₂(ν) ≤ c, a constant, then $\widetilde r(F) = F^{c} r(F)$ is nondecreasing.

Proof 2.

The proof is similar to the proof of Lemma 3.2 in Wells and Zhou (2008). Differentiating $\widetilde r(F) = F^{c} r(F)$ with respect to F, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \frac {\partial \widetilde r(F)}{\partial F} = F^{c} \left( c\frac {r(F)}{F} + r^{\prime}(F) \right) = F^{c} \left( (1 + c)R(F) + FR^{\prime}(F) \right), \end{array} $$

where $R(F) = \frac {r(F)}{F}$. Therefore, $\frac {\partial \widetilde r(F)}{\partial F} \ge 0$ is equivalent to

$$ \begin{array}{@{}rcl@{}} \displaystyle (1 + c)\frac {I_{\frac {p}{2} - 1, A, g} \left( F \right)} {I_{\frac {p}{2} - 2, A, g} \left( F \right)} + F\frac {\left\{ I^{\prime}_{\frac {p}{2} - 1, A, g} \left( F \right) I_{\frac {p}{2} - 2, A, g} \left( F \right) - I^{\prime}_{\frac {p}{2} - 2, A, g} \left( F \right) I_{\frac {p}{2} - 1,A,g} \left( F \right) \right\}}{I_{\frac {p}{2} - 2, A, g}^{2} \left( F \right)} \ge 0, \end{array} $$

which in turn is equivalent to

$$ \begin{array}{@{}rcl@{}} \displaystyle &&-FI^{\prime}_{\frac {p}{2} - 1, A, g} \left( F \right)I_{\frac {p}{2} - 2, A, g} \left( F \right) \le (1 + c)I_{\frac {p}{2} - 2, A, g} \left( F \right)I_{\frac {p}{2} - 1, A, g} \left( F \right) \\&&- FI^{\prime}_{\frac {p}{2} - 2, A, g} \left( F \right)I_{\frac {p}{2} - 1, A, g} \left( F \right). \end{array} $$

(12)

Now, we see

$$ \begin{array}{@{}rcl@{}} \displaystyle -FI^{\prime}_{a,A,g} \left( F \right) = {{\int}_{0}^{1}} \lambda^{a} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right) \frac {A\lambda F}{1 + \lambda F}d\lambda. \end{array} $$

Using (9) and (10), (12) can be written as

$$ \begin{array}{@{}rcl@{}} \displaystyle \frac {J_{\frac {p}{2} - 1} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} \le 1 + c + \frac {J_{\frac {p}{2} - 2} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)}. \end{array} $$

(13)

By applying (11), (13) is equivalent to

$$ \begin{array}{@{}rcl@{}} && \displaystyle \frac {-F^{\frac {p}{2}} g(0)(1 + F)^{-A}} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} + \left( \frac {p}{2} \right) + \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right) \left[ \frac {g^{\prime}\left( \frac {F - u}{u} \right)}{g\left( \frac {F - u}{u} \right)} \left( -\frac {F}{u} \right) \right]du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du} \\ &\le& \displaystyle \!\!1 + c + \frac {-F^{\frac {p}{2} - 1} g(0)(1 + F)^{-A}} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} + \frac {p}{2} - 1 + \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right) \left[ \frac {g^{\prime}\left( \frac {F - u}{u} \right)}{g\left( \frac {F - u}{u} \right)} \left( -\frac {F}{u} \right) \right]du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du}, \end{array} $$

which in turn is equivalent to

$$ \begin{array}{@{}rcl@{}} && \displaystyle \frac {-g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 1, A, g} \left( F \right)} + \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} + \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right)l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} \\ &\le& \displaystyle c + \frac {-g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 2, A, g} \left( F \right)} + \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} + \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)}. \\ \end{array} $$

(14)

Since $I_{\frac {p}{2} - 1, A, g} \left (F \right ) \le I_{\frac {p}{2} - 2, A, g} \left (F \right )$, we have

$$ \begin{array}{@{}rcl@{}} \frac {- g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 1, A, g} \left( F \right)} \le \frac {-g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 2, A, g} \left( F \right)}. \end{array} $$

Note also that l₁(ν) is increasing in ν implies that for all F fixed, $l_{1} \left (\frac {F - u}{u} \right )$ is decreasing in u. When t < u, we have

$$ \begin{array}{@{}rcl@{}} \frac {u^{\frac {p}{2} - 2} \left( {1 + u} \right)^{-A} g\left( \frac {F - u}{u} \right) 1\left( u \le F \right)} {t^{\frac {p}{2} - 2} \left( 1 + t \right)^{-A} g\left( \frac {F - t}{t} \right) 1\left( {t \le F} \right)} \le \frac {u^{\frac {p}{2} - 1} \left( {1 + u} \right)^{-A} g\left( \frac {F - u}{u} \right) 1\left( u \le F \right)} {t^{\frac {p}{2} - 1} \left( 1 + t \right)^{-A} g\left( \frac {F - t}{t} \right)1\left( t \le F \right)}. \end{array} $$

By a monotone likelihood argument, we have

$$ \begin{array}{@{}rcl@{}} && \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} = \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right)du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du} \\ &\le& \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)l_{1} \left( \frac {F - u}{u} \right)du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du} = \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} 0 \le \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} \le c, \ 0 \le \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} \le c. \end{array} $$

Thus, we have established the inequality (14) and the proof is complete. □

The next lemma gives conditions under which a lower bound of r(F) can be determined.

Lemma 2.2.

With the regularity conditions C1 and C2, assume that $\psi \left (\nu \right ) = -\left (1 + \nu \right ) \frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )} \ge M$, where M is a finite real number. For the r(F) function, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) \ge \frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M}. \end{array} $$

Proof 3.

The proof is similar to the proof of Lemma 3.1 in Wells and Zhou (2008). According to (6), we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) = F\frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda} = F \frac {I_{\frac {p}{2} - 1, A, g} \left( F \right)}{I_{\frac {p}{2} - 2, A, g} \left( F \right)}. \end{array} $$

Using (7) and (8), we obtain

$$ \begin{array}{@{}rcl@{}} \displaystyle N_{1} = \frac {1}{A - 1}{{\int}_{0}^{1}}\! \left( 1 + \lambda F \right)^{-A} \left( \frac {p}{2} - 1 \right) \lambda^{\frac {p}{2} - 2} g\left( \frac{1 - \lambda}{\lambda} \right)d\lambda = \frac {\frac {p}{2} - 1}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right), \end{array} $$

$$ \begin{array}{@{}rcl@{}} \displaystyle N_{2} &=& \displaystyle \frac {1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 2} g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right) \left( \frac {-\lambda}{\lambda^{2}} \right)d\lambda \\ &=& \displaystyle \frac {1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left[ \frac {g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right)} {g\left( \frac {1 - \lambda}{\lambda} \right)} \left( -\frac {1 - \lambda}{\lambda } - 1 \right) \right]d\lambda \\ &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right) \psi \left( \frac {1 - \lambda}{\lambda} \right)d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right)d\lambda} \\ &\ge& \displaystyle \frac {M}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right), \end{array} $$

$$ \begin{array}{@{}rcl@{}} \displaystyle N_{3} = \frac {\frac {p}{2} - 1}{A - 1}FI_{\frac {p}{2} - 1, A, g} \left( F \right) = \frac {\left( \frac {p}{2} - 1 \right)r(F)}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right) \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} \displaystyle N_{4} &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {F{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} \left( 1 + \lambda F \right)^{-A} g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right) \left( \frac {-1}{\lambda} \right)d\lambda}{I_{\frac {p}{2} - 2, A, g} \left( F \right)} \\ &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {F{{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \left[ \frac {g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right)} {g\left( \frac {1 - \lambda}{\lambda} \right)} \left( -\frac {1 - \lambda}{\lambda} - 1 \right) \right]d\lambda} {I_{\frac {p}{2} - 2, A, g} \left( F \right)} \\ &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {F{{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \psi \left( \frac {1 - \lambda}{\lambda} \right)d\lambda}{I_{\frac {p}{2} - 2, A, g} \left( F \right)} \\ &\ge& \displaystyle \frac {M r(F)}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right). \end{array} $$

Combining all the terms, we obtain the following inequality

$$ \begin{array}{@{}rcl@{}} \displaystyle (A - 1)r(F) \ge \left( \frac {p}{2} - 1 \right) + M + \left( \frac {p}{2} - 1 \right)r(F) + Mr(F), \end{array} $$

implying

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) \ge \frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M}. \end{array} $$

Thus, we have the needed bound on the r(F) function. □

Now we use Lemmas 2.1 and 2.2 to show minimaxity of the generalized Bayes estimator δ_B in (5). In fact, this is our main result.

Theorem 2.2.

If $\psi \left (\nu \right ) = -\left (1 + \nu \right )\frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )}$ is increasing in ν and $\psi \left (\nu \right ) = -\left (1 + \nu \right )\frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )} \ge M$, where M is a finite real number and also 1 ≤ d $\left (1 + \left (\frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right ) \right )$, then δ_B in (5) is minimax under the loss function (2).

Proof 4.

First, assume l₂(ν) = 0 and $l_{1}(\nu )=\psi \left (\nu \right )$. By using Lemma 2.1 for the case c = 0, we see that r(F) is an increasing function in F, hence ϕ(F) is deceasing in F. By (2), we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \varphi^{BZ} \left( F \right) = 1 - \frac {n + 2}{p + n + 2} r_{1} (F), \end{array} $$

where

$$ \begin{array}{@{}rcl@{}} \displaystyle r_{1} (F) = \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - 1} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - 2} d\lambda}. \end{array} $$

Using the change of variables u = λF, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r_{1} (F) = \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 1} du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 2} du}. \end{array} $$

Since r₁(F) is increasing in F, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r_{1} (F) \le \frac {{\int}_{0}^{\infty} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 1} du} {{\int}_{0}^{\infty} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 2} du} = \frac {p + n + 2}{n + 2}. \end{array} $$

By Lemma 2.2, we obtain

$$ \begin{array}{@{}rcl@{}} \displaystyle \varphi \left( F \right) = 1 - d \left( 1 + r(F)\right) \le 1 - d\left[ 1 + \left( \frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right) \right]. \end{array} $$

Also we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \varphi^{BZ} \left( F \right) = 1 - \frac {n + 2}{p + n + 2} r_{1} (F) \ge 1 - \frac {n + 2}{p + n + 2} \frac {p + n + 2}{n + 2} = 0. \end{array} $$

Therefore, if $1 - d\left [ 1 + \left (\frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right ) \right ]\le 0$ is satisfied, then condition (II) in Theorem 2.2 is satisfied and hence δ_B in (5) is minimax under the loss function (2). □

3 Examples

In this section, we give three examples to which our results can be applied. We also make some connections to existing literature (Ghosh, 1994).

Example 1.

The class of priors studied by Ghosh (1994) (in the setup and notation of Section 2) corresponds to

$$ \begin{array}{@{}rcl@{}} \displaystyle h\left( \nu, \eta \right) = k\eta^{b} \left( 1 + \nu \right)^{-b - 2}, \ \displaystyle \nu > 0, \ \displaystyle \eta > 0, \end{array} $$

where k is a positive constant. If M = b + 2 and $-\frac {p}{2} - 1 < b \le - 1$, then we can show that the class of priors of Ghosh (1994) satisfies the conditions and hence our results include the results of Ghosh (1994).

Example 2.

Another class of prior distributions is

$$ \begin{array}{@{}rcl@{}} \displaystyle h\left( \nu, \eta \right) = k\eta^{b} e^{-\nu}, \ \displaystyle \nu > 0, \ \displaystyle \eta > 0, \end{array} $$

where k is a positive constant. If M = 1, p ≥ 2 and $-\frac {p + n}{2} - 1 < b \le - 1$ then Theorem 2.2 is satisfied and the generalized Bayes estimator will be minimax under the loss function (2).

Example 3.

Consider the following prior distribution

$$ \begin{array}{@{}rcl@{}} \displaystyle h\left( \nu, \eta \right) = k\eta^{b} \left( 1 + \nu \right)^{-a - c - 2} \nu^{c}, \ \displaystyle \nu > 0, \ \displaystyle \eta > 0, \end{array} $$

where k is a positive constant. If M = a + c, c ≥ 0 and $-\frac {p + n}{2} - 1 < b \le 2 + a + c$, then the conditions of Theorem 2.2 are satisfied, so the corresponding generalized Bayes estimators are minimax.

4 Simulation study

In this section, we compare the performance of the affine equivariant estimator δ₀ with the generalized Bayes estimator in Theorem 2.2. The comparison uses the following simulation scheme computing bias and mean squared error:

a)
set values for n, 𝜃 and σ²;
b)
simulate a random sample of size n from a seven-dimensional normal distribution with mean vector 𝜃 and covariance matrix σ²I_p;
c)
compute
$$ \begin{array}{@{}rcl@{}} \displaystyle S = \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{7} \left( X_{ij} - \overline{X} \right)^{2}, \end{array} $$

where
$$ \begin{array}{@{}rcl@{}} \displaystyle \overline{X} = \frac {\sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{7} X_{ij}}{7n}; \end{array} $$
d)
compute the equivariant estimator δ₀ by $\delta _{0} = \frac {S}{n + 2}$;
e)
compute the estimator δ_B given by Theorem 2.2 with b = − 2 and g(ν) = e^−ν;
f)
repeat steps b) to e) one thousand times;
g)
compute the biases of the estimators as
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{bias} \left( \delta_{0} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{0, i} - \sigma^{2} \right) \end{array} $$

and
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{bias} \left( \delta_{B} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{B, i} - \sigma^{2} \right), \end{array} $$

where δ_0,i and δ_B,i denote the estimates of δ₀ and δ_B, respectively, in the i th iteration;
h)
compute the mean squared errors of the estimators as
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{mse} \left( \delta_{0} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{0, i} - \sigma^{2} \right)^{2} \end{array} $$

and
$$ \begin{array}{@{}rcl@{}} \displaystyle \text{mse} \left( \delta_{B} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{B, i} - \sigma^{2} \right)^{2}. \end{array} $$

Plots of the biases and mean squared errors versus n = 10,11,…,100 when 𝜃 = (0,0,0,0,0,0,0) and σ² = 1 are shown in Figs. 1 and 2. Plots of the biases and mean squared errors versus 𝜃₀ = − 50,49,…,50 when $\theta = \left (\theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0} \right )$, n = 100 and σ² = 1 are shown in Figs. 3 and 4. Plots of the biases and mean squared errors versus σ² = 1,2,…,100 when 𝜃 = (0,0,0,0,0,0,0) and n = 100 are shown in Figs. 5 and 6.

We can observe the following from Figs. 1 to 6. The biases are generally negative for both estimators. The biases approach zero in magnitude as n increases. δ_B has smaller bias for every n. The mean squared errors approach zero as n increases. δ_B has smaller mean squared error for every n. The biases decrease from being positive to negative as 𝜃₀ increases from − 50 to 50. The biases are smallest in magnitude when 𝜃₀ = 0. The mean squared errors take a parabolic shape as 𝜃₀ increases from − 50 to 50. The mean squared errors are smallest when 𝜃₀ = 0. The biases are negative and decrease as σ² increases from 1 to 100. The biases are smallest in magnitude when σ² = 1. The mean squared errors increase as σ² increases from 1 to 100. The mean squared errors are smallest when σ² = 1.

The computations were performed using the R software (R Development Core Team, 2023) and the package m vtnorm (Genz et al. 2021). The codes used are given in the Appendix A.

Code Availability

The code are in the Appendix A.

References

Brewster, J. F. and Zidek, J. C. (1974). Improving on equivariant estimators. Ann. Stat. 2, 21–38.
Article MathSciNet MATH Google Scholar
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Bornkamp, B., Maechler, M. and Hothorn, T. (2021). mvtnorm: Multivariate normal and t distributions. R package version 1, 1–3.
Google Scholar
Ghosh, M. (1994). On some Bayesian solutions of the Neyman-Scott problem,.
Kubokawa, T. (1994). An unified approach to improving equivariant estimators. Ann. Stat. 22, 290–299.
Article MathSciNet MATH Google Scholar
Maruyama, Y. and Strawderman, W. E. (2006). A new class of minimax generalized Bayes estimators of a normal variance. Journal of Statistical Planning and Inference 136, 3822–3836.
Article MathSciNet MATH Google Scholar
R Development Core Team (2023). A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Stein, C. (1964). Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean. Ann. Inst. Stat. Math. 16, 155–160.
Article MathSciNet MATH Google Scholar
Strawderman, W. E. (1974). Minimax estimation of powers of the variance of a normal population under squared error loss. Ann. Stat. 2, 190–198.
Article MathSciNet MATH Google Scholar
Wells, T. M. and Zhou, G. (2008). Generalized Bayes minimax estimators of the mean of multivariate normal distribution with unknown variance. J. Multivar. Anal. 99, 2208–2220.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank the Editor, the Associate Editor and the referee for careful reading and comments which greatly improved the paper.

Author information

Authors and Affiliations

Institute for Research in Fundamental Sciences, Tehran, Iran
Shokofeh Zinodiny
Howard University, Washington DC, MD, 20059, USA
Saralees Nadarajah

Authors

Shokofeh Zinodiny
View author publications
You can also search for this author in PubMed Google Scholar
Saralees Nadarajah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saralees Nadarajah.

Ethics declarations

Conflict of Interest

Authors declare no conflicts of interest.

Consent for Publication

All authors gave explicit consent to publish this manuscript.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix : : R codes

############################################################### # computes the bias and mean squared error with respect to n # ############################################################### nn=seq(10,100) bias1=nn bias2=nn mse1=nn mse2=nn nsim=1000 est1=rep(0,nsim) est2=est1 for (n in seq(10,100)) {for (i in 1:nsim) {x=rmvnorm(n,mean=rep(0,7),sigma=diag(7)) mm=mean(x) S=sum((x-mm)**2) tt=0 for (i in seq(1,n)) tt=tt+sum((x[i,])**2) F=tt/S f1=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2)} f2=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2-1)} est1[i]=S/(n+2) est2[i]=S*integrate(f1,lower=0,upper=1)$value/ ((n+7)*integrate(f2,lower=0,upper=1)$value)} bias1[n-9]=mean(est1-9) bias2[n-9]=mean(est2-9) mse1[n-9]=mean((est1-9)**2) mse2[n-9]=mean((est2-9)**2) }

################################################################## # computes the bias and mean squared error with respect to sigma # ################################################################## nn=seq(1,100) bias1=nn bias2=nn mse1=nn mse2=nn nsim=1000 est1=rep(0,nsim) est2=est1 n=100 for (s in seq(1,100)) {for (i in 1:nsim) {x=rmvnorm(n,mean=rep(0,7),sigma=(s*diag(7))) mm=mean(x) S=sum((x-mm)**2) tt=0 for (i in seq(1,n)) tt=tt+sum((x[i,])**2) F=tt/S f1=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2)} f2=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2-1)} est1[i]=S/(n+2) est2[i]=S*integrate(f1,lower=0,upper=1)$value/ ((n+7)*integrate(f2,lower=0,upper=1)$value)} bias1[s]=mean(est1-s) bias2[s]=mean(est2-s) mse1[s]=mean((est1-s)**2) mse2[s]=mean((est2-s)**2) } ################################################################### # computes the bias and mean squared error with respect to theta # ################################################################### nn=seq(1,101) bias1=nn bias2=nn mse1=nn mse2=nn nsim=1000 est1=rep(0,nsim) est2=est1 n=100 for (mu in seq(-50,50)) {for (i in 1:nsim) {x=rmvnorm(n,mean=rep(mu,7),sigma=(diag(7))) mm=mean(x) S=sum((x-mm)**2) tt=0 for (i in seq(1,n)) tt=tt+sum((x[i,])**2) F=tt/S f1=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2)} f2=function (x) {x**(3/2)*exp((x-1)/x)*(1+x*F)**(-(n+7)/2-1)} est1[i]=S/(n+2) est2[i]=S*integrate(f1,lower=0,upper=1)$value/ ((n+7)*integrate(f2,lower=0,upper=1)$value)} bias1[mu+51]=mean(est1-mu) bias2[mu+51]=mean(est2-mu) mse1[mu+51]=mean((est1-mu)**2) mse2[mu+51]=mean((est2-mu)**2) }

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zinodiny, S., Nadarajah, S. Generalized Bayes Minimax Estimators of the Variance of a Multivariate Normal Distribution. Sankhya A 85, 1667–1683 (2023). https://doi.org/10.1007/s13171-023-00311-z

Download citation

Received: 23 October 2022
Accepted: 20 March 2023
Published: 17 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s13171-023-00311-z

Keywords

PACS Nos

Primary 62E99

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Generalized Bayes Minimax Estimators of the Variance of a Multivariate Normal Distribution

Abstract

Similar content being viewed by others

Regression analysis: likelihood, error and entropy

A family of the adjusted estimators maximizing the asymptotic predictive expected log-likelihood

Maximum a posteriori estimators as a limit of Bayes estimators

1 Introduction