1 Introduction

Let X and S be independent random variables having

$$ \begin{array}{@{}rcl@{}} \displaystyle X \sim N_{p} \left( \theta, \sigma^{2} I_{p} \right), \ \displaystyle S \sim \sigma^{2} {\chi_{n}^{2}}, \end{array} $$
(1)

where \( N_{p} \left (\theta , \sigma ^{2} I_{p} \right )\) denotes the p-variate normal distribution with unknown mean vector 𝜃 and covariance matrix σ2Ip, where Ip denotes the p × p identity matrix, and \({\chi _{n}^{2}}\) denotes a chisquare variable with n degree of freedom. We consider the problem of estimating σ2 when the loss function is

$$ \begin{array}{@{}rcl@{}} \displaystyle L\left( \delta; \sigma^{2} \right) = \left( \frac {\delta}{\sigma^{2}} - 1 \right)^{2}, \end{array} $$
(2)

where δ = δ(X,S) is an estimator of σ2.

The best affine equivariant estimator is δ0 = (n + 2)− 1S which is a minimax estimator with constant risk 2(n + 2)− 1. Stein (1964) showed that δ0 can be improved by considering a class of scale equivariant estimators \(\delta = \left ({n + 2} \right )^{-1} \left (1 - \phi (F) \right )S\) for \(F = \frac {X^{\prime }X}{S}\). He found a specific better estimator \(\delta ^{S} = \left (n + 2 \right )^{-1} \left (1 - \phi ^{S} (F) \right )S\), where \(\phi ^{S} (F) = {\max \limits } \left \{ 0, \frac {p - (n + 2)F}{p + n + 2} \right \}\). Brewster and Zidek (1974) obtained an improved generalized Bayes estimator \(\delta ^{BZ} = \left (n + 2 \right )^{-1} \left (1 - \phi ^{BZ} (F) \right )S\), where

$$ \begin{array}{@{}rcl@{}} \displaystyle \phi^{BZ} (F) = 1 - \frac {n + 2}{p + n + 2} \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} (1 + \lambda F)^{-\frac {p + n}{2} - 1} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} (1 + \lambda F)^{-\frac {p + n}{2} - 2} d\lambda}. \end{array} $$
(3)

They also gave a general sufficient condition for minimaxity, using an integral expression of the difference in risks between δ0 and δ. Strawderman (1974) derived another sufficient condition for minimaxity. Using conditions of Brewster and Zidek (1974), Ghosh (1994) obtained a class of generalized Bayes estimators for σ2. Maruyama and Strawderman (2006) proposed another class of improved generalized Bayes estimators.

In this paper, we derive a large class of generalized Bayes minimax estimators of σ2 which contains estimators of Ghosh (1994) as special cases. To do so, we use techniques of Wells and Zhou (2008) and Brewster and Zidek (1974). Section 3 considers some examples of classes of generalized Bayes minimax estimators. In particular, Example 1 demonstrates that a result in Ghosh (1994) follows from our main theorem. Section 4 compares the minimax estimators of Sections 23 and the equivariant estimator δ0 by simulation.

2 A class of generalized Bayes minimax estimators

In this section, we consider the problem of estimating σ2 in (1) under the loss function (2). Our main result is Theorem 2.2. Before stating and proving this theorem, we state a theorem due to Brewster and Zidek (1974) and Kubokawa (1994), discuss a class of priors, borrow some notations from Wells and Zhou (2008) and state and prove two technical lemmas (Lemmas 2.1 and 2.2).

Brewster and Zidek (1974) derived general sufficient conditions for minimaxity of estimators having the form \(\delta = \left (n + 2 \right )^{- 1} \left (1 - \phi (F) \right )S\), where ϕ(F) is a function of \(F=\frac {X^{\prime }X}{S}\). For the purpose of verifying the minimaxity of a generalized Bayes estimator, we use the following specialized result.

Theorem 2.1.

The estimator δ(X,S) given by

$$ \begin{array}{@{}rcl@{}} \displaystyle \delta = \left( n + 2 \right)^{-1} \left( 1 - \phi (F) \right)S \end{array} $$
(4)

is minimax for σ2 under the loss function (2) provided that the following conditions hold

  • ϕ(F) is nonincreasing,

  • 0 ≤ ϕ(F) ≤ ϕBZ(F), where ϕBZ(F) is given by (3).

Proof 1.

See Brewster and Zidek (1974) and Kubokawa (1994). □

Now, we construct generalized Bayes minimax estimators of σ2 under the loss function (2). To do so, we consider the following class of prior distributions.

For η = σ2, let the conditional distribution of 𝜃 given ν and η be normal with zero mean vector and covariance matrix νη− 1Ip and let the generalized density of (ν,η) given by h(ν,η) = ηbg(ν), ν > 0, η > 0, where \(b > -\frac {n+p}{2}-1\) and g(ν) is a continuously differentiable positive function on \([0, \infty )\) such that the following conditions hold

  • \({{\int \limits }_{0}^{1}} \lambda ^{\frac {p}{2} - 2} g\left (\frac {1 - \lambda }{\lambda } \right ) d\lambda < \infty \),

  • \(\mathop {\lim }\limits _{\nu \to \infty } \frac {g(\nu )}{(1 + \nu )^{\frac {p}{2} - 1}} = 0\).

In the following discussion, we obtain conditions on g and b such that the generalized Bayes estimators satisfy the conditions of Theorem 2.1, and hence are minimax. Note that the joint density function f(η,x,s) of η, X, S is

$$ \begin{array}{@{}rcl@{}} \displaystyle f(\eta, x, s) &\propto& \displaystyle {\int}_{0}^{\infty} {\int}_{R^{p}} \eta^{\frac {p}{2}} e^{-\frac {\eta \left\| x - \theta \right\|^{2}}{2}} \nu^{-\frac {p}{2}} \eta^{\frac {p}{2}} e^{-\frac {\eta \left\| \theta \right\|^{2}}{2\nu}} g\left( \nu \right)\eta^{b} \eta^{\frac {n}{2}} e^{-\frac {\eta s}{2\nu}} d\theta d\nu \\ &\propto& \displaystyle {\int}_{0}^{\infty} {\int}_{R^{p}} \eta^{\frac {2p + n}{2} + b} e^{-\frac {\eta}{2}\left[ \left\| x - \theta \right\|^{2} + \frac {\left\| \theta \right\|^{2}}{\nu } \right]} \nu^{-\frac {p}{2}} g\left( \nu \right)e^{-\frac {\eta s}{2\nu}} d\theta d\nu \\ &\propto& \displaystyle {\int}_{0}^{\infty} \eta^{\frac {p + n}{2} + b} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} e^{-\frac {\eta }{2} \left[ s + \frac {\left\| x \right\|^{2}}{1 + \nu} \right]} d\nu, \end{array} $$

where \(\left \| . \right \|\) denotes the Euclidean norm. Therefore, the generalized Bayes estimator of η = σ− 2 with respect to the loss function (2) is

$$ \begin{array}{@{}rcl@{}} \displaystyle \delta_{B} &=& \displaystyle \frac {E\left[ \left. \eta \right|X,S \right]} {E\left[ \left. \eta^{2} \right|X,S \right]} \\ &=& \displaystyle \frac {{\int}_{0}^{\infty} {\int}_{0}^{\infty} \eta^{\frac {p + n}{2} + b + 1} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} e^{-\frac {\eta S}{2}\left( 1 + \frac {F}{1 + \nu} \right)} d\eta d\nu} {{\int}_{0}^{\infty} {\int}_{0}^{\infty} \eta^{\frac {p + n}{2} + b+ 2} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} e^{-\frac {\eta S}{2}\left( 1 + \frac {F}{1 + \nu } \right)} d\eta d\nu} \\ &=& \displaystyle \frac {S}{n + p + 2b+ 4} \frac {{\int}_{0}^{\infty} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} \left( 1 + \frac {F}{1 + \nu} \right)^{-\frac {n + p}{2} - b - 2} d\nu} {{\int}_{0}^{\infty} g\left( \nu \right)\left( 1 + \nu \right)^{-\frac {p}{2}} \left( 1 + \frac {F}{1 + \nu} \right)^{-\frac {n + p}{2} - b - 3} d\nu}. \end{array} $$

Using the change of variables \(\lambda = \frac {1}{1+\nu }\), we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \delta_{B} = \frac {S}{n + p + 2b+ 4} \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - b - 2} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - b - 3} d\lambda}. \end{array} $$
(5)

This estimator is of the form (4) with

$$ \begin{array}{@{}rcl@{}} \displaystyle \phi \left( F \right) = 1 - d\left( 1 + r(F)\right), \end{array} $$

where \(d = \frac {n + 2}{n + p + 2b + 4}\) and

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) = F \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda}, \end{array} $$
(6)

where \(A=\frac {n+p}{2}+b+3\).

To continue discussion, we need the following notations borrowed from Wells and Zhou (2008). Define the function Iα,A,g(F) as

$$ \begin{array}{@{}rcl@{}} \displaystyle I_{\alpha, A, g} \left( F \right) = {{\int}_{0}^{1}} \lambda^{\alpha} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right)d\lambda. \end{array} $$
(7)

Using integration by parts, we obtain

$$ \begin{array}{@{}rcl@{}} \displaystyle FI_{\frac {p}{2} - 1, A, g} \left( F \right) &=& \displaystyle {{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) d\left[ \frac {\left( 1 + \lambda F \right)^{1 - A}}{1 - A} \right] \\ &=& \displaystyle g(0)\frac {\left( 1 + F \right)^{1 - A}}{1 - A} \\ && \displaystyle +\frac {\frac {p}{2} - 1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} (1 + \lambda F) \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) d\lambda \\ && \displaystyle -\frac {1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} (1 + \lambda F) \frac {1}{\lambda^{2}}\lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) d\lambda. \end{array} $$
(8)

Also, we define the functions \(J_{a} \left (g\left (\frac {F - u}{u} \right ) \right ) \) and \( J_{a} \left (\frac {Au}{1 + u} g\left (\frac {F - u}{u} \right ) \right )\) as

$$ \begin{array}{@{}rcl@{}} \displaystyle J_{a} \left( g\left( \frac {F - u}{u} \right) \right) = {{\int}_{0}^{F}} u^{a} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du = F^{a + 1} I_{a, A, g} \left( F \right) \end{array} $$
(9)

and

$$ \begin{array}{@{}rcl@{}} \displaystyle J_{a} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right) &=& \displaystyle {{\int}_{0}^{F}} u^{a} \left( 1 + u \right)^{-A} \frac {Au}{1 + u}g\left( \frac {F - u}{u} \right)du \\ &=& \displaystyle F^{a + 1} {{\int}_{0}^{1}} \lambda^{a} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right) \frac {A\lambda F}{1 + \lambda F}d\lambda,\\ \end{array} $$
(10)

respectively. By integration by parts, we have

$$ \begin{array}{@{}rcl@{}} J_{a} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right) &=& {{\int}_{0}^{F}} u^{a} \left( 1 + u \right)^{-A} \frac {Au}{1 + u}g\left( \frac {F - u}{u} \right)du \\ &=& -{{\int}_{0}^{F}} u^{a + 1} g\left( \frac {F - u}{u} \right)d \left( {1 + u} \right)^{-A} \\ &=& -F^{a + 1} g(0)\left( 1 + F \right)^{-A} + (a + 1) {{\int}_{0}^{F}} \left( 1 + u \right)^{-A} u^{a} g\left( \frac {F - u}{u} \right) du \\ && +{{\int}_{0}^{F}} \left( 1 + u \right)^{-A} u^{a + 1} g^{\prime}\left( \frac {F - u}{u} \right) \left( -\frac {F}{u^{2}} \right) du. \end{array} $$
(11)

To show ϕ(F) is a decreasing function in F, it is sufficient to show that r(F) is an increasing function in F. The following lemma gives conditions under which \(\widetilde r(F)=F^{c} r(F)\) is an increasing function in F.

Lemma 2.1.

If \(\psi \left (\nu \right ) = -\left (1 + \nu \right ) \frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )}\) can be decomposed as l1(ν) + l2(ν), where l1(ν) is increasing in ν and 0 ≤ l2(ν) ≤ c, a constant, then \(\widetilde r(F) = F^{c} r(F)\) is nondecreasing.

Proof 2.

The proof is similar to the proof of Lemma 3.2 in Wells and Zhou (2008). Differentiating \(\widetilde r(F) = F^{c} r(F)\) with respect to F, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \frac {\partial \widetilde r(F)}{\partial F} = F^{c} \left( c\frac {r(F)}{F} + r^{\prime}(F) \right) = F^{c} \left( (1 + c)R(F) + FR^{\prime}(F) \right), \end{array} $$

where \(R(F) = \frac {r(F)}{F}\). Therefore, \(\frac {\partial \widetilde r(F)}{\partial F} \ge 0\) is equivalent to

$$ \begin{array}{@{}rcl@{}} \displaystyle (1 + c)\frac {I_{\frac {p}{2} - 1, A, g} \left( F \right)} {I_{\frac {p}{2} - 2, A, g} \left( F \right)} + F\frac {\left\{ I^{\prime}_{\frac {p}{2} - 1, A, g} \left( F \right) I_{\frac {p}{2} - 2, A, g} \left( F \right) - I^{\prime}_{\frac {p}{2} - 2, A, g} \left( F \right) I_{\frac {p}{2} - 1,A,g} \left( F \right) \right\}}{I_{\frac {p}{2} - 2, A, g}^{2} \left( F \right)} \ge 0, \end{array} $$

which in turn is equivalent to

$$ \begin{array}{@{}rcl@{}} \displaystyle &&-FI^{\prime}_{\frac {p}{2} - 1, A, g} \left( F \right)I_{\frac {p}{2} - 2, A, g} \left( F \right) \le (1 + c)I_{\frac {p}{2} - 2, A, g} \left( F \right)I_{\frac {p}{2} - 1, A, g} \left( F \right) \\&&- FI^{\prime}_{\frac {p}{2} - 2, A, g} \left( F \right)I_{\frac {p}{2} - 1, A, g} \left( F \right). \end{array} $$
(12)

Now, we see

$$ \begin{array}{@{}rcl@{}} \displaystyle -FI^{\prime}_{a,A,g} \left( F \right) = {{\int}_{0}^{1}} \lambda^{a} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right) \frac {A\lambda F}{1 + \lambda F}d\lambda. \end{array} $$

Using (9) and (10), (12) can be written as

$$ \begin{array}{@{}rcl@{}} \displaystyle \frac {J_{\frac {p}{2} - 1} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} \le 1 + c + \frac {J_{\frac {p}{2} - 2} \left( \frac {Au}{1 + u} g\left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)}. \end{array} $$
(13)

By applying (11), (13) is equivalent to

$$ \begin{array}{@{}rcl@{}} && \displaystyle \frac {-F^{\frac {p}{2}} g(0)(1 + F)^{-A}} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} + \left( \frac {p}{2} \right) + \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right) \left[ \frac {g^{\prime}\left( \frac {F - u}{u} \right)}{g\left( \frac {F - u}{u} \right)} \left( -\frac {F}{u} \right) \right]du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du} \\ &\le& \displaystyle \!\!1 + c + \frac {-F^{\frac {p}{2} - 1} g(0)(1 + F)^{-A}} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} + \frac {p}{2} - 1 + \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right) \left[ \frac {g^{\prime}\left( \frac {F - u}{u} \right)}{g\left( \frac {F - u}{u} \right)} \left( -\frac {F}{u} \right) \right]du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du}, \end{array} $$

which in turn is equivalent to

$$ \begin{array}{@{}rcl@{}} && \displaystyle \frac {-g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 1, A, g} \left( F \right)} + \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} + \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right)l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} \\ &\le& \displaystyle c + \frac {-g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 2, A, g} \left( F \right)} + \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} + \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)}. \\ \end{array} $$
(14)

Since \(I_{\frac {p}{2} - 1, A, g} \left (F \right ) \le I_{\frac {p}{2} - 2, A, g} \left (F \right )\), we have

$$ \begin{array}{@{}rcl@{}} \frac {- g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 1, A, g} \left( F \right)} \le \frac {-g(0)(1 + F)^{-A}}{I_{\frac {p}{2} - 2, A, g} \left( F \right)}. \end{array} $$

Note also that l1(ν) is increasing in ν implies that for all F fixed, \(l_{1} \left (\frac {F - u}{u} \right )\) is decreasing in u. When t < u, we have

$$ \begin{array}{@{}rcl@{}} \frac {u^{\frac {p}{2} - 2} \left( {1 + u} \right)^{-A} g\left( \frac {F - u}{u} \right) 1\left( u \le F \right)} {t^{\frac {p}{2} - 2} \left( 1 + t \right)^{-A} g\left( \frac {F - t}{t} \right) 1\left( {t \le F} \right)} \le \frac {u^{\frac {p}{2} - 1} \left( {1 + u} \right)^{-A} g\left( \frac {F - u}{u} \right) 1\left( u \le F \right)} {t^{\frac {p}{2} - 1} \left( 1 + t \right)^{-A} g\left( \frac {F - t}{t} \right)1\left( t \le F \right)}. \end{array} $$

By a monotone likelihood argument, we have

$$ \begin{array}{@{}rcl@{}} && \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} = \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right)du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du} \\ &\le& \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)l_{1} \left( \frac {F - u}{u} \right)du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 2} \left( 1 + u \right)^{-A} g\left( \frac {F - u}{u} \right)du} = \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{1} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} 0 \le \frac {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 2} \left( g\left( \frac {F - u}{u} \right) \right)} \le c, \ 0 \le \frac {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) l_{2} \left( \frac {F - u}{u} \right) \right)} {J_{\frac {p}{2} - 1} \left( g\left( \frac {F - u}{u} \right) \right)} \le c. \end{array} $$

Thus, we have established the inequality (14) and the proof is complete. □

The next lemma gives conditions under which a lower bound of r(F) can be determined.

Lemma 2.2.

With the regularity conditions C1 and C2, assume that \(\psi \left (\nu \right ) = -\left (1 + \nu \right ) \frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )} \ge M\), where M is a finite real number. For the r(F) function, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) \ge \frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M}. \end{array} $$

Proof 3.

The proof is similar to the proof of Lemma 3.1 in Wells and Zhou (2008). According to (6), we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) = F\frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left( 1 + \lambda F \right)^{-A} d\lambda} = F \frac {I_{\frac {p}{2} - 1, A, g} \left( F \right)}{I_{\frac {p}{2} - 2, A, g} \left( F \right)}. \end{array} $$

Using (7) and (8), we obtain

$$ \begin{array}{@{}rcl@{}} \displaystyle N_{1} = \frac {1}{A - 1}{{\int}_{0}^{1}}\! \left( 1 + \lambda F \right)^{-A} \left( \frac {p}{2} - 1 \right) \lambda^{\frac {p}{2} - 2} g\left( \frac{1 - \lambda}{\lambda} \right)d\lambda = \frac {\frac {p}{2} - 1}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right), \end{array} $$
$$ \begin{array}{@{}rcl@{}} \displaystyle N_{2} &=& \displaystyle \frac {1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 2} g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right) \left( \frac {-\lambda}{\lambda^{2}} \right)d\lambda \\ &=& \displaystyle \frac {1}{A - 1} {{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 2} g\left( \frac {1 - \lambda}{\lambda} \right) \left[ \frac {g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right)} {g\left( \frac {1 - \lambda}{\lambda} \right)} \left( -\frac {1 - \lambda}{\lambda } - 1 \right) \right]d\lambda \\ &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right) \psi \left( \frac {1 - \lambda}{\lambda} \right)d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 2} \left( 1 + \lambda F \right)^{-A} g\left( \frac {1 - \lambda}{\lambda} \right)d\lambda} \\ &\ge& \displaystyle \frac {M}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right), \end{array} $$
$$ \begin{array}{@{}rcl@{}} \displaystyle N_{3} = \frac {\frac {p}{2} - 1}{A - 1}FI_{\frac {p}{2} - 1, A, g} \left( F \right) = \frac {\left( \frac {p}{2} - 1 \right)r(F)}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right) \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} \displaystyle N_{4} &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {F{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} \left( 1 + \lambda F \right)^{-A} g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right) \left( \frac {-1}{\lambda} \right)d\lambda}{I_{\frac {p}{2} - 2, A, g} \left( F \right)} \\ &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {F{{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \left[ \frac {g^{\prime}\left( \frac {1 - \lambda}{\lambda} \right)} {g\left( \frac {1 - \lambda}{\lambda} \right)} \left( -\frac {1 - \lambda}{\lambda} - 1 \right) \right]d\lambda} {I_{\frac {p}{2} - 2, A, g} \left( F \right)} \\ &=& \displaystyle \frac {I_{\frac {p}{2} - 2, A, g} \left( F \right)}{A - 1} \frac {F{{\int}_{0}^{1}} \left( 1 + \lambda F \right)^{-A} \lambda^{\frac {p}{2} - 1} g\left( \frac {1 - \lambda}{\lambda} \right) \psi \left( \frac {1 - \lambda}{\lambda} \right)d\lambda}{I_{\frac {p}{2} - 2, A, g} \left( F \right)} \\ &\ge& \displaystyle \frac {M r(F)}{A - 1} I_{\frac {p}{2} - 2, A, g} \left( F \right). \end{array} $$

Combining all the terms, we obtain the following inequality

$$ \begin{array}{@{}rcl@{}} \displaystyle (A - 1)r(F) \ge \left( \frac {p}{2} - 1 \right) + M + \left( \frac {p}{2} - 1 \right)r(F) + Mr(F), \end{array} $$

implying

$$ \begin{array}{@{}rcl@{}} \displaystyle r(F) \ge \frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M}. \end{array} $$

Thus, we have the needed bound on the r(F) function. □

Now we use Lemmas 2.1 and 2.2 to show minimaxity of the generalized Bayes estimator δB in (5). In fact, this is our main result.

Theorem 2.2.

If \(\psi \left (\nu \right ) = -\left (1 + \nu \right )\frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )}\) is increasing in ν and \(\psi \left (\nu \right ) = -\left (1 + \nu \right )\frac {g^{\prime }\left (\nu \right )}{g\left (\nu \right )} \ge M\), where M is a finite real number and also 1 ≤ d \(\left (1 + \left (\frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right ) \right )\), then δB in (5) is minimax under the loss function (2).

Proof 4.

First, assume l2(ν) = 0 and \(l_{1}(\nu )=\psi \left (\nu \right )\). By using Lemma 2.1 for the case c = 0, we see that r(F) is an increasing function in F, hence ϕ(F) is deceasing in F. By (2), we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \varphi^{BZ} \left( F \right) = 1 - \frac {n + 2}{p + n + 2} r_{1} (F), \end{array} $$

where

$$ \begin{array}{@{}rcl@{}} \displaystyle r_{1} (F) = \frac {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - 1} d\lambda} {{{\int}_{0}^{1}} \lambda^{\frac {p}{2} - 1} \left( 1 + \lambda F \right)^{-\frac {n + p}{2} - 2} d\lambda}. \end{array} $$

Using the change of variables u = λF, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r_{1} (F) = \frac {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 1} du} {{{\int}_{0}^{F}} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 2} du}. \end{array} $$

Since r1(F) is increasing in F, we have

$$ \begin{array}{@{}rcl@{}} \displaystyle r_{1} (F) \le \frac {{\int}_{0}^{\infty} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 1} du} {{\int}_{0}^{\infty} u^{\frac {p}{2} - 1} \left( 1 + u \right)^{-\frac {n + p}{2} - 2} du} = \frac {p + n + 2}{n + 2}. \end{array} $$

By Lemma 2.2, we obtain

$$ \begin{array}{@{}rcl@{}} \displaystyle \varphi \left( F \right) = 1 - d \left( 1 + r(F)\right) \le 1 - d\left[ 1 + \left( \frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right) \right]. \end{array} $$

Also we have

$$ \begin{array}{@{}rcl@{}} \displaystyle \varphi^{BZ} \left( F \right) = 1 - \frac {n + 2}{p + n + 2} r_{1} (F) \ge 1 - \frac {n + 2}{p + n + 2} \frac {p + n + 2}{n + 2} = 0. \end{array} $$

Therefore, if \(1 - d\left [ 1 + \left (\frac {\frac {p}{2} - 1 + M}{A - \frac {p}{2} - M} \right ) \right ]\le 0\) is satisfied, then condition (II) in Theorem 2.2 is satisfied and hence δB in (5) is minimax under the loss function (2). □

3 Examples

In this section, we give three examples to which our results can be applied. We also make some connections to existing literature (Ghosh, 1994).

Example 1.

The class of priors studied by Ghosh (1994) (in the setup and notation of Section 2) corresponds to

$$ \begin{array}{@{}rcl@{}} \displaystyle h\left( \nu, \eta \right) = k\eta^{b} \left( 1 + \nu \right)^{-b - 2}, \ \displaystyle \nu > 0, \ \displaystyle \eta > 0, \end{array} $$

where k is a positive constant. If M = b + 2 and \(-\frac {p}{2} - 1 < b \le - 1\), then we can show that the class of priors of Ghosh (1994) satisfies the conditions and hence our results include the results of Ghosh (1994).

Example 2.

Another class of prior distributions is

$$ \begin{array}{@{}rcl@{}} \displaystyle h\left( \nu, \eta \right) = k\eta^{b} e^{-\nu}, \ \displaystyle \nu > 0, \ \displaystyle \eta > 0, \end{array} $$

where k is a positive constant. If M = 1, p ≥ 2 and \(-\frac {p + n}{2} - 1 < b \le - 1\) then Theorem 2.2 is satisfied and the generalized Bayes estimator will be minimax under the loss function (2).

Example 3.

Consider the following prior distribution

$$ \begin{array}{@{}rcl@{}} \displaystyle h\left( \nu, \eta \right) = k\eta^{b} \left( 1 + \nu \right)^{-a - c - 2} \nu^{c}, \ \displaystyle \nu > 0, \ \displaystyle \eta > 0, \end{array} $$

where k is a positive constant. If M = a + c, c ≥ 0 and \(-\frac {p + n}{2} - 1 < b \le 2 + a + c\), then the conditions of Theorem 2.2 are satisfied, so the corresponding generalized Bayes estimators are minimax.

4 Simulation study

In this section, we compare the performance of the affine equivariant estimator δ0 with the generalized Bayes estimator in Theorem 2.2. The comparison uses the following simulation scheme computing bias and mean squared error:

  1. a)

    set values for n, 𝜃 and σ2;

  2. b)

    simulate a random sample of size n from a seven-dimensional normal distribution with mean vector 𝜃 and covariance matrix σ2Ip;

  3. c)

    compute

    $$ \begin{array}{@{}rcl@{}} \displaystyle S = \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{7} \left( X_{ij} - \overline{X} \right)^{2}, \end{array} $$

    where

    $$ \begin{array}{@{}rcl@{}} \displaystyle \overline{X} = \frac {\sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{7} X_{ij}}{7n}; \end{array} $$
  4. d)

    compute the equivariant estimator δ0 by \(\delta _{0} = \frac {S}{n + 2}\);

  5. e)

    compute the estimator δB given by Theorem 2.2 with b = − 2 and g(ν) = eν;

  6. f)

    repeat steps b) to e) one thousand times;

  7. g)

    compute the biases of the estimators as

    $$ \begin{array}{@{}rcl@{}} \displaystyle \text{bias} \left( \delta_{0} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{0, i} - \sigma^{2} \right) \end{array} $$

    and

    $$ \begin{array}{@{}rcl@{}} \displaystyle \text{bias} \left( \delta_{B} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{B, i} - \sigma^{2} \right), \end{array} $$

    where δ0,i and δB,i denote the estimates of δ0 and δB, respectively, in the i th iteration;

  8. h)

    compute the mean squared errors of the estimators as

    $$ \begin{array}{@{}rcl@{}} \displaystyle \text{mse} \left( \delta_{0} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{0, i} - \sigma^{2} \right)^{2} \end{array} $$

    and

    $$ \begin{array}{@{}rcl@{}} \displaystyle \text{mse} \left( \delta_{B} \right) = \frac {1}{1000} \sum\limits_{i = 1}^{1000} \left( \delta_{B, i} - \sigma^{2} \right)^{2}. \end{array} $$

Plots of the biases and mean squared errors versus n = 10,11,…,100 when 𝜃 = (0,0,0,0,0,0,0) and σ2 = 1 are shown in Figs. 1 and 2. Plots of the biases and mean squared errors versus 𝜃0 = − 50,49,…,50 when \(\theta = \left (\theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0} \right )\), n = 100 and σ2 = 1 are shown in Figs. 3 and 4. Plots of the biases and mean squared errors versus σ2 = 1,2,…,100 when 𝜃 = (0,0,0,0,0,0,0) and n = 100 are shown in Figs. 5 and 6.

Figure 1
figure 1

The biases of δ0 (solid line) and δB (broken line) versus n = 10,11,…,100 when 𝜃 = (0,0,0,0,0,0,0) and σ2 = 1

Figure 2
figure 2

The mean squared errors of δ0 (solid line) and δB (broken line) versus n = 10,11,…,100 when 𝜃 = (0,0,0,0,0,0,0) and σ2 = 1

Figure 3
figure 3

The biases of δ0 (solid line) and δB (broken line) versus 𝜃0 = − 50,49,…,50 when \(\theta = \left (\theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0} \right )\), n = 100 and σ2 = 1

Figure 4
figure 4

The mean squared errors of δ0 (solid line) and δB (broken line) versus 𝜃0 = − 50,49,…,50 when \(\theta = \left (\theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0}, \theta _{0} \right )\), n = 100 and σ2 = 1

Figure 5
figure 5

The biases of δ0 (solid line) and δB (broken line) versus σ2 = 1,2,…,100 when 𝜃 = (0,0,0,0,0,0,0) and n = 100

Figure 6
figure 6

The mean squared errors of δ0 (solid line) and δB (broken line) versus σ2 = 1,2,…,100 when 𝜃 = (0,0,0,0,0,0,0) and n = 100

We can observe the following from Figs. 1 to 6. The biases are generally negative for both estimators. The biases approach zero in magnitude as n increases. δB has smaller bias for every n. The mean squared errors approach zero as n increases. δB has smaller mean squared error for every n. The biases decrease from being positive to negative as 𝜃0 increases from − 50 to 50. The biases are smallest in magnitude when 𝜃0 = 0. The mean squared errors take a parabolic shape as 𝜃0 increases from − 50 to 50. The mean squared errors are smallest when 𝜃0 = 0. The biases are negative and decrease as σ2 increases from 1 to 100. The biases are smallest in magnitude when σ2 = 1. The mean squared errors increase as σ2 increases from 1 to 100. The mean squared errors are smallest when σ2 = 1.

The computations were performed using the R software (R Development Core Team, 2023) and the package m vtnorm (Genz et al. 2021). The codes used are given in the Appendix A.