Global-Local Shrinkage Priors for Asymptotic Point and Interval Estimation of Normal Means under Sparsity

Qin, Zikun; Ghosh, Malay

doi:10.1007/s13171-023-00315-9

Global-Local Shrinkage Priors for Asymptotic Point and Interval Estimation of Normal Means under Sparsity

Published: 08 September 2023

Volume 86, pages 93–137, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Sankhya A Aims and scope Submit manuscript

Global-Local Shrinkage Priors for Asymptotic Point and Interval Estimation of Normal Means under Sparsity

Download PDF

167 Accesses
1 Citation
Explore all metrics

Abstract

The paper addresses asymptotic estimation of normal means under sparsity. The primary focus is estimation of multivariate normal means where we obtain exact asymptotic minimax error under global-local shrinkage prior. This extends the corresponding univariate work of Ghosh and Chakrabarti (2017). In addition, we obtain similar results for the Dirichlet-Laplace prior as considered in Bhattacharya et al. (2015). Also, following van der Pas et al. (2017), we have been able to derive credible sets for multivariate normal means under global-local priors.

Shrinkage estimation with logarithmic penalties

Article Open access 29 November 2023

Weak Versus Strong Dominance of Shrinkage Estimators

Article 18 November 2021

Nearly optimal Bayesian shrinkage for high-dimensional regression

Article 14 October 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Estimation of normal means under sparsity started a while ago. Its importance is felt for the analysis of high dimensional data. For example, in microarray experiments, there is a multitude of genes, but only a few have impact on a certain disease. A foundational article appears in Donoho et al. (1992), who provided an asymptotic minimax estimation rate for estimation of normal means with a large majority of zeros, but with also a few significant departures from zeros. The idea was pursed in a Bayesian framework by Castillo and van der Vaart (2012) who provided the same asymptotic minimax estimation rate under a class of priors $\Pi _n$ with “exponential decay” (see (2.2) of their paper for its definition).

The present work addresses the same problem, but stems from several recent excellent articles of Bhattacharya et al. (2015), van der Pas et al. (2014), van der Pas et al. (2016) and Ghosh and Chakrabarti (2017). In particular, our paper has more direct structural connection with the work of Ghosh and Chakrabarti (2017), but extends their work in certain directions.

It may be pointed out that the priors of Bhattacharya et al. (2015) or Ghosh and Chakrabarti (2017) can be brought under the general framework of van der Pas et al. (2014), but each has its own salient features which enable one to provide a more concrete set of results. In particular, these priors, now commonly referred to as “global-local” priors following Carvalho et al. (2009, 2010), are scale mixtures of normal priors with the scale parameters involving both global and local components. The global components try to shrink the normal means towards zero, while the local parameters try to balance the same with the end of identifying and distinguishing the true signals from the noises. While the work of Ghosh and Chakrabarti (2017) considers a single global parameter and utilizes the same as a tuning parameter, Bhattacharya et al. (2015) considered essentially multiple global parameters and assigned certain priors on them. These ideas will be made more specific in the following sections.

We first find the asymptotic minimax error for estimation of multivariate normal means under sparsity in the nearly-black sense (Castillo and van der Vaart , 2012). It is the same as the asymptotic minimax error in the univariate case, which was proved by Donoho et al. (1992).

We then consider estimation of multivariate normal means under global-local priors. Like Ghosh and Chakrabarti (2017), we obtain exact asymptotic minimaxtity results as well in this situation. Further, in the framework of Bhattacharya et al. (2015), we obtain asymptotic minimaxity results in the multivariate case. This is the case where we put priors to the global parameters.

The final contribution of this paper is finding credible sets for multivariate normal means following the framework of van der Pas et al. (2017), who considered the univariate case. We have considered a general class of global-local priors which includes the now famous horseshoe prior, as well as a more specific class of priors which is in the framework of Bhattacharya et al. (2015). Like van der Pas et al. (2017), we have been able to identify parameter vectors for which the posteriors give good coverage, and others for which they do not.

The outline of the remaining sections is as follows. In Section 2.1, we find the asymptotic minimax error in the multivariate setting. In Section 2.2, we consider estimation of multivariate normal means and obtain exact asymptotic minimax error. We also find out the corresponding posterior contraction rates around both the estimator and the true means. Section 3 addresses results related to credible sets of multivariate normal means. Some final remarks are made in Section 4. The proofs of some technical lemmas are given in Appendix A. The proofs of the main theorems are given in Appendix B.

2 Point Estimation of Multivariate Normal Means

2.1 Asymptotic Minimax Error under Nearly-Black Sparsity

Suppose $X_i \overset{ind}{\sim }\ N(\theta _i, 1), i = 1, \dots , n$. To estimate multiple normal means, Ghosh and Chakrabarti (2017) used a general global-local prior in which the global parameter is treated as a tuning parameter. The exact asymptotic minimaxity was established under the prior. There, the true means ${\theta }_{0i}$ $(1 \le i \le n)$ are assumed to be sparse in the nearly-black sense (Donoho et al. , 1992; Castillo and van der Vaart , 2012), meaning that the cardinality of the non-zero ${\theta }_{0i}$’s, say $q_n$, is o(n), as $n \rightarrow \infty $. The set of nearly-black mean vectors is denoted by $l_0[q_n] = \{\varvec{\theta } \in \mathbb {R}^n: \sum _{i=1}^n \mathbbm {1}(\theta _{i} \ne 0) \le q_n\}$ with $q_n = o(n)$. Donoho et al. (1992) provides the asymptotic minimax error,

$$\begin{aligned} \inf _{\widehat{\varvec{\theta }}} \sup _{\varvec{\theta }_0 \in l_0[q_n]} \sum _{i=1}^n E_{\theta _{0i}} \left( \widehat{\theta } - \theta _{0i}\right) ^2 = 2 q_n \log \left( \frac{n}{q_n}\right) (1+o(1)), \text { as } n \rightarrow \infty . \end{aligned}$$

(1)

In the multivariate situation, the true means $\varvec{\theta }_{0i}$ $(1 \le i \le n)$ being assumed to be sparse in the nearly-black sense also means that $\sum _{i=1}^n \mathbbm {1}(\varvec{\theta }_{0i} \ne \varvec{0}) \le q_n$ with $q_n = o(n)$. We denote the set of nearly-black multivariate means by $L_0[q_n] = \{ \{\varvec{\theta }_{0i}\}_{i=1}^{n} : \varvec{\theta }_{0i} \in \mathbb {R}^k, i=1,\dots ,n, \sum _{i=1}^n \mathbbm {1}(\varvec{\theta }_{0i} \ne \varvec{0}) \le q_n\}$. One can prove that, in the multivariate setting, the asymptotic minimax error using the Mahalanobis distance loss is the same as the asymptotic minimax error using the squared error loss in the univariate setting. We use $\Vert \cdot \Vert _{\Sigma }$ to denote the Mahalanobis norm, e.g., $\Vert \varvec{X}_i \Vert _{\Sigma }^2 = \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i$, where $\varvec{\Sigma }$ is the positive definite population covariance matrix.

Theorem 1

Suppose that $\varvec{X}_i \sim N_k(\varvec{\theta }_i, \varvec{\Sigma })$, independently, for $i=1,\dots ,n$, with a positive definite covariance matrix $\varvec{\Sigma }$, and that the true mean vectors $\{\varvec{\theta }_i\}_{i=1}^n$ are sparse in the nearly-black sense. If we measure the error of an estimator using the Mahalanobis distance loss, then, as $n \rightarrow \infty $,

$$\begin{aligned} \inf _{\{\widehat{\varvec{\theta }_i}\}} \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} \sum _{i=1}^n E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 = 2 q_n \log \left( \frac{n}{q_n}\right) (1+o(1)). \end{aligned}$$

(2)

Remark 1

When $\varvec{\Sigma }$ is positive definite, the Mahalanobis norm $\Vert \cdot \Vert _{\Sigma }$ is equivalent to the $l_2$-norm $\Vert \cdot \Vert _2$ in the sense of equivalent norms, which means that there exist two positive real constants c and C such that $0 < c \le C$, for each $\varvec{X} \in \mathbb {R}^k$, $c \Vert \varvec{X} \Vert _2 \le \Vert \varvec{X} \Vert _{\Sigma } \le C \Vert \varvec{X} \Vert _2$. Specifically, $c = \lambda _{\max }^{-1}(\varvec{\Sigma })$, inverse of the largest eigenvalue of $\varvec{\Sigma }$, and $C = \lambda _{\min }^{-1}(\varvec{\Sigma })$, inverse of the smallest eigenvalue of $\varvec{\Sigma }$. So, Theorem 1 will not give us an exact asymptotic minimax error under the $l_2$-norm unless $\varvec{\Sigma }$ satisfies certain eigenvalue conditions. Instead, we can get both lower and upper bounds of the minimax error under the $l_2$-norm. Since both bounds are of the same rate, $2 q_n \log \left( n/q_n\right) (1+o(1))$, the minimax error under the $l_2$-norm must be of the same rate as well, and will only differ from it by up to a constant factor.

2.2 Minimax Estimation of Multivariate Normal Means

Now, we first extend the results of Ghosh and Chakrabarti (2017) to the multivariate case. We begin with a general global-local prior model

(i)
$\varvec{X}_{i} \vert \varvec{\theta }_{i} \overset{ind}{\sim }\ N_k(\varvec{\theta }_{i}, \varvec{\Sigma })$, $i = 1, \dots ,n $, $\varvec{\Sigma }$ is known positive definite;
(ii)
$\varvec{\theta }_{i} \vert \lambda _i^2 \overset{ind}{\sim }\ N_k(\varvec{0}, \lambda _i^2 \tau _n\varvec{\Sigma }) $, $i = 1, \dots ,n $, where $\tau _n \in (0,1)$ is a sequence of positive constants to be chosen later, $\tau _n \rightarrow 0$ as $n \rightarrow \infty $;
(iii)
$\pi (\lambda _i^2) = K (\lambda _i^2) ^{-a-1} L(\lambda _i^2)$, $i = 1, \dots ,n $, where $a > 0$ and L is a slowly varying function.

In this model, the global parameter $\tau _n$ is assumed to be a tuning parameter. Note that the horseshoe prior (Carvalho et al. , 2009; Carvalho et al. , 2010) is a special case of this prior in the univariate setup with $a=1/2$.

The following regularity assumptions are made:

(I)
L is non-decreasing in its argument with $0< m \le L(u) \le M < \infty $;
(II)
$0< \lambda _{\min } (\varvec{\Sigma }) \le \lambda _{\max } (\varvec{\Sigma }) < \infty $, where $\lambda _{\min } (\varvec{\Sigma })$ and $\lambda _{\max } (\varvec{\Sigma })$ denote the minimum and maximum eigenvalues of $\varvec{\Sigma }$.

We estimate $\varvec{\theta }_i$ using the posterior means under the global-local prior, i.e.,

$$\begin{aligned} \widehat{\varvec{\theta }}_i = E(\varvec{\theta }_i \mid \varvec{X}_i) = E(1-\kappa _i \mid \varvec{X}_i)\varvec{X}_i, \text { where } \kappa _i = (1+\lambda _i^2 \tau _n)^{-1}, \end{aligned}$$

(3)

and $\kappa _i$ is the shrinkage factor. The estimators using a prior (iii) are denoted by $\widehat{\varvec{\theta }}^R_i$ specifically.

We prove the following theorem under this model, in which (4) and (5) concern the error contributed by the zero and non-zero true means, respectively, and (6) is then immediate following the previous two results. In particular, when $0 < a \le 1$, we already have an upper bound for the error. Theorem 1 provides the minimax lower bound, which matches the upper bound here. This fact actually finishes both the proofs of Theorem 1 and (7). As shown in Theorem 2, this general class of global-local priors attains the asymptotic minimax rate in the multivariate setting, and when $0 < a \le 1$, it attains the exact asymptotic minimax error.

Theorem 2

Assume that the true means are sparse in the nearly-black sense. Under the regularity assumptions (I) and (II), using the global-local prior with a tuning parameter, i.e., a model satisfying (i), (ii) and (iii), if $\tau _n = \left( q_n/n\right) ^{\frac{1+\epsilon }{\eta }}$, where $\epsilon > 0$ and $0< \eta < \min (1,a)$, then, for any valid choice of $\epsilon $ and $\eta $,

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} {\sum \limits _{i: \varvec{\theta }_{0i} = \varvec{0}} E_0 \Vert \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2} / {\left( q_n \log \left( \frac{n}{q_n}\right) \right) } = 0. \end{aligned}$$

(4)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2}{2 q_n \log (n/q_n)} \le a/\min (1,a). \end{aligned}$$

(5)

Consequently,

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2}{2 q_n \log (n/q_n)} \le a/\min (1,a). \end{aligned}$$

(6)

In particular, since the minimax error (2) provides a lower bound, when $0 < a \le 1$, one gets the result

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2}{2 q_n \log (n/q_n)} = 1. \end{aligned}$$

(7)

The following theorem provides results on the rates of posterior contraction for this prior around both the Bayes estimators and the true means. By (8), the posterior distributions contracts around the Bayes estimator at least as fast as at the minimax rate. However, by (9), the rate of posterior contraction around the true means would be slower than the minimax rate.

Theorem 3

Under the assumptions of Theorem 2, we have

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert ^2 > q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$

(8)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - {\varvec{\theta }}_{0i} \Vert ^2 > M_n q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$

(9)

for any $\{M_n\}$ such that $\lim _{n\rightarrow \infty } M_n = \infty $.

Next we extend the work of Bhattacharya et al. (2015) in the present multivariate framework. We consider the following prior in which, while (i) remains the same in our earlier formulation, we replace (ii) and (iii) respectively by

(ii’)
$\varvec{\theta }_{i} \vert \lambda _i^2, \tau _i \overset{ind}{\sim }\ N_k(\varvec{0}, \lambda _i^2 \tau _i\varvec{\Sigma }) $, $i = 1, \dots ,n $;
(iii’)
$\lambda _i^2$ and $ \tau _i$ are mutually independent. Also, $\lambda _i^2$’s are independent with $\pi (\lambda _i^2) \propto \exp (-\lambda _i^2/2)$, $i = 1, \dots ,n $, while $ \tau _i$’s are also independent with $\pi (\tau _i) \propto \exp (-c_n/(2\tau _i)) \tau _i^{-d-1}$, where $c_n \rightarrow 0$ and will be chosen later and $0<d<1$.

As noted by Bhattacharya et al. (2015) as well, the Dirichlet-Laplace priors can be rewritten in the above formulation, except for that the authors put a Gamma prior on $\tau _i$ while we put an Inverse-Gamma prior on it. Due to this discrepancy, we refer the prior defined by (ii’) and (iii’) as the Exponential-Inverse-Gamma prior.

It is worth mentioning that writing $u_i = \lambda _i^2 \tau _i$, one gets $\pi (u) \propto (u_i + c_n)^{-d-1}$, and one can directly use the $u_i$ for inferential purposes. Further, this particular formulation is a special case of van der Pas et al. (2016), who has a very general result concerning asymptotic minimaxity of univariate normal means. However, it seems more convenient to work with separate priors for $\lambda _i^2$ and $ \tau _i$, and the explicit nature of these priors makes the calculation smooth. As an aside, $u_i/c_n$ has a beta prime prior with $a=1$ and $b=d$, and this is the prior considered in Armagan et al. (2011) and Griffin and Brown (2017).

With the above formulation, the estimators of the ${\varvec{\theta }}_i$ are denoted as $\widehat{\varvec{\theta }}^{EIG}_i$. We will prove Theorems 4 in this setup. It shows that the prior attains the exact asymptotic minimax error as well.

Theorem 4

Assume that the true means are sparse in the nearly-black sense. Under the regularity assumption (II), using the Exponential-Inverse-Gamma prior, i.e., a model satisfying (i), (ii’) and (iii’) above, if $c_n = \left( q_n/n\right) ^{\frac{1+\epsilon }{d}}$, where $\epsilon > 0$, then, for any valid choice of $\epsilon $,

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}}{\sum \limits _{i: \varvec{\theta }_{0i} = \varvec{0}} E_0 \Vert \widehat{\varvec{\theta }}^{EIG}_i \Vert ^2} / {\left( q_n \log \left( \frac{n}{q_n}\right) \right) } = 0. \end{aligned}$$

(10)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^{EIG}_i - \varvec{\theta }_{0i}\Vert ^2}{2 q_n \log (n/q_n)} \le 1. \end{aligned}$$

(11)

Consequently,

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^{EIG}_i - \varvec{\theta }_{0i}\Vert ^2}{2 q_n \log (n/q_n)} = 1. \end{aligned}$$

(12)

We also have the following results regarding the posterior contraction rate around the Bayes estimator and the true means. The same contraction rates are observed as using the tuning parameter model.

Theorem 5

Under the assumptions of Theorem 4, we have

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in l_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_i \Vert ^2 > q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$

(13)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in l_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - {\varvec{\theta }}_{0i} \Vert ^2 > M_n q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$

(14)

for any $\{M_n\}$ such that $\lim _{n\rightarrow \infty } M_n = \infty $.

3 Credible Sets of Multivariate Normal Means

In this section, we first study coverage probabilities of credible sets constructed under global-local priors defined by (ii) and (iii). The global parameter is treated as a tuning parameter. We consider credible sets of the form:

$$\begin{aligned} \widehat{C}^{R}_{n,i} = \{\varvec{\theta }_i: \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_{i} \Vert _{\Sigma }^2 \le L \widehat{r}_{n,i}^{a/(1+\rho )}(\alpha , \tau _n) \}, \end{aligned}$$

(15)

for some $\rho (> 0)$ to be chosen later and $\widehat{r}_{n,i}(\alpha , \tau _n)$ is determined from

$$\begin{aligned} \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_{i} \Vert _{\Sigma }^2 \le \widehat{r}_{n,i}(\alpha , \tau _n) \mid \varvec{X}_i) = 1 - \alpha . \end{aligned}$$

In the following, we will omit the subscript n in $\widehat{C}^{R}_{n,i}$ and $\widehat{r}_{n,i}$ for notational simplicity.

Following van der Pas et al. (2017), we view the true mean vectors as in three categories:

$$\begin{aligned} \begin{aligned} \mathcal {S}&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_S \tau _n\} \\ \mathcal {M}&:= \{\varvec{\theta }_{0i}: f_{\tau _n} \tau _n \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_M \log \frac{1}{\tau _n}\}\\ \mathcal {L}&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n}\} \end{aligned} \end{aligned}$$

for some positive constants $K_S$, $K_M$ and $K_L$, and some $f_{\tau _n}$ that goes to infinity as $\tau _n$ goes to zero. We will show that, the proposed credible sets will cover the true means in either $\mathcal {S}$ or $\mathcal {L}$ with a desired probability, while the true means in $\mathcal {M}$ will not be covered with probability tending to one. The results are summaried in the following theorem.

Theorem 6

Consider the global-local prior with a tuning parameter $\tau _n$, i.e., a model satisfying (i), (ii) and (iii), with $a < 1$, under the regularity assumptions (I) and (II). Suppose that $K_S > 0$, $K_M < 2a$ and $K_L > 2a$, and that $f_{\tau _n} \rightarrow \infty $ and $f_{\tau _n} \tau _n \rightarrow 0$ as $\tau _n \rightarrow 0$. Then, given $\alpha $, for the credible sets of form (15) with $L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-a/(1+\rho )}$ for some fixed $\beta > \alpha $ and $\rho > 0$,

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i}) \ge 1-\alpha , \text { if } \varvec{\theta }_{0i} \in \mathcal {S}, \end{aligned}$$

(16)

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \notin \widehat{C}^{R}_{i}) \rightarrow 1, \text { if } \varvec{\theta }_{0i} \in \mathcal {M}, \end{aligned}$$

(17)

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i}) \ge 1-\alpha , \text { if } \varvec{\theta }_{0i} \in \mathcal {L}. \end{aligned}$$

(18)

as $\tau _n \rightarrow 0$.

Remark 2

From the proof of the theorem, the conclusions for $\varvec{\theta }_{0i}$ in either $\mathcal {S}$ or $\mathcal {M}$ do not rely on any specific choice of L, while only the conclusion for $\mathcal {L}$ has the requirement on L. To make the credible sets as narrow as possible, noticing that $L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-a/(1+\rho )}$, we should choose $\beta $ to be as close to $\alpha $ as possible. As for $\rho $, noticing that

$$L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n) > \chi ^2_{k,\alpha }(\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta })^{a/(1+\rho )},$$

the choice should depend on $\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta }$. For instance, when $\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta } > 1$, we can choose as large $\rho $ as possible, so that $L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n)$ would essentially become $\chi ^2_{k,\alpha }$. On the other hand, when $\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta } < 1$, it would be more preferable to choose $\rho $ closer to 0. This observation also motivates an individualized choice of $L_i$ instead of a common L among all the subjects, so that each credible set can be narrowed as much as possible while maintaining the theoretical coverage probability.

Assuming sparsity in the nearly-black sense, most true means would be in the set $\mathcal {S}$. This fact immediately leads to a high overall coverage probability, i.e., the following corollary.

Corollary 1

Under the setup of Theorem 6, further assume that the true means $\varvec{\theta }_{0i}$ are sparse in the nearly-black sense. Then, for almost all $i=1,\dots ,n$, as $\tau _n \rightarrow 0$,

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i}) \ge 1-\alpha . \end{aligned}$$

Next, we study coverage probabilities of credible sets constructed under the Exponential-Inverse-Gamma priors defined by (ii’) and (iii’). We consider credible sets of the same form as in the previous setup:

$$\begin{aligned} \widehat{C}^{EIG}_{i} = \{\varvec{\theta }_i: \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_{i} \Vert _{\Sigma }^2 \le L \widehat{r}_{i}^{d/(1+\rho )}(\alpha , c_n) \}, \end{aligned}$$

(19)

for some $\rho (> 0)$ to be chosen later and $\widehat{r}_{i}(\alpha , c_n)$ is determined from

$$\begin{aligned} \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_{i} \Vert _{\Sigma }^2 \le \widehat{r}_{i}(\alpha , c_n) \mid \varvec{X}_i) = 1 - \alpha . \end{aligned}$$

Here, we divide the true mean vectors as in the following three categories:

$$\begin{aligned} \begin{aligned} \mathcal {S}'&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K'_S c_n\} \\ \mathcal {M}'&:= \{\varvec{\theta }_{0i}: f'_{c_n} c_n \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K'_M \log \frac{1}{c_n}\}\\ \mathcal {L}'&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K'_L \log \frac{1}{c_n}\} \end{aligned} \end{aligned}$$

for some positive constants $K'_S$, $K'_M$ and $K'_L$, and some $f'_{c_n}$ that goes to infinity as $c_n$ goes to zero. And similar results regarding the coverage probabilities are observed under this prior.

Theorem 7

Consider the Exponential-Inverse-Gamma prior, i.e., a model satisfying (i), (ii’) and (iii’), under the regularity assumption (II). Suppose that $K'_S > 0$, $K'_M < 2d$ and $K'_L > 2d$, and that $f'_{c_n} \rightarrow \infty $ and $f'_{c_n} c_n \rightarrow 0$ as $c_n \rightarrow 0$. Then, given $\alpha $, for the credible sets of form (19) with $L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-d/(1+\rho )}$ for some fixed $\beta > \alpha $ and $\rho > 0$, for $\alpha < 1/2$

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i}) \ge 1-\alpha , \text {if}~\varvec{\theta }_{0i} \in \mathcal {S}', \end{aligned}$$

(20)

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \notin \widehat{C}^{EIG}_{i}) \rightarrow 1, \text {if}~\varvec{\theta }_{0i} \in \mathcal {M}', \end{aligned}$$

(21)

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i}) \ge 1-\alpha , \text {if}~\varvec{\theta }_{0i} \in \mathcal {L}'. \end{aligned}$$

(22)

as $\tau _n \rightarrow 0$.

The following corollary is also immediate due to the nearly-black sparsity.

Corollary 2

Under the setup of Theorem 7, further assume that the true means $\varvec{\theta }_{0i}$ are sparse in the nearly-black sense. Then, for almost all $i=1,\dots ,n$, as $c_n \rightarrow 0$,

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i}) \ge 1-\alpha . \end{aligned}$$

4 Final Remarks

The paper addresses asymptotic estimation of multivariate normal means under global-local priors. We first find the asymptotic minimax error in the multivariate setup. Then, the asymptotic minimax error is obtained by treating the global parameter as a tuning parameter. The same result is obtained under Dirichlet-Laplace priors. Also, credible sets are obtained under global-local priors extending the idea of van der Pas et al. (2017) in the multivariate case.

References

Armagan, A., Dunson, D.B., and Clyde, M. (2011). Generalized beta mixtures of gaussians. Proceedings of the 24th international conference on neural information processing systems (p. 523–531). Red Hook, NY, USA: Curran Associates Inc.
Bhattacharya, A., Pati, D., Pillai, N.S., Dunson, D.B. (2015). Dirichlet-Laplace priors for optimal shrinkage. J. Amer. Statist. Assoc., 110(512), 1479–1490. Retrieved from https://doi.org/10.1080/01621459.2014.960967
Carvalho, C.M., Polson, N.G., Scott, J.G. (2009). Handling sparsity via the horseshoe. D. van Dyk & M. Welling (Eds.), Proceedings of the twelth international conference on artificial intelligence and statistics (Vol. 5, pp. 73–80). Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA: PMLR. Retrieved from https://proceedings.mlr.press/v5/carvalho09a.html
Carvalho, C.M., Polson, N.G., Scott, J.G. (2010). The horseshoe estimator for sparse signals. Biometrika, 97(2), 465–480. Retrieved from https://doi.org/10.1093/biomet/asq017
Castillo, I. van der Vaart, A. (2012). Needles and straw in a haystack: posterior concentration for possibly sparse sequences. Ann. Statist., 40(4), 2069–2101. Retrieved from https://doi.org/10.1214/12-AOS1029
Donoho, D.L., Johnstone, I.M., Hoch, J.C., Stern, A.S. (1992). Maximum entropy and the nearly black object. J. Roy. Statist. Soc. Ser. B, 54(1), 41–81. Retrieved from http://links.jstor.org/sici?sici=0035-9246(1992)54:1<41:MEATNB>2.0.CO;2-I &origin=MSN (With discussion and a reply by the authors)
Ghosh, P., & Chakrabarti, A. (2017). Asymptotic optimality of one-group shrinkage priors in sparse high-dimensional problems. Bayesian Anal., 12(4), 1133–1161. Retrieved from https://doi.org/10.1214/16-BA1029
Griffin, J., & Brown, P. (2017). Hierarchical shrinkage priors for regression models. Bayesian Analysis, 12(1), 135–159.
Article MathSciNet Google Scholar
van der Pas, S.L., Kleijn, B.J.K., van der Vaart, A.W. (2014). The horseshoe estimator: posterior concentration around nearly black vectors. Electron. J. Stat., 8(2), 2585–2618. Retrieved from https://doi.org/10.1214/14-EJS962
van der Pas, S.L., Salomond, J.-B., Schmidt-Hieber, J. (2016). Conditions for posterior contraction in the sparse normal means problem. Electron. J. Stat., 10(1), 976–1000. Retrieved from https://doi.org/10.1214/16-EJS1130
van der Pas, S.L., Szabó, B., van der Vaart, A. (2017). Uncertainty quantification for the horseshoe (with discussion). Bayesian Anal., 12(4), 1221–1274. Retrieved from https://doi.org/10.1214/17-BA1065 (With a rejoinder by the authors)

Download references

Acknowledgements

We thank the anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions.

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Department of Statistics, University of Florida, 102 Griffin-Floyd Hall, Gainesville, 32611, Florida, USA
Zikun Qin & Malay Ghosh

Authors

Zikun Qin
View author publications
You can also search for this author in PubMed Google Scholar
Malay Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zikun Qin.

Ethics declarations

Competing Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Lemmas

1.1 A.1 Lemmas for the Multivariate Tuning Parameter Model

Regarding the multivariate tuning parameter model, we can establish the following lemmas. Under the model (i) - (iii), the posterior density of $\kappa _i$ is given by

$$\begin{aligned} \begin{aligned} \Pi (\kappa _{i}\vert X_i)&\propto {\kappa _{i}}^{k/2+a-1} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) }. \end{aligned} \end{aligned}$$

In the following, we will use $K (> 0)$ to denote a generic constant.

Lemma 1

Under the multivariate global-local prior model with treating the global parameter as a tuning parameter and under the regularity assumption (I), assuming $\tau _n \rightarrow 0$ as $n \rightarrow \infty $, for arbitrary $\eta $ such that $0< \eta < \min (1,a)$, when n is sufficiently large,

$$\begin{aligned} E(1-\kappa _i\vert \varvec{X}_i) \le K \tau _n^\eta \exp \left( \frac{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2}\right) . \end{aligned}$$

(A1)

Proof

For an arbitrary constant $\xi >0$,

$$\begin{aligned} \begin{aligned}&E\left( 1-\kappa _i \mid \varvec{X}_i\right) \\ =&\tau _n E\left[ \lambda _i^2 \left( 1 + \lambda _i^2 \tau _n\right) ^{-1} \mid \varvec{X}_i\right] \\ =&\tau _n \frac{\int _0^\infty \left( 1+\lambda _{i}^2 \tau _n\right) ^{-\left( \frac{k}{2}+1\right) } \left( \lambda _{i}^2\right) ^{-a} L\left( \lambda _{i}^2\right) \exp \left( -\frac{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2\left( 1+\lambda _{i}^2 \tau _n \right) }\right) d\lambda _{i}^2}{\int _0^\infty (1+\lambda _{i}^2 \tau _n)^{-\frac{k}{2}} (\lambda _{i}^2)^{-a-1} L(\lambda _{i}^2) \exp \left( -\frac{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2(1+\lambda _{i}^2 \tau _n)}\right) d\lambda _{i}^2} \\ \le&\tau _n exp(\frac{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2}) I_1 / \left( I_2 L(\xi )\right) \end{aligned} \end{aligned}$$

(A2)

where

$$\begin{aligned} I_1 = \int _0^\infty \left( 1+\lambda _{i}^2 \tau _n\right) ^{-\left( \frac{k}{2}+1\right) } \left( \lambda _{i}^2\right) ^{-a} L\left( \lambda _{i}^2\right) d\lambda _{i}^2 \end{aligned}$$

(A3)

and

$$\begin{aligned} I_2 = \int _{\xi }^\infty \left( 1+\lambda _{i}^2 \tau _n\right) ^{-\frac{k}{2}} \left( \lambda _{i}^2\right) ^{-a-1} d\lambda _{i}^2 \end{aligned}$$

(A4)

But

$$\begin{aligned} I_2 \ge \int _{\xi }^\infty \left( 1+\lambda _{i}^2\right) ^{-\frac{k}{2}} \left( \lambda _{i}^2\right) ^{-a-1} d\lambda _{i}^2 = K. \end{aligned}$$

(A5)

Next, by choosing $0< \eta < \min (1,a)$,

$$\begin{aligned} \begin{aligned} I_1&\!\le \! \int _0^1 (\lambda _{i}^2)(\lambda _{i}^2)^{-a-1} L(\lambda _{i}^2)d\lambda _{i}^2 \!+\! M \!\int _1^\infty (\lambda _{i}^2 \tau _n) ^{-1+\eta } (1\!+\! \lambda _{i}^2 \tau _n)^{-\frac{k}{2}-\eta } (\lambda _{i}^2)^{-a} d\lambda _{i}^2 \\&\!\le \! K^{-1} + M \tau _n^{\eta -1} (a-\eta )^{-1} \end{aligned} \end{aligned}$$

(A6)

Combining (A2)-(A6), $E\left( 1-\kappa _i \mid \varvec{X}_i\right) \le K \tau _n^\eta \exp \left( \frac{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2}\right) $, when n is sufficiently large.$\square $

Lemma 2

Under the multivariate global-local prior model with treating the global parameter as a tuning parameter and under the regularity assumption (I), for arbitrary constants $0< \xi < 1$ and $0< \delta < 1$,

$$\begin{aligned} E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \le K \tau _n^{-a} \exp {\left[ -\frac{\xi (1-\delta )}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right] }. \end{aligned}$$

(A7)

Proof

For an arbitrary constant $0< \xi < 1$,

$$\begin{aligned} \begin{aligned}&E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \\ =&\frac{\int _{\xi }^1 \kappa _{i}^{k/2+a} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}}{\int _0^{1} \kappa _{i}^{k/2+a-1} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}} \\ \le&\frac{\int _{\xi }^1 \kappa _{i}^{k/2+a} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}}{\int _0^{\xi \delta } \kappa _{i}^{k/2+a-1} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}} \\ \le&\exp {\left[ -\frac{\xi (1-\delta )}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right] } \frac{ \int _{\xi }^1 \kappa _{i}^{k/2+a} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) d\kappa _{i}}{\int _0^{\xi \delta } \kappa _{i}^{k/2+a-1} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) d\kappa _{i}}. \end{aligned} \end{aligned}$$

(A8)

Now observe that

$$\begin{aligned} \begin{aligned}&\int _{\xi }^1 \kappa _{i}^{k/2+a} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) d\kappa _{i} \\ =&\int _{0}^{(1-\xi )/(\xi \tau _n)} (1+\lambda _{i}^2 \tau _n)^{-(k/2+a)} (\lambda _{i}^2 \tau _n (1+\lambda _{i}^2 \tau _n)^{-1})^{-a-1} L(\lambda _{i}^2) \frac{\tau _n d \lambda _{i}^2}{(1+\lambda _{i}^2 \tau _n)^{2}} \\ =&\tau _n^{-a} \int _{0}^{(1-\xi )/(\xi \tau _n)} (1+\lambda _{i}^2 \tau _n)^{-(k/2+1)} (\lambda _{i}^2)^{-a-1} L(\lambda _{i}^2) d \lambda _{i}^2 \\ \le&\tau _n^{-a} \int _0^\infty (\lambda _{i}^2)^{-a-1} L(\lambda _{i}^2) d \lambda _{i}^2 = \tau _n^{-a} K^{-1}. \end{aligned} \end{aligned}$$

(A9)

By assumption (I) and $\tau _n<1$,

$$\begin{aligned} \begin{aligned}&\int _0^{\xi \delta } \kappa _{i}^{k/2+a-1} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) d\kappa _{i} \\ \ge&\int _{0}^{\xi \delta } \kappa _{i}^{k/2+a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) d\kappa _{i} \\ \ge&L\left( \frac{1-\xi \delta }{\xi \delta }\right) \int _{0}^{\xi \delta } \kappa _{i}^{k/2+a-1} d \kappa _{i} \\ =&L\left( \frac{1-\xi \delta }{\xi \delta }\right) \frac{(\xi \delta )^{k/2+a}}{k/2+a}. \end{aligned} \end{aligned}$$

(A10)

Combining (A8) - (A10),

$$\begin{aligned} E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \le K \tau _n^{-a} \exp {\left[ -\frac{\xi (1-\delta )}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right] }. \end{aligned}$$

(A11)

$\square $

Lemma 3

Under the multivariate global-local prior model with treating the global parameter as a tuning parameter and under the regularity assumption (I), for an arbitrary constant $0< \xi < 1$,

$$\begin{aligned} E(\kappa _i \mathbbm {1}_{[\kappa _i \le \xi ]} \vert \varvec{X}_i) \le K / \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i. \end{aligned}$$

(A12)

Proof

For an arbitrary constant $0< \xi < 1$,

$$\begin{aligned} \begin{aligned}&E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i \le \xi \right] } \mid \varvec{X}_i\right) \\ =&\frac{\int _0^{\xi } \kappa _{i}^{k/2+a} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}}{\int _0^{1} \kappa _{i}^{k/2+a-1} (1-\kappa _{i})^{-a-1} L\left( \frac{1-\kappa _{i}}{\kappa _{i} \tau _n}\right) \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}} \\ \le&\frac{(1-\xi )^{-a-1}M}{L\left( \frac{1-\xi }{\xi \tau _n}\right) } \frac{\int _0^{\xi } \kappa _{i}^{k/2+a} \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}}{\int _0^{\xi } \kappa _{i}^{k/2+a-1} \exp {\left( -\frac{\kappa _{i}}{2} \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i \right) } d\kappa _{i}} \\ =&\frac{(1-\xi )^{-a-1}M}{L\left( \frac{1-\xi }{\xi \tau _n}\right) } \frac{\int _0^{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2} \left( \frac{2t}{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }\right) ^{k/2+a} \exp {\left( -t\right) } dt}{\int _0^{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2} \left( \frac{2t}{ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }\right) ^{k/2+a-1} \exp {\left( -t\right) } dt} \\ =&\frac{(1-\xi )^{-a-1}M}{L\left( \frac{1-\xi }{\xi \tau _n}\right) \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i } \frac{\int _0^{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2} t^{k/2+a} \exp {\left( -t\right) } dt}{\int _0^{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2} t^{k/2+a-1} \exp {\left( -t\right) } dt}. \end{aligned} \end{aligned}$$

(A13)

Integrating the numerator by parts,

$$\begin{aligned} \begin{aligned}&\int _0^{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2} t^{k/2+a} \exp {\left( -t\right) } dt = (k/2 + a) \times \\&\int _0^{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2} t^{k/2+a-1} \exp {\left( -t\right) } dt - \left( \frac{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2} \right) ^{\frac{k}{2}+a} \exp {\left( -\frac{\xi \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i }{2} \right) }. \end{aligned} \end{aligned}$$

(A14)

Combining (A13) and (A14),

$$\begin{aligned} E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i \le \xi \right] } \mid \varvec{X}_i\right) \le K/ \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i . \end{aligned}$$

(A15)

$\square $

1.2 A.2 Lemmas for the Multivariate Exponential-Inverse-Gamma Model

Under an Exponential-Inverse-Gamma model (i), (ii’) and (iii’) the posterior density of $\kappa _i$ is given by

$$\begin{aligned} \begin{aligned} \Pi (\kappa _i\vert \varvec{X}_i) \propto \kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i + \kappa _i c_n)^{-d-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) . \end{aligned} \end{aligned}$$

Now, we give three basic inequalities involving the Exponential-Inverse-Gamma prior.

Lemma 4

Under the multivariate Exponential-Inverse-Gamma model, assuming $c_n \rightarrow 0$ as $n \rightarrow \infty $, for large n,

$$\begin{aligned} E(1-\kappa _i\vert \varvec{X}_i) \le K c_n^{d} \exp \left( \frac{\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) . \end{aligned}$$

(A16)

Proof

Firstly,

$$\begin{aligned} \begin{aligned} E(1-\kappa _i\vert \varvec{X}_i)&= \frac{\int _0^1 (1-\kappa _i) \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^1 \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i} \\&\le \exp (\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) \frac{\int _0^1 (1-\kappa _i)\kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^1 \kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}\\&:= \exp (\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) \frac{N}{D}. \end{aligned} \end{aligned}$$

Now,

$$\begin{aligned} \begin{aligned} D&\ge \int _{1-c_n}^1 \kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i\\&\ge \int _{1-c_n}^1 \kappa _i^{d+\frac{k}{2}-1} (c_n + c_n )^{-d-1} d\kappa _i\\&= \left( 2 c_n\right) ^{-(d+1)} \int _{1-c_n}^1 \kappa _i^{d+\frac{k}{2}-1} d \kappa _i\\&= \left( 2 c_n\right) ^{-(d+1)}\frac{ 1-(1-c_n)^{d+\frac{k}{2}}}{d+\frac{k}{2}}\\&\ge K c_n^{-(d+1)} \left[ c_n- \frac{d+\frac{k}{2}-1}{2}c_n^2\right] \\&= K c_n^{-d} \left[ 1- \frac{d+\frac{k}{2}-1}{2}c_n\right] \\&\ge K c_n^{-d}. \end{aligned} \end{aligned}$$

Next,

$$\begin{aligned} \begin{aligned} N&= \int _0^1 (1-\kappa _i)\kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i \\&\le \int _0^1 (1-\kappa _i)\kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i)^{-d-1} d\kappa _i \\&= \int _0^1 \kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i)^{-d} d\kappa _i \\&= \text {Beta}(1-d, \frac{k}{2}+d). \end{aligned} \end{aligned}$$

Therefore, when n is sufficiently large,

$$\begin{aligned} E(1-\kappa _i \vert \varvec{X}_i) \le K c_n^{d} \exp (\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) . \end{aligned}$$

$\square $

Lemma 5

Under the multivariate Exponential-Inverse-Gamma model, for arbitrary constants $0< \xi < 1$ and $0< \delta < 1$,

$$\begin{aligned} E(\kappa _i \mathbbm {1}_{[\kappa _i > \xi ]} \vert \varvec{X}_i) \le K c_n^{-d} \exp \left( -\frac{1}{2}\xi (1-\delta )\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i\right) . \end{aligned}$$

(A17)

Proof

For arbitrary constants $0< \xi < 1$ and $0< \delta < 1$,

$$\begin{aligned}{} & {} E(\kappa _i \mathbbm {1}_{ [\kappa _i > \xi ] }\vert \varvec{X}_i)\\= & {} \frac{\int _{\xi }^1 \kappa _i \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^1 \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i} \\\le & {} \frac{\int _{\xi }^1 \kappa _i^{d+\frac{k}{2}} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^{\xi \delta } \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}\\\le & {} \exp \left( -\frac{1}{2} \xi (1-\delta ) \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i \right) \frac{\int _{\xi }^1 \kappa _i^{d+\frac{k}{2}} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^{\xi \delta } \kappa _i^{d+\frac{k}{2}-1} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}\\\le & {} \exp \left( -\frac{1}{2} \xi (1-\delta ) \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i \right) \frac{\int _{\xi }^1 \kappa _i^{d+\frac{k}{2}} (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^{\xi \delta } \kappa _i^{d+\frac{k}{2}-1} (1 + c_n)^{-d-1} d\kappa _i}\\\le & {} \exp \left( -\frac{1}{2} \xi (1-\delta ) \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i \right) (1+c_n)^{d+1} \frac{d+k/2}{(\xi \delta )^{d+k/2}}\int _{\xi }^1 (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i\\\le & {} K \exp \left( -\frac{1}{2} \xi (1-\delta ) \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i \right) \left[ \frac{(1-\kappa _i + \xi c_n)^{-d}}{d}\right] ^1_\xi \\\le & {} K \exp \left( -\frac{1}{2} \xi (1-\delta ) \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i \right) c_n^{-d}. \end{aligned}$$

$\square $

Lemma 6

Under the multivariate Exponential-Inverse-Gamma model, for an arbitrary constant $0< \xi < 1$,

$$\begin{aligned} E(\kappa _i \mathbbm {1}_{[\kappa _i \le \xi ]} \vert \varvec{X}_i) \le K / \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i. \end{aligned}$$

(A18)

Proof

For an arbitrary constant $0< \xi < 1$,

$$\begin{aligned}{} & {} E(\kappa _i \mathbbm {1}_{ [\kappa _i \le \xi ] }\vert \varvec{X}_i)\\= & {} \frac{\int _0^{\xi } \kappa _i^{d+\frac{k}{2}} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^1 \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i} \\\le & {} \frac{\int _0^{\xi } \kappa _i^{d+\frac{k}{2}} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i}{\int _0^\xi \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) (1-\kappa _i + \kappa _i c_n)^{-d-1} d\kappa _i} \\\le & {} \frac{(1-\xi )^{-d-1}}{(1+\xi c_n)^{-d-1}}\frac{\int _0^{\xi } \kappa _i^{d+\frac{k}{2}} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) d\kappa _i}{\int _0^\xi \kappa _i^{d+\frac{k}{2}-1} \exp \left( -\frac{\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2}\right) d\kappa _i} \\\le & {} \frac{(1-\xi )^{-d-1}}{(1+\xi c_n)^{-d-1} \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}\frac{\int _0^{\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2} \exp \left( -t\right) t^{d+\frac{k}{2}} dt}{\int _0^{\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2} \exp \left( -t\right) t^{d+\frac{k}{2}-1}dt}. \\ \end{aligned}$$

Integrating by parts,

$$\begin{aligned} \begin{aligned}&\int _0^{\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2} \exp \left( -t\right) t^{d+\frac{k}{2}}dt\\ =&\left[ \!-\!\exp \left( -t\right) t^{d+\frac{k}{2}}\right] ^{t=\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2}_{t=0} \!+\! (d\!+\!\frac{k}{2}) \int _0^{\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2} \exp \left( -t\right) t^{d+\frac{k}{2}-1} dt. \end{aligned} \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{\int _0^{\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2} \exp \left( -t\right) ^{d+\frac{k}{2}} dt}{\int _0^{\xi \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2} \exp \left( -t\right) t^{d+\frac{k}{2}-1}dt} \le d + \frac{k}{2}. \end{aligned}$$

Hence,

$$\begin{aligned} E(\kappa _i \mathbbm {1}_{ [\kappa _i \le \xi ] }\vert \varvec{X}_i) \le K/\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i. \end{aligned}$$

$\square $

Appendix B: Proofs of the Main Theorems

1.1 B.1 Proofs of Theorems in Section 2

Proof

of Theorem 1 For each $\varvec{X}_{i} \overset{ind}{\sim }\ N_k(\varvec{\theta }_{i}, \varvec{\Sigma })$, define $\varvec{Y}_{i} = (Y_{i1}, \dots , Y_{ik})^\top := \varvec{\Sigma }^{-1/2}\varvec{X}_{i}$ and $\varvec{\lambda }_{i} = (\lambda _{i1}, \dots , \lambda _{ik})^\top := \varvec{\Sigma }^{-1/2}\varvec{\theta }_{i}$. So, all the components of $\varvec{Y}_{i}$’s are independently normally distributed, i.e., ${Y}_{ij} \overset{ind}{\sim }\ N({\lambda }_{ij}, 1), i = 1, \dots , n, j=1,\dots ,k$.

Given estimators $\widehat{\varvec{\lambda }}_{i}$, if we use $\widehat{\varvec{\theta }}_{i} = \varvec{\Sigma }^{1/2}\widehat{\varvec{\lambda }}_{i}$ to estimate $\varvec{\theta }_{i}$’s, then

$$E \Vert \widehat{\varvec{\theta }}_{i} - \varvec{\theta }_{i} \Vert _{\Sigma }^2 = E \Vert \widehat{\varvec{\lambda }}_{i} - \varvec{\lambda }_{i} \Vert ^2 = \sum _{j=1}^{k} E ( \widehat{{\lambda }}_{ij} - {\lambda }_{ij} )^2. $$

Also, since $\{ \lambda _{ij}\} \in l_0[q'_{nk}]$ would imply that $\{ \varvec{\theta }_{i}\} \in L_0[q_n]$, when we let $q'_{nk} = q_n$, we have

$$\sup _{\{ \lambda _{ij}\} \in l_0[q'_{nk}]} \sum _{i=1}^n \sum _{j=1}^{k} E ( \widehat{{\lambda }}_{ij} - {\lambda }_{ij} )^2 \le \sup _{\{ \varvec{\theta }_{i}\} \in L_0[q_n]} \sum _{i=1}^n E \Vert \widehat{\varvec{\theta }}_{i} - \varvec{\theta }_{i} \Vert _{\Sigma }^2. $$

Finally, since $\varvec{\Sigma }$ is positive definite, there is a one-to-one correspondence between $\{\{\widehat{\varvec{\theta }}_{i}\}_{i=1}^n\}$ and $\{\{\widehat{{\lambda }}_{ij}\}_{i=1,j=1}^{n,\quad k}\}$. So the above inequality still hold, if we further take the infinum over all possible estimators:

$$\inf _{\{\widehat{{\lambda }}_{ij}\}} \sup _{\{ \lambda _{ij}\} \in l_0[q'_{nk}]} \sum _{i=1}^n \sum _{j=1}^{k} E ( \widehat{{\lambda }}_{ij} - {\lambda }_{ij} )^2 \le \inf _{\{\widehat{\varvec{\theta }}_{i}\}} \sup _{\{ \varvec{\theta }_{i}\} \in L_0[q_n]} \sum _{i=1}^n E \Vert \widehat{\varvec{\theta }}_{i} - \varvec{\theta }_{i} \Vert _{\Sigma }^2. $$

Donoho et al. (1992) provides the result for the left hand side, which is, as $n \rightarrow \infty $,

$$\begin{aligned} \begin{aligned} \inf _{\{\widehat{{\lambda }}_{ij}\}} \sup _{\{ \lambda _{ij}\} \in l_0[q'_{nk}]} \sum _{i=1}^n \sum _{j=1}^{k} E ( \widehat{{\lambda }}_{ij} - {\lambda }_{ij} )^2&= 2q'_{nk} \log (nk/q'_{nk}) (1+o(1)) \\&= 2q_{n} \log (n/q_{n}) (1+o(1)). \end{aligned} \end{aligned}$$

The last equality holds because we let $q'_{nk} = q_n$.

So we have a minimax lower bound now,

$$\inf _{\{\widehat{\varvec{\theta }}_{i}\}} \sup _{\{ \varvec{\theta }_{i}\} \in L_0[q_n]} \sum _{i=1}^n E \Vert \widehat{\varvec{\theta }}_{i} - \varvec{\theta }_{i} \Vert _{\Sigma }^2 \ge 2q_{n} \log (n/q_{n}) (1+o(1)). $$

As we will see in (6) and (12), when using some particular priors, the error of the Bayes estimate of $\varvec{\theta }_{i}$’s will be at most $2q_{n} \log (n/q_{n}) (1+o(1))$. This fact provides an upper bound for the minimax error, which coincides with the lower bound, and finishes this proof. $\square $

Proof

of Theorem 2 We first prove (4). Observe that

$$\begin{aligned} \Vert \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 = E\left( 1-\kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le E\left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2. \end{aligned}$$

Denote the $l_2$-norm by $\Vert \cdot \Vert _2$. Then, making use of Lemma 1, for a sequence of positive constants $\{a_n,n\ge 1\}$ to be specified later with $a_n \rightarrow \infty $ as $n \rightarrow \infty $,

$$\begin{aligned}{} & {} E_0\left[ E \left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] } \right] \nonumber \\\le & {} K \tau _n^\eta E_0\left[ \exp \left( \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i/ 2\right) \Vert \varvec{X}_i\Vert _2^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] } \right] \nonumber \\= & {} K \tau _n^\eta \int _{ \Vert \varvec{X}_i\Vert _2^2 \le a_n} \Vert \varvec{X}_i\Vert _2^2 \exp \left( \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i/2 - \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i / 2\right) d\varvec{X}_i \nonumber \\= & {} K \tau _n^\eta \int _{ \Vert \varvec{X}_i\Vert _2^2 \le a_n} \Vert \varvec{X}_i\Vert _2^2 d\varvec{X}_i \nonumber \\\le & {} K \tau _n^\eta a_n V_k(a_n)\nonumber \\= & {} K \tau _n^\eta a_n^{k+1}, \end{aligned}$$

(B19)

where $V_k(r) = \frac{\pi ^{k/2}}{\Gamma (k/2+1)} r^{k}$ is the volume of a Euclidean ball of radius r in k-dimensional Euclidean space.

Moreover, under assumption (II),

$$\begin{aligned} \begin{aligned}&E_0\left[ E \left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2> a_n\right] } \right] \\ \le&E_0 \left( \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > \lambda _{\min } \left( \Sigma ^{-1}\right) a_n \right] } \right) \end{aligned} \end{aligned}$$

Noting that under $\varvec{\theta }_{0i} = \varvec{0} $, $\Vert \varvec{X}_i\Vert _{\Sigma }^2 \sim \chi _k^2$, one gets

$$\begin{aligned} \begin{aligned}&E_0\left( \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > \lambda _{\min } \left( \Sigma ^{-1}\right) a_n\right] } \right) \\ =&\int _{\lambda _{\min } \left( \Sigma ^{-1}\right) a_n}^\infty x \exp \left( -\frac{x}{2}\right) \frac{x^{k/2-1}}{\Gamma (k/2) 2^{k/2}} dx\\ =&2 \int _{\left( \lambda _{\min } \left( \Sigma ^{-1}\right) a_n \right) /2}^\infty \exp \left( -x\right) \frac{x^{k/2}}{\Gamma (k/2)} dx\\ \le&2 \exp \left( - \lambda _{\min } \left( \Sigma ^{-1}\right) a_n / 4 \right) \int _{\left( \lambda _{\min } \left( \Sigma ^{-1}\right) a_n \right) /2}^\infty \exp \left( -\frac{x}{2}\right) \frac{x^{k/2}}{\Gamma (k/2)} dx\\ \le&\exp \left( - \lambda _{\min } \left( \Sigma ^{-1}\right) a_n / 4\right) \left( \frac{k}{2}\right) 2 ^{\frac{k}{2}+2}. \end{aligned} \end{aligned}$$

Now choosing $a_n = 4 \lambda _{\min }^{-1} \left( \Sigma ^{-1}\right) (1+\epsilon )\log (n/q_n)$, with $\epsilon > 0$, one gets

$$\begin{aligned} E_0\left[ E \left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 > a_n\right] } \right] \le K (q_n/n)^{1+\epsilon }. \end{aligned}$$

(B20)

Finally, choosing $\tau _n = \left( q_n/n\right) ^{\frac{1+\epsilon }{\eta }}$, the theorem follows from (B19) and (B20),

$$\begin{aligned} \begin{aligned} E_0 \left[ E\left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right]&\le K \left[ (q_n/n)^{1+\epsilon } \left( \log (n/q_n)\right) ^{k+1} + (q_n/n)^{1+\epsilon } \right] \\&\lesssim (q_n/n)^{1+\epsilon } \left( \log (n/q_n)\right) ^{k+1}, \text { as } n \rightarrow \infty . \end{aligned} \end{aligned}$$

Summing over all i’s for which $\varvec{\theta }_{0i} = \varvec{0}$, one gets

$$ { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} {\sum \limits _{i: \varvec{\theta }_{0i} = \varvec{0}} E_0 \Vert \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 } / {\left( q_n \log \left( \frac{n}{q_n}\right) \right) } = o(1), \text { as } n \rightarrow \infty .$$

Now we prove (5). Use the inequality

$$\begin{aligned} \begin{aligned}&E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 \\ =&E_{\varvec{\theta }_{0i}} \Vert (1-E\left( \kappa _i \mid \varvec{X}_i\right) )\varvec{X}_i - \varvec{X}_i + \varvec{X}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 \\ =&E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 + \Vert \varvec{X}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 -2 \langle \varvec{\Sigma }^{-1/2}(\varvec{X}_i - \varvec{\theta }_{0i}), E\left( \kappa _i \mid \varvec{X}_i\right) \varvec{\Sigma }^{-1/2}\varvec{X}_i \rangle \right] \\ \le&E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] + E_{\varvec{\theta }_{0i}} \Vert \varvec{X}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 \\ +&2 \left( E_{\varvec{\theta }_{0i}} \Vert \varvec{X}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] \right) ^\frac{1}{2} \end{aligned} \end{aligned}$$

But $E_{\varvec{\theta }_{0i}} \Vert \varvec{X}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 = k$, the above becomes

$$\begin{aligned} \begin{aligned} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 \le&E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] \\ +&2 k^{\frac{1}{2}} E_{\varvec{\theta }_{0i}}^\frac{1}{2} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] + k \\ \le&E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] \\ +&2 k^{\frac{1}{2}} E_{\varvec{\theta }_{0i}}^\frac{1}{2} \left[ E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] + k \end{aligned} \end{aligned}$$

Since $q_n k / q_n \log (n/q_n) \rightarrow 0$, as $n \rightarrow \infty $, it suffices to show that

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} [E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2]}{4a q_n \log (n/q_n)} \le 1/(2\min (1,a)). \end{aligned}$$

(B21)

and

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}}^{\frac{1}{2}} [E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2]}{q_n \log (n/q_n)} = 0. \end{aligned}$$

(B22)

In view of Lemma 2, for sufficiently large $b_n > 0$, uniformly in $\varvec{\theta }_{0i} \ne \varvec{0}$,

$$\begin{aligned} \begin{aligned}&E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i> \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2> b_n \right] } \\ \le&K \tau _n^{-a} \Vert \varvec{X}_i\Vert _{\Sigma }^2 \exp {\left[ -\frac{\xi (1-\delta )}{2} \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] } \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > b_n \right] } \\ \le&K \tau _n^{-a} b_n \exp {\left[ -\frac{\xi (1-\delta )}{2} b_n \right] } \\ \le&K(q_n/n)^{a(\rho -\epsilon )/\eta } \log \left( \frac{n}{q_n}\right) , \end{aligned} \end{aligned}$$

(B23)

by choosing $b_n = \frac{1+\rho }{\eta } \frac{2a}{\xi (1-\delta )} \log \left( \frac{n}{q_n}\right) $, with $\rho > \epsilon $, and recalling that $\tau _n = \left( q_n/n\right) ^{\frac{1+\epsilon }{\eta }}$.

Then summing over all i’s for which $\varvec{\theta }_{0i} \ne \varvec{0}$, one gets

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \left[ E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i> \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > b_n \right] }\right] }{q_n \log (n/q_n)} = 0. \end{aligned}$$

(B24)

Finally,

$$\begin{aligned} \begin{aligned}&E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le b_n \right] } \right] \\ \le&E_{\varvec{\theta }_{0i}} \left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le b_n \right] } \right] \\ \le&b_n \\ \le&\frac{1+\rho }{\eta } \frac{2a}{\xi (1-\delta )} \log \left( \frac{n}{q_n}\right) . \end{aligned} \end{aligned}$$

(B25)

Since the result above is independent of any specific choice of the parameters, by making $\epsilon \rightarrow 0$, $\rho \rightarrow 0$, $\eta \rightarrow \min (1,a)$, $\xi \rightarrow 1$ and $\delta \rightarrow 0$, and summing over all i’s for which $\varvec{\theta }_{0i} \ne \varvec{0}$, one gets

$$\begin{aligned} \begin{aligned}&\limsup \limits _{n\rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le b_n \right] } \right] }{4a q_n \log (n/q_n)} \\&\le 1/(2\min (1,a)). \end{aligned} \end{aligned}$$

(B26)

Together with Lemma 3 and (B24), this leads to (B21).

Altogether Lemma 2, (B23) and (B25) also imply that

$$\begin{aligned} E_{\varvec{\theta }_{0i}} [E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2] \le K \log \left( \frac{n}{q_n}\right) (1+o(1)), \text { as } n \rightarrow \infty . \end{aligned}$$

(B27)

Again summing over all i’s for which $\varvec{\theta }_{0i} \ne \varvec{0}$, one gets (B22). This completes the proof of (5).

(6) follows (4) and (5), immediately. In particular, when $0 < a \le 1$, Theorem 1 provides a lower bound for the estimation error and (7) follows.$\square $

Proof

of Theorem 3 By Markov’s inequality and the independence of samples,

$$\begin{aligned} \begin{aligned}&E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 > q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) \\ \le&E_{\{\varvec{\theta }_{0i}\}} E(\sum _{i=1}^{n} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 \mid \{\varvec{X}_i\}) / q_n \log (\frac{n}{q_n}) \\ =&\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 \mid \varvec{X}_i) / q_n \log (\frac{n}{q_n}). \end{aligned} \end{aligned}$$

Since

$$\varvec{\theta }_{i} \mid \varvec{X}_i, \kappa _i \sim N_k(\widehat{\varvec{\theta }}^R_i, (1-\kappa _i)\varvec{\Sigma }),$$

we have

$$\begin{aligned} \begin{aligned}&E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 \mid \varvec{X}_i) \\ =&E\{ E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 \mid \varvec{X}_i, \kappa _i) \mid \varvec{X}_i \}\\ =&E\{ (1-\kappa _i) E( \frac{ \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 }{1-\kappa _i} \mid \varvec{X}_i, \kappa _i) \mid \varvec{X}_i \} \\ =&k E (1-\kappa _i \mid \varvec{X}_i ). \end{aligned} \end{aligned}$$

Now we only need to find a bound for $E_{\varvec{\theta }_{0i}} E (1-\kappa _i \mid \varvec{X}_i )$. When $\varvec{\theta }_{0i} \ne 0$,

$$E_{\varvec{\theta }_{0i}} E (1-\kappa _i \mid \varvec{X}_i ) \le 1.$$

When $\varvec{\theta }_{0i} = 0$, letting $\{a_n\}$ be a sequence of positive numbers that will be chosen later, using Lemma 1, for $0< \eta < \min (0,a)$,

$$\begin{aligned}{} & {} E_{\varvec{\theta }_{0i}} E (1-\kappa _i \mid \varvec{X}_i ) \\= & {} E_{\varvec{0}} \{E (1-\kappa _i \mid \varvec{X}_i ) \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] }\} + E_{\varvec{0}} \{E (1-\kappa _i \mid \varvec{X}_i ) \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2> a_n\right] }\} \\\le & {} E_{\varvec{0}} \{ K \tau _n^\eta \exp \left( \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i / 2 \right) \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] }\} + E_{\varvec{0}} \{\mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2> a_n\right] }\} \\= & {} K \tau _n^\eta V_k(a_n) + K Pr (\chi ^2_k > a_n) \\\le & {} K \tau _n^\eta a_n^k + K a_n^{k/2} \exp (-a_n/2), \end{aligned}$$

where $V_k(r)$ is the volume of a Euclidean ball of radius r in k-dimensional Euclidean space, and the probability is bounded using the Chernoff bound for the $\chi ^2$ random variables.

For some $\epsilon > 0$, choose $a_n = 2(1+\epsilon ) \log (n/q_n)$ and $\tau _n = (q_n/n)^{(1+\epsilon )/\eta }$. Then,

$$\begin{aligned} \begin{aligned}&\sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} \sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 \mid \varvec{X}_i) / q_n \log (\frac{n}{q_n}) \\ \le&\frac{q_n + (n-q_n)(K \tau _n^\eta a_n^k + K a_n^{k/2} \exp (-a_n/2))}{q_n \log (\frac{n}{q_n})} \rightarrow 0, \end{aligned} \end{aligned}$$

as $n \rightarrow \infty $. This proves (8).

Next, since

$$E( \Vert \varvec{\theta }_i - {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \mid \varvec{X}_i) \le 2E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2 \mid \varvec{X}_i) + 2E( \Vert \widehat{\varvec{\theta }}^R_i - {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \mid \varvec{X}_i),$$

by Markov’s inequality, (6) and (8) immediately lead to (9).$\square $

Proof

of Theorem 4 The proof of (10) is similar to the proof of (4), but now using Lemma 4. We start from

$$\begin{aligned} \Vert \widehat{\varvec{\theta }}^{EIG}_i \Vert _{\Sigma }^2 = E\left( 1-\kappa _i \mid \varvec{X}_i\right) ^2 \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le E\left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2. \end{aligned}$$

Then, for a sequence of positive constants $\{a_n,n\ge 1\}$ to be specified later with $a_n \rightarrow \infty $ as $n \rightarrow \infty $,

$$\begin{aligned} \begin{aligned}&E_0\left[ E \left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] } \right] \\ \le&K c_n^d E_0\left[ \exp \left( \varvec{X}_i^T \varvec{\Sigma }^{-1} \varvec{X}_i /2 \right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] } \right] \\ \le&K c_n^d a_n^{k+1}, \end{aligned} \end{aligned}$$

and,

$$\begin{aligned} \begin{aligned}&E_0\left[ E \left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2> a_n\right] } \right] \\ \le&E_0 \left( \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > \lambda _{\min } \left( \Sigma ^{-1}\right) a_n \right] } \right) \\ \le&\exp \left( -\lambda _{\min } \left( \Sigma ^{-1}\right) a_n/4\right) . \end{aligned} \end{aligned}$$

By choosing $a_n = 4 \lambda _{\min }^{-1} \left( \Sigma ^{-1}\right) (1+\epsilon )\log (n/q_n)$ and $\tau _n = \left( q_n/n\right) ^{\frac{1+\epsilon }{d}}$, with $\epsilon > 0$, one gets

$$\begin{aligned} \begin{aligned} E_0 \left[ E\left( 1-\kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right]&\le K \left[ (q_n/n)^{1+\epsilon } \left( \log (n/q_n)\right) ^{k+1} + (q_n/n)^{1+\epsilon } \right] \\&\lesssim (q_n/n)^{1+\epsilon } \left( \log (n/q_n)\right) ^{k+1}, \text { as } n \rightarrow \infty . \end{aligned} \end{aligned}$$

and hence

$${ \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} {\sum \limits _{i: \varvec{\theta }_{0i} = \varvec{0}} E_0 \Vert \widehat{\varvec{\theta }}^{EIG}_i \Vert _{\Sigma }^2} / {\left( q_n \log \left( \frac{n}{q_n}\right) \right) } = o(1), \text { as } n \rightarrow \infty .$$

The proof of (11) is similar to the proof of (5), but now using Lemmas 5 and 6. Again, it suffices to show that

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} [E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2]}{2\lambda _{\max }(\Sigma ) q_n \log (n/q_n)} \le 1. \end{aligned}$$

(B28)

and

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}}^{\frac{1}{2}} [E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2]}{q_n \log (n/q_n)} = 0. \end{aligned}$$

(B29)

By Lemma 5, for sufficiently large $b_n > 0$, uniformly in $\varvec{\theta }_{0i} \ne \varvec{0}$,

$$\begin{aligned} \begin{aligned}&E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i> \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2> b_n \right] } \\ \le&K c_n^{-d} \Vert \varvec{X}_i\Vert _{\Sigma }^2 \exp {\left[ -\frac{\xi (1-\delta )}{2} \Vert \varvec{X}_i\Vert _{\Sigma }^2 \right] } \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > b_n \right] } \\ \le&K c_n^{-d} b_n \exp {\left[ -\frac{\xi (1-\delta )}{2} b_n \right] } \\ \le&K(q_n/n)^{\rho -\epsilon } \log \left( \frac{n}{q_n}\right) , \end{aligned} \end{aligned}$$

by choosing $b_n = \frac{2(1+\rho )}{\xi (1-\delta )} \log \left( \frac{n}{q_n}\right) $, with $\rho > \epsilon $, and recalling that $c_n = \left( q_n/n\right) ^{\frac{1+\epsilon }{d}}$. So,

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \left[ E \left( \kappa _i \mathbbm {1}_{\left[ \kappa _i> \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 > b_n \right] }\right] }{q_n \log (n/q_n)} = 0. \end{aligned}$$

(B30)

Also,

$$\begin{aligned} \begin{aligned}&E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le b_n \right] } \right] \\ \le&E_{\varvec{\theta }_{0i}} \left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le b_n \right] } \right] \\ \le&b_n \\ \le&\frac{2(1+\rho )}{\xi (1-\delta )} \log \left( \frac{n}{q_n}\right) . \end{aligned} \end{aligned}$$

(B31)

By making $\epsilon \rightarrow 0$, $\rho \rightarrow 0$, $\eta \rightarrow \min (1,a)$, $\xi \rightarrow 1$ and $\delta \rightarrow 0$, one gets

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \left[ E\left( \kappa _i \mathbbm {1}_{\left[ \kappa _i > \xi \right] } \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2 \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _{\Sigma }^2 \le b_n \right] } \right] }{2 q_n \log (n/q_n)} \le 1. \end{aligned}$$

(B32)

Together with Lemma 6 and (B30), this leads to (B28).

Altogether Lemma 5, (B30) and (B31) also imply that

$$\begin{aligned} E_{\varvec{\theta }_{0i}} [E\left( \kappa _i \mid \varvec{X}_i\right) \Vert \varvec{X}_i\Vert _{\Sigma }^2] \le K \log \left( \frac{n}{q_n}\right) (1+o(1)), \text { as } n \rightarrow \infty . \end{aligned}$$

(B33)

Consequently, one gets (B29). This completes the proof of (11).

Finally, (12) follows the previous two results and Theorem 1. $\square $

Proof

of Theorem 5 Similar to the Proof of Theorem 3, we only need to find a bound for $E_{\varvec{\theta }_{0i}} E (1-\kappa _i \mid \varvec{X}_i )$. When $\varvec{\theta }_{0i} \ne 0$,

$$E_{\varvec{\theta }_{0i}} E (1-\kappa _i \mid \varvec{X}_i ) \le 1.$$

When $\varvec{\theta }_{0i} = 0$, letting $\{a_n\}$ be a sequence of positive numbers that will be chosen later, using Lemma 4,

$$\begin{aligned} \begin{aligned}&E_{\varvec{\theta }_{0i}} E (1-\kappa _i \mid \varvec{X}_i ) \\ =&E_{\varvec{0}} \{E (1-\kappa _i \mid \varvec{X}_i ) \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 \le a_n\right] }\} + E_{\varvec{0}} \{E (1-\kappa _i \mid \varvec{X}_i ) \mathbbm {1}_{\left[ \Vert \varvec{X}_i\Vert _2^2 > a_n\right] }\} \\ \le&K c_n^d a_n^k + K a_n^{k/2} \exp (-a_n/2). \end{aligned} \end{aligned}$$

For some $\epsilon > 0$, choose $a_n = 2(1+\epsilon ) \log (n/q_n)$ and $c_n = (q_n/n)^{(1+\epsilon )/d}$. Then,

$$\begin{aligned} \begin{aligned}&\sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} \sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i) / q_n \log (\frac{n}{q_n}) \\ \le&\frac{q_n + (n-q_n)(K c_n^d a_n^k + K a_n^{k/2} \exp (-a_n/2))}{q_n \log (\frac{n}{q_n})} \rightarrow 0, \end{aligned} \end{aligned}$$

as $n \rightarrow \infty $. This proves (13).

Next, since

$$E( \Vert \varvec{\theta }_i - {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \mid \varvec{X}_i) \le 2E( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i) + 2E( \Vert \widehat{\varvec{\theta }}^{EIG}_i - {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \mid \varvec{X}_i),$$

by Markov’s inequality, (12) and (13) immediately lead to (14). $\square $

1.2 B.2 Proofs of Theorems in Section 3

Proof

of Theorem 6 We use $\widehat{\varvec{\theta }}_{i}$ for $\widehat{\varvec{\theta }}_{i}^R$ in this proof.

Proof of (16). Look at the case where $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_S \tau _n$. Note that we could write $\varvec{X}_i = \varvec{\theta }_{0i} + \varvec{\epsilon }_i$, where $\varvec{\epsilon }_i \sim \varvec{N}_{k}(\varvec{0}, \varvec{\Sigma })$ and hence $\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \sim \chi ^2_k$. So,

$$\begin{aligned} \begin{aligned} \Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2&= \Vert \varvec{\theta }_{0i} - E(1-\kappa _i \mid \varvec{X}_i)\varvec{X}_i \Vert _{\Sigma }^2 \\&=\Vert E(\kappa _i \mid \varvec{X}_i)\varvec{\theta }_{0i} - E(1-\kappa _i \mid \varvec{X}_i) \varvec{\epsilon }_i \Vert _{\Sigma }^2 \\&\le 2 E(\kappa _i \mid \varvec{X}_i)^2 \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 + 2 E(1-\kappa _i \mid \varvec{X}_i)^2 \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \\&\le 2 K_s \tau _n + 2 E(1-\kappa _i \mid \varvec{X}_i) \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2. \end{aligned} \end{aligned}$$

By Lemma 1, for any fixed $\eta < a$,

$$\begin{aligned} \begin{aligned} E(1-\kappa _i \mid \varvec{X}_i)&\le K \tau _n^\eta \exp \left( \frac{\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2} \right) \\&\le \tau _n^\eta \exp \left( \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \right) \exp \left( \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2\right) . \end{aligned} \end{aligned}$$

We will show that, for small enough $\tau _n$,

$$\begin{aligned} \widehat{r}_i(\alpha , \tau _n) \ge \chi ^2_{k, A\alpha } c \tau _n (1+o(1)) , \end{aligned}$$

(B34)

for some $c>0$ which we will fix later and any fixed $A > K m / (a c^a)$ with, specifically, $K = \int _0^\infty u^{-a-1} L(u) du$ here. Given this, for fixed $a/(1+\rho )< \eta < a$ and for any fixed L,

$$\begin{aligned} \begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i})&= P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \le L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n)) \\&\ge P_{\varvec{\theta }_{0i}} (K \tau _n + K \tau _n^\eta e^{\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2} \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le K \tau _n^{a/(1+\rho )} (1+o(1))) \\&\ge P_{\varvec{\theta }_{0i}} (K \tau _n + K \tau _n^\eta e^{\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2} \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le K \tau _n^{a/(1+\rho )} (1+o(1)) \mid \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }) \\&\times P(\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha })\\&\rightarrow 1 \times (1-\alpha ) = 1-\alpha , \end{aligned} \end{aligned}$$

as $\tau _n \rightarrow 0$, since the left hand side of the inequality in the conditional probability is of a higher order of infinitesimal.

Now, it remains to prove (B34). Due to the normality of the posterior

$$\begin{aligned} \varvec{\theta }_i \mid \varvec{X}_i, \lambda _i^2 \sim \varvec{N}_k ((1-\kappa _i)\varvec{X}_i, (1-\kappa _i) \varvec{\Sigma }), \end{aligned}$$

applying Anderson’s lemma, we have

$$\begin{aligned} \begin{aligned} \Pi (\Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \!>\! \widehat{r}_i(\alpha , \tau _n) \!\mid \! \varvec{X}_i, \lambda _i^2) \!\ge \! \Pi ( \Vert \varvec{\theta }_i \!-\! (1\!-\!\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 \!>\! \widehat{r}_i(\alpha , \tau _n) \!\mid \! \varvec{X}_i, \lambda _i^2). \end{aligned} \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned} \alpha&= \int _0^\infty \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2> \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\&\ge \int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2. \end{aligned} \end{aligned}$$

Recall that $\pi (\lambda _i^2 \mid \varvec{X}_i) \propto (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp (-\frac{\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i}{2 (1+\lambda _i^2 \tau _n)})$. Let $\widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) \propto (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2)$ be another density. Then, since $\pi (\lambda _i^2 \mid \varvec{X}_i)/\widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i)$ and $ \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2)$ are both increasing in $\lambda _i^2$

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2. \end{aligned} \end{aligned}$$

On the other hand, since $\Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 / (1-\kappa _i) \mid \varvec{X}_i, \lambda _i^2 \sim \chi ^2_k$, and $1-\kappa _i \ge c \tau _n (1+o(1))$ when $\lambda _i^2 \ge c$ for any $c > 0$. Thus, for some fixed $A(>0)$ to be determined later,

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } c \tau _n (1+o(1)) \mid \varvec{X}_i, \lambda _i^2) \widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\int _{c}^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } c \tau _n (1+o(1)) \mid \varvec{X}_i, \lambda _i^2) \widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\int _{c}^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \chi ^2_{k, A\alpha } (1-\kappa _i) \mid \varvec{X}_i, \lambda _i^2) \widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ =&A \alpha \widetilde{\Pi }(\lambda _i^2 \ge c \mid \varvec{X}_i). \end{aligned} \end{aligned}$$

Since, by the dominated convergence theorem,

$$\begin{aligned} \begin{aligned} \widetilde{\Pi }(\lambda _i^2 \ge c \mid \varvec{X}_i) =&\frac{\int _{c}^{\infty } (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) d\lambda _i^2}{\int _{0}^{\infty } (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) d\lambda _i^2} \\ \rightarrow&\frac{\int _{c}^{\infty } (\lambda _i^2)^{-a-1} L(\lambda _i^2) d\lambda _i^2}{\int _{0}^{\infty } (\lambda _i^2)^{-a-1} L(\lambda _i^2) d\lambda _i^2} \\ \ge&K m \int _{c}^\infty (\lambda _i^2)^{-a-1} d \lambda _i^2 = K m / (a c^a). \end{aligned} \end{aligned}$$

First fixing c to be such that $ K m / (a c^a) > \alpha $, if we further fix $A > a c^a/(K m)$ with $K = \int _0^\infty u^{-a-1} L(u) du$, then

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } \tau _n/2 (1+o(1)) \mid \varvec{X}_i, \lambda _i^2) \widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\alpha A K m / (a c^a) \\>&\alpha \\ \ge&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \widetilde{\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2. \end{aligned} \end{aligned}$$

This implies that $\widehat{r}_i(\alpha , \tau _n) \ge \chi ^2_{k, A\alpha } c \tau _n (1+o(1))$ which completes the proof of the first part.

Proof of (17) For the case where $f_{\tau _n} \tau _n \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_M \log \frac{1}{\tau _n}$, we start with the inequality

$$\begin{aligned} \Vert \varvec{X}_{i} \Vert _{\Sigma } \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma } + \Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }. \end{aligned}$$

For $K_0 > K_M$,

$$\begin{aligned} \begin{aligned} \Vert \varvec{X}_{i} \Vert _{\Sigma } - \left( K_0 \log \frac{1}{\tau _n} \right) ^{1/2} \le \Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma } + \left( \sqrt{K_M} - \sqrt{K_0}\right) \left( \log \frac{1}{\tau _n} \right) ^{1/2} \le 0, \end{aligned} \end{aligned}$$

if

$$\Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma } \le \left( \sqrt{K_0} - \sqrt{K_M}\right) \left( \log \frac{1}{\tau _n} \right) ^{1/2},$$

the probability of which converges to 1 as $\tau _n \rightarrow 0$.

We now study $\widehat{r}_i(\alpha , \tau _n)$ in this case. We will find an upper bound for $\widehat{r}_i(\alpha , \tau _n)$. As the first ingredient, for $B > 0$,

$$\begin{aligned} \begin{aligned}&{\Pi }(\lambda _i^2 \ge B \mid \varvec{X}_i) \\ =&\frac{\int _{B}^{\infty } (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2}{\int _{0}^{\infty } (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2}. \end{aligned} \end{aligned}$$

The denominator

$$\begin{aligned} \begin{aligned}&\int _{0}^{\infty } (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2 \\ \ge&\exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2}\right) \int _{1}^{2} (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) d\lambda _i^2 \\ \ge&\exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2}\right) (3)^{-k/2} L(1) \int _{1}^{2} (\lambda _i^2)^{-a-1} d\lambda _i^2\\ =&K \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2}\right) . \end{aligned} \end{aligned}$$

Hence,

$$\begin{aligned} \begin{aligned} {\Pi }(\lambda _i^2 \ge B \mid \varvec{X}_i) \le&K \exp \left( \frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2}\right) \int _{B}^{\infty } (\lambda _i^2)^{-a-1} d\lambda _i^2 \\ =&K \exp \left( \frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2}\right) B^{-a}. \end{aligned} \end{aligned}$$

(B35)

Next, by the inequality

$$\begin{aligned} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \le 2 \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 + 2 \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2, \end{aligned}$$

for $r > 0$,

$$\begin{aligned} \frac{1}{2} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 - \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge r \end{aligned}$$

would imply that

$$\begin{aligned} \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge r. \end{aligned}$$

Now,

$$\begin{aligned} \begin{aligned}&\Pi ( \frac{1}{2} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 - \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge r \mid \varvec{X}_i, \lambda _i^2 ) \\ \le&\Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge r \mid \varvec{X}_i, \lambda _i^2 ). \end{aligned} \end{aligned}$$

(B36)

Using (B36),

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \!\ge \! 2r \!+\! 2 \sup _{\lambda _i^2 \le B} \Vert \widehat{\varvec{\theta }}_{i} \!-\! (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2 \\ \le&\int _0^B \Pi ( \Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \!\ge \! 2r \!+\! 2 \sup _{\lambda _i^2 \le B} \Vert \widehat{\varvec{\theta }}_{i} \!-\! (1\!-\!\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2 \\ +&{\Pi }(\lambda _i^2 \ge B \mid \varvec{X}_i) \\ \le&\int _0^B \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge 2r + 2 \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2 \\ +&{\Pi }(\lambda _i^2 \ge B \mid \varvec{X}_i) \\ \le&\int _0^B \Pi ( \Vert \varvec{\theta }_i \!-\! (1\!-\!\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge r \mid \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2 + {\Pi }(\lambda _i^2 \ge B \mid \varvec{X}_i). \end{aligned} \end{aligned}$$

Since $1-\kappa _i \le \frac{B\tau _n}{1+B\tau _n}$ when $\lambda _i^2 \le B$, we can bound the first term above by $\alpha /2$ via

$$\begin{aligned} \begin{aligned}&\int _0^B \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge \chi ^2_{k,\alpha /2} \frac{B\tau _n}{1+B\tau _n} \mid \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2 \\ \le&\int _0^B \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge \chi ^2_{k,\alpha /2} (1-\kappa _i) \mid \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2 \\ \le&\alpha /2. \end{aligned} \end{aligned}$$

As for the second term, by (B35), when $\varvec{X}_i$ is fixed, for large enough B,

$$\begin{aligned} {\Pi }(\lambda _i^2 \ge B \mid \varvec{X}_i) \le \alpha / 2. \end{aligned}$$

(B37)

This leads to

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2\\ =&\alpha \\ \ge \!&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge 2r \!+\! 2 \sup _{\lambda _i^2 \le B} \Vert \widehat{\varvec{\theta }}_{i} \!-\! (1\!-\!\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \!\mid \! \varvec{X}_i, \lambda _i^2 ) \pi (\lambda _i^2 \mid \varvec{X}_i) d \lambda _i^2, \end{aligned} \end{aligned}$$

if $r = \chi ^2_{k,\alpha /2} B\tau _n / (1+B\tau _n)$ and B is large enough. Thus, when $\Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \le K_0 \log \frac{1}{\tau _n}$, for small enough $\tau _n$,

$$\begin{aligned} \begin{aligned} \widehat{r}_i(\alpha , \tau _n) \le&2\chi ^2_{k,\alpha /2} \frac{B\tau _n}{1+B\tau _n} + 2 \sup _{\lambda _i^2 \le B} \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \\ \le&2\chi ^2_{k,\alpha /2} \frac{B\tau _n}{1\!+\!B\tau _n} \!+\! 4 \!\sup _{\lambda _i^2 \le B} \Vert E(1\!-\!\kappa _i \!\mid \! \varvec{X}_i) {\varvec{X}}_{i} \Vert _{\Sigma }^2 \!+\! 4 \!\sup _{\lambda _i^2 \le B} \Vert (1\!-\!\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \\ \le&2\chi ^2_{k,\alpha /2} B\tau _n + 8 B \tau _n \Vert \varvec{X}_i \Vert _{\Sigma }^2 \\ \le&K B \tau _n \log \frac{1}{\tau _n}. \end{aligned} \end{aligned}$$

Again, applying Lemma 1, when $\Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \le K_0 \log \frac{1}{\tau _n}$, for fixed $\eta < a$,

$$\begin{aligned} \begin{aligned} \Vert \widehat{\varvec{\theta }}_{i}\Vert _{\Sigma }^2 \le&E(1-\kappa _i \mid \varvec{X}_i) \Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \\ \le&K \tau _n^{\eta } \exp \left( \Vert \varvec{X}_{i} \Vert _{\Sigma }^2 / 2 \right) \Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \\ \le&K \tau _n^{\eta } \exp \left( K_0 \log \frac{1}{\tau _n} / 2 \right) K_0 \log \frac{1}{\tau _n}\\ =&K \tau _n^{\eta - K_0 / 2} \log \frac{1}{\tau _n}. \end{aligned} \end{aligned}$$

If we choose B to be such that $B^a = K\tau _n^{-K_0/2}$, in which the factor K makes (B37) hold, we then have, as $\tau _n \rightarrow 0$,

$$\begin{aligned} \begin{aligned} 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n) \le K \tau _n^{\eta - K_0 / 2} \log \frac{1}{\tau _n} + K \left( \tau _n^{1 - K_0/2a} \log \frac{1}{\tau _n} \right) ^{a/(1+\rho )} = o(1), \end{aligned} \end{aligned}$$

if we require $K_M < 2a $ and fix $\eta $ and $K_0$ to be such that $K_M< K_0< 2\eta < 2a$. This leads to that, when $\Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \le K_0 \log \frac{1}{\tau _n}$, for some $f_{\tau _n}$ such that $f_{\tau _n} \rightarrow \infty $ and $f_{\tau _n} \tau _n \rightarrow 0$ as $\tau _n \rightarrow 0$,

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \ge f_{\tau _n} \tau _n \ge 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n). \end{aligned}$$

Finally, by the inequality

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \le 2 \Vert {\varvec{\theta }}_{0i} - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2, \end{aligned}$$

the fact that

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \ge 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n), \end{aligned}$$

would imply that

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n). \end{aligned}$$

So,

$$\begin{aligned} \begin{aligned}&P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \notin \widehat{C}^{R}_{i}) \\ =&P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2> L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n)) \\ \ge&P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} \!-\! \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \!>\! L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n) \!\mid \! \Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }^2 \!\le \! (\sqrt{K_0} \!-\! \sqrt{K_M})^2 \log 1/\tau _n ) \\ \times&P(\Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }^2 \le (\sqrt{K_0} - \sqrt{K_M})^2 \log 1/\tau _n)\\ =&P(\Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }^2 \le (\sqrt{K_0} - \sqrt{K_M})^2 \log 1/\tau _n)\\ \rightarrow&1, \end{aligned} \end{aligned}$$

as $\tau _n \rightarrow 0$, for any fixed $L > 0$.

Proof of (18). Consider the case where $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n}$. We write

$$\begin{aligned} \begin{aligned} \Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }&= \Vert \varvec{\theta }_{0i} - \varvec{X}_i - E(\kappa _i \mid \varvec{X}_i)\varvec{X}_i \Vert _{\Sigma } \\&\le \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma } + \Vert E(\kappa _i \mid \varvec{X}_i)\varvec{X}_i \Vert _{\Sigma }. \end{aligned} \end{aligned}$$

Applying both Lemmas 2 and 3, for any fixed $\xi , \delta \in (0,1)$,

$$\begin{aligned} \begin{aligned} \Vert E(\kappa _i \mid \varvec{X}_i)\varvec{X}_i \Vert _{\Sigma }^2&\le 2 \left( \frac{K}{\Vert \varvec{X}_i \Vert _{\Sigma }^2}\right) ^2 \Vert \varvec{X}_i \Vert _{\Sigma }^2 \\&+ 2 \left( K \tau _n^{-a} \exp \left( -\frac{\xi (1-\delta )}{2} \Vert \varvec{X}_i \Vert _{\Sigma }^2 \right) \right) ^2 \Vert \varvec{X}_i \Vert _{\Sigma }^2, \end{aligned} \end{aligned}$$

Further,

$$\Vert \varvec{X}_i \Vert _{\Sigma } = \Vert \varvec{\theta }_{0i} + \varvec{\epsilon }_i \Vert _{\Sigma } \ge |\Vert \varvec{\theta }_{0i} \Vert _{\Sigma } - \Vert \varvec{\epsilon }_i \Vert _{\Sigma } |,$$

in which $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n}$, by assumption, and $\Vert \varvec{\epsilon }_i \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }$ with probability $1-\alpha $. Since $\tau _n \rightarrow 0$, when $\Vert \varvec{\epsilon }_i \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }$, we will have

$$\Vert \varvec{X}_i \Vert _{\Sigma } \ge \left( K_L \log \frac{1}{\tau _n} \right) ^{1/2} - \left( \chi ^2_{k, \alpha }\right) ^{1/2}$$

and, consequently,

$$\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1)).$$

By choosing $K_L$, $\xi $, and $\delta $ to be such that $K_L (1+o(1)) > \frac{2a}{\xi (1-\delta )}$, e.g., $K_L = 3a$ and $\xi (1-\delta ) = 3/4$, when $\Vert \varvec{\epsilon }_i \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }$, we will have, as $\tau _n \rightarrow 0$,

$$\begin{aligned} \Vert E(\kappa _i \mid \varvec{X}_i)\varvec{X}_i \Vert _{\Sigma }^2 \le o(1), \end{aligned}$$

hence,

$$\begin{aligned} \begin{aligned} \Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2&\le \left( \sqrt{\chi ^2_{k,\alpha }} + o(1) \right) ^2 = \chi ^2_{k,\alpha } (1 + o(1)). \end{aligned} \end{aligned}$$

(B38)

Then, we find a lower bound for $\widehat{r}_i(\alpha , \tau _n)$ in this case. Making use of the posterior normality and Anderson’s lemma again,

$$\begin{aligned} \begin{aligned} \alpha&= \int _0^\infty \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2> \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\&\ge \int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2. \end{aligned} \end{aligned}$$

On the other hand, since $1-\kappa _i \ge g_{\tau _n} / (1+g_{\tau _n}) = 1 + o(1),$ if $\lambda _i^2 \ge g_{\tau _n}/\tau _n$ with $g_{\tau _n} = (\log 1/\tau _n)^{1/3}$, for some fixed A,

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } g_{\tau _n} / (1+g_{\tau _n}) \mid \varvec{X}_i, \lambda _i^2) {\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\int _{g_{\tau _n}/\tau _n}^\infty \Pi ( \Vert \varvec{\theta }_i \!-\! (1\!-\!\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } g_{\tau _n} / (1+g_{\tau _n}) \mid \varvec{X}_i, \lambda _i^2) {\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\int _{g_{\tau _n}/\tau _n}^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \chi ^2_{k, A\alpha } (1-\kappa _i) \mid \varvec{X}_i, \lambda _i^2) {\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ =&A \alpha {\Pi }(\lambda _i^2 \ge g_{\tau _n}/\tau _n \mid \varvec{X}_i). \end{aligned} \end{aligned}$$

We now study the posterior probability in a situation where $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1))$:

$$\begin{aligned} \begin{aligned}&{\Pi }(\lambda _i^2 < g_{\tau _n}/\tau _n \mid \varvec{X}_i) \\ =&\frac{\int ^{g_{\tau _n}/\tau _n}_{0} (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2}{\int _{0}^{\infty } (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2} \\ :=&N/D. \end{aligned} \end{aligned}$$

Next,

$$\begin{aligned} \begin{aligned} D \ge&\int _{(g_{\tau _n}+1)/\tau _n}^{2(g_{\tau _n}+1)/\tau _n} (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2 \\ \ge&m (2 g_{\tau _n}+3)^{-k/2} \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (2+g_{\tau _n})}\right) \int _{(g_{\tau _n}+1)/\tau _n}^{2(g_{\tau _n}+1)/\tau _n} (\lambda _i^2)^{-a-1} d\lambda _i^2 \\ =&K (2 g_{\tau _n}+3)^{-k/2} \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (2+g_{\tau _n})}\right) \left( \frac{\tau _n}{g_{\tau _n}+1}\right) ^a. \end{aligned} \end{aligned}$$

Next, fixing a constant $c > 0$, when $g_{\tau _n}$ is large enough,

$$\begin{aligned} \begin{aligned} N =&\int ^{g_{\tau _n}/\tau _n}_{0} (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2\\ =&\left( \int ^{c/\tau _n}_{0} + \int ^{g_{\tau _n}/\tau _n}_{c/\tau _n} \right) (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2 \\ =:&N_1 + N_2. \end{aligned} \end{aligned}$$

The first term

$$\begin{aligned} \begin{aligned} N_1 =&\int ^{c/\tau _n}_{0} (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2 \\ \le&\int ^{c/\tau _n}_{0} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+c)}\right) d\lambda _i^2 \\ \le&\exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+c)}\right) \int ^{\infty }_{0} (\lambda _i^2)^{-a-1} L(\lambda _i^2) d\lambda _i^2 \\ =&K \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+c)}\right) . \end{aligned} \end{aligned}$$

When $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1))$,

$$\begin{aligned} \begin{aligned} N_1/D \le&K (2 g_{\tau _n}+3)^{k/2} \left( \frac{g_{\tau _n}+1}{\tau _n}\right) ^a \exp \left( \frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (2+g_{\tau _n})}-\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+c)}\right) \\ \le&K (2 g_{\tau _n}+3)^{k/2} \left( \frac{g_{\tau _n}+1}{\tau _n}\right) ^a \exp \left( -\frac{K_L}{2} \log \frac{1}{\tau _n} \frac{1}{(1+c)} (1+o(1)) \right) \\ =&K (2 g_{\tau _n}+3)^{k/2} \left( g_{\tau _n}+1\right) ^a \tau _n^{\frac{K_L}{2(1+c)}(1+o(1))-a} = o(1), \end{aligned} \end{aligned}$$

(B39)

as $\tau _n \rightarrow 0$, if c is chosen properly, e.g., $c = 1/3$. Our choice of $K_L$ makes sure that the above is o(1) as $\tau _n \rightarrow 0$, given that $g_{\tau _n} = (\log 1/\tau _n)^{1/3}$. Moreover,

$$\begin{aligned} \begin{aligned} N_2 =&\int _{c/\tau _n}^{g_{\tau _n}/\tau _n} (1+\lambda _i^2 \tau _n)^{-k/2} (\lambda _i^2)^{-a-1} L(\lambda _i^2) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+\lambda _i^2 \tau _n)}\right) d\lambda _i^2 \\ \le&M (1+c)^{-k/2} \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+g_{\tau _n})}\right) \int _{c/\tau _n}^{g_{\tau _n}/\tau _n} (\lambda _i^2)^{-a-1} d\lambda _i^2 \\ \le&M (1+c)^{-k/2} \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+g_{\tau _n})}\right) (\tau _n / c)^{a+1} \frac{g_{\tau _n} - c}{\tau _n}. \end{aligned} \end{aligned}$$

It follows that

$$\begin{aligned} \begin{aligned} N_2/D \le&K (2 g_{\tau _n}+3)^{k/2} \left( g_{\tau _n}+1\right) ^a (g_{\tau _n} - c) \exp \left( -\frac{\Vert \varvec{X}_i \Vert _{\Sigma }^2}{2 (1+g_{\tau _n})(2+g_{\tau _n})}\right) . \end{aligned} \end{aligned}$$

Since we have chosen $g_{\tau _n} = (\log 1/\tau _n)^{1/3}$, when $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1))$, we will have, as $\tau _n \rightarrow 0$,

$$\begin{aligned} N_2/D = o(1). \end{aligned}$$

(B40)

Combining (B39) and (B40), when $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1))$, as $\tau _n \rightarrow 0$,

$$\begin{aligned} \begin{aligned} {\Pi }(\lambda _i^2 < g_{\tau _n}/\tau _n \mid \varvec{X}_i) = o(1), \end{aligned} \end{aligned}$$

and hence

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \chi ^2_{k, A\alpha } g_{\tau _n} / (1+g_{\tau _n}) \mid \varvec{X}_i, \lambda _i^2) {\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&A \alpha {\Pi }(\lambda _i^2 \ge g_{\tau _n}/\tau _n \mid \varvec{X}_i) \\ =&A \alpha (1-o(1)). \end{aligned} \end{aligned}$$

Here, let $A = \beta /\alpha $ for some fixed $\beta > \alpha $. For small enough $\tau _n$, we will have

$$\begin{aligned} \begin{aligned}&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, \beta } g_{\tau _n} / (1+g_{\tau _n}) \mid \varvec{X}_i, \lambda _i^2) {\pi }(\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2 \\ \ge&\beta (1-o(1)) \\>&\alpha \\ \ge&\int _0^\infty \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , \tau _n) \mid \varvec{X}_i, \lambda _i^2) \pi (\lambda _i^2 \mid \varvec{X}_i) d\lambda _i^2. \end{aligned} \end{aligned}$$

It follows that

$$\begin{aligned} \widehat{r}_i(\alpha , \tau _n) \ge \chi ^2_{k, \beta } g_{\tau _n} / (1+g_{\tau _n}) = \chi ^2_{k, \beta } (1+o(1)), \end{aligned}$$

(B41)

when $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1))$.

As a reminder, when $\tau _n$ is small enough, $\Vert \varvec{\epsilon }_i \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }$ implies that $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n} (1+o(1))$.

So, by (B38) along with (B41), in the case where $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n}$, for any fixed $\beta > \alpha $, as $\tau _n \rightarrow 0$,

$$\begin{aligned} \begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i})&= P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \le L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n)) \\&\!\ge P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \le L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n) \mid \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }) \\&\times P(\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha })\\&\!\ge P_{\varvec{\theta }_{0i}} (\chi ^2_{k,\alpha }(1 \!+\! o(1)) \!\le \! L (\chi ^2_{k, \beta })^{a/(1+\rho )} (1\!+\!o(1)) \!\mid \! \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \!\le \! \chi ^2_{k, \alpha }) \\&\!\times P(\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha })\\&\!\rightarrow 1 \times (1-\alpha ) = 1-\alpha , \end{aligned} \end{aligned}$$

if we choose $L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-a/(1+\rho )}$, e.g., $L = 2\chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-a/(1+\rho )}$.$\square $

Proof

of Theorem 7 We use $\widehat{\varvec{\theta }}_{i}$ for $\widehat{\varvec{\theta }}_{i}^{EIG}$ in this proof.

Proof of (20). We first find a lower bound for $\widehat{r}_i(\alpha , c_n)$. Recall that $\pi (\kappa _i \mid \varvec{X}_i) \propto \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} \exp (-\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2)$. Similar to the proof of (16), let $\widetilde{\pi }(\kappa _i \mid \varvec{X}_i) \propto \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1}$ be another density. Also, under the setup of this theorem,

$$\begin{aligned} \varvec{\theta }_i \mid \varvec{X}_i, \kappa _i \sim \varvec{N}_k ((1-\kappa _i)\varvec{X}_i, (1-\kappa _i) \varvec{\Sigma }), \end{aligned}$$

Now we can proceed similarly as the proof of (B34). For some $A(>0)$ and some $v (> 1)$ to be determined later,

$$\begin{aligned} \begin{aligned}&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } c_n^v \mid \varvec{X}_i, \kappa _i) \widetilde{\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ \ge&\int _0^{ 1-c_n^v} \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } c_n^v \mid \varvec{X}_i, \kappa _i) \widetilde{\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ \ge&\int _0^{ 1-c_n^v} \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \chi ^2_{k, A\alpha } (1-\kappa _i) \mid \varvec{X}_i, \kappa _i) \widetilde{\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ =&A \alpha \widetilde{\Pi }(\kappa _i \le 1-c_n^v \mid \varvec{X}_i). \end{aligned} \end{aligned}$$

Since

$$\begin{aligned} \begin{aligned} \widetilde{\Pi }(\kappa _i > 1-c_n^v \mid \varvec{X}_i)&= \frac{\int _{ 1-c_n^v}^1 \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} d\kappa _i}{\int _0^1 \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} d\kappa _i} \\&\le \frac{\int _{1-c_n^v}^1 c_n^{-d-1} d\kappa _i}{\int _{1-c_n}^1 (1-c_n)^{d+k/2-1} (c_n + c_n (1-c_n))^{-d-1} d\kappa _i} \\&= c_n^{v-1} \frac{(2-c_n)^{d+1}}{(1-c_n)^{d+k/2-1}}, \end{aligned} \end{aligned}$$

suppose $\alpha < 1/2$, when $c_n$ is small enough, if we fix $A = 2$, for example,

$$\begin{aligned} A \alpha \widetilde{\Pi }(\kappa _i \le 1-c_n^v \mid \varvec{X}_i) > \alpha . \end{aligned}$$

On the other hand, by Anderson’s lemma,

$$\begin{aligned} \begin{aligned} \alpha&= \int _0^1 \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2> \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \pi (\kappa _i \mid \varvec{X}_i) d\kappa _i \\&\ge \int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \pi (\kappa _i \mid \varvec{X}_i) d\kappa _i. \end{aligned} \end{aligned}$$

So,

$$\begin{aligned} \begin{aligned}&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } c_n^v \mid \varvec{X}_i, \kappa _i) \widetilde{\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i\\> \alpha \ge&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \widetilde{\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i. \end{aligned} \end{aligned}$$

This implies that

$$\begin{aligned} \widehat{r}_i(\alpha , c_n) \ge \chi ^2_{k, A\alpha } c_n^v. \end{aligned}$$

(B42)

For the case where $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_S' c_n$. Again,

$$\begin{aligned} \begin{aligned} \Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2&= \Vert \varvec{\theta }_{0i} - E(1-\kappa _i \mid \varvec{X}_i)\varvec{X}_i \Vert _{\Sigma }^2 \\&=\Vert E(\kappa _i \mid \varvec{X}_i)\varvec{\theta }_{0i} - E(1-\kappa _i \mid \varvec{X}_i) \varvec{\epsilon }_i \Vert _{\Sigma }^2 \\&\le 2 E(\kappa _i \mid \varvec{X}_i)^2 \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 + 2 E(1-\kappa _i \mid \varvec{X}_i)^2 \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \\&\le 2 K_s \tau _n + 2 E(1-\kappa _i \mid \varvec{X}_i) \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2. \end{aligned} \end{aligned}$$

Using Lemma 4 and (B42), when we choose v to be such that $v{d/(1+\rho )}< d < 1$, e.g., $v = 1+\rho /2$,

$$\begin{aligned} \begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i})&= P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \le L \widehat{r}_i^{d/(1+\rho )} (\alpha , c_n)) \\&\ge P_{\varvec{\theta }_{0i}} (K c_n + K c_n^d e^{\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2} \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le (\chi ^2_{k, A\alpha } c_n^v)^{d/(1+\rho )} ) \\&\ge P_{\varvec{\theta }_{0i}} (K c_n + K c_n^d e^{\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2} \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le (\chi ^2_{k, A\alpha } c_n^v)^{d/(1+\rho )} \mid \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }) \\&\times P(\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha })\\&\rightarrow 1 \times (1-\alpha ) = 1-\alpha , \end{aligned} \end{aligned}$$

as $c_n \rightarrow 0$, since the left hand side of the inequality in the conditional probability is of a higher order of infinitesimal.

Proof of (21) For the case where $ f'_{c_n} c_n \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K'_M \log \frac{1}{c_n}$, similar to the proof of (17), for $K_0' > K_M'$,

$$\begin{aligned} \begin{aligned} \Vert \varvec{X}_{i} \Vert _{\Sigma } - \left( K_0' \log \frac{1}{c_n} \right) ^{1/2} \le \Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma } + \left( \sqrt{K_M'} - \sqrt{K_0'}\right) \left( \log \frac{1}{c_n} \right) ^{1/2} \le 0, \end{aligned} \end{aligned}$$

if

$$\Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma } \le \left( \sqrt{K_0'} - \sqrt{K_M'}\right) \left( \log \frac{1}{c_n} \right) ^{1/2},$$

the probability of which converges to 1 as $c_n \rightarrow 0$.

Using (B36) again,

$$\begin{aligned}{} & {} \int _0^1 \Pi ( \Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge 2r \!+\! 2 \sup _{\kappa _i \ge {B_n}} \Vert \widehat{\varvec{\theta }}_{i} \!-\! (1\!-\!\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \!\mid \! \varvec{X}_i, \kappa _i ) \pi (\kappa _i \!\mid \! \varvec{X}_i) d \kappa _i \\\le & {} \int _{B_n}^1 \Pi ( \Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \!\ge \! 2r \!+\! 2 \sup _{\kappa _i \ge {B_n}} \Vert \widehat{\varvec{\theta }}_{i} \!-\! (1\!-\!\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i, \kappa _i ) \pi (\kappa _i \!\mid \! \varvec{X}_i) d \kappa _i \\+ & {} {\Pi }(\kappa _i< {B_n} \mid \varvec{X}_i) \\\le & {} \int _{B_n}^1 \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge 2r + 2 \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \mid \varvec{X}_i, \kappa _i ) \pi (\kappa _i \mid \varvec{X}_i) d \kappa _i \\+ & {} {\Pi }(\kappa _i< {B_n} \mid \varvec{X}_i) \\\le & {} \int _{B_n}^1 \Pi ( \Vert \varvec{\theta }_i \!-\! (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \!\ge \! r \mid \varvec{X}_i, \kappa _i ) \pi (\kappa _i \mid \varvec{X}_i) d \kappa _i \!+\! {\Pi }(\kappa _i < {B_n} \mid \varvec{X}_i). \end{aligned}$$

For the first term, let $r = \chi ^2_{k,\alpha /2} (1-{B_n})$,

$$\begin{aligned} \begin{aligned}&\int _{B_n}^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge \chi ^2_{k,\alpha /2} (1-{B_n}) \mid \varvec{X}_i, \kappa _i ) \pi (\kappa _i \mid \varvec{X}_i) d \kappa _i \\ \le&\int _{B_n}^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \ge \chi ^2_{k,\alpha /2} (1-\kappa _i) \mid \varvec{X}_i, \kappa _i ) \pi (\kappa _i \mid \varvec{X}_i) d \kappa _i \\ \le&\alpha /2. \end{aligned} \end{aligned}$$

For the other term, when $\varvec{X}_i$ is fixed, if ${B_n} = 1 - c_n^{d/(2d+2)}$, as $c_n \rightarrow 0$,

$$\begin{aligned} \begin{aligned}&{\Pi }(\kappa _i < {B_n} \mid \varvec{X}_i) \\ =&\frac{\int _{0}^{{1-c_n^{d/(2d+2)}}} \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} \exp (-\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) d \kappa _i }{\int _{0}^{1}\kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} \exp (-\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) d \kappa _i } \\ \le&\frac{\int _{0}^{{1-c_n^{d/(2d+2)}}} (c_n^{d/(2d+2)} + c_n (1-c_n^{d/(2d+2)}))^{-d-1} d \kappa _i }{\exp (-\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) \int _{1-c_n}^{1} (1-c_n)^{d+k/2-1} (c_n + c_n (1-c_n))^{-d-1} d \kappa _i } \\ =&\frac{c_n^{-d/2} (1+c_n^{(d+2)/(2d+2)}-c_n)^{-d-1}}{\exp (-\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) c_n^{-d} (1-c_n)^{d+k/2-1} (2-c_n)^{-d-1} } \\ \le&K \exp (\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) {c_n}^{d/2} \le \alpha /2. \end{aligned} \end{aligned}$$

So,

$$\begin{aligned} \begin{aligned}&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \pi (\kappa _i \mid \varvec{X}_i) d\kappa _i\\ =&\alpha \\ \ge&\! \int _0^1 \Pi ( \Vert \varvec{\theta }_i \!-\! \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge 2r \!+\! 2 \sup _{\kappa _i \ge {B_n}} \Vert \widehat{\varvec{\theta }}_{i} \!-\! (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \!\mid \! \varvec{X}_i, \kappa _i ) \pi (\kappa _i \mid \varvec{X}_i) d \kappa _i, \end{aligned} \end{aligned}$$

if $r = \chi ^2_{k,\alpha /2} (1-{B_n})$ and ${B_n} = 1 - c_n^{d/(2d+2)}$. Thus, when $\Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \le K_0' \log \frac{1}{c_n}$, for small enough $c_n$,

$$\begin{aligned} \begin{aligned} \widehat{r}_i(\alpha , c_n) \le&2\chi ^2_{k,\alpha /2} (1-{B_n}) + 2 \sup _{\kappa _i \ge {B_n}} \Vert \widehat{\varvec{\theta }}_{i} - (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \\ \le&2\chi ^2_{k,\alpha /2} c_n^{d/(2d+2)} + 4 \sup _{\kappa _i \ge {B_n}} \Vert E(1-\kappa _i \mid \varvec{X}_i) {\varvec{X}}_{i} \Vert _{\Sigma }^2 + 4 \sup _{\kappa _i \ge {B_n}} \Vert (1-\kappa _i) \varvec{X}_i \Vert _{\Sigma }^2 \\ \le&2\chi ^2_{k,\alpha /2} c_n^{d/(2d+2)} + 8 c_n^{d/(2d+2)} \Vert \varvec{X}_i \Vert _{\Sigma }^2 \\ \le&K c_n^{d/(2d+2)} \log \frac{1}{c_n}. \end{aligned} \end{aligned}$$

Finally, by the inequality

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \le 2 \Vert {\varvec{\theta }}_{0i} - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2, \end{aligned}$$

the fact that

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \ge 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \widehat{r}_i^{d/(1+\rho )}(\alpha , \tau _n), \end{aligned}$$

would imply that

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 \ge \widehat{r}_i^{d/(1+\rho )}(\alpha , \tau _n). \end{aligned}$$

Applying Lemma 4, when $\Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \le K_0' \log \frac{1}{c_n}$,

$$\begin{aligned} \begin{aligned} \Vert \widehat{\varvec{\theta }}_{i}\Vert _{\Sigma }^2 \le&E(1-\kappa _i \mid \varvec{X}_i) \Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \\ \le&K c_n^d \exp \left( \Vert \varvec{X}_{i} \Vert _{\Sigma }^2 / 2 \right) \Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \\ \le&K c_n^d \exp \left( K_0' \log \frac{1}{c_n} / 2 \right) K_0' \log \frac{1}{c_n}\\ =&K c_n^{d - K_0' / 2} \log \frac{1}{c_n}. \end{aligned} \end{aligned}$$

We then have, as $c_n \rightarrow 0$,

$$\begin{aligned} \begin{aligned} 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \widehat{r}_i^{d/(1+\rho )}(\alpha , c_n) \le K c_n^{d - K_0' / 2} \log \frac{1}{c_n} + K \left( c_n^{d/(2d+2)} \log \frac{1}{c_n}\right) ^{d/(1+\rho )} = o(1), \end{aligned} \end{aligned}$$

if we require $K_M < 2d $ and fix $K_0'$ to be such that $K_M< K_0' < 2d$. This implies that when $\Vert \varvec{X}_{i} \Vert _{\Sigma }^2 \le K_0' \log \frac{1}{c_n}$, for some $f_{\tau _n}$ such that $f_{c_n}' \rightarrow \infty $ and $f_{c_n}' c_n \rightarrow 0$ as $c_n \rightarrow 0$,

$$\begin{aligned} \Vert {\varvec{\theta }}_{0i} \Vert _{\Sigma }^2 \ge f_{c_n}' c_n \ge 2 \Vert \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2 + 2 \widehat{r}_i^{d/(1+\rho )}(\alpha , c_n). \end{aligned}$$

Finally,

$$\begin{aligned} \begin{aligned}&P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \notin \widehat{C}^{EIG}_{i}) \\ =&P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2> L \widehat{r}_i^{d/(1+\rho )}(\alpha , \tau _n)) \\ \ge&P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} \!-\! \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \!>\! L \widehat{r}_i^{d/(1+\rho )}(\alpha , \tau _n) \mid \Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }^2 \le (\sqrt{K_0} \!-\! \sqrt{K_M})^2 \log 1/\tau _n ) \\ \times&P(\Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }^2 \le (\sqrt{K_0} - \sqrt{K_M})^2 \log 1/\tau _n)\\ =&P(\Vert \varvec{\epsilon }_{0i} \Vert _{\Sigma }^2 \le (\sqrt{K_0} - \sqrt{K_M})^2 \log 1/\tau _n)\\ \rightarrow&1, \end{aligned} \end{aligned}$$

as $c_n \rightarrow 0$, for any fixed $L > 0$.

Proof of (22). Similar to the proof of (18), using Lemmas 5 and 6, when $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L' \log \frac{1}{c_n}$ and $\Vert \varvec{\epsilon }_i \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }$, we have, as $c_n \rightarrow 0$,

$$\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L' \log \frac{1}{c_n} (1+o(1)),$$

and, for $K_L' (1+o(1)) > \frac{2d}{\xi (1-\delta )}$,

$$\begin{aligned} \begin{aligned} \Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2&\le \chi ^2_{k,\alpha } (1 + o(1)). \end{aligned} \end{aligned}$$

(B43)

Then, we find a lower bound for $\widehat{r}_i(\alpha , c_n)$ in this case. Making use of the posterior normality and Anderson’s lemma again,

$$\begin{aligned} \begin{aligned} \alpha&= \int _0^1 \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}_{i} \Vert _{\Sigma }^2> \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \pi (\kappa _i \mid \varvec{X}_i) d\kappa _i \\&\ge \int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \pi (\kappa _i \mid \varvec{X}_i) d\kappa _i. \end{aligned} \end{aligned}$$

On the other hand, since $1-\kappa _i \ge 1 - c_n$ if $\kappa _i \le c_n$, for some fixed A,

$$\begin{aligned} \begin{aligned}&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } (1 - c_n) \mid \varvec{X}_i, \kappa _i) {\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ \ge&\int _{0}^{c_n} \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } (1 - c_n) \mid \varvec{X}_i, \kappa _i) {\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ \ge&\int _{0}^{c_n} \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \chi ^2_{k, A\alpha } (1-\kappa _i) \mid \varvec{X}_i, \kappa _i) {\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ =&A \alpha {\Pi }( \kappa _i \le c_n \mid \varvec{X}_i). \end{aligned} \end{aligned}$$

When $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L' \log \frac{1}{c_n} (1+o(1)),$ as $n \rightarrow \infty $,

$$\begin{aligned} \begin{aligned}&{\Pi }(\kappa _i \ge c_n \mid \varvec{X}_i) \\ \le&\frac{\int _{c_n}^{1} \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} \exp (-\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) d\kappa _i }{\int _{1-c_n^{1/d}}^{1} \kappa _i^{d+k/2-1} (1 - \kappa _i + c_n \kappa _i)^{-d-1} \exp (-\kappa _i \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) d\kappa _i} \\ \le&\frac{\int _{c_n}^{1} c_n^{-d-1} \exp (-c_n \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) d\kappa _i }{\int _{1-c_n^{1/d}}^1 (1-c_n^{1/d})^{d+k/2-1} (c_n^{1/d} + c_n (1-c_n^{1/d}))^{-d-1} \exp (-\varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) d\kappa _i} \\ =&c_n^{-d} \exp (- (1-c_n) \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i/2) (1+o(1)) \\ \le&c_n^{-d} \exp \left( - (1-c_n) K_L' \log \frac{1}{c_n} (1+o(1)) /2 \right) (1+o(1)) \\ =&c_n^{(1-c_n) K_L' (1+o(1)) /2 - d} = o(1). \end{aligned} \end{aligned}$$

since we have already chosen $K_L' (1+o(1))> \frac{2d}{\xi (1-\delta )} > 2d$.

Let $A = \beta /\alpha $ for some fixed $\beta > \alpha $. For small enough $c_n$, we will have

$$\begin{aligned} \begin{aligned}&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2> \chi ^2_{k, A\alpha } (1 - c_n) \mid \varvec{X}_i, \kappa _i) {\pi }(\kappa _i \mid \varvec{X}_i) d\kappa _i \\ \ge&\beta (1-o(1)) \\>&\alpha \\ \ge&\int _0^1 \Pi ( \Vert \varvec{\theta }_i - (1-\kappa _i) \varvec{X}_{i} \Vert _{\Sigma }^2 > \widehat{r}_i(\alpha , c_n) \mid \varvec{X}_i, \kappa _i) \pi (\kappa _i \mid \varvec{X}_i) d\kappa _i. \end{aligned} \end{aligned}$$

It follows that, as $c_n \rightarrow 0$,

$$\begin{aligned} \widehat{r}_i(\alpha , c_n) \ge \chi ^2_{k, \beta }(1 - c_n) = \chi ^2_{k, \beta } (1+o(1)), \end{aligned}$$

(B44)

when $\Vert \varvec{X}_i \Vert _{\Sigma }^2 \ge K_L' \log \frac{1}{c_n} (1+o(1))$.

So, by (B43) along with (B44), in the case where $\Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L' \log \frac{1}{c_n}$, for any fixed $\beta > \alpha $, as $c_n \rightarrow 0$,

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i})= & {} P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \le L \widehat{r}_i^{d/(1+\rho )}(\alpha , c_n)) \\\ge & {} P_{\varvec{\theta }_{0i}} (\Vert \varvec{\theta }_{0i} - \widehat{\varvec{\theta }}_i \Vert _{\Sigma }^2 \le L \widehat{r}_i^{d/(1+\rho )}(\alpha , c_n) \mid \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }) \\\times & {} P(\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha })\\\ge & {} P_{\varvec{\theta }_{0i}} (\chi ^2_{k,\alpha }(1 + o(1)) \le L (\chi ^2_{k, \beta })^{d/(1+\rho )} (1+o(1)) \mid \Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha }) \\\times & {} P(\Vert \varvec{\epsilon }_{i} \Vert _{\Sigma }^2 \le \chi ^2_{k, \alpha })\\\rightarrow & {} 1 \times (1-\alpha ) = 1-\alpha , \end{aligned}$$

if we choose $L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-{d/(1+\rho )}}$, e.g., $L = 2\chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-{d/(1+\rho )}}$.$\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qin, Z., Ghosh, M. Global-Local Shrinkage Priors for Asymptotic Point and Interval Estimation of Normal Means under Sparsity. Sankhya A 86, 93–137 (2024). https://doi.org/10.1007/s13171-023-00315-9

Download citation

Received: 05 June 2022
Published: 08 September 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s13171-023-00315-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Global-Local Shrinkage Priors for Asymptotic Point and Interval Estimation of Normal Means under Sparsity

Abstract

Similar content being viewed by others

Shrinkage estimation with logarithmic penalties

Weak Versus Strong Dominance of Shrinkage Estimators

Nearly optimal Bayesian shrinkage for high-dimensional regression

1 Introduction

2 Point Estimation of Multivariate Normal Means

2.1 Asymptotic Minimax Error under Nearly-Black Sparsity

Theorem 1

Remark 1

2.2 Minimax Estimation of Multivariate Normal Means

Theorem 2

Theorem 3

Theorem 4

Theorem 5

3 Credible Sets of Multivariate Normal Means

Theorem 6

Remark 2

Corollary 1

Theorem 7

Corollary 2

4 Final Remarks

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Appendices

Appendix A: Lemmas

1.1 A.1 Lemmas for the Multivariate Tuning Parameter Model

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

1.2 A.2 Lemmas for the Multivariate Exponential-Inverse-Gamma Model

Lemma 4

Proof

Lemma 5

Proof

Lemma 6

Proof

Appendix B: Proofs of the Main Theorems

1.1 B.1 Proofs of Theorems in Section 2

Proof

Proof

Proof

Proof

Proof

1.2 B.2 Proofs of Theorems in Section 3

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation