1 Introduction

Estimation of normal means under sparsity started a while ago. Its importance is felt for the analysis of high dimensional data. For example, in microarray experiments, there is a multitude of genes, but only a few have impact on a certain disease. A foundational article appears in Donoho et al. (1992), who provided an asymptotic minimax estimation rate for estimation of normal means with a large majority of zeros, but with also a few significant departures from zeros. The idea was pursed in a Bayesian framework by Castillo and van der Vaart (2012) who provided the same asymptotic minimax estimation rate under a class of priors \(\Pi _n\) with “exponential decay” (see (2.2) of their paper for its definition).

The present work addresses the same problem, but stems from several recent excellent articles of Bhattacharya et al. (2015), van der Pas et al. (2014), van der Pas et al. (2016) and Ghosh and Chakrabarti (2017). In particular, our paper has more direct structural connection with the work of Ghosh and Chakrabarti (2017), but extends their work in certain directions.

It may be pointed out that the priors of Bhattacharya et al. (2015) or Ghosh and Chakrabarti (2017) can be brought under the general framework of van der Pas et al. (2014), but each has its own salient features which enable one to provide a more concrete set of results. In particular, these priors, now commonly referred to as “global-local” priors following Carvalho et al. (2009, 2010), are scale mixtures of normal priors with the scale parameters involving both global and local components. The global components try to shrink the normal means towards zero, while the local parameters try to balance the same with the end of identifying and distinguishing the true signals from the noises. While the work of Ghosh and Chakrabarti (2017) considers a single global parameter and utilizes the same as a tuning parameter, Bhattacharya et al. (2015) considered essentially multiple global parameters and assigned certain priors on them. These ideas will be made more specific in the following sections.

We first find the asymptotic minimax error for estimation of multivariate normal means under sparsity in the nearly-black sense (Castillo and van der Vaart , 2012). It is the same as the asymptotic minimax error in the univariate case, which was proved by Donoho et al. (1992).

We then consider estimation of multivariate normal means under global-local priors. Like Ghosh and Chakrabarti (2017), we obtain exact asymptotic minimaxtity results as well in this situation. Further, in the framework of Bhattacharya et al. (2015), we obtain asymptotic minimaxity results in the multivariate case. This is the case where we put priors to the global parameters.

The final contribution of this paper is finding credible sets for multivariate normal means following the framework of van der Pas et al. (2017), who considered the univariate case. We have considered a general class of global-local priors which includes the now famous horseshoe prior, as well as a more specific class of priors which is in the framework of Bhattacharya et al. (2015). Like van der Pas et al. (2017), we have been able to identify parameter vectors for which the posteriors give good coverage, and others for which they do not.

The outline of the remaining sections is as follows. In Section 2.1, we find the asymptotic minimax error in the multivariate setting. In Section 2.2, we consider estimation of multivariate normal means and obtain exact asymptotic minimax error. We also find out the corresponding posterior contraction rates around both the estimator and the true means. Section 3 addresses results related to credible sets of multivariate normal means. Some final remarks are made in Section 4. The proofs of some technical lemmas are given in Appendix A. The proofs of the main theorems are given in Appendix B.

2 Point Estimation of Multivariate Normal Means

2.1 Asymptotic Minimax Error under Nearly-Black Sparsity

Suppose \(X_i \overset{ind}{\sim }\ N(\theta _i, 1), i = 1, \dots , n\). To estimate multiple normal means, Ghosh and Chakrabarti (2017) used a general global-local prior in which the global parameter is treated as a tuning parameter. The exact asymptotic minimaxity was established under the prior. There, the true means \({\theta }_{0i}\) \((1 \le i \le n)\) are assumed to be sparse in the nearly-black sense (Donoho et al. , 1992; Castillo and van der Vaart , 2012), meaning that the cardinality of the non-zero \({\theta }_{0i}\)’s, say \(q_n\), is o(n), as \(n \rightarrow \infty \). The set of nearly-black mean vectors is denoted by \(l_0[q_n] = \{\varvec{\theta } \in \mathbb {R}^n: \sum _{i=1}^n \mathbbm {1}(\theta _{i} \ne 0) \le q_n\}\) with \(q_n = o(n)\). Donoho et al. (1992) provides the asymptotic minimax error,

$$\begin{aligned} \inf _{\widehat{\varvec{\theta }}} \sup _{\varvec{\theta }_0 \in l_0[q_n]} \sum _{i=1}^n E_{\theta _{0i}} \left( \widehat{\theta } - \theta _{0i}\right) ^2 = 2 q_n \log \left( \frac{n}{q_n}\right) (1+o(1)), \text { as } n \rightarrow \infty . \end{aligned}$$
(1)

In the multivariate situation, the true means \(\varvec{\theta }_{0i}\) \((1 \le i \le n)\) being assumed to be sparse in the nearly-black sense also means that \(\sum _{i=1}^n \mathbbm {1}(\varvec{\theta }_{0i} \ne \varvec{0}) \le q_n\) with \(q_n = o(n)\). We denote the set of nearly-black multivariate means by \(L_0[q_n] = \{ \{\varvec{\theta }_{0i}\}_{i=1}^{n} : \varvec{\theta }_{0i} \in \mathbb {R}^k, i=1,\dots ,n, \sum _{i=1}^n \mathbbm {1}(\varvec{\theta }_{0i} \ne \varvec{0}) \le q_n\}\). One can prove that, in the multivariate setting, the asymptotic minimax error using the Mahalanobis distance loss is the same as the asymptotic minimax error using the squared error loss in the univariate setting. We use \(\Vert \cdot \Vert _{\Sigma }\) to denote the Mahalanobis norm, e.g., \(\Vert \varvec{X}_i \Vert _{\Sigma }^2 = \varvec{X}_i^T\varvec{\Sigma }^{-1}\varvec{X}_i\), where \(\varvec{\Sigma }\) is the positive definite population covariance matrix.

Theorem 1

Suppose that \(\varvec{X}_i \sim N_k(\varvec{\theta }_i, \varvec{\Sigma })\), independently, for \(i=1,\dots ,n\), with a positive definite covariance matrix \(\varvec{\Sigma }\), and that the true mean vectors \(\{\varvec{\theta }_i\}_{i=1}^n\) are sparse in the nearly-black sense. If we measure the error of an estimator using the Mahalanobis distance loss, then, as \(n \rightarrow \infty \),

$$\begin{aligned} \inf _{\{\widehat{\varvec{\theta }_i}\}} \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} \sum _{i=1}^n E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2 = 2 q_n \log \left( \frac{n}{q_n}\right) (1+o(1)). \end{aligned}$$
(2)

Remark 1

When \(\varvec{\Sigma }\) is positive definite, the Mahalanobis norm \(\Vert \cdot \Vert _{\Sigma }\) is equivalent to the \(l_2\)-norm \(\Vert \cdot \Vert _2\) in the sense of equivalent norms, which means that there exist two positive real constants c and C such that \(0 < c \le C\), for each \(\varvec{X} \in \mathbb {R}^k\), \(c \Vert \varvec{X} \Vert _2 \le \Vert \varvec{X} \Vert _{\Sigma } \le C \Vert \varvec{X} \Vert _2\). Specifically, \(c = \lambda _{\max }^{-1}(\varvec{\Sigma })\), inverse of the largest eigenvalue of \(\varvec{\Sigma }\), and \(C = \lambda _{\min }^{-1}(\varvec{\Sigma })\), inverse of the smallest eigenvalue of \(\varvec{\Sigma }\). So, Theorem 1 will not give us an exact asymptotic minimax error under the \(l_2\)-norm unless \(\varvec{\Sigma }\) satisfies certain eigenvalue conditions. Instead, we can get both lower and upper bounds of the minimax error under the \(l_2\)-norm. Since both bounds are of the same rate, \(2 q_n \log \left( n/q_n\right) (1+o(1))\), the minimax error under the \(l_2\)-norm must be of the same rate as well, and will only differ from it by up to a constant factor.

2.2 Minimax Estimation of Multivariate Normal Means

Now, we first extend the results of Ghosh and Chakrabarti (2017) to the multivariate case. We begin with a general global-local prior model

  1. (i)

    \(\varvec{X}_{i} \vert \varvec{\theta }_{i} \overset{ind}{\sim }\ N_k(\varvec{\theta }_{i}, \varvec{\Sigma })\), \(i = 1, \dots ,n \), \(\varvec{\Sigma }\) is known positive definite;

  2. (ii)

    \(\varvec{\theta }_{i} \vert \lambda _i^2 \overset{ind}{\sim }\ N_k(\varvec{0}, \lambda _i^2 \tau _n\varvec{\Sigma }) \), \(i = 1, \dots ,n \), where \(\tau _n \in (0,1)\) is a sequence of positive constants to be chosen later, \(\tau _n \rightarrow 0\) as \(n \rightarrow \infty \);

  3. (iii)

    \(\pi (\lambda _i^2) = K (\lambda _i^2) ^{-a-1} L(\lambda _i^2)\), \(i = 1, \dots ,n \), where \(a > 0\) and L is a slowly varying function.

In this model, the global parameter \(\tau _n\) is assumed to be a tuning parameter. Note that the horseshoe prior (Carvalho et al. , 2009; Carvalho et al. , 2010) is a special case of this prior in the univariate setup with \(a=1/2\).

The following regularity assumptions are made:

  1. (I)

    L is non-decreasing in its argument with \(0< m \le L(u) \le M < \infty \);

  2. (II)

    \(0< \lambda _{\min } (\varvec{\Sigma }) \le \lambda _{\max } (\varvec{\Sigma }) < \infty \), where \(\lambda _{\min } (\varvec{\Sigma })\) and \(\lambda _{\max } (\varvec{\Sigma })\) denote the minimum and maximum eigenvalues of \(\varvec{\Sigma }\).

We estimate \(\varvec{\theta }_i\) using the posterior means under the global-local prior, i.e.,

$$\begin{aligned} \widehat{\varvec{\theta }}_i = E(\varvec{\theta }_i \mid \varvec{X}_i) = E(1-\kappa _i \mid \varvec{X}_i)\varvec{X}_i, \text { where } \kappa _i = (1+\lambda _i^2 \tau _n)^{-1}, \end{aligned}$$
(3)

and \(\kappa _i\) is the shrinkage factor. The estimators using a prior (iii) are denoted by \(\widehat{\varvec{\theta }}^R_i\) specifically.

We prove the following theorem under this model, in which (4) and (5) concern the error contributed by the zero and non-zero true means, respectively, and (6) is then immediate following the previous two results. In particular, when \(0 < a \le 1\), we already have an upper bound for the error. Theorem 1 provides the minimax lower bound, which matches the upper bound here. This fact actually finishes both the proofs of Theorem 1 and (7). As shown in Theorem 2, this general class of global-local priors attains the asymptotic minimax rate in the multivariate setting, and when \(0 < a \le 1\), it attains the exact asymptotic minimax error.

Theorem 2

Assume that the true means are sparse in the nearly-black sense. Under the regularity assumptions (I) and (II), using the global-local prior with a tuning parameter, i.e., a model satisfying (i), (ii) and (iii), if \(\tau _n = \left( q_n/n\right) ^{\frac{1+\epsilon }{\eta }}\), where \(\epsilon > 0\) and \(0< \eta < \min (1,a)\), then, for any valid choice of \(\epsilon \) and \(\eta \),

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} {\sum \limits _{i: \varvec{\theta }_{0i} = \varvec{0}} E_0 \Vert \widehat{\varvec{\theta }}^R_i \Vert _{\Sigma }^2} / {\left( q_n \log \left( \frac{n}{q_n}\right) \right) } = 0. \end{aligned}$$
(4)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2}{2 q_n \log (n/q_n)} \le a/\min (1,a). \end{aligned}$$
(5)

Consequently,

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2}{2 q_n \log (n/q_n)} \le a/\min (1,a). \end{aligned}$$
(6)

In particular, since the minimax error (2) provides a lower bound, when \(0 < a \le 1\), one gets the result

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^R_i - \varvec{\theta }_{0i}\Vert _{\Sigma }^2}{2 q_n \log (n/q_n)} = 1. \end{aligned}$$
(7)

The following theorem provides results on the rates of posterior contraction for this prior around both the Bayes estimators and the true means. By (8), the posterior distributions contracts around the Bayes estimator at least as fast as at the minimax rate. However, by (9), the rate of posterior contraction around the true means would be slower than the minimax rate.

Theorem 3

Under the assumptions of Theorem 2, we have

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_i \Vert ^2 > q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$
(8)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - {\varvec{\theta }}_{0i} \Vert ^2 > M_n q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$
(9)

for any \(\{M_n\}\) such that \(\lim _{n\rightarrow \infty } M_n = \infty \).

Next we extend the work of Bhattacharya et al. (2015) in the present multivariate framework. We consider the following prior in which, while (i) remains the same in our earlier formulation, we replace (ii) and (iii) respectively by

  1. (ii’)

    \(\varvec{\theta }_{i} \vert \lambda _i^2, \tau _i \overset{ind}{\sim }\ N_k(\varvec{0}, \lambda _i^2 \tau _i\varvec{\Sigma }) \), \(i = 1, \dots ,n \);

  2. (iii’)

    \(\lambda _i^2\) and \( \tau _i\) are mutually independent. Also, \(\lambda _i^2\)’s are independent with \(\pi (\lambda _i^2) \propto \exp (-\lambda _i^2/2)\), \(i = 1, \dots ,n \), while \( \tau _i\)’s are also independent with \(\pi (\tau _i) \propto \exp (-c_n/(2\tau _i)) \tau _i^{-d-1}\), where \(c_n \rightarrow 0\) and will be chosen later and \(0<d<1\).

As noted by Bhattacharya et al. (2015) as well, the Dirichlet-Laplace priors can be rewritten in the above formulation, except for that the authors put a Gamma prior on \(\tau _i\) while we put an Inverse-Gamma prior on it. Due to this discrepancy, we refer the prior defined by (ii’) and (iii’) as the Exponential-Inverse-Gamma prior.

It is worth mentioning that writing \(u_i = \lambda _i^2 \tau _i\), one gets \(\pi (u) \propto (u_i + c_n)^{-d-1}\), and one can directly use the \(u_i\) for inferential purposes. Further, this particular formulation is a special case of van der Pas et al. (2016), who has a very general result concerning asymptotic minimaxity of univariate normal means. However, it seems more convenient to work with separate priors for \(\lambda _i^2\) and \( \tau _i\), and the explicit nature of these priors makes the calculation smooth. As an aside, \(u_i/c_n\) has a beta prime prior with \(a=1\) and \(b=d\), and this is the prior considered in Armagan et al. (2011) and Griffin and Brown (2017).

With the above formulation, the estimators of the \({\varvec{\theta }}_i\) are denoted as \(\widehat{\varvec{\theta }}^{EIG}_i\). We will prove Theorems 4 in this setup. It shows that the prior attains the exact asymptotic minimax error as well.

Theorem 4

Assume that the true means are sparse in the nearly-black sense. Under the regularity assumption (II),  using the Exponential-Inverse-Gamma prior, i.e., a model satisfying (i), (ii’) and (iii’) above, if \(c_n = \left( q_n/n\right) ^{\frac{1+\epsilon }{d}}\), where \(\epsilon > 0\), then, for any valid choice of \(\epsilon \),

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}}{\sum \limits _{i: \varvec{\theta }_{0i} = \varvec{0}} E_0 \Vert \widehat{\varvec{\theta }}^{EIG}_i \Vert ^2} / {\left( q_n \log \left( \frac{n}{q_n}\right) \right) } = 0. \end{aligned}$$
(10)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum \limits _{i: \varvec{\theta }_{0i} \ne \varvec{0}} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^{EIG}_i - \varvec{\theta }_{0i}\Vert ^2}{2 q_n \log (n/q_n)} \le 1. \end{aligned}$$
(11)

Consequently,

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } { \sup _{\{\varvec{\theta }_{0i}\} \in L_0[q_n]}} \frac{\sum _{i=1}^{n} E_{\varvec{\theta }_{0i}} \Vert \widehat{\varvec{\theta }}^{EIG}_i - \varvec{\theta }_{0i}\Vert ^2}{2 q_n \log (n/q_n)} = 1. \end{aligned}$$
(12)

We also have the following results regarding the posterior contraction rate around the Bayes estimator and the true means. The same contraction rates are observed as using the tuning parameter model.

Theorem 5

Under the assumptions of Theorem 4, we have

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in l_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_i \Vert ^2 > q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$
(13)

and

$$\begin{aligned} \limsup \limits _{n \rightarrow \infty } \sup _{\{\varvec{\theta }_{0i}\} \in l_0[q_n]} E_{\{\varvec{\theta }_{0i}\}} \Pi (\sum _{i=1}^{n} \Vert \varvec{\theta }_i - {\varvec{\theta }}_{0i} \Vert ^2 > M_n q_n \log (\frac{n}{q_n}) \mid \{\varvec{X}_i\}) = 0, \end{aligned}$$
(14)

for any \(\{M_n\}\) such that \(\lim _{n\rightarrow \infty } M_n = \infty \).

3 Credible Sets of Multivariate Normal Means

In this section, we first study coverage probabilities of credible sets constructed under global-local priors defined by (ii) and (iii). The global parameter is treated as a tuning parameter. We consider credible sets of the form:

$$\begin{aligned} \widehat{C}^{R}_{n,i} = \{\varvec{\theta }_i: \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_{i} \Vert _{\Sigma }^2 \le L \widehat{r}_{n,i}^{a/(1+\rho )}(\alpha , \tau _n) \}, \end{aligned}$$
(15)

for some \(\rho (> 0)\) to be chosen later and \(\widehat{r}_{n,i}(\alpha , \tau _n)\) is determined from

$$\begin{aligned} \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^R_{i} \Vert _{\Sigma }^2 \le \widehat{r}_{n,i}(\alpha , \tau _n) \mid \varvec{X}_i) = 1 - \alpha . \end{aligned}$$

In the following, we will omit the subscript n in \(\widehat{C}^{R}_{n,i}\) and \(\widehat{r}_{n,i}\) for notational simplicity.

Following van der Pas et al. (2017), we view the true mean vectors as in three categories:

$$\begin{aligned} \begin{aligned} \mathcal {S}&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_S \tau _n\} \\ \mathcal {M}&:= \{\varvec{\theta }_{0i}: f_{\tau _n} \tau _n \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K_M \log \frac{1}{\tau _n}\}\\ \mathcal {L}&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K_L \log \frac{1}{\tau _n}\} \end{aligned} \end{aligned}$$

for some positive constants \(K_S\), \(K_M\) and \(K_L\), and some \(f_{\tau _n}\) that goes to infinity as \(\tau _n\) goes to zero. We will show that, the proposed credible sets will cover the true means in either \(\mathcal {S}\) or \(\mathcal {L}\) with a desired probability, while the true means in \(\mathcal {M}\) will not be covered with probability tending to one. The results are summaried in the following theorem.

Theorem 6

Consider the global-local prior with a tuning parameter \(\tau _n\), i.e., a model satisfying (i), (ii) and (iii), with \(a < 1\), under the regularity assumptions (I) and (II). Suppose that \(K_S > 0\), \(K_M < 2a\) and \(K_L > 2a\), and that \(f_{\tau _n} \rightarrow \infty \) and \(f_{\tau _n} \tau _n \rightarrow 0\) as \(\tau _n \rightarrow 0\). Then, given \(\alpha \), for the credible sets of form (15) with \(L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-a/(1+\rho )}\) for some fixed \(\beta > \alpha \) and \(\rho > 0\),

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i}) \ge 1-\alpha , \text { if } \varvec{\theta }_{0i} \in \mathcal {S}, \end{aligned}$$
(16)
$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \notin \widehat{C}^{R}_{i}) \rightarrow 1, \text { if } \varvec{\theta }_{0i} \in \mathcal {M}, \end{aligned}$$
(17)
$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i}) \ge 1-\alpha , \text { if } \varvec{\theta }_{0i} \in \mathcal {L}. \end{aligned}$$
(18)

as \(\tau _n \rightarrow 0\).

Remark 2

From the proof of the theorem, the conclusions for \(\varvec{\theta }_{0i}\) in either \(\mathcal {S}\) or \(\mathcal {M}\) do not rely on any specific choice of L, while only the conclusion for \(\mathcal {L}\) has the requirement on L. To make the credible sets as narrow as possible, noticing that \(L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-a/(1+\rho )}\), we should choose \(\beta \) to be as close to \(\alpha \) as possible. As for \(\rho \), noticing that

$$L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n) > \chi ^2_{k,\alpha }(\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta })^{a/(1+\rho )},$$

the choice should depend on \(\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta }\). For instance, when \(\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta } > 1\), we can choose as large \(\rho \) as possible, so that \(L \widehat{r}_i^{a/(1+\rho )}(\alpha , \tau _n)\) would essentially become \(\chi ^2_{k,\alpha }\). On the other hand, when \(\widehat{r}_i(\alpha , \tau _n)/\chi ^2_{k,\beta } < 1\), it would be more preferable to choose \(\rho \) closer to 0. This observation also motivates an individualized choice of \(L_i\) instead of a common L among all the subjects, so that each credible set can be narrowed as much as possible while maintaining the theoretical coverage probability.

Assuming sparsity in the nearly-black sense, most true means would be in the set \(\mathcal {S}\). This fact immediately leads to a high overall coverage probability, i.e., the following corollary.

Corollary 1

Under the setup of Theorem 6, further assume that the true means \(\varvec{\theta }_{0i}\) are sparse in the nearly-black sense. Then, for almost all \(i=1,\dots ,n\), as \(\tau _n \rightarrow 0\),

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{R}_{i}) \ge 1-\alpha . \end{aligned}$$

Next, we study coverage probabilities of credible sets constructed under the Exponential-Inverse-Gamma priors defined by (ii’) and (iii’). We consider credible sets of the same form as in the previous setup:

$$\begin{aligned} \widehat{C}^{EIG}_{i} = \{\varvec{\theta }_i: \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_{i} \Vert _{\Sigma }^2 \le L \widehat{r}_{i}^{d/(1+\rho )}(\alpha , c_n) \}, \end{aligned}$$
(19)

for some \(\rho (> 0)\) to be chosen later and \(\widehat{r}_{i}(\alpha , c_n)\) is determined from

$$\begin{aligned} \Pi ( \Vert \varvec{\theta }_i - \widehat{\varvec{\theta }}^{EIG}_{i} \Vert _{\Sigma }^2 \le \widehat{r}_{i}(\alpha , c_n) \mid \varvec{X}_i) = 1 - \alpha . \end{aligned}$$

Here, we divide the true mean vectors as in the following three categories:

$$\begin{aligned} \begin{aligned} \mathcal {S}'&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K'_S c_n\} \\ \mathcal {M}'&:= \{\varvec{\theta }_{0i}: f'_{c_n} c_n \le \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \le K'_M \log \frac{1}{c_n}\}\\ \mathcal {L}'&:= \{\varvec{\theta }_{0i}: \Vert \varvec{\theta }_{0i} \Vert _{\Sigma }^2 \ge K'_L \log \frac{1}{c_n}\} \end{aligned} \end{aligned}$$

for some positive constants \(K'_S\), \(K'_M\) and \(K'_L\), and some \(f'_{c_n}\) that goes to infinity as \(c_n\) goes to zero. And similar results regarding the coverage probabilities are observed under this prior.

Theorem 7

Consider the Exponential-Inverse-Gamma prior, i.e., a model satisfying (i), (ii’) and (iii’), under the regularity assumption (II). Suppose that \(K'_S > 0\), \(K'_M < 2d\) and \(K'_L > 2d\), and that \(f'_{c_n} \rightarrow \infty \) and \(f'_{c_n} c_n \rightarrow 0\) as \(c_n \rightarrow 0\). Then, given \(\alpha \), for the credible sets of form (19) with \(L > \chi ^2_{k,\alpha }(\chi ^2_{k,\beta })^{-d/(1+\rho )}\) for some fixed \(\beta > \alpha \) and \(\rho > 0\), for \(\alpha < 1/2\)

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i}) \ge 1-\alpha , \text {if}~\varvec{\theta }_{0i} \in \mathcal {S}', \end{aligned}$$
(20)
$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \notin \widehat{C}^{EIG}_{i}) \rightarrow 1, \text {if}~\varvec{\theta }_{0i} \in \mathcal {M}', \end{aligned}$$
(21)
$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i}) \ge 1-\alpha , \text {if}~\varvec{\theta }_{0i} \in \mathcal {L}'. \end{aligned}$$
(22)

as \(\tau _n \rightarrow 0\).

The following corollary is also immediate due to the nearly-black sparsity.

Corollary 2

Under the setup of Theorem 7, further assume that the true means \(\varvec{\theta }_{0i}\) are sparse in the nearly-black sense. Then, for almost all \(i=1,\dots ,n\), as \(c_n \rightarrow 0\),

$$\begin{aligned} P_{\varvec{\theta }_{0i}} (\varvec{\theta }_{0i} \in \widehat{C}^{EIG}_{i}) \ge 1-\alpha . \end{aligned}$$

4 Final Remarks

The paper addresses asymptotic estimation of multivariate normal means under global-local priors. We first find the asymptotic minimax error in the multivariate setup. Then, the asymptotic minimax error is obtained by treating the global parameter as a tuning parameter. The same result is obtained under Dirichlet-Laplace priors. Also, credible sets are obtained under global-local priors extending the idea of van der Pas et al. (2017) in the multivariate case.