1 Introduction

In this paper, we are interested in the one-sample problem for high-dimensional data. Here “high dimension” means “the data dimension is close to or even much larger than the sample size”. High-dimensional data are encountered when many measurements are taken on only a few subjects. For example, in DNA microarray data, thousands of gene expression levels are often measured on a relatively few subjects. With rapid development of data collecting technologies, high-dimensional data become rather common and attract many research efforts nowadays. Many new methods are proposed for high-dimensional hypothesis testing problems about mean vectors or covariance matrices in recent years, see, for example, Li et al. (2020), Bai et al. (2021), Zhang et al. (2021), and Silva et al. (2021) among others. The canonical one-sample problem aims to test if the population mean vector of a sample is a zero vector, and many interesting and more complicated hypotheses can be converted to it by some simple transformations, such as in the one group repeat measurement designs (Ahmad et al. 2008), in the mean matrix structure of transposable data (Touloumis et al. 2015), the two-sample problem (Chen and Qin 2010), and the multi-sample problem (Schott 2007).

The classical solution to the multivariate one-sample problem is Hotelling’s \(T^{2}\) test. However, Hotelling’s \(T^{2}\) test does not apply to high-dimensional data when the data dimension is larger than the sample size because in this case the sample covariance matrix is not invertible. To overcome this problem, many alternative tests are then proposed to test the one-sample hypothesis in high-dimensional settings. Srivastava and Du (2008) proposed a scale-invariant test. Park and Ayyala (2013) proposed a leaving-one-out scale-invariant test. Wang et al. (2015) proposed a nonparametric one-sample test based on the multivariate spatial sign transformation for elliptically distributed data. Feng and Sun (2016) proposed a scale-invariant nonparametric test based on spatial ranks and inner standardization which can also take the scale difference of variables into account. Some other tests include the random permutation based test proposed by Shen and Lin (2015), the randomization test proposed by Wang and Xu (2019), block diagonal test by Zhao (2017), the diagonal likelihood ratio test by Hu et al. (2019), the sign test by Paindaveine and Verdebout (2016), the composite \(T^{2}\) test by Feng et al. (2017), shrinkage-based regularization tests by Chen et al. (2011), Shen et al. (2011) and Dong et al. (2016), and the empirical likelihood test by Peng et al. (2014) among others.

Many existing tests, such as the tests by Srivastava and Du (2008) and Wang et al. (2015), use normal approximation to approximate their null distributions. However, for most tests, normal approximation is only valid under very strong conditions on the underlying covariance matrix as noted by Katayama et al. (2013). One of the key conditions requires that the high-dimensional data are less or nearly not correlated. To relax the assumptions on the underlying covariance matrix, Zhang and Xu (2009) proposed an \(L^{2}\)-norm one-sample test for normal data based on the two-cumulant (2-c) matched Welch–Satterthwaite \(\chi ^{2}\)-approximation. For one-group normally distributed repeated measures designs, Ahmad et al. (2008) proposed a test with the 2-c matched \(\chi ^{2}\)-approximation, and Pauly et al. (2015) proposed a test with the three-cumulant (3-c) matched \(\chi ^{2}\)-approximation of Zhang (2005).

In this paper, we propose and study a normal reference test with the 3-c matched \(\chi ^{2}\)-approximation for a general one-sample problem with non-normal high-dimensional data. We show that under some regularity conditions, when the null hypothesis is true, the proposed test statistic and a \(\chi ^2\)-type mixture have the same normal or non-normal limit distributions. It is then justifiable to approximate the null distribution of the test statistic using that of the \(\chi ^2\)-type mixture. The distribution of the \(\chi ^2\)-type mixture which has both positive and negative unknown coefficients can be well approximated by a 3-c matched \(\chi ^2\)-approximation with the approximation parameters consistently estimated from the data. Since the \(\chi ^2\)-type mixture is obtained from the test statistic when the null hypothesis holds and when the data are normally distributed, the resulting test is termed as a normal reference test with 3-c matched \(\chi ^2\)-approximation.

The proposed test has a close relationship with the test proposed by Pauly et al. (2015) but the two tests have several different aspects as listed below. First of all, our test is investigated for general non-normal data and one-sample tests with other types of data can be reduced to our one-sample test via some simple transformations while their test is studied only for normally distributed repeated measure designs. Second, the test proposed by Pauly et al. (2015) is based on a nonnegative squared \(L^2\)-norm statistic and their approximation essentially follows Hall (1983) by matching the third moment of normalized variables. On the other hand, our statistic is a centered squared \(L^2\)-norm statistic and our approximation is formulated as in Zhang (2005) for a \(\chi ^2\)-mixture with both positive and negative coefficients. Third, our approximation parameter estimators are constructed directly without using U-statistics and are ratio-consistent under the null or any alternative hypotheses while their approximation parameter estimators are constructed using U-statistics which are often time and space consuming and are ratio-consistent only under the null hypothesis. In practice, one does not know if the null hypothesis holds. Fourth, the asymptotic power of our test is established, the effect of the data non-normality on our test is discussed, and a sufficient and necessary condition is found for the asymptotic normality of our test. These are not discussed in Pauly et al. (2015).

The rest of the paper is organized as follows. Our main results are presented in Sect. 2. A simulation study is presented in Sect. 3. Applications of our test to one-sample problems with other types of data are presented in Sect. 4. Some concluding remarks are given in Sect. 5. The technical proofs of the main results are outlined in the Appendix.

2 Main results

Our study is motivated by a multivariate analysis of variance (MANOVA) problem for dependent samples. Suppose we have n independent, identically distributed (i.i.d.) \(q\times k\) matrix variate observations \(\varvec{{X}}_{i}=(\varvec{{x}}_{i1},\ldots ,\varvec{{x}}_{ik}),\ i=1,\ldots ,n\). The k columns of the observation matrix \(\varvec{{X}}_{i}\) correspond to matched multivariate observations from k different samples. Unlike the usual MANOVA problem for independent samples, we assume the observations of the k samples are matched, and allow possible dependence between matched observations from different samples. Besides, as frequently encountered in many practical problems, such as in the time profiles analysis (Ahmad et al. 2008; Pauly et al. 2015), we allow k (or q) to be large, even be proportional to the sample size n. The interested problem is whether the mean vectors of the k samples are the same, i.e., to test

$$\begin{aligned} H_{0}:{\text {E}}(\varvec{{x}}_{11})=\cdots ={\text {E}}(\varvec{{x}}_{1k}) \text { versus }H_{1}:H_{0}\text { is not true}. \end{aligned}$$
(1)

In this paper, instead of trying to solve above specific problem directly, we treat it as a special case of the following one-sample problem. Suppose we have one high-dimensional sample:

$$\begin{aligned} \varvec{{y}}_{1},\ldots ,\varvec{{y}}_{n} \text{ are } \text{ i.i.d. } p \text{-dimensional } \text{ random } \text{ vectors }, \end{aligned}$$
(2)

with \({\text {E}}(\varvec{{y}}_1)=\varvec{{\mu }}\) and \({\text {Cov}}(\varvec{{y}}_1)=\varvec{{\varSigma }}\) where the dimension p is big, and may be much larger than the sample size n. Consider the following hypotheses:

$$\begin{aligned} H_{0}:\ \ \varvec{{\mu }}=\varvec{{0}}\ \ \text{ versus } \ \ H_{1}:\ \ \varvec{{\mu }}\ne \varvec{{0}}. \end{aligned}$$
(3)

In many situations, one may be interested in testing the hypotheses: \(H_0: \varvec{{\mu }}=\varvec{{\mu }}_0\) versus \(H_1: \varvec{{\mu }}\ne \varvec{{\mu }}_0\) for some known constant vector \(\varvec{{\mu }}_0\). This general one-sample problem can be reduced to the one-sample problem (3) based on the induced sample \(\varvec{{y}}_i-\varvec{{\mu }}_0,\ i=1,\ldots ,n\) and with \(\varvec{{\mu }}\) replaced by \(\varvec{{\mu }}-\varvec{{\mu }}_0\). To see the connection between hypotheses (3) and (1) , let \(\varvec{{P}}=\varvec{{I}}_{k}-k^{-1}\varvec{{J}}_{k}\), where \(\varvec{{I}}_{k}\) is a \(k\times k\) identity matrix and \(\varvec{{J}}_{k}\) is a \(k\times k\) matrix of ones. The hypothesis \(H_{0}\) in (1) is equivalent to \({\text {vec}}[{\text {E}}(\varvec{{X}}_{1})\varvec{{P}}]={\text {E}}[{\text {vec}}(\varvec{{X}}_{1}\varvec{{P}})]=\varvec{{0}}\), where \({\text {vec}}\) denotes the matrix vectorization by column operator, so to test the hypothesis \(H_{0}\) in (1) for the original sample \(\varvec{{X}}_{i},\ i=1,\ldots ,n\), we can just test the hypothesis \(H_{0}\) in (3) for the induced sample \(\varvec{{y}}_{i}={\text {vec}}(\varvec{{X}}_{i}\varvec{{P}})\), \(i=1,\ldots ,n\).

2.1 Asymptotic null distribution

Let

$$\begin{aligned} \bar{\varvec{{y}}}=n^{-1}\sum _{i=1}^n \varvec{{y}}_i, \; \text{ and } \; \hat{\varvec{{\varSigma }}}=(n-1)^{-1}\sum _{i=1}^n (\varvec{{y}}_i-\bar{\varvec{{y}}})(\varvec{{y}}_i-\bar{\varvec{{y}}})^{\top } \end{aligned}$$
(4)

denote the sample mean vector and covariance matrix, respectively. Inspired by the two-sample test of Bai and Saranadasa (1996), the test statistic for testing the one-sample problem (3) can be constructed as

$$\begin{aligned} T_{n,p}=n\Vert \bar{\varvec{{y}}}\Vert ^{2}-{\text {tr}}(\hat{\varvec{{\varSigma }}}), \end{aligned}$$
(5)

where \(\Vert \cdot \Vert \) denotes the usual \(L^{2}\)-norm of a vector. We can write

$$\begin{aligned} T_{n,p}=T_{n,p,0}+2S_{n,p}+n\Vert \varvec{{\mu }}\Vert ^{2}, \end{aligned}$$
(6)

where

$$\begin{aligned} T_{n,p,0}=n\Vert \bar{\varvec{{y}}}-\varvec{{\mu }}\Vert ^{2}-{\text {tr}}(\hat{\varvec{{\varSigma }}}),\;\;S_{n,p}=n\varvec{{\mu }}^{\top }(\bar{\varvec{{y}}}-\varvec{{\mu }}). \end{aligned}$$
(7)

Note that \(T_{n,p,0}\) has the same distribution as \(T_{n,p}\) under the null hypothesis.

When the sample (2) is normally distributed, it is easy to see that for any given n and p, the distribution of \(T_{n,p,0}\) has the same distribution as that of the following \(\chi ^2\)-type mixture

$$\begin{aligned} T_{n,p,0}^*=\sum _{r=1}^p \lambda _{p,r} [A_r-B_r/(n-1)],\; A_r{\mathop {\sim }\limits ^{\text {i.i.d.}}}\chi _1^2,\; B_r{\mathop {\sim }\limits ^{\text {i.i.d.}}}\chi _{n-1}^2, \end{aligned}$$
(8)

where \(\chi _{v}^{2}\) denotes a central chi-square distribution with v degrees of freedom, \(\lambda _{p,r},\ r=1,\ldots , p \) are the eigenvalues of the covariance matrix \(\varvec{{\varSigma }}\). The first three cumulants of \(T_{n,p,0}^*\) are given as \({\text {E}}(T_{n,p,0}^*)=0\),

$$\begin{aligned} {\text {Var}}(T_{n,p,0}^*)=\frac{2n}{n-1}{\text {tr}}(\varvec{{\varSigma }}^2),\;\; \text{ and } {\text {E}}(T_{n,p,0}^{*3})=\frac{8n(n-2)}{(n-1)^{2}}{\text {tr}}(\varvec{{\varSigma }}^3). \end{aligned}$$
(9)

Now we study the asymptotic property of \(T_n\) when both n and p tend to infinity. Although the situation described by this kind of high-dimensional asymptotics never happens in reality, the high-dimensional property of \(T_n\) gives a hint how it behaves in the practical scenario that when both sample size and data dimension are large, or when the data dimension is comparable to the sample size. More importantly, the limiting behavior of \(T_n\) provides a guidance for properly approximating its null distribution and the p value of the corresponding test when both n and p are large.

Set \(\rho _{p,r}=\lambda _{p,r}/\sqrt{{\text {tr}}(\varvec{{\varSigma }}^2)},\ r=1,\ldots ,p\). The following conditions are convenient for the theoretical study:

  1. C1

    We have \(\varvec{{y}}_{i}=\varvec{{\mu }}+\varvec{{\varGamma }}\varvec{{z}}_{i},\ i=1,\ldots ,n\), where \(\varvec{{\varGamma }}\) is a \(p\times p\) matrix such that \(\varvec{{\varGamma }}\varvec{{\varGamma }}^{\top }=\varvec{{\varSigma }}\) and \(\varvec{{z}}_{i}\)’s are i.i.d. p-vectors with \({\text {E}}(\varvec{{z}}_{i})=\varvec{{0}}\) and \({\text {Cov}}(\varvec{{z}}_{i})=\varvec{{I}}_{p}\), the \(p\times p\) identity matrix.

  2. C2

    We have \({\text {E}}(z_{ir}^{4})=3+\varDelta <\infty \) where \(z_{ir}\) is the r-th component of \(\varvec{{z}}_{i}\), \(\varDelta \) is some constant, and \({\text {E}}(z_{ir_1}^{\alpha _{1}}\cdots z_{ir_q}^{\alpha _{q}})={\text {E}}(z_{ir_1}^{\alpha _1})\cdots {\text {E}}(z_{ir_q}^{\alpha _q})\) for a positive integer q such that \(\sum _{r=1}^q\alpha _r\le 8\) and \(r_1\ne \cdots \ne r_q\).

  3. C3

    We have \(\lim _{p\rightarrow \infty } \rho _{p,r}=\rho _{r},\ r=1,2,\ldots \), uniformly and \(\lim _{p\rightarrow \infty } \sum _{r=1}^p \rho _{p,r}=\sum _{r=1}^{\infty } \rho _r<\infty \).

  4. C4

    As \(n,p\rightarrow \infty \), we have \(p/n^2\longrightarrow 0\).

  5. C5

    As \(p\rightarrow \infty \), we have \(\rho _{p,\max }\rightarrow 0\) where \(\rho _{p,\max }=\max _{r=1}^p \rho _{p,r}\).

Conditions C1 and C2 are also imposed by Bai and Saranadasa (1996) and Chen and Qin (2010), respectively. They specify a factor model for high-dimensional data analysis. Condition C3 is also imposed by Zhang et al. (2020), it ensures the existence of the limits of \(\lambda _{p,r}\) as \(p\rightarrow \infty \) and the exchangeability of the limit and summation operations in the expression \(\lim _{p\rightarrow \infty }\sum _{r=1}^p\rho _{p,r}\). Condition C3 implies that \(\sum _{r=q+1}^{p}\rho _{p,r} \longrightarrow \sum _{r=q+1}^{\infty }\rho _{r}\) as \(p\rightarrow \infty \) for any fixed \(q<p\), and \(\sum _{r=q+1}^{\infty }\rho _{r}\longrightarrow 0\) as \(q\rightarrow \infty \). It is used to ensure that the limiting distributions of the normalized versions of \(T_{n,p,0}\) and \(T_{n,p,0}^*\), namely,

$$\begin{aligned} \tilde{T}_{n,p,0}=\frac{T_{n,p,0}}{\sqrt{\frac{2n}{(n-1)}{\text {tr}}(\varvec{{\varSigma }}^2)}}, \text{ and } \tilde{T}_{n,p,0}^*=\frac{T_{n,p,0}^*}{\sqrt{\frac{2n}{(n-1)}{\text {tr}}(\varvec{{\varSigma }}^2)}}, \end{aligned}$$
(10)

are non-normal. Condition C4 is needed by Lemma 1 presented in the Appendix which proves the ratio-consistency of the estimator (20) of \({\text {tr}}(\varvec{{\varSigma }}^3)\). It is also needed by Theorems 4 and 5 . This condition is weaker than the condition “\(p/n\longrightarrow c\in (0,\infty )\) as \(n, p\rightarrow \infty \)” imposed by Bai and Saranadasa (1996). It allows \(p/n\longrightarrow \infty \) as \(n, p\rightarrow \infty \) but only allows p to diverge in a slower rate than \(n^2\). Condition C5 is also imposed by Bai and Saranadasa (1996) and it is used to ensure that the limiting distributions of \(\tilde{T}_{n,p,0}\) and \(\tilde{T}_{n,p,0}^*\) are normal. Conditions C3 and C5 impose two exclusive constraints on the eigenvalues of the covariance matrix \(\varvec{{\varSigma }}\) so that the limiting distributions of \(\tilde{T}_{n,p,0}\) and \(\tilde{T}_{n,p,0}^*\) are non-normal and normal, respectively. Theoretically speaking, when the eigenvalues of \(\varvec{{\varSigma }}\) are in the same order (e.g., under a non-spiked covariance model where no eigenvalues of \(\varvec{{\varSigma }}\) can dominate the other eigenvalues), Condition C5 is satisfied so that \(\tilde{T}_{n,p,0}\) and \(\tilde{T}_{n,p,0}^*\) will be asymptotically normally distributed and when the sequence of decreasingly ordered eigenvalues of \(\varvec{{\varSigma }}\) tends to 0 quickly (e.g., under a spiked covariance model where a finite number of eigenvalues dominate the remaining eigenvalues asymptotically) such that \({\text {tr}}^2(\varvec{{\varSigma }})/{\text {tr}}(\varvec{{\varSigma }}^2)\) tends to a finite limit, Condition C3 is satisfied. In real data analysis, largely speaking, when the p-components of an observation are nearly uncorrelated, Condition C5 is approximately satisfied and when they are moderately or highly correlated, Condition C3 is approximately satisfied. Let \({\mathop {=}\limits ^{d}}\) denote equality in distribution and \({\mathop {\longrightarrow }\limits ^{\mathcal {L}}}\) denote convergence in distribution. We have the following useful theorem.

Theorem 1

  1. (a)

    Under Conditions C1, C2 and C3, as \(n, p\rightarrow \infty \), we have

    $$\begin{aligned} \tilde{T}_{n,p,0}{\mathop {\longrightarrow }\limits ^{\mathcal {L}}}\zeta , \quad \text{ and }\quad \tilde{T}_{n,p,0}^*{\mathop {\longrightarrow }\limits ^{\mathcal {L}}}\zeta , \end{aligned}$$
    (11)

    where \(\zeta {\mathop {=}\limits ^{d}}\sum _{r=1}^{\infty } \rho _{r} (A_r-1)/\sqrt{2}, \; A_r{\mathop {\sim }\limits ^{\text {i.i.d.}}}\chi _1^2\).

  2. (b)

    Under Conditions C1, C2 and C5, as \(n, p\rightarrow \infty \), we have

    $$\begin{aligned} \tilde{T}_{n,p,0}{\mathop {\longrightarrow }\limits ^{\mathcal {L}}}\mathcal {N}(0,1), \quad \text{ and } \quad \tilde{T}_{n,p,0}^*{\mathop {\longrightarrow }\limits ^{\mathcal {L}}}\mathcal {N}(0,1). \end{aligned}$$
    (12)

    Then under the conditions of (a) or (b), we always have

    $$\begin{aligned} \sup _{x}|\Pr (T_{n,p,0}\le x)-\Pr (T_{n,p,0}^*\le x )|\longrightarrow 0. \end{aligned}$$
    (13)

In one-sample test for normally distributed repeated measures designs, a theorem comparable with Theorem 1 has been proved by Pauly et al. (2015). However, the authors failed to extend it to non-normal repeated measures designs in their paper. Theorem 1 provides a theoretical justification for us to use the distribution of \(T_{n,p,0}^*\) to approximate the distribution of \(T_{n,p,0}\). Notice that \(T_{n,p,0}^*\) is obtained when the data (2) are normally distributed. Thus, we term the distribution of \(T_{n,p,0}^*\) as the normal-reference distribution of \(T_{n,p,0}\).

2.2 Implementation

To implement the proposed test, we approximate the null distribution of \(T_{n,p}\) using that of \(T_{n,p,0}^*\). Different from the \(L^{2}\)-norm test studied in Zhang and Xu (2009), whose null distribution is the same as a \(\chi ^{2}\)-type mixture with only positive coefficients, the distribution of \(T_{n,p,0}^*\) is the same as a \(\chi ^{2}\)-type mixture with both positive and negative coefficients. For such a \(\chi ^{2}\)-type mixture, Zhang (2013) showed, with some simulation studies, that the 2-c matched \(\chi ^{2}\)-approximation method (Welch 1947; Satterthwaite 1946; Box 1954) adopted by Zhang and Xu (2009) should not be used to approximate the distribution of \(T_{n,p,0}^*\). Rather, the 3-c matched \(\chi ^{2}\)-approximation method of Zhang (2005) should be used.

One obvious advantage of the 3-c \(\chi ^{2}\)-approximation method for approximating the distribution of \(T_{n,p,0}^*\) over the normal approximation suggested by Bai and Saranadasa (1996), and the 2-c matched \(\chi ^{2}\)-approximation method used by Zhang and Xu (2009), is that the former matches the first three cumulants while the latter two only matches the first two cumulants. So it is expected that in terms of size control the 3-c \(\chi ^{2}\)-approximation should be more accurate than the normal approximation and the 2-c matched \(\chi ^{2}\)-approximation. In fact, Zhang (2005) showed, theoretically in terms of upper density approximation error bound and via simulation studies, that the 3-c matched \(\chi ^{2}\)-approximation has a much better accuracy than the normal approximation even when the normal approximation is adequate.

By the 3-c matched \(\chi ^{2}\)-approximation method of Zhang (2005), we approximate the distribution of \(T_{n,p,0}^*\) using the distribution of the random variable

$$\begin{aligned} R=\beta _{0}+\beta _{1}\chi _{d}^{2}, \end{aligned}$$
(14)

where the parameters \(\beta _{0},\beta _{1}\) and d are determined via matching the first three cumulants of \(T_{n,p,0}^*\) and R. The first three cumulants of \(T_{n,p,0}^*\) are given in (9) while by (14), the first three cumulants of R are given by \(\beta _{0}+\beta _{1}d,\; 2\beta _{1}^{2}d,\) and \(8\beta _{1}^{3}d\), respectively. Matching the first three cumulants of \(T_{n,p,0}^*\) and R then leads to

$$\begin{aligned} \beta _{0}=-\frac{n{\text {tr}}^{2}(\varvec{{\varSigma }}^{2})}{(n-2){\text {tr}}(\varvec{{\varSigma }}^{3})},\;\;\beta _{1} =\frac{(n-2){\text {tr}}(\varvec{{\varSigma }}^{3})}{(n-1){\text {tr}}(\varvec{{\varSigma }}^{2})},\;\; d=\frac{n(n-1)}{(n-2)^{2}} \frac{{\text {tr}}^{3}(\varvec{{\varSigma }}^{2})}{{\text {tr}}^{2}(\varvec{{\varSigma }}^{3})}. \end{aligned}$$
(15)

The parameter d is usually called as the approximate degrees of freedom of the 3-c matched \(\chi ^{2}\)-approximation to \(T_{n,p,0}^*\). Note that since \(\varvec{{\varSigma }}\) is always nonnegative, we always have \(\beta _0<0, \beta _1>0\), and \(d>0\). This is reasonable since \(T_{n,p,0}^*\) is a \(\chi ^2\)-type mixture with both positive and negative coefficients. Using d defined above, the skewness of \(T_{n,p,0}^*\) is given by

$$\begin{aligned} {\text {E}}(T_{n,p,0}^{*3})/{\text {Var}}^{3/2}(T_{n,p,0}^*)=\left( 8/d\right) ^{1/2}. \end{aligned}$$
(16)

To implement the proposed test in real data analysis, we need estimate \({\text {tr}}(\varvec{{\varSigma }}^{2})\) and \({\text {tr}}(\varvec{{\varSigma }}^{3})\) consistently. Let their ratio-consistent estimators be denoted respectively as \(\widehat{{\text {tr}}(\varvec{{\varSigma }}^{2})}\) and \(\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}\). Then the ratio-consistent estimators of \(\beta _{0},\beta _{1}\) and d are respectively given by

$$\begin{aligned} \hat{\beta }_{0}=-\frac{n[\widehat{{\text {tr}}(\varvec{{\varSigma }}^{2})}]^{2}}{(n-2)\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}},\; \hat{\beta }_{1}=\frac{(n-2)\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}}{(n-1)\widehat{{\text {tr}}(\varvec{{\varSigma }}^{2})}},\; \hat{d}=\frac{n(n-1)}{(n-2)^{2}}\frac{[\widehat{{\text {tr}}(\varvec{{\varSigma }}^{2})}]^{3}}{[\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}]^{2}}. \end{aligned}$$
(17)

For any nominal significance level \(\alpha >0\), let \(\chi _{v}^{2}(\alpha )\) denote the upper \(100\alpha \) percentile of \(\chi _{v}^{2}\). Then by (17), the proposed test for the one-sample problem (3) using \(T_{n,p}\) with the 3-c matched \(\chi ^{2}\)-approximation is then conducted via using the approximate critical value \(\hat{\beta }_{0}+\hat{\beta }_{1}\chi _{\hat{d}}^{2}(\alpha )\) or the approximate p value \(\Pr \left[ \chi _{\hat{d}}^{2}\ge (T_{n,p}-\hat{\beta }_{0})/\hat{\beta }_{1}\right] \).

In practice, one often uses the following normalized version of \(T_{n,p}\):

$$\begin{aligned} \tilde{T}_{n,p}=\frac{T_{n,p}}{\sqrt{\frac{2n}{n-1} \widehat{{\text {tr}}(\varvec{{\varSigma }}^{2})}}}. \end{aligned}$$
(18)

Then to approximate the distribution of \(T_{n,p}\) using that of \(\hat{\beta }_0+\hat{\beta }_1\chi _{\hat{d}}^2\) is equivalent to approximate the distribution of \(\tilde{T}_{n,p}\) using that of \((\chi _{\hat{d}}^2-\hat{d})/\sqrt{2\hat{d}}\). In this case, the proposed test for the one-sample problem (3) using \(\tilde{T}_{n,p}\) with the 3-c matched \(\chi ^{2}\)-approximation can also be conducted via using the approximate critical value \([\chi _{\hat{d}}^{2}(\alpha )-\hat{d}]/\sqrt{2\hat{d}}\) or the approximate p value \(\Pr \left( \chi _{\hat{d}}^{2}\ge \hat{d}+\sqrt{2\hat{d}}\tilde{T}_{n,p}\right) \).

We now consider the ratio-consistent estimators of \({\text {tr}}(\varvec{{\varSigma }}^2)\) and \({\text {tr}}(\varvec{{\varSigma }}^3)\). By Lemma S.3 of Zhang et al. (2020), a ratio-consistent estimator of \({\text {tr}}(\varvec{{\varSigma }}^{2})\) is given by

$$\begin{aligned} \widehat{{\text {tr}}(\varvec{{\varSigma }}^{2})}=\frac{(n-1)^{2}}{(n-2)(n+1)}\left[ {\text {tr}}(\hat{\varvec{{\varSigma }}}^{2})-\frac{{\text {tr}}^{2}(\hat{\varvec{{\varSigma }}})}{n-1}\right] , \end{aligned}$$
(19)

where \(\hat{\varvec{{\varSigma }}}\) is the sample covariance estimator of \(\varvec{{\varSigma }}\) as given in (4). When the data (2) are normally distributed, we have \(\hat{\varvec{{\varSigma }}}\sim W_p(n-1,\varvec{{\varSigma }}/(n-1))\), a Wishart distribution with \(n-1\) degrees of freedom and covariance matrix \(\varvec{{\varSigma }}/(n-1)\). Then under Condition C4, by Lemma 1 given in the Appendix, an unbiased and ratio-consistent estimator of \({\text {tr}}(\varvec{{\varSigma }}^{3})\) is given by

$$\begin{aligned} \widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}=\frac{(n-1)^{4}}{(n^2+n-6)(n^2-2n-3)}\left[ {\text {tr}}(\hat{\varvec{{\varSigma }}}^{3})-\frac{3{\text {tr}}(\hat{\varvec{{\varSigma }}}){\text {tr}}(\hat{\varvec{{\varSigma }}}^{2})}{(n-1)}+\frac{2{\text {tr}}^{3}(\hat{\varvec{{\varSigma }}})}{(n-1)^{2}}\right] . \end{aligned}$$
(20)

We conjecture that when Conditions C1, C2 and C4 are satisfied, \(\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}\) is also ratio-consistent for \({\text {tr}}(\varvec{{\varSigma }}^{3})\) for non-normal data. This is partially confirmed by the simulation results presented in Sect. 3 and in the Supplementary Material where the proposed test works well in terms of size control regardless of whether the data are nearly uncorrelated, moderately correlated or highly correlated and whether the data are normally or non-normally distributed. A theoretical justification of the ratio-consistency of \(\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}\) without the normality assumption, like the one given in Lemma 1 for normal data, is theoretically interesting and mathematically possible but expectedly rather laborious because the evaluation of the mean and variance of \(\widehat{{\text {tr}}(\varvec{{\varSigma }}^{3})}\) for non-normal data will be much more complicated than those in the proof of Lemma 1 for normal data. Further research in this direction is interesting and warranted. It is worthwhile to mention that a U-statistic based estimator of \({\text {tr}}(\varvec{{\varSigma }}^3)\) is given by Pauly et al. (2015) (Theorem 8.2). However, this estimator is often time-consuming, especially when both n and p are large. Further, its ratio-consistency is proved under the null hypothesis and the normality assumption as well.

2.3 Asymptotic power

In this subsection, we investigate the asymptotic power of \(T_{n,p}\). By (6), we have the expansion \(T_{n,p}{\mathop {=}\limits ^{d}}T_{n,p,0}+2S_{n,p}+n\Vert \varvec{{\mu }}\Vert ^{2}\) where \(T_{n,p,0}\) has the same distribution as \(T_{n,p}\) under the null hypothesis and \({\text {Var}}(S_{n,p})=n\varvec{{\mu }}^{\top }\varvec{{\varSigma }}\varvec{{\mu }}\). Following Bai and Saranadasa (1996), let’s consider the power of \(T_{n,p}\) under the following local alternative:

$$\begin{aligned} \text{ as } {n,p\rightarrow \infty },\;\;n\varvec{{\mu }}^{\top } \varvec{{\varSigma }}\varvec{{\mu }}=o[{\text {tr}}(\varvec{{\varSigma }}^{2})]. \end{aligned}$$
(21)

This is the case when \({\text {Var}}(S_{n,p})=o[{\text {Var}}(T_{n,p,0})]\) so that \(T_{n,p}=T_{n,p,0}+n\Vert \varvec{{\mu }}\Vert ^{2}+o_{p}\left[ \sqrt{{\text {Var}}(T_{n,p,0})}\right] \) since \({\text {E}}(S_{n,p})=0\).

Theorem 2

Assume that \(\hat{\beta }_0, \hat{\beta }_1\) and \(\hat{d}\) are the ratio-consistent estimators of \(\beta _0,\beta _1\) and d as \(n,p\rightarrow \infty \), respectively. Then, (a) Under Conditions C1, C2, C3, and the local alternative (21), as \(n,p\rightarrow \infty \), we have

$$\begin{aligned} \Pr \left[ T_{n,p}>\hat{\beta }_0+\hat{\beta }_1\chi _{\hat{d}}^{2}(\alpha )\right] =\Pr \left[ \zeta \ge \frac{\chi _{d}^{2}(\alpha )-d}{\sqrt{2d}}-\frac{n\Vert \varvec{{\mu }}\Vert ^2}{\sqrt{2{\text {tr}}(\varvec{{\varSigma }}^2)}}\right] [1+o(1)], \end{aligned}$$

where \(\zeta \) is defined in Theorem 1(a).

(b) Under Conditions C1, C2, C4, C5 and the local alternative (21), as \(n,p\rightarrow \infty \), we have

$$\begin{aligned} \Pr \left[ T_{n,p}>\hat{\beta }_0+\hat{\beta }_1\chi _{\hat{d}}^{2}(\alpha )\right] =\varPhi \left[ -z_{\alpha }+\frac{n\Vert \varvec{{\mu }}\Vert ^2}{\sqrt{2{\text {tr}}(\varvec{{\varSigma }}^2)}}\right] [1+o(1)], \end{aligned}$$

where \(z_{\alpha }\) denotes the upper \(100\alpha \)-percentile of \(\mathcal {N}(0,1)\) and \(\varPhi (\cdot )\) denotes the cumulative distribution function of \(\mathcal {N}(0,1)\).

For any \(d\ge 1\) and a small \(\alpha \), it is easy to check that we always have \(z_{\alpha }< [\chi _d^2(\alpha )-d]/\sqrt{2d}\). This shows that under Conditions C1–C3 and the local alternative (21), the asymptotic size and power of the proposed test with the normal approximation are expected to be “artificially” larger than those of the proposed test with the 3-c \(\chi ^2\)-approximation. This is consistent with what we observe from the simulation results presented in Sect. 3.

2.4 Effect of data non-normality

The validness of the proposed normal reference test is guaranteed by Theorem 1. In this subsection, we aim to further investigate the effect of the data non-normality onto the proposed test. That is, how does the data non-normality affect the performance of the proposed test? To answer this question, we study how to approximate the distribution of \(T_{n,p,0}\) directly using the 3-c matched \(\chi ^2\)-approximation. To this end, we compute the first three cumulants of \(T_{n,p,0}\) as in the following theorem.

Theorem 3

The first three cumulants of \(T_{n,p,0}\) are given by \({\text {E}}(T_{n,p,0})=0\),

$$\begin{aligned} \quad {\text {Var}}(T_{n,p,0})=\frac{2n}{n-1}{\text {tr}}(\varvec{{\varSigma }}^{2}),\;\text{ and }\; {\text {E}}(T_{n,p,0}^3)=\frac{8n(n-2)}{(n-1)^{2}}{\text {tr}}(\varvec{{\varSigma }}^{3})+\frac{4n\varUpsilon }{(n-1)^{2}}, \end{aligned}$$

where \(\varUpsilon ={\text {E}}[(\varvec{{y}}_{1}-\varvec{{\mu }})^{\top }(\varvec{{y}}_{2}-\varvec{{\mu }})]^{3}\).

It is seen from Theorem 3 that the data non-normality affects the third moment of \(T_{n,p,0}\) only. To approximate the distribution of \(T_{n,p,0}\) directly using that of \(W=b_0+b_1\chi _f^2\) via matching the first three cumulants of \(T_{n,p,0}\) and W, the parameters \(b_{0},b_{1}\) and f are obtained as

$$\begin{aligned} b_{0}=\beta _0/\delta ,\;\;b_1=\beta _1\delta ,\; \text{ and } \;f=d/\delta ^2,\; \text{ where } \delta =1+\varUpsilon /[2(n-2){\text {tr}}(\varvec{{\varSigma }}^{3})], \nonumber \\ \end{aligned}$$
(22)

and \(\beta _0, \beta _1\) and d are given in (15). Note that the skewness of \(T_{n,p,0}\) is given by

$$\begin{aligned} {\text {E}}(T_{n,p,0}^{3})/{\text {Var}}^{3/2}(T_{n,p,0})=\left( 8/f\right) ^{1/2}. \end{aligned}$$
(23)

The quantity \(\varUpsilon \) can be seen as a non-invariant measure of multivariate normality based on skewness (See, e.g., Sect. 3.1 of Henze 2002). When the data (2) are normal, it is easy to show that \(\varUpsilon =0\) so that \(\delta =1, b_0=\beta _0, b_1=\beta _1, f=d\) and the skewness (23) of \(T_{n,p,0}\) reduces to the skewness (16) of \(T_{n,p,0}^*\) as expected. However, when the data (2) are non-normal, we may not have \(\varUpsilon =0\) and hence the approximation parameters \(b_0, b_1\), f, and the skewness of \(T_{n,p,0}\) are all affected by the data non-normality. Fortunately, we can show the following result.

Theorem 4

(a) Under Conditions C1 and C2, we have \(\varUpsilon \le (\varDelta ^2+6\varDelta +9)^{3/4}{\text {tr}}^{3/2}(\varvec{{\varSigma }}^2)\) where \(\varDelta \) is given in Condition C2; and (b) Under either Conditions C1, C2 and C3 or Conditions C1, C2 and C4, we have \(\delta =1+o(1)\) as \(n,p\rightarrow \infty \).

Theorem 4 says that under Conditions C1, C2 and C3 or Conditions C1, C2 and C4, the data non-normality on the proposed normal reference test can be ignorable asymptotically so that we have \(b_0=\beta _0[1+o(1)],\ b_1=\beta _1[1+o(1)],\ f=d[1+o(1)]\) and the skewness of \(T_{n,p,0}\) and that of \(T_{n,p,0}^*\) are also asymptotically equal. The following theorem gives a sufficient and necessary condition for the asymptotic normality of \(\tilde{T}_{n,p,0}\).

Theorem 5

Under Conditions C1, C2 and C4, as \(n,p\rightarrow \infty \), \(\tilde{T}_{n,p,0}{\mathop {\longrightarrow }\limits ^{\mathcal {L}}}\mathcal {N}(0,1) \) if and only if \(d\longrightarrow \infty \) where d is given in (15).

Theorem 5 indicates that when d is small, the normal approximation to the distribution of \(\tilde{T}_{n,p,0}\) is unlikely to be adequate.

3 Simulation study

In this section, we conduct a simulation study to compare the proposed normal reference test with the 3-c matched \(\chi ^{2}\)-approximation (denoted as \(T_{new}\)), against the \(L^{2}\)-norm based test with the 2-c matched \(\chi ^{2}\)-approximation proposed by Zhang and Xu (2009) (denoted as \(T_{ZX}\)), and the tests proposed by Bai and Saranadasa (1996), Chen and Qin (2010) and Srivastava and Du (2008) (dented as \(T_{BS},\) \(T_{CQ}\) and \(T_{SD}\), respectively). The original \(T_{BS}\) and \(T_{CQ}\) are two-sample tests and the corresponding one-sample tests adopted here are respectively given by (1.2) and (1.5) of Zhou et al. (2019). Note that the null distributions of \(T_{BS}\), \(T_{CQ}\) and \(T_{SD}\) are all computed using the normal approximation.

In each run, we generate the high-dimensional data (2) using \(\varvec{{y}}_{i}=\varvec{{\mu }}+\varvec{{\varSigma }}^{1/2}\varvec{{z}}_{i},\ i=1,\ldots ,n\) where \(\varvec{{\mu }}=\delta \varvec{{h}}\) with the components of \(\varvec{{z}}_{i}\) i.i.d. generated from the following three models:

  • Model 1: \(z_{ir},\ r=1,\ldots ,p{\mathop {\sim }\limits ^{\text {i.i.d.}}}\mathcal {N}(0,1)\).

  • Model 2: \(z_{ir}=w_{ir}/\sqrt{2},\ r=1,\ldots ,p\) with \(w_{ir},\ r=1,\ldots ,p{\mathop {\sim }\limits ^{\text {i.i.d.}}}{\mathrm{t}}_{4}\).

  • Model 3: \(z_{ik}=(w_{ir}-1)/\sqrt{2},\ r=1,\ldots ,p\) with \(w_{ir},\ r=1,\ldots ,p{\mathop {\sim }\limits ^{\text {i.i.d.}}}\chi _{1}^{2}\).

Based on the above three models, the resulting data are normal, symmetric but non-normal, and skewed and non-normal, respectively. The covariance matrix is specified as \(\varvec{{\varSigma }}=\sigma ^{2}\left[ (1-\rho )\varvec{{I}}_{p}+\rho \varvec{{J}}_{p}\right] \). Some additional simulation results with different covariance structures are presented in the Supplementary Material and the conclusions are similar to those presented in this section.

Note that the tuning parameters \(\delta ,\ \varvec{{h}}\) and \(\rho \) are used to control the mean vector and the data correlation, respectively. Note also that the power of a test will increase with increasing the value of \(\delta \) and the data correlation will increase with increasing the value of \(\rho \). For simplicity, without loss of generality, we set \(\varvec{{h}}=\varvec{{u}}/\Vert \varvec{{u}}\Vert \) with \(\varvec{{u}}=(1,\ldots ,p)^{\top }\) and set \(\sigma ^{2}=1\). To compare the performance of the tests under consideration with various settings, we consider three cases of dimension with \(p=50,500,1000\), three cases of sample sizes with \(n=30,60,120\), and three cases of data correlation with \(\rho =0.1,0.5\) and 0.9.

In the simulations, empirical size and power of a test are calculated as the proportions of the number of rejections (i.e., number of runs when calculated p values of the associated test is smaller than nominal level \(\alpha =5\%\)) out of 10, 000 runs. The empirical sizes are calculated with \(\delta =0\) so that the null hypothesis \(H_0\) in (3) is true, and the empirical powers are calculated with \(\delta >0\). Different values of \(\delta \) (see Table 2) are carefully selected for different combinations of n and p so that all the tests largely have non-trivial powers when \(\rho =0.1, 0.5\) and 0.9, respectively. To assess the performance of a test in maintaining the type I error, we define the average relative error as \(\text{ ARE }=100M^{-1}\sum _{j=1}^{M}|\hat{\alpha }_{j}-\alpha |/\alpha \), where \(\alpha \) is the nominal size (\(5\%\) here) and \(\hat{\alpha }_{j},\ j=1,\ldots ,M\) denote the empirical sizes under consideration. A smaller ARE value indicates an overall better performance of the associated test in terms of maintaining the nominal size.

Table 1 Empirical sizes (in \(\%\)) of the tests under various settings
Table 2 Empirical powers (in \(\%\)) of the tests under various settings
Table 3 Estimated degrees of freedoms of \(T_{new}\), \(T_{ZX}\) under various settings

Table 1 displays the empirical sizes of the tests under various settings with the last row presenting the ARE values of the tests for three values of \(\rho \). It is seen that under each setting, the empirical size of \(T_{new}\) is generally much closer to \(5\%\) than those of other tests. This shows that in terms of size control, our new test significantly outperforms other tests. This conclusion is also seen from the ARE values of the tests. In fact, from the last row of the table, it is seen that the ARE values of \(T_{new}\) are much smaller than those of other tests for \(\rho =0.1, 0.5\) and 0.9, respectively. From Table 1, we also see that in terms of size control, (a) \(T_{ZX}\) generally outperforms \(T_{BS}, T_{CQ}\) and \(T_{SD}\); (b) \(T_{BS}\) and \(T_{SD}\) are generally comparable and they are generally very liberal with most of their empirical sizes close to \(7\%\); and (c) \(T_{SD}\) performs quite well for Models 1 and 2 for \(\rho =0.1\) but it is very conservative for \(\rho =0.5\) and 0.9 with most of its empirical sizes much smaller than \(5\%\). This implies that \(T_{SD}\) cannot work well for highly skewed or correlated high-dimensional data.

Table 2 displays the empirical powers of the tests under various settings. First of all, it is seen that \(T_{new}\) and \(T_{ZX}\) have comparable empirical powers, with \(T_{ZX}\)’s empirical powers slightly bigger than those of \(T_{new}\)’s. This is possibly due to the fact that as shown in Table 1, the empirical sizes of \(T_{ZX}\) are generally bigger than those of \(T_{new}\). This observation is consistent with the conclusion drawn from Theorem 2. Second, \(T_{BS}\) and \(T_{CQ}\) have comparable empirical powers which are slightly bigger than those of \(T_{new}\) and \(T_{ZX}\). This is also because the former tests generally have bigger empirical sizes than the latter tests. Third, \(T_{SD}\) has comparable empirical powers with other tests when \(\rho =0.1\) and under Models 1 and 2. It has lower empirical powers than other tests when \(\rho =0.5,0.9\) or under Model 3. This again shows that \(T_{SD}\) does not work well for highly skewed or highly correlated high-dimensional data. Finally, we can see that under various settings, the empirical powers of all the tests are getting smaller with increasing the value of \(\rho \). This is reasonable since with increasing the value of \(\rho \), the data variations are also increasing.

Table 3 displays the estimated approximate degrees of freedom of \(T_{new}\) and \(T_{ZX}\) under various settings. First of all, it is seen that under the same setting, the estimated approximate degrees of freedom of \(T_{new}\) is smaller than \(T_{ZX}\) in most cases. Secondly, it is seen that with increasing the values of \(\rho \), the estimated approximate degrees of freedom of \(T_{new}\) and \(T_{ZX}\) become smaller. This shows that with increasing the data correlation, the normal approximation becomes less adequate. This explains why in terms of size control, \(T_{BS}\) and \(T_{CQ}\) perform worse with increasing the data correlation.

In summary, the simulation results presented in this section show that in terms of size control, \(T_{new}\) outperforms other tests significantly; \(T_{ZX}\) outperforms \(T_{BS}, T_{CQ}\) and \(T_{SD}\); \(T_{BS}\) and \(T_{SD}\) are generally comparable and are generally liberal; and \(T_{SD}\) performs well for symmetric and less correlated high-dimensional data but it is very conservative when the high-dimensional data are highly skewed or highly correlated.

4 Some interesting applications

4.1 Paired two-sample problem

One important application of the one-sample test considered in this paper is testing the mean difference for two paired samples. Suppose we have n paired observations \((\varvec{{x}}_{11},\varvec{{x}}_{12}),\ldots ,(\varvec{{x}}_{n1},\varvec{{x}}_{n2})\) which are i.i.d., we are interested in testing the following hypotheses

$$\begin{aligned} H_{0}:\ {\text {E}}(\varvec{{x}}_{11})={\text {E}}(\varvec{{x}}_{12}), \text{ versus } H_{1}:\ {\text {E}}(\varvec{{x}}_{11})\ne {\text {E}}(\varvec{{x}}_{12}). \end{aligned}$$
(24)

Then testing (24) is equivalent to testing (3) based on the induced i.i.d. sample \(\varvec{{y}}_{i}=\varvec{{x}}_{i1}-\varvec{{x}}_{i2}\), \(i=1,\ldots ,n\) and with \(\varvec{{\mu }}={\text {E}}(\varvec{{y}}_1)\). Therefore, the one-sample test discussed previously can be used to test the hypothesis (24).

As a real data example, we consider the colon dataset provided by Alon et al. (1999). The colon dataset contains 22 normal colon tissues and 40 tumor colon tissues from 40 colon-cancer patients, with each observation consisting of 2000 gene expressions. It is of interest to check whether the mean gene expression levels of the normal and tumor colon tissues are the same. For simplicity, we remove the unpaired colon tissues and keep \(n=22\) paired colon tissues only.

As an application, we apply the tests \(T_{new}\), \(T_{ZX}\), \(T_{BS}\), \(T_{CQ}\) and \(T_{SD}\) to the colon dataset to test whether the normal colon tissues and the tumor colon tissues have significantly different mean gene expression levels.

Table 4 Results for testing if the mean gene expression levels of the normal colon issues and the tumor colon tissues are the same

Table 4 presents the results based on the 22 paired colon issues only. It is seen that all the tests except \(T_{SD}\) strongly reject the null hypothesis. The estimated degrees of freedom of \(T_{new}\) and \(T_{ZX}\) are small, showing that the normal approximation used in \(T_{BS}, T_{CQ}\) and \(T_{SD}\) is not adequate to the respective null distributions. Therefore, the p values of \(T_{BS}\), \(T_{CQ}\) and \(T_{SD}\) are less liable. The p value of \(T_{SD}\) indicates that \(T_{SD}\) failed to detect the difference between the gene expression levels of the normal colon tissues and the tumor colon tissues at the \(5\%\) significance level, showing that \(T_{SD}\) is conservative in this example. This result is consistent with what we observed from the simulation results presented in Sect. 3.

4.2 One-sample problem for transposable data

In many applications, measurements of a subject can be naturally organized in a matrix, especially when the rows and columns correspond to two different sets of variables. Such a kind of data is called transposable data in Allen and Tibshirani (2010). Given n i.i.d. transposable \(q\times k\) random matrices \(\varvec{{X}}_{1},\ldots ,\varvec{{X}}_{n}\), Touloumis et al. (2015) considered the following testing problem on the structure of the mean matrix:

$$\begin{aligned} H_{0}:\ \varvec{{M}}=\left( \varvec{{\mu }}_{1}\varvec{{1}}_{k_{1}}^{\top },\ldots ,\varvec{{\mu }}_{g}\varvec{{1}}_{k_{g}}^{\top }\right) , \text{ versus } H_{1}:\ \varvec{{M}}\ne \left( \varvec{{\mu }}_{1}\varvec{{1}}_{k_{1}}^{\top },\ldots ,\varvec{{\mu }}_{g}\varvec{{1}}_{k_{g}}^{\top }\right) , \end{aligned}$$
(25)

where \(\varvec{{M}}={\text {E}}(\varvec{{X}}_{1})\), \(k_{1},\ldots ,k_{g}\) are positive integers such that \(\sum _{i=1}^{g}k_{i}=k\) with at least one \(k_{i}\ge 2\), \(\varvec{{\mu }}_{1},\ldots ,\varvec{{\mu }}_{g}\) are g unknown \(q\times 1\) vectors. For each \(i=1,\ldots ,g\), set \(\varvec{{P}}_{k_i}=\varvec{{I}}_{k_i}-\varvec{{J}}_{k_i}/k_i\) as a centering matrix of size \(k_i\times k_i\). Note that the MANOVA hypothesis (1) for dependent samples can be seen as a special case of (25). Set \(\varvec{{P}}={\text {diag}}(\varvec{{P}}_{k_{1}},\ldots ,\varvec{{P}}_{k_{g}})\), a \(k\times k\) block diagonal matrix. Then testing the null hypothesis in (25) is equivalent to testing \(\text{ vec }(\varvec{{M}}\varvec{{P}})=\varvec{{0}}\). Set

$$\begin{aligned} \varvec{{y}}_{i}=\text{ vec }(\varvec{{X}}_{i}\varvec{{P}}),\ i=1,\ldots ,n, \end{aligned}$$
(26)

which are i.i.d. \((qk)\times 1\) random vectors. Then testing (25) based on the i.i.d. random matrices \(\varvec{{X}}_{1},\ldots ,\varvec{{X}}_{n}\) is equivalent to testing (3) with the induced i.i.d. random vectors (26) and with \(\varvec{{\mu }}={\text {E}}(\varvec{{y}}_1)=\text{ vec }(\varvec{{M}}\varvec{{P}})\). Therefore, our normal reference one-sample test described in Sect. 2 can then be applied to test (25) via applying it to the induced i.i.d. random vectors (26). Similar structural hypotheses on the rows of the mean matrix can also be tested accordingly. Besides, the technical Conditions C1–C5 can be easily adapted to the original transposable data as in Touloumis et al. (2015), so asymptotic results derived in Sect. 2 also apply here. To test (25), Touloumis et al. (2015) constructed a test using U-statistics as in \(T_{CQ}\) of Chen and Qin (2010). Like \(T_{CQ}\), their test requires some strong assumptions so that a normal approximation to the null distribution of the test statistic is valid.

As a real data example, we consider the following mean matrix structure hypothesis studied by Touloumis et al. (2015) on the glioblastoma (GB) transposable dataset provided by Sottoriva et al. (2013):

$$\begin{aligned} H_{0}:\ \varvec{{M}}=(\varvec{{\mu }}_{1},\varvec{{\mu }}_{2},\varvec{{\mu }}_{3}\varvec{{1}}_{5}^{\top }), \; \text {versus}\; H_{1}: H_{0}\; \text { is not true,} \end{aligned}$$
(27)

where the columns of \(\varvec{{M}}\) represent the mean gene expression patterns of different brain compartments, with \(\varvec{{\mu }}_{1}\) corresponding to the tumor margin (MA), \(\varvec{{\mu }}_{2}\) corresponding to the sub-ventricular zone (SVZ, normal brain tissue that surrounds the tumor mass), and \(\varvec{{\mu }}_{3},\ldots ,\varvec{{\mu }}_{7}\) corresponding to 5 different fragments in the tumor mass such that earlier fragments are closer to MA and later fragments closer to SVZ. The null hypothesis in (27) corresponds to the biological hypothesis of the conservation of the mean vectors of gene expression levels across the tumor mass. The GB dataset consists of \(n=8\) patients for \(k=7\) mRNA samples (column variables), with each sample having \(q=16,810\) (row variables) gene expression levels measured. We apply the test \(T_{TTM}\) proposed by Touloumis et al. (2015), and \(T_{new}\), \(T_{ZX}\), \(T_{BS}\), \(T_{CQ}\) and \(T_{SD}\) to the transformed data (26) to test the null hypothesis in (27). The associated p values are given in the left panel of Table 5. It is seen that all the p values are comparable and they suggest that there is not enough evidence to reject the null hypothesis in (27). Because we do not reject the null hypothesis in (27), it is of interest to further test the following hypotheses:

$$\begin{aligned} H_{0}:\ \varvec{{M}}=(\varvec{{\mu }}_{1}\varvec{{1}}_{2}^{\top },\varvec{{\mu }}_{2}\varvec{{1}}_{5}^{\top }), \ \text {versus}\ H_{1}: \; H_{0}\ \text {is not true,} \end{aligned}$$
(28)

where the null hypothesis corresponds to the biological hypothesis that MA and SVZ have a common mean gene expression pattern and the 5 different fragments in the tumor mass also have a common mean gene expression pattern. The testing results are given in the right panel of Table 5. It is seen that all the tests reject the null hypothesis in (28). From Table 5, it is seen that the estimated degrees of freedom’s of \(T_{new}\) and \(T_{ZX}\) are quite large, showing that the normal approximation to the respective null distributions of \(T_{BS},\ T_{CQ},\ T_{TTM}\) and \(T_{SD}\) may be adequate.

Table 5 Testing the null hypotheses in (27) and (28) for mean gene expression levels of the glioblastoma data

4.3 Two-sample problem and MANOVA

In this subsection, we show how to use the proposed one-sample test to solve problems with two or more independent samples, e.g., the two-sample problem and MANOVA, by transforming them into a one-sample problem. There is abundant literature in the high-dimensional two-sample problem and MANOVA, see Dempster (1958), Bai and Saranadasa (1996), Srivastava and Du (2008), Chen and Qin (2010), Schott (2007), Yamada and Himeno (2015), Hu et al. (2017) and references therein. One of the advantages of using the transformation method to solve k-sample problems as a one-sample problem is that heteroscedasticity can be automatically overcome so there is no need to assume a common covariance matrix for different samples (Zhang and Xu 2009, Nishiyama et al. 2013).

Given k independent normal samples \(\varvec{{x}}_{ij},\ i=1,\ldots ,n_{j}{\mathop {\sim }\limits ^{\text {i.i.d.}}}\mathcal {N}(\varvec{{\mu }}_{j},\varvec{{\varSigma }}_{j});\ \ j=1,\ldots ,k,\) where suppose \(n_{1}\le \cdots \le n_{k}\), we firstly consider testing the simple linear hypotheses

$$\begin{aligned} H_{0}:\sum _{j=1}^{k}c_{j}\varvec{{\mu }}_{j}=\varvec{{0}},\; \text {versus}\; H_{1}: \sum _{j=1}^{k}c_{j}\varvec{{\mu }}_{j}\ne \varvec{{0}}, \end{aligned}$$
(29)

where \(c_{1},\ldots ,c_{k}\) are some given scalars. To apply the proposed one-sample test to the above problem, we can transform the k samples into one sample by the following transformation (Anderson 2003, Sect. 5.5): for \(i=1,\ldots ,n_{1}\),

$$\begin{aligned} \varvec{{y}}_{i}=c_{1}\varvec{{x}}_{i1}+\sum _{j=2}^{k}c_{j}(n_{1}/n_{j})^{1/2} \left[ \varvec{{x}}_{ij}-n_{1}^{-1}\sum _{\ell =1}^{n_{1}}\varvec{{x}}_{\ell j}+(n_{1}n_{j})^{-1/2}\sum _{\ell =1}^{n_{j}}\varvec{{x}}_{\ell j}\right] . \end{aligned}$$
(30)

Then we have \( \varvec{{y}}_{1},\ldots ,\varvec{{y}}_{n_{1}}{\mathop {\sim }\limits ^{\text {i.i.d.}}}\mathcal {N}(\sum _{i=1}^{k}c_{i}\varvec{{\mu }}_{i}, \sum _{i=1}^{k}c_{i}^{2}n_{1}n_{i}^{-1}\varvec{{\varSigma }}_{i})\). Applying the proposed one-sample test to the induced sample we can test the hypotheses (29). In particular, let \(k=2\) and \(c_{1}=1,\) \(c_{2}=-1\), hypotheses (29) reduce to the two-sample problem studied in Chen and Qin (2010), and the transformation (30) reduces to the multivariate Scheffé (1943)’s transformation, also known as Bennett (1950)’s transformation.

For the MANOVA problem, i.e., testing

$$\begin{aligned} H_{0}:\varvec{{\mu }}_{1}=\cdots =\varvec{{\mu }}_{k},\; \text {versus}\; H_{1}: H_{0}\; \text { is not true,} \end{aligned}$$
(31)

where \(k\ge 3\), we can use the “dimension stacking” trick described by Anderson (1963). By applying the transformation (30) \(k-1\) times, where in the j-th time, set \(c_{1}=1\), \(c_{j}=-1\) other coefficients zero, we get \(k-1\) samples \(\varvec{{y}}_{ij},\ i=1,\ldots ,n_{1};\ j=1,\ldots ,(k-1)\). By stacking \((k-1)\) observations from different samples into a single observation, i.e., define \(\varvec{{y}}_{i}=(\varvec{{y}}_{i1}^{\top },\ldots ,\varvec{{y}}_{i(k-1)}^{\top })^{\top },\ i=1,\ldots ,n_{1}\), the original k samples are transformed into one sample with mean vector \((\varvec{{\mu }}_{1}^{\top }-\varvec{{\mu }}_{2}^{\top },\ldots ,\varvec{{\mu }}_{1}^{\top }-\varvec{{\mu }}_{k}^{\top })^{\top }\), and the MANOVA problem (31) for the original k samples is also converted to the one-sample problem for the induced sample. See also Zhang and Xu (2009) for more details of this approach.

As a real data example, we consider the peripheral blood mononuclear cells (PBMC) data provided by Burczynski et al. (2006), which is a microarray data contains 22,283 gene expression levels of 42 normal, 26 ulcerative colitis (UC), and 59 Crohn’s disease (CD) tissues. We apply different one-sample tests based on the transformation method to check whether the three PBMC tissues have the same mean expression levels. The testing results are given in Table 6, where all tests reject the null hypothesis that the mean gene expression levels of the three PBMC tissues are the same. This result is consistent with the result reported in Table 7 of Zhang et al. (2017), and the testing result given by the one-sample test \(T_{ZX}\) is very similar to the result given by the MANOVA test proposed by Zhang et al. (2017). It is seen that the estimated degrees of freedom’s of \(T_{new}\) and \(T_{ZX}\) are quite small, indicating that the normal approximation to the respective null distributions of \(T_{BS}\), \(T_{CQ}\) and \(T_{SD}\) is not adequate.

Table 6 Testing if the mean gene expression levels of the three PBMC tissues are the same

4.4 One-sample problem for heavy tailed data

Direct applications of \(T_{new}\), \(T_{ZX}\), \(T_{BS}\), \(T_{CQ}\) and \(T_{SD}\) to one-sample problem for heavy tailed high-dimensional data are often less powerful. To overcome this difficulty, one may apply these tests to the induced one sample yielded from the following multivariate spatial sign transformation:

$$\begin{aligned} \varvec{{u}}_{i}=U(\varvec{{y}}_{i})={\left\{ \begin{array}{ll} \frac{\varvec{{y}}_{i}}{||\varvec{{y}}_{i}||}, &{} \varvec{{y}}_{i}\ne \varvec{{0}},\\ 0, &{} \varvec{{y}}_{i}=\varvec{{0}}. \end{array}\right. },\ i=1,\ldots ,n. \end{aligned}$$
(32)

For example, Wang et al. (2015) and Zhou et al. (2019) successfully apply \(T_{CQ}\) and \(T_{ZX}\) to the induced sample (32) for elliptically distributed high-dimensional data, respectively.

To compare the performance of \(T_{new}, T_{ZX}, T_{BS}, T_{CQ}\) and \(T_{SD}\) on the induced one sample (32) for heavy tailed high-dimensional data, we conduct the following simulation study. We generate a heavy tailed high-dimensional sample using \(\varvec{{y}}_i=\varvec{{\mu }}+\varvec{{\varSigma }}^{1/2}\varvec{{z}}_i,\ i=1,\ldots , n\) where \(\varvec{{\mu }}\) and \(\varvec{{\varSigma }}\) are specified as in Sect. 3 and \(\varvec{{z}}_{i},\ i=1,\ldots ,n\) are generated using the following two models:

  • Model 4: \(z_{ir},\ r=1,\ldots ,p\) i.i.d. follow a Gaussian mixture \(0.9\mathrm{\mathcal {N}(0,1)+0.1\mathcal {N}(0,9)}\).

  • Model 5: \(\varvec{{z}}_{i}=\varvec{{w}}_{i}/\sqrt{0.3}\), with \(\varvec{{w}}_{i}\) following a p-dimensional multivariate \(\mathrm{t}\)-distribution with 3 degrees of freedom, mean \(\varvec{{0}}\) and covariance matrix \(\varvec{{I}}_p\).

Table 7 Empirical sizes (in \(\%\)) of the tests for heavy tailed distributions
Table 8 Empirical powers (in \(\%\)) of the tests for heavy tailed distributions
Table 9 Estimated approximate degrees of freedom when \(\delta =0\) for heavy tailed distributions

Tables 7, 8 and 9 present the empirical sizes, powers of the tests and estimated degrees of freedoms of \(T_{new}\) and \(T_{ZX}\), respectively. As expected, the conclusions drawn from these three tables are similar to those drawn from Tables 1, 2 and 3 in Sect. 3. In particular, in terms of size control, \(T_{new}\) again outperforms other tests significantly.

5 Concluding remarks

In this paper, we propose and study a normal reference test with three-cumulant matched \(\chi ^2\)-approximation for the one-sample problem for high-dimensional data. A simulation study shows that in terms of size control, the proposed test outperforms several existing competitors. The proposed test can also be applied for testing one-sample problems with other types of data via some simple transformations. When the data are normally distributed, it is known that the estimated approximation parameters are ratio-consistent. However, whether they are also ratio-consistent for non-normal high-dimensional data is interesting and warranted. Since the normal reference test with the 3-c matched \(\chi ^2\)-approximation for one-sample problems for high-dimensional data has much better size control than several existing tests, it is also interesting and warranted to extend this normal reference test to other high-dimensional hypothesis testing problems.