Simultaneous Tests for Mean Vectors and Covariance Matrices with Three-Step Monotone Missing Data

Sakai, Remi; Yagi, Ayaka; Seo, Takashi

doi:10.1007/s42519-023-00355-2

Simultaneous Tests for Mean Vectors and Covariance Matrices with Three-Step Monotone Missing Data

Original Article
Published: 14 December 2023

Volume 18, article number 3, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Simultaneous Tests for Mean Vectors and Covariance Matrices with Three-Step Monotone Missing Data

Download PDF

Remi Sakai¹,
Ayaka Yagi¹ &
Takashi Seo¹

99 Accesses
Explore all metrics

Abstract

In this paper, we consider simultaneous tests of the mean vectors and the covariance matrices under three-step monotone missing data for a one-sample and a multi-sample problem. We provide the likelihood ratio test (LRT) statistic and propose statistics for improving the accuracy of the $\chi ^2$ approximation. These test statistics are derived by decomposing the likelihood ratio (LR) using the coefficients of the modified LRT statistics with complete data. As an alternative approach, we derive an approximate upper percentile of the LRT statistic with three-step monotone missing data using linear interpolation based on an asymptotic expansion of the LRT statistic with complete data. Finally, we investigate the asymptotic behavior of the upper percentiles of these test statistics and the accuracy of approximate upper percentiles via Monte Carlo simulation. In addition, we give an example of test statistics and approximate upper percentiles proposed in this paper.

On the likelihood ratio test for the equality of multivariate normal populations with two-step monotone missing data

Article 01 December 2016

On the asymptotic distribution of T²-type statistic with two-step monotone missing data

Article 01 September 2018

Empirical likelihood-based inferences in varying coefficient models with missing data

Article 12 July 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper, we consider simultaneous tests of the mean vectors and the covariance matrices under three-step monotone missing data for a one-sample and a multi-sample problem. Jinadasa and Tracy [4] and Kanda and Fujikoshi [5] discussed MLEs for general k-step monotone missing data. For simultaneous tests, the LRT statistic and the modified LRT statistic with Bartlett correction for the case of complete data were discussed by Muirhead [6] and Srivastava [7]. Furthermore, the LRT statistic and the test statistics for improving the accuracy of the $\chi ^2$ approximation for three-step monotone missing data were proposed by Hao and Krishnamoorthy [1] and Hosoya and Seo [2, 3]. In particular, Hosoya and Seo [2, 3] presented test statistics by decomposing the LR; this paper is an extension of the work presented by Hosoya and Seo [2, 3]. An LRT statistic and test statistics for general k-step monotone missing data, which are obtained by correcting only a part of the missing data, were given by Yagi et al. [9].

The remainder of this paper is organized as follows. Sect. 2 describes the MLEs of the mean vector and covariance matrix and its LRT statistic in the case of three-step monotone missing data for a one-sample problem. Furthermore, we propose three test statistics for improving the accuracy of the $\chi ^2$ approximation using the coefficients of the modified LRT statistics with complete data. In addition, we derive an approximate upper percentile of the LRT statistic. Using Monte Carlo simulation, we investigate the $\chi ^2$ approximation accuracy of the test statistics and the accuracy of approximate upper percentiles of the LRT statistic. Numerical power comparison of the test statistics for some selected parameters is also presented. Sect. 3 describes the test statistics and approximate upper percentile for a multi-sample problem. Furthermore, via Monte Carlo simulation, we investigate the asymptotic behavior of the upper percentiles of these test statistics and approximate upper percentiles of the LRT statistic. The results are illustrated using an example. Finally, Sect. 4 states our conclusions.

2 One-Sample Problem

In this section, we consider the problem of simultaneous test of the mean vector and the covariance matrix under three-step monotone missing data for a one-sample problem.

2.1 LR with Three-Step Monotone Missing Data

We suppose that the data is normally distributed as follows:

$$\begin{aligned} {\varvec{x}}_1,{\varvec{x}}_2,\ldots ,{\varvec{x}}_{N_1}&{\mathop {\sim }\limits ^{i.i.d.}} N_p({\varvec{\mu }},{\varvec{\varSigma }}), \nonumber \\ {\varvec{x}}_{(12),N_1+1},{\varvec{x}}_{(12),N_1+2},\ldots ,{\varvec{x}}_{(12),N_1+N_2}&{\mathop {\sim }\limits ^{i.i.d.}} N_{p_1+p_2}({\varvec{\mu }}_{(12)}, {\varvec{\varSigma }}_{(12)(12)}),\nonumber \\ {\varvec{x}}_{1,N_1+N_2+1},{\varvec{x}}_{1,N_1+N_2+2},\ldots ,{\varvec{x}}_{1N}&{\mathop {\sim }\limits ^{i.i.d.}} N_{p_1}({\varvec{\mu }}_{1},{\varvec{\varSigma }}_{11}), \end{aligned}$$

(1)

where

$$\begin{aligned} {\varvec{\mu }}&= \left( \begin{array}{c} {\varvec{\mu }}_1 \\ {\varvec{\mu }}_2 \\ {\varvec{\mu }}_3 \end{array} \right) =\left( \begin{array}{c} {\varvec{\mu }}_{(12)} \\ {\varvec{\mu }}_3 \end{array} \right) ,\\ {\varvec{\varSigma }}&= \left( \begin{array}{cc|c} {\varvec{\varSigma }}_{11} &{} {\varvec{\varSigma }}_{12} &{} {\varvec{\varSigma }}_{13} \\ {\varvec{\varSigma }}_{21} &{} {\varvec{\varSigma }}_{22} &{} {\varvec{\varSigma }}_{23} \\ \hline {\varvec{\varSigma }}_{31} &{} {\varvec{\varSigma }}_{32} &{} {\varvec{\varSigma }}_{33} \end{array} \right) =\left( \begin{array}{c|c} {\varvec{\varSigma }}_{(12)(12)} &{} {\varvec{\varSigma }}_{(12)3} \\ \hline {\varvec{\varSigma }}_{3(12)} &{} {\varvec{\varSigma }}_{33} \end{array} \right) . \end{aligned}$$

We partition ${\varvec{x}}_j$ into a $p_1 \times 1$ random vector, a $p_2 \times 1$ random vector, and a $p_3 \times 1$ random vector as ${\varvec{x}}_j = ({\varvec{x}}'_{1j},{\varvec{x}}'_{2j},{\varvec{x}}'_{3j})'(j=1,\ldots ,N_1)$. In addition, let ${\varvec{x}}_{(12),j} = ({\varvec{x}}_{1j}',{\varvec{x}}_{2j}')'(j=N_1+1,\ldots ,N_1+N_2)$.

Such a dataset has three-step monotone missing data for a one-sample problem:

where $N = N_1+N_2+N_3,p =p_1+p_2+p_3$ and “$*$” indicates a missing observation.

Now, we consider the following hypothesis test when the dataset has a three-step monotone pattern.

$$\begin{aligned} H_{0}:{\varvec{\mu }}={\varvec{\mu }}_0,{\varvec{\varSigma }}={\varvec{\varSigma }}_0 \;\, \mathrm{vs.} \;\, H_{1}:\textrm{not} \;\, H_{0}, \end{aligned}$$

(2)

where ${\varvec{\mu }}_0$ is a known vector, and ${\varvec{\varSigma }}_0$ is a known matrix. Without loss of generality, we can assume that ${\varvec{\mu }}={\varvec{0}}$ and ${\varvec{\varSigma }}={\varvec{I}}_p$. Then, we have the following theorem.

Theorem 1

Suppose the data have a three-step monotone pattern of missing observations and are normally distributed as (1). Then, the LR of the hypothesis test (2) can be given by

$$\begin{aligned} \lambda _{1}&= |\widehat{{\varvec{\varDelta }}}_{11}|^{\frac{N}{2}} |\widehat{{\varvec{\varDelta }}}_{22}|^{\frac{N_1+N_2}{2}} |\widehat{{\varvec{\varDelta }}}_{33}|^{\frac{N_1}{2}} \nonumber \\&\quad \times \frac{\textrm{etr}\left( -{\frac{1}{2}\sum _{j=1}^{N}}{\varvec{x}}_{j}{\varvec{x}}_{j}'\right) \textrm{etr}\left( -{\frac{1}{2}\sum _{j=1}^{N_1+N_2}}{\varvec{x}}_{2j}{\varvec{x}}_{2j}'\right) \textrm{etr}\left( -{\frac{1}{2}\sum _{j=1}^{N_1}}{\varvec{x}}_{3j}{\varvec{x}}_{3j}'\right) }{\exp \left( -{\frac{1}{2}Np_1} \right) \exp \left( -{\frac{1}{2}(N_1+N_2)p_2} \right) \exp \left( -{\frac{1}{2}N_1p_3} \right) }, \end{aligned}$$

(3)

where

$$\begin{aligned} \widehat{{\varvec{\varDelta }}}_{11}=\frac{1}{N}{\varvec{B}}, \; \, \widehat{{\varvec{\varDelta }}}_{22}=\frac{1}{N_1+N_2}{\varvec{A}}_{22 \cdot 1}, \; \, \widehat{{\varvec{\varDelta }}}_{33}=\frac{1}{N_1}{\varvec{W}}_{(1)33 \cdot 12}, \end{aligned}$$

and

$$\begin{aligned} {\varvec{W}}_{(1)}&= \sum _{j=1}^{N_1}({\varvec{x}}_j-\overline{{\varvec{x}}}_{(1)})({\varvec{x}}_j-\overline{{\varvec{x}}}_{(1)})' \\&= \left( \begin{array}{cc|c} {\varvec{W}}_{(1)11} &{} {\varvec{W}}_{(1)12} &{} {\varvec{W}}_{(1)13} \\ {\varvec{W}}_{(1)21} &{} {\varvec{W}}_{(1)22} &{} {\varvec{W}}_{(1)23} \\ \hline {\varvec{W}}_{(1)31} &{} {\varvec{W}}_{(1)32} &{} {\varvec{W}}_{(1)33} \end{array} \right) = \left( \begin{array}{c|c} {\varvec{W}}_{(1),(12)(12)} &{} {\varvec{W}}_{(1),(12)3} \\ \hline {\varvec{W}}_{(1),3(12)} &{} {\varvec{W}}_{(1)33} \end{array} \right) , \\ {\varvec{W}}_{(2)}&= \sum _{j=N_1+1}^{N_1+N_2}\left( \begin{array}{c} {\varvec{x}}_{1j}-\overline{{\varvec{x}}}_{(2)1} \\ {\varvec{x}}_{2j}-\overline{{\varvec{x}}}_{(2)2} \end{array} \right) \left( \begin{array}{c} {\varvec{x}}_{1j}-\overline{{\varvec{x}}}_{(2)1} \\ {\varvec{x}}_{2j}-\overline{{\varvec{x}}}_{(2)2} \end{array} \right) '\\&\quad +\frac{N_1N_2}{N_1+N_2}\left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)1}-\overline{{\varvec{x}}}_{(2)1} \\ \overline{{\varvec{x}}}_{(1)2}-\overline{{\varvec{x}}}_{(2)2} \end{array} \right) \left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)1}-\overline{{\varvec{x}}}_{(2)1} \\ \overline{{\varvec{x}}}_{(1)2}-\overline{{\varvec{x}}}_{(2)2} \end{array} \right) '\\&= \left( \begin{array}{cc} {\varvec{W}}_{(2)11} &{} {\varvec{W}}_{(2)12} \\ {\varvec{W}}_{(2)21} &{} {\varvec{W}}_{(2)22} \end{array} \right) ,\\ {\varvec{W}}_{(3)}&= \sum _{j=N_1+N_2+1}^{N}({\varvec{x}}_{1j}-\overline{{\varvec{x}}}_{(3)})({\varvec{x}}_{1j}-\overline{{\varvec{x}}}_{(3)})'\\&\quad + \frac{(N_1+N_2)N_3}{N} \left( \overline{{\varvec{x}}}_{(3)}-\frac{1}{N_1+N_2}(N_1\overline{{\varvec{x}}}_{(1)1}+N_2\overline{{\varvec{x}}}_{(2)1})\right) \\&\quad \times \left( \overline{{\varvec{x}}}_{(3)}-\frac{1}{N_1+N_2}(N_1\overline{{\varvec{x}}}_{(1)1}+N_2\overline{{\varvec{x}}}_{(2)1})\right) ',\\ {\varvec{W}}_{(1)33 \cdot 12}&= {\varvec{W}}_{(1)33}-{\varvec{W}}_{(1),3(12)}{\varvec{W}}_{(1),(12)(12)}^{-1}{\varvec{W}}_{(1),(12)3},\\ {\varvec{A}}&= {\varvec{W}}_{(1),(12)(12)}+{\varvec{W}}_{(2)}, \; \, {\varvec{A}}_{22 \cdot 1} ={\varvec{A}}_{22}-{\varvec{A}}_{21}{\varvec{A}}_{11}^{-1}{\varvec{A}}_{12},\\ {\varvec{B}}&= {\varvec{W}}_{(1)11}+{\varvec{W}}_{(2)11}+{\varvec{W}}_{(3)},\\ \overline{{\varvec{x}}}_{(1)}&= \left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)1} \\ \overline{{\varvec{x}}}_{(1)2} \\ \overline{{\varvec{x}}}_{(1)3} \end{array} \right) , \; \overline{{\varvec{x}}}_{(1)1}=\frac{1}{N_1}\sum _{j=1}^{N_1}{\varvec{x}}_{1j},\; \overline{{\varvec{x}}}_{(1)2}=\frac{1}{N_1}\sum _{j=1}^{N_1}{\varvec{x}}_{2j}, \\ \overline{{\varvec{x}}}_{(1)3}&=\frac{1}{N_1}\sum _{j=1}^{N_1}{\varvec{x}}_{3j},\; \overline{{\varvec{x}}}_{(2)} = \left( \begin{array}{c} \overline{{\varvec{x}}}_{(2)1} \\ \overline{{\varvec{x}}}_{(2)2} \end{array} \right) ,\; \overline{{\varvec{x}}}_{(2)1}=\frac{1}{N_2}\sum _{j=N_1+1}^{N_1+N_2}{\varvec{x}}_{1j},\\ \overline{{\varvec{x}}}_{(2)2}&=\frac{1}{N_2}\sum _{j=N_1+1}^{N_1+N_2}{\varvec{x}}_{2j},\; \overline{{\varvec{x}}}_{(3)} = \frac{1}{N_3}\sum _{j=N_1+N_2+1}^{N}{\varvec{x}}_{1j}.\\ \end{aligned}$$

For the derivation of Theorem 1, see the Appendix. After calculations, we get

$$\begin{aligned} \lambda _{1}&=\left( \frac{e}{N} \right) ^{\frac{Np_1}{2}}\left| {\varvec{B}}\right| ^{\frac{N}{2}} \left( \frac{e}{N_1+N_2} \right) ^{\frac{(N_1+N_2)p_2}{2}}\left| {\varvec{A}}_{22 \cdot 1}\right| ^{\frac{N_1+N_2}{2}} \left( \frac{e}{N_1} \right) ^{\frac{N_1p_3}{2}}\left| {\varvec{W}}_{(1)33 \cdot 12}\right| ^{\frac{N_1}{2}}\\&\quad \times \textrm{etr}\left\{ -\frac{1}{2}\left( {\varvec{B}}+\frac{1}{N}(N_1\overline{{\varvec{x}}}_{(1)1} +N_2\overline{{\varvec{x}}}_{(2)1}+N_3\overline{{\varvec{x}}}_{(3)})(N_1\overline{{\varvec{x}}}_{(1)1} +N_2\overline{{\varvec{x}}}_{(2)1}+N_3\overline{{\varvec{x}}}_{(3)})'\right) \right\} \\&\quad \times \textrm{etr} \left\{ -\frac{1}{2}\big ({\varvec{A}}_{22}+\frac{1}{N_1+N_2}(N_1\overline{{\varvec{x}}}_{(1)2}+N_2\overline{{\varvec{x}}}_{(2)2})(N_1\overline{{\varvec{x}}}_{(1)2}+N_2\overline{{\varvec{x}}}_{(2)2})'\big )\right\} \\&\quad \times \textrm{etr} \left\{ -\frac{1}{2}({\varvec{W}}_{(1)33}+N_1\overline{{\varvec{x}}}_{(1)3}\overline{{\varvec{x}}}_{(1)3}') \right\} . \end{aligned}$$

This LR $\lambda _1$ is essentially the same as that obtained by Yagi, Yamaguchi, and Seo [9]. Thus, we obtain the LRT statistic $-2\log \lambda _{1}$. In the complete data case (Sect. 2.2), the LRT statistic for (2) is asymptotically distributed as $\chi ^2$ distribution with $f_1$ degrees of freedom where $f_1=p(p+3)/2$ (see Muirhead [6, p. 370]). For example, Table 1 presents the simulated values of the upper 100$\alpha $ percentiles of $-2\log \lambda _{1}$ and Type I error rate, $\alpha _{1} =\textrm{Pr}\{ -2\log \lambda _{1} > \chi _{f_1{;1-\alpha }}^2 \}$ for the three-step monotone missing data case, where $\chi _{f_1{;1-\alpha }}^2$ is the upper $100\alpha $ percentile of the $\chi ^2$ distribution with $f_1$ degrees of freedom.

Table 1 The upper percentile of $-2\log \lambda _{1}$ and type I error rates when $(p_1,p_2,p_3)=(3,3,3)$

Full size table

As demonstrated in Table 1, the accuracy of the $\chi ^2$ approximation in this case is not desirable when the sample size is not large; therefore, a test statistic is needed to improve the accuracy of the $\chi ^2$ approximation. We propose test statistics that improve the $\chi ^2$ approximation using the modified likelihood ratio test statistic of simultaneous test and test of variance for the complete data case described in Sect. 2.2.

2.2 Complete Data

We consider the LRT statistic and modified LRT statistics with Bartlett correction in the case of complete data for a one-sample problem. These results are used in the next subsection. We first consider a simultaneous test for complete data as follows:

$$\begin{aligned} H_{01}:{\varvec{\mu }}={\varvec{0}}, {\varvec{\varSigma }}={\varvec{I}} \; \, \mathrm{vs.} \;\, H_{11}:\textrm{not} \; \, H_{01} \end{aligned}$$

In this case, the LR can be expressed as follows. Let ${\varvec{x}}_1,{\varvec{x}}_2,\ldots ,{\varvec{x}}_N$ be independently distributed as $N_p ({\varvec{\mu }},{\varvec{\varSigma }})$, and let $\lambda _{S_1}$ be the LR for the complete data. Then, the LR is given by

$$\begin{aligned} \lambda _{S_1}= e^{\frac{Np}{2}}\left| \frac{1}{N}{\varvec{W}} \right| ^{\frac{N}{2}} \textrm{etr}\left( -\frac{1}{2}{\varvec{W}} \right) \exp \left( -\frac{1}{2}N \overline{{\varvec{x}}}'\overline{{\varvec{x}}} \right) , \end{aligned}$$

where

$$\begin{aligned} {\varvec{W}} =\sum _{j=1}^{N}({\varvec{x}}_{j}-\overline{{\varvec{x}}})({\varvec{x}}_{j}-\overline{{\varvec{x}}})',\; \overline{{\varvec{x}}} = \frac{1}{N}\sum _{j=1}^{N}{\varvec{x}}_{j}. \end{aligned}$$

It is known that $-2 \log \lambda _{S_1}$ is asymptotically distributed as $\chi ^2$ distribution with $f_1(=p(p+3)/2)$ degrees of freedom. Furthermore, the modified LRT statistic with Bartlett correction can be given by $-2\rho _1 \log \lambda _{S_1}$ (Muirhead [6, p. 370]), where

$$\begin{aligned} \rho _1 = 1-\frac{2p^2+9p+11}{6N(p+3)}. \end{aligned}$$

Next, we consider a covariance test for complete data as follows:

$$\begin{aligned} H_{02}:{\varvec{\varSigma }}={\varvec{I}} \; \, \mathrm{vs.} \; \, H_{12}:\textrm{not} \; \, H_{02} \end{aligned}$$

In this case, the LR, which is an unbiased test, can be expressed as follows:

$$\begin{aligned} \lambda _{V_1}= e^{\frac{(N-1)p}{2}} \left| \frac{1}{N-1}{\varvec{W}} \right| ^{\frac{N-1}{2}} \textrm{etr}\left( -\frac{1}{2}{\varvec{W}} \right) . \end{aligned}$$

The modified LRT statistic with Bartlett correction $-2\rho _2 \log \lambda _{V_1}$ was provided by Muirhead [6, p. 359], where

$$\begin{aligned} \rho _2 = 1-\frac{2p^2+3p-1}{6(N-1)(p+1)}. \end{aligned}$$

2.3 Test Statistics

We now decompose the LR to derive the test statistic for improving the accuracy of the $\chi ^2$ approximation. Let

$$\begin{aligned} \omega _1&= \exp \left( -\frac{1}{2N}(N_1\overline{{\varvec{x}}}_{(1)1}+N_2\overline{{\varvec{x}}}_{(2)1}+N_3\overline{{\varvec{x}}}_{(3)})'(N_1\overline{{\varvec{x}}}_{(1)1}+N_2\overline{{\varvec{x}}}_{(2)1}+N_3\overline{{\varvec{x}}}_{(3)}) \right) ,\\ \omega _2&= \exp \left( -\frac{1}{2(N_1+N_2)}(N_1\overline{{\varvec{x}}}_{(1)2}+N_2\overline{{\varvec{x}}}_{(2)2})'(N_1\overline{{\varvec{x}}}_{(1)2}+N_2\overline{{\varvec{x}}}_{(2)2})\right) ,\\ \omega _3&= \exp \left( -\frac{N_1}{2}\overline{{\varvec{x}}}_{(1)3}'\overline{{\varvec{x}}}_{(1)3} \right) ,\\ \omega _4&= \left( \frac{e}{N} \right) ^{\frac{1}{2}Np_1}\left| {\varvec{B}}\right| ^{\frac{N}{2}}\textrm{etr}\left( -\frac{1}{2}{\varvec{B}} \right) ,\\ \omega _5&= \left( \frac{e}{N_1+N_2} \right) ^{\frac{1}{2}(N_1+N_2)p_2}\left| {\varvec{A}}_{22 \cdot 1}\right| ^{\frac{N_1+N_2}{2}}\textrm{etr}\left( -\frac{1}{2}{\varvec{A}}_{22 \cdot 1} \right) ,\\ \omega _6&= \left( \frac{e}{N_1} \right) ^{\frac{1}{2}N_1p_3}\left| {\varvec{W}}_{(1)33 \cdot 12}\right| ^{\frac{N_1}{2}}\textrm{etr}\left( -\frac{1}{2}{\varvec{W}}_{(1)33 \cdot 12} \right) ,\\ \omega _7&= \textrm{etr}\left( -\frac{1}{2}{\varvec{A}}_{21}{\varvec{A}}_{11}^{-1}{\varvec{A}}_{12} \right) \textrm{etr}\left( -\frac{1}{2}{\varvec{W}}_{(1),3(12)}{\varvec{W}}_{(1),(12)(12)}^{-1}{\varvec{W}}_{(1),(12)3} \right) . \end{aligned}$$

Therefore,

$$\begin{aligned} \lambda _{1} = \prod _{i=1}^7 \omega _i. \end{aligned}$$

Then, $\omega _1\omega _4, \omega _2\omega _5, \omega _3\omega _6$ are of the form of LR for $H_{01}$ under non-missing normality. Hence, we can obtain the modified LRT statistics, $-2\rho _{14}\log \omega _1\omega _4,$ $-2\rho _{25}\log \omega _2\omega _5,-2\rho _{36}\log \omega _3\omega _6$, where

$$\begin{aligned} \rho _{14}&=1-\frac{2p_1^2+9p_1+11}{6N(p_1+3)},\\ \rho _{25}&=1-\frac{2p_2^2+9p_2+11}{6(N_1+N_2)(p_2+3)},\\ \rho _{36}&=1-\frac{2p_3^2+9p_3+11}{6N_1(p_3+3)}. \end{aligned}$$

Thus, we propose a new test statistic given by $-2\log \tau _{1}$, where

$$\begin{aligned} \tau _{1} = (\omega _1\omega _4)^{\rho _{14}}(\omega _2\omega _5)^{\rho _{25}}(\omega _3\omega _6)^{\rho _{36}}\omega _7 . \end{aligned}$$

In addition, we denote

$$\begin{aligned} \omega _4^*&= \left( \frac{e}{n} \right) ^{\frac{1}{2}np_1}\left| {\varvec{B}}\right| ^{\frac{n}{2}}\textrm{etr}\left( -\frac{1}{2}{\varvec{B}} \right) ,\\ \omega _5^*&= \left( \frac{e}{n_1+n_2} \right) ^{\frac{1}{2}(n_1+n_2)p_2}\left| {\varvec{A}}_{22 \cdot 1}\right| ^{\frac{n_1+n_2}{2}}\textrm{etr}\left( -\frac{1}{2}{\varvec{A}}_{22 \cdot 1} \right) ,\\ \omega _6^*&= \left( \frac{e}{n_1} \right) ^{\frac{1}{2}n_1p_3}\left| {\varvec{W}}_{(1)33 \cdot 12}\right| ^{\frac{n_1}{2}}\textrm{etr}\left( -\frac{1}{2}{\varvec{W}}_{(1)33 \cdot 12} \right) , \end{aligned}$$

where

$$\begin{aligned} n=N-1,\; n_1=N_1-(p_1+p_2)-1,\; n_1+n_2=N_1+N_2-p_1-1. \end{aligned}$$

Subsequently, since $\omega _4^*,\omega _5^*,\omega _6^*$ are of the form of LR for $H_{02}$ under non-missing normality, we can propose the test statistic as $-2\log \phi _{1}$, where

$$\begin{aligned} \phi _{1} = \omega _1\omega _2\omega _3(\omega _4^*)^{\rho _4^*}(\omega _5^*)^{\rho _5^*}(\omega _6^*)^{\rho _6^*}\omega _7 \end{aligned}$$

and

$$\begin{aligned} \rho _{4}^*&= 1-\frac{2p_1^2+3p_1-1}{6n(p_1+1)},\; \rho _{5}^*=1-\frac{2p_2^2+3p_2-1}{6(n_1+n_2)(p_2+1)},\\ \rho _{6}^*&=1-\frac{2p_3^2+3p_3-1}{6n_1(p_3+1)}. \end{aligned}$$

Now, we propose the modified LRT statistic $-2\rho _{L_{1}} \log \lambda _{1}$ via linear interpolation, where

$$\begin{aligned} \rho _{L_{1}}&= \left\{ 1-\frac{(p_1+p_2)N_2+p_1N_3}{p(N_2+N_3)}\right\} \rho _{N_1,1}+\frac{(p_1+p_2)N_2+p_1N_3}{p(N_2+N_3)}\rho _{N,1} , \end{aligned}$$

and

$$\begin{aligned} \rho _{N_1,1}= 1-\frac{2p^2+9p+11}{6N_1(p+3)}, \; \rho _{N,1}= 1-\frac{2p^2+9p+11}{6N(p+3)}. \end{aligned}$$

2.4 Asymptotic Expansion Approximation

In this subsection, we give an approximate upper percentile of $-2\log \lambda _{1}$ when the data have a three-step monotone pattern for a one-sample problem. The upper $100\alpha $ percentile of $-2\log \lambda _{S_1}$ can be expanded as

$$\begin{aligned} q_{c_1}^*(\alpha )&= \chi _{f_1{;1-\alpha }}^2+\frac{\nu }{N}\chi _{f_1{;1-\alpha }}^2 +\frac{1}{N^2}\chi _{f_1{;1-\alpha }}^2\left\{ \nu ^2+\frac{2\nu }{f_1} +\frac{2\nu }{f_1(f_1+2)}\chi _{f_1{;1-\alpha }}^2 \right\} \\&\quad + O(N^{-2}), \end{aligned}$$

where $\nu =(2p^2+9p+11)/\{6(p+3)\}$ (Hosoya and Seo [2]) Based on linear interpolation and letting $q_{1}^*(\alpha )$ be the upper $100\alpha $ percentile of $-2\log \lambda _{1}$, the following can be obtained:

$$\begin{aligned} q_{1}^*(\alpha )=\left\{ 1-\frac{(p_1+p_2)N_2+p_1N_3}{p(N_2+N_3)}\right\} q_{N_1,1}^*(\alpha )+\frac{(p_1+p_2)N_2+p_1N_3}{p(N_2+N_3)}q_{N,1}^*(\alpha ), \end{aligned}$$

where

$$\begin{aligned} q_{N_1,1}^*(\alpha )&= \chi _{f_1{;1-\alpha }}^2+\frac{\nu }{N_1}\chi _{f_1{;1-\alpha }}^2 +\frac{1}{N_1^2}\chi _{f_1{;1-\alpha }}^2\left\{ \nu ^2+\frac{2\nu }{f_1}+\frac{2\nu }{f_1(f_1+2)} \chi _{f_1{;1-\alpha }}^2 \right\} , \\ q_{N,1}^*(\alpha )&= \chi _{f_1{;1-\alpha }}^2+\frac{\nu }{N}\chi _{f_1{;1-\alpha }}^2 +\frac{1}{N^2}\chi _{f_1{;1-\alpha }}^2\left\{ \nu ^2+\frac{2\nu }{f_1} +\frac{2\nu }{f_1(f_1+2)}\chi _{f_1{;1-\alpha }}^2 \right\} . \end{aligned}$$

2.5 Simulation Studies

We evaluate the accuracy and the asymptotic behaviors of the $\chi ^2$ approximations via Monte Carlo simulation ($10^6$ runs). Let

$$\begin{aligned} \alpha _{1}&= \textrm{Pr}\{ -2\log \lambda _{1}> \chi _{f_1{;1-\alpha }}^2 \},\\ \alpha _{\rho _{L_1}}&= \textrm{Pr}\{ -2\rho _{L_{1}}\log \lambda _{1}> \chi _{f_1{;1-\alpha }}^2 \},\\ \alpha _{\tau _{1}}&= \textrm{Pr}\{ -2\log \tau _{1}> \chi _{f_1{;1-\alpha }}^2 \},\\ \alpha _{\phi _{1}}&= \textrm{Pr}\{ -2\log \phi _{1}> \chi _{f_1{;1-\alpha }}^2 \}, \\ \alpha _{q_{1}^*}&= \textrm{Pr}\{ -2\log \lambda _{1} > q_{1}^*(\alpha ) \}. \end{aligned}$$

In Tables 2 and 3, we provide the simulated upper 100$\alpha $ percentiles of $-2\log \lambda _{1}, -2\rho _{L_{1}}\log \lambda _{1}$, $-2\log \tau _{1}$, and $-2\log \phi _{1}$ and the approximate upper percentiles of $-2\log \lambda _{1}$ ($q_{1}^*(\alpha )$) and the actual type I error rates $\alpha _{1}, \alpha _{\rho _{L_1}}, \alpha _{\tau _{1}}, \alpha _{\phi _{1}}$, and $\alpha _{q_{1}^*}$; $\alpha =0.05$; and for the following cases (Case I),

$$\begin{aligned} (N_1,N_2,N_3)= \left\{ \begin{array}{ll} (t,10,10),&{} \\ (t,20,20),&{} t=10,20,30,40,80,200,400,\\ (t,50,50),&{} \end{array} \right. \end{aligned}$$

where $(p_1,p_2,p_3)$ in Tables 2 and 3 are (3, 3, 3), (6, 6, 6), respectively.

Similarly, Tables 4 and 5 exhibit the results for the following cases (Case II),

$$\begin{aligned} (N_1,N_2,N_3)= \left\{ \begin{array}{ll} (t,t/2,t/2),&{} \\ (t,t,t),&{} t=10,20,30,40,80,200,400,\\ (t,2t,2t),&{} \end{array} \right. \end{aligned}$$

where $(p_1,p_2,p_3)$ in Tables 4 and 5 are (3, 3, 3), (6, 6, 6) respectively.

It may be noted from the above-mentioned Tables that the simulated values are closer to the upper percentile of the $\chi ^2$ distribution when the sample size increases. In addition, it can be seen that the upper percentile of $-2\log \phi _1$ is considerably better than that of $-2\log \lambda _1$ even for small sample sizes, while the upper percentile of $-2\rho _{L_{1}}\log \lambda _{1}$ or $q_{1}^*(\alpha )$ is not as good as $-2\log \phi _1$.

Table 2 Simulated values for $-2\log \lambda _{1}, -2\rho _{L_{1}}\log \lambda _{1}, -2\log \tau _{1}$, and $-2\log \phi _{1}$ and the approximate values for $-2\log \lambda _{1}$ and the actual type I error rates $\alpha _{1}, \alpha _{\rho _{L_1}}, \alpha _{\tau _{1}}, \alpha _{\phi _{1}}$, and $\alpha _{q_{1}^*}$ when $(p_1,p_2,p_3)=(3,3,3)$ for Case I

Full size table

Table 3 The simulated values for $-2\log \lambda _{1}, -2\rho _{L_{1}}\log \lambda _{1}, -2\log \tau _{1}$, and $-2\log \phi _{1}$ and the approximate values for $-2\log \lambda _{1}$ and the actual type I error rates $\alpha _{1}, \alpha _{\rho _{L_1}\hspace{-0.8mm}}, \alpha _{\tau _{1}\hspace{-0.4mm}}, \alpha _{\phi _{1}\hspace{-0.4mm}}$, and $\alpha _{q_{1}^*}$ when $(p_1,p_2,p_3)=(6,6,6)$ for Case I

Full size table

Table 4 The simulated values for $-2\log \lambda _{1}, -2\rho _{L_{1}}\log \lambda _{1}, -2\log \tau _{1}$, and $-2\log \phi _{1}$ and the approximate values for $-2\log \lambda _{1}$ and the actual type I error rates $\alpha _{1}, \alpha _{\rho _{L_1}\hspace{-0.8mm}}, \alpha _{\tau _{1}\hspace{-0.4mm}}, \alpha _{\phi _{1}\hspace{-0.4mm}}$, and $\alpha _{q_{1}^*}$ when $(p_1,p_2,p_3)=(3,3,3)$ for Case II

Full size table

Table 5 The simulated values for $-2\log \lambda _{1}, -2\rho _{L_{1}}\log \lambda _{1}, -2\log \tau _{1}$, and $-2\log \phi _{1}$ and the approximate values for $-2\log \lambda _{1}$ and the actual type I error rates $\alpha _{1}, \alpha _{\rho _{L_1}\hspace{-0.8mm}}, \alpha _{\tau _{1}\hspace{-0.4mm}}, \alpha _{\phi _{1}\hspace{-0.4mm}}$, and $\alpha _{q_{1}^*}$ when $(p_1,p_2,p_3)=(6,6,6)$ for Case II

Full size table

2.6 Numerical Power

We conduct the power comparison of (I) the LR test using $-2\log \lambda _1$ given in Sect. 2.1, (II) the test using statistic $-2\log \tau _1$ given in Sect. 2.3, and (III) the test using statistic $-2\log \phi _1$ given in Sect. 2.3. Under some parameter settings, the powers of (I), (II), and (III) are compared using corresponding simulated upper $100\alpha $ percentiles under the null distribution, where $\alpha =0.05$. The simulation was executed $10^6$ times using normal random vectors. When ${\varvec{\varSigma }}={\varvec{I}}_p$, the powers are computed with various values of $\delta _i={\varvec{\mu }}_i'{\varvec{\mu }}_i$, $i=1,2,3$. This follows the power computation for the test of a mean vector in Krishnamoorthy and Pannala [10]. On the other hand, when ${\varvec{\mu }}={\varvec{0}}$, we put ${\varvec{\varSigma }}={\varvec{I}}_p+(1/\sqrt{N_1}){\varvec{\varOmega }},$ where ${\varvec{\varOmega }}=\textrm{diag}(\omega _1,\, \omega _2,\ldots ,\omega _p)$, the powers are computed with various values of $\omega _j=\omega $, $j=1,2,\ldots ,p$. Table 6 shows the power of three tests where $(p_1,p_2,p_3)=(3,3,3)$ and $(N_1,N_2,N_3)=(20,10,10)$. We note from Table 6 that the power of three tests have natural power properties. In addition, comparing the power of tests (I), (II), and (III), it can be seen that test (III) has the highest power, while tests (I) and (II) have almost the same power.

Further, Fig. 1 shows the power plots of (III) the test using statistic $-2\log \phi _1$ when (a) $(N_1,N_2,N_3)=(10,10,10)$, (b) $(N_1,N_2,N_3)=(20,10,10)$ and (c) $(N_1,N_2,N_3)=(40,10,10)$ with $(p_1,p_2, p_3)=(3,3,3)$ and $\delta _2=\delta _3=\omega =0$. Fig. 1 illustrates that the power is an increasing function of the sample size. The power studies are performed for other sample sizes and dimensions, and similar trends are observed. Therefore, the results are not listed here.

Table 6 The power comparison of (I), (II), and (III)

Full size table

3 Multi-Sample Problem

In this section, we will consider simultaneous tests of the mean vector and the covariance matrix under three-step monotone missing data for a multi-sample problem.

3.1 LR with Three-Step Monotone Missing Data

Let ${\varvec{x}}_1^{(\ell )},{\varvec{x}}_2^{(\ell )},\ldots ,{\varvec{x}}_{N_1^{(\ell )}}^{(\ell )}$ be independent p-dimensional sample vectors, ${\varvec{x}}_{(12),N_1^{(\ell )}+1}^{(\ell )}$, ${\varvec{x}}_{(12),N_1^{(\ell )}+2}^{(\ell )},\ldots ,{\varvec{x}}_{(12),N_1^{(\ell )}+N_2^{(\ell )}}^{(\ell )}$ be independent $(p_1+p_2)$-dimensional sample vectors and ${\varvec{x}}_{1,N_1^{(\ell )}+N_2^{(\ell )}+1}^{(\ell )},{\varvec{x}}_{1,N_1^{(\ell )}+N_2^{(\ell )}+2}^{(\ell )}$, $\ldots ,{\varvec{x}}_{1N^{(\ell )}}^{(\ell )}$ be independent $p_1$-dimensional sample vectors from the $\ell $th population ($\ell =1,\ldots ,m$). We suppose that the data is normally distributed as follows:

$$\begin{aligned} {\varvec{x}}_1^{(\ell )},{\varvec{x}}_2^{(\ell )},\ldots ,{\varvec{x}}_{N_1^{(\ell )}}^{(\ell )}&{\mathop {\sim }\limits ^{i.i.d.}} N_p({\varvec{\mu }}^{(\ell )},{\varvec{\varSigma }}^{(\ell )}),\nonumber \\ {\varvec{x}}_{(12),N_1^{(\ell )}+1}^{(\ell )},{\varvec{x}}_{(12),N_1^{(\ell )}+2}^{(\ell )},\ldots ,{\varvec{x}}_{(12),N_1^{(\ell )}+N_2^{(\ell )}}^{(\ell )}&{\mathop {\sim }\limits ^{i.i.d.}} N_{p_1+p_2}({\varvec{\mu }}_{(12)}^{(\ell )},{\varvec{\varSigma }}_{(12)(12)}^{(\ell )}),\nonumber \\ {\varvec{x}}_{1,N_1^{(\ell )}+N_2^{(\ell )}+1}^{(\ell )},{\varvec{x}}_{1,N_1^{(\ell )}+N_2^{(\ell )}+2}^{(\ell )},\ldots ,{\varvec{x}}_{1N^{(\ell )}}^{(\ell )}&{\mathop {\sim }\limits ^{i.i.d.}} N_{p_1}({\varvec{\mu }}_{1}^{(\ell )},{\varvec{\varSigma }}_{11}^{(\ell )}), \end{aligned}$$

(4)

where

$$\begin{aligned} {\varvec{\mu }}^{(\ell )}&= \left( \begin{array}{c} {\varvec{\mu }}_1^{(\ell )} \\ {\varvec{\mu }}_2^{(\ell )} \\ {\varvec{\mu }}_3^{(\ell )} \end{array} \right) =\left( \begin{array}{c} {\varvec{\mu }}_{(12)}^{(\ell )} \\ {\varvec{\mu }}_3^{(\ell )} \end{array} \right) ,\\ {\varvec{\varSigma }}^{(\ell )}&= \left( \begin{array}{cc|c} {\varvec{\varSigma }}_{11}^{(\ell )} &{} {\varvec{\varSigma }}_{12}^{(\ell )} &{} {\varvec{\varSigma }}_{13}^{(\ell )} \\ {\varvec{\varSigma }}_{21}^{(\ell )} &{} {\varvec{\varSigma }}_{22}^{(\ell )} &{} {\varvec{\varSigma }}_{23}^{(\ell )} \\ \hline {\varvec{\varSigma }}_{31}^{(\ell )} &{} {\varvec{\varSigma }}_{32}^{(\ell )} &{} {\varvec{\varSigma }}_{33}^{(\ell )} \end{array} \right) =\left( \begin{array}{c|c} {\varvec{\varSigma }}_{(12)(12)}^{(\ell )} &{} {\varvec{\varSigma }}_{(12)3}^{(\ell )} \\ \hline {\varvec{\varSigma }}_{3(12)}^{(\ell )} &{} {\varvec{\varSigma }}_{33}^{(\ell )} \end{array} \right) , \end{aligned}$$

and

$$\begin{aligned} {\varvec{x}}_j^{(\ell )}&= ({\varvec{x}}_{1j}^{(\ell )'},{\varvec{x}}_{2j}^{(\ell )'},{\varvec{x}}_{3j}^{(\ell )'})' ,j=1,\ldots ,N_1^{(\ell )},\\ {\varvec{x}}_{(12),j}^{(\ell )}&= ({\varvec{x}}_{1j}^{(\ell )'},{\varvec{x}}_{2j}^{(\ell )'})', j=N_1^{(\ell )}+1,\ldots ,N_1^{(\ell )}+N_2^{(\ell )},\\ N^{(\ell )}&= N_1^{(\ell )}+N_2^{(\ell )}+N_3^{(\ell )}, \; p=p_1+p_2+p_3. \end{aligned}$$

Such a dataset has three-step monotone missing data for a multi-sample problem for the $\ell $th population:

where “$*$” indicates a missing observation.

We consider the following hypothesis:

$$\begin{aligned} H_{m0}:{\varvec{\mu }}^{(1)}={\varvec{\mu }}^{(2)}=\cdots ={\varvec{\mu }}^{(m)}, \; {\varvec{\varSigma }}^{(1)} ={\varvec{\varSigma }}^{(2)}=\cdots ={\varvec{\varSigma }}^{(m)} \; \, \mathrm{vs.} \; \, H_{m1}:\textrm{not} \; \, H_{m0} \end{aligned}$$

(5)

To derive the MLEs of the mean vectors and the covariance matrices, we consider the following transformation matrix ${\varvec{Z}}^{(\ell )}$:

$$\begin{aligned} {\varvec{Z}}^{(\ell )} = \left( \begin{array}{c|c} \begin{matrix} {\varvec{I}}_{p_1} &{} {\varvec{O}} \\ -{\varvec{\varSigma }}_{21}^{(\ell )}{\varvec{\varSigma }}_{11}^{(\ell )-1} &{} {\varvec{I}}_{p_2} \end{matrix} &{} {\varvec{O}} \\ \hline \\ -{\varvec{\varSigma }}_{3(12)}^{(\ell )}{\varvec{\varSigma }}_{(12)(12)}^{(\ell )-1} &{} {\varvec{I}}_{p_3} \end{array} \right) . \end{aligned}$$

The transformed vector ${\varvec{y}}_j^{(\ell )}=({\varvec{y}}_{1j}^{(\ell )'},{\varvec{y}}_{2j}^{(\ell )'},{\varvec{y}}_{3j}^{(\ell )'})'$ is

$$\begin{aligned} {\varvec{y}}_j^{(\ell )}&= {\varvec{Z}}^{(\ell )}{\varvec{x}}_j^{(\ell )}\\&= \left( \begin{array}{c} {\varvec{x}}_{1j}^{(\ell )} \\ -{\varvec{\varSigma }}_{21}^{(\ell )}{\varvec{\varSigma }}_{11}^{(\ell )-1}{\varvec{x}}_{1j}^{(\ell )}+{\varvec{x}}_{2j}^{(\ell )} \\ -{\varvec{\varSigma }}_{3(12)}^{(\ell )}{\varvec{\varSigma }}_{(12)(12)}^{(\ell )-1}\left( \begin{array}{c} {\varvec{x}}_{1j}^{(\ell )} \\ {\varvec{x}}_{2j}^{(\ell )} \end{array}\right) +{\varvec{x}}_{3j}^{(\ell )} \end{array} \right) . \end{aligned}$$

The transformed parameters $({\varvec{\eta }}^{(\ell )},{\varvec{\varDelta }}^{(\ell )})$ are defined as

$$\begin{aligned} {\varvec{\eta }}^{(\ell )}&= \left( \begin{array}{c} {\varvec{\eta }}_1^{(\ell )} \\ {\varvec{\eta }}_2^{(\ell )} \\ {\varvec{\eta }}_3^{(\ell )} \end{array} \right) =\left( \begin{array}{c} {\varvec{\mu }}_1^{(\ell )} \\ -{\varvec{\varSigma }}_{21}^{(\ell )}{\varvec{\varSigma }}_{11}^{(\ell )-1}{\varvec{\mu }}_1^{(\ell )}+{\varvec{\mu }}_2^{(\ell )} \\ -{\varvec{\varSigma }}_{3(12)}^{(\ell )}{\varvec{\varSigma }}_{(12)(12)}^{(\ell )-1}\left( \begin{array}{c} {\varvec{\mu }}_1^{(\ell )} \\ {\varvec{\mu }}_2^{(\ell )} \end{array} \right) + {\varvec{\mu }}_3^{(\ell )} \end{array} \right) ,\\ {\varvec{\varDelta }}^{(\ell )}&= \left( \begin{array}{cc|c} {\varvec{\varDelta }}_{11}^{(\ell )} &{} {\varvec{\varDelta }}_{12}^{(\ell )} &{} {\varvec{\varDelta }}_{13}^{(\ell )} \\ {\varvec{\varDelta }}_{21}^{(\ell )} &{} {\varvec{\varDelta }}_{22}^{(\ell )} &{} {\varvec{\varDelta }}_{23}^{(\ell )} \\ \hline {\varvec{\varDelta }}_{31}^{(\ell )} &{} {\varvec{\varDelta }}_{32}^{(\ell )} &{} {\varvec{\varDelta }}_{33}^{(\ell )} \end{array} \right) =\left( \begin{array}{c|c} {\varvec{\varDelta }}_{(12)(12)}^{(\ell )} &{} {\varvec{\varDelta }}_{(12)3}^{(\ell )} \\ \hline {\varvec{\varDelta }}_{3(12)}^{(\ell )} &{} {\varvec{\varDelta }}_{33}^{(\ell )} \end{array} \right) , \end{aligned}$$

where

$$\begin{aligned} {\varvec{\varDelta }}_{11}^{(\ell )}&= {\varvec{\varSigma }}_{11}^{(\ell )},\\ {\varvec{\varDelta }}_{12}^{(\ell )}&= {\varvec{\varDelta }}_{21}^{(\ell )'}={\varvec{\varSigma }}_{11}^{(\ell )-1}{\varvec{\varSigma }}_{12}^{(\ell )},\\ {\varvec{\varDelta }}_{22}^{(\ell )}&= {\varvec{\varSigma }}_{22\cdot 1}^{(\ell )}={\varvec{\varSigma }}_{22}^{(\ell )}-{\varvec{\varSigma }}_{21}^{(\ell )}{\varvec{\varSigma }}_{11}^{(\ell )-1}{\varvec{\varSigma }}_{12}^{(\ell )},\\ {\varvec{\varDelta }}_{(12)3}^{(\ell )}&= {\varvec{\varDelta }}_{3(12)}^{(\ell )'}={\varvec{\varSigma }}_{(12)(12)}^{(\ell )-1}{\varvec{\varSigma }}_{(12)3}^{(\ell )},\\ {\varvec{\varDelta }}_{33}^{(\ell )}&= {\varvec{\varSigma }}_{33\cdot 12}^{(\ell )}={\varvec{\varSigma }}_{33}^{(\ell )}-{\varvec{\varSigma }}_{3(12)}^{(\ell )}{\varvec{\varSigma }}_{(12)(12)}^{(\ell )-1}{\varvec{\varSigma }}_{(12)3}^{(\ell )}. \end{aligned}$$

We note that the pair $({\varvec{\mu }}^{(\ell )},{\varvec{\varSigma }}^{(\ell )})$ is in one-to-one correspondence with $({\varvec{\eta }}^{(\ell )},{\varvec{\varDelta }}^{(\ell )})$. Under $H_1$, we define the MLEs of $({\varvec{\eta }}^{(\ell )},{\varvec{\varDelta }}^{(\ell )})$ as $(\widehat{{\varvec{\eta }}}^{(\ell )},\widehat{{\varvec{\varDelta }}}^{(\ell )})$,

$$\begin{aligned} \widehat{{\varvec{\eta }}}_1^{(\ell )}&= \frac{1}{N^{(\ell )}}(N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )} +N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )}+N_3^{(\ell )}\overline{{\varvec{x}}}_{(3)}^{(\ell )}),\nonumber \\ \widehat{{\varvec{\eta }}}_2^{(\ell )}&= \frac{1}{N_1^{(\ell )}+N_2^{(\ell )}}\{ N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)2}^{(\ell )} +N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)2}^{(\ell )}-\widehat{{\varvec{\varDelta }}}_{21}^{(\ell )}(N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )} +N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )}) \},\nonumber \\ \widehat{{\varvec{\eta }}}_3^{(\ell )}&= \overline{{\varvec{x}}}_{(1)3}^{(\ell )}-\widehat{{\varvec{\varDelta }}}_{3(12)}^{(\ell )} \overline{{\varvec{x}}}_{(1)(12)}^{(\ell )}, \nonumber \\ \widehat{{\varvec{\varDelta }}}_{11}^{(\ell )}&= \frac{1}{N^{(\ell )}}({\varvec{W}}_{(1)11}^{(\ell )}+{\varvec{W}}_{(2)11}^{(\ell )}+{\varvec{W}}_{(3)}^{(\ell )}), \end{aligned}$$

(6)

$$\begin{aligned} \widehat{{\varvec{\varDelta }}}_{22}^{(\ell )}&= \frac{1}{N_1^{(\ell )}+N_2^{(\ell )}}({\varvec{W}}_{(1),(12)(12)}^{(\ell )}+{\varvec{W}}_{(2)}^{(\ell )})_{22\cdot 1}, \; \widehat{{\varvec{\varDelta }}}_{33}^{(\ell )} = \frac{1}{N_1^{(\ell )}}({\varvec{W}}_{(1)33\cdot 12}^{(\ell )}),\nonumber \\ \widehat{{\varvec{\varDelta }}}_{12}^{(\ell )}&= \widehat{{\varvec{\varDelta }}}_{21}^{(\ell )'} = ({\varvec{W}}_{(1)11}^{(\ell )} +{\varvec{W}}_{(2)11}^{(\ell )}) ^{-1}({\varvec{W}}_{(1)12}^{(\ell )}+{\varvec{W}}_{(2)12}^{(\ell )}),\nonumber \\ \widehat{{\varvec{\varDelta }}}_{(12)3}^{(\ell )}&= \widehat{{\varvec{\varDelta }}}_{3(12)}^{(\ell )'} = ({\varvec{W}}_{(1),(12)(12)}^{(\ell )}) ^{-1}{\varvec{W}}_{(1),(12)3}^{(\ell )}, \end{aligned}$$

(7)

where

$$\begin{aligned} \overline{{\varvec{x}}}_{(1)}^{(\ell )}&= \left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)1}^{(\ell )} \\ \overline{{\varvec{x}}}_{(1)2}^{(\ell )} \\ \overline{{\varvec{x}}}_{(1)3}^{(\ell )} \end{array} \right) = \left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)(12)}^{(\ell )} \\ \overline{{\varvec{x}}}_{(1)3}^{(\ell )} \end{array} \right) ,\\ \overline{{\varvec{x}}}_{(1)1}^{(\ell )}&=\frac{1}{N_1^{(\ell )}}\sum _{j=1}^{N_1^{(\ell )}}{\varvec{x}}_{1j}^{(\ell )},\ \overline{{\varvec{x}}}_{(1)2}^{(\ell )}=\frac{1}{N_1^{(\ell )}}\sum _{j=1}^{N_1^{(\ell )}}{\varvec{x}}_{2j}^{(\ell )},\ \overline{{\varvec{x}}}_{(1)3}^{(\ell )}=\frac{1}{N_1^{(\ell )}}\sum _{j=1}^{N_1^{(\ell )}}{\varvec{x}}_{3j}^{(\ell )},\\ \overline{{\varvec{x}}}_{(2)}^{(\ell )}&= \left( \begin{array}{c} \overline{{\varvec{x}}}_{(2)1}^{(\ell )} \\ \overline{{\varvec{x}}}_{(2)2}^{(\ell )} \end{array} \right) ,\; \overline{{\varvec{x}}}_{(2)1}^{(\ell )}=\frac{1}{N_2^{(\ell )}}\sum _{j=N_1^{(\ell )}+1}^{N_1^{(\ell )}+N_2^{(\ell )}}{\varvec{x}}_{1j}^{(\ell )},\ \overline{{\varvec{x}}}_{(2)2}^{(\ell )}=\frac{1}{N_2^{(\ell )}}\sum _{j=N_1^{(\ell )}+1}^{N_1^{(\ell )}+N_2^{(\ell )}}{\varvec{x}}_{2j}^{(\ell )},\\ \overline{{\varvec{x}}}_{(3)}^{(\ell )}&= \frac{1}{N_3^{(\ell )}}\sum _{j=N_1^{(\ell )}+N_2^{(\ell )}+1}^{N^{(\ell )}}{\varvec{x}}_{1j}^{(\ell )},\\ {\varvec{W}}_{(1)}^{(\ell )}&= \sum _{j=1}^{N_1}({\varvec{x}}_j^{(\ell )}-\overline{{\varvec{x}}}_{(1)}^{(\ell )})({\varvec{x}}_j^{(\ell )}-\overline{{\varvec{x}}}_{(1)}^{(\ell )})'\\&= \left( \begin{array}{cc|c} {\varvec{W}}_{(1)11}^{(\ell )} &{} {\varvec{W}}_{(1)12}^{(\ell )} &{} {\varvec{W}}_{(1)13}^{(\ell )} \\ {\varvec{W}}_{(1)21}^{(\ell )} &{} {\varvec{W}}_{(1)22}^{(\ell )} &{} {\varvec{W}}_{(1)23}^{(\ell )} \\ \hline {\varvec{W}}_{(1)31}^{(\ell )} &{} {\varvec{W}}_{(1)32}^{(\ell )} &{} {\varvec{W}}_{(1)33}^{(\ell )} \end{array} \right) = \left( \begin{array}{c|c} {\varvec{W}}_{(1),(12)(12)}^{(\ell )} &{} {\varvec{W}}_{(1),(12)3}^{(\ell )} \\ \hline {\varvec{W}}_{(1),3(12)}^{(\ell )} &{} {\varvec{W}}_{(1)33}^{(\ell )} \end{array} \right) , \\ {\varvec{W}}_{(2)}^{(\ell )}&= \sum _{j=N_1^{(\ell )}+1}^{N_1^{(\ell )}+N_2^{(\ell )}}\left( \begin{array}{c} {\varvec{x}}_{1j}^{(\ell )}-\overline{{\varvec{x}}}_{(2)1}^{(\ell )} \\ {\varvec{x}}_{2j}^{(\ell )}-\overline{{\varvec{x}}}_{(2)2}^{(\ell )} \end{array} \right) \left( \begin{array}{c} {\varvec{x}}_{1j}^{(\ell )}-\overline{{\varvec{x}}}_{(2)1}^{(\ell )} \\ {\varvec{x}}_{2j}^{(\ell )}-\overline{{\varvec{x}}}_{(2)2}^{(\ell )} \end{array} \right) '\\&\quad +\frac{N_1^{(\ell )}N_2^{(\ell )}}{N_1^{(\ell )}+N_2^{(\ell )}}\left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)1}^{(\ell )}-\overline{{\varvec{x}}}_{(2)1}^{(\ell )} \\ \overline{{\varvec{x}}}_{(1)2}^{(\ell )}-\overline{{\varvec{x}}}_{(2)2}^{(\ell )} \end{array} \right) \left( \begin{array}{c} \overline{{\varvec{x}}}_{(1)1}^{(\ell )}-\overline{{\varvec{x}}}_{(2)1}^{(\ell )} \\ \overline{{\varvec{x}}}_{(1)2}^{(\ell )}-\overline{{\varvec{x}}}_{(2)2}^{(\ell )} \end{array} \right) '\\&= \left( \begin{array}{cc} {\varvec{W}}_{(2)11}^{(\ell )} &{} {\varvec{W}}_{(2)12}^{(\ell )} \\ {\varvec{W}}_{(2)21}^{(\ell )} &{} {\varvec{W}}_{(2)22}^{(\ell )} \end{array} \right) ,\\ {\varvec{W}}_{(3)}^{(\ell )}&= \sum _{j=N_1^{(\ell )}+N_2^{(\ell )}+1}^{N^{(\ell )}}({\varvec{x}}_{1j}^{(\ell )}-\overline{{\varvec{x}}}_{(3)}^{(\ell )})({\varvec{x}}_{1j}^{(\ell )}-\overline{{\varvec{x}}}_{(3)}^{(\ell )})'\\&\quad + \frac{(N_1^{(\ell )}+N_2^{(\ell )})N_3^{(\ell )}}{N^{(\ell )}} \left( \overline{{\varvec{x}}}_{(3)}^{(\ell )}-\frac{1}{N_1^{(\ell )}+N_2^{(\ell )}}(N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )}+N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )})\right) \\&\quad \times \left( \overline{{\varvec{x}}}_{(3)}^{(\ell )}-\frac{1}{N_1^{(\ell )}+N_2^{(\ell )}}(N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )}+N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )})\right) '. \end{aligned}$$

Conversely, under $H_{m0}$, we define MLEs of ${\varvec{\eta }}(={\varvec{\eta }}^{(1)}=\cdots ={\varvec{\eta }}^{(m)}),{\varvec{\varDelta }}(={\varvec{\varDelta }}^{(1)}=\cdots ={\varvec{\varDelta }}^{(m)})$ as $(\widetilde{{\varvec{\eta }}},\widetilde{{\varvec{\varDelta }}})$. Subsequently, we obtain

$$\begin{aligned} \widetilde{{\varvec{\eta }}}_1&= \frac{1}{N}\sum _{\ell =1}^m (N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )} +N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )}+N_3^{(\ell )}\overline{{\varvec{x}}}_{(3)}^{(\ell )}),\nonumber \\ \widetilde{{\varvec{\eta }}}_2&= \frac{1}{N_1+N_2} \sum _{\ell =1}^m \{ N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)2}^{(\ell )} +N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)2}^{(\ell )}-\widetilde{{\varvec{\varDelta }}}_{21}(N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )} +N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )}) \},\nonumber \\ \widetilde{{\varvec{\eta }}}_3&= \frac{1}{N_1} \sum _{\ell =1}^m N_1^{(\ell )} \{\overline{{\varvec{x}}}_{(1)3}^{(\ell )}-\widetilde{{\varvec{\varDelta }}}_{3(12)}\overline{{\varvec{x}}}_{(1)(12)}^{(\ell )} \}, \nonumber \\ \widetilde{{\varvec{\varDelta }}}_{11}&= \frac{1}{N}\sum _{\ell =1}^m \sum _{j=1}^{N^{(\ell )}}({\varvec{x}}_{1j}^{(\ell )} -\widetilde{{\varvec{\eta }}}_1)({\varvec{x}}_{1j}^{(\ell )}-\widetilde{{\varvec{\eta }}}_1)', \end{aligned}$$

(8)

$$\begin{aligned} \widetilde{{\varvec{\varDelta }}}_{22}&= \frac{1}{N_1+N_2}\sum _{\ell =1}^m \sum _{j=1}^{N_1^{(\ell )}+N_2^{(\ell )}} (-\widetilde{{\varvec{\varDelta }}}_{21}{\varvec{x}}_{1j}^{(\ell )}+{\varvec{x}}_{2j}^{(\ell )}-\widetilde{{\varvec{\eta }}}_2) (-\widetilde{{\varvec{\varDelta }}}_{21}{\varvec{x}}_{1j}^{(\ell )}+{\varvec{x}}_{2j}^{(\ell )}-\widetilde{{\varvec{\eta }}}_2)', \end{aligned}$$

(9)

$$\begin{aligned} \widetilde{{\varvec{\varDelta }}}_{33}&= \frac{1}{N_1}\sum _{\ell =1}^m \sum _{j=1}^{N_1^{(\ell )}}(-\widetilde{{\varvec{\varDelta }}}_{3(12)} {\varvec{x}}_{(12)j}^{(\ell )}+{\varvec{x}}_{3j}^{(\ell )}-\widetilde{{\varvec{\eta }}}_3)(-\widetilde{{\varvec{\varDelta }}}_{3(12)}{\varvec{x}}_{(12)j}^{(\ell )} +{\varvec{x}}_{3j}^{(\ell )}-\widetilde{{\varvec{\eta }}}_3)',\nonumber \\ \widetilde{{\varvec{\varDelta }}}_{21}&= \widetilde{{\varvec{\varDelta }}}_{12}' \nonumber \\&= \sum _{\ell =1}^{m} \left[ \sum _{j=1}^{N_1^{(\ell )}+N_2^{(\ell )}}{\varvec{x}}_{2j}^{(\ell )}{\varvec{x}}_{1j}^{(\ell )'} -\frac{1}{N_1+N_2} \left\{ \sum _{k=1}^m (N_1^{(k)}\overline{{\varvec{x}}}_{(1)2}^{(k)} +N_2^{(k)}\overline{{\varvec{x}}}_{(2)2}^{(k)})\right\} \right. \nonumber \\&\quad \times \left. (N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )}+N_2^{(\ell )}\overline{{\varvec{x}}}_{(2)1}^{(\ell )})' \right] \sum _{\ell =1}^{m} \left[ \sum _{j=1}^{N_1^{(\ell )}+N_2^{(\ell )}}{\varvec{x}}_{1j}^{(\ell )}{\varvec{x}}_{1j}^{(\ell )'}\right. \nonumber \\&\quad \left. -\frac{1}{N_1+N_2} \left\{ \sum _{k=1}^m (N_1^{(k)}\overline{{\varvec{x}}}_{(1)1}^{(k)}+N_2^{(k)}\overline{{\varvec{x}}}_{(2)1}^{(k)}) \right\} (N_1^{(\ell )}\overline{{\varvec{x}}}_{(1)1}^{(\ell )}+N_2^{(\ell )} \overline{{\varvec{x}}}_{(2)1}^{(\ell )})' \right] ^{-1}, \nonumber \\ \widetilde{{\varvec{\varDelta }}}_{3(12)}&= \widetilde{{\varvec{\varDelta }}}_{(12)3}' \nonumber \\&= \sum _{\ell =1}^{m} \left\{ \sum _{j=1}^{N_1^{(\ell )}}{\varvec{x}}_{3j}^{(\ell )}{\varvec{x}}_{(12),j}^{(\ell )'} -N_1^{(\ell )}\left( \frac{1}{N_1}\sum _{k=1}^m N_1^{(k)}\overline{{\varvec{x}}}_{(1)3}^{(\ell )} \right) \overline{{\varvec{x}}}_{(1),(12)}^{(\ell )} \right\} \nonumber \\&\quad \times \sum _{\ell =1}^{m} \left\{ \sum _{j=1}^{N_1^{(\ell )}}{\varvec{x}}_{(12),j}^{(\ell )}{\varvec{x}}_{(12),j}^{(\ell )'} -N_1^{(\ell )}\left( \frac{1}{N_1}\sum _{k=1}^m N_1^{(k)}\overline{{\varvec{x}}}_{(1),(12)}^{(\ell )} \right) \overline{{\varvec{x}}}_{(1),(12)}^{(\ell )} \right\} ^{-1}, \end{aligned}$$

(10)

where $N=\sum _{\ell =1}^{m} N^{(\ell )}, \; N_1=\sum _{\ell =1}^{m} N_1^{(\ell )}, \; N_2=\sum _{\ell =1}^{m} N_2^{(\ell )}$. From the preceding MLEs, we get the following theorem.

Theorem 2

Suppose that the datasets have a three-step monotone pattern of missing observations and are normally distributed as (4). Then, the LR for (5) can be given by

$$\begin{aligned} \lambda _{m}&= \frac{\prod _{\ell =1}^m |\widehat{{\varvec{\varDelta }}}_{11}^{(\ell )}|^{\frac{N^{(\ell )}}{2}} |\widehat{{\varvec{\varDelta }}}_{22}^{(\ell )}|^{\frac{N_1^{(\ell )}+N_2^{(\ell )}}{2}} |\widehat{{\varvec{\varDelta }}}_{33}^{(\ell )}|^{\frac{N_1^{(\ell )}}{2}} }{|\widetilde{{\varvec{\varDelta }}}_{11}|^{\frac{N}{2}} |\widetilde{{\varvec{\varDelta }}}_{22}|^{\frac{N_1+N_2}{2}} |\widetilde{{\varvec{\varDelta }}}_{33}|^{\frac{N_1}{2}} }, \end{aligned}$$

where $\widehat{{\varvec{\varDelta }}}_{ii}$ and $\widetilde{{\varvec{\varDelta }}}_{ii}$ $(i=1,2,3)$ are given in (6)–(10).

Thus, we obtain LRT statistic $-2\log \lambda _{m}$. $-2\log \lambda _{m}$ is asymptotically distributed as a $\chi ^2$ distribution with $f_m=p(p+3)(m-1)/2$ degrees of freedom. However, it is known that the accuracy of this approximation is not good for small samples. Therefore, we propose the test statistics that are a good approximation to $\chi ^2$ distribution using several methods based on the complete data case in Sect. 3.2.

3.2 Complete Data

In this subsection, we discuss the LRT statistic in the case of complete data and the modified LRT statistics with Bartlett correction. The results will be used to propose the test statistics in the next subsection. First, we consider a simultaneous test with complete data as follows:

$$\begin{aligned} H_{03}:{\varvec{\mu }}^{(1)}={\varvec{\mu }}^{(2)}=\cdots ={\varvec{\mu }}^{(m)}, \; {\varvec{\varSigma }}^{(1)}={\varvec{\varSigma }}^{(2)}=\cdots ={\varvec{\varSigma }}^{(m)} \; \, \mathrm{vs.} \; \, H_{13}:\textrm{not} \; \, H_{03} \end{aligned}$$

${\varvec{x}}_1^{(\ell )},{\varvec{x}}_2^{(\ell )},\ldots ,{\varvec{x}}_{N^{(\ell )}}^{(\ell )}$ be independently distributed as $N_p({\varvec{\mu }}^{(\ell )},{\varvec{\varSigma }}^{(\ell )})$, and let $\lambda _{S_m}$ be the LR for the complete data. Then, the LR is given by

$$\begin{aligned} \lambda _{S_m} = \frac{\prod _{\ell =1}^m \left| \frac{1}{N^{(\ell )}}{\varvec{V}}^{(\ell )} \right| ^{\frac{1}{2}N^{(\ell )}}}{\left| \frac{1}{N}({\varvec{V}}+{\varvec{B}}) \right| ^{\frac{1}{2}N}}, \end{aligned}$$

where

$$\begin{aligned} {\varvec{V}}^{(\ell )}&= \sum _{j=1}^{N^{(\ell )}} ({\varvec{x}}_j^{(\ell )}-\overline{{\varvec{x}}}^{(\ell )})({\varvec{x}}_j^{(\ell )}-\overline{{\varvec{x}}}^{(\ell )})', \; \, {\varvec{V}} = \sum _{\ell =1}^m {\varvec{V}}^{(\ell )},\\ {\varvec{B}}&= \sum _{\ell =1}^m N^{(\ell )}(\overline{{\varvec{x}}}^{(\ell )}-\overline{{\varvec{x}}})(\overline{{\varvec{x}}}^{(\ell )}-\overline{{\varvec{x}}})',\\ \overline{{\varvec{x}}}^{(\ell )}&= \frac{1}{N^{(\ell )}} \sum _{j=1}^{N^{(\ell )}} {\varvec{x}}_j^{(\ell )}, \; \, \overline{{\varvec{x}}} = \frac{1}{N} \sum _{\ell =1}^m N^{(\ell )}\overline{{\varvec{x}}}^{(\ell )}, \; N = \sum _{\ell =1}^m N^{(\ell )}. \end{aligned}$$

Furthermore, the modified LRT statistic with Bartlett correction can be given by $-2\rho _3\log \lambda _{S_m}$ (Muirhead [6, p. 513]), where

$$\begin{aligned} \rho _3 = 1-\frac{2p^2+9p+11}{6N(p+3)(m-1)}\left( \sum _{\ell =1}^{m}\frac{N}{N^{(\ell )}}-1 \right) . \end{aligned}$$

Next, we consider the covariance test in the case of complete data as follows:

$$\begin{aligned} H_{04}:{\varvec{\varSigma }}^{(1)}={\varvec{\varSigma }}^{(2)}=\cdots ={\varvec{\varSigma }}^{(m)} \; \, \mathrm{vs.} \; \, H_{14}:\textrm{not} \; \, H_{04} \end{aligned}$$

The modified LRT statistic $-2\rho _4\log \lambda _{V_m}$ was provided by Muirhead [6, p. 308], where

$$\begin{aligned} \rho _4&= 1-\frac{2p^2+3p-1}{6(p+1)(m-1)}\left( \sum _{\ell =1}^{m}\frac{1}{n^{(\ell )}}-\frac{1}{n} \right) , \; \lambda _{V_m} = \frac{\prod _{\ell =1}^{m} \left| \frac{1}{n^{(\ell )}}{\varvec{V}}^{(\ell )} \right| ^{\frac{n^{(\ell )}}{2}} }{\left| \frac{1}{n}{\varvec{V}} \right| ^{\frac{n}{2}} }, \end{aligned}$$

and

$$\begin{aligned} n^{(\ell )}=N^{(\ell )}-1, \quad n=\sum _{\ell -1}^{m} n^{(\ell )}. \end{aligned}$$

3.3 Test Statistics

Using LR of simultaneous test with complete data from the previous subsection, we propose test statistics by decomposing the LR $\lambda _m$ with three-step monotone missing data. First, the LR can be decomposed as $\lambda _m=\xi _1\xi _2\xi _3$, where

$$\begin{aligned} \xi _1 = \frac{\prod _{\ell =1}^m \left| \widehat{{\varvec{\varDelta }}}_{11}^{(\ell )} \right| ^{\frac{N^{(\ell )}}{2}}}{\left| \widetilde{{\varvec{\varDelta }}}_{11} \right| ^{\frac{N}{2}}}, \;\, \xi _2 = \frac{\prod _{\ell =1}^m \left| \widehat{{\varvec{\varDelta }}}_{22}^{(\ell )} \right| ^{\frac{N_1^{(\ell )}+N_2^{(\ell )}}{2}}}{\left| \widetilde{{\varvec{\varDelta }}}_{22} \right| ^{\frac{N_1+N_2}{2}}}, \;\, \xi _3 = \frac{\prod _{\ell =1}^m \left| \widehat{{\varvec{\varDelta }}}_{33}^{(\ell )} \right| ^{\frac{N_1^{(\ell )}}{2}}}{\left| \widetilde{{\varvec{\varDelta }}}_{33} \right| ^{\frac{N_1}{2}}}. \end{aligned}$$

Because $\xi _1$ is of the form of LR for $H_{03}$ in the case of without missing data, the modified LRT statistic $-2\rho _{\xi _1}\log \xi _1$ is given, where

$$\begin{aligned} \rho _{\xi _1} = 1-\frac{2p_1^2+9p_1+11}{6N(p_1+3)(m-1)}\left( \sum _{\ell =1}^{m}\frac{N}{N^{(\ell )}}-1 \right) . \end{aligned}$$

Next, $\xi _2$ can be decomposed as $\xi _2=\xi _2^{\dagger }\xi _2^{\ddagger }$, where

$$\begin{aligned} \xi _2^{\dagger }&= \frac{\prod _{\ell =1}^m \left| \widehat{{\varvec{\varDelta }}}_{22}^{(\ell )} \right| ^{\frac{N_1^{(\ell )}+N_2^{(\ell )}}{2}}}{\left| \frac{1}{N_1+N_2}({\varvec{V}}_{p_2}+{\varvec{B}}_{p_2}) \right| ^{\frac{N_1+N_2}{2}}}, \quad \xi _2^{\ddagger } = \frac{\left| \frac{1}{N_1+N_2}({\varvec{V}}_{p_2}+{\varvec{B}}_{p_2}) \right| ^{\frac{N_1+N_2}{2}}}{\left| \widetilde{{\varvec{\varDelta }}}_{22} \right| ^{\frac{N_1+N_2}{2}}}, \end{aligned}$$

and

$$\begin{aligned} {\varvec{V}}_{p_2}^{(\ell )}&= \sum _{j=1}^{N_1^{(\ell )}+N_2^{(\ell )}}({\varvec{x}}_{2j}^{(\ell )}-\widehat{{\varvec{\varDelta }}}_{21}^{(\ell )}{\varvec{x}}_{1j}^{(\ell )}-\widehat{{\varvec{\eta }}}_2^{(\ell )})({\varvec{x}}_{2j}^{(\ell )}-\widehat{{\varvec{\varDelta }}}_{21}^{(\ell )}{\varvec{x}}_{1j}^{(\ell )}-\widehat{{\varvec{\eta }}}_2^{(\ell )})',\\ {\varvec{V}}_{p_2}&= \sum _{\ell =1}^m {\varvec{V}}_{p_2}^{(\ell )}, \; {\varvec{B}}_{p_2} = \sum _{\ell =1}^m (N_1^{(\ell )}+N_2^{(\ell )})(\widehat{{\varvec{\eta }}}_2^{(\ell )}-\widetilde{{\varvec{\eta }}}_2)(\widehat{{\varvec{\eta }}}_2^{(\ell )}-\widetilde{{\varvec{\eta }}}_2)'. \end{aligned}$$

Since $\xi _2^{\dagger }$ is of the form of LR for $H_{03}$ in the case of complete data, the modified LRT statistic $-2\rho _{\xi _2}\log \xi _2^{\dagger }$ is given, where

$$\begin{aligned} \rho _{\xi _2} = 1-\frac{2p_2^2+9p_2+11}{6(N_1+N_2)(p_2+3)(m-1)}\left( \sum _{\ell =1}^{m}\frac{N_1+N_2}{N_1^{(\ell )}+N_2^{(\ell )}}-1 \right) . \end{aligned}$$

Similarly, $\xi _3$ can be decomposed as $\xi _3=\xi _3^{\dagger }\xi _3^{\ddagger }$, where

$$\begin{aligned} \xi _3^{\dagger }&= \frac{\prod _{\ell =1}^m \left| \widehat{{\varvec{\varDelta }}}_{33}^{(\ell )} \right| ^{\frac{N_1^{(\ell )}}{2}}}{\left| \frac{1}{N_1}({\varvec{V}}_{p_3}+{\varvec{B}}_{p_3}) \right| ^{\frac{N_1}{2}}}, \quad \xi _3^{\ddagger } = \frac{\left| \frac{1}{N_1}({\varvec{V}}_{p_3}+{\varvec{B}}_{p_3}) \right| ^{\frac{N_1}{2}}}{\left| \widetilde{{\varvec{\varDelta }}}_{33} \right| ^{\frac{N_1}{2}}}, \end{aligned}$$

and

$$\begin{aligned} {\varvec{V}}_{p_3}^{(\ell )}&= \sum _{j=1}^{N_1^{(\ell )}} ({\varvec{x}}_{3j}^{(\ell )}-\widehat{{\varvec{\varDelta }}}_{3(12)}^{(\ell )}{\varvec{x}}_{(12)j}^{(\ell )}-\widehat{{\varvec{\eta }}}_3^{(\ell )})({\varvec{x}}_{3j}^{(\ell )}-\widehat{{\varvec{\varDelta }}}_{3(12)}^{(\ell )}{\varvec{x}}_{(12)j}^{(\ell )}-\widehat{{\varvec{\eta }}}_3^{(\ell )})',\\ {\varvec{V}}_{p_3}&= \sum _{\ell =1}^m {\varvec{V}}_{p_3}^{(\ell )}, \; {\varvec{B}}_{p_3} = \sum _{\ell =1}^m N_1^{(\ell )}(\widehat{{\varvec{\eta }}}_3^{(\ell )}-\widetilde{{\varvec{\eta }}}_3)(\widehat{{\varvec{\eta }}}_3^{(\ell )}-\widetilde{{\varvec{\eta }}}_3)'. \end{aligned}$$

Since $\xi _3^{\dagger }$ is of the form of LR for $H_{03}$ in the case of complete data, the modified LRT statistic $-2\rho _{\xi _3}\log \xi _3^{\dagger }$ is given, where

$$\begin{aligned} \rho _{\xi _3} = 1-\frac{2p_3^2+9p_3+11}{6N_1(p_3+3)(m-1)}\left( \sum _{\ell =1}^{m}\frac{N_1}{N_1^{(\ell )}}-1 \right) . \end{aligned}$$

Therefore, the decomposed form of $\lambda _m=\xi _1\xi _2\xi _3$ is $\lambda _m=\xi _1\xi _2^{\dagger }\xi _2^{\ddagger }\xi _3^{\dagger }\xi _3^{\ddagger }$. We give a correction only for $\xi _1,\xi _2^{\dagger }$, and $\xi _3^{\dagger }$. Thus, we give the test statistic $-2\log \tau _m$ for improving the accuracy of the $\chi ^2$ approximation, where

$$\begin{aligned} \tau _m = (\xi _1)^{\rho _{\xi _1}}(\xi _2^{\dagger })^{\rho _{\xi _2}}\xi _2^{\ddagger }(\xi _3^{\dagger })^{\rho _{\xi _3}}\xi _3^{\ddagger }. \end{aligned}$$

Next, using the LR of the covariance test with complete data from the previous subsection, we propose test statistics by decomposing the LR $\lambda _m$ with three-step monotone missing data. Let

$$\begin{aligned} \xi _{11}^{*}&= \frac{\prod _{\ell =1}^{m} \left| \frac{{\varvec{V}}_{p_1}^{(\ell )}}{n^{(\ell )}} \right| ^{\frac{n^{(\ell )}}{2}} }{\left| \frac{{\varvec{V}}_{p_1}}{n} \right| ^{\frac{n}{2}} }, \; \xi _{21}^{*} = \frac{\prod _{\ell =1}^{m} \left| \frac{{\varvec{V}}_{p_2}^{(\ell )}}{n_1^{(\ell )}+n_2^{(\ell )}} \right| ^{\frac{n_1^{(\ell )}+n_2^{(\ell )}}{2}} }{\left| \frac{{\varvec{V}}_{p_2}}{n_1+n_2} \right| ^{\frac{n_1+n_2}{2}} }, \; \xi _{31}^{*} = \frac{\prod _{\ell =1}^{m} \left| \frac{{\varvec{V}}_{p_3}^{(\ell )}}{n_1^{(\ell )}} \right| ^{\frac{n_1^{(\ell )}}{2}} }{\left| \frac{{\varvec{V}}_{p_3}}{n_1} \right| ^{\frac{n_1}{2}} }, \end{aligned}$$

where

$$\begin{aligned} {\varvec{V}}_{p_1}^{(\ell )}&= \sum _{j=1}^{N^{(\ell )}} ({\varvec{x}}_{1j}^{(\ell )}-\widehat{{\varvec{\eta }}}_1^{(\ell )}) ({\varvec{x}}_{1j}^{(\ell )}-\widehat{{\varvec{\eta }}}_1^{(\ell )})', \quad {\varvec{V}}_{p_1} = \sum _{\ell =1}^m {\varvec{V}}_{p_1}^{(\ell )}, \\ n_1^{(\ell )}&= N_1^{(\ell )}-(p_1+p_2)-1, \quad n_1= \sum _{\ell =1}^m n_1^{(\ell )}, \\ n_1^{(\ell )}+n_2^{(\ell )}&= N_1^{(\ell )}+N_2^{(\ell )}-p_1-1, \quad n_1+n_2= \sum _{\ell =1}^m (n_1^{(\ell )}+n_2^{(\ell )}). \end{aligned}$$

Because $\xi _{11}^{*},\xi _{21}^{*}$, and $\xi _{31}^{*}$ are of the form of LR for $H_{04}$ in the case of complete data, the modified LRT statistics $-2\rho _{\xi _{11}^{*}}\log \xi _{11}^{*},-2\rho _{\xi _{21}^{*}}\log \xi _{21}^{*}$, and $-2\rho _{\xi _{31}^{*}}\log \xi _{31}^{*}$ are given, where

$$\begin{aligned} \rho _{\xi _{11}^{*}}&= 1-\frac{2p_1^2+3p_1-1}{6(p_1+1)(m-1)}\left( \sum _{\ell =1}^{m}\frac{1}{n^{(\ell )}}-\frac{1}{n} \right) , \\ \rho _{\xi _{21}^{*}}&= 1-\frac{2p_2^2+3p_2-1}{6(p_2+1)(m-1)}\left( \sum _{\ell =1}^{m}\frac{1}{n_1^{(\ell )}+n_2^{(\ell )}}-\frac{1}{n_1+n_2} \right) , \\ \rho _{\xi _{31}^{*}}&= 1-\frac{2p_3^2+3p_3-1}{6(p_3+1)(m-1)}\left( \sum _{\ell =1}^{m}\frac{1}{n_1^{(\ell )}}-\frac{1}{n_1} \right) . \end{aligned}$$

Therefore, we propose the test statistic $-2\log \phi _m$ to improve the accuracy of the $\chi ^2$ approximation, where

$$\begin{aligned} \phi _m = (\xi _{11}^{*})^{\rho _{\xi _{11}^{*}}}(\xi _{21}^{*})^{\rho _{\xi _{21}^{*}}}(\xi _{31}^{*})^{\rho _{\xi _{31}^{*}}} \frac{\lambda }{\xi _{11}\xi _{21}\xi _{31}}, \end{aligned}$$

and

$$\begin{aligned} \xi _{11}&= \frac{\prod _{\ell =1}^{m} \left| \frac{{\varvec{V}}_{p_1}^{(\ell )}}{N^{(\ell )}} \right| ^{\frac{N^{(\ell )}}{2}} }{\left| \frac{{\varvec{V}}_{p_1}}{N} \right| ^{\frac{N}{2}} }, \ \xi _{21} = \frac{\prod _{\ell =1}^{m} \left| \frac{{\varvec{V}}_{p_2}^{(\ell )}}{N_1^{(\ell )}+N_2^{(\ell )}} \right| ^{\frac{N_1^{(\ell )}+N_2^{(\ell )}}{2}} }{\left| \frac{{\varvec{V}}_{p_2}}{N_1+N_2} \right| ^{\frac{N_1+N_2}{2}} }, \; \xi _{31} = \frac{\prod _{\ell =1}^{m} \left| \frac{{\varvec{V}}_{p_3}^{(\ell )}}{N_1^{(\ell )}} \right| ^{\frac{N_1^{(\ell )}}{2}} }{\left| \frac{{\varvec{V}}_{p_3}}{N_1} \right| ^{\frac{N_1}{2}} }. \end{aligned}$$

Next, we propose the test statistic $-2\rho _{L_m}\log \lambda _m$ via linear interpolation, where

$$\begin{aligned} \rho _{L_m} = \left\{ 1-\frac{(p_1+p_2)N_2+p_1N_3}{p(N_2+N_3)}\right\} \rho _{N_1,m}+\frac{(p_1+p_2)N_2+p_1N_3}{p(N_2+N_3)}\rho _{N,m} \end{aligned}$$

and

$$\begin{aligned} \rho _{N,m}&= 1-\frac{2p^2+9p+11}{6N(p+3)(m-1)}\left( \sum _{\ell =1}^{m}\frac{N}{N^{(\ell )}}-1 \right) ,\\ \rho _{N_1,m}&= 1-\frac{2p^2+9p+11}{6N_1(p+3)(m-1)}\left( \sum _{\ell =1}^{m}\frac{N_1}{N_1^{(\ell )}}-1 \right) . \end{aligned}$$

3.4 Asymptotic Expansion Approximation

In this subsection, we give an approximate upper $100\alpha $ percentile of $-2\log \lambda _{m}$ with three-step monotone missing data for a multi-sample problem. The upper $100\alpha $ percentile of $-2\log \lambda _{S_m}$ can be expanded as

$$\begin{aligned} q_{c_m}^*(\alpha )&= \chi _{f_m{;1-\alpha }}^2+\frac{\nu }{N} \left( \sum _{\ell =1}^m \frac{1}{k_1^{(\ell )}}-1 \right) \chi _{f_m{;1-\alpha }}^2 \\&\quad + \frac{\chi _{f_m{;1-\alpha }}^2}{N^2}\left\{ \nu ^2 \left( \sum _{\ell =1}^m \frac{1}{k_1^{(\ell )}}-1 \right) ^2 + \frac{2\gamma _1}{f_m}+\frac{2\gamma _1 \chi _{f_m{;1-\alpha }}^2}{f_m(f_m+2)} \right\} +O(N^{-3}), \end{aligned}$$

where

$$\begin{aligned} \nu&= \frac{2p^2+9p+11}{6(p+3)(m-1)}, \; k_1^{(\ell )} = \frac{N^{(\ell )}}{N}, \\ \gamma _1&= \frac{1}{288}\left[ 6p(p + 1)(p + 2)(p + 3) \left( \sum _{\ell =1}^m \frac{1}{\{k_1^{(\ell )}\}^2}-1 \right) \right. \\&\quad \left. -\frac{(2p^2 + 9p + 11)^2(2p - 1)}{p(p + 3)(m - 1)}\left( \sum _{\ell =1}^m \frac{1}{k_1^{(\ell )}} - 1 \right) ^2\right] , \end{aligned}$$

where $\chi _{f_m{;1-\alpha }}^2$ is the upper $100\alpha $ percentile of the $\chi ^2$ distribution with $f_m$ degrees of freedom (Hosoya and Seo [3]). Based on linear interpolation and letting $q_m^*(\alpha )$ be the upper $100\alpha $ percentile of $-2\log \lambda _m$, the following can be obtained:

$$\begin{aligned} q_m^*(\alpha ) = \left\{ 1 - \frac{(p_1 + p_2)N_2 + p_1N_3}{p(N_2 + N_3)}\right\} q_{N_1,m}(\alpha ) + \frac{(p_1 + p_2)N_2 + p_1N_3}{p(N_2 + N_3)}q_{N,m}(\alpha ) \end{aligned}$$

where

$$\begin{aligned} q_{N,m}(\alpha )&= \chi _{f_m{;1-\alpha }}^2+\frac{\nu }{N}\left( \sum _{\ell =1}^m \frac{1}{k_1^{(\ell )}}-1 \right) \chi _{f_m{;1-\alpha }}^2\\&\quad +\frac{1}{N^2}\chi _{f_m{;1-\alpha }}^2\left\{ \nu ^2\left( \sum _{\ell =1}^m \frac{1}{k_1^{(\ell )}}-1 \right) ^{2}+ \frac{2\gamma _1}{f_m}+\frac{2\gamma _1}{f_m(f_m+2)}\chi _{f_m{;1-\alpha }}^2\right\} ,\\ q_{N_1,m}(\alpha )&= \chi _{f_m{;1-\alpha }}^2+\frac{\nu }{N_{1}}\left( \sum _{\ell =1}^m \frac{1}{k_2^{(\ell )}}-1 \right) \chi _{f_m{;1-\alpha }}^2 \\&\quad + \frac{1}{N_{1}^2}\chi _{f_m{;1-\alpha }}^2\left\{ \nu ^2\left( \sum _{\ell =1}^m \frac{1}{k_2^{(\ell )}}-1 \right) ^{2} +\frac{2\gamma _2}{f_m}+\frac{2\gamma _2}{f_m(f_m+2)}\chi _{f_m{;1-\alpha }}^2 \right\} ,\\ k_2^{(\ell )}&= \frac{N_1^{(\ell )}}{N_1}, \\ \gamma _2&= \frac{1}{288}\left[ 6p(p+1)(p+2)(p+3)\left( \sum _{\ell =1}^m \frac{1}{\{k_2^{(\ell )}\}^2}-1 \right) \right. \\&\quad \left. -\frac{(2p^2+9p+11)^2(2p-1)}{p(p+3)(m-1)}\left( \sum _{\ell =1}^m \frac{1}{k_2^{(\ell )}}-1 \right) ^2\right] . \end{aligned}$$

3.5 Simulation Studies

We evaluate the accuracy and the asymptotic behaviors of the $\chi ^2$ approximations via Monte Carlo simulation ($10^6$ runs). Now let

$$\begin{aligned} \alpha _{m}&=\textrm{Pr}\{ -2\log \lambda _m> \chi _{f_m{;1-\alpha }}^2 \},\\ \alpha _{\rho _{L_m}}&=\textrm{Pr}\{ -2\rho _{L_m}\log \lambda _{m}> \chi _{f_m{;1-\alpha }}^2 \},\\ \alpha _{\tau _{m}}&=\textrm{Pr}\{ -2\log \tau _{m}> \chi _{f_m{;1-\alpha }}^2 \},\\ \alpha _{\phi _{m}}&=\textrm{Pr}\{ -2\log \phi _{m}> \chi _{f_m{;1-\alpha }}^2 \}, \\ \alpha _{q_{m}^*}&=\textrm{Pr}\{ -2\log \lambda _{m} > q_{m}^*(\alpha ) \}. \end{aligned}$$

In Tables 7, 8, and 9, we provide the simulated upper 100$\alpha $ percentiles of $-2\log \lambda _{m},$ $-2\rho _{L_{m}}\log \lambda _{m},-2\log \tau _{m}$, and $-2\log \phi _{m}$ and the approximate upper percentiles of $-2\log \lambda _{m}$ ($q_{m}^*(\alpha )$) and the actual type I error rates $\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}$, and $\alpha _{q_{m}^*}$; $\alpha =0.05$;

$$\begin{aligned} (N_1^{(\ell )},N_2^{(\ell )},N_3^{(\ell )})= \left\{ \begin{array}{ll} (t,t,t),&{} \\ (t,t/2,t/2),&{} t=20,40,80,160,320,\\ (t,2t,2t),&{} \end{array} \right. \end{aligned}$$

where $(p_1,p_2,p_3)$ is (4, 4, 4).

The simulated values are closer to the upper percentile of the $\chi ^2$ distribution when the sample size increases. However, the accuracy of the simulated values is not very good compared with one-sample case, even if the sample size is quite large. In addition, by comparing the Type I error rates $\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}$, the accuracy of the approximate percentile ($q_{m}^*(\alpha )$) is the best.

Table 7 Simulated values for $-2\log \lambda _{m}, -2\rho _{L_{m}}\log \lambda _{m}, -2\log \tau _{m}$, and $-2\log \phi _{m}$ and the approximate values for $-2\log \lambda _{m}$ and the actual type I error rates $\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}$, and $\alpha _{q_{m}^*}$ for $(p_1,p_2,p_3)=(4,4,4), m=2$

Full size table

Table 8 Simulated values for $-2\log \lambda _{m}, -2\rho _{L_{m}}\log \lambda _{m}, -2\log \tau _{m}$, and $-2\log \phi _{m}$ and the approximate values for $-2\log \lambda _{m}$ and the actual type I error rates $\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}$, and $\alpha _{q_{m}^*}$ for $(p_1,p_2,p_3)=(4,4,4), m=3$

Full size table

Table 9 Simulated values for $-2\log \lambda _{m}, -2\rho _{L_{m}}\log \lambda _{m}, -2\log \tau _{m}$, and $-2\log \phi _{m}$ and the approximate values for $-2\log \lambda _{m}$ and the actual type I error rates $\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}$, and $\alpha _{q_{m}^*}$ for $(p_1,p_2,p_3)=(4,4,4), m=4$

Full size table

3.6 Numerical Example

In this section, we give an example of test statistics and approximate upper percentiles proposed in this paper. The data consisted of cholesterol values measured during treatment at five time points (baseline, 6 months, 12 months, 20 months, and 24 months) of a placebo group and a high dose group (Wei and Lachin [8]). We used data with values available for up to 24 months, data with values available for up to 20 months, and data with values available for up to 12 months to construct the three-step monotone missing data. That is $m=2, p_1=3, p_2=1, p_3=1$. For the placebo group ($\ell =1$), $N_1^{(1)}=31, N_2^{(1)}=4, N_3^{(1)}=3$, and for the high dose group ($\ell =2$), $N_1^{(2)}=36, N_2^{(2)}=7, N_3^{(2)}=12$. Then, LRT statistic and test statistics are

$$\begin{aligned} -2\log \lambda _m&= 82.201, \; -2\rho _{L_m}\log \lambda _m = 75.425, \\ -2\log \tau _m&= 66.542, \; -2\log \phi _m = 80.527. \end{aligned}$$

And, approximate upper percentile is

$$\begin{aligned} q_m^*(0.05) = 34.422, \; q_m^*(0.01) = 41.196, \end{aligned}$$

and $\chi ^2_{20{;0.95}}=31.410, \chi ^2_{20{;0.99}}=37.566$. Thus, the null hypothesis is rejected for all test statistics and approximate upper percentile.

4 Conclusions

We discussed simultaneous tests for mean vectors and covariance matrices with three-step monotone missing data for a one-sample and a multi-sample problem. For a one-sample problem, we proposed two test statistics ($-2\log \tau _{1},$ $-2\log \phi _{1}$) by decomposing the LR and correcting it by extracting the LR of the simultaneous test and the test of the variance in the case of complete data. We also proposed a test statistic ($-2\rho _{L_{1}}\log \lambda _{1}$) via linear interpolation. In addition, we provided an approximate upper $100\alpha $ percentile ($q_{1}^*(\alpha )$). Based on the simulation results, the test statistic $-2\log \phi _{1}$, which was modified only for the LR part of the test of the variance for the complete data, gave the most accurate results. Similarly, for a multi-sample problem, we proposed three test statistics ($-2\rho _{L_{m}}\log \lambda _{m}, -2\log \tau _{m}, -2\log \phi _{m}$) and an approximate upper percentile ($q_{m}^*(\alpha )$). Furthermore, based on the simulation results, the approximate upper $100\alpha $ percentile $q_{m}^*(\alpha )$ is the most accurate. Finally, we gave an example of the proposed test statistics. The results of this paper can be extended to the k-step monotone missing data. We are currently investigating this problem.

References

Hao J, Krishnamoorthy K (2001) Inferences on a normal covariance matrix and generalized variance with monotone missing data. J Multivar Anal 78:62–82
Article MathSciNet Google Scholar
Hosoya M, Seo T (2015) Simultaneous testing of the mean vector and the covariance matrix with two-step monotone missing data. SUT J Math 51:83–98
Article MathSciNet Google Scholar
Hosoya M, Seo T (2016) On the likelihood ratio test for the equality of multivariate normal populations with two-step monotone missing data. J Stat Theory Pract 10:673–692
Article MathSciNet Google Scholar
Jinadasa KG, Tracy DS (1992) Maximum likelihood estimation for multivariate normal distribution with monotone sample. Commun Stat Theory Methods 21:41–50
Article MathSciNet Google Scholar
Kanda T, Fujikoshi Y (1998) Some basic properties of the MLE’s for a multivariate normal distribution with monotone missing data. Am J Math Manag Sci 18:161–190
MathSciNet Google Scholar
Muirhead RJ (2005) Aspects of multivariate statistical theory. John Wiley & Sons Inc, Hoboken
Google Scholar
Srivastava MS (2002) Methods of multivariate statistics. John Wiley & Sons Inc, New York
Google Scholar
Wei LJ, Lachin JM (1984) Two-sample asymptotically distribution-free tests for incomplete multivariate observations. J Am Stat Assoc 79:653–661
Article MathSciNet Google Scholar
Yagi A, Yamaguchi R, Seo T (2016) Simultaneous testing of mean vectors and covariance matrices with monotone missing data, Technical Report No.16-02, Statistical Research Group, Hiroshima University, Hiroshima, Japan
K., Krishnamoorthy Maruthy K., Pannala (1999) Confidence estimation of a normal mean vector with incomplete data Abstract Canadian Journal of Statistics 27(2) 395–407 https://doi.org/10.2307/3315648

Download references

Acknowledgements

The authors would like to thank the referee for the helpful comments and suggestions. The second author’s research is partly supported by a Grant-in-Aid for Early-Career Scientists (19K20225, 22K13961).

Author information

Authors and Affiliations

Department of Applied Mathematics, Tokyo University of Science, 1-3, Kagurazaka, Shinjuku-ku, Tokyo, 162-8601, Japan
Remi Sakai, Ayaka Yagi & Takashi Seo

Authors

Remi Sakai
View author publications
You can also search for this author in PubMed Google Scholar
Ayaka Yagi
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Seo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ayaka Yagi.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Derivation of Theorem 1

Following the derivation of MLEs of the mean vector and the covariance matrix with two-step monotone missing data in Kanda and Fujikoshi [5], we consider the transformation matrix

$$\begin{aligned} {\varvec{Z}} = \left( \begin{array}{c|c} \begin{matrix} {\varvec{I}}_{p_1} &{} {\varvec{O}} \\ -{\varvec{\varSigma }}_{21}{\varvec{\varSigma }}_{11}^{-1} &{} {\varvec{I}}_{p_2}\\ \end{matrix} &{} {\varvec{O}} \\ \hline \\ -{\varvec{\varSigma }}_{3(12)}{\varvec{\varSigma }}_{(12)(12)}^{-1} &{} {\varvec{I}}_{p_3} \end{array} \right) \end{aligned}$$

for the three-step monotone missing data case. In this case, the transformed vector ${\varvec{y}}_j$ is

$$\begin{aligned} {\varvec{y}}_j= {\varvec{Z}}{\varvec{x}}_j=\left( \begin{array}{c} {\varvec{x}}_{1j} \\ -{\varvec{\varSigma }}_{21}{\varvec{\varSigma }}_{11}^{-1}{\varvec{x}}_{1j}+{\varvec{x}}_{2j} \\ -{\varvec{\varSigma }}_{3(12)}{\varvec{\varSigma }}_{(12)(12)}^{-1}\left( \begin{array}{c} {\varvec{x}}_{1j} \\ {\varvec{x}}_{2j} \end{array}\right) +{\varvec{x}}_{3j} \end{array} \right) . \end{aligned}$$

The transformed parameters are defined as

$$\begin{aligned} {\varvec{\eta }}&= \left( \begin{array}{c} {\varvec{\eta }}_1 \\ {\varvec{\eta }}_2 \\ {\varvec{\eta }}_3 \end{array} \right) =\left( \begin{array}{c} {\varvec{\mu }}_1 \\ -{\varvec{\varSigma }}_{21}{\varvec{\varSigma }}_{11}^{-1}{\varvec{\mu }}_1+{\varvec{\mu }}_2 \\ -{\varvec{\varSigma }}_{3(12)}{\varvec{\varSigma }}_{(12)(12)}^{-1}\left( \begin{array}{c} {\varvec{\mu }}_1 \\ {\varvec{\mu }}_2 \end{array} \right) + {\varvec{\mu }}_3 \end{array} \right) ,\\ {\varvec{\varDelta }}&= \left( \begin{array}{cc|c} {\varvec{\varDelta }}_{11} &{} {\varvec{\varDelta }}_{12} &{} {\varvec{\varDelta }}_{13} \\ {\varvec{\varDelta }}_{21} &{} {\varvec{\varDelta }}_{22} &{} {\varvec{\varDelta }}_{23} \\ \hline {\varvec{\varDelta }}_{31} &{} {\varvec{\varDelta }}_{32} &{} {\varvec{\varDelta }}_{33} \end{array} \right) =\left( \begin{array}{c|c} {\varvec{\varDelta }}_{(12)(12)} &{} {\varvec{\varDelta }}_{(12)3} \\ \hline {\varvec{\varDelta }}_{3(12)} &{} {\varvec{\varDelta }}_{33} \end{array} \right) , \end{aligned}$$

where

$$\begin{aligned} {\varvec{\varDelta }}_{11}&= {\varvec{\varSigma }}_{11},\\ {\varvec{\varDelta }}_{12}&= {\varvec{\varDelta }}'_{21}={\varvec{\varSigma }}_{11}^{-1}{\varvec{\varSigma }}_{12},\\ {\varvec{\varDelta }}_{22}&= {\varvec{\varSigma }}_{22\cdot 1}={\varvec{\varSigma }}_{22}-{\varvec{\varSigma }}_{21}{\varvec{\varSigma }}_{11}^{-1}{\varvec{\varSigma }}_{12},\\ {\varvec{\varDelta }}_{(12)3}&= {\varvec{\varDelta }}'_{3(12)}={\varvec{\varSigma }}_{(12)(12)}^{-1}{\varvec{\varSigma }}_{(12)3},\\ {\varvec{\varDelta }}_{33}&= {\varvec{\varSigma }}_{33\cdot 12}={\varvec{\varSigma }}_{33}-{\varvec{\varSigma }}_{3(12)}{\varvec{\varSigma }}_{(12)(12)}^{-1}{\varvec{\varSigma }}_{(12)3}. \end{aligned}$$

Then, the likelihood function after the transformation can be written as

$$\begin{aligned} L({\varvec{\eta }}, {\varvec{\varDelta }})&={\textit{Const}}. |{\varvec{\varDelta }}_{11}|^{-\frac{1}{2}N}|{\varvec{\varDelta }}_{22}|^{-\frac{1}{2}(N_1+N_2)}|{\varvec{\varDelta }}_{33}|^{-\frac{1}{2}N_1}\\&\quad \times \exp \left\{ -\frac{1}{2}\sum _{j=1}^{N}({\varvec{y}}_{1j}-{\varvec{\eta }}_1)'{\varvec{\varDelta }}_{11}^{-1}({\varvec{y}}_{1j}-{\varvec{\eta }}_1) \right\} \\&\quad \times \exp \left\{ -\frac{1}{2}\sum _{j=1}^{N_1+N_2}({\varvec{y}}_{2j}-{\varvec{\eta }}_2)'{\varvec{\varDelta }}_{22}^{-1}({\varvec{y}}_{2j}-{\varvec{\eta }}_2)\right\} \\&\quad \times \exp \left\{ -\frac{1}{2}\sum _{j=1}^{N_1}({\varvec{y}}_{3j}-{\varvec{\eta }}_3)'{\varvec{\varDelta }}_{33}^{-1}({\varvec{y}}_{3j}-{\varvec{\eta }}_3) \right\} . \end{aligned}$$

We note that the pair $({\varvec{\eta }},{\varvec{\varDelta }})$ is in one-to-one correspondence with $({\varvec{\mu }},{\varvec{\varSigma }})$. The MLEs of ${\varvec{\eta }}$ and ${\varvec{\varDelta }}$ ($\widehat{{\varvec{\eta }}}$ and $\widehat{{\varvec{\varDelta }}}$) can be obtained by differentiating the log likelihood function $\log L({\varvec{\eta }}, {\varvec{\varDelta }})$ with respect to ${\varvec{\eta }}$ and ${\varvec{\varDelta }}$, respectively. Calculating

$$\begin{aligned} \lambda _1=\frac{L({\varvec{0}}, {\varvec{I}}_p)}{L(\widehat{{\varvec{\eta }}}, \widehat{{\varvec{\varDelta }}})} \end{aligned}$$

yields (3).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sakai, R., Yagi, A. & Seo, T. Simultaneous Tests for Mean Vectors and Covariance Matrices with Three-Step Monotone Missing Data. J Stat Theory Pract 18, 3 (2024). https://doi.org/10.1007/s42519-023-00355-2

Download citation

Accepted: 15 November 2023
Published: 14 December 2023
DOI: https://doi.org/10.1007/s42519-023-00355-2

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Simultaneous Tests for Mean Vectors and Covariance Matrices with Three-Step Monotone Missing Data

Abstract

Similar content being viewed by others

On the likelihood ratio test for the equality of multivariate normal populations with two-step monotone missing data

On the asymptotic distribution of T2-type statistic with two-step monotone missing data

Empirical likelihood-based inferences in varying coefficient models with missing data

1 Introduction

2 One-Sample Problem

2.1 LR with Three-Step Monotone Missing Data

Theorem 1

2.2 Complete Data

2.3 Test Statistics

2.4 Asymptotic Expansion Approximation

2.5 Simulation Studies

2.6 Numerical Power

3 Multi-Sample Problem

3.1 LR with Three-Step Monotone Missing Data

Theorem 2

3.2 Complete Data

3.3 Test Statistics

3.4 Asymptotic Expansion Approximation

3.5 Simulation Studies

3.6 Numerical Example

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 Derivation of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation

On the asymptotic distribution of T²-type statistic with two-step monotone missing data