Abstract
In this paper, we consider simultaneous tests of the mean vectors and the covariance matrices under three-step monotone missing data for a one-sample and a multi-sample problem. We provide the likelihood ratio test (LRT) statistic and propose statistics for improving the accuracy of the \(\chi ^2\) approximation. These test statistics are derived by decomposing the likelihood ratio (LR) using the coefficients of the modified LRT statistics with complete data. As an alternative approach, we derive an approximate upper percentile of the LRT statistic with three-step monotone missing data using linear interpolation based on an asymptotic expansion of the LRT statistic with complete data. Finally, we investigate the asymptotic behavior of the upper percentiles of these test statistics and the accuracy of approximate upper percentiles via Monte Carlo simulation. In addition, we give an example of test statistics and approximate upper percentiles proposed in this paper.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we consider simultaneous tests of the mean vectors and the covariance matrices under three-step monotone missing data for a one-sample and a multi-sample problem. Jinadasa and Tracy [4] and Kanda and Fujikoshi [5] discussed MLEs for general k-step monotone missing data. For simultaneous tests, the LRT statistic and the modified LRT statistic with Bartlett correction for the case of complete data were discussed by Muirhead [6] and Srivastava [7]. Furthermore, the LRT statistic and the test statistics for improving the accuracy of the \(\chi ^2\) approximation for three-step monotone missing data were proposed by Hao and Krishnamoorthy [1] and Hosoya and Seo [2, 3]. In particular, Hosoya and Seo [2, 3] presented test statistics by decomposing the LR; this paper is an extension of the work presented by Hosoya and Seo [2, 3]. An LRT statistic and test statistics for general k-step monotone missing data, which are obtained by correcting only a part of the missing data, were given by Yagi et al. [9].
The remainder of this paper is organized as follows. Sect. 2 describes the MLEs of the mean vector and covariance matrix and its LRT statistic in the case of three-step monotone missing data for a one-sample problem. Furthermore, we propose three test statistics for improving the accuracy of the \(\chi ^2\) approximation using the coefficients of the modified LRT statistics with complete data. In addition, we derive an approximate upper percentile of the LRT statistic. Using Monte Carlo simulation, we investigate the \(\chi ^2\) approximation accuracy of the test statistics and the accuracy of approximate upper percentiles of the LRT statistic. Numerical power comparison of the test statistics for some selected parameters is also presented. Sect. 3 describes the test statistics and approximate upper percentile for a multi-sample problem. Furthermore, via Monte Carlo simulation, we investigate the asymptotic behavior of the upper percentiles of these test statistics and approximate upper percentiles of the LRT statistic. The results are illustrated using an example. Finally, Sect. 4 states our conclusions.
2 One-Sample Problem
In this section, we consider the problem of simultaneous test of the mean vector and the covariance matrix under three-step monotone missing data for a one-sample problem.
2.1 LR with Three-Step Monotone Missing Data
We suppose that the data is normally distributed as follows:
where
We partition \({\varvec{x}}_j\) into a \(p_1 \times 1\) random vector, a \(p_2 \times 1\) random vector, and a \(p_3 \times 1\) random vector as \({\varvec{x}}_j = ({\varvec{x}}'_{1j},{\varvec{x}}'_{2j},{\varvec{x}}'_{3j})'(j=1,\ldots ,N_1)\). In addition, let \({\varvec{x}}_{(12),j} = ({\varvec{x}}_{1j}',{\varvec{x}}_{2j}')'(j=N_1+1,\ldots ,N_1+N_2)\).
Such a dataset has three-step monotone missing data for a one-sample problem:
where \(N = N_1+N_2+N_3,p =p_1+p_2+p_3\) and “\(*\)” indicates a missing observation.
Now, we consider the following hypothesis test when the dataset has a three-step monotone pattern.
where \({\varvec{\mu }}_0\) is a known vector, and \({\varvec{\varSigma }}_0\) is a known matrix. Without loss of generality, we can assume that \({\varvec{\mu }}={\varvec{0}}\) and \({\varvec{\varSigma }}={\varvec{I}}_p\). Then, we have the following theorem.
Theorem 1
Suppose the data have a three-step monotone pattern of missing observations and are normally distributed as (1). Then, the LR of the hypothesis test (2) can be given by
where
and
For the derivation of Theorem 1, see the Appendix. After calculations, we get
This LR \(\lambda _1\) is essentially the same as that obtained by Yagi, Yamaguchi, and Seo [9]. Thus, we obtain the LRT statistic \(-2\log \lambda _{1}\). In the complete data case (Sect. 2.2), the LRT statistic for (2) is asymptotically distributed as \(\chi ^2\) distribution with \(f_1\) degrees of freedom where \(f_1=p(p+3)/2\) (see Muirhead [6, p. 370]). For example, Table 1 presents the simulated values of the upper 100\(\alpha \) percentiles of \(-2\log \lambda _{1}\) and Type I error rate, \(\alpha _{1} =\textrm{Pr}\{ -2\log \lambda _{1} > \chi _{f_1{;1-\alpha }}^2 \}\) for the three-step monotone missing data case, where \(\chi _{f_1{;1-\alpha }}^2\) is the upper \(100\alpha \) percentile of the \(\chi ^2\) distribution with \(f_1\) degrees of freedom.
As demonstrated in Table 1, the accuracy of the \(\chi ^2\) approximation in this case is not desirable when the sample size is not large; therefore, a test statistic is needed to improve the accuracy of the \(\chi ^2\) approximation. We propose test statistics that improve the \(\chi ^2\) approximation using the modified likelihood ratio test statistic of simultaneous test and test of variance for the complete data case described in Sect. 2.2.
2.2 Complete Data
We consider the LRT statistic and modified LRT statistics with Bartlett correction in the case of complete data for a one-sample problem. These results are used in the next subsection. We first consider a simultaneous test for complete data as follows:
In this case, the LR can be expressed as follows. Let \({\varvec{x}}_1,{\varvec{x}}_2,\ldots ,{\varvec{x}}_N\) be independently distributed as \(N_p ({\varvec{\mu }},{\varvec{\varSigma }})\), and let \(\lambda _{S_1}\) be the LR for the complete data. Then, the LR is given by
where
It is known that \(-2 \log \lambda _{S_1}\) is asymptotically distributed as \(\chi ^2\) distribution with \(f_1(=p(p+3)/2)\) degrees of freedom. Furthermore, the modified LRT statistic with Bartlett correction can be given by \(-2\rho _1 \log \lambda _{S_1}\) (Muirhead [6, p. 370]), where
Next, we consider a covariance test for complete data as follows:
In this case, the LR, which is an unbiased test, can be expressed as follows:
The modified LRT statistic with Bartlett correction \(-2\rho _2 \log \lambda _{V_1}\) was provided by Muirhead [6, p. 359], where
2.3 Test Statistics
We now decompose the LR to derive the test statistic for improving the accuracy of the \(\chi ^2\) approximation. Let
Therefore,
Then, \(\omega _1\omega _4, \omega _2\omega _5, \omega _3\omega _6\) are of the form of LR for \(H_{01}\) under non-missing normality. Hence, we can obtain the modified LRT statistics, \(-2\rho _{14}\log \omega _1\omega _4,\) \(-2\rho _{25}\log \omega _2\omega _5,-2\rho _{36}\log \omega _3\omega _6\), where
Thus, we propose a new test statistic given by \(-2\log \tau _{1}\), where
In addition, we denote
where
Subsequently, since \(\omega _4^*,\omega _5^*,\omega _6^*\) are of the form of LR for \(H_{02}\) under non-missing normality, we can propose the test statistic as \(-2\log \phi _{1}\), where
and
Now, we propose the modified LRT statistic \(-2\rho _{L_{1}} \log \lambda _{1}\) via linear interpolation, where
and
2.4 Asymptotic Expansion Approximation
In this subsection, we give an approximate upper percentile of \(-2\log \lambda _{1}\) when the data have a three-step monotone pattern for a one-sample problem. The upper \(100\alpha \) percentile of \(-2\log \lambda _{S_1}\) can be expanded as
where \(\nu =(2p^2+9p+11)/\{6(p+3)\}\) (Hosoya and Seo [2]) Based on linear interpolation and letting \(q_{1}^*(\alpha )\) be the upper \(100\alpha \) percentile of \(-2\log \lambda _{1}\), the following can be obtained:
where
2.5 Simulation Studies
We evaluate the accuracy and the asymptotic behaviors of the \(\chi ^2\) approximations via Monte Carlo simulation (\(10^6\) runs). Let
In Tables 2 and 3, we provide the simulated upper 100\(\alpha \) percentiles of \(-2\log \lambda _{1}, -2\rho _{L_{1}}\log \lambda _{1}\), \(-2\log \tau _{1}\), and \(-2\log \phi _{1}\) and the approximate upper percentiles of \(-2\log \lambda _{1}\) (\(q_{1}^*(\alpha )\)) and the actual type I error rates \(\alpha _{1}, \alpha _{\rho _{L_1}}, \alpha _{\tau _{1}}, \alpha _{\phi _{1}}\), and \(\alpha _{q_{1}^*}\); \(\alpha =0.05\); and for the following cases (Case I),
where \((p_1,p_2,p_3)\) in Tables 2 and 3 are (3, 3, 3), (6, 6, 6), respectively.
Similarly, Tables 4 and 5 exhibit the results for the following cases (Case II),
where \((p_1,p_2,p_3)\) in Tables 4 and 5 are (3, 3, 3), (6, 6, 6) respectively.
It may be noted from the above-mentioned Tables that the simulated values are closer to the upper percentile of the \(\chi ^2\) distribution when the sample size increases. In addition, it can be seen that the upper percentile of \(-2\log \phi _1\) is considerably better than that of \(-2\log \lambda _1\) even for small sample sizes, while the upper percentile of \(-2\rho _{L_{1}}\log \lambda _{1}\) or \(q_{1}^*(\alpha )\) is not as good as \(-2\log \phi _1\).
2.6 Numerical Power
We conduct the power comparison of (I) the LR test using \(-2\log \lambda _1\) given in Sect. 2.1, (II) the test using statistic \(-2\log \tau _1\) given in Sect. 2.3, and (III) the test using statistic \(-2\log \phi _1\) given in Sect. 2.3. Under some parameter settings, the powers of (I), (II), and (III) are compared using corresponding simulated upper \(100\alpha \) percentiles under the null distribution, where \(\alpha =0.05\). The simulation was executed \(10^6\) times using normal random vectors. When \({\varvec{\varSigma }}={\varvec{I}}_p\), the powers are computed with various values of \(\delta _i={\varvec{\mu }}_i'{\varvec{\mu }}_i\), \(i=1,2,3\). This follows the power computation for the test of a mean vector in Krishnamoorthy and Pannala [10]. On the other hand, when \({\varvec{\mu }}={\varvec{0}}\), we put \({\varvec{\varSigma }}={\varvec{I}}_p+(1/\sqrt{N_1}){\varvec{\varOmega }},\) where \({\varvec{\varOmega }}=\textrm{diag}(\omega _1,\, \omega _2,\ldots ,\omega _p)\), the powers are computed with various values of \(\omega _j=\omega \), \(j=1,2,\ldots ,p\). Table 6 shows the power of three tests where \((p_1,p_2,p_3)=(3,3,3)\) and \((N_1,N_2,N_3)=(20,10,10)\). We note from Table 6 that the power of three tests have natural power properties. In addition, comparing the power of tests (I), (II), and (III), it can be seen that test (III) has the highest power, while tests (I) and (II) have almost the same power.
Further, Fig. 1 shows the power plots of (III) the test using statistic \(-2\log \phi _1\) when (a) \((N_1,N_2,N_3)=(10,10,10)\), (b) \((N_1,N_2,N_3)=(20,10,10)\) and (c) \((N_1,N_2,N_3)=(40,10,10)\) with \((p_1,p_2, p_3)=(3,3,3)\) and \(\delta _2=\delta _3=\omega =0\). Fig. 1 illustrates that the power is an increasing function of the sample size. The power studies are performed for other sample sizes and dimensions, and similar trends are observed. Therefore, the results are not listed here.
3 Multi-Sample Problem
In this section, we will consider simultaneous tests of the mean vector and the covariance matrix under three-step monotone missing data for a multi-sample problem.
3.1 LR with Three-Step Monotone Missing Data
Let \({\varvec{x}}_1^{(\ell )},{\varvec{x}}_2^{(\ell )},\ldots ,{\varvec{x}}_{N_1^{(\ell )}}^{(\ell )}\) be independent p-dimensional sample vectors, \({\varvec{x}}_{(12),N_1^{(\ell )}+1}^{(\ell )}\), \({\varvec{x}}_{(12),N_1^{(\ell )}+2}^{(\ell )},\ldots ,{\varvec{x}}_{(12),N_1^{(\ell )}+N_2^{(\ell )}}^{(\ell )}\) be independent \((p_1+p_2)\)-dimensional sample vectors and \({\varvec{x}}_{1,N_1^{(\ell )}+N_2^{(\ell )}+1}^{(\ell )},{\varvec{x}}_{1,N_1^{(\ell )}+N_2^{(\ell )}+2}^{(\ell )}\), \(\ldots ,{\varvec{x}}_{1N^{(\ell )}}^{(\ell )}\) be independent \(p_1\)-dimensional sample vectors from the \(\ell \)th population (\(\ell =1,\ldots ,m\)). We suppose that the data is normally distributed as follows:
where
and
Such a dataset has three-step monotone missing data for a multi-sample problem for the \(\ell \)th population:
where “\(*\)” indicates a missing observation.
We consider the following hypothesis:
To derive the MLEs of the mean vectors and the covariance matrices, we consider the following transformation matrix \({\varvec{Z}}^{(\ell )}\):
The transformed vector \({\varvec{y}}_j^{(\ell )}=({\varvec{y}}_{1j}^{(\ell )'},{\varvec{y}}_{2j}^{(\ell )'},{\varvec{y}}_{3j}^{(\ell )'})'\) is
The transformed parameters \(({\varvec{\eta }}^{(\ell )},{\varvec{\varDelta }}^{(\ell )})\) are defined as
where
We note that the pair \(({\varvec{\mu }}^{(\ell )},{\varvec{\varSigma }}^{(\ell )})\) is in one-to-one correspondence with \(({\varvec{\eta }}^{(\ell )},{\varvec{\varDelta }}^{(\ell )})\). Under \(H_1\), we define the MLEs of \(({\varvec{\eta }}^{(\ell )},{\varvec{\varDelta }}^{(\ell )})\) as \((\widehat{{\varvec{\eta }}}^{(\ell )},\widehat{{\varvec{\varDelta }}}^{(\ell )})\),
where
Conversely, under \(H_{m0}\), we define MLEs of \({\varvec{\eta }}(={\varvec{\eta }}^{(1)}=\cdots ={\varvec{\eta }}^{(m)}),{\varvec{\varDelta }}(={\varvec{\varDelta }}^{(1)}=\cdots ={\varvec{\varDelta }}^{(m)})\) as \((\widetilde{{\varvec{\eta }}},\widetilde{{\varvec{\varDelta }}})\). Subsequently, we obtain
where \(N=\sum _{\ell =1}^{m} N^{(\ell )}, \; N_1=\sum _{\ell =1}^{m} N_1^{(\ell )}, \; N_2=\sum _{\ell =1}^{m} N_2^{(\ell )}\). From the preceding MLEs, we get the following theorem.
Theorem 2
Suppose that the datasets have a three-step monotone pattern of missing observations and are normally distributed as (4). Then, the LR for (5) can be given by
where \(\widehat{{\varvec{\varDelta }}}_{ii}\) and \(\widetilde{{\varvec{\varDelta }}}_{ii}\) \((i=1,2,3)\) are given in (6)–(10).
Thus, we obtain LRT statistic \(-2\log \lambda _{m}\). \(-2\log \lambda _{m}\) is asymptotically distributed as a \(\chi ^2\) distribution with \(f_m=p(p+3)(m-1)/2\) degrees of freedom. However, it is known that the accuracy of this approximation is not good for small samples. Therefore, we propose the test statistics that are a good approximation to \(\chi ^2\) distribution using several methods based on the complete data case in Sect. 3.2.
3.2 Complete Data
In this subsection, we discuss the LRT statistic in the case of complete data and the modified LRT statistics with Bartlett correction. The results will be used to propose the test statistics in the next subsection. First, we consider a simultaneous test with complete data as follows:
\({\varvec{x}}_1^{(\ell )},{\varvec{x}}_2^{(\ell )},\ldots ,{\varvec{x}}_{N^{(\ell )}}^{(\ell )}\) be independently distributed as \(N_p({\varvec{\mu }}^{(\ell )},{\varvec{\varSigma }}^{(\ell )})\), and let \(\lambda _{S_m}\) be the LR for the complete data. Then, the LR is given by
where
Furthermore, the modified LRT statistic with Bartlett correction can be given by \(-2\rho _3\log \lambda _{S_m}\) (Muirhead [6, p. 513]), where
Next, we consider the covariance test in the case of complete data as follows:
The modified LRT statistic \(-2\rho _4\log \lambda _{V_m}\) was provided by Muirhead [6, p. 308], where
and
3.3 Test Statistics
Using LR of simultaneous test with complete data from the previous subsection, we propose test statistics by decomposing the LR \(\lambda _m\) with three-step monotone missing data. First, the LR can be decomposed as \(\lambda _m=\xi _1\xi _2\xi _3\), where
Because \(\xi _1\) is of the form of LR for \(H_{03}\) in the case of without missing data, the modified LRT statistic \(-2\rho _{\xi _1}\log \xi _1\) is given, where
Next, \(\xi _2\) can be decomposed as \(\xi _2=\xi _2^{\dagger }\xi _2^{\ddagger }\), where
and
Since \(\xi _2^{\dagger }\) is of the form of LR for \(H_{03}\) in the case of complete data, the modified LRT statistic \(-2\rho _{\xi _2}\log \xi _2^{\dagger }\) is given, where
Similarly, \(\xi _3\) can be decomposed as \(\xi _3=\xi _3^{\dagger }\xi _3^{\ddagger }\), where
and
Since \(\xi _3^{\dagger }\) is of the form of LR for \(H_{03}\) in the case of complete data, the modified LRT statistic \(-2\rho _{\xi _3}\log \xi _3^{\dagger }\) is given, where
Therefore, the decomposed form of \(\lambda _m=\xi _1\xi _2\xi _3\) is \(\lambda _m=\xi _1\xi _2^{\dagger }\xi _2^{\ddagger }\xi _3^{\dagger }\xi _3^{\ddagger }\). We give a correction only for \(\xi _1,\xi _2^{\dagger }\), and \(\xi _3^{\dagger }\). Thus, we give the test statistic \(-2\log \tau _m\) for improving the accuracy of the \(\chi ^2\) approximation, where
Next, using the LR of the covariance test with complete data from the previous subsection, we propose test statistics by decomposing the LR \(\lambda _m\) with three-step monotone missing data. Let
where
Because \(\xi _{11}^{*},\xi _{21}^{*}\), and \(\xi _{31}^{*}\) are of the form of LR for \(H_{04}\) in the case of complete data, the modified LRT statistics \(-2\rho _{\xi _{11}^{*}}\log \xi _{11}^{*},-2\rho _{\xi _{21}^{*}}\log \xi _{21}^{*}\), and \(-2\rho _{\xi _{31}^{*}}\log \xi _{31}^{*}\) are given, where
Therefore, we propose the test statistic \(-2\log \phi _m\) to improve the accuracy of the \(\chi ^2\) approximation, where
and
Next, we propose the test statistic \(-2\rho _{L_m}\log \lambda _m\) via linear interpolation, where
and
3.4 Asymptotic Expansion Approximation
In this subsection, we give an approximate upper \(100\alpha \) percentile of \(-2\log \lambda _{m}\) with three-step monotone missing data for a multi-sample problem. The upper \(100\alpha \) percentile of \(-2\log \lambda _{S_m}\) can be expanded as
where
where \(\chi _{f_m{;1-\alpha }}^2\) is the upper \(100\alpha \) percentile of the \(\chi ^2\) distribution with \(f_m\) degrees of freedom (Hosoya and Seo [3]). Based on linear interpolation and letting \(q_m^*(\alpha )\) be the upper \(100\alpha \) percentile of \(-2\log \lambda _m\), the following can be obtained:
where
3.5 Simulation Studies
We evaluate the accuracy and the asymptotic behaviors of the \(\chi ^2\) approximations via Monte Carlo simulation (\(10^6\) runs). Now let
In Tables 7, 8, and 9, we provide the simulated upper 100\(\alpha \) percentiles of \(-2\log \lambda _{m},\) \(-2\rho _{L_{m}}\log \lambda _{m},-2\log \tau _{m}\), and \(-2\log \phi _{m}\) and the approximate upper percentiles of \(-2\log \lambda _{m}\) (\(q_{m}^*(\alpha )\)) and the actual type I error rates \(\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}\), and \(\alpha _{q_{m}^*}\); \(\alpha =0.05\);
where \((p_1,p_2,p_3)\) is (4, 4, 4).
The simulated values are closer to the upper percentile of the \(\chi ^2\) distribution when the sample size increases. However, the accuracy of the simulated values is not very good compared with one-sample case, even if the sample size is quite large. In addition, by comparing the Type I error rates \(\alpha _{m}, \alpha _{\rho _{L_m}}, \alpha _{\tau _{m}}, \alpha _{\phi _{m}}\), the accuracy of the approximate percentile (\(q_{m}^*(\alpha )\)) is the best.
3.6 Numerical Example
In this section, we give an example of test statistics and approximate upper percentiles proposed in this paper. The data consisted of cholesterol values measured during treatment at five time points (baseline, 6 months, 12 months, 20 months, and 24 months) of a placebo group and a high dose group (Wei and Lachin [8]). We used data with values available for up to 24 months, data with values available for up to 20 months, and data with values available for up to 12 months to construct the three-step monotone missing data. That is \(m=2, p_1=3, p_2=1, p_3=1\). For the placebo group (\(\ell =1\)), \(N_1^{(1)}=31, N_2^{(1)}=4, N_3^{(1)}=3\), and for the high dose group (\(\ell =2\)), \(N_1^{(2)}=36, N_2^{(2)}=7, N_3^{(2)}=12\). Then, LRT statistic and test statistics are
And, approximate upper percentile is
and \(\chi ^2_{20{;0.95}}=31.410, \chi ^2_{20{;0.99}}=37.566\). Thus, the null hypothesis is rejected for all test statistics and approximate upper percentile.
4 Conclusions
We discussed simultaneous tests for mean vectors and covariance matrices with three-step monotone missing data for a one-sample and a multi-sample problem. For a one-sample problem, we proposed two test statistics (\(-2\log \tau _{1},\) \(-2\log \phi _{1}\)) by decomposing the LR and correcting it by extracting the LR of the simultaneous test and the test of the variance in the case of complete data. We also proposed a test statistic (\(-2\rho _{L_{1}}\log \lambda _{1}\)) via linear interpolation. In addition, we provided an approximate upper \(100\alpha \) percentile (\(q_{1}^*(\alpha )\)). Based on the simulation results, the test statistic \(-2\log \phi _{1}\), which was modified only for the LR part of the test of the variance for the complete data, gave the most accurate results. Similarly, for a multi-sample problem, we proposed three test statistics (\(-2\rho _{L_{m}}\log \lambda _{m}, -2\log \tau _{m}, -2\log \phi _{m}\)) and an approximate upper percentile (\(q_{m}^*(\alpha )\)). Furthermore, based on the simulation results, the approximate upper \(100\alpha \) percentile \(q_{m}^*(\alpha )\) is the most accurate. Finally, we gave an example of the proposed test statistics. The results of this paper can be extended to the k-step monotone missing data. We are currently investigating this problem.
References
Hao J, Krishnamoorthy K (2001) Inferences on a normal covariance matrix and generalized variance with monotone missing data. J Multivar Anal 78:62–82
Hosoya M, Seo T (2015) Simultaneous testing of the mean vector and the covariance matrix with two-step monotone missing data. SUT J Math 51:83–98
Hosoya M, Seo T (2016) On the likelihood ratio test for the equality of multivariate normal populations with two-step monotone missing data. J Stat Theory Pract 10:673–692
Jinadasa KG, Tracy DS (1992) Maximum likelihood estimation for multivariate normal distribution with monotone sample. Commun Stat Theory Methods 21:41–50
Kanda T, Fujikoshi Y (1998) Some basic properties of the MLE’s for a multivariate normal distribution with monotone missing data. Am J Math Manag Sci 18:161–190
Muirhead RJ (2005) Aspects of multivariate statistical theory. John Wiley & Sons Inc, Hoboken
Srivastava MS (2002) Methods of multivariate statistics. John Wiley & Sons Inc, New York
Wei LJ, Lachin JM (1984) Two-sample asymptotically distribution-free tests for incomplete multivariate observations. J Am Stat Assoc 79:653–661
Yagi A, Yamaguchi R, Seo T (2016) Simultaneous testing of mean vectors and covariance matrices with monotone missing data, Technical Report No.16-02, Statistical Research Group, Hiroshima University, Hiroshima, Japan
K., Krishnamoorthy Maruthy K., Pannala (1999) Confidence estimation of a normal mean vector with incomplete data Abstract Canadian Journal of Statistics 27(2) 395–407 https://doi.org/10.2307/3315648
Acknowledgements
The authors would like to thank the referee for the helpful comments and suggestions. The second author’s research is partly supported by a Grant-in-Aid for Early-Career Scientists (19K20225, 22K13961).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Derivation of Theorem 1
Following the derivation of MLEs of the mean vector and the covariance matrix with two-step monotone missing data in Kanda and Fujikoshi [5], we consider the transformation matrix
for the three-step monotone missing data case. In this case, the transformed vector \({\varvec{y}}_j\) is
The transformed parameters are defined as
where
Then, the likelihood function after the transformation can be written as
We note that the pair \(({\varvec{\eta }},{\varvec{\varDelta }})\) is in one-to-one correspondence with \(({\varvec{\mu }},{\varvec{\varSigma }})\). The MLEs of \({\varvec{\eta }}\) and \({\varvec{\varDelta }}\) (\(\widehat{{\varvec{\eta }}}\) and \(\widehat{{\varvec{\varDelta }}}\)) can be obtained by differentiating the log likelihood function \(\log L({\varvec{\eta }}, {\varvec{\varDelta }})\) with respect to \({\varvec{\eta }}\) and \({\varvec{\varDelta }}\), respectively. Calculating
yields (3).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sakai, R., Yagi, A. & Seo, T. Simultaneous Tests for Mean Vectors and Covariance Matrices with Three-Step Monotone Missing Data. J Stat Theory Pract 18, 3 (2024). https://doi.org/10.1007/s42519-023-00355-2
Accepted:
Published:
DOI: https://doi.org/10.1007/s42519-023-00355-2
Keywords
- Asymptotic expansion
- Likelihood ratio test
- Linear interpolation
- Maximum likelihood estimator
- Modified likelihood ratio test statistic