7.1 Introduction

The Extended Growth Curve Model (EGCM) is an extension of the Growth Curve Model (GCM), and arises in situations where there are linear restrictions on the mean parameter of the model [22, 26, 27, 30, 31]. Both the GCM and EGCM are also referred to as Generalized Multivariate Analysis of Variance (GMANOVA) models, simply because they are indeed generalizations of the classical Multivariate Analysis of Variance (MANOVA) models [1, 9,10,11,12, 26]. The term bilinear regression will be used frequently in describing the GCM and its extensions, to indicate the presence of two design matrices: the within and between design matrices, and to describe the bilinear nature of the projections corresponding to these two design matrices. The linear restrictions on MANOVA models lead to a bilinear model, where the model involves projections with respect to two design matrices, as already mentioned [3, 4, 16, 30].

The GCM has been demonstrated to be useful in the analysis of longitudinal data, especially for short to moderate time series, and the analysis is performed assuming that the mean for each of the groups follows a polynomial of degree q [11, 21, 23]. There is extensive literature in the area and various distributions and covariance structures have been considered [6, 13, 14, 18,19,20, 28]. High-dimensional extensions have also been considered in recent years [5, 8, 24, 25]. In practical applications, however, the mean for different groups might be represented by polynomials of different degrees. This, for instance, happens when we have clustered longitudinal data, where the response over time follows different shapes for different clusters or groups. In such situations, the EGCM is useful since it allows different degrees of polynomials to be fitted within one modeling framework [3, 7, 15, 17, 26, 31].

Consider a GCM with m groups, where measurements are taken from each of the n individuals at p different time points. Suppose also, that the mean (across time) for the ith (i = 1, 2, …, k) group can be represented by a polynomial function of degree q − 1 and can be described as

$$\displaystyle \begin{aligned}b_{0,i}+b_{1,i}t++b_{2,i}t^2 \cdots +b_{q,i}t^{q-1}, t=t_1, t_2, \ldots, t_p.\end{aligned}$$

The GCM in matrix format, is given as

$$\displaystyle \begin{aligned} \mathbf{Y} = \mathbf{Z} \mathbf{B} \mathbf{X} + \mathbf{E}, \end{aligned}$$

where Y : p × n represents the observation (outcome) matrix, Z : p × q and X : m × n are the between and within individual design matrices, and B : q × m represents the parameter matrix with the coefficients of the polynomials. The columns of the error matrix E : p × n are assumed to be distributed as a p-variate normal distribution with mean zero and positive definite covariance matrix Σ. Description of the various matrices in the model above are given by

$$\displaystyle \begin{aligned} {\mathbf{Z}}^\prime=\left( \begin{array}{ccccc} 1 & 1 & 1 & \cdots & 1 \\ t_1 & t_2 & t_3 & \cdots & t_p \\ t_1^2 & t_2^2 & t_3^2 & \cdots & t_p^2 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ t_1^{q-1} & t_2^{q-1} & t_3^{q-1} & \cdots & t_p^{q-1} \end{array}\right),\quad \mathbf{X}=\left( \begin{array}{ccccc} {\mathbf{1}}_{n_1} & {\mathbf{0}}_{n_2} & {\mathbf{0}}_{n_3} & \cdots & {\mathbf{0}}_{n_m} \\ {\mathbf{0}}_{n_1} & {\mathbf{1}}_{n_2} & {\mathbf{0}}_{n_3} & \cdots & {\mathbf{0}}_{n_m} \\ {\mathbf{0}}_{n_1} & {\mathbf{0}}_{n_2} & {\mathbf{1}}_{n_3} & \cdots & {\mathbf{0}}_{n_m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ {\mathbf{0}}_{n_1} & {\mathbf{0}}_{n_2} & {\mathbf{0}}_{n_3} & \cdots & {\mathbf{1}}_{n_m} \\ \end{array} \right), \end{aligned} $$
$$\displaystyle \begin{aligned} \mathbf{Y}=\left( \begin{array}{ccccc} y_{11} & y_{12} & y_{13} & \cdots & y_{1n}\\ y_{21} & y_{22} & y_{23} & \cdots & y_{2n}\\ y_{31} & y_{32} & y_{33} & \cdots & y_{3n}\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ y_{p1} & y_{p2} & y_{p3} & \cdots & y_{pn}\\ \end{array} \right),\quad \mathbf{B}=\left( \begin{array}{ccccc} b_{01} & b_{02} & b_{03} & \cdots & b_{0m}\\ b_{11} & b_{12} & b_{13} & \cdots & b_{1m}\\ b_{21} & b_{22} & b_{23} & \cdots & b_{2m}\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ b_{(q-1)1} & b_{(q-1)2} & b_{(q-1)3} & \cdots & b_{(q-1)m}\\ \end{array} \right) \end{aligned} $$

and it is assumed that q ≤ p and rank(X) + p ≤ n and n = n 1 + n 2 + … + n m.

Inferences for the GCM have been considered by many, and likelihood estimators for both the mean parameter and covariance matrices are already available [4, 6, 11]. The likelihood ratio test for testing the general linear hypotheses was also derived by Khatri [11], under some full rank restrictions. Residuals in the GCM were considered and mathematical decompositions were performed to provide some insights on the characteristics of the various components [29, 30]. This work was further extended to the EGCM, where better understanding of the design space as well as the residual space were achieved [3, 30].

Consider now the decomposed residuals in the GCM [29] and the simple hypothesis B = 0 (mean is zero), an extension of the well-known Lawley-Hotelling trace test was previously derived for testing this hypothesis [4]. The test turned out to be functions of the decomposed residuals, which in turn facilitated appropriate interpretations of the decomposed residuals as well as better understanding of the distribution of the test statistic. In their paper, Hamid and colleagues showed that the distributions of the test statistics for the simple hypothesis of B = 0 as well as the general linear hypothesis GBF = 0 are free of the unknown covariance matrix Σ [4]. This means that the distributions can be generated empirically and the critical value for the tests can be calculated through simulations or parametric bootstrapping. Furthermore, the authors also showed that the distribution of the test can be represented as weighted sums of chi-square random variables, which allowed the authors to provide appropriate approximations.

In this paper, we consider the EGCM and provide some tests for the special case of the model, where we assume the mean growth curves to be clustered in two categories. The mean (over time) for the two clusters follow polynomials of different degrees, e.g., one cluster consisting of groups with linear growth curves and the other cluster consisting of groups with quadratic curves. We first present and discuss the decomposed residuals in the EGCM [3, 30], and formulate two hypotheses based on their practical relevance in analysis of clustered longitudinal data, requiring fitting of the EGCM. We propose potential statistical tests motivated by the residuals, and using a formulation similar to the trace test in the GCM, which as mentioned above is an extension of the well-known Lawley-Hotelling trace test in the MANOVA model. We then provide formal mathematical derivation using restricted maximum likelihood approach and provide some distributional properties of the test statistics. We evaluate performance using extensive simulations and provide real data illustration.

7.2 Residuals in the Extended Growth Curve Model

Suppose we have m groups and a total of n individuals from whom measurements are taken at p time points. Without loss of generality, consider the EGCM, where groups can be categorized into two clusters. Suppose, again without loss of generality, that the mean for the first cluster, consisting of m 1 groups, follows a polynomial of degree q 1 − 1 and the mean for the second cluster (consisting of m 2 groups) follows a polynomial of degree q 1 + q 2 − 1. For presentation purposes, let m = m 1 + m 2 and q = q 1 + q 2. The EGCM, in matrix formulations, can be represented as

$$\displaystyle \begin{aligned} \mathbf{Y}={\mathbf{Z}}_{1}{\mathbf{B}}_{1}{\mathbf{X}}_{1}+{\mathbf{Z}}_{2}{\mathbf{B}}_{2}{\mathbf{X}}_{2}+\mathbf{E}, {} \end{aligned} $$
(7.1)

where q 1, q 2 ≤ p, rank(X 1) + p ≤ n and \(\mathcal {C}({\mathbf {X}}_{2}^\prime ) \subseteq \mathcal {C}(\mathbf { X}_{1}^{\prime })\), where \(\mathcal {C}(.)\) represents the column space of a matrix. The observation matrix Y has the same dimension and representation as the one described above for the GCM, and the n columns of the error matrix E are assumed to be distributed as a p-variate normal random variable with mean 0 and covariance Σ. The within individual design matrix Z 1, now of dimension p × q 1, also has the same representation as Z in GCM, but with q 1 replacing q; the parameter matrix B 1, of dimension q 1 × m represents the coefficients of the q 1 polynomials that are common for both clusters consisting of all m groups. The between individual design matrix X 1, of dimension m × n, has the same representation as X in GCM. The descriptions of the remaining matrices involved in the EGCM are given as

$$\displaystyle \begin{aligned} {{\mathbf{Z}}_2}^\prime :q_2 \times p & = \left( \begin{array}{ccccc} t_1^{q_{1}} & t_2^{q_{1}} & t_3^{q_{1}} & \cdots & t_p^{q_{1}} \\ t_1^{q_{1}+1} & t_2^{q_{1}+1} & t_3^{q_{1}+1} & \cdots & t_p^{q_{1}+1} \\ t_1^{q_{1}+2} & t_2^{q_{1}+2} & t_3^{q_{1}+2} & \cdots & t_p^{q_{1}+2} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ t_1^{q_{1}+q_{2}-1} & t_2^{q_{1}+q_{2}-1} & t_3^{q_{1}+q_{2}-1} & \cdots & t_p^{q_{1}+q_{2}-1} \\ \end{array} \right),\\ {\mathbf{X}}_{2}:m_{2} \times n & = \left( \begin{array}{ccccccc} {\mathbf{0}}_{n_1} & {\mathbf{0}}_{n_2} & \cdots & {\mathbf{1}}_{n_{m_{1}+1}}&{\mathbf{0}}_{n_{m_{1}+2}}&\cdots&{\mathbf{0}}_{n_m} \\ {\mathbf{0}}_{n_1} & {\mathbf{0}}_{n_2} & \cdots & {\mathbf{0}}_{n_{m_{1}+1}}&{\mathbf{1}}_{n_{m_{1}+2}}&\cdots&{\mathbf{0}}_{n_m}\\ \vdots & \vdots & \vdots & \vdots & \vdots& \ddots & \vdots\\ {\mathbf{0}}_{n_1} & {\mathbf{0}}_{n_2} & \cdots & {\mathbf{0}}_{n_{m_{1}+1}}&{\mathbf{0}}_{n_{m_{1}+2}}&\cdots&{\mathbf{1}}_{n_m}\\ \end{array} \right),\\ {\mathbf{B}}_{2}:q_{2} \times m_{2} & =\left( \begin{array}{ccccc} b_{q_{1}1} & b_{q_{1}2} & b_{q_{1}3} & \cdots & b_{q_{1}m_{2}}\\ b_{(q_{1}+1)1} & b_{(q_{1}+1)2} & b_{(q_{1}+1)3} & \cdots & b_{(q_{1}+1)m_{2}}\\ b_{(q_{1}+2)1} & b_{(q_{1}+2)2} & b_{(q_{1+2)}3} & \cdots & b_{(q_{1}+2)m_{2}}\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ b_{(q_{1}+q_{2}-1)1} & b_{(q_{1}+q_{2}-1)2} & b_{(q_{1}+q_{2}-1)3} & \cdots & b_{(q_{1}+q_{2}-1)m_{2}}\\ \end{array} \right). \end{aligned} $$

Considerable literature is available on GMANOVA models in general, and EGCM in particular [3, 7, 12, 15, 26, 30, 31]. Solutions to the likelihood functions are also provided, and explicit formulae for the estimators are provided [12, 26]. Nevertheless, there is limited work related to hypothesis testing in the context of the mean parameters of EGCM, and hence limited applications of the model exist despite the fact that longitudinal data in practice follow different functions over time across the different groups. The focus of this paper is, therefore, to contribute towards hypothesis testing for the mean parameters of the EGCM. In doing so, we are particularly interested in the decomposed residuals, and we consider a special case of the model, without loss of generality. We use the residuals as motivations to formulate hypotheses relevant to real world applications. We then mathematically derive corresponding test statistics. As such, let us first consider the estimated model, derived using the maximum likelihood (ML) approach, which is always unique [3, 26, 27]. The mathematical expression for the estimated model is given by

$$\displaystyle \begin{aligned} \hat{\mathbf{Y}} & = {\mathbf{Z}}_{1}\widehat{\mathbf{B}}_{1}{\mathbf{X}}_{1}+{\mathbf{Z}}_{2}\widehat{\mathbf{B}}_{2}{\mathbf{X}}_{2} \\ & = (\mathbf{I}-{\mathbf{T}}_{1})\mathbf{Y}{\mathbf{X}}_{1}^{\prime}({\mathbf{X}}_{1}{\mathbf{X}}_{1}^{\prime})^{-}{\mathbf{X}}_{1}+(\mathbf{I}-\mathbf{ T}_{2})\mathbf{Y}{\mathbf{X}}_{2}^{\prime}({\mathbf{X}}_{2}{\mathbf{X}}_{2}^{\prime})^{-} {\mathbf{X}}_{2}, {} \end{aligned} $$
(7.2)

where A represents a generalized inverse of any matrix A and

$$\displaystyle \begin{aligned} \begin{array}{rcl} { \mathbf{T}}_{1}&\displaystyle =&\displaystyle { \mathbf{I}}-{ \mathbf{Z}}_{1}({ \mathbf{Z}}_{1}^{\prime}{ \mathbf{S}}_{1}^{-1}{ \mathbf{Z}}_{1})^{-}{ \mathbf{ Z}}_{1}^{\prime}{ \mathbf{S}}_{1}^{-1},\\ { \mathbf{T}}_{2}&\displaystyle =&\displaystyle { \mathbf{I}}-{ \mathbf{T}}_{1}{ \mathbf{Z}}_{2}({ \mathbf{Z}}_{2}^{\prime}{ \mathbf{T}}_{1}^{\prime}{ \mathbf{S}}_{2}^{-1}{ \mathbf{T}}_{1}{ \mathbf{Z}}_{2})^{-}{ \mathbf{Z}}_{2}^{\prime}{ \mathbf{T}}_{1}^{\prime}{ \mathbf{ S}}_{2}^{-1};\\ { \mathbf{S}}_{1}&\displaystyle =&\displaystyle { \mathbf{Y}}({ \mathbf{I}}-{ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}}';\\ { \mathbf{S}}_{2}&\displaystyle =&\displaystyle { \mathbf{S}}_{1}+{ \mathbf{T}}_{1}{ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{ X}}_{1}({ \mathbf{I}}-{ \mathbf{X}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2})\\ &\displaystyle &\displaystyle \times { \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}}'{ \mathbf{ T}}_{1}^{\prime}. \end{array} \end{aligned} $$

In vectorized form, this can be re-written as

$$\displaystyle \begin{aligned}\textit{Vec}\hat{\mathbf{Y}} = [{\mathbf{P}}_{{\mathbf{X}}_{1}}\otimes{\mathbf{P}}_{{\mathbf{Z}}_{1}}]\textit{Vec}\mathbf{Y}+[\mathbf{ P}_{{\mathbf{X}}_{2}} \otimes{\mathbf{P}}_{{\mathbf{Z}}_{2}}]\textit{Vec}\mathbf{Y} = \mathbf{P}\textit{Vec}\mathbf{Y},\end{aligned}$$

where Vec represents a vectorized form a matrix, ⊗ represents the Kronecker product and

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{P}}_{{\mathbf{X}}_{1}} &\displaystyle =&\displaystyle {\mathbf{X}}_{1}^{\prime}({\mathbf{X}}_{1}{\mathbf{X}}_{1}^{\prime})^{-}{\mathbf{X}}_{1},\\ {\mathbf{P}}_{{\mathbf{X}}_{2}} &\displaystyle =&\displaystyle {\mathbf{X}}_{2}^{\prime}({\mathbf{X}}_{2}{\mathbf{X}}_{2}^{\prime})^{-}{\mathbf{X}}_{2},\\ {\mathbf{P}}_{{\mathbf{Z}}_{1}} &\displaystyle =&\displaystyle {\mathbf{Z}}_{1}({\mathbf{Z}}_{1}^{\prime}{\mathbf{S}}_{1}^{-1}{\mathbf{Z}}_{1})^{-}{\mathbf{Z}}_{1}^{\prime}\mathbf{ S}_{1}^{-1},\\ {\mathbf{P}}_{{\mathbf{Z}}_{2}} &\displaystyle =&\displaystyle {\mathbf{T}}_{1}{\mathbf{Z}}_{2}({\mathbf{Z}}_{2}^{\prime}{\mathbf{T}}_{1}^{\prime}{\mathbf{S}}_{2}^{-1}{\mathbf{T}}_{1}\mathbf{ Z}_{2})^{-}{\mathbf{Z}}_{2}^{\prime}\mathbf{ T}_{1}^{\prime}{\mathbf{S}}_{2}^{-1}. \end{array} \end{aligned} $$

It is evident that the estimated model is a bilinear projection with respect to the four design matrices, which is equivalent to the column space of the matrix P [3, 30]. This column space can be represented as sum of two tensor product spaces

$$\displaystyle \begin{aligned}\mathcal{C}(\mathbf{P}) = \mathcal{C}({\mathbf{X}}_{1}^{\prime})\otimes\mathcal{C}_{{\mathbf{S}}_{1}}({\mathbf{Z}}_{1})+\mathcal{C}(\mathbf{ X}_{2}^{\prime}) \otimes\mathcal{C}_{{\mathbf{S}}_{2}}({\mathbf{T}}_{1}{\mathbf{Z}}_{2}),\end{aligned}$$

where represents tensor product between two spaces. The relationship between Kronecker product of two matrices and tensor product space are presented in more details in Kollo and von Rosen [12], and its applications in decomposition of residuals in GMANOVA models is provided in Hamid and von Rosen [3].

Now consider the residuals in the EGCM, which are defined on the orthogonal complement of the design space, that is a projection onto \((\mathcal {C}({\mathbf {X}}_{1}^{\prime })\boldsymbol {\otimes }\mathcal {C}_{{\mathbf {S}}_{1}}({\mathbf {Z}}_{1})+\mathcal {C}({\mathbf {X}}_{2}^{\prime })\mathbf { \otimes }\mathcal {C}_{{\mathbf {S}}_{2}}({\mathbf {T}}_{1}{\mathbf {Z}}_{2}))^{\perp }\) [3, 30]. This space was mathematically decomposed and four residuals are provided in a previous work. Graphical elucidation of the design and residual spaces as well as the corresponding formulas for the residuals are provided below (Fig. 7.1). More mathematical details about the residuals in the EGCM can be found in Hamid and von Rosen [3].

Fig. 7.1
figure 1

The spaces representing the fitted model (design space) and the residuals in the EGCM

Note that, some of the spaces in Fig. 7.1 can be decomposed further (e.g., note the broken lines across R 2 and R 3). Although further decompositions can also be interpreted in terms of the model characteristics with respect to what the model was not able to explain (i.e., residuals), we focused on these four residuals for now, mainly because of their practical relevance in applications of GMANOVA models:

$$\displaystyle \begin{aligned} { \mathbf{R}}_{1}&=({ \mathbf{I}}-{ \mathbf{T}}_{1}){ \mathbf{Y}}({ \mathbf{I}}-{ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{ X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}),\\ { \mathbf{R}}_{2}&={ \mathbf{T}}_{1}{ \mathbf{Y}}({ \mathbf{I}}-{ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}),\\ { \mathbf{R}}_{3}&={ \mathbf{T}}_{1}{ \mathbf{Y}}({ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{ X}}_{1}-{ \mathbf{X}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}),\\ { \mathbf{R}}_{4}&=({ \mathbf{T}}_{1}+{ \mathbf{T}}_{2}-{ \mathbf{I}}){ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}. \end{aligned} $$

As we can see from the figure and mathematical characteristics of the residuals, R 12 = R 1 + R 2 is equivalent to the residual in MANOVA models and appears in the GCM as a sum of two of the decomposed residuals [29, 30]. This residual represents the difference between the observations and the group mean \((\mathbf {Y}-\mathbf {Y}{ \mathbf { X}}_{1}^{\prime }({ \mathbf {X}}_{1}{ \mathbf {X}}_{1}^{\prime })^{-}{ \mathbf {X}}_{1})=\mathbf {Y}({ \mathbf {I}}-{ \mathbf {X}}_{1}^{\prime }({ \mathbf { X}}_{1}{ \mathbf {X}}_{1}^{\prime })^{-}{ \mathbf {X}}_{1})\) and is distributed as a multivariate normal random variable, and hence useful in assessing between individual assumptions as well as distributional assumptions. Note further, that \({\mathbf {R}}_{12}{\mathbf {R}}_{12}^{\prime } = {\mathbf {S}}_{1}\), and that the sample covariance matrix is \(\frac {1}{n-1}{\mathbf {S}}_{1}\). On the other hand, R 34 = R 3 + R 4 provides information about overall model fit, while R 3 and R 4 individually provide information on different components of the polynomial fit.

For instance, consider two clusters, where the mean for one of the clusters can be represented by a linear function over time and the mean of the second cluster follows a quadratic curve. For this scenario, R 3 provides information on the linear component of the model (for both clusters) and R 4 provides information on the quadratic component of the second cluster. More on this will come while deriving and evaluating the proposed tests.

7.3 Some Tests for the Extended Growth Curve Model

Consider the EGCM defined in (7.1). Without loss of generality, consider only two groups, where each group is considered as a cluster. Suppose measurements from each group are taken at different time points. Suppose the mean for the first group can be represented by a linear function over time and the mean for the second group follows quadratic curve. Note that, this is just to simplify presentation in this paper, nevertheless the tests are derived under general assumptions, and are not restricted to two groups or linear/quadratic polynomials.

Consider now the simple hypothesis that the mean is zero (testing overall significance of the model), which can be formulated as

$$\displaystyle \begin{aligned} \begin{array}{rcl} H_{o}&\displaystyle :&\displaystyle { \mathbf{B}}_{1}={ \mathbf{0}},{ \mathbf{B}}_{2}={ \mathbf{0}}, \\ H_{1}&\displaystyle :&\displaystyle { \mathbf{B}}_{1}\neq{ \mathbf{0}},{ \mathbf{B}}_{2}\neq{ \mathbf{0}}. {} \end{array} \end{aligned} $$
(7.3)

Recall the Lawley-Hotelling trace test for MANOVA and GCM [4, 5]. Both tests are weighted functions of the observed mean and the corresponding residuals (the part of the observed mean that is left unexplained by the fitted model), where the weight is the between individual variation represented by the sample variance-covariance matrix in GCM, which is a function of S. Using analogous arguments, one can suggest that a test statistic for the simple hypothesis presented above, in the EGCM framework, will be a weighted function of \({ \mathbf {YX}}_{1}^{\prime }({ \mathbf {X}}_{1}{ \mathbf {X}}_{1}^{\prime })^{-}{ \mathbf {X}}_{1}{ \mathbf {Y}'}\) and \({ \mathbf {R}}_{34}{ \mathbf {R}}_{34}^{\prime }\), and will have the format

$$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{f( \mathbf{YX}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}'})}{f( {\mathbf{R}}_{34}{ \mathbf{ R}}_{34}^{\prime})}. \end{array} \end{aligned} $$

Below, we will provide mathematical derivation for the test. In doing so, we write the likelihood function as a product of three independent terms. We maximize a certain part of the likelihood to get an estimator for the unknown covariance matrix, which then replaces the covariance matrix in the likelihood function to give the estimated likelihood.

Consider first the EGCM in (7.1), we can rewrite it as

$$\displaystyle \begin{aligned} \begin{array}{rcl} { \mathbf{Y}}&=&{ \mathbf{Z}}_{1}({ \mathbf{B}}_{11}:{ \mathbf{B}}_{12})\left( \begin{array}{c} { \mathbf{X}}_{11} \\ { \mathbf{X}}_{12}\\ \end{array} \right)+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2}+{ \mathbf{E}} \\ &=&{ \mathbf{Z}}_{1}{ \mathbf{B}}_{11}{ \mathbf{X}}_{11}+{ \mathbf{Z}}_{1}{ \mathbf{B}}_{12}{ \mathbf{X}}_{12}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2}+{ \mathbf{E}}, {} \end{array} \end{aligned} $$
(7.4)

where B 1 = (B 11 : B 12) and \({ \mathbf {X}}_{1}^{\prime }=({ \mathbf {X}}_{11}^{\prime }:{ \mathbf {X}}_{12}^{\prime })\). Considering the two clusters (in our case, two groups as well) separately, the model reduces to

$$\displaystyle \begin{aligned} { \mathbf{Y}}_{1}={ \mathbf{Z}}_{1}{ \mathbf{B}}_{11}{ \mathbf{X}}^{1}_{11}+{ \mathbf{E}}_{1},\end{aligned} $$

and

$$\displaystyle \begin{aligned} { \mathbf{Y}}_{2}={ \mathbf{Z}}_{1}{ \mathbf{B}}_{12}{ \mathbf{X}}^{2}_{12}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{22}+{ \mathbf{E}}_{2},\end{aligned} $$

for Group I and Group II, respectively. Here Y 1, \({ \mathbf {X}}^{1}_{11}\) and E 1 are matrices consisting of the first n 1 columns of Y, X 11 and E, respectively. The matrices Y 2, \({ \mathbf {X}}^{2}_{12}\), X 22 and E 2 consist of the last n 2 columns of Y, X 12, X 2 and E, respectively. Observe that X 12 = X 2. Moreover, it is possible to show that

$$\displaystyle \begin{aligned} { \mathbf{X}}_{11}^{\prime}({ \mathbf{X}}_{11}{ \mathbf{X}}_{11}^{\prime})^{-}{ \mathbf{X}}_{11}={ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{ X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}-{ \mathbf{X}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}.\end{aligned} $$

Now consider the likelihood function for the EGCM

$$\displaystyle \begin{aligned} L=\gamma|{ \boldsymbol{\varSigma}}|{}^{-\frac{n}{2}}e^{-\frac{1}{2}tr\{{ \boldsymbol{\varSigma}}^{-1} ({ \mathbf{Y}}-({ \mathbf{Z}}_{1}{ \mathbf{B}}_{1}{ \mathbf{X}}_{1}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2}))({ \mathbf{Y}}-({ \mathbf{Z}}_{1}{ \mathbf{B}}_{1}{ \mathbf{X}}_{1}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2}))'\}}, {}\end{aligned} $$
(7.5)

where \(\gamma =(2\pi )^{-\frac {1}{2}np}\). It can be rewritten as a product of three terms,

$$\displaystyle \begin{aligned} L=L_{1}\quad \times\quad L_{2}\quad \times\quad L_{3}, \end{aligned}$$

where

$$\displaystyle \begin{aligned} \begin{array}{rcl} L_{1}&\displaystyle =&\displaystyle \gamma_{ }\text{exp}\{-\frac{1}{2}tr\{{ \boldsymbol{\varSigma}}^{-1} ({ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}-({ \mathbf{Z}}_{1}{ \mathbf{B}}_{12}{ \mathbf{ X}}_{12}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2}))\\ &\displaystyle &\displaystyle \times ({ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}-({ \mathbf{Z}}_{1}{ \mathbf{ B}}_{12}{ \mathbf{X}}_{12}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2}))'\}\},\\ L_{2}&\displaystyle =&\displaystyle \text{exp}\{-\frac{1}{2}tr\{{ \boldsymbol{\varSigma}}^{-1}({ \mathbf{Y}}({ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}-{ \mathbf{X}}_{2}^{\prime} ({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2})-({ \mathbf{Z}}_{1}{ \mathbf{B}}_{11}{ \mathbf{X}}_{11}))\\ &\displaystyle &\displaystyle \times ({ \mathbf{Y}}({ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}-{ \mathbf{X}}_{2}^{\prime}({ \mathbf{ X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2})-({ \mathbf{Z}}_{1}{ \mathbf{B}}_{11}{ \mathbf{X}}_{11}))'\}\},\\ L_{3}&\displaystyle =&\displaystyle |{ \boldsymbol{\varSigma}}|{}^{-\frac{n}{2}}\text{exp}\{-\frac{1}{2}tr\{{ \boldsymbol{\varSigma}}^{-1}{ \mathbf{Y}}({ \mathbf{I}}-{ \mathbf{X}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}){ \mathbf{Y}}'\}\}. \end{array} \end{aligned} $$

Let us now consider L 3, which is free of the parameters specified in the hypothesis and maximize the expression to get an estimator for the unknown covariance matrix Σ. It is possible to show that the estimator that maximizes L 3 is \((n-1)\hat {\boldsymbol {\varSigma }}={\mathbf {S}}_{1}\). We now replace Σ in (7.5) by its estimator, to get the estimated likelihood, denoted by EL, and then maximize EL under H o and H o ∪ H 1, to get the desired test. The maximum of the EL under H o and H o ∪ H 1 are respectively given by

$$\displaystyle \begin{aligned} \gamma_{1}|{ \mathbf{S}}_{1}|{}^{-\frac{n}{2}}e^{-\frac{1}{2}tr\{nS_{1}^{-1} { \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}}'\}} {} \end{aligned} $$
(7.6)

and

$$\displaystyle \begin{aligned} \gamma_{1}|{ \mathbf{S}}_{1}|{}^{-\frac{n}{2}}\text{exp}&\{-\frac{1}{2}ntr\{{ \mathbf{S}}_{1}^{-1} ({ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}-({ \mathbf{Z}}_{1}\hat{{ \mathbf{B}}}_{1}{ \mathbf{X}}_{1} +{ \mathbf{Z}}_{2}\hat{{ \mathbf{B}}}_{2}{ \mathbf{X}}_{2})) \\ &\times({ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}-({ \mathbf{Z}}_{1}\hat{{ \mathbf{B}}}_{1}{ \mathbf{X}}_{1} +{ \mathbf{Z}}_{2}\hat{{ \mathbf{B}}}_{2}{ \mathbf{X}}_{2}))'\}\}, {} \end{aligned} $$
(7.7)

where \(\gamma _{1}=n^{\frac {n}{2}}(2\pi )^{-\frac {1}{2}np}e^{-\frac {1}{2}np}\). Note that we can rewrite R 3 and R 4 as follows:

$$\displaystyle \begin{aligned} { \mathbf{R}}_{3}={ \mathbf{S}}_{1}{ \mathbf{Z}}_{1}^{o}({\mathbf{Z}}_{1}^{o \prime} { \mathbf{S}}_{1}{ \mathbf{Z}}_{1}^{o})^{-}{\mathbf{Z}}_{1}^{o \prime} {\mathbf{Y}}({\mathbf{X}}_{1}^{\prime}({\mathbf{X}}_{1}{\mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1} -{\mathbf{X}}_{2}^{\prime}({\mathbf{X}}_{2}{\mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}), {} \end{aligned} $$
(7.8)
$$\displaystyle \begin{aligned} { \mathbf{R}}_{4}={ \mathbf{S}}_{2}{ \mathbf{Z}}^{o}({\mathbf{Z}}^{o \prime} { \mathbf{S}}_{2}{\mathbf{Z}}^{o})^{-}{\mathbf{Z}}^{o \prime}{\mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{\mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}, {} \end{aligned}$$

where \({ \mathbf {Z}}_{1}^{o}\) and Z o are matrices of full rank spanning the orthogonal complements of the column spaces of the matrices Z 1 and Z = (Z 1 : T 1Z 2), respectively. It is possible to show that R 34, which denotes the sum of the residuals R 3 and R 4, can be written as a difference between the observed and estimated means, i.e.,

$$\displaystyle \begin{aligned}{ \mathbf{R}}_{34}={ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}-({ \mathbf{Z}}_{1}\hat{{ \mathbf{B}}}_{1}{ \mathbf{X}}_{1}+{ \mathbf{Z}}_{2}\hat{{ \mathbf{B}}}_{2}{ \mathbf{X}}_{2}).\end{aligned}$$

A test statistic is defined by taking the ratio between (7.6) and (7.7), which can be simplified as

$$\displaystyle \begin{aligned} \frac{e^{-\frac{1}{2}ntr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}}'\}}}{e^{-\frac{1}{2}ntr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{R}}_{34}{ \mathbf{R}}_{34}^{\prime}\}}}, {} \end{aligned} $$
(7.9)

where the hypothesis is rejected when the value of the ratio is small, i.e., close to zero. Note that the ratio has values between zero and one. One can also define an equivalent test by taking the logarithm of the test statistic, which can be re-written as

$$\displaystyle \begin{aligned} tr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}}'\}-tr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{R}}_{34}{ \mathbf{R}}_{34}^{\prime}\}, {} \end{aligned} $$
(7.10)

and the hypothesis will be rejected for large values of (7.10). This formulation allows a relatively easier understanding of some distributional characteristics of the proposed test. As such, consider the first term in (7.10) and write it as a sum of two terms as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} tr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{YX}}_{1}^{\prime}({ \mathbf{X}}_{1}{ \mathbf{X}}_{1}^{\prime})^{-}{ \mathbf{X}}_{1}{ \mathbf{Y}}'\}&\displaystyle =&\displaystyle tr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{YX}}_{11}^{\prime}({ \mathbf{X}}_{11}{ \mathbf{X}}_{11}^{\prime})^{-}{ \mathbf{X}}_{11}{ \mathbf{Y}}'\} \\ &\displaystyle &\displaystyle +tr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{X}}_2^{\prime}\}, {} \end{array} \end{aligned} $$
(7.11)

where X 11 is as in (7.4). Similarly, use the expressions for R 3 and R 4 and write the second term in (7.10) as

$$\displaystyle \begin{aligned} tr\{{ \mathbf{S}}_{1}^{-1}{ \mathbf{R}}_{34}{ \mathbf{R}}_{34}^{\prime}\}&=&tr\{{ \mathbf{YX}}_{11}^{\prime}({ \mathbf{X}}_{11}{ \mathbf{X}}_{11}^{\prime})^{-}{ \mathbf{X}}_{11}{ \mathbf{Y}}' { \mathbf{S}}_{1}^{-1}{ \mathbf{Z}}_{1} ({ \mathbf{Z}}_{1}^{\prime}{ \mathbf{S}}_{1}^{-1}{ \mathbf{Z}}_{1})^{-}{ \mathbf{Z}}_{1}^{\prime}{ \mathbf{S}}_{1}^{-1}\} \\ &&+tr\{{ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{Y}}'{ \mathbf{S}}_{2}^{-1}{ \mathbf{Z}}({ \mathbf{Z}}'{ \mathbf{S}}_{1}^{-1}{ \mathbf{Z}})^{-}{ \mathbf{Z}}'{ \mathbf{S}}_{1}^{-1}\}. {} \end{aligned} $$
(7.12)

By subtracting (7.12) from (7.11), the test statistic reduces to

$$\displaystyle \begin{aligned} \phi_{1}({ \mathbf{Y}})&=&tr\{{ \mathbf{YX}}_{11}^{\prime}({ \mathbf{X}}_{11}{ \mathbf{X}}_{11}^{\prime})^{-}{ \mathbf{X}}_{11}{ \mathbf{Y}}'{ \mathbf{S}}_{1}^{-1}{ \mathbf{Z}}_{1} ({ \mathbf{Z}}_{1}^{\prime}{ \mathbf{S}}_{1}^{-1}{ \mathbf{Z}}_{1})^{-}{ \mathbf{Z}}_{1}^{\prime}{ \mathbf{S}}_{1}^{-1}\} \\ &&+tr\{{ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{Y}}'{ \mathbf{S}}_{2}^{-1}{ \mathbf{Z}} ({ \mathbf{Z}}'{ \mathbf{S}}_{1}^{-1}{ \mathbf{Z}})^{-}{ \mathbf{Z}}'{ \mathbf{S}}_{1}^{-1}\}, {} \end{aligned} $$
(7.13)

where Z = (Z 1 : T 1Z 2). The hypothesis is rejected when the value of ϕ 1(Y) is large. Here it is important to note that the column spaces of (Z 1 : Z 2) and (Z 1 : T 1Z 2) are identical [2, 3].

The test statistic given in (7.13) is always greater or equal to zero. Moreover, it is possible to see from the expression in (7.9) that the numerator is a function of \({ \mathbf {YX}}_{1}^{\prime }({ \mathbf {X}}_{1}{ \mathbf {X}}_{1}^{\prime })^{-}{ \mathbf {X}}_{1}\), which is the observed mean. On the other hand, in the denominator we have a function of R 34, which is the residual obtained by subtracting the estimated mean from the observed mean. This shows that the test compares the observed and estimated means, in some weighted fashion, and rejects the hypothesis when the difference between them is “small”, which is quite intuitive and in agreement with the formulation of the Lawley-Hotelling trace tests in the MANOVA and GMANOVA models.

Below, we show that the distribution of ϕ 1(Y) under the null hypothesis is independent of the unknown covariance matrix, Σ. This is extremely important in applications, since the empirical distribution can be generated without the knowledge of the covariance matrix, and hence significance testing can be performed and p-values can be provided. Performance evaluation through simulations can also be performed by assuming, without loss of generality, that Σ = I. On the other hand, it is possible to show that the distribution under the alternative depends on Σ, and hence the power of the test depends on the variance-covariance matrix. We will empirically show this through the simulations.

Now, to show that the distribution of the test statistic in (7.13) under the null is independent of Σ, consider the first part of the expression and let \({\mathbf {Z}}_{1}^{o}\) be a matrix of full rank spanning the orthogonal complement to the space generated by the columns of Z 1. We can write first expression in (7.13) as

$$\displaystyle \begin{aligned} & tr\{\mathbf{Y}{\mathbf{X}}_{11}^{\prime}({\mathbf{X}}_{11}{\mathbf{X}}_{11}^{\prime})^{-}{\mathbf{X}}_{11}\mathbf{Y}'{\mathbf{S}}_{1}^{-1}\}\\ &\qquad -tr\{\mathbf{Y}{\mathbf{X}}_{11}^{\prime}({\mathbf{X}}_{11}{\mathbf{X}}_{11}^{\prime})^{-} {\mathbf{X}}_{11}\mathbf{Y}'{\mathbf{Z}}_{1}^{o}({{\mathbf{Z}}_{1}^{o}}^\prime {\mathbf{S}}_{1}{\mathbf{Z}}_{1}^{o})^{-1} {\mathbf{ Z}_{1}^{o}}^\prime \}. {} \end{aligned} $$
(7.14)

The first term in (7.14) is invariant under the transformation \(\boldsymbol {\varSigma }^{-\frac {1}{2}}\mathbf {Y}\). It is, therefore, possible to replace Y by \(\boldsymbol {\varSigma }^{-\frac {1}{2}}\mathbf {Y}\), which shows that the distribution of this term in is independent of Σ. For the second term, by using the property of the trace function, we can rewrite it as

$$\displaystyle \begin{aligned}tr\{{\mathbf{X}}_{11}^{\prime}({\mathbf{X}}_{11}{\mathbf{X}}_{11}^{\prime})^{-}{\mathbf{X}}_{11}\mathbf{Y}'{\mathbf{Z}}_{1}^{o}({\mathbf{ Z}_{1}^{o}}^\prime{\mathbf{S}}_{1}{\mathbf{Z}}_{1}^{o})^{-1}{{\mathbf{Z}}_{1}^{o}}^\prime\mathbf{Y}\}.\end{aligned}$$

Now, write \({{\mathbf {Z}}_{1}^{o}}^\prime \mathbf {Y}\) as

$$\displaystyle \begin{aligned}({{\mathbf{Z}}_{1}^{o}}^\prime \boldsymbol{\varSigma}{\mathbf{Z}}_{1}^{o})^{\frac{1}{2}}({{\mathbf{Z}}_{1}^{o}}^\prime\boldsymbol{\varSigma}\mathbf{ Z}_{1}^{o})^{-\frac{1}{2}}{{\mathbf{Z}}_{1}^{o}}^\prime\mathbf{Y}\end{aligned}$$

and observe that we can rewrite \(({{\mathbf {Z}}_{1}^{o}}^\prime \boldsymbol {\varSigma }{\mathbf {Z}}_{1}^{o})^{\frac {1}{2}}({\mathbf { Z}_{1}^{o}}^\prime {\mathbf {S}}_{1}{\mathbf {Z}}_{1}^{o})^{-1} ({{\mathbf {Z}}_{1}^{o}}^\prime \boldsymbol {\varSigma }{\mathbf {Z}}_{1}^{o})^{\frac {1}{2}}\) as

$$\displaystyle \begin{aligned}(({{\mathbf{Z}}_{1}^{o}}^\prime\boldsymbol{\varSigma}{\mathbf{Z}}_{1}^{o})^{-\frac{1}{2}}{{\mathbf{Z}}_{1}^{o}}^\prime\mathbf{Y}(\mathbf{ I}-{{\mathbf{X}}_{11}}^\prime({\mathbf{X}}_{11}{{\mathbf{X}}_{11}}^\prime)^{-} {\mathbf{X}}_{11})\mathbf{Y}'{\mathbf{Z}}_{1}^{o}({\mathbf{ Z}_{1}^{o}}^\prime\boldsymbol{\varSigma}{\mathbf{Z}}_{1}^{o})^{-\frac{1}{2}})^{-1}.\end{aligned}$$

Consequently, it remains to show that the distribution of \(({{ {\mathbf {Z}}_{1}}^{o}}^\prime { \boldsymbol {\varSigma }} { {\mathbf {Z}}_{1}}^{o})^{-\frac {1}{2}}{{ {\mathbf {Z}}_{1}}^{o}}^\prime { \mathbf {Y}}\) is independent of Σ. Note that the expression is a linear function of a multivariate normal random variable. As a result, it is enough to show that the mean and dispersion matrices are independent of Σ, which are shown below.

Under the null hypothesis E[Y] = Z 1B 1X 1 + Z 2B 2X 2 = 0 which implies

$$\displaystyle \begin{aligned}\boldsymbol{E}[({ {{\mathbf{Z}}_{1}}^{o}}^\prime{ \boldsymbol{\varSigma}} { {\mathbf{Z}}_{1}}^{o})^{-\frac{1}{2}}{{ \mathbf{ Z}_{1}}^{o}}^\prime{ \mathbf{Y}}]={ \mathbf{0}}.\end{aligned}$$

The dispersion matrix, D, is given by

$$\displaystyle \begin{aligned}\boldsymbol{D}[({{ {\mathbf{Z}}_{1}}^{o}}^\prime{ \boldsymbol{\varSigma}} { {\mathbf{Z}}_{1}}^{o})^{-\frac{1}{2}}{{ \mathbf{ Z}_{1}}^{o}}^\prime{ \mathbf{Y}}]=({{ {\mathbf{Z}}_{1}}^{o}}^\prime{ \boldsymbol{\varSigma}} { {\mathbf{Z}}_{1}}^{o})^{-\frac{1}{2}} {{ {\mathbf{Z}}_{1}}^{o}}^\prime{ \boldsymbol{\varSigma}} { {\mathbf{Z}}_{1}}^{o}({{ {\mathbf{Z}}_{1}}^{o}}^\prime{ \boldsymbol{\varSigma}} { {\mathbf{Z}}_{1}}^{o})^{-\frac{1}{2}}={ \mathbf{I}}.\end{aligned}$$

Suppose now, that we want to check if the quadratic term in the growth curves of the individuals in the second group (cluster) is significantly different from zero. We can formulate the hypotheses (in terms of the mean parameters) as

$$\displaystyle \begin{aligned} H_{o}& : {\mathbf{B}}_{2}=\mathbf{0} \\ H_{1}& : {\mathbf{B}}_{2} \neq \mathbf{0}. {} \end{aligned} $$
(7.15)

Note that this hypothesis is associated with the residual, R 4. Using a similar motivation as in the Lawley-Hotelling trace test in MANOVA, the trace test in the GCM and the test provided in (7.13), one can argue that the statistic for testing the hypothesis in (7.15) should have the format

$$\displaystyle \begin{aligned}\frac{f( \mathbf{YX}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{Y}}')}{f( {\mathbf{R}}_{4}{ \mathbf{ R}}_{4}^{\prime})}.\end{aligned}$$

In order to provide a formal derivation, recall the likelihood function of the EGCM given in (7.5) and maximize the product, L 2 × L 3, which is independent of R 4. This gives us the estimator S 2 for nΣ. Update the likelihood by using the estimator and proceed by taking the ratio of the maximum of the estimated likelihood under H o and H o ∪ H 1. The test statistic provided in (7.17) can be obtained by taking the logarithm of the ratio and doing some algebraic manipulations similar to ϕ 1(Y), including using the property of the trace function (e.g., for any two matrices tr(AB) = tr(BA)) as well as the fact that

$$\displaystyle \begin{aligned} \mathbf{I}-{\mathbf{Z}}_{1}({\mathbf{Z}}_{1}^{\prime}{{\mathbf{S}}_{2}}^{-1}{\mathbf{Z}}_{1})^{-}{\mathbf{Z}}_{1}^{\prime}{\mathbf{S}}_{2}^{-1} & =& {\mathbf{T}}_{1}{\mathbf{Z}}_{2}({{\mathbf{Z}}_{2}}^\prime {{\mathbf{T}}_{1}}^\prime {{\mathbf{S}}_{2}}^{-1} {\mathbf{T}}_{1}{\mathbf{Z}}_{2})^{-} {{\mathbf{Z}}_{2}}^\prime {{\mathbf{T}}_{1}}^\prime {{\mathbf{S}}_{2}}^{-1} \\ && + {\mathbf{S}}_{2}{\mathbf{Z}}^{o}({{\mathbf{Z}}^{o}}^\prime {\mathbf{S}}_{2}{\mathbf{Z}}^{o}){{\mathbf{Z}}^{o}}^\prime, {} \end{aligned} $$
(7.16)

where T 1 and Z 1 o are as presented in (7.2) and (7.8), respectively. Note that the two terms on the right hand side of (7.16) are orthogonal to each other. The test statistic for testing the hypothesis in (7.15) can be formally written as

$$\displaystyle \begin{aligned} \phi_{2}({ \mathbf{Y}})=tr\{{ \mathbf{YX}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{Y}}'{ \mathbf{S}}_{2}^{-1} { \mathbf{T}}_{1}{ \mathbf{Z}}_{2}({ \mathbf{Z}}_{2}^{\prime}{ \mathbf{T}}_{1}^{\prime}{ \mathbf{S}}_{2}^{-1}{ \mathbf{T}}_{1}{ \mathbf{Z}}_{2})^{-}{ \mathbf{Z}}_{2}^{\prime}{ \mathbf{T}}_{1}^{\prime}{ \mathbf{S}}_{2}^{-1}\}, {} \end{aligned} $$
(7.17)

and the hypothesis is rejected when the value of ϕ 2(Y) is large. The above test statistic is always greater or equal to zero. Similar to ϕ 1(Y), the distribution of ϕ 2(Y) is under the null hypothesis independent of the unknown covariance matrix, Σ. To show this, note that the test statistic in (7.17) can be written as the difference between two terms as

$$\displaystyle \begin{aligned} & tr\{{ \mathbf{YX}}_{2}^{\prime}({\mathbf{X}}_{2}{\mathbf{X}}_{2}^{\prime})^{-}{\mathbf{X}}_{2}\mathbf{Y}' {\mathbf{G}}_{1}({ \mathbf{G}}_{1}^{\prime}\mathbf{ W}_{2}{\mathbf{G}}_{1})^{-1}{\mathbf{G}}_{1}^{\prime}\}\\ &\qquad -tr\{\mathbf{YX}_{2}^{\prime}({\mathbf{X}}_{2}{\mathbf{X}}_{2}^{\prime})^{-} {\mathbf{X}}_{2}\mathbf{Y}'{\mathbf{G}}_{2}({\mathbf{G}}_{2}^{\prime}{\mathbf{W}}_{2}{\mathbf{G}}_{2})^{-1}{\mathbf{G}}_{2}^{\prime}\}, {} \end{aligned} $$
(7.18)

where

$$\displaystyle \begin{aligned} { \mathbf{G}}_{r+1}&={ \mathbf{G}}_{r}({ \mathbf{G}}_{r}^{\prime}{ \mathbf{Z}}_{r+1})^{o},\quad { \mathbf{G}}_{0}={ \mathbf{ I}},\\ { \mathbf{W}}_{r+1}&={ \mathbf{Y}}({ \mathbf{I}}-{ \mathbf{X}}_{r}^{\prime}({ \mathbf{X}}_{r}{ \mathbf{X}}_{r}^{\prime})^{-}{ \mathbf{X}}_{r}){ \mathbf{Y}}', r=0,1,2,\ldots m-1. \end{aligned} $$

Such approaches have been discussed in a more general form in von Rosen [27]. For special cases, we refer to Hamid [2] and Hamid and von Rosen [3]. The two terms in (7.18) can be, respectively, rewritten as

$$\displaystyle \begin{aligned}tr\{{ \mathbf{X}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{Y}}'{ \mathbf{G}}_{1}({ \mathbf{ G}}_{1}^{\prime}{ \mathbf{W}}_{2}{ \mathbf{G}}_{1})^{-1}{ \mathbf{G}}_{1}^{\prime}{ \mathbf{Y}}\}\end{aligned}$$

and

$$\displaystyle \begin{aligned}tr\{{ \mathbf{X}}_{2}^{\prime}({ \mathbf{X}}_{2}{ \mathbf{X}}_{2}^{\prime})^{-}{ \mathbf{X}}_{2}{ \mathbf{Y}}'{ \mathbf{G}}_{2}({ \mathbf{ G}}_{2}^{\prime}{ \mathbf{W}}_{2}{ \mathbf{G}}_{2})^{-1}{ \mathbf{G}}_{2}^{\prime}{ \mathbf{Y}}\}.\end{aligned}$$

In order to show that the distribution of ϕ 2(Y) under the null hypothesis is independent of Σ, we want to show that the distributions of the above two expressions under the null hypothesis are independent of Σ, which is equivalent to showing that the distributions of \({ \mathbf {G}}_{1}^{\prime }{ \mathbf {Y}}\) and \({ \mathbf {G}}_{2}^{\prime }{ \mathbf {Y}}\) under the null hypothesis are independent of Σ. As such, we write \({ \mathbf {G}}_{1}^{\prime }{ \mathbf {Y}}\) as

$$\displaystyle \begin{aligned}({ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{\frac{1}{2}}({ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{-\frac{1}{2}}{ \mathbf{G}}_{1}^{\prime}{ \mathbf{Y}}',\end{aligned}$$

and it remains to show that the distribution of \(({ \mathbf {G}}_{1}{ \boldsymbol {\varSigma }} { \mathbf {G}}_{1}^{\prime })^{-\frac {1}{2}}{ \mathbf {G}}_{1}^{\prime }{ \mathbf {Y}}'\), which is a linear function of a multivariate normal random variable, is independent of Σ. Once again, because of normality, it is enough to show that the mean and dispersion matrices are independent of Σ.

Under the null hypothesis, B 2 = 0, we have that \({ \mathbf {G}}_{1}^{\prime }{ \mathbf {Z}}_{1}={ {\mathbf {Z}}_{1}^{o}}^\prime { \mathbf {Z}}_{1}={ \mathbf {0}}\). Consequently,

$$\displaystyle \begin{aligned}\boldsymbol{E}[({ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{-\frac{1}{2}}{ \mathbf{G}}_{1}^{\prime}{ \mathbf{ Y}}]=({ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{-\frac{1}{2}}{ \mathbf{G}}_{1}^{\prime}({ \mathbf{Z}}_{1}{ \mathbf{B}}_{1}{ \mathbf{X}}_{1}+{ \mathbf{Z}}_{2}{ \mathbf{B}}_{2}{ \mathbf{X}}_{2})={ \mathbf{0}}.\end{aligned}$$

Furthermore,

$$\displaystyle \begin{aligned}\boldsymbol{D}[{ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{-\frac{1}{2}}{ \mathbf{G}}_{1}^{\prime}{ \mathbf{ Y}}]=({ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{-\frac{1}{2}}{ \mathbf{G}}_{1}^{\prime} { \boldsymbol{\varSigma}} { \mathbf{G}}_{1}({ \mathbf{G}}_{1}{ \boldsymbol{\varSigma}} { \mathbf{G}}_{1}^{\prime})^{-\frac{1}{2}}={ \mathbf{I}}.\end{aligned}$$

Similar calculations can show that the distribution of \({ \mathbf {G}}_{2}^{\prime }{ \mathbf {Y}}\) is independent of Σ.

7.4 Simulations

We performed simulations to evaluate the performance of the tests. We first generated the empirical distributions of both tests under the corresponding null hypotheses, and calculated the critical values for the tests based on 50,000 simulations. The empirical level and power of the tests were then calculated from second sets of 10,000 simulations. Several scenarios were considered in terms of sample size (n), departure from the null hypotheses and number of time points, p. We have also considered several scenarios in terms of the covariance matrix Σ.

For the null hypothesis in (7.3), the distribution of the test statistic, ϕ 1(Y) under the null hypothesis is skewed to the left (Fig. 7.2), which is consistent across different sample sizes, dimensions of p, degrees of polynomial q and magnitudes of Σ.

Fig. 7.2
figure 2

The null distribution of ϕ 1(Y ), where p = 8, and (a) n=20, (b) n=30, (c) n=40 and (d) n=50. The arrows show the critical values corresponding to the tests

The results of the simulation show that ϕ 1(Y ) performs well and possesses all the desirable properties of a test statistic. Similar to the trace test in the GCM, our simulation results show that the test maintains the nominal level α = 0.05 (Fig. 7.3), for all the scenarios considered.

Fig. 7.3
figure 3

Empirical level of ϕ 1(Y ) and the corresponding 95% confidence bands

The results also show that the test is unbiased, symmetric, monotone with respect to both n and departures from the null hypothesis (Fig. 7.4). Similar to our previous studies on the GCM and EGCM, departure from the null hypotheses is measured using the Euclidean norm of the parameter matrix [5,6,7,8]. The results also show that the test has a reasonably good statistical power, where the test detects very small departures from the null hypothesis, even with relatively small sample sizes (Fig. 7.4 and Table 7.1). Please note, the results are consistent across all the scenarios we considered.

Fig. 7.4
figure 4

Empirical power of ϕ 1(Y ) with x-axis depicting (a) positive and negative scenarios to show symmetry, (b) Euclidean norm of the parameter matrix, used to measure departures from the null

Table 7.1 Empirical power of ϕ 1(Y) for multiple n, p and B

Similar results were also obtained for ϕ 2(Y), where the simulation results show that the distribution of the test statistic is skewed to the left. The test statistic maintains the nominal level and has all the other desirable properties such as unbiasedness, symmetry and monotonicity (Figs. 7.5 and 7.6). Furthermore, our simulations demonstrate that the test has a very good performance, as it detects extremely small departures from the null hypothesis, with a reasonably small sample size (Fig. 7.6).

Fig. 7.5
figure 5

(a) Empirical level and (b) empirical power, of ϕ 2(Y )

Fig. 7.6
figure 6

Empirical power of ϕ 2(Y ) with respect to Euclidean norm of the parameter matrix across four sample sizes

7.5 Empirical Example

As an empirical illustration, we consider the glucose data [32]. Data consists of measurements taken at 8 time points from 13 controls and 20 obese patients. The mean profile plots for the two groups are provided in Fig. 7.7 below. For illustration purposes, we assume two clusters, where the mean of one of the clusters (consisting of individuals in the obese group) follows a quadratic growth curve and the mean for the second cluster (consisting of individuals in the control group) follows a cubic growth curve.

Fig. 7.7
figure 7

Mean profile plots for the glucose data

Results show that the null hypothesis of zero mean for both groups (i.e., B 1 = 0, B 2 = 0) is rejected, with observed test statistic = 62.76, critical value = 0.80, and p-value r <  0.0001. For the hypothesis that the coefficient of the cubic term is zero, the observed test statistic value is 0.142 and the critical value is 0.173 (Fig. 7.8), indicating that there is no evidence to reject the null hypothesis (p-value = 0.07328) at 5% level of significance.

Fig. 7.8
figure 8

The null distribution of ϕ 2(Y) for the glucose data

7.6 Summary

We considered the Extended Growth Curve Model, which can be applied in situations where longitudinal measurements from groups of individuals can be clustered into several groups. Using decomposed residuals as a motivation, we derived test statistics for two hypotheses related to the mean parameters of the model. The two corresponding tests can be shown to be extensions of the trace test in the Growth Curve Model, which in turn is the Lawley-Hotelling trace test.

Simulation results showed that the two tests possess all the desirable properties such as unbiasedness, symmetry and monotonicity with respect to both sample size and departures from the null hypotheses, where departure is measured using Euclidean norm. Results also demonstrate that the tests have good performances, where the tests were able to detect very small departures from the null hypotheses. The results were consistent under all scenarios considered (e.g., different covariance matrices, different values of p and q).

Although the tests are derived under special scenarios, where we assumed two clusters, the approach used in our manuscript allows extensions to several clusters (the general EGCM) to be made. Moreover, the formulation we provided in this study can also be extended to allow a more general linear hypothesis to be tested.