1 Introduction

Advances in computing power in the past few decades greatly encouraged the collection of multi-level multivariate data in all fields of science: biomedical, medical, environmental, social sciences and engineering. And, with these data sets complex multivariate testing problems occur frequently. It is common in clinical trial studies to collect measurements on more than one response variable at several locations taken repeatedly over time on one experimental unit to test the effectiveness of some medication, diet or treatment. These are called three-level multivariate data and the doubly exchangeable covariance structure (defined below) is a suitable variance-covariance matrix for this kind of data.

Good examples of such data may be found in osteopenia and osteoporosis studies, being estimated that one of every four post-menopausal women may suffer from osteoporosis. Although it is more common in white or Asian women older than 50 years, osteoporosis can occur in almost any person at any age, being estimated that in fact, more than 2 million American men have osteoporosis, the national cost for osteoporosis and related injuries being assessed at $14 billion each year in the United States.

Let \(\varvec{y}\) be the muv-variate real-valued random vector of all measurements. We partition this vector \(\varvec{y}\) as follows:

$$\begin{aligned} \varvec{y}=\left( \begin{array}{c} \varvec{y}_{1} \\ \vdots \\ \varvec{y}_{v} \end{array} \right) ,\qquad \text {where }\quad \varvec{y}_{t}=\left( \begin{array}{c} \varvec{y}_{t1} \\ \vdots \\ \varvec{y}_{tu} \end{array} \right) ,\qquad \text {with }\quad \varvec{y}_{ts}=\left( \begin{array}{c} \varvec{y}_{ts1} \\ \vdots \\ \varvec{y}_{tsm} \end{array} \right) , \end{aligned}$$

for \(t=1,\ldots ,v\), \(s=1,\ldots ,u\). The m-dimensional vector of measurements \(\varvec{y}_{ts}\) represents the replicate on the sth location and at the tth time point.

Let \({\varvec{\varTheta }=\text{ Cov }\left[ \varvec{y}\right] }\) be the \((muv \times muv)-\)dimensional partitioned covariance matrix. We say that the covariance matrix \({\varvec{\varTheta }}\) has a doubly exchangeable covariance structure (Roy and Leiva 2007) if it can be written as

$$\begin{aligned} \varvec{\varTheta }= & {} \varvec{I}_{uv} \otimes \varvec{U}_0 + \left[ \varvec{I}_{v}\otimes (\varvec{J}_{u} -\varvec{I}_u))\right] \otimes \varvec{U}_{1} + \left[ \varvec{J}_{uv}- (\varvec{I}_v \otimes \varvec{J}_u)\right] \otimes \varvec{W} \nonumber \\= & {} \varvec{I}_{uv}\otimes \left( \varvec{U}_{0}-\varvec{U}_{1}\right) \varvec{+I}_{v}\otimes \mathbf {J}_{u} \otimes \left( \varvec{U}_{1}-\varvec{W}\right) \varvec{+J}_{uv} \otimes \varvec{W}, \nonumber \\= & {} \varvec{I}_{v} \otimes \varvec{U} + \left( \varvec{J}_{v}-\varvec{I}_{v}\right) \otimes \varvec{W}^* \end{aligned}$$
(1)

where

$$\begin{aligned} \varvec{U} = \varvec{I}_{u} \otimes \varvec{U}_0 + \left( \varvec{J}_{u}-\varvec{I}_{u}\right) \otimes \varvec{U}_1 \end{aligned}$$

and

$$\begin{aligned} \varvec{W}^* = \varvec{J}_u\otimes \varvec{W}, \end{aligned}$$

with \(\varvec{U}_{0}\) a positive definite symmetric \( m\times m\) matrix, and \(\varvec{U}_{1}\) and \(\varvec{W}\) symmetric \(m\times m\) matrices. The matrices \(\varvec{U}_{0},\varvec{U}_{1}\) and \(\varvec{W}\) are all unstructured. That is, the doubly exchangeable covariance structure can be written as a matrix having diagonal blocks of the same matrix \(\varvec{U}\) with component matrices \(\varvec{U}_0\) and \(\varvec{U}_1\) and otherwise being filled-up with matrices \(\varvec{W}\).

Thus, the vectors \(\varvec{y}_{11},\ldots , \varvec{y}_{1u},\ldots , \varvec{y}_{v1},\ldots , \varvec{y}_{vu}\) are doubly exchangeable if

$$\begin{aligned} \text{ Cov }\left[ \varvec{y}_{ts}; \varvec{y}_{t^{*}s^{*}}\right] =\left\{ \begin{array}{lllll} \varvec{U}_{0} &{}\quad \text {if} &{} t=t^{*} &{} \text {and} &{} s=s^{*}, \\ \varvec{U}_{1} &{}\quad \text {if} &{} t=t^{*} &{} \text {and} &{} s\ne s^{*}, \\ \varvec{W} &{}\quad \text {if} &{} t\ne t^{*}, &{} &{} \end{array} \right. \end{aligned}$$

with \(\text{ Cov }\left[ \varvec{y}_{ts}; \varvec{y}_{t^{*}s^{*}}\right] =\text{ Cov }\left[ \varvec{y}_{t^*s^*}; \varvec{y}_{ts}\right] \).

The \(m \times m\) diagonal blocks \(\varvec{U}_{0}\) in (1) represent the variance-covariance matrix of the m response variables at any given location and at any given time point, whereas the \(m \times m\) off-diagonal blocks \(\varvec{U}_{1}\) in (1) represent the covariance matrix of the m response variables between any two different locations, at any given time point. We assume \(\varvec{U}_{0}\) is the same for all locations and time points, and \(\varvec{U}_{1}\) is the same for all time points. The \(m \times m\) off-diagonal blocks \(\varvec{W}\) represent the covariance matrix of the m response variables between any two different time points. It is assumed to be the same for any pair of different time points, irrespective of location.

A good and simple example of a three-level multivariate dataset on osteoporosis may be obtained from Johnson and Wichern (2007) who report data on a study where an investigator measures the mineral content of three bones, radius, humerus and ulna (\({m=3}\)) by photon absorptiometry to examine whether a particular dietary supplement increases bone mineral content and mass in older women. All three measurements are recorded on the dominant and non-dominant sides (\({u=2}\)) for each woman. These two-level multivariate measurements are then taken again one year after their first participation in the experimental program. Thus, this whole dataset has a three-level multivariate structure, with \({m=3}\) variables, for \({u=2}\) locations, over \({v=2}\) time points and given what the variables involved represent, testing for a doubly exchangeable covariance structure may be an adequate goal. Another simple example is from a bone densitometry study where the interest may be to test the doubly exchangeable covariance structure for an osteopenia study on twelve patients. Bone mineral density were obtained from the femoral neck and trochanter \({(m=2)}\), for the right and left femur (\({u=2}\)), taken at two different times (\({v=2}\)), separated by about two years. This latter example is addressed in Sect. 6, while the example of the osteoporosis data from Johnson and Wichern (2007) is placed in the supplementary material.

Roy and Fonseca (2012) fitted a general linear model to three-level multivariate data, with a doubly exchangeable covariance structure for the error vector, and Leiva and Roy (2011, 2012) used this doubly exchangeable covariance structure for classification of three-level multivariate data, while hypothesis testing on three-level multivariate data was first studied by Roy and Leiva (2008), where these two authors introduced parametrically parsimonious models for hypotheses testing of Kronecker product covariance structures.

However, none of these authors addressed directly the problem of testing the doubly exchangeable covariance structure.

The doubly exchangeable covariance structure is actually a very general and rich covariance structure which generalizes at the same time both the compound symmetry and the sphericity structures. As such it has many interesting covariance structures as particular cases. Some of these are depicted in Fig. 1. While some of these structures are rather well-known, as the above mentioned compound symmetry and sphericity structures, and also the block compound symmetry (Votaw 1948; Szatrowski 1976, 1982; Coelho and Roy 2017) and the block-matrix sphericity structure (Moschopoulos 1992; Cardeño and Nagar 2001; Marques and Coelho 2012), others are not even known in the literature as it is the case of the structure derived from the doubly exchangeable covariance structure when \({\varvec{U}_1=\varvec{0}}\) but \({\varvec{W}\ne \varvec{0}}\), which we call multi-block-matrix sphericity (see Fig. 1). We may note how for either \(u=1\) or \(v=1\) we have the block compound symmetric structure for which a likelihood ratio test (l.r.t.) was developed in Coelho and Roy (2017), being thus the test in the present paper a generalization of the test in this reference.

Fig. 1
figure 1

Web of relations of covariance structures which are particular cases of the doubly exchangeable structure. Covariance structures depicted and their acronyms: D.E.—Double exchangeable, D.C.Sym—Double compound symmetry, B.C.Sym—Block Compound Symmetry (or block exchangeability), D-B.C.Sym—Diagonal-block compound symmetry, M.B-M.Sph—Multi-Block-Matrix Sphericity, C.Sym—Compound symmetry, B-M.Sph.—Block-matrix sphericity, Sph—Sphericity

Therefore, once the test for double exchangeability is developed, it may be used to test for (see Fig. 1)

  1. (i)

    block compound symmetry or block exchangeability, for \({u=1}\) or \({v=1}\),

  2. (ii)

    double compound symmetry, for \({m=1}\), or

  3. (iii)

    compound symmetry, for \({m=1}\) and either \({u=1}\) or \({v=1}\).

And, as such, in case we reject the doubly exchangeable structure for a given covariance matrix, we may then take our testing procedure one step further by testing for example for block compound symmetry or block exchangeability of submatrices, in trying to evaluate which may be the reason why the doubly exchangeable structure was rejected, and this can be done just by using one of the particular cases of this same test.

The use of the doubly exchangeable structure will also enable the use of a much smaller number of parameters to model the covariance structure, since for an unstructured covariance matrix of dimensions \(muv{\scriptstyle \times }muv\) there are \(muv(muv+1)/2\) unknown parameters, whereas the doubly exchangeable covariance matrix has only \(3m(m+1)/2\) unknown parameters, which does not depend on either u or v. Perlman (1987) stresses the importance of carrying out tests for symmetry structures by saying that “if symmetries are known to be present, then sharper statistical inferences can be obtained”.

These facts motivated the authors to develop a l.r.t. for the doubly exchangeable covariance structure for three-level multivariate data. The method used allows for an easy way to obtain the l.r.t. statistic as well as the characterization of its exact distribution, which is done in Sects. 2 and 3 of the paper. Given the extremely complicated structure of the exact distribution of the l.r.t. statistic, the development of sharp but manageable approximations stands up as a most desirable goal, in order to enable the practical application of the test. It happens that the approach undertaken also enables to obtain an extremely useful factorization of the characteristic function of the logarithm of the l.r.t. statistic, which in turn opens the way for the development of extremely sharp and manageable near-exact approximations for the distribution of the likelihood ratio statistic. This work is done in Sect. 4 of the paper. In Sect. 5 the authors carry out a number of numerical studies which show the extremely good performance of the near-exact distributions developed, even for very small samples and/or very large numbers of variables involved. In Sect. 6 two real-data examples are used to exemplify the practical application of the l.r.t. and of the near-exact approximations developed. Conclusions are drawn in Sect. 7.

2 Formulation of the hypothesis and the likelihood ratio test

Let \(\varvec{y}\sim N_{muv}(\varvec{\mu },\varvec{\varSigma })\). We are interested in testing the hypothesis

$$\begin{aligned} H_0:\varvec{\varSigma }=\varvec{\varTheta }\,, \end{aligned}$$
(2)

where \(\varvec{\varTheta }\) is defined in (1).

In Lemma 3.1 in Roy and Fonseca (2012), it is shown that for \(\varvec{\varGamma }^*=\underset{{v \times v}}{\varvec{C}^\prime } \otimes \varvec{I}_{mu}\) and \(\varvec{\varGamma }^{\bullet }= \varvec{I}_{v}\otimes (\underset{u\times u}{\varvec{C}^{*\prime }} \otimes \varvec{I}_{m})\), where \(\varvec{C}\) and \(\varvec{C}^*\) are orthogonal Helmert matrices whose first columns are proportional to \(\varvec{1}\)’s,

$$\begin{aligned} \varvec{\varGamma }^{\bullet }\varvec{\varGamma }^{*} \varvec{\varTheta } \varvec{\varGamma }^{*\prime } \varvec{\varGamma }^{\bullet \prime }= \mathrm{diag}(\varvec{\Delta }_{3}, \varvec{I}_{u-1}\otimes \varvec{\Delta }_{1}, \varvec{\Delta }_{2}, \varvec{I}_{u-1}\otimes \varvec{\Delta }_{1}, \varvec{\Delta }_{2}, \dots , \varvec{I}_{u-1}\otimes \varvec{\Delta }_{1}) \end{aligned}$$

where

$$\begin{aligned} \varvec{\Delta }_{1}= & {} \varvec{U}_{0}-\varvec{U}_{1}, \\ \varvec{\Delta }_{2}= & {} \varvec{U}_{0}+\left( u-1\right) \varvec{U}_{1}-u \varvec{W=}\left( \varvec{U}_{0}-\varvec{U}_{1}\right) +u\left( \varvec{U} _{1}-\varvec{W}\right) , \\ \text {and} ~~ \varvec{\Delta }_{3}= & {} \varvec{U}_{0}+\left( u-1\right) \varvec{U} _{1}+u\left( v-1\right) \varvec{W}=\left( \varvec{U}_{0}-\varvec{U} _{1}\right) +u\left( \varvec{U}_{1}-\varvec{W}\right) +uv \varvec{W}. \end{aligned}$$

Since \(\varvec{\varGamma }^{\bullet }\) and \(\varvec{\varGamma }^{*}\) are not function of either \({\varvec{U}}_0\), or \({\varvec{U}}_1\) or \({\varvec{W}}\), to test \(H_0\) in (2) is equivalent to test

$$\begin{aligned} H_0: \varvec{\varSigma }^{*}=\varvec{\varOmega } \end{aligned}$$
(3)

where

$$\begin{aligned} \varvec{\varSigma }^{*}=\varvec{\varGamma }^{\bullet } \varvec{\varGamma }^{*} \varvec{\varSigma } \varvec{\varGamma }^{*\prime }\varvec{\varGamma }^{\bullet \prime } ~~~~~~\mathrm{and}~~~~~~ \varvec{\varOmega }=\varvec{\varGamma }^{\bullet } \varvec{\varGamma }^{*} \varvec{\varTheta } \varvec{\varGamma }^{*\prime } \varvec{\varGamma }^{\bullet \prime }\,. \end{aligned}$$

We may split the null hypothesis in (3) as

(4)

where ‘’ means ‘after’ and ‘||’ means ‘parallel’, meaning ‘either after or before’.

In (4),

$$\begin{aligned} H_{0a}:\varvec{\varSigma }^*=\hbox {block-diag}(\varvec{\varSigma }^*_i,\,i=1,\dots ,uv)\,, \end{aligned}$$
(5)

is the hypothesis of independence of the uv diagonal blocks \(\varvec{\varSigma }^*_i\)\({(i=1,\dots ,uv)}\) of size \(m{\scriptstyle \times }m\) of \(\varvec{\varSigma }^*\);

$$\begin{aligned} \begin{array}{ll} H_{0b|a}: &{} \underbrace{\varvec{\varSigma }^*_2=\dots =\varvec{\varSigma }^*_u}_{u-1}=\underbrace{\varvec{\varSigma }^*_{u+2}=\dots =\varvec{\varSigma }^*_{2u}}_{u-1}=\dots =\underbrace{\varvec{\varSigma }^*_{(v-1)u+2}=\dots =\varvec{\varSigma }^*_{vu}}_{u-1},\\ &{}\hbox {assuming } H_{0a}, \end{array} \nonumber \\ \end{aligned}$$
(6)

is the hypothesis of equality of \(v(u-1)\) covariance matrices of dimension \(m{\scriptstyle \times }m\), assuming \(H_{0a}\), and

$$\begin{aligned} \begin{array}{ll} H_{0c|a}: &{} \varvec{\varSigma }^*_{u+1}=\varvec{\varSigma }^*_{2u+1}=\dots =\varvec{\varSigma }^*_{(v-1)u+1}\\ &{} \hbox {assuming } H_{0a}, \end{array} \end{aligned}$$
(7)

is the hypothesis of equality of the covariance matrices \(\varvec{\varSigma }^*_{u+1},\varvec{\varSigma }^*_{2u+1},\dots ,\)\(\varvec{\varSigma }^*_{(v-1)u+1}\), assuming \(H_{0a}\).

The l.r.t. statistic to test \(H_{0a}\) in (5) is (Anderson 2003, Sect. 9.2)

$$\begin{aligned} \varLambda _a=\left( \frac{|\varvec{A}|}{\prod _{j=1}^{uv} |\varvec{A}_j|}\right) ^{n/2} \end{aligned}$$

where \(\varvec{A}=\varvec{\varGamma }^\bullet \varvec{\varGamma }^*\varvec{A}^{\scriptscriptstyle +}\varvec{\varGamma }^{*\prime }\varvec{\varGamma }^{\bullet \prime }\) is the maximum likelihood estimator (m.l.e.) of \(\varvec{\varSigma }^*\), and \(\varvec{A}_j\) its j-th diagonal \(m{\scriptstyle \times }m\) block, being \(\varvec{A}^{\scriptscriptstyle +}\) the m.l.e. of \(\varvec{\varSigma }\).

The l.r.t. statistic to test \(H_{0b|a}\) in (6) is (Anderson 2003, Sect. 10.2)

$$\begin{aligned} \varLambda _b=\left( \left( v(u-1)\right) ^{mv(u-1)}\frac{\prod _{\ell =1}^v\prod _{k=1}^{u-1}\left| \varvec{A}_{(\ell -1)u+1+k}\right| }{|\varvec{A}^*|^{v(u-1)}}\right) ^{n/2}, \end{aligned}$$
(8)

where

$$\begin{aligned} \varvec{A}^*=\sum _{\ell =1}^v\sum _{k=1}^{u-1} \varvec{A}_{(\ell -1)u+1+k}\,. \end{aligned}$$

The l.r.t. statistic to test \(H_{0c|a}\) in (7) is (Anderson 2003, Sect. 10.2)

$$\begin{aligned} \varLambda _c=\left( (v-1)^{m(v-1)}\frac{\prod _{k=1}^{v-1} |\varvec{A}_{ku+1}|}{|\varvec{A}^{**}|^{v-1}}\right) ^{n/2} \end{aligned}$$
(9)

where

$$\begin{aligned} \varvec{A}^{**}=\sum _{k=1}^{v-1} \varvec{A}_{ku+1}\,. \end{aligned}$$

Then, through an extension of Lemma 10.3.1 in (Anderson 2003, Sect. 10.3), the l.r.t. statistic to test \(H_0\) in (3) will be

$$\begin{aligned} \varLambda= & {} \varLambda _a\varLambda _b\varLambda _c \nonumber \\= & {} \displaystyle \left( \left( v(u-1)\right) ^{mv(u-1)}(v-1)^{m(v-1)}\frac{|\varvec{A}|}{|\varvec{A}_1| |\varvec{A}^*|^{v(u-1)}\,|\varvec{A}^{**}|^{v-1}}\right) ^{n/2}, \end{aligned}$$
(10)

with

$$\begin{aligned} E\left( \varLambda ^h\right) =E\left( \varLambda _a^h\right) E\left( \varLambda _b^h\right) E\left( \varLambda _c^h\right) , \end{aligned}$$
(11)

since on one hand, under \(H_{0a}\)\(\varLambda _a\) is independent of \(\prod _{j=1}^{uv}|\varvec{A}_j|\) (Marques and Coelho 2012; Coelho and Marques 2012b), which makes \(\varLambda _a\) independent of \(\varLambda _b\) and \(\varLambda _c\), while on the other hand, the \(\varvec{A}_j\)\({(j=1,\dots ,uv)}\), under \(H_{0a}\), are independent among themselves, which makes \(\varLambda _b\) and \(\varLambda _c\) independent because they are built on different \(\varvec{A}_j\)’s.

In the following section we obtain the expressions for the moments of all three l.r.t. statistics \(\varLambda _a\), \(\varLambda _b\) and \(\varLambda _c\), as well as their distributions.

3 On the exact distribution of the l.r.t. statistic

Using the results in Coelho (2004), Coelho et al. (2010) and Marques et al. (2011) we may write the h-th moment of \(\varLambda _a\) as

$$\begin{aligned} E\left( \varLambda _a^h\right)= & {} \displaystyle \prod _{k=1}^{uv-1}\prod _{j=1}^m \frac{\varGamma \left( \frac{n-j}{2}\right) \varGamma \left( \frac{n-(uv-k)m-j}{2}+\frac{n}{2}h\right) }{\varGamma \left( \frac{n-(uv-k)m-j}{2}\right) \varGamma \left( \frac{n-j}{2}+\frac{n}{2}h\right) } \nonumber \\= & {} \displaystyle \underbrace{\left\{ \prod _{j=3}^{muv}\left( \frac{n-j}{n}\right) ^{r_j}\left( \frac{n-j}{n}+h\right) ^{-r_j}\right\} }_{\varPhi ^{}_{a,1}(h)} \underbrace{\left( \frac{\varGamma \left( \frac{n-1}{2}\right) \varGamma \left( \frac{n-2}{2}+\frac{n}{2}h\right) }{\varGamma \left( \frac{n-1}{2}+\frac{n}{2}h\right) \varGamma \left( \frac{n-2}{2}\right) }\right) ^{k^*}}_{\varPhi ^{}_{a,2}(h)}\nonumber \\ \end{aligned}$$
(12)

with

$$\begin{aligned} k^*=\left\{ \begin{array}{lll} \displaystyle \left\lfloor \frac{uv}{2}\right\rfloor , &{}&{}\quad m \hbox { odd}\\ 0, &{}&{}\quad m \hbox { even}, \end{array} \right. \end{aligned}$$
(13)

where \(\lfloor \,\cdot \,\rfloor \) represents the integer function, that is, the largest integer that does not exceed its argument, and

$$\begin{aligned} r_j=\left\{ \begin{array}{lll} h_{j-2}+(-1)^j k^*, &{}&{} j=3,4\\ r_{j-2}+h_{j-2}, &{}&{} j=5,\dots ,muv, \end{array} \right. \end{aligned}$$

with

$$\begin{aligned} h_j=\left\{ \begin{array}{lll} uv-1, &{}&{} j=1,\dots ,m\\ -1, &{}&{} j=m+1,\dots ,muv -2. \end{array} \right. \end{aligned}$$

Now, using the results in (Coelho and Marques 2012a; Coelho et al. 2010; Marques et al. 2011) we obtain the expression for the h-th moment of \(\varLambda _b\) as

$$\begin{aligned} E\left( \varLambda _b^h\right)= & {} \displaystyle \prod _{j=1}^m\prod _{k=1}^{v(u-1)}\frac{\varGamma \left( \frac{n-1}{2}-\frac{j-1}{2v(u-1)}+\frac{k-1}{v(u-1)}\right) \varGamma \left( \frac{n-j}{2}+\frac{n}{2}h\right) }{\varGamma \left( \frac{n-1}{2}-\frac{j-1}{2v(u-1)}+\frac{k-1}{v(u-1)}+\frac{n}{2}h\right) \varGamma \left( \frac{n-j}{2}\right) } \nonumber \\= & {} \displaystyle \underbrace{\left\{ \prod _{j=2}^m\left( \frac{n-j}{n}\right) ^{s_j}\left( \frac{n-j}{n}+h\right) ^{-s_j}\right\} }_{\varPhi ^{}_{b,1}(h)}\nonumber \\&\displaystyle \times \left\{ \prod _{j=1}^{\lfloor m/2\rfloor }\prod _{k=1}^{v(u-1)}\frac{\varGamma \left( n-1+\frac{k-2j}{v(u-1)}\right) \varGamma \left( n-1+\left\lfloor \frac{k-2j}{v(u-1)}\right\rfloor +nh\right) }{\varGamma \left( n-1+\frac{k-2j}{v(u-1)}+nh\right) \varGamma \left( n-1+\left\lfloor \frac{k-2j}{v(u-1)}\right\rfloor \right) }\right\} \nonumber \\&\displaystyle \underbrace{\times \left\{ \prod _{k=1}^{v(u-1)}\frac{\varGamma \left( \frac{n-m}{2}+\frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}\right) \varGamma \left( \frac{n-m}{2}+\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}\right\rfloor +\frac{n}{2}h\right) }{\varGamma \left( \frac{n-m}{2}+\frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}+\frac{n}{2}h\right) \varGamma \left( \frac{n-m}{2}+\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}\right\rfloor \right) }\right\} ^{\widetilde{m}}}_{\varPhi ^{}_{b,2}(h)}\nonumber \\ \end{aligned}$$
(14)

where \(\widetilde{m}=m-2\lfloor m/2\rfloor \) and \(s_j\)\({(j=2,\dots ,m)}\) are given in Appendix A in the supplementary material.

By looking at (8) and (9) we may see that the h-th moment of \(\varLambda _c\) may be obtained, as follows, from the h-th moment of \(\varLambda _b\) by first replacing v by 1 and then replacing u by v,

$$\begin{aligned} E\left( \varLambda _c^h\right)= & {} \displaystyle \prod _{j=1}^m\prod _{k=1}^{v-1}\frac{\varGamma \left( \frac{n-1}{2}-\frac{j-1}{2(v-1)}+\frac{k-1}{v-1}\right) \varGamma \left( \frac{n-j}{2}+\frac{n}{2}h\right) }{\varGamma \left( \frac{n-1}{2}-\frac{j-1}{2(v-1)}+\frac{k-1}{v-1}+\frac{n}{2}h\right) \varGamma \left( \frac{n-j}{2}\right) } \nonumber \\= & {} \displaystyle \underbrace{\left\{ \prod _{j=2}^m\left( \frac{n-j}{n}\right) ^{\delta _j}\left( \frac{n-j}{n}+h\right) ^{-\delta _j}\right\} }_{\varPhi ^{}_{c,1}(h)}\nonumber \\&\displaystyle \times \left\{ \prod _{j=1}^{\lfloor m/2\rfloor }\prod _{k=1}^{v-1}\frac{\varGamma \left( n-1+\frac{k-2j}{v-1}\right) \varGamma \left( n-1+\left\lfloor \frac{k-2j}{v-1}\right\rfloor +nh\right) }{\varGamma \left( n-1+\frac{k-2j}{v-1}+nh\right) \varGamma \left( n-1+\left\lfloor \frac{k-2j}{v-1}\right\rfloor \right) }\right\} \nonumber \\&\displaystyle \underbrace{\times \left\{ \prod _{k=1}^{v-1}\frac{\varGamma \left( \frac{n-m}{2}+\frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}\right) \varGamma \left( \frac{n-m}{2}+\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}\right\rfloor +\frac{n}{2}h\right) }{\varGamma \left( \frac{n-m}{2}+\frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}+\frac{n}{2}h\right) \varGamma \left( \frac{n-m}{2}+\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}\right\rfloor \right) }\right\} ^{\widetilde{m}}}_{\varPhi ^{}_{c,2}(h)}\nonumber \\ \end{aligned}$$
(15)

where \(\widetilde{m}=m-2\lfloor m/2\rfloor \) and the shape parameters \(\delta _j\)\({(j=2,\dots ,m)}\) are given in Appendix A in the supplementary material.

Since the supports of \(\varLambda _a\), \(\varLambda _b\) and \(\varLambda _c\) are delimited, their distributions are defined by their moments, and as such, from the first expression in (12) we may write

$$\begin{aligned} \varLambda _a\buildrel {st}\over {\sim }\prod _{j=1}^m\prod _{k=1}^{uv-1} \left( X_{jk}\right) ^{n/2}\,,~~~~\mathrm{where}~~~~X_{jk}\sim Beta\left( \frac{n-(uv-k)m-j}{2},\frac{(uv-k)m}{2}\right) ,\nonumber \\ \end{aligned}$$
(16)

where ‘\(\buildrel {st}\over {\sim }\)’ means ‘stochastically equivalent to’ and \(X_{jk}\)\({(j=1,\dots ,m;}\)\({k=1,\dots ,uv-1)}\) are independent random variables, while from the first expression in (14) we may write

$$\begin{aligned} \varLambda _b\buildrel {st}\over {\sim }\prod _{j=1}^m\prod _{k=1}^{v(u-1)} \left( X^*_{jk}\right) ^{n/2}\,,~~~~\mathrm{where}~~~~X^*_{jk}\sim Beta\left( \frac{n-j}{2},\frac{j-1}{2}+\frac{2k-j-1}{2v(u-1)}\right) ,\nonumber \\ \end{aligned}$$
(17)

where \(X^*_{jk}\)\({(j=1,\dots ,m;k=1,\dots ,v(u-1))}\) are independent, and from the first expression in (15) we may write

$$\begin{aligned} \varLambda _c\buildrel {st}\over {\sim }\prod _{j=1}^m\prod _{k=1}^{v-1} \left( X^{**}_{jk}\right) ^{n/2}\,,~~~~\mathrm{where}~~~~X^{**}_{jk}\sim Beta\left( \frac{n-j}{2},\frac{j-1}{2}+\frac{2k-j-1}{2(v-1)}\right) ,\nonumber \\ \end{aligned}$$
(18)

where \(X^{**}_{jk}\)\({(j=1,\dots ,m;k=1,\dots ,v-1)}\) are independent, so that we may write for the overall l.r.t. statistic for \(H_0\) in (4)

$$\begin{aligned} \varLambda \,\buildrel {st}\over {\sim }\,\prod _{j=1}^m\left\{ \left( \prod _{k=1}^{uv-1}X_{jk}\right) ^{n/2}{\scriptstyle \times }\left( \prod _{k=1}^{v(u-1)}X^*_{jk}\right) ^{n/2}{\scriptstyle \times }\left( \prod _{k=1}^{v-1}X_{jk}^{**}\right) ^{n/2}\right\} , \end{aligned}$$
(19)

where all random variables are independent.

On the other hand, based on the results in Appendix B in the supplementary material and from the second expressions in (12), (14) and (15) we may respectively write,

$$\begin{aligned} \varLambda _a\buildrel {st}\over {\sim }\left( \prod _{j=3}^{muv}e^{-Z_j}\right) {\scriptstyle \times }\left( \prod _{j=1}^{k^*}\left( Y_j\right) ^{n/2}\right) \end{aligned}$$
(20)

where

$$\begin{aligned} Z_j\sim \varGamma \left( r_j,\frac{n-j}{n}\right) ~~\mathrm{and}~~Y_j\sim Beta\left( \frac{n-2}{2},\frac{1}{2}\right) \end{aligned}$$
(21)

are all independent random variables, while for \(\varLambda _b\) we may write

(22)

where ,

$$\begin{aligned} Z_j^*\sim \varGamma \left( s_j,\frac{n-j}{n}\right) ,~~Y_{1jk}^*\sim Beta\left( n-1+\left\lfloor \frac{k-2j}{v(u-1)}\right\rfloor ,\frac{k-2j}{v(u-1)}-\left\lfloor \frac{k-2j}{v(u-1)}\right\rfloor \right) ,\nonumber \\ \end{aligned}$$
(23)

and

$$\begin{aligned}&\displaystyle Y_{2k}^* \sim Beta\left( \frac{n-m}{2}+\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}\right\rfloor ,\right. \nonumber \\&\qquad \qquad \qquad \qquad \qquad \left. \frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}-\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2v(u-1)}\right\rfloor \right) \end{aligned}$$
(24)

are all independent random variables, while for \(\varLambda _c\) we may write

(25)

where

$$\begin{aligned} Z_j^{**}\sim \varGamma \left( \delta _j,\frac{n-j}{n}\right) ,~~Y_{1jk}^{**}\sim Beta\left( n-1+\left\lfloor \frac{k-2j}{v-1}\right\rfloor ,\frac{k-2j}{v-1}-\left\lfloor \frac{k-2j}{v-1}\right\rfloor \right) ,\nonumber \\ \end{aligned}$$
(26)

and

$$\begin{aligned}&\displaystyle Y_{2k}^{**}\sim Beta\left( \frac{n-m}{2}+\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}\right\rfloor , \right. \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \displaystyle \left. \frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}-\left\lfloor \frac{m-1}{2}+\frac{2k-m-1}{2(v-1)}\right\rfloor \right) \end{aligned}$$
(27)

are all independent random variables.

Thus, we have the following Theorem.

Theorem 1

The exact distribution of the overall l.r.t. statistic \(\varLambda \) in (10), to test \(H_0\) in (3) or (4) is, for general u, v and m, the same as that of

$$\begin{aligned}&\left( \prod _{j=2}^{muv}e^{-T_j}\right) {\scriptstyle \times }\left( \prod _{j=1}^{k^*}Y_j\right) ^{n/2}{\scriptstyle \times }\left( \prod _{j=1}^{\lfloor m/2\rfloor }\prod _{k=1}^{v(u-1)} Y^*_{1jk}\right) ^{n}{\scriptstyle \times }\left( \prod _{j=1}^{\lfloor m/2\rfloor }\prod _{k=1}^{v-1} Y^{**}_{1jk}\right) ^{n}\nonumber \\&\quad {\scriptstyle \times }\left\{ \left( \prod _{k=1}^{v(u-1)}Y_{2k}^*\right) ^{n/2} {\scriptstyle \times }\left( \prod _{k=1}^{v-1}Y_{2k}^{**}\right) ^{n/2}\right\} ^{\widetilde{m}} \end{aligned}$$
(28)

where \({{\widetilde{m}}}=m-2\lfloor m/2\rfloor \) and, for \(j=2,\dots ,muv\),

$$\begin{aligned} T_j\sim \varGamma \left( \mu _j,\frac{n-j}{n}\right) , \end{aligned}$$

with

$$\begin{aligned} \mu _j=\sum _{j=2}^{muv}\left( r_j^++s_j^++\delta _j^+\right) \end{aligned}$$
(29)

where

$$\begin{aligned} r_j^+=\left\{ \begin{array}{ll} 0, &{} j=2\\ r_j, &{} j=3,\dots ,muv \end{array} \right. \end{aligned}$$

and

$$\begin{aligned} s_j^+=\left\{ \begin{array}{lll} s_j, &{}&{} j=2,\dots ,m\\ 0, &{}&{} j=m+1,\dots ,muv\,, \end{array} \right. ~~~~~~~~ \delta _j^+=\left\{ \begin{array}{lll} \delta _j, &{}&{} j=2,\dots ,m\\ 0, &{}&{} j=m+1,\dots ,muv\,, \end{array} \right. \end{aligned}$$

where the shape parameters \(s_j\) and \(\delta _j\) are defined in Appendix A in the supplementary material. The distributions of \(Y_j\), \(Y_{1jk}^*\), \(Y_{2k}^{*}\), \(Y_{1jk}^{**}\) and \(Y_{2k}^{**}\) are defined in (21), (23), (24), (26) and (27).

Proof

The proof of the above theorem is rather trivial, from the previously established results. We may only remark that the random variables \(T_j\) are the sum of the random variables \(Z_j\), \(Z_j^*\) and \(Z_j^{**}\), which are independent Gamma distributed random variables, all with the same rate parameters \(\frac{n-j}{n}\), for a given j. As such, for a given j, their sum is a Gamma distributed random variable with that same rate parameter and a shape parameter which is the sum of the original shape parameters. \(\square \)

The following Corollary refers to the particular cases for \({v=1}\) and \({v=2}\), which are particular cases of interest.

Corollary 1

For both \(v=1\) and \(v=2\), the hypothesis \(H_{0c|a}\) vanishes and as such the distribution of \(\varLambda \) is in this case the same as that of

(30)

where, for \(j=2,\dots ,muv\),

$$\begin{aligned} T_j\sim \varGamma \left( \mu ^*_j,\frac{n-j}{n}\right) , \end{aligned}$$

with

$$\begin{aligned} \mu ^*_j=\sum _{j=2}^{muv}\left( r_j^++s_j^+\right) \end{aligned}$$
(31)

where

$$\begin{aligned} r_j^+=\left\{ \begin{array}{ll} 0, &{}\quad j=2\\ r_j, &{}\quad j=3,\dots ,muv\,, \end{array} \right. ~~~~~~\mathrm{and}~~~~~~ s_j^+=\left\{ \begin{array}{ll} s_j, &{}\quad j=2,\dots ,m\\ 0, &{}\quad j=m+1,\dots ,muv\,, \end{array} \right. \end{aligned}$$

where the shape parameters \(s_j\) are defined in Appendix A in the supplementary material. The distributions of \(Y_j\), \(Y_{1jk}^*\) and \(Y_{2k}^{*}\), are defined in (21), (23) and (24).

The case for \(v=1\) is equivalent to the test for block compound symmetric covariance structure, addressed in Coelho and Roy (2017).

4 The characteristic function of \(W=-\log \,\varvec{\varLambda }\) and the development of near-exact distributions

4.1 The characteristic function of \(W=-\log \,\varvec{\varLambda }\)

In the previous section, the reason we obtain two equivalent representations for the exact distribution of \(\varLambda \), one from (19) and the other one in Theorem 1, is because the first one of these representations neither yields a manageable cumulative distribution function nor it adequately leads to a sharp approximation to the exact distribution of \(\varLambda \).

In order to obtain a very sharp and manageable approximation to the exact distribution of \(\varLambda \), we will base our developments on the representations given by Theorem 1 and Corollary 1.

From Theorem 1 and expressions (12)–(15) we may write the characteristic function (c.f.) of \({W=-\log \,\varLambda }\) as

$$\begin{aligned} \varPhi ^{}_W(t)&= \displaystyle E\left( e^{\mathrm {i}tW}\right) \,=\,E\left( \varLambda ^{-\mathrm {i}t}\right) \nonumber \\&= \displaystyle \underbrace{\left\{ \prod _{j\,=\,2}^{muv}\left( \frac{n-j}{n}\right) ^{\mu _j}\left( \frac{n-j}{n}-\mathrm {i}t\right) ^{-\mu _j}\right\} }_{\varPhi ^{}_{W,1}(t)} \times \underbrace{\varPhi ^{}_{a,2}(-\mathrm {i}t)\ \varPhi ^{}_{b,2}(-\mathrm {i}t)\ \varPhi ^{}_{c,2}(-\mathrm {i}t)}_{\varPhi ^{}_{W,2}(t)} \end{aligned}$$
(32)

where \(\mu _j\) is given by (29), \(\varPhi ^{}_{a,2}(\,\cdot \,)\), \(\varPhi ^{}_{b,2}(\,\cdot \,)\) and \(\varPhi ^{}_{c,2}(\,\cdot \,)\) are defined in (12)–(15), and \(\varPhi ^{}_{W,1}(t)\) is actually equal to \(\varPhi ^{}_{a,1}(-\mathrm {i}t)\varPhi ^{}_{b,1}(-\mathrm {i}t)\varPhi ^{}_{c,1}(-\mathrm {i}t)\).

For \({v=1}\) and \({v=2}\), according to Corollary 1 in the previous section \(\varPhi ^{}_{W}(t)\) reduces to

$$\begin{aligned} \begin{array}{rcl} \varPhi ^{}_W(t)= & {} \displaystyle \underbrace{\left\{ \prod _{j=2}^{muv}\left( \frac{n-j}{n}\right) ^{\mu ^*_j}\left( \frac{n-j}{n}-\mathrm {i}t\right) ^{-\mu ^*_j}\right\} }_{\varPhi ^{}_{W,1}(t)}\,\times \, \underbrace{\varPhi ^{}_{a,2}(-\mathrm {i}t)\ \varPhi ^{}_{b,2}(-\mathrm {i}t)}_{\varPhi ^{}_{W,2}(t)} \end{array} \end{aligned}$$
(33)

for \(\mu ^*_j\) given by (31) and where now \(\varPhi ^{}_{W,1}(t)\) is equal to \(\varPhi ^{}_{a,1}(-\mathrm {i}t)\varPhi ^{}_{b,1}(-\mathrm {i}t)\).

Expressions (32) and (33), together with expressions (28) and (30), show that the exact distribution of \({W=-\log \,\varLambda }\) is the same as that of the sum of \(muv-1\) independent Gamma random variables with an independent random variable which itself is distributed as a sum of independent Logbeta random variables.

Then, in building the near-exact distributions in the next subsection we will keep \(\varPhi ^{}_{W,1}(t)\) untouched and will approximate \(\varPhi ^{}_{W,2}(t)\) asymptotically by the c.f. of a finite mixture of Gamma distributions.

4.2 Development of near-exact distributions

Based on the result in Section 5 of Tricomi and Erdélyi (1951), we know that we may, for increasing values of a, asymptotically replace the distribution of any Logbeta(ab) distributed random variable by an infinite mixture of \(\varGamma (b+\ell ,a)\) distributions \({(\ell =0,1,\dots )}\). As such, we could replace \(\varPhi _{W,2}(t)\) in either (32) or (33) by the c.f. of the sum of infinite mixtures of Gamma distributions, which would be the same as the c.f. of an infinite mixture of sums of Gamma distributions. Although it happens that these Gamma distributions would have different rate parameters, these parameters would anyway be of comparable magnitude. As such, in building our near-exact distributions for \({W=-\log \,\varLambda }\) and \(\varLambda \), while we will leave \(\varPhi _{W,1}(t)\) unchanged, we will replace \(\varPhi _{W,2}(t)\) in either (32) or (33), by

$$\begin{aligned} \varPhi ^*(t)=\sum _{\ell =0}^{m^*}\pi _\ell \,\nu ^{r+\ell }(\nu -\mathrm {i}t)^{-(r+\ell )} \end{aligned}$$
(34)

which is the c.f. of a finite mixture of \(\varGamma (r+\ell ,\nu )\) distributions, where for the general case in (32) and for \(k^*\) in (13) and \(v>1\), we will take

(35)

which is the sum of all the second parameters of the Logbeta distributions in \(\varPhi _{W,2}(t)\) in (32), while for the particular case in (33) we will take

(36)

which is the sum of all the second parameters of the Logbeta distributions in \(\varPhi _{W,2}(t)\) in (33).

The parameter \(\nu \) in (34) is then taken as the rate parameter in

$$\begin{aligned} \varPhi ^{**}(t)=\theta \,\nu ^{s_1}(\nu -\mathrm {i}t)^{-s_1}+(1-\theta )\nu ^{s_2}(\nu -\mathrm {i}t)^{-s_2} \end{aligned}$$

where \(\theta \), \(\nu \), \(s_1\) and \(s_2\) are determined in such a way that

$$\begin{aligned} \left. \frac{d\varPhi ^{}_{W,2}(t)}{dt^h}\right| _{t=0}=\left. \frac{d\varPhi ^{**}(t)}{dt^h}\right| _{t=0}~~~~~~\mathrm{for } \,h=1,\dots ,4\,, \end{aligned}$$

while the weights \(\pi _\ell \)\({(\ell =0,\dots ,m^*-1)}\) in (34) will then be determined in such a way that

$$\begin{aligned} \left. \frac{d\varPhi ^{}_{W,2}(t)}{dt^h}\right| _{t=0}=\left. \frac{d\varPhi ^{*}(t)}{dt^h}\right| _{t=0}~~~~~~\mathrm{for }\, h=1,\dots ,m^*\,, \end{aligned}$$

with \({\pi _{m^*}=1-\sum _{\ell =0}^{m^*-1}\pi _\ell }\).

This procedure yields near-exact distributions for W which will match the first \(m^*\) exact moments of W and which have c.f.

$$\begin{aligned} \varPhi ^{}_{W,1}(t)\varPhi ^*(t)\,, \end{aligned}$$

with \(\varPhi ^{}_{W,1}(t)\) given by (32) or (33) and \(\varPhi ^*(t)\) by (34), where r, given by (35) or (36) is always either an integer or a half-integer.

As such, the near-exact distributions developed yield, for W, distributions which, for non-integer r, are mixtures, with weights \(\pi _\ell \)\({(\ell =0,\dots ,m^*)}\), of \({m^*+1}\) Generalized Near-Integer Gamma (GNIG) distributions of depth muv with integer shape parameters \(\mu _j\)\({(j=2,\dots ,muv)}\) and real shape parameter r, in the general case, or shape parameters \(\mu _j^*\)\({(j=2,\dots ,muv)}\) for the case of \(v=1\) or \(v=2\), and corresponding rate parameters \((n-j)/n\)\({(j=2,\dots ,muv)}\) and \(\nu \), and which, for integer r, are similar mixtures but of Generalized Integer Gamma (GIG) distributions, with the same shape and rate parameters. See Coelho (1998, 2004) and Appendix C in the supplementary material for further details on the GIG and GNIG distributions and their probability density and cumulative distribution functions.

Using the notation in Appendix C in the supplementary material for the probability density and cumulative distribution functions of the GNIG distribution, the near-exact distributions obtained for W, for the general case of \({v>2}\) and for the case of non-integer r, will have, for \({w>0}\), probability density and cumulative distribution functions respectively of the form

$$\begin{aligned} f^{*}_{W}(w)=\sum ^{m^*}_{\ell =0} \pi _\ell \, f^{\mathrm {\tiny GNIG}}\left( w\Bigl |\underline{s}; \underline{\lambda };g\right) , ~~~\mathrm{and}~~~ F^{*}_{W}(w)=\sum ^{m^*}_{\ell =0} \pi _\ell \, F^{\mathrm {\tiny GNIG}}\left( w\Bigl |\underline{s}; \underline{\lambda };g\right) , \end{aligned}$$

where

$$\begin{aligned} \underline{s}=\left\{ \mu _2,\dots ,\mu _{muv},r+\ell \right\} ~~~~\mathrm{and}~~~~\underline{\lambda }=\left\{ \frac{n-2}{n},\dots ,\frac{n-muv}{n},\nu \right\} \end{aligned}$$

are respectively the sets of shape and rate parameters and \(g=muv\) is the depth of the GNIG distributions, while, for \({0<\lambda <1}\) and \(z=-\log \lambda \), the near-exact probability density and cumulative distribution functions of \(\varLambda \) are respectively given by

$$\begin{aligned} f^{*}_{\varLambda }(\lambda )=\sum ^{m^*}_{\ell =0} \pi _\ell f^{\mathrm {\tiny GNIG}}\left( z\Bigl |\underline{s}; \underline{\lambda };g\right) \frac{1}{\lambda }, ~~~\mathrm{and}~~~ F^{*}_{\varLambda }(\lambda )=\sum ^{m^*}_{\ell =0} \pi _\ell \left( 1-F^{\mathrm {\tiny GNIG}}\left( z\Bigl |\underline{s}; \underline{\lambda };g\right) \right) .\nonumber \\ \end{aligned}$$
(37)

For the case \({v=1}\) or \({v=2}\) all we have to do is to replace the shape parameters \(\mu _j\) given by (29) with the shape parameters \(\mu _j^*\) given by (31) and use r given by (36), instead of r given by (35).

For integer r, all we have to do is to replace the GNIG probability density and cumulative distribution functions with their GIG counterparts (see Appendix C in the supplementary material).

5 Numerical studies

In order to assess the performance of the near-exact distributions developed in the previous section we will use

$$\begin{aligned} \varDelta =\frac{1}{2\pi }\int _{-\infty }^{+\infty }\left| \frac{\varPhi ^{}_W(t)-\varPhi ^{}_{W,1}(t)\varPhi ^*(t)}{t}\right| \,dt \end{aligned}$$
(38)

with

$$\begin{aligned} \varDelta \ge \max _{w>0}\left| F^{}_W(w)-F^*_W(w)\right| =\max _{0<\lambda <1}\left| F^{}_\varLambda (\lambda )-F^*_\varLambda (\lambda )\right| , \end{aligned}$$

as a measure of proximity between the exact and the near-exact distributions, where \(\varPhi ^{}_W(t)\) is the exact c.f. of W in (32) or (33) and \(F^{}_W(\,\cdot \,)\) and \(F^*_W(\,\cdot \,)\) represent respectively the exact and near-exact cumulative distribution functions of W, corresponding respectively to \(\varPhi ^{}_W(t)\) and \(\varPhi ^{}_{W,1}(t)\varPhi ^*(t)\), being \(F^{}_\varLambda (\,\cdot \,)\) and \(F^*_\varLambda (\,\cdot \,)\) the corresponding exact and near-exact cumulative distribution functions of \(\varLambda \).

Table 1 Values of the measure \(\varDelta \) for the near-exact distributions for \({m=2}\)

In Tables 12, and 3 we may analyze values of \(\varDelta \) for different combinations of values of m, u and v and different sample sizes. For each combination of values of m, u and v, at least three different sample sizes n exceeding the total number of variables muv by 2, 30 and 100 are used. For larger combinations of values of m, u and v, some larger values of n are also used to illustrate the asymptotic behavior of the near-exact distributions in what concerns the sample size. For all near-exact distributions, values of \(m^*\) equal to 4, 6 and 10 are used, that is, we use for each case near-exact distributions matching 4, 6 and 10 exact moments of W. Smaller values of \(\varDelta \) indicate a closer agreement with the exact distribution and as such, a better performance of the corresponding near-exact distribution.

Table 2 Values of the measure \(\varDelta \) for the near-exact distributions for \({m=5}\)
Table 3 Values of the measure \(\varDelta \) for the near-exact distributions for \({m=10}\)

We may see how the near-exact distributions developed provide very sharp approximations to the exact distribution even for very small samples, that is, for sample sizes hardly exceeding the total number of variables involved. Moreover, they also exhibit clear asymptotic behaviors not only for increasing sample sizes, but also for increasing values of m, u and v. The asymptotic behavior in terms of sample size becoming apparent for larger sample sizes as the values of m, u and v get larger.

A Box asymptotic distribution (Box 1949) is developed for \(W=-\log \,\varLambda \) in Appendix D in the supplementary material. This distribution yields a mixture of two Gamma distributions as the asymptotic distribution for \(W=-\log \,\varLambda \). As such, it should be compared with the near-exact distribution that matches only the first exact moment, that is, the near-exact distribution with \(m^*=1\), since it is for this near-exact distribution that the part of the distribution that is approximated is asymptotically approximated with a mixture of two Gamma distributions.

In Tables D.1–D.3 in Appendix D in the supplementary material are displayed the values of the measure \(\varDelta \) in (38) for the Box asymptotic approximation and for the near-exact distributions that match the first \(m^*=1\) and \(m^*=2\) exact moments and we may see how the Box asymptotic distribution cannot match the quality of the approximation provided even by the near-exact distribution that matches the single first exact moment.

The values used for n, m, u and v in these Tables were the same that were used in Tables 12, and 3. We may notice that some of the values for the measure \(\varDelta \) for the Box asymptotic approximation even go above 1, particularly for cases where the number of variables involved is quite large, and simultaneously the sample size is rather small or not that large. Although this may seem to be something out of the norm, this happens because in these cases the distribution yielded by the Box approximation is not a legitimate distribution, with its ‘p.d.f.’ and ‘c.d.f.’ going below zero, so that the values reported for the measure \(\varDelta \) which keep being an upper bound between the exact and the asymptotic c.d.f.’s and are indeed correct.

We may also note how the asymptotic Box distribution quickly degrades its quality of approximation as the overall number of variables involved increases, in contrast to what happens with the near-exact distributions.

6 Two real data examples

Example 1: Osteopenia data

To illustrate our proposed testing method, we test the hypothesis (2) on a real data set. The original data consist of bone mineral density values obtained by a technique known as dual X-ray absorptiometry using a GE Lunar Prodigy machine. Measurements were obtained for the femoral neck and the trochanter (\({m=2}\)), for both the right and left femur \({(u=2)}\). These four measurements were observed a second time, approximately two years later (\({v=2}\)). The sample covariance matrix obtained from Roy and Leiva (2011) is matrix \(\varvec{A}^{+}\) in Appendix E.1 in the supplementary material, which we take as the m.l.e. of \(\varvec{\varSigma }\).

We see that the variance-covariance matrices \((\varvec{U}_0)\) of the two mineral contents for the femoral neck and trochanter appear very similar for the first as well as for the second year. Also, the covariance matrices \((\varvec{U}_1)\) of the left and right femurs seem to be fairly similar for both years. Finally, the covariance matrices \((\varvec{W})\) of the two parts of the femur between the two years seem to be similar too. Thus, we will not be much surprised if the hypothesis that the population covariance matrix has a doubly exchangeable covariance structure is not rejected. In fact, as stated in Sect. 2, to test this hypothesis is equivalent to testing the hypothesis in (3). We thus compute the m.l.e. of \(\varvec{\varSigma }^*\), which is the matrix \(\varvec{A}\) in Appendix E.1 in the supplementary material where the orthogonal matrices \(\varvec{\varGamma }^{\bullet }\) and \(\varvec{\varGamma }^{*}\) are

$$\begin{aligned} \varvec{\varGamma }^\bullet =\varvec{I}_v\otimes (\underset{u\times u}{\varvec{C}^{*\prime }} \otimes \varvec{I}_{m}) \qquad \mathrm{and}\qquad \varvec{\varGamma }^*=\underset{{v \times v}}{\varvec{C}^\prime } \otimes \varvec{I}_{mu}=[\gamma _{ij}] \end{aligned}$$

where, for \(u=2\),

$$\begin{aligned} \varvec{C}^*= \left( \begin{array}{rr} \frac{1}{\sqrt{2}} &{} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} &{} ~-\frac{1}{\sqrt{2}} \end{array} \right) , \end{aligned}$$
(39)

and \(\varvec{C}=\varvec{C}^*\), given that \({v=u=2}\). In particular for \({u=2}\) and \({m=2}\) the elements \(\gamma _{ij}\) of the symmetric matrix \(\varvec{\varGamma }^*\) are given by

$$\begin{aligned} \gamma _{ii} = (-1)^{I(i>mu)} 1/\sqrt{2} \quad \text{ and } \quad \gamma _{ij}=I(j-i=mu) 1/\sqrt{2} \quad i<j\,, \end{aligned}$$

for all \(i,j=1,\ldots ,muv\), where \(I(\,\cdot \,)\) represents the indicator function.

Now, from (10) we have

$$\begin{aligned} \varLambda =\left( 2^4\,\frac{|\varvec{A}|}{|\varvec{A}_1|\,|\varvec{A}_2+\varvec{A}_4|\,|\varvec{A}_3|}\right) ^{n/2}, \end{aligned}$$

where \(\varvec{A}_1,\dots ,\varvec{A}_4\) denote the four diagonal blocks of dimension \(2{\scriptstyle \times }2\) of \({\varvec{A}}\) and where \(n=12\). The computed value of \(\varLambda \) is \(\lambda =4.46637{\scriptstyle \times }10^{-12}\).

Using then the near-exact distribution for \(\varLambda \) which matches \(m^*=4\) exact moments, with probability density and cumulative distribution functions given by (37), with \({r=2}\), as given by (36), and the other shape parameters \(\mu _j^*\)\((j=2,\dots ,muv=8)\) given by (31), with

$$\begin{aligned} \begin{array}{rcl} r_j^+ &{} = &{} \{0,3,3,2,2,1,1\}\\ s_j^+ &{} = &{} \{1,0,0,0,0,0,0\}\\ \mu _j^* &{} = &{} \{1,3,3,2,2,1,1\}\,=\,r_j^+ +s_j^+\,, \end{array} \end{aligned}$$

we obtain a near-exact p-value of 0.29468. Thus, we should not reject the null hypothesis that the covariance structure is of the doubly exchangeable type.

In case we had used the common chi-square approximation for the distribution of l.r.t. statistics, we would have \(-2\log \,\varLambda {\mathop {\sim }\limits ^{a}}\chi ^2_{\{muv(muv+1)/2\}-\{3m(m+1)/2\}}\)\(\equiv \chi ^2_{27}\), which would give a p-value of 0.00246. These results show how the chi-square approximation may be completely inadequate for small samples. Indeed, even for quite large samples the chi-square approximation may lead to completely inadequate p-values.

We should note that while the near-exact distribution that we used to compute the near-exact p-value yields a value of \(7.73{\scriptstyle \times }10^{-12}\) for the measure \(\varDelta \) in (38), the chi-square approximation yields for this same measure the value of 0.70. These results clearly show that the common chi-square approximation, opposite to the near-exact approach, leads to too many rejections of the null hypothesis, or, equivalently, in general, to too low p-values, clearly inadequate for any practical purposes.

In Fig. 2 we may analyze the plots of the cumulative distribution function for the near-exact distribution for \(-\log \,\varLambda \) and for the \(\varGamma (27/2,1)\) distribution which corresponds to the \(\chi ^2_{27}\) approximation for \(-2\log \,\varLambda \), to see how much they differ.

Fig. 2
figure 2

Plots of the cumulative distribution function for the near-exact distribution for \(-\log \,\varLambda \) and for the \(\varGamma (27/2,1)\) distribution which corresponds to the \(\chi ^2_{27}\) approximation for \(-2\log \,\varLambda \)

Example 2: Osteoporosis data

See the supplementary material.

7 Discussion and conclusions

We may see how the techniques used to handle the null hypothesis in (2), by first bringing it to the form in (3) and then using the decomposition in (4) enabled the development of very accurate near-exact distributions for the l.r.t. statistic.

From the results of the numerical studies carried out we see that the near-exact distributions developed show an interesting set of nice features. They not only have a good asymptotic behavior for increasing sample sizes, but also an extraordinary performance for very small sample sizes, as for example for sample sizes exceeding only by two the overall number of variables. Furthermore, opposite to common asymptotic distributions, these near-exact distributions also display a marked asymptotic behavior for increasing values of m, u and v. All these features add up to make the developed near-exact approximations the best choice for practical applications of the test studied.

Moreover, given the discussion in Sections 9.11 and 10.11 of Anderson (2003), the results presented concerning the exact distribution of the l.r.t. statistic as well as the near-exact distributions developed may be made extensive to cases where the vector \({\varvec{y}}\) has an elliptically contoured distribution.

As future research of interest, stemming out of the work developed in the present paper the authors point out the tests to the hypotheses

$$\begin{aligned} \begin{array}{rl} &{} H_0: \varvec{W}=\varvec{0}, \hbox { assuming } \varvec{\varSigma }=\varvec{\varTheta } \text { (that is, } \varvec{\varSigma }\text { is doubly-exchangeable)}\\ \mathrm{vs} &{} \\ &{} H_1: \varvec{\varSigma }=\varvec{\varTheta }\,, \end{array} \end{aligned}$$

which is the test between the doubly exchangeable covariance structure, in \(H_1\), and the diagonal-block compound symmetry or diagonal-block exchangeable covariance structure, in \(H_0\), as well as the test to the hypotheses

$$\begin{aligned} \begin{array}{rl} &{} H_0: \varvec{U}_1=\varvec{0}, \hbox { assuming } \varvec{\varSigma }=\varvec{\varTheta }\\ \mathrm{vs} &{} \\ &{} H_1: \varvec{\varSigma }=\varvec{\varTheta }\,, \end{array} \end{aligned}$$

which is the test between the doubly exchangeable covariance structure, in \(H_1\), and the multi-block-matrix sphericity covariance structure, in \(H_0\).

Also the tests to the hypotheses

$$\begin{aligned} \begin{array}{rl} &{} H_0: \varvec{W}=\varvec{0},\varvec{U}_1=\varvec{0}, \hbox { assuming }\varvec{\varSigma }=\varvec{\varTheta }\\ \mathrm{vs} &{} \\ &{} H_1: \varvec{\varSigma }=\varvec{\varTheta }, \hbox { with }\varvec{U}_1=\varvec{0} \end{array} \end{aligned}$$

which is the test between the multi-block-matrix sphericity covariance structure, in \(H_1\), and the block-matrix sphericity covariance structure, in \(H_0\), and yet the test to the hypotheses

$$\begin{aligned} \begin{array}{rl} &{} H_0: \varvec{W}=\varvec{0},\varvec{U}_1=\varvec{0}, \hbox { assuming } \varvec{\varSigma }=\varvec{\varTheta }\\ \mathrm{vs} &{} \\ &{} H_1: \varvec{\varSigma }=\varvec{\varTheta }, \hbox { with }\varvec{W}=\varvec{0} \end{array} \end{aligned}$$

which is the test between the multi-block compound symmetry covariance structure, in \(H_1\), and the block-matrix sphericity covariance structure, in \(H_0\), are tests whose development is of interest.

Since all these tests lay upon the assumption of a doubly exchangeable structure, these are a few more reasons why developing a test for the doubly exchangeable structure was an imperative goal. And once these four tests are developed we will be able to test between any two of the covariance structures in Fig. 1.