Multi-Response Permutation Procedures (MRPP) were introduced in Chap. 2 and applied to interval-level, completely randomized data in Chap. 3. While multi-response permutation procedures are generally thought of as providing tests of differences among g treatment groups as demonstrated in Chap. 3, they also have applications in ordinary least squares (OLS) linear regression analyses with v = 2 and least absolute deviations (LAD) linear regression analyses with v = 1. In this fourth chapter of Permutation Statistical Methods, MRPP analyses of LAD regression residuals are illustrated with a variety of experimental designs, including one-way completely randomized with and without a covariate, one-way and two-way randomized-block, two-way factorial, Latin square, and two-factor nested analysis-of-variance designs. Also considered are multivariate multiple regression designs.

4.1 LAD Linear Regression

OLS linear regression has long been recognized as a useful tool in many fields of research. The optimal properties of OLS regression are well known when the errors are normally distributed. However, in practice the assumption of multivariate normality is rarely justified. LAD linear regression is an attractive alternative to OLS regression as it is extremely robust to deviations from normality as well as to the presence of extreme values [297, p. 172].

It is widely recognized that estimators of OLS regression parameters can be severely affected by unusual values in either the criterion variable or in one or more of the predictor variables. This is due in large part to the weight given to each data point when minimizing the sum of squared errors. In contrast, LAD regression is much less sensitive to the effects of unusual-value errors due to the fact that the errors are not squared. Moreover, LAD regression has been shown to be superior to OLS regression when errors are generated from heavy-tailed or outlier-producing distributions, such as the Cauchy and double-exponential distributions; see, for example, articles by Blattburg and Sargent [46], Dielman [94, 95], Dielman and Pfaffenberger [96], Dielman and Rose [97], Mathew and Nordström [264], Mielke , Berry , Landsea , and Gray [303], Pfaffenberger and Dinkel [337], Rice and White [346], Rosenberg and Carlson [352], Rousseeuw [355], Taylor [394], and Wilson [432].

As described by Sheynin , the initial known use of regression by Daniel Bernoulli (c. 1734) for astronomical prediction problems involved LAD regression based on ordinary Euclidean distances between the observed and predicted response values [372]. Further developments in LAD regression were due to Roger Joseph (Rogerius Josephus) Boscovich (c. 1755), Pierre-Simon Laplace (c. 1789), and Carl Friedrich Gauss (c. 1809). The American mathematician and astronomer Nathaniel Bowditch (c. 1809) was highly critical of OLS regression because, as he argued, squared regression residuals unduly emphasized questionable observations in comparison with the absolute regression residuals associated with LAD regression [372].

Consider the general multivariate regression model given by

$$\displaystyle{ \mathbf{y}_{i} =\boldsymbol{ h}\left (\boldsymbol{\beta },\mathbf{x}_{i}\right ) + \mathbf{e}_{i}\;, }$$

where \(\mathbf{y}_{i}^{\,{\prime}} = (y_{1i},\,\ldots,\,y_{ri})\) denotes the row vector of r observed response measurements for the ith of N objects, \(\mathbf{x}_{i}^{\,{\prime}} = (x_{1i},\,\ldots,\,x_{si})\) is the row vector of s predictor values for the ith object, \(\boldsymbol{\beta }^{{\prime}} = (\beta _{1},\,\ldots,\,\beta _{t})\) is the row vector of t parameters, \(\boldsymbol{h}^{{\prime}} = (h_{1},\,\ldots,\,h_{r})\) is the row vector of r model functions of \(\boldsymbol{\beta }\) and x i for the ith object, and \(\mathbf{e}_{i}^{\,{\prime}} = (e_{1i},\,\ldots,\,e_{ri})\) denotes the r errors between the response variables and model functions for the ith object, i = 1, , N objects. The special case of a multivariate linear regression model is given by

$$\displaystyle{ \mathbf{y}_{i} = \mathbf{B}\boldsymbol{f}(\mathbf{x}_{i}) + \mathbf{e}_{i}\;, }$$

where \(\boldsymbol{f}(\mathbf{x}_{i})\) denotes a column vector of p distinct functions of s predictors (x i ) for the ith object, i = 1, , N, and B is an r×p matrix of parameters in which (B j1, , B jp ) is the row vector of p parameters associated with the jth response measurement, j = 1, , r.

Let y i denote a column vector of r observed response measurement scores and let \(\tilde{\mathbf{y}}_{i}\) denote a column vector of r predicted response values for the ith object, i = 1, , N. Thus, the general and linear predicted multivariate regression models are given by

$$\displaystyle\begin{array}{rcl} \qquad \qquad \qquad \tilde{\mathbf{y}}_{i}& =& \boldsymbol{h}\left (\boldsymbol{\tilde{\beta }},\mathbf{x}_{i}\right ) {}\\ \text{and}\qquad \qquad \qquad \qquad \qquad \qquad & & {}\\ \qquad \qquad \qquad \tilde{\mathbf{y}}_{i}& =& \tilde{\mathbf{B}}\boldsymbol{f}\left (\mathbf{x}_{i}\right )\;, {}\\ \end{array}$$

respectively, where \(\boldsymbol{\tilde{\beta }}\) and \(\tilde{\mathbf{B}}\) are estimated parameters that are intended to provide good fits between the y i and \(\tilde{\mathbf{y}}_{i}\) values relative to a selected goodness-of-fit criterion. The null hypothesis (H 0) underlying each criterion dictates that each of the N! possible, equally-likely pairings of the predicted sequential ordering (\(\tilde{\mathbf{y}}_{1},\,\ldots,\,\tilde{\mathbf{y}}_{N}\)) with the fixed observed sequential ordering (y 1, , y N ) occurs with equal probability, i.e., 1∕N! .

Let \(\Delta (\mathbf{\tilde{y}_{i}},\mathbf{y}_{i})\) for i = 1, , N denote the distance function between the predicted and observed response measurement values and consider the generalized Minkowski distance function given by

$$\displaystyle{ \Delta (\tilde{\mathbf{y}}_{i},\mathbf{y}_{i}) = \left (\,\sum _{j=1}^{r}\big\vert \tilde{y}_{ ij} - y_{ij}\big\vert ^{w}\right )^{\!v/w}\;, }$$

where w ≥ 1 and v > 0. Since v = 1 yields the Minkowski metric [12], the choice of v = 1 is preferred since v > 1 yields distance functions that do not satisfy the triangle inequality property of a metric. Consequently, the distance function of choice utilizes v = 1 and w = 2, i.e., an ordinary Euclidean distance function.

Let the average distance function between \((\tilde{\mathbf{y}}_{1},\,\ldots,\,\tilde{\mathbf{y}}_{N})\) and \((\mathbf{y}_{1},\,\ldots,\,\mathbf{y}_{N})\) be given by

$$\displaystyle{ \delta = \frac{1} {N}\sum _{i=1}^{N}\Delta \big(\tilde{\mathbf{y}}_{ i},\mathbf{y}_{i}\big)\;. }$$
(4.1)

As noted previously, a distance function with v > 1 is not a metric function. If the distance function associated with LAD regression is squared (i.e., v = 2), then the estimated parameters that minimize δ yield an OLS regression model.

The criterion for fitting multivariate regression models based on δ is the chance-corrected measure of agreement between the observed and predicted response measurement values given by

$$\displaystyle{ \mathfrak{R} = 1 -\frac{\delta } {\mu _{\delta }}\;, }$$
(4.2)

where μ δ is the expected value of δ over the N! possible pairings under the null hypothesis. An efficient computational expression for obtaining μ δ that involves a sum of N 2 rather than N! terms is given by

$$\displaystyle{ \mu _{\delta } = \frac{1} {N^{2}}\sum _{i=1}^{N}\,\sum _{ j=1}^{N}\Delta \big(\tilde{\mathbf{y}}_{ i},\mathbf{y}_{j}\big)\;. }$$
(4.3)

4.1.1 Linear Regression and Agreement

A simple interpretation of \(\mathfrak{R}\) can be described for \(r = s = 1\) since the same interpretation holds for any r and s. In the case involving perfect agreement, \(\tilde{y}_{i} = y_{i}\) for i = 1, , N, δ = 0. 00, and \(\mathfrak{R} = 1.00\). This implies that the functional relationship between \(\tilde{y}\) and y can be described by a straight line that passes through the origin with a slope of 45, as depicted in Fig. 4.1 with N = 5 bivariate \((y,\,\tilde{y})\) values: (2, 2), (4, 4), (6, 6), (8, 8), and (10, 10). For the N = 5 data points depicted in Fig. 4.1, the intercept is \(\tilde{\beta }_{0} = 0.00\), the unstandardized slope is \(\tilde{\beta }_{1} = +1.00\), the squared Pearson product-moment correlation coefficient is \(r_{y\tilde{y}}^{2} = +1.00\), and the agreement percentage is also 1. 00, i.e., all five of the y and \(\tilde{y}\) paired values agree.

Fig. 4.1
figure 1

Graphic depicting a regression line with perfect agreement between y and \(\tilde{y}\) with intercept equal to 0.00 and slope equal to +1.00

In this context, the squared Pearson product-moment correlation coefficient, \(r_{y\tilde{y}}^{2}\),  has also been used as a measure of agreement. However, \(r_{y\tilde{y}}^{2} = +1.00\) implies a linear relationship between y and \(\tilde{y}\), where both the intercept and slope are arbitrary. While perfect agreement is described by \(\mathfrak{R} = +1.00\), \(r_{y\tilde{y}}^{2} = +1.00\) describes a linear relationship that may or may not reflect perfect agreement as depicted in Fig. 4.2 with N = 5 \((y,\,\tilde{y})\) values: (2, 4), (4, 5), (6, 6), (8, 7), and (10, 8). For the N = 5 bivariate data points depicted in Fig. 4.2, the intercept is \(\tilde{\beta }_{0} = +3.00\), the unstandardized slope is \(\tilde{\beta }_{1} = +0.50\), the squared Pearson product-moment correlation coefficient is \(r_{y\tilde{y}}^{2} = +1.00\), and the agreement percentage is 0. 20, i.e., only one (6, 6) of the N = 5 y and \(\tilde{y}\) paired values agree.

Fig. 4.2
figure 2

Graphic depicting a regression line with perfect correlation between y and \(\tilde{y}\) with intercept equal to +3.00 and slope equal to +0.50

Comparisons of \(\mathfrak{R}\) with other measures of agreement and the advantages of \(\mathfrak{R}\) relative to the other agreement measures were detailed in a 1996 article by Watterson [416].

While the agreement measure \(\mathfrak{R}\) provides a description of the functional relationship between \((\tilde{\mathbf{y}}_{1},\,\ldots,\,\tilde{\mathbf{y}}_{N})\) and \((\mathbf{y}_{1},\,\ldots,\,\mathbf{y}_{N})\), it does not indicate how extreme an observed value of \(\mathfrak{R}\), say \(\mathfrak{R}_{\text{o}}\), is relative to the N! possible values of \(\mathfrak{R}\) under the null hypothesis. Since μ δ is invariant under the null hypothesis and the observed value of δ is given by

$$\displaystyle{ \delta _{\text{o}} =\mu _{\delta }(1 -\mathfrak{R}_{\text{o}})\;, }$$

the exact probability value for \(\mathfrak{R}_{\text{o}}\) is given by

$$\displaystyle{ P\big(\mathfrak{R}\geq \mathfrak{R}_{\text{o}}\vert H_{0}\big) = P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {M} \;, }$$

where M = N! . Because an exact probability value requires generating N! arrangements of the observed data, calculation of an exact value is prohibitive even for small values of N, e.g., \(M = N! = 15! = 1,307,674,368,000\).

When M is very large, an approximate probability value for δ may be obtained from a resampling permutation procedure. Let L denote a random sample of all possible arrangements of the observed data, where L is typically a large number, e.g., L = 1, 000, 000. Then, an approximate resampling probability value is given by

$$\displaystyle{ P\big(\mathfrak{R}\geq \mathfrak{R}_{\text{o}}\vert H_{0}\big) = P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} \;. }$$

Also, when M is very large and P is exceedingly small, a resampling-approximation permutation procedure based on fitting the first three exact moments of the discrete permutation distribution to a Pearson type III distribution provides approximate probability values, as detailed in Chap. 1, Sect. 1.2.2; see also references [284] and [300].

4.2 Example LAD Regression Analyses

In this section, example analyses illustrate the permutation approach to typical multiple regression problems. The first example analyzes a small set of multivariate response measurement scores using LAD regression and generates a resampling permutation probability value; the second example analyzes the same small set of multivariate response measurement scores using OLS regression and also generates a resampling permutation probability value; the third example analyzes the same set of multivariate response measurement scores using OLS regression, but provides a conventional approximate probability value based on Snedecor’s F distribution.

4.2.1 Example Analysis 1

Consider the multiple regression data listed in Fig. 4.3 where s = 2 observed response measurement scores have been obtained for each of N = 12 objects, y 1, , y N denotes the observed response measurement scores for the N objects, and \(\mathbf{x}_{i}^{\,{\prime}} = (x_{1i},\,\ldots,\,x_{2i})\) is the row vector of s = 2 predictor variables for the ith of N objects.

Fig. 4.3
figure 3

Example data with s = 2 independent variables on N = 12 objects

Because there are \(M = 12! = 479,001,600\) possible, equally-likely arrangements of the N = 12 multivariate response measurement scores in Fig. 4.3, an exact permutation approach is impractical and a resampling procedure is mandated.

A LAD regression analysis of the multivariate response measurement scores listed in Fig. 4.3 yields estimated regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = +3.8571\;,\quad \tilde{\beta }_{1} = +0.4286\;,\mbox{ and}\quad \tilde{\beta }_{2} = +0.1429\;. }$$

Footnote 1 Figure 4.4 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 12.

Fig. 4.4
figure 4

Observed, predicted, and residual LAD regression values for the example data listed in Fig. 4.3

Following Eq. (4.1) on p. 125 with v = 1, the observed value of the MRPP test statistic calculated on the LAD regression residuals listed in Fig. 4.4  is δ o = 1. 50.

If all M possible arrangements of the N = 12 observed LAD regression residuals listed in Fig. 4.4 occur with equal chance, the approximate resampling probability value of δ o = 1. 50 calculated on L = 1, 000, 000 random arrangements of the observed LAD regression residuals is

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} = \frac{191,128} {1,000,000} = 0.0191\;. }$$

Following Eq. (4.3) on p. 117, the exact expected value of the M = 479, 001, 600 δ values is μ δ  = 1. 8294 and, following Eq. (4.2) on p. 117, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{\text{o}} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 - \frac{1.50} {1.8294} = +0.1800\;, }$$

indicating 18 % agreement between the observed and predicted y values above that expected by chance.

4.2.2 Example Analysis 2

For a second example analysis of the multivariate response measurement scores listed in Fig. 4.3 on p. 120, consider an OLS regression analysis based on a resampling permutation procedure. An OLS regression analysis of the multivariate response measurement scores listed in Fig. 4.3 yields estimated regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +6.8198\;,\quad \hat{\beta }_{1} = +0.6356\;,\mbox{ and}\quad \hat{\beta }_{2} = -0.0649\;. }$$

Figure 4.5 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 12.

Fig. 4.5
figure 5

Observed, predicted, and residual OLS regression values for the example data listed in Fig. 4.3

Following Eq. (4.1) on p. 117 with v = 2, the observed value of the MRPP test statistic computed on the OLS regression residuals listed in Fig. 4.5  is δ o = 3. 1502. If all M possible arrangements of the N = 12 observed OLS regression residuals listed in Fig. 4.5 occur with equal chance, the approximate resampling probability value of δ o = 3. 1502 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals is

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta _{\text{o}}$ values $ \leq \delta _{\text{o}}$}} {L} = \frac{96,104} {1,000,000} = 0.0961\;. }$$

For comparison, the approximate resampling probability value based on LAD regression in Example 1 is P = 0. 0191.

Following Eq. (4.3) on p. 117, the exact expected value of the M = 479, 001, 600 δ values is μ δ  = 5. 2942 and, following Eq. (4.2) on p. 117, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{\text{o}} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 -\frac{3.1502} {5.2942} = +0.4050\;, }$$

indicating approximately 41 % agreement between the observed and predicted y values above that expected by chance.

4.2.3 Example Analysis 3

Finally, consider a conventional OLS regression analysis of the multivariate response measurement scores listed in Fig. 4.3 on p. 120. An OLS regression analysis yields estimated regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +6.8198\;,\quad \hat{\beta }_{1} = +0.6356\;,\mbox{ and}\quad \hat{\beta }_{2} = -0.0649\;, }$$

the regression residuals are listed in Fig. 4.5, and the observed squared multiple correlation coefficient is \(R_{y.x_{1},x_{2}}^{2} = 0.2539\). \(R_{y.x_{1},x_{2}}^{2}\) may be transformed into an F-ratio by

$$\displaystyle{ F = \frac{(N - s - 1)R_{y.x_{1},x_{2}}^{2}} {s(1 - R_{y.x_{1},x_{2}}^{2})} = \frac{(12 - 2 - 1)(0.2539)} {(2)(1 - 0.2539)} = 1.5313\;. }$$

Assuming independence, normality, and homogeneity of variance, F is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = s = 2\) and \(\nu _{2} = N - s - 1 = 12 - 2 - 1 = 9\) degrees of freedom. Under the null hypothesis, the observed value of F o = 1. 5313 yields an approximate probability value of P = 0. 2677.

Note that the asymptotic probability value based on OLS regression in Example 3 is P = 0. 2677, while a resampling analysis of the same data in Example 2 yielded a probability value, again based on OLS regression, of P = 0. 0961, a marked difference. Moreover, a LAD regression analysis of the same data in Example 1 yielded an approximate resampling probability value of P = 0. 0191, once again demonstrating the different results possible with v = 1 and v = 2, both with and without a permutation analysis.

4.3 LAD Regression and Analysis of Variance Designs

It is well known that experimental designs that would ordinarily be analyzed by some form of analysis of variance can also be analyzed by OLS multiple regression using either dummy- or effect-coding schemes. The same is true of LAD regression. In this section a variety of analysis-of-variance designs are analyzed using MRPP, LAD regression, and either dummy or effect coding of treatment groups; included are one-way randomized, one-way randomized with a covariate, one-way randomized-block, two-way randomized-block, two-way factorial, Latin square, split-plot, and two-factor nested analysis-of-variance designs.

4.3.1 One-Way Randomized Design

Consider a one-way completely randomized experimental design with fixed effects in which N = 26 objects have been randomly assigned to one of g = 3 treatment groups with n 1 = 8 and \(n_{2} = n_{3} = 9\). The design and data are adapted from Stevens [387, p. 70] and are given in Fig. 4.6.

Fig. 4.6
figure 6

Example data for a one-way randomized design with g = 3 treatment groups and univariate response measurement scores on N = 26 objects

For a one-way randomized experimental design, the appropriate regression model is given by

$$\displaystyle{ y_{i} =\sum _{ j=1}^{m}x_{ ij}\beta _{j} + e_{i}\;, }$$

where y i denotes the ith of N responses possibly affected by a treatment; x ij is the jth of m covariates associated with the ith response, where x i1 = 1 if the model includes an intercept; β j denotes the jth of m regression parameters; and e i designates the error associated with the ith of N responses. If the estimates of β 1, , β m that minimize

$$\displaystyle{ \sum _{i=1}^{N}\vert e_{ i}\vert }$$

are denoted by \(\tilde{\beta }_{1},\,\ldots,\,\tilde{\beta }_{m}\), then the N residuals of the LAD regression model are given by \(e_{i} = y_{i} -\tilde{ y}_{i}\) for i = 1, , N, where the predicted value of y i is given by

$$\displaystyle{ \tilde{y}_{i} =\sum _{ j=1}^{m}x_{ ij}\tilde{\beta }_{j}\;,\qquad i = 1,\,\ldots,\,N\;. }$$

In contrast, OLS regression estimators of β 1, , β m minimize

$$\displaystyle{ \sum _{i=1}^{N}e_{ i}^{2}\;, }$$

the N residuals of the OLS regression model are given by \(e_{i} = y_{i} -\hat{ y}_{i}\) for i = 1, , N, and the predicted value of y i is given by

$$\displaystyle{ \hat{y}_{i} =\sum _{ j=1}^{m}x_{ ij}\hat{\beta }_{j}\;,\qquad i = 1,\,\ldots,\,N\;. }$$

If the N regression residuals are partitioned into g disjoint treatment groups of sizes n 1, , n g , where n i  ≥ 2 for i = 1, , g and

$$\displaystyle{ N =\sum _{ i=1}^{g}n_{ i}\;, }$$

then the permutation test depends on test statistic

$$\displaystyle{ \delta =\sum _{ i=1}^{g}C_{ i}\xi _{i}\;, }$$
(4.4)

where

$$\displaystyle{ C_{i} = \frac{n_{i}} {N}\;,\qquad i = 1,\,\ldots,\,g\;, }$$

is a positive weight for the ith of g treatment groups that minimizes the variability of δ,

$$\displaystyle{ \sum _{i=1}^{g}C_{ i} = 1\;, }$$

and \(\xi _{i}\) is the average pairwise Euclidean difference among the n i residuals in the ith of g treatment groups defined by

$$\displaystyle{ \xi _{i} = \binom{n_{i}}{2}^{\!-1}\,\sum _{ j=1}^{N-1}\,\sum _{ k=j+1}^{N}\Big[\big(e_{ j} - e_{k}\big)^{2}\Big]^{v/2}\Psi _{ ji}\,\Psi _{ki}\;, }$$
(4.5)

where v = 1 for LAD regression and

$$\displaystyle{ \Psi _{ji} = \left \{\begin{array}{@{}l@{\quad }l@{}} \,1 \quad &\mbox{ if $e_{i}$ is in the $i$th treatment group}\;, \\ [6pt]\,0\quad &\text{otherwise}\;. \end{array} \right. }$$

The null hypothesis specifies that each of the

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{g}n_{ i}!} }$$

allocations of the N residuals to the g treatment groups is equally likely with n i , i = 1, , g, residuals preserved for each arrangement of the observed data. The exact probability value of an observed value of δ, δ o, is given by

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {M} \;. }$$

As previously, when M is large, an approximate probability value of δ may be obtained from a resampling procedure, where

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} }$$

and L denotes the number of resampled test statistic values. Typically, L is set to a large number to ensure accuracy, e.g., L = 1, 000, 000. When M is very large and P is exceedingly small, a resampling-approximation permutation procedure may produce no δ values equal to or less than δ o, even with L = 1, 000, 000, yielding an approximate resampling probability value of P = 0. 00. In such cases, moment-approximation permutation procedures based on fitting the first three exact moments of the discrete permutation distribution to a Pearson type III distribution provide approximate probability values, as detailed in Chap. 1, Sect. 1.2.2 [284, 300].

An index of the effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is given by the chance-corrected measure

$$\displaystyle{ \mathfrak{R} = 1 -\frac{\delta } {\mu _{\delta }}\;, }$$
(4.6)

where μ δ is the arithmetic average of the δ values calculated on all M equally-likely arrangements of the observed response measurements, i.e.,

$$\displaystyle{ \mu _{\delta } = \frac{1} {M}\sum _{i=1}^{M}\delta _{ i}\;. }$$
(4.7)

A design matrix of dummy codes for an MRPP regression analysis of the N = 26 response measurement scores in Fig. 4.6 is given in Fig. 4.7 where the first columns of 1 values provide for an intercept. The second columns contain the N = 26 univariate response measurement scores listed according to the original random assignment of the N = 26 objects to the g = 3 treatment groups with the first n 1 = 8 scores, the next n 2 = 9 scores, and the last n 3 = 9 scores associated with the first, second, and third treatment groups, respectively.

Fig. 4.7
figure 7

Design matrix and data for a one-way randomized design with g = 3 treatment groups and univariate response measurement scores on N = 26 objects

Because the purpose of the analysis is to test for possible differences among the g = 3 treatment groups, a reduced regression model is constructed without a variate for treatments. Therefore, for a single-factor experiment the design matrix for the reduced model is composed solely of a code for the intercept. The MRPP regression analysis examines the N = 26 regression residuals for possible differences among the g = 3 treatment levels; consequently, no dummy codes for treatments are included in Fig. 4.7 as this information is implicit in the ordering of the g = 3 treatment groups in the three columns labeled “Score” with n 1 = 8 and \(n_{2} = n_{3} = 9\) values.

An exact permutation solution is impractical for the univariate response measurements listed in Fig. 4.7 since there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{g}n_{ i}!} = \frac{26!} {8!\;9!\;9!} = 75,957,810,500 }$$

possible, equally-likely arrangements of the N = 26 univariate response measurement scores; consequently, a resampling procedure is the default in this case.

LAD Regression Analysis

An MRPP resampling analysis of the LAD regression residuals calculated on the univariate response measurement scores listed in Fig. 4.7 yields an estimated LAD regression coefficient of \(\tilde{\beta }_{0} = +12.00\). Figure 4.8 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 26.

Fig. 4.8
figure 8

Observed, predicted, and residual LAD regression values for the example one-way randomized data listed in Fig. 4.7

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 26 LAD regression residuals listed in Fig. 4.8 yield g = 3 average distance-function values of

$$\displaystyle{ \xi _{1} = 4.50\;,\quad \xi _{2} = 4.2222\;,\mbox{ and}\quad \xi _{3} = 6.8889\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.8 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{i}} {n} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{\text{o}} =\sum _{ i=1}^{g}C_{ i}\xi _{i} = \frac{1} {26}\big[(8)(4.50) + (9)(4.2222) + (9)(6.8889)\big] = 5.2308\;. }$$

If all M possible arrangements of the N = 26 observed LAD regression residuals listed in Fig. 4.8 occur with equal chance, the approximate resampling probability value of δ o = 5. 2308 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with n 1 = 8 and \(n_{2} = n_{3} = 9\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} = \frac{12,062} {1,000,000} = 0.0121\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 6. 1262 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{\text{o}} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 -\frac{5.2308} {6.1262} = +0.1462\;, }$$

indicating approximately 15 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of OLS regression residuals calculated on the N = 26 univariate response measurement scores listed in Fig. 4.7 on p. 127. The MRPP regression analysis yields an estimated OLS regression coefficient of \(\hat{\beta }_{0} = +14.2692\). Figure 4.9 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 26.

Fig. 4.9
figure 9

Observed, predicted, and residual OLS regression values for the example one-way randomized data listed in Fig. 4.7

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 26 OLS regression residuals listed in Fig. 4.9 yield g = 3 average distance-function values of

$$\displaystyle{ \xi _{1} = 29.7143\;,\quad \xi _{2} = 25.00\;,\mbox{ and}\quad \xi _{3} = 103.2222\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.9 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{i} - 1} {N - g}\;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle\begin{array}{rcl} \delta _{\text{o}} =\sum _{ i=1}^{g}C_{ i}\xi _{i} = \frac{1} {26 - 3}\big[(8 - 1)(29.7143)& +& (9 - 1)(25.00) {}\\ & +& (9 - 1)(103.2222)\big] = 53.6425\;. {}\\ \end{array}$$

If all M possible arrangements of the N = 26 observed OLS regression residuals listed in Fig. 4.9 occur with equal chance, the approximate resampling probability value of δ o = 53. 6425 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with n 1 = 8 and \(n_{2} = n_{3} = 9\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} = \frac{91,842} {1,000,000} = 0.0918\;. }$$

For comparison, the approximate resampling probability value based LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{i}/N\) for i = 1, 2, 3 is P = 0. 0121.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 75, 957, 810, 500 δ values is μ δ  = 60. 5692 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{\text{o}} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 -\frac{53.6425} {60.5692} = +0.1144\;, }$$

indicating approximately 11 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional fixed-effects one-way analysis of variance calculated on the N = 26 univariate response measurement scores listed in Fig. 4.6 on p. 124 yields an observed F-ratio of F o = 2. 6141. Assuming independence, normality, and homogeneity of variance, F is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = g - 1 = 3 - 1 = 2\) and \(\nu _{2} = N - g = 26 - 3 = 23\) degrees of freedom. Under the null hypothesis, the observed value of F o = 2. 6141 yields an approximate probability value of P = 0. 0948, which is similar to that produced by the MRPP resampling analysis of the OLS regression residuals.

4.3.2 One-Way Randomized Design with a Covariate

A covariate experimental design permits the testing of differences among the treatment groups after the effect of the covariate has been removed from the analysis. Consider a one-way completely randomized design with a covariate in which N = 47 objects are randomly assigned to one of g = 5 treatment groups. The experimental data are listed in Table 4.1 and are adapted from a 1984 study by Conti and Musty [78].

Table 4.1 Example data for a one-way randomized design with a covariate, consisting of pre-test (Pre) and post-test (Post) response measurement scores on N = 47 randomly assigned objects to g = 5 treatment groups

A design matrix of dummy codes for analyzing treatments is given in Fig. 4.10, where the first column of 1 values provides for an intercept, the second column contains the covariate (Pre-test) values, and the third column contains the (Post-test) scores listed according to the original random assignment of the N = 47 objects to the g = 5 treatment groups with the first n 1 = 10 scores, the next n 2 = 10 scores, the next n 3 = 9 scores, the next n 4 = 8 scores, and the last n 5 = 10 scores associated with the g = 5 treatment groups, respectively.

Fig. 4.10
figure 10

Design matrix and data, consisting of an intercept and pre- and post-test measurement scores for a one-way randomized design with a covariate

The MRPP regression analysis examines the N = 47 regression residuals for possible differences among the g = 5 treatment levels; consequently, no dummy codes for treatments are included in Fig. 4.10 as this information is implicit in the ordering of the g = 5 treatment groups in the two paired columns labeled “Pre” and “Post.”

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{g}n_{ i}!} = \frac{47!} {10!\;10!\;9!\;8!\;10!} = 369,908,998,147,203,213,613,129,815,600 }$$

possible, equally-likely arrangements of the N = 47 univariate response measurement scores listed in Table 4.1, an exact permutation approach is not possible and a resampling analysis is mandated.

LAD Regression Analysis

An MRPP resampling analysis of the LAD regression residuals calculated on the N = 47 response measurement scores listed in Fig. 4.10 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = -0.1282\quad \mbox{ and}\quad \tilde{\beta }_{1} = +0.4956\;. }$$

Table 4.2 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 47.

Table 4.2 Observed, predicted, and residual LAD regression values for the example covariate data listed in Fig. 4.10

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals v = 1, the LAD regression residuals listed in Table 4.2 yield g = 5 average distance-function values of

$$\displaystyle{ \xi _{1} = 0.7072\;,\;\;\xi _{2} = 0.6335\;,\;\;\xi _{3} = 0.7213\;,\;\;\xi _{4} = 1.3409\;,\;\mbox{ and}\;\;\xi _{5} = 0.6795\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Table 4.2 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{i}} {N}\;,\qquad i = 1,\,\ldots,\,5\;, }$$

is

$$\displaystyle\begin{array}{rcl} \delta _{\text{o}} =\sum _{ i=1}^{g}C_{ i}\xi _{i} = \frac{1} {47}\big[(10)(0.7072)& +& (10)(0.6335) + (9)(0.7213) {}\\ & +& (8)(1.3409) + (10)(0.6795)\big] = 0.7962\;. {}\\ \end{array}$$

If all M possible arrangements of the observed LAD regression residuals listed in Table 4.2 occur with equal chance, the approximate resampling probability value of δ o = 0. 7962 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{1} = n_{2} = n_{5} = 10\), n 3 = 9, and n 4 = 8 residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} = \frac{4,095} {1,000,000} = 0.0041\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 0. 9178 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{\text{o}} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 -\frac{0.7962} {0.9178} = +0.1326\;, }$$

indicating approximately 13 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of the OLS regression residuals calculated on the N = 47 univariate response measurement scores listed in Fig. 4.10 on p. 132. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = -0.2667\quad \mbox{ and}\quad \hat{\beta }_{1} = +0.5311\;. }$$

Table 4.3 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 47.

Table 4.3 Observed, predicted, and residual OLS regression values for the example covariate data listed in Fig. 4.10

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the OLS regression residuals listed in Table 4.3 yield g = 5 average distance-function values of

$$\displaystyle{ \xi _{1} = 0.8067\;,\;\;\xi _{2} = 0.7407\;,\;\;\xi _{3} = 0.7073\;,\;\;\xi _{4} = 2.6035\;,\;\mbox{ and}\;\;\xi _{5} = 0.6906\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Table 4.3 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{i} - 1} {N - g}\;,\qquad i = 1,\,\ldots,\,5\;, }$$

is

$$\displaystyle\begin{array}{rcl} \delta _{\text{o}}& =& \sum _{i=1}^{g}C_{ i}\xi _{i} = \frac{1} {47 - 5}\big[(10 - 1)(0.8067) + (10 - 1)(0.7407) {}\\ & & +(9 - 1)(0.7073) + (8 - 1)(2.6035) + (10 - 1)(0.6906)\big] = 1.0482\;. {}\\ \end{array}$$

If all M possible arrangements of the N = 47 observed OLS regression residuals listed in Table 4.3 occur with equal chance, the approximate resampling probability value of δ o = 1. 0482 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{1} = n_{2} = n_{5} = 10\), n 3 = 9, and n 4 = 8 residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} = \frac{15,301} {1,000,000} = 0.0153\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{i}/N\) for i = 1, , 5 is P = 0. 0041.

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 1. 2761 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{\text{o}} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 -\frac{1.0482} {1.2761} = +0.1785\;, }$$

indicating approximately 18 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional fixed-effects one-way analysis of covariance calculated on the N = 47 univariate response measurement scores listed in Table 4.1 on p. 131 yields an observed F-ratio of F o = 4. 6978. Assuming independence, normality, and homogeneity of variance, F is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = g - 1 = 5 - 1 = 4\) and \(\nu _{2} = N - g - 1 = 47 - 5 - 1 = 41\) degrees of freedom. Under the null hypothesis, the observed value of F o = 4. 6978 yields an approximate probability value of P = 0. 0033.

4.3.3 One-Way Randomized-Block Design

One-way randomized-block designs are common in experimental research and have long been valuable statistical tools in such fields as agriculture and genetics. E.J.G. Pitman , for example, developed a permutation approach for one-way randomized-block designs in 1938 [342]. With modern developments in embryo transplants and cloning where subjects can be genetically matched on a large number of important characteristics, randomized-block designs have become very practical and efficient.Footnote 2

Consider a one-way randomized-block design where b = 6 objects (blocks) are evaluated over a = 3 treatments with r = 1 response measurement. The design and data are adapted from a study by Anderson , Sweeney , and Williams [9, p. 471] and are given in Fig. 4.11.

Fig. 4.11
figure 11

Example data for a one-way randomized-block design with b = 6 blocks, a = 3 treatments, and r = 1 response measurement

A design matrix of dummy codes for an MRPP regression analysis is given in Fig. 4.12, where the first column of 1 values provides for an intercept, the next five columns contain dummy codes for the b = 6 blocks, and the last column contains the univariate response measurement scores listed according to the original random assignment of the N = 18 objects to the a = 3 treatment levels of Factor A with the first \(n_{A_{1}} = 6\) objects, the next \(n_{A_{2}} = 6\) objects, and the last \(n_{A_{3}} = 6\) objects associated with treatment levels \(A_{1}\), A 2, and A 3, respectively. The MRPP regression analysis examines the N = 18 regression residuals for possible differences in the a = 3 treatment levels; consequently, there are no dummy codes for treatments in Fig. 4.12 as this information is implicit in the ordering of the a = 3 treatment levels of Factor A in the last column.

Fig. 4.12
figure 12

Design matrix and data for a one-way randomized-block design with b = 6 blocks, a = 3 treatments, and r = 1 response measurement

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{18!} {(6!)^{3}} = 17,153,136 }$$

possible, equally-likely arrangements of the N = 18 univariate response measurement scores listed in Fig. 4.11, an exact permutation approach is not practical.

LAD Regression Analysis

An MRPP resampling analysis of the LAD regression residuals calculated on the univariate response measurement scores listed in Fig. 4.12 yields estimated LAD regression coefficients of

$$\displaystyle\begin{array}{rcl} \tilde{\beta }_{0}& =& +15.00\;,\quad \tilde{\beta }_{1} = -1.00\;,\quad \tilde{\beta }_{2} = -4.00\;,\quad \tilde{\beta }_{3} = -2.00\;, {}\\ \tilde{\beta }_{4}& =& +1.00\;,\mbox{ and}\quad \tilde{\beta }_{5} = -2.00 {}\\ \end{array}$$

for Factor A. Figure 4.13 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 18.

Fig. 4.13
figure 13

Observed, predicted, and residual LAD regression values for the example randomized-block data listed in Fig. 4.12

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 18 LAD regression residuals listed in Fig. 4.13 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 0.3333\;,\quad \xi _{A_{2}} = 1.20\;,\mbox{ and}\quad \xi _{A_{3}} = 2.3333\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.13 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{6} {18}\big(0.3333 + 1.20 + 2.3333\big) = 1.2889\;. }$$

If all M possible arrangements of the N = 18 observed LAD regression residuals listed in Fig. 4.13 occur with equal chance, the approximate resampling probability value of δ A  = 1. 2889 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 6\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {L} = \frac{56,035} {1,000,000} = 0.0560\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 17, 153, 136 δ values is μ δ  = 1. 6078 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{1.2889} {1.6078} = +0.1984\;, }$$

indicating approximately 20 % agreement between the observed and predicted y values above that expected by chance.

An Exact Test

Although an exact permutation analysis of the N = 18 LAD regression residuals listed in Fig. 4.13 is impractical, it is not impossible. In fact, exact permutation methods are oftentimes more efficient than resampling permutation methods because the L = 1, 000, 000 calls to a pseudorandom number generator, necessary for a resampling test, are not required by an exact test.

Following Eq. (4.5) on p. 125, an exact permutation analysis of the N = 18 LAD regression residuals listed in Fig. 4.13 yields a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 0.3333\;,\quad \xi _{A_{2}} = 1.20\;,\mbox{ and}\quad \xi _{A_{3}} = 2.3333\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  based on v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{6} {18}\big(0.3333 + 1.20 + 2.3333\big) = 1.2889\;. }$$

If all arrangements of the N = 18 observed LAD regression residuals listed in Fig. 4.13 occur with equal chance, the exact probability value of δ A  = 1. 2889 computed on the M = 17, 153, 136 possible arrangements of the observed LAD regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 6\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{961,884} {17,153,136} = 0.0561\;. }$$

For comparison, the resampling probability value computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals listed in Fig. 4.13 is P = 0. 0560.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of OLS regression residuals calculated on the N = 18 univariate response measurement scores listed in Fig. 4.12 on p. 137. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle\begin{array}{rcl} \hat{\beta }_{0}& =& +16.00\;,\quad \hat{\beta }_{1} = -2.00\;,\quad \hat{\beta }_{2} = -4.00\;,\quad \hat{\beta }_{3} = -2.00\;, {}\\ \hat{\beta }_{4}& =& -1.00\;,\mbox{ and}\quad \hat{\beta }_{5} = -3.00 {}\\ \end{array}$$

for Factor A. Figure 4.14 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 18.

Fig. 4.14
figure 14

Observed, predicted, and residual OLS regression values for the example randomized-block data listed in Fig. 4.12

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 2, the N = 18 OLS regression residuals listed in Fig. 4.14 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 2.20\;,\quad \xi _{A_{2}} = 1.60\;,\mbox{ and}\quad \xi _{A_{3}} = 3.80\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.14 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{6 - 1} {18 - 3}\big(2.20 + 1.60 + 3.80\big) = 2.5333\;. }$$

If all M possible arrangements of the N = 18 observed OLS regression residuals listed in Fig. 4.14 occur with equal chance, the approximate resampling probability value of δ A  = 2. 5333 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 6\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {L} = \frac{4,974} {1,000,000} = 0.0050\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{A_{i}}/N\) for i = 1, 2, 3 is P = 0. 0560.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 17, 153, 136 δ values is μ δ  = 5. 5556 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{\text{o}}} {\mu _{\delta }} = 1 -\frac{2.5333} {5.5556} = +0.5440\;, }$$

indicating approximately 54 % agreement between the observed and predicted y values above that expected by chance.

An Exact Test

Although an exact permutation analysis of the N = 18 OLS regression residuals listed in Fig. 4.14 is impractical, it is not impossible. Following Eq. (4.5) on p. 125, an exact permutation analysis of the N = 18 OLS regression residuals listed in Fig. 4.14 yields a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 2.20\;,\quad \xi _{A_{2}} = 1.60\;,\mbox{ and}\quad \xi _{A_{3}} = 3.80\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  based on v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{6 - 1} {18 - 3}\big(2.20 + 1.60 + 3.80\big) = 2.5333\;. }$$

If all arrangements of the N = 18 observed OLS regression residuals listed in Fig. 4.14 occur with equal chance, the exact probability value of δ A  = 2. 5333 computed on the M = 17, 153, 136 possible arrangements of the observed OLS regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 6\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{85,188} {17,153,136} = 0.0050\;. }$$

For comparison, the approximate resampling probability value computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals listed in Fig. 4.14 is also P = 0. 0050.

Conventional ANOVA Analysis

A conventional randomized-block analysis of variance calculated on the N = 18 univariate response measurement scores listed in Fig. 4.11 on p. 137 yields an observed F-ratio of F A  = 5. 5263. Assuming independence and normality, F A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a - 1 = 3 - 1 = 2\) and \(\nu _{2} = (b - 1)(a - 1) = (6 - 1)(3 - 1) = 10\) degrees of freedom. Under the null hypothesis, the observed value of F A  = 5. 5263 yields an approximate probability value of P = 0. 0242.

4.3.4 Two-Way Randomized-Block Design

Consider a balanced two-way randomized-block design in which n = 3 subjects (S ) are tested over a = 3 levels of Factor A and the experiment is repeated b = 3 times for Factor B. The design and data are adapted from Myers and Well [315, p. 260] and are given in Table 4.4. A complete permutation analysis of a two-way randomized-block design requires three separate analyses comprised of (1) the main effect of Factor A, (2) the main effect of Factor B, and (3) the A×B interaction effect.

Table 4.4 Example univariate data for a balanced two-way randomized-block design with n = 3 subjects, a = 3 levels of Factor A, and b = 3 levels of Factor B

Analysis of Factor A

A design matrix of dummy codes for analyzing Factor A is given on the left side of Table 4.5, where the first column of 1 values provides for an intercept and the second and third columns contain dummy codes for Factor B. The last column on the left side of Table 4.5 lists the N = 9 response measurement summations over the b = 3 levels of Factor B (e.g., \(3.10 + 1.90 + 1.60 = 6.60\)) and ordered by the a = 3 treatment levels of Factor A with the first \(n_{A_{1}} = 3\) summations, the next \(n_{A_{2}} = 3\) summations, and the last \(n_{A_{3}} = 3\) summations associated with treatment levels A 1, A 2, and A 3, respectively. The MRPP regression analysis examines the N = 9 regression residuals for possible differences in the a = 3 treatment levels of Factor A; consequently, no dummy codes are provided for Factor A as this information is implicit in the ordering of the a = 3 treatment levels of Factor A in the last column on the left side of Table 4.5.

Table 4.5 Design matrices and summation data for Factors A and B in a two-way analysis of variance randomized-block design

An exact permutation solution is reasonable for the response measurement summations listed on the left side of Table 4.5 since there are only

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{9!} {(3!)^{3}} = 1,680 }$$

possible, equally-likely arrangements of the N = 9 response measurement summations for Factor A with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 3\) response measurement summations preserved for each arrangement of the observed data.

LAD Regression Analysis

An MRPP analysis of the LAD regression residuals calculated on the N = 9 response measurement summations on the left side of Table 4.5 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = +6.60\;,\quad \tilde{\beta }_{1} = +8.00\;,\mbox{ and}\quad \tilde{\beta }_{2} = +17.40 }$$

for Factor A. Figure 4.15 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 9.

Fig. 4.15
figure 15

Observed, predicted, and residual LAD regression values for the summations over Factor B on the left side of Table 4.5

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 1, the N = 9 LAD regression residuals listed in Fig. 4.15 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 0.00\;,\quad \xi _{A_{2}} = 4.0667\;,\mbox{ and}\quad \xi _{A_{3}} = 1.60\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.15 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{3} {9}\big(0.00 + 4.0667 + 1.60\big) = 1.8889\;. }$$

If all arrangements of the N = 9 observed LAD regression residuals listed in Fig. 4.15 occur with equal chance, the exact probability value of δ A  = 1. 8889 computed on the M = 1, 680 possible arrangements of the observed LAD regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 3\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{6} {1,680} = 0.0036\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 1, 680 δ values is μ δ  = 2. 9889 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{1.8889} {2.9889} = +0.3680\;, }$$

indicating approximately 37 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 9 response measurement summations for Factor A listed on the left side of Table 4.5. Again, since there are only M = 1, 680 possible arrangements of the response measurement summations, an exact permutation test is selected. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +6.3333\;,\quad \hat{\beta }_{1} = +9.00\;,\mbox{ and}\quad \hat{\beta }_{2} = +18.6333 }$$

for Factor A. Figure 4.16 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 9.

Fig. 4.16
figure 16

Observed, predicted, and residual OLS regression values for the summations over Factor B on the left side of Table 4.5

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 9 OLS regression residuals listed in Fig. 4.16 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 0.8585\;,\quad \xi _{A_{2}} = 11.9674\;,\mbox{ and}\quad \xi _{A_{3}} = 7.0452\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.16 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{3 - 1} {9 - 3}\big(0.8585 + 11.9674 + 7.0452\big) = 6.6237\;. }$$

If all arrangements of the N = 9 observed OLS regression residuals listed in Fig. 4.16 occur with equal chance, the exact probability value of δ A  = 6. 6237 computed on the M = 1, 680 possible arrangements of the observed OLS regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 3\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{18} {1,680} = 0.0107\;. }$$

For comparison, the exact probability value based on LAD regression, v = 1, M = 1, 680, and \(C_{i} = n_{A_{i}}/N\) for i = 1, 2, 3 is P = 0. 0036.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 1, 680 δ values is μ δ  = 14. 7250 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 - \frac{6.6237} {14.7250} = +0.5512\;, }$$

indicating approximately 55 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional randomized-block analysis of variance calculated on the N = 27 univariate response measurement scores for Factor A in Table 4.4 on p. 143 yields an observed F-ratio of F A  = 3. 9282. Assuming independence and normality, F A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a - 1 = 3 - 1 = 2\) and \(\nu _{2} = (n - 1)(a - 1) = (3 - 1)(3 - 1) = 4\) degrees of freedom. Under the null hypothesis, the observed value of F A  = 3. 9282 yields an approximate probability value of P = 0. 1138.

Analysis of Factor B

The right side of Table 4.5 on p. 144 contains a design matrix of dummy codes for analyzing Factor B, where the first column of 1 values provides for an intercept and the next two columns contain dummy codes for Factor A. The last column on the right side of Table 4.5 lists the N = 9 response measurement summations over the a = 3 levels of Factor A (e.g., \(3.10 + 2.90 + 2.40 = 8.40\)) and ordered by the b = 3 treatment levels with the first \(n_{B_{1}} = 3\) summations, the next \(n_{B_{2}} = 3\) summations, and the last \(n_{B_{3}} = 3\) summations associated with treatment levels, B 1, B 2, and B 3, respectively. The MRPP regression analysis examines the N = 9 regression residuals for possible differences among the b = 3 treatment levels of Factor B; consequently, no dummy codes are provided for Factor B as this information is implicit in the ordering of the b = 3 treatment levels of Factor B in the last column on the right side of Table 4.5.

An exact permutation solution is ideal for the response measurement summations on the right side of Table 4.5 since there are only

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{b}n_{ B_{i}}!} = \frac{9!} {(3!)^{3}} = 1,680 }$$

possible, equally-likely arrangements of the N = 9 response measurement summations for Factor B with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}}\) response measurement summations preserved for each arrangement of the observed data.

LAD Regression Analysis

An MRPP analysis of the LAD regression residuals calculated on the N = 9 response measurement summations on the right side of Table 4.5 on p. 144 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = +5.60\;,\quad \hat{\beta }_{1} = +9.00\;,\mbox{ and}\quad \tilde{\beta }_{2} = +18.90 }$$

for Factor B. Figure 4.17 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 9.

Fig. 4.17
figure 17

Observed, predicted, and residual LAD regression values for the summations over Factor A on the right side of Table 4.5

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 9 LAD regression residuals listed in Fig. 4.17 yield b = 3 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 0.8667\;,\quad \xi _{B_{2}} = 0.00\;,\mbox{ and}\quad \xi _{B_{3}} = 1.40\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.17 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{3} {9}\big(0.8667 + 0.00 + 1.40\big) = 0.7556\;. }$$

If all arrangements of the N = 9 observed LAD regression residuals listed in Fig. 4.17 occur with equal chance, the exact probability value of δ B  = 0. 7556 computed on the M = 1, 680 possible arrangements of the observed LAD regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = 3\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {M} = \frac{6} {1,680} = 0.0036\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 1, 680 δ values is μ δ  = 2. 5889 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 -\frac{0.7556} {2.5889} = +0.7082\;, }$$

indicating approximately 71 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 9 response measurement summations for Factor B listed on the right side of Table 4.5 on p. 144. Again, since there are only M = 1, 680 possible arrangements of the response measurement summations, an exact permutation test is preferred. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +6.3333\;,\quad \hat{\beta }_{1} = +9.00\;,\mbox{ and}\quad \hat{\beta }_{2} = +18.6333 }$$

for Factor B. Figure 4.18 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 9.

Fig. 4.18
figure 18

Observed, predicted, and residual OLS regression values for the summations over Factor A on the right side of Table 4.5

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 9 OLS regression residuals listed in Fig. 4.18 yield b = 3 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 1.3252\;,\quad \xi _{B_{2}} = 0.0474\;,\mbox{ and}\quad \xi _{B_{3}} = 1.8585\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.18 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}} - 1} {N - b} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{3 - 1} {9 - 3}\big(1.3252 + 0.0474 + 1.8585\big) = 1.0770\;. }$$

If all arrangements of the N = 9 observed OLS regression residuals listed in Fig. 4.18 occur with equal chance, the exact probability value of δ B  = 1. 0770 computed on the M = 1, 680 possible arrangements of the observed OLS regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = 3\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {M} = \frac{6} {1,680} = 0.0036\;. }$$

For comparison, the exact probability value based on LAD regression, v = 1, M = 1, 680, and \(C_{i} = n_{B_{i}}/N\) for i = 1, 2, 3 is also P = 0. 0036.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 1, 680 δ values is μ δ  = 9. 9150 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 -\frac{1.0770} {9.9150} = +0.8914\;, }$$

indicating approximately 89 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional randomized-block analysis of variance calculated on the N = 27 univariate response measurement scores for Factor B listed in Table 4.4 on p. 143 yields an observed F-ratio of F B  = 22. 5488. Assuming independence and normality, F B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = b - 1 = 3 - 1 = 2\) and \(\nu _{2} = (n - 1)(b - 1) = (3 - 1)(3 - 1) = 4\) degrees of freedom. Under the null hypothesis, the observed value of F B  = 22. 5488 yields an approximate probability value of P = 0. 0066, which is similar to the LAD and OLS regression probability value of P = 0. 0036.

Analysis of the A×B Interaction

A design matrix of dummy codes for analyzing the interaction of Factors A and B is given in Table 4.6, where the first column of 1 values provides for an intercept and the second and third columns contain dummy codes for subjects (S ). The fourth and fifth columns contain dummy codes for Factor A, the sixth and seventh columns contain dummy codes for Factor B, and the next eight columns contain dummy codes for the S×A and S×B interactions. The last column in Table 4.6 lists the response measurement scores ordered by the \(ab = (3)(3) = 9\) levels of the A×B interaction.

The MRPP regression analysis examines the N = 27 regression residuals for possible differences among the nine treatment levels of the A×B interaction; consequently, no dummy codes are provided for the A×B interaction as this information is implicit in the ordering of the treatment levels of the A×B interaction in the last column of Table 4.6.

Table 4.6 Design matrix and univariate response measurement scores for the interaction of Factors A and B in a two-way randomized-block design with N = 27 objects

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{ab}n_{ (A\times B)_{i}}!} = \frac{27!} {(3!)^{9}} = 1,080,491,954,750,208,000,000 }$$

possible, equally-likely arrangements of the N = 27 univariate response measurement scores for the A×B interaction listed in Table 4.6, an exact permutation solution is not possible.

LAD Regression Analysis

An MRPP resampling analysis of the LAD regression residuals calculated on the N = 27 univariate response measurement scores listed in Table 4.6 yields estimated LAD regression coefficients of

$$\displaystyle\begin{array}{rcl} \tilde{\beta }_{0}& =& +2.70\;,\quad \tilde{\beta }_{1} = +3.00\;,\quad \tilde{\beta }_{2} = +6.20\;,\quad \tilde{\beta }_{3} = +0.20\;, {}\\ \tilde{\beta }_{4}& =& -0.20\;,\quad \tilde{\beta }_{5} = -0.80\;,\quad \tilde{\beta }_{6} = -1.00\;,\quad \tilde{\beta }_{7} = +0.90\;, {}\\ \tilde{\beta }_{8}& =& -0.20\;,\quad \tilde{\beta }_{9} = +1.80\;,\quad \tilde{\beta }_{10} = -0.70\;,\quad \tilde{\beta }_{11} = -0.30\;, {}\\ \tilde{\beta }_{12}& =& -0.40\;,\quad \tilde{\beta }_{13} = -0.60\;,\ \mbox{ and }\tilde{\beta }_{14} = -1.00 {}\\ \end{array}$$

for the interaction of Factors A and B. Figure 4.19 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 27.

Fig. 4.19
figure 19

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Table 4.6

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 27 LAD regression residuals listed in Fig. 4.19 yield \(ab = (3)(3) = 9\) average distance-function values of

$$\displaystyle\begin{array}{rcl} & & \xi _{(A\times B)_{1}} = 0.5333\;,\quad \xi _{(A\times B)_{2}} = 0.00\;,\quad \xi _{(A\times B)_{3}} =\xi _{(A\times B)_{4}} = 0.0667\;, {}\\ & & \xi _{(A\times B)_{5}} = 0.7333\;,\quad \xi _{(A\times B)_{6}} =\xi _{(A\times B)_{7}} = 0.1333\;,\quad \xi _{(A\times B)_{8}} = 0.0667\;, {}\\ & & \mbox{ and}\;\;\xi _{(A\times B)_{9}} = 0.00\;. {}\\ \end{array}$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.19 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{(A\times B)_{i}}} {N} \;,\qquad i = 1,\,\ldots,\,9\;, }$$

is

$$\displaystyle{ \delta _{A\times B} =\sum _{ i=1}^{ab}C_{ i}\xi _{i} = \frac{3} {9}\big(0.5333 + 0.00 + \cdots + 0.0667 + 0.00\big) = 0.1926\;. }$$

If all M possible arrangements of the N = 27 observed LAD regression residuals listed in Fig. 4.19 occur with equal chance, the approximate resampling probability value of δ A×B  = 0. 1926 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{(A\times B)_{1}} = \cdots = n_{(A\times B)_{9}} = 3\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A\times B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A\times B}$}} {L} = \frac{235,542} {1,000,000} = 0.2355\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 0. 2063 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A\times B} = 1 -\frac{\delta _{A\times B}} {\mu _{\delta }} = 1 -\frac{0.1926} {0.2063} = +0.0663\;, }$$

indicating approximately 7 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 27 univariate response measurement scores for the A×B interaction listed in Table 4.6. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{\begin{array}{lllllll} \hat{\beta }_{0} & = +2.8889\;,\quad &\quad \hat{\beta }_{1} & = +2.80\;,\quad &\quad \hat{\beta }_{2} = +6.3222\;,\quad &\hat{\beta }_{3} & = +0.0667\;, \\ \hat{\beta }_{4} & = -0.3333\;,\quad &\quad \hat{\beta }_{5} & = -0.9333\;,\quad &\quad \hat{\beta }_{6} = -1.1333\;,\quad &\hat{\beta }_{7} & = +1.00\;, \\ \hat{\beta }_{8} & = 0.00\;,\quad &\quad \hat{\beta }_{9} & = +2.0333\;,\quad &\quad \hat{\beta }_{10} = -0.80\;,\quad &\hat{\beta }_{11} & = -0.1333\;, \\ \hat{\beta }_{12} & = -0.2667\;,\quad &\quad \hat{\beta }_{13} & = -0.4333\;, &\quad \mbox{ and }\hat{\beta }_{14} = -1.1333\end{array} }$$

for the interaction of Factors A and B. Figure 4.20 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 27.

Fig. 4.20
figure 20

Observed, predicted, and residual OLS regression values for the univariate response measurement scores listed in Table 4.6

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 27 OLS regression residuals listed in Fig. 4.20 yield \(ab = (3)(3) = 9\) average distance-function values of

$$\displaystyle\begin{array}{rcl} \xi _{(A\times B)_{1}}& =& 0.1151\;,\quad \xi _{(A\times B)_{2}} = 0.1147\;,\quad \xi _{(A\times B)_{3}} = 0.0055\;, {}\\ \xi _{(A\times B)_{4}}& =& 0.0865\;,\quad \xi _{(A\times B)_{5}} = 0.2105\;,\quad \xi _{(A\times B)_{6}} = 0.0287\;, {}\\ \xi _{(A\times B)_{7}}& =& 0.0359\;,\quad \xi _{(A\times B)_{8}} = 0.0250\;,\quad \mbox{ and }\xi _{(A\times B)_{9}} = 0.0300\;. {}\\ \end{array}$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.20 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{(A\times B)_{i}} - 1} {N - ab} \;,\qquad i = 1,\,\ldots,\,9\;, }$$

is

$$\displaystyle\begin{array}{rcl} & & \delta _{A\times B} =\sum _{ i=1}^{ab}C_{ i}\xi _{i} = \frac{3 - 1} {27 - 9}\big(0.1151 + 0.1147 + 0.0055 {}\\ & & \phantom{\delta _{A\times B} =\sum _{ i=1}^{ab}C_{ i}\xi _{i} = \frac{3 - 1} {27 - 9}(0.1151 + 0.11} + \cdots + 0.0250 + 0.0300\big) = 0.0724\;. {}\\ \end{array}$$

If all M possible arrangements of the N = 27 observed OLS regression residuals listed in Fig. 4.20 occur with equal chance, the approximate resampling probability value of δ A×B  = 0. 0724 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{(A\times B)_{1}} = \cdots = n_{(A\times B)_{9}} = 3\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A\times B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A\times B}$}} {L} = \frac{141,960} {1,000,000} = 0.1420\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{(A\times B)_{i}}/N\) for i = 1, , 9 is P = 0. 2355.

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 8. 9231 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A\times B} = 1 -\frac{\delta _{A\times B}} {\mu _{\delta }} = 1 -\frac{0.0724} {8.9231} = +0.1883\;, }$$

indicating approximately 19 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional randomized-block analysis of variance calculated on the N = 27 response measurement scores for the A×B interaction listed in Table 4.4 on p. 143 yields an observed F-ratio of F A×B  = 1. 5591. Assuming independence and normality, F A×B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = (a - 1)(b - 1) = (3 - 1)(3 - 1) = 4\) and \(\nu _{2} = (n - 1)(a - 1)(b - 1) = (3 - 1)(3 - 1)(3 - 1) = 8\) degrees of freedom. Under the null hypothesis, the observed value of F A×B  = 1. 5591 yields an approximate probability value of P = 0. 2744.

4.3.5 Two-Way Factorial Design

Consider a 2×3 fixed-effects factorial design with n = 4 subjects in each treatment combination for a total of N = 24 subjects. The univariate response measurement scores for Factors A and B are listed in Fig. 4.21, and the design matrices and data for Factors A and B are given in Table 4.7; the design and data are adapted from Keppel [214, p. 197]. While design matrices of either dummy or effect codes are appropriate for one-way completely randomized and randomized-block designs, the main effects of factorial designs are best analyzed with effect codes when estimation of the effects of each factor is adjusted for all other factors in the model to obtain the unique contribution of each factor [31, 37, 294].Footnote 3 A permutation analysis of factorial designs requires three separate analyses comprising (1) the main effect of Factor A, (2) the main effect of Factor B, and (3) the A×B interaction effect.

Fig. 4.21
figure 21

Example univariate response measurement scores for Factors A and B in a two-way factorial design

Table 4.7 Design matrices and univariate response measurement scores for the main effects of Factors A and B in a two-way factorial design with N = 24 subjects

Analysis of Factor A

A design matrix of effect codes for analyzing Factor A is given on the left side of Table 4.7, where the first column of 1 values provides for an intercept. The second and third columns contain effect codes for Factor B, the fourth and fifth columns contain effect codes for the A×B interaction, and the last column on the left side of Table 4.7 contains the N = 24 univariate response measurement scores listed according to the original random assignment of the subjects to the a = 2 levels of Factor A with the first \(n_{A_{1}} = 12\) scores and the last \(n_{A_{2}} = 12\) scores associated with treatment levels A 1 and A 2, respectively. The MRPP regression analysis examines the N = 24 regression residuals for possible differences between the a = 2 treatment levels of Factor A; consequently, no effect codes are provided for Factor A as this information is implicit in the ordering of the a = 2 treatment levels of Factor A in the last column on the left side of Table 4.7.

An exact permutation solution is feasible for the univariate response measurement scores listed on the left side of Table 4.7 since there are only

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{24!} {(12!)^{2}} = 2,704,156 }$$

possible, equally-likely arrangements of the N = 24 response measurement scores for Factor A.

LAD Regression Analysis

An MRPP analysis of the LAD regression residuals calculated on the N = 24 univariate response measurement scores on the left side of Table 4.7 yields estimated LAD regression coefficients of

$$\displaystyle\begin{array}{rcl} \tilde{\beta }_{0}& =& +9.6667\;,\quad \tilde{\beta }_{1} = -1.1667\;,\quad \tilde{\beta }_{2} = +0.8333\;,\quad \tilde{\beta }_{3} = -4.50\;,\mbox{ and} {}\\ \tilde{\beta }_{4}& =& +1.50 {}\\ \end{array}$$

for Factor A. Figure 4.22 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 24.

Fig. 4.22
figure 22

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed on the left side of Table 4.7

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 24 LAD regression residuals listed in Fig. 4.22 yield a = 2 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 4.5455\quad \mbox{ and}\quad \xi _{A_{2}} = 5.6061\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.22 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,2\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{12} {24}\big(4.5455 + 5.6061\big) = 5.0758\;. }$$

If all arrangements of the N = 24 observed LAD regression residuals listed in Fig. 4.22 occur with equal chance, the exact probability value of δ A  = 5. 0758 computed on the M = 2, 704, 156 possible arrangements of the observed LAD regression residuals with \(n_{A_{1}} = n_{A_{2}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{1,039,084} {2,704,156} = 0.3843\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 2, 704, 156 δ values is μ δ  = 5. 0725 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{5.0758} {5.0725} = -0.6494\times 10^{-3}\;, }$$

indicating slightly less than chance agreement between the observed and predicted y values.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 24 univariate response measurement scores for Factor A on the left side of Table 4.7. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle\begin{array}{rcl} \hat{\beta }_{0}& =& +10.00\;,\;\;\hat{\beta }_{1} = -3.00\;,\;\;\hat{\beta }_{2} = +1.00\;,\;\;\hat{\beta }_{3} = -3.00\;,\mbox{ and}\;\;\hat{\beta }_{4} = 0.00 {}\\ \end{array}$$

for Factor A. Figure 4.23 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 24.

Fig. 4.23
figure 23

Observed, predicted, and residual OLS regression values for the univariate response measurement scores listed on the left side of Table 4.7

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 24 OLS regression residuals listed in Fig. 4.23 yield a = 2 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 26.1818\quad \mbox{ and}\quad \xi _{A_{2}} = 33.8182\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.23 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,2\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{12 - 1} {24 - 2}\big(26.1818 + 33.8182\big) = 30.00\;. }$$

If all arrangements of the N = 24 observed OLS regression residuals listed in Fig. 4.23 occur with equal chance, the exact probability value of δ A  = 30. 00 computed on the M = 2, 704, 156 possible arrangements of the observed OLS regression residuals with \(n_{A_{1}} = n_{A_{2}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{637,454} {2,704,156} = 0.2357\;. }$$

For comparison, the exact probability value based on LAD regression, v = 1, M = 2, 704, 156, and \(C_{i} = n_{A_{i}}/N\) for i = 1, 2 is P = 0. 3843.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 2, 704, 156 δ values is μ δ  = 30. 7826 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 - \frac{30.00} {30.7826} = +0.0254\;, }$$

indicating approximately 3 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional fixed-effects factorial analysis of variance calculated on the N = 24 Factor A response measurement scores listed in Fig. 4.21 on p. 156 yields an observed F-ratio of F A  = 1. 3091. Assuming independence, normality, and homogeneity of variance, F A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a - 1 = 2 - 1 = 1\) and \(\nu _{2} = N - ab = 24 - (2)(3) = 18\) degrees of freedom. Under the null hypothesis, the observed value of F A  = 1. 3091 yields an approximate probability value of P = 0. 2675, which is similar to the OLS regression probability value of P = 0. 2357.

Analysis of Factor B

The right side of Table 4.7 on p. 157 contains a design matrix of effect codes for analyzing Factor B, where the first column of 1 values provides for an intercept. The second column contains effect codes for Factor A, the third and fourth columns contain effect codes for the A×B interaction, and the last column on the right side of Table 4.7 contains the N = 24 univariate response measurement scores listed according to the original random assignment of the subjects to the b = 3 levels of Factor B with the first \(n_{B_{1}} = 8\) scores, the next \(n_{B_{2}} = 8\) scores, and the last \(n_{B_{3}} = 8\) scores associated with treatment levels, B 1, B 2, and B 3, respectively. The MRPP regression analysis examines the N = 24 regression residuals for possible differences among the b = 3 treatment levels of Factor B; consequently, no effect codes are provided for Factor B as this information is implicit in the ordering of the b = 3 treatment levels of Factor B in the last column on the right side of Table 4.7.

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{b}n_{ B_{i}}!} = \frac{24!} {(8!)^{3}} = 9,465,511,770 }$$

possible, equally-likely arrangements of the N = 24 response measurement scores for Factor B listed on the right side of Table 4.7, an exact permutation approach is not practical.

LAD Regression Analysis

An MRPP resampling analysis of the N = 24 LAD regression residuals calculated on the univariate response measurement scores on the right side of Table 4.7 on p. 157 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = +9.50\;,\quad \tilde{\beta }_{1} = +0.1667\;,\quad \tilde{\beta }_{2} = -3.6667\;,\mbox{ and}\quad \tilde{\beta }_{3} = +0.3333 }$$

for Factor B. Figure 4.24 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 24.

Fig. 4.24
figure 24

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed on the right side of Table 4.7

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 24 LAD regression residuals listed in Fig. 4.24 yield b = 3 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 4.0714\;,\quad \xi _{B_{2}} = 6.0714\;,\mbox{ and}\quad \xi _{B_{3}} = 4.8571\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.24 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{12} {24}\big(4.0714 + 6.0714 + 4.8571\big) = 5.00\;. }$$

If all M possible arrangements of the N = 24 observed LAD regression residuals listed in Fig. 4.24 occur with equal chance, the approximate resampling probability value of δ B  = 5. 00 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = 8\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {L} = \frac{125,031} {1,000,000} = 0.1250\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 5. 3333 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 - \frac{5.00} {5.3333} = +0.0625\;, }$$

indicating approximately 6 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 24 univariate response measurement scores for Factor B listed on the right side of Table 4.7 on p. 157. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +10.00\;,\quad \hat{\beta }_{1} = -1.00\;,\quad \hat{\beta }_{2} = -3.00\;,\mbox{ and}\quad \hat{\beta }_{3} = 0.00 }$$

for Factor B. Figure 4.25 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 24.

Fig. 4.25
figure 25

Observed, predicted, and residual OLS regression values for the univariate response measurement scores listed on the right side of Table 4.7

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 24 OLS regression residuals listed in Fig. 4.25 yield b = 3 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 21.7143\;,\quad \xi _{B_{2}} = 45.1429\;,\mbox{ and}\quad \xi _{B_{3}} = 27.4286\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.25 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}} - 1} {N - b} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{8 - 1} {24 - 3}\big(21.7143 + 45.1429 + 27.4286\big) = 31.4286\;. }$$

If all M possible arrangements of the N = 24 observed OLS regression residuals listed in Fig. 4.25 occur with equal chance, the approximate resampling probability value of δ B  = 31. 4286 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = 8\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {L} = \frac{49,168} {1,000,000} = 0.0492\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{B_{i}}/N\) for i = 1, 2, 3 is P = 0. 1250. 

Following Eq. (4.7) on p. 126, the exact expected value of the M = 9, 465, 511, 770 δ values is μ δ  = 38. 4348 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 -\frac{31.4286} {38.4348} = +0.1823\;, }$$

indicating approximately 18 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional fixed-effects factorial analysis of variance calculated on the N = 24 Factor B response measurement scores listed in Fig. 4.21 on p. 156 yields an observed F-ratio of F B  = 3. 0545. Assuming independence, normality, and homogeneity of variance, F B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = b - 1 = 3 - 1 = 2\) and \(\nu _{2} = N - ab = 24 - (2)(3) = 18\) degrees of freedom. Under the null hypothesis, the observed value of F B  = 3. 0545 yields an approximate probability value of P = 0. 0721.

Analysis of the A×B Interaction

A design matrix of effect codes for analyzing the A×B interaction of the data listed in Fig. 4.21 on p. 156 is given in Fig. 4.26, where the first column of 1 values provides for an intercept, the second column contains effect codes for Factor A, the third and fourth columns contain effect codes for Factor B, and the last column lists the N = 24 univariate response measurement scores listed according to the original random assignment of the subjects to the \(ab = (2)(3) = 6\) levels of the A×B interaction. The MRPP regression analysis examines the N = 24 regression residuals for possible differences among the six treatment levels of the A×B interaction; consequently, no effect codes are provided for the A×B interaction as this information is implicit in the ordering of the treatment levels of the A×B interaction in the last column of Fig. 4.26.

Fig. 4.26
figure 26

Design matrix and univariate response measurement scores for the A×B interaction in a 2×3 factorial design with N = 24 subjects

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{ab}n_{ (A\times B)_{i}}!} = \frac{24!} {(4!)^{6}} = 118,569,536,025,665,614,982,267,535,360,000 }$$

possible, equally-likely arrangements of the N = 24 univariate response measurement scores for the A×B interaction listed in Fig. 4.26, an exact permutation approach is clearly not possible.

LAD Regression Analysis

An MRPP resampling analysis of the LAD regression residuals calculated on the univariate response measurement scores in Fig. 4.26 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = +8.3333\;,\quad \tilde{\beta }_{1} = -1.00\;,\quad \tilde{\beta }_{2} = -3.3333\;,\mbox{ and}\quad \tilde{\beta }_{3} = -0.3333 }$$

for the interaction of Factors A and B. Figure 4.27 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 24.

Fig. 4.27
figure 27

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Fig. 4.26

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 24 LAD regression residuals listed in Fig. 4.27 yield \(ab = (2)(3) = 6\) average distance-function values of

$$\displaystyle\begin{array}{rcl} & & \xi _{(A\times B)_{1}} = 4.00\;,\quad \xi _{(A\times B)_{2}} = 5.00\;,\quad \xi _{(A\times B)_{3}} = 6.00\;,\quad \xi _{(A\times B)_{4}} = 7.00\;, {}\\ & & \mbox{ and}\quad \xi _{(A\times B)_{5}} =\xi _{(A\times B)_{6}} = 5.00\;. {}\\ \end{array}$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.27 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{(A\times B)_{i}}} {N} \;,\qquad i = 1,\,\ldots,\,6\;, }$$

is

$$\displaystyle{ \delta _{A\times B} =\sum _{ i=1}^{ab}C_{ i}\xi _{i} = \frac{4} {24}\big(4.00 + 5.00 + 6.00 + 7.00 + 5.00 + 5.00\big) = 5.3333\;. }$$

If all M possible arrangements of the N = 24 observed LAD regression residuals listed in Fig. 4.27 occur with equal chance, the approximate resampling probability value of δ A×B  = 5. 3333 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{(A\times B)_{1}} = \cdots = n_{(A\times B)_{6}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A\times B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A\times B}$}} {L} = \frac{347,675} {1,000,000} = 0.3477\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 5. 50 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A\times B} = 1 -\frac{\delta _{A\times B}} {\mu _{\delta }} = 1 -\frac{5.3333} {5.50} = +0.0303\;, }$$

indicating approximately 3 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 24 univariate response measurement scores of the A×B interaction listed in Fig. 4.26. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +10.00\;,\quad \hat{\beta }_{1} = -1.00\;,\quad \hat{\beta }_{2} = -3.00\;,\mbox{ and}\quad \hat{\beta }_{3} = +1.00 }$$

for the interaction of Factors A and B. Figure 4.28 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 24.

Fig. 4.28
figure 28

Observed, predicted, and residual OLS regression values for the univariate response measurement scores listed in Fig. 4.26

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 24 OLS regression residuals listed in Fig. 4.28 yield \(ab = (2)(3) = 6\) average distance-function values of

$$\displaystyle\begin{array}{rcl} \xi _{(A\times B)_{1}}& =& 20.00\;,\quad \xi _{(A\times B)_{2}} = 30.6667\;,\quad \xi _{(A\times B)_{3}} = 45.3333\;, {}\\ \xi _{(A\times B)_{4}}& =& 60.00\;,\quad \xi _{(A\times B)_{5}} = 30.6667\;,\quad \mbox{ and }\xi _{(A\times B)_{6}} = 33.3333\;. {}\\ \end{array}$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.28 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{(A\times B)_{i}} - 1} {N - ab} \;,\qquad i = 1,\,\ldots,\,6\;, }$$

is

$$\displaystyle\begin{array}{rcl} \delta _{A\times B}& =& \sum _{i=1}^{ab}C_{ i}\xi _{i} = \frac{4 - 1} {24 - 6}\big(20.00 + 30.6667 + 45.3333 + 60.00 {}\\ & & \phantom{\sum _{i=1}^{ab}C_{ i}\xi _{i} = \frac{4 - 1} {24 - 6}(20.00 + 30.66} + 30.6667 + 33.3333\big) = 36.6667\;. {}\\ \end{array}$$

If all M possible arrangements of the observed OLS regression residuals listed in Fig. 4.28 occur with equal chance, the approximate resampling probability value of δ A×B  = 36. 6666 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{(A\times B)_{1}} = \cdots = n_{(A\times B)_{6}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A\times B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A\times B}$}} {L} = \frac{224,204} {1,000,000} = 0.2242\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{(A\times B)_{i}}/N\) for i = 1, , 6 is P = 0. 3477.

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 41. 2174 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A\times B} = 1 -\frac{\delta _{A\times B}} {\mu _{\delta }} = 1 -\frac{36.6667} {41.2174} = +0.1104\;, }$$

indicating approximately 11 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional fixed-effects factorial analysis of variance calculated on the N = 24 univariate response measurement scores listed in Fig. 4.21 on p. 156 yields an observed F-ratio of F A×B  = 3. 9273. Assuming independence, normality, and homogeneity of variance, F A×B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = (a - 1)(b - 1) = (2 - 1)(3 - 1) = 2\) and \(\nu _{2} = ab(n - 1) = (2)(3)(4 - 1) = 18\) degrees of freedom. Under the null hypothesis, the observed value of F A×B  = 3. 9273 yields an approximate probability value of P = 0. 0384, which differs greatly from the LAD and OLS regression probability values of P = 0. 3477 and P = 0. 2242, respectively.

4.3.6 Latin Square Design

A Latin square experimental design assigns treatments to subjects so the treatments occur in a balanced fashion within a square block or field; thus, n treatments appear once in each of n rows and n columns. The Latin square is the design of choice when controlling for two blocking factors. Consider an ordinary balanced Latin square experiment involving repeated measurements in which n = 4 subjects (S ) are each tested b = 4 times on Factor A. The design and data are adapted from Ferguson [115, p. 349] and are given in Table 4.8, where B refers to the ordinal position in which the levels of Factor A are administered. Thus, the first subject, S 1, receives the b = 4 treatments in the order A 2, A 4, A 1, A 3, and so on. Due to the balanced nature of Latin square designs, the assumption is that there is no interaction between blocking Factors A and B, or between either blocking factor and the treatments.

Table 4.8 Design and data for a Latin square design with four subjects (S), four treatments (A), and four orders (B)

Analysis of Factor A

A design matrix of dummy codes for analyzing Factor A is given in Fig. 4.29, where the first column of 1 values provides for an intercept, the second through fourth columns contain dummy codes for Subjects, the fifth through seventh columns contain dummy codes for Factor B, and the last column lists the univariate response measurement scores ordered by the a = 4 levels of Factor A, with the first \(n_{A_{1}} = 4\) scores, the next \(n_{A_{2}} = 4\) scores, the next \(n_{A_{3}} = 4\) scores, and the last \(n_{A_{4}} = 4\) scores associated with treatment levels A 1, A 2, A 3, and A 4, respectively. The MRPP regression analysis examines the N = 16 regression residuals for possible differences among the a = 4 treatment levels of Factor A; consequently, no dummy codes are provided for Factor A as this information is implicit in the ordering of the a = 4 treatment levels of Factor A in the last column of Fig. 4.29.

Fig. 4.29
figure 29

Design matrix and univariate response measurement scores for treatment (A) in a Latin square design

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{16!} {(4!)^{4}} = 63,063,000 }$$

possible, equally-likely arrangements of the N = 16 univariate response measurement scores listed in Fig. 4.29, an exact permutation approach is not practical.

LAD Regression Analysis

An MRPP resampling analysis of the N = 16 LAD regression residuals calculated on the univariate response measurement scores listed in Fig. 4.29 yields estimated LAD regression coefficients of

$$\displaystyle{\begin{array}{llll} \tilde{\beta }_{0} & = +10.00\;,\quad &\quad \tilde{\beta }_{1} & = +2.00\;,\quad \tilde{\beta }_{2} = -2.00\;,\quad \tilde{\beta }_{3} = -5.00\;, \\ \tilde{\beta }_{4} & = +8.00\;,\quad &\quad \tilde{\beta }_{5} & = +12.00\;,\mbox{ and }\tilde{\beta }_{6} = +4.00\end{array} }$$

for Factor A. Figure 4.30 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 16.

Fig. 4.30
figure 30

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Fig. 4.29

Following (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 16 LAD regression residuals listed in Fig. 4.30 yield a = 4 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 10.3333\;,\quad \xi _{A_{2}} = 7.3333\;,\quad \xi _{A_{3}} = 0.00\;,\mbox{ and}\quad \xi _{A_{4}} = 7.1667\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.30 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,\,\ldots,\,4\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{4} {16}\big(10.3333 + 7.3333 + 0.00 + 7.1667\big) = 6.2083\;. }$$

If all M possible arrangements of the N = 16 observed LAD regression residuals listed in Fig. 4.30 occur with equal chance, the approximate resampling probability value of δ A  = 6. 2083 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{A_{1}} = \cdots = n_{A_{4}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {L} = \frac{27,289} {1,000,000} = 0.0273\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 63, 063, 000 δ values is μ δ  = 8. 2750 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{6.2083} {8.2750} = +0.2497\;, }$$

indicating approximately 25 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of the OLS regression residuals calculated on the N = 16 univariate response measurement scores listed in Fig. 4.29 on p. 171. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{\begin{array}{llll} \hat{\beta }_{0} & = +11.6875\;,\quad &\quad \hat{\beta }_{1} & = -0.2500\;,\quad \hat{\beta }_{2} = +2.00\;,\quad \hat{\beta }_{3} = +1.50\;, \\ \hat{\beta }_{4} & = +0.50\;,\quad &\quad \hat{\beta }_{5} & = +1.7500\;,\quad \mbox{ and }\hat{\beta }_{6} = +1.00\end{array} }$$

for Factor A. Figure 4.31 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 16.

Fig. 4.31
figure 31

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Fig. 4.29

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 16 OLS regression residuals listed in Fig. 4.31 yield a = 4 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 6.2083\;,\quad \xi _{A_{2}} = 6.4583\;,\quad \xi _{A_{3}} = 0.8750\;,\mbox{ and}\quad \xi _{A_{4}} = 2.3750\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.31 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,\,\ldots,\,4\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{4 - 1} {16 - 4}\big(6.2083 + 6.4583 + 0.8750 + 2.3750\big) = 3.9792\;. }$$

If all M possible arrangements of the N = 16 observed OLS regression residuals listed in Fig. 4.31 occur with equal chance, the approximate resampling probability value of δ A  = 3. 9792 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{A_{1}} = \cdots = n_{A_{4}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {L} = \frac{1} {1,000,000} = 0.10\times 10^{-5}\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{A_{i}}/N\) for i = 1, , 4 is P = 0. 0273.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 63, 063, 000 δ values is μ δ  = 68. 0083 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 - \frac{3.9792} {68.0083} = +0.9415\;, }$$

indicating approximately 95 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional Latin square analysis of variance calculated on the N = 16 univariate response measurement scores listed in Table 4.8 on p. 170170 yields an observed F-ratio of F A  = 40. 7277. Assuming independence and normality, F A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a - 1 = 4 - 1 = 3\) and \(\nu _{2} = (a - 2)(a - 1) = (4 - 2)(4 - 1) = 6\) degrees of freedom. Under the null hypothesis, the observed value of F A  = 40. 7277 yields an approximate probability value of \(P = 0.2204\times 10^{-3}\).

Analysis of Factor B

A design matrix of dummy codes for analyzing Factor B is given in Fig. 4.32, where the first column of 1 values provides for an intercept, the second through fourth columns contain dummy codes for Subjects, the fifth through seventh columns contain dummy codes for Factor A, and the last column lists the univariate response measurement scores ordered by the b = 4 treatment levels of Factor B, with the first \(n_{B_{1}} = 4\) scores, the next \(n_{B_{2}} = 4\) scores, the next \(n_{B_{3}} = 4\) scores, and the last \(n_{B_{4}} = 4\) associated with treatment levels B 1, B 2, B 3, and B 4, respectively. The MRPP regression analysis examines LAD regression residuals for possible differences among the b = 4 treatment levels of Factor B; consequently, no dummy codes are provided for Factor B as this information is implicit in the ordering of the b = 4 treatment levels of Factor B in the last column of Fig. 4.32.

Fig. 4.32
figure 32

Design matrix and univariate response measurement scores for order (B) in a Latin square design

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{b}n_{ B_{i}}!} = \frac{16!} {(4!)^{4}} = 63,063,000 }$$

possible, equally-likely arrangements of the N = 16 univariate response measurement scores listed in Fig. 4.32, an exact permutation approach is not practical.

LAD Regression Analysis

An MRPP resampling analysis of the N = 16 LAD regression residuals calculated on the univariate response measurement scores in Fig. 4.32 yields estimated LAD regression coefficients of

$$\displaystyle\begin{array}{rcl} \tilde{\beta }_{0}& =& +21.00\;,\quad \tilde{\beta }_{1} = -2.00\;,\quad \tilde{\beta }_{2} = +2.00\;,\quad \tilde{\beta }_{3} = +1.00\;, {}\\ \tilde{\beta }_{4}& =& -13.00\;,\quad \tilde{\beta }_{5} = -11.00\;,\mbox{ and}\quad \tilde{\beta }_{6} = -7.00 {}\\ \end{array}$$

for Factor B. Figure 4.33 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 16.

Fig. 4.33
figure 33

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Fig. 4.32

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 16 LAD regression residuals listed in Fig. 4.33 yield b = 4 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 2.00\;,\quad \xi _{B_{2}} = 2.00\;,\quad \xi _{B_{3}} = 3.1667\;,\mbox{ and}\quad \xi _{B_{4}} = 0.00\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.33 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}}} {N} \;,\qquad i = 1,\,\ldots,\,4\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{4} {16}\big(2.00 + 2.00 + 3.1667 + 0.00\big) = 1.7917\;. }$$

If all M possible arrangements of the N = 16 observed LAD regression residuals listed in Fig. 4.33 occur with equal chance, the approximate resampling probability value of δ B  = 1. 7917 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{B_{1}} = \cdots = n_{B_{4}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {L} = \frac{495,269} {1,000,000} = 0.4953\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 63, 063, 000 δ values is μ δ  = 1. 8583 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 -\frac{1.7917} {1.8583} = +0.0359\;, }$$

indicating approximately 4 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of the OLS regression residuals calculated on the N = 16 univariate response measurement scores listed in Fig. 4.29 on p. 171. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle\begin{array}{rcl} \hat{\beta }_{0}& =& +20.6875\;,\quad \hat{\beta }_{1} = -0.2500\;,\quad \hat{\beta }_{2} = +2.00\;,\quad \hat{\beta }_{3} = +1.50\;, {}\\ \hat{\beta }_{4}& =& -14.7500\;,\quad \hat{\beta }_{5} = -11.2500\;,\mbox{ and}\quad \hat{\beta }_{6} = -6.7500 {}\\ \end{array}$$

for Factor B. Figure 4.34 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 16.

Fig. 4.34
figure 34

Observed, predicted, and residual OLS regression values for the univariate response measurement scores listed in Fig. 4.32

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 16 OLS regression residuals listed in Fig. 4.34 yield b = 4 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 2.8750\;,\quad \xi _{B_{2}} = 6.7083\;,\quad \xi _{B_{3}} = 3.2083\;,\mbox{ and}\quad \xi _{B_{4}} = 3.1250\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.34 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}} - 1} {N - b} \;,\qquad i = 1,\,\ldots,\,4\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{4 - 1} {16 - 4}\big(2.8750 + 6.7083 + 3.2083 + 3.1250\big) = 3.9792\;. }$$

If all M possible arrangements of the N = 16 observed OLS regression residuals listed in Fig. 4.34 occur with equal chance, the approximate resampling probability value of δ B  = 3. 9792 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = n_{B_{4}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {L} = \frac{378,875} {1,000,000} = 0.3789\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{B_{i}}/N\) for i = 1, , 4 is P = 0. 4953.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 63, 063, 000 δ values is μ δ  = 4. 0750 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 -\frac{1.7917} {4.0750} = +0.0235\;, }$$

indicating only approximately 2 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional Latin square analysis of variance calculated on the N = 16 univariate response measurement scores listed in Table 4.8 on p. 170 yields an observed F-ratio of F B  = 0. 5602. Assuming independence and normality, F B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = b - 1 = 4 - 1 = 3\) and \(\nu _{2} = (b - 2)(b - 1) = (4 - 2)(4 - 1) = 6\) degrees of freedom. Under the null hypothesis, the observed value of F B  = 0. 5602 yields an approximate probability value of P = 0. 6606. The LAD regression, OLS regression, and F-ratio probability values of P = 0. 4953, P = 0. 3789, and P = 0. 6606, respectively, all indicate that the order in which the treatments were distributed did not matter.

4.3.7 Split-Plot Design

Imagine a testing experiment with two treatment factors, A and B, with a and b treatment levels, respectively, so that there are ab treatment combinations. If each testing session requires h hours of a subject’s time and every subject is to be treated under all treatment conditions, each subject will require ab testing sessions and abh hours of testing time. When this is unreasonable, then with S subjects available, assign \(n = S/A\) subjects to each level of Factor A and test each subject under all levels of Factor B. The design is a repeated-measures split-plot design in which subjects are randomly assigned to the a treatment levels of Factor A (i.e., plots), and each subject is then tested under all b levels of Factor B (i.e., subplots). The design is also called a mixed factorial design with one between-subjects factor (A) and one within-subjects factor (B), or an A×(B×S) design [214].

Consider a split-plot experiment in which Factor A has a = 3 treatment levels, Factor B has b = 3 treatment levels, n = 12 subjects are randomly assigned to each of the a = 3 levels of Factor A, and each subject is tested at all b = 3 levels of Factor B. The design and data are adapted from Keppel and Zedeck and are given in Fig. 4.35 [215, p. 303].

Fig. 4.35
figure 35

Example univariate response measurements for a split-plot design

Analysis of Factor A

A design matrix of effect codes for an MRPP regression analysis of Factor A is given in Fig. 4.36, where the first column of 1 values provides for an intercept and the second column lists the total of response measurement summations over the b levels of Factor B (e.g., \(53 + 51 + 35 = 139\)). The summations are ordered by the a = 3 treatment levels of Factor A with the first \(n_{A_{1}} = 4\) summations, the second \(n_{A_{2}} = 4\) summations, and the last \(n_{A_{3}} = 4\) summations associated with treatment levels A 1, A 2, and A 3, respectively. The MRPP regression analysis examines the N = 12 regression residuals for possible differences among the a = 3 treatment levels of Factor A; consequently, no effect codes are provided for Factor A as this information is implicit in the ordering of the a = 3 treatment levels of Factor A in the second column of Fig. 4.36.

Fig. 4.36
figure 36

Design matrix and response measurement summations for the main effects of Factor A in a split-plot design

An exact permutation solution is feasible for the response measurement summations listed in Fig. 4.36 since there are only

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{12!} {(4!)^{3}} = 34,650 }$$

possible, equally-likely arrangements of the N = 12 response measurement summations for Factor A.

LAD Regression Analysis

An MRPP analysis of the N = 12 LAD regression residuals calculated on the response measurement summations in Fig. 4.36 yields an estimated LAD regression coefficient of \(\tilde{\beta }_{0} = +101.00\) for Factor A. Figure 4.37 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 12.

Fig. 4.37
figure 37

Observed, predicted, and residual LAD regression values for the response measurement summations listed in Fig. 4.36

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 12 LAD regression residuals listed in Fig. 4.37 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 20.00\;,\quad \xi _{A_{2}} = 26.6667\;,\mbox{ and}\quad \xi _{A_{3}} = 11.6667\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.37 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\quad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{4} {12}\big(20.00 + 26.6667 + 11.6667\big) = 19.4444\;. }$$

If all arrangements of the N = 16 observed LAD regression residuals listed in Fig. 4.37 occur with equal chance, the exact probability value of δ A  = 19. 4444 calculated on the M = 34, 650 possible arrangements of the observed LAD regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{672} {34,650} = 0.0194\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the M = 34, 650 δ values is μ δ  = 27. 00 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{19.4444} {27.00} = +0.2798\;, }$$

indicating approximately 28 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the response measurement summations for Factor A in Fig. 4.36. The MRPP regression analysis yields an estimated OLS regression coefficient of \(\hat{\beta }_{0} = +100.50\) for Factor A. Figure 4.38 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 12.

Fig. 4.38
figure 38

Observed, predicted, and residual OLS regression values for the response measurement summations listed in Fig. 4.36

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 12 OLS regression residuals listed in Fig. 4.38 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 493.3333\;,\quad \xi _{A_{2}} = 909.3333\;,\mbox{ and}\quad \xi _{A_{3}} = 179.3333\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.38 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{4 - 1} {12 - 3}\big(493.3333 + 909.3333 + 179.3333\big) = 527.3333\;. }$$

If all arrangements of the N = 12 observed OLS regression residuals listed in Fig. 4.38 occur with equal chance, the exact probability value of δ A  = 527. 3333 computed on the M = 34, 650 possible arrangements of the observed OLS regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{564} {34,650} = 0.0163\;. }$$

For comparison, the exact probability value based on LAD regression, v = 1, M = 34, 650, and \(C_{i} = n_{A_{i}}/N\) for i = 1, 2, 3 is P = 0. 0194.

Following Eq. (4.7) on p. 126, the exact expected value of the M = 34, 650 δ values is μ δ  = 1, 082. 7273 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 - \frac{527.3333} {1,082.7273} = +0.5130\;, }$$

indicating approximately 51 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional split-plot analysis of variance calculated on the N = 12 univariate response measurement scores listed in Fig. 4.35 on p. 180 yields an observed F-ratio of F A  = 6. 7927. Assuming independence, normality, and homogeneity of variance, F A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a - 1 = 3 - 1 = 2\) and \(\nu _{2} = a(n - 1) = 3(4 - 1) = 9\) degrees of freedom. Under the null hypothesis, the observed value of F A  = 6. 7927 yields an approximate probability value of P = 0. 0159.

Analysis of Factor B

A design matrix of effect codes for an MRPP regression analysis of Factor B is given in Table 4.9, where the first column of 1 values provides for an intercept, the next 11 columns contain effect codes for Subjects nested within Factor A, and the next four columns contain effect codes for the A×B interaction. The last column lists the N = 36 univariate response measurement scores ordered by the b = 3 treatment levels of Factor B, with the first \(n_{B_{1}} = 12\) scores, the next \(n_{B_{2}} = 12\) scores, and the last \(n_{B_{3}} = 12\) scores associated with treatment levels B 1, B 2, and B 3, respectively. The MRPP regression analysis examines the N = 36 regression residuals for possible differences among the b = 3 treatment levels of Factor B; consequently, no effect codes are provided for Factor B as this information is implicit in the ordering of the b = 3 treatment levels of Factor B in the last column of Table 4.9.

Table 4.9 Design matrix and univariate response measurement scores for the main effects of Factor B

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{b}n_{ B_{i}}!} = \frac{36!} {(12!)^{3}} = 3,384,731,762,521,200 }$$

possible, equally-likely arrangements of the N = 36 univariate response measurement scores listed in Table 4.9, an exact permutation approach is not possible.

LAD Regression Analysis

An MRPP resampling analysis of the LAD regression residuals calculated on the N = 36 univariate response measurement scores in Table 4.9 yields estimated LAD regression coefficients of

$$\displaystyle{\begin{array}{llllllll} \tilde{\beta }_{0} & = +35.50\;,\quad &\quad \tilde{\beta }_{1} & = +9.8333\;,\quad &\quad \tilde{\beta }_{2} & = -7.1667\;,\quad &\quad \tilde{\beta }_{3} & = +2.8333\;, \\ \tilde{\beta }_{4} & = +5.8333\;,\quad &\quad \tilde{\beta }_{5} & = +4.8333\;,\quad &\quad \tilde{\beta }_{6} & = -0.1667\;,\quad &\quad \tilde{\beta }_{7} & = -20.1667\;, \\ \tilde{\beta }_{8} & = -17.1667\;,\quad &\quad \tilde{\beta }_{9} & = +2.8333\;,\quad &\quad \tilde{\beta }_{10} & = +0.8333\;,\quad &\quad \tilde{\beta }_{11} & = +9.8333\;, \\ \tilde{\beta }_{12} & = +0.6667\;,\quad &\quad \tilde{\beta }_{13} & = +6.6667\;,\quad &\quad \tilde{\beta }_{14} & = +5.6667\;,\quad &\quad \mbox{ and} \\ \tilde{\beta }_{15} & = -2.3333\end{array} }$$

for Factor B. Figure 4.39 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.39
figure 39

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Table 4.9

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 36 LAD regression residuals listed in Fig. 4.39 yield b = 3 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 8.6061\;,\quad \xi _{B_{2}} = 1.3182\;,\mbox{ and}\quad \xi _{B_{3}} = 13.5606\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.39 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{12} {36}\big(8.6061 + 1.3182 + 13.5606\big) = 7.8283\;. }$$

If all M possible arrangements of the N = 36 observed LAD regression residuals listed in Fig. 4.39 occur with equal chance, the approximate resampling probability value of δ B  = 7. 8283 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {L} = \frac{0} {1,000,000} = 0.00\;, }$$

which may be interpreted as a probability of less than one in a million.

When M is very large and the probability of an observed δ is extremely small, as in this case, resampling permutation procedures sometimes result in zero probability, even with L = 1, 000, 000 random arrangements of the observed regression residuals. A reanalysis of Factor B using L = 10, 000, 000 random arrangements of the observed data yielded an identical resampling probability value of P = 0. 00. Moment-approximation permutation procedures, described briefly in Chap. 1, Sect. 1.2.2, can often provide results in these extreme situations. The moment-approximation of a test statistic requires computation of the exact moments of the test statistic, assuming equally-likely arrangements of the observed regression residuals [284, 300]. Usually, the first three exact moments are used: the exact mean, μ δ , the exact variance, \(\sigma _{\delta }^{2}\), and the exact skewness, γ δ , of δ. The three moments are then used to fit a specified distribution, such as a Pearson type III distribution, that approximates the underlying discrete permutation distribution and provides an approximate probability value. For Factor B, a moment-approximation procedure yields δ B  = 7. 8283, μ δ  = 12. 5460, \(\sigma _{\delta }^{2} = 0.1675\), \(\gamma _{\delta } = -1.3580\), a standardized test statistic of

$$\displaystyle{ T_{B} = \frac{\delta _{B} -\mu _{\delta }} {\sigma _{\delta }} = \frac{7.8283 - 12.5460} {\sqrt{0.1675}} = -11.5272\;, }$$

and a Pearson type III approximate probability value of \(P = 0.1495\times 10^{-6}\).

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 12. 5460 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 - \frac{7.8283} {12.5460} = +0.3760\;, }$$

indicating approximately 38 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 36 response measurement summations for Factor B in Table 4.9 on p. 184. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{\begin{array}{llllllll} \hat{\beta }_{0} & = +33.50\;,\quad &\quad \hat{\beta }_{1} & = +12.8333\;,\quad &\quad \hat{\beta }_{2} & = +0.1667\;,\quad &\quad \hat{\beta }_{3} & = +7.50\;, \\ \hat{\beta }_{4} & = +5.50\;,\quad &\quad \hat{\beta }_{5} & = +1.50\;,\quad &\quad \hat{\beta }_{6} & = -5.1667\;,\quad &\quad \hat{\beta }_{7} & = -12.50\;, \\ \hat{\beta }_{8} & = -13.8333\;,\quad &\quad \hat{\beta }_{9} & = +2.8333\;,\quad &\quad \hat{\beta }_{10} & = -1.8333\;,\quad &\quad \hat{\beta }_{11} & = +4.50\;, \\ \hat{\beta }_{12} & = -1.7500\;,\quad &\quad \hat{\beta }_{13} & = +5.7500\;,\quad &\quad \hat{\beta }_{14} & = +1.50\;,\quad &\quad \mbox{ and} \\ \hat{\beta }_{15} & = -2.7500 \end{array} }$$

for Factor B. Figure 4.40 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.40
figure 40

Observed, predicted, and residual OLS regression values for the response measurement scores listed in Table 4.9

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 36 OLS regression residuals listed in Fig. 4.40 yield b = 3 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 30.2727\;,\quad \xi _{B_{2}} = 46.4394\;,\mbox{ and}\quad \xi _{B_{3}} = 16.5606\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.40 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}} - 1} {N - b} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{12 - 1} {36 - 3}\big(30.2727 + 46.4394 + 16.5606\big) = 31.0909\;. }$$

If all M possible arrangements of the N = 36 observed OLS regression residuals listed in Fig. 4.40 occur with equal chance, the approximate resampling probability value of δ B  = 31. 0909 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{B_{1}} = n_{B_{2}} = n_{B_{3}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {L} = \frac{0} {1,000,000} = 0.00\;, }$$

i.e., a probability of less than one in a million. For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{B_{i}}/N\) for i = 1, 2, 3 is also P = 0. 00.

As with the analysis of the LAD regression residuals listed in Fig. 4.39 on p. 186, when M is large and the probability of an observed δ is very small, an alternative moment procedure based on the exact mean, μ δ , exact variance, \(\sigma _{\delta }^{2}\), and exact skewness, γ δ , of δ can be employed to obtain approximate probability values; see Chap. 1, Sect. 1.2.2. For Factor B, a moment-approximation procedure yields δ B  = 31. 0909, μ δ  = 199. 2857, \(\sigma _{\delta }^{2} = 134.8578\), \(\gamma _{\delta } = -1.7697\), a standardized test statistic of

$$\displaystyle{ T_{B} = \frac{\delta _{B} -\mu _{\delta }} {\sigma _{\delta }} = \frac{31.0909 - 199.2857} {\sqrt{134.8578}} = -14.4835\;, }$$

and a Pearson type III approximate probability value of \(P = 0.5420\times 10^{-7}\).

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 199. 2857 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 - \frac{31.0909} {199.2857} = +0.8440\;, }$$

indicating approximately 84 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional split-plot analysis of variance calculated on the N = 36 univariate response measurement scores listed in Fig. 4.35 on p. 180 yields an observed F-ratio of F B  = 52. 1842. Assuming independence and normality, F B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = b - 1 = 3 - 1 = 2\) and \(\nu _{2} = a(n - 1)(b - 1) = 3(4 - 1)(3 - 1) = 18\) degrees of freedom. Under the null hypothesis, the observed value of F B  = 52. 1842 yields an approximate probability value of \(P = 0.3224\times 10^{-7}\).

Analysis of the A×B Interaction

A design matrix of effect codes for an MRPP regression analysis of the A×B interaction is given in Table 4.10, where the first column of 1 values provides for an intercept, the next 11 columns contain effect codes for Subjects nested within Factor A, and the next two columns contain effect codes for Factor B. The last column lists the N = 36 univariate response measurement scores ordered by the \(ab = (3)(3) = 9\) levels of the A×B interaction. The MRPP regression analysis examines the N = 36 regression residuals for possible differences among the nine treatment levels of the A×B interaction; consequently, no effect codes are provided for the A×B interaction as this information is implicit in the ordering of the treatment levels of the A×B interaction in the last column of Table 4.10.

Table 4.10 Design matrix and univariate response measurement scores for the interaction effects of Factors A and B

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{ab}n_{ (A\times B)_{i}}!} = \frac{36!} {(4!)^{9}} = 140,810,154,080,474,667,338,550,000,000 }$$

possible, equally-likely arrangements of the N = 36 univariate response measurement scores listed in Table 4.10, an exact permutation approach is not possible.

LAD Regression Analysis

An MRPP resampling analysis of the N = 36 LAD regression residuals calculated on the univariate response measurement scores in Table 4.10 yields estimated LAD regression coefficients of

$$\displaystyle{\begin{array}{llllllll} \tilde{\beta }_{0} & = +34.00\;,\;\; &\quad \tilde{\beta }_{1} = +12.6667\;,\;\; &\tilde{\beta }_{2} & = -3.3333\;,\;\;&\quad \tilde{\beta }_{3} & = +6.6667\;, \\ \tilde{\beta }_{4} & = +4.6667\;,\;\; &\quad \tilde{\beta }_{5} = +4.6667\;,\;\; &\tilde{\beta }_{6} & = -4.3333\;,\;\;&\quad \tilde{\beta }_{7} & = -11.3333\;, \\ \tilde{\beta }_{8} & = -16.3333\;,\;\;&\quad \tilde{\beta }_{9} = +2.6667\;,\;\; &\tilde{\beta }_{10} & = -1.3333\;,\;\;&\quad \tilde{\beta }_{11} & = +7.6667\;, \\ \tilde{\beta }_{12} & = +8.3333\;, &\ \ \mbox{ and }\tilde{\beta }_{13} = +3.3333\end{array} }$$

for the interaction of Factors A and B. Figure 4.41 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.41
figure 41

Observed, predicted, and residual LAD regression values for the univariate response measurement scores listed in Table 4.10

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 36 LAD regression residuals listed in Fig. 4.41 yield \(ab = (3)(3) = 9\) average distance-function values of

$$\displaystyle{\begin{array}{lllll} \xi _{(A\times B)_{1}} & = 7.50\;,\quad &\quad \xi _{(A\times B)_{2}} & = 6.1667\;,\quad &\quad \xi _{(A\times B)_{3}} = 6.6667\;, \\ \xi _{(A\times B)_{4}} & = 3.1667\;,\quad &\quad \xi _{(A\times B)_{5}} & = 7.3333\;,\quad &\quad \xi _{(A\times B)_{6}} = 5.6667\;, \\ \xi _{(A\times B)_{7}} & = 2.00\;,\quad &\quad \xi _{(A\times B)_{8}} & = 6.8333\;,\quad &\quad \mbox{ and }\xi _{(A\times B)_{9}} = 2.00\;.\end{array} }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.41 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{(A\times B)_{i}}} {N} \;,\qquad i = 1,\,\ldots,\,9\;, }$$

is

$$\displaystyle\begin{array}{rcl} \delta _{A\times B}& =& \sum _{i=1}^{ab}C_{ i}\xi _{i} = \frac{4} {36}\big(7.50 + 6.1667 + 6.6667 {}\\ & & \phantom{\sum _{i=1}^{ab}C_{ i}\xi _{i} = \frac{4} {36}(7.50 + 6.1667 + 6} + \cdots + 6.8333 + 2.00\big) = 5.2593\;. {}\\ \end{array}$$

If all M possible arrangements of the N = 36 observed LAD regression residuals listed in Fig. 4.41 occur with equal chance, the approximate resampling probability value of δ A×B  = 5. 2593 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{(A\times B)_{1}} = \cdots = n_{(A\times B)_{9}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A\times B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A\times B}$}} {L} = \frac{140,219} {1,000,000} = 0.1402\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 5. 6825 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A\times B} = 1 -\frac{\delta _{A\times B}} {\mu _{\delta }} = 1 -\frac{5.2593} {5.6825} = +0.0745\;, }$$

indicating approximately 7 % agreement between the observed and predicted y values above that expected by chance.

OLS Regression Analysis

For comparison, consider an MRPP analysis of OLS regression residuals calculated on the N = 36 response measurement scores for the A×B interaction in Table 4.10. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{\begin{array}{llllllll} \hat{\beta }_{0} & = +33.50\;,\;\; &\quad \hat{\beta }_{1} = +12.8333\;,\;\; &\hat{\beta }_{2} & = +0.1667\;,\;\;\;\;&\hat{\beta }_{3} & = +7.50\;, \\ \hat{\beta }_{4} & = +5.50\;,\;\; &\quad \hat{\beta }_{5} = +1.50\;,\;\; &\hat{\beta }_{6} & = -5.1667\;,\;\;\;\;&\hat{\beta }_{7} & = -12.50\;, \\ \hat{\beta }_{8} & = -13.8333\;,\;\;&\quad \hat{\beta }_{9} = +2.8333\;,\;\; &\hat{\beta }_{10} & = -1.8333\;,\;\;\;\;&\hat{\beta }_{11} & = +4.50\;, \\ \hat{\beta }_{12} & = +9.50\;, &\ \mbox{ and }\hat{\beta }_{13} = +2.7500 \end{array} }$$

for the interaction of Factors A and B. Figure 4.42 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.42
figure 42

Observed, predicted, and residual OLS regression values for the univariate response measurement scores listed in Table 4.10

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 36 OLS regression residuals listed in Fig. 4.42 yield \(ab = (3)(3) = 9\) average distance-function values of

$$\displaystyle\begin{array}{rcl} \xi _{(A\times B)_{1}}& =& 56.2037\;,\quad \xi _{(A\times B)_{2}} = 16.6481\;,\quad \xi _{(A\times B)_{3}} = 38.1481\;, {}\\ \xi _{(A\times B)_{4}}& =& 26.4259\;,\quad \xi _{(A\times B)_{5}} = 98.8148\;,\quad \xi _{(A\times B)_{6}} = 45.0370\;, {}\\ \xi _{(A\times B)_{7}}& =& 15.2593\;,\quad \xi _{(A\times B)_{8}} = 35.7593\;,\quad \mbox{ and }\xi _{(A\times B)_{9}} =\phantom{ 1}9.7037\;. {}\\ \end{array}$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.42 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{(A\times B)_{i}} - 1} {N - ab} \;,\qquad i = 1,\,\ldots,\,9\;, }$$

is

$$\displaystyle\begin{array}{rcl} \delta _{A\times B}& =& \sum _{i=1}^{ab}C_{ i}\xi _{i} = \frac{4 - 1} {36 - 9}\big(56.2037 + 16.6481 + 38.1481 {}\\ & & \phantom{\sum _{i=1}^{ab}C_{ i}\xi _{i} = \frac{4 - 1} {36 - 9}()56.2037 +} +\cdots + 35.7593 + 9.7037\big) = 38.00\;. {}\\ \end{array}$$

If all M possible arrangements of the N = 36 observed OLS regression residuals listed in Fig. 4.42 occur with equal chance, the approximate resampling probability value of δ A×B  = 38. 00 calculated on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{(A\times B)_{1}} = \cdots = n_{(A\times B)_{9}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A\times B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A\times B}$}} {L} = \frac{72,276} {1,000,000} = 0.0723\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{(A\times B)_{i}}/N\) for i = 1, , 9 is P = 0. 1402.

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 47. 6286 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A\times B} = 1 -\frac{\delta _{A\times B}} {\mu _{\delta }} = 1 - \frac{38.00} {47.6286} = +0.2022\;, }$$

indicating approximately 20 % agreement between the observed and predicted y values above that expected by chance.

Conventional ANOVA Analysis

A conventional split-plot analysis of variance calculated on the N = 36 univariate response measurement scores listed in Fig. 4.35 on p. 180 yields an observed F-ratio of F A×B  = 2. 8114. Assuming independence, normality, and homogeneity of variance, F A×B is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = (a - 1)(b - 1) = (3 - 1)(3 - 1) = 4\) and \(\nu _{2} = a(n - 1)(b - 1) = 3(4 - 1)(3 - 1) = 18\) degrees of freedom. Under the null hypothesis, the observed value of F A×B  = 2. 8114 yields an approximate probability value of P = 0. 0565, which is similar to the probability value of P = 0. 0723 obtained with the OLS regression analysis.

4.3.8 Nested Design

It is sometimes necessary to compare treatment groups when one independent variable is nested under a second independent variable. Two-factor nested analysis-of-variance designs occur whenever one factor is not completely crossed with the second factor. Consider a nested design to compare a = 3 levels of Factor A on scores obtained from b = 3 levels of Factor B, with B 1, B 2, and B 3 of Factor B in level A 1; B 4, B 5, and B 6 of Factor B in level A 2, and B 7, B 8, and B 9 of Factor B in level A 3. Thus, Factor B is said to be nested under Factor A. The univariate data for this example are listed in Table 4.11 for a sample of n = 4 objects randomly chosen from each of the \(ab = (3)(3) = 9\) levels of Factors A and B.

Table 4.11 Example univariate response measurement scores for a nested design with b = 3 levels of Factor B nested under a = 3 levels of Factor A

Analysis of Factor A

A design matrix of effect codes for an MRPP regression analysis of Factor A is given in Table 4.12, where the first column of 1 values provides for an intercept, the next two columns contain the effect codes for Factor B, and the third column contains the univariate response measurement scores listed according to the original random assignment of the n = 36 objects to the a = 3 levels of Factor A with the first \(n_{A_{1}} = 12\) scores, the next \(n_{A_{2}} = 12\) scores, and the last \(n_{A_{3}} = 12\) scores associated with the a = 3 levels of Factor A, respectively. The MRPP regression analysis examines the N = 36 regression residuals for possible differences among the a = 3 treatment levels of Factor A; consequently, no effect codes are provided for Factor A as this information is implicit in the ordering of the a = 3 levels of Factor A in the rightmost columns of Table 4.12.

Table 4.12 Design matrix and univariate response measurement scores for an analysis of Factor A with Factor B nested under Factor A

Because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{36!} {(12!)^{3}} = 3,384,731,762,521,200 }$$

possible, equally-likely arrangements of the N = 36 univariate response measurement scores listed in Table 4.11, an exact permutation approach is not possible.

LAD Regression Analysis

An MRPP resampling analysis of the N = 36 LAD regression residuals calculated on the univariate response measurement scores listed in Table 4.12 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{0} = +32.00\;,\quad \tilde{\beta }_{1} = -1.00\;,\mbox{ and}\quad \tilde{\beta }_{2} = 0.00 }$$

for Factor A. Figure 4.43 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.43
figure 43

Observed, predicted, and residual LAD regression values for the nested response measurement scores listed in Table 4.12

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 36 LAD regression residuals listed in Fig. 4.43 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 2.00\;,\quad \xi _{A_{2}} = 3.5152\;,\mbox{ and}\quad \xi _{A_{3}} = 4.4242\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.43 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{12} {36}\big(2.00 + 3.5152 + 4.4242\big) = 3.3131\;. }$$

If all M possible arrangements of the N = 36 observed LAD regression residuals listed in Fig. 4.43 occur with equal chance, the approximate resampling probability value of δ A  = 3. 3131 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {L} = \frac{704,848} {1,000,000} = 0.7048\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 3. 2508 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size between the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{3.3131} {3.2508} = -0.0192\;, }$$

indicating slightly less than chance agreement between the observed and predicted y values.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of OLS regression residuals calculated on the N = 36 univariate response measurement scores listed in Table 4.12. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{ \hat{\beta }_{0} = +32.00\;,\quad \hat{\beta }_{1} = -1.00\;,\mbox{ and}\quad \hat{\beta }_{2} = 0.00 }$$

for Factor A. Figure 4.44 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 36.Footnote 4

Fig. 4.44
figure 44

Observed, predicted, and residual OLS regression values for the nested response measurement scores listed in Table 4.12

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 36 OLS regression residuals listed in Fig. 4.44 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 5.8182\;,\quad \xi _{A_{2}} = 17.4545\;,\mbox{ and}\quad \xi _{A_{3}} = 27.6364\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.44 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}} - 1} {N - a} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{12 - 1} {36 - 3}(5.8182 + 17.4545 + 27.6364) = 16.9697. }$$

If all M possible arrangements of the N = 36 observed OLS regression residuals listed in Fig. 4.44 occur with equal chance, the approximate resampling probability value of δ A  = 16. 9697 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {L} = \frac{1,000,000} {1,000,000} = 1.00\;. }$$

A reanalysis of the data based on L = 10, 000, 000 random arrangements of the N = 36 observed regression residuals listed in Fig. 4.44 with \(n_{A_{1}} = n_{A_{2}} = n_{A_{3}} = 12\) residuals preserved for each arrangement also yields an approximate resampling probability value of P = 1. 00.

A probability value of P = 1. 00 is not very informative. In such cases, an alternative moment procedure based on the exact mean, μ δ , exact variance, \(\sigma _{\delta }^{2}\), and exact skewness, γ δ , of δ can be employed to obtain approximate probability values; see Chap. 1, Sect. 1.2.2. For Factor A, a moment-approximation procedure yields δ A  = 16. 9697, μ δ  = 16. 00, \(\sigma _{\delta }^{2} = 0.8472\), \(\gamma _{\delta } = -1.7012\), an observed standardized test statistic of

$$\displaystyle{ T_{B} = \frac{\delta _{B} -\mu _{\delta }} {\sigma _{\delta }} = \frac{16.9697 - 16.00} {\sqrt{0.8472}} = +0.0535\;, }$$

and a Pearson type III approximate probability value of P = 0. 9487.

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{A_{i}}/N\) for i = 1, , a is P = 0. 7048.

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 16. 00 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{16.9697} {16.00} = -0.0606\;, }$$

indicating slightly less than chance agreement between the observed and predicted y values.

Conventional ANOVA Analysis

A conventional fixed-effects nested analysis of variance calculated on the N = 36 response measurement scores for Factor A listed in Table 4.11 on p. 196 yields an observed F-ratio of F A  = 3. 6818. Assuming independence, normality, and homogeneity of variance, F A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a - 1 = 3 - 1 = 2\) and \(\nu _{2} = ab(n - 1) = (3)(3)(4 - 1) = 27\) degrees of freedom. Under the null hypothesis, the observed value of F A  = 3. 6818 yields an approximate probability value of P = 0. 0386.

Analysis of Factor B | A

A design matrix of effect codes for an MRPP regression analysis of Factor B, nested under Factor A, is given in Fig. 4.45, where the first column of 1 values provides for an intercept, the next two columns contain effect codes for Factor A, the next four columns contain effect codes for the A×B interaction, and the last column contains the univariate response measurement scores listed according to the b = 3 levels of Factor B with the first \(n_{B\vert A_{1}} = 12\) scores, the next \(n_{B\vert A_{2}} = 12\) scores, and the last \(n_{B\vert A_{3}} = 12\) scores associated with the b = 3 levels of Factor B, respectively. The MRPP regression analysis examines the N = 36 regression residuals for possible differences among the b = 3 treatment levels of Factor B; consequently, no effect codes are provided for Factor B as this information is implicit in the ordering of the b = 3 levels of Factor B in the last column of Fig. 4.45.

Fig. 4.45
figure 45

Design matrix and univariate response measurement scores for an analysis of Factor B with Factor B nested under Factor A

LAD Regression Analysis

Again, because there are

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{b}n_{ B\vert A_{i}}!} = \frac{36!} {(12!)^{3}} = 3,384,731,762,521,200 }$$

possible, equally-likely arrangements of the N = 36 response measurement scores listed in Fig. 4.45, an exact permutation approach is not possible. An MRPP resampling analysis of the N = 36 LAD regression residuals calculated on the univariate response measurement scores in Fig. 4.45 yields estimated LAD regression coefficients of

$$\displaystyle{\begin{array}{llll} \tilde{\beta }_{0} & = +32.00\;,\quad &\quad \tilde{\beta }_{1} & = -1.00\;,\quad \tilde{\beta }_{2} = 0.6667\;,\quad \tilde{\beta }_{3} = +1.00\;, \\ \tilde{\beta }_{4} & = -1.6667\;,\quad &\quad \tilde{\beta }_{5} & = +1.00\;,\quad \mbox{ and }\tilde{\beta }_{6} = +2.3333\end{array} }$$

for Factor B | A. Figure 4.46 lists the observed y i values, LAD predicted \(\tilde{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.46
figure 46

Observed, predicted, and residual LAD regression values for the nested response measurement scores listed in Table 4.12

Following Eq. (4.5) on p. 125 and employing ordinary Euclidean distance between residuals with v = 1, the N = 36 LAD regression residuals listed in Fig. 4.46 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{B\vert A_{1}} = 2.00\;,\quad \xi _{B\vert A_{2}} = 2.00\;,\mbox{ and}\quad \xi _{B\vert A_{3}} = 2.00\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.46 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B\vert A_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B\vert A} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{12} {36}\big(2.00 + 2.00 + 2.00\big) = 2.00\;. }$$

If all M possible arrangements of the N = 36 observed LAD regression residuals listed in Fig. 4.46 occur with equal chance, the approximate resampling probability value of δ B | A  = 2. 00 computed on L = 1, 000, 000 random arrangements of the observed LAD regression residuals with \(n_{B\vert A_{1}} = n_{B\vert A_{2}} = n_{B\vert A_{3}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B\vert A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B\vert A}$}} {L} = \frac{361,575} {1,000,000} = 0.3616\;. }$$

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 2. 0127 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B\vert A} = 1 -\frac{\delta _{B\vert A}} {\mu _{\delta }} = 1 - \frac{2.00} {2.0127} = +0.0063\;, }$$

indicating approximately chance agreement between the observed and predicted y values.

OLS Regression Analysis

For comparison, consider an MRPP resampling analysis of OLS regression residuals calculated on the N = 36 response measurement scores listed in Table 4.12 on p. 197. The MRPP regression analysis yields estimated OLS regression coefficients of

$$\displaystyle{\begin{array}{llll} \hat{\beta }_{0} & = +32.00\;,\quad &\quad \hat{\beta }_{1} & = -1.00\;,\quad \hat{\beta }_{2} = 0.00\;,\quad \hat{\beta }_{3} = +1.00\;, \\ \hat{\beta }_{4} & = -2.00\;,\quad &\quad \hat{\beta }_{5} & = +1.00\;,\quad \mbox{ and }\hat{\beta }_{6} = +3.00\end{array} }$$

for Factor B | A. Figure 4.47 lists the observed y i values, OLS predicted \(\hat{y}_{i}\) values, and residual e i values for i = 1, , 36.

Fig. 4.47
figure 47

Observed, predicted, and residual OLS regression values for the nested response measurement scores listed in Table 4.12

Following Eq. (4.5) on p. 125 and employing squared Euclidean distance between residuals with v = 2, the N = 36 OLS regression residuals listed in Fig. 4.47 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{B\vert A_{1}} = 5.8182\;,\quad \xi _{B\vert A_{2}} = 5.8182\;,\mbox{ and}\quad \xi _{B\vert A_{3}} = 5.8182\;. }$$

Following Eq. (4.4) on p. 125, the observed value of the MRPP test statistic  calculated on the OLS regression residuals listed in Fig. 4.47 with v = 2 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B\vert A_{i}} - 1} {N - b} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{B\vert A} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{12 - 1} {36 - 3}\big(5.8182 + 5.8182 + 5.8182\big) = 5.8182\;. }$$

If all M possible arrangements of the N = 36 observed OLS regression residuals listed in Fig. 4.47 occur with equal chance, the approximate resampling probability value of δ B | A  = 5. 8182 computed on L = 1, 000, 000 random arrangements of the observed OLS regression residuals with \(n_{B\vert A_{1}} = n_{B\vert A_{2}} = n_{B\vert A_{3}} = 12\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B\vert A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B\vert A}$}} {L} = \frac{7,600} {1,000,000} = 0.0076\;. }$$

For comparison, the approximate resampling probability value based on LAD regression, v = 1, L = 1, 000, 000, and \(C_{i} = n_{B\vert A_{i}}/N\) for i = 1, 2, 3 is P = 0. 3616.

Following Eq. (4.7) on p. 126, the exact expected value of the values is μ δ  = 5. 4857 and, following Eq. (4.6) on p. 126, the observed chance-corrected measure of effect size for the y i and \(\hat{y}_{i}\) values, i = 1, , N, is

$$\displaystyle{ \mathfrak{R}_{B\vert A} = 1 -\frac{\delta _{B\vert A}} {\mu _{\delta }} = 1 -\frac{5.8182} {5.4857} = -0.0606\;, }$$

indicating slightly less than chance agreement between the observed and predicted y values.

Conventional ANOVA Analysis

A conventional fixed-effects nested analysis of variance calculated on the N = 36 univariate response measurement scores for Factor B | A listed in Table 4.11 on p. 196 yields an observed F-ratio of F B | A  = 10. 6362. Assuming independence, normality, and homogeneity of variance, F B | A is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = a(b - 1) = 3(3 - 1) = 6\) and \(\nu _{2} = ab(n - 1) = (3)(3)(4 - 1) = 27\) degrees of freedom. Under the null hypothesis, the observed value of F B | A  = 10. 6362 yields an approximate probability value of \(P = 4.5461\times 10^{-6}\).

4.4 Multivariate Multiple Regression Designs

An extension of LAD multiple regression to include multiple dependent variables, as well as multiple independent variables, i.e., multivariate multiple LAD regression, is developed in this section. The extension was prompted by a multivariate Least Sum (of) Euclidean Distances (LSED) algorithm developed by Kaufman , Taylor , Mielke , and Berry in 2002 [198].

Consider the multivariate multiple regression model given by

$$\displaystyle{ y_{ik} =\sum _{ j=1}^{m}x_{ ij}\beta _{jk} + e_{ik} }$$

for i = 1, , N and k = 1, , r, where y ik represents the ith of N measurements for the kth of r response variables, possibly affected by a treatment; x ij is the jth of m covariates associated with the ith response, where x i1 = 1 if the model includes an intercept; β jk denotes the jth of m regression parameters for the kth of r response variables; and e ik designates the error associated with the ith of N measurements for the k of r response variables.

If estimates of β jk that minimize

$$\displaystyle{ \sum _{i=1}^{N}\left (\,\sum _{ k=1}^{r}e_{ ik}^{2}\right )^{\!1/2} }$$

are denoted by \(\tilde{\beta }_{jk}\) for j = 1, , m and k = 1, , r, then the Nr-dimensional residuals of the LSED multivariate multiple regression model are given by

$$\displaystyle{ e_{ik} = y_{ik} -\sum _{j=1}^{m}x_{ ij}\tilde{\beta }_{jk} }$$

for i = 1, , N and k = 1, , r.

Let the Nr-dimensional residuals, (e i1, , e ir ) for i = 1, , N obtained from an LSED multivariate multiple regression model, be partitioned into g treatment groups of sizes n 1, , n g , where n i  ≥ 2 for i = 1, , g and

$$\displaystyle{ N =\sum _{ i=1}^{g}n_{ i}\;. }$$

The MRPP analysis of the multivariate multiple regression residuals depends on statistic

$$\displaystyle{ \delta =\sum _{ i=1}^{g}C_{ i}\xi _{i}\;, }$$
(4.8)

where \(C_{i} = n_{i}/N\) is a positive weight for the ith of g treatment groups and \(\xi _{i}\) is the average pairwise Euclidean distance among the n i r-dimensional residuals in the ith of g treatment groups defined by

$$\displaystyle{ \xi _{i} = \binom{n_{i}}{2}^{\!-1}\,\sum _{ k=1}^{N-1}\,\sum _{ l=k+1}^{N}\left [\,\sum _{ j=1}^{r}\big(e_{ kj} - e_{lj}\big)^{2}\right ]^{\!1/2}\Psi _{ ki}\,\Psi _{li}\;, }$$
(4.9)

where

$$\displaystyle{ \Psi _{ki} = \left \{\begin{array}{@{}l@{\quad }l@{}} \,1 \quad &\mbox{ if ($e_{k1},\,\ldots,\,e_{kr}$) is in the $i$th treatment group}\;, \\ [6pt]\,0\quad &\mbox{ otherwise}\;. \end{array} \right. }$$

The null hypothesis specifies that each of the

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{g}n_{ i}!} }$$

possible allocations of the Nr-dimensional residuals to the g treatment groups is equally likely. An exact MRPP probability value associated with the observed value of δ,  δ o, is given by

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {M} \;. }$$

As previously, when M is large an approximate probability value may be obtained from a resampling permutation procedure. Let L denote a large random sample drawn from all M possible arrangements of the observed data, then an approximate resampling probability value is given by

$$\displaystyle{ P\big(\delta \leq \delta _{\text{o}}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{\text{o}}$}} {L} \;. }$$

As with univariate multiple regression models, the criterion for fitting multivariate multiple regression models based on δ is the chance-corrected measure of effect size between the observed and predicted response measurement values given by

$$\displaystyle{ \mathfrak{R} = 1 -\frac{\delta } {\mu _{\delta }}\;, }$$
(4.10)

where μ δ is the expected value of δ over the N! possible pairings under the null hypothesis, given by

$$\displaystyle{ \mu _{\delta } = \frac{1} {M}\sum _{i=1}^{M}\delta _{ i}\;. }$$
(4.11)

4.4.1 Example Analysis

To illustrate a multivariate LSED multiple regression analysis, consider an unbalanced two-way randomized-block experimental design in which N = 16 subjects (S ) are tested over a = 3 levels of Factor A, the experiment is repeated b = 2 times for Factor B, and there are r = 2 response measurement scores for each subject. The design and data are adapted from Mielke and Berry [297, p. 184] and are given in Fig. 4.48. The design is intentionally kept small to illustrate the multivariate multiple regression procedure.

Fig. 4.48
figure 48

Example data for a two-way randomized-block design with a = 3 blocks and b = 2 treatments

Analysis of Factor A

A design matrix of dummy codes for an MRPP regression analysis of Factor A is given in Fig. 4.49, where the first column of 1 values provides for an intercept, the next column contains the dummy codes for Factor B, and the third and fourth columns contain the bivariate response measurement scores listed according to the original random assignment of the N = 16 subjects to the a = 3 levels of Factor A with the first \(n_{A_{1}} = 5\) scores, the next \(n_{A_{2}} = 7\) scores, and the last \(n_{A_{3}} = 4\) scores associated with the a = 3 levels of Factor A, respectively. The MRPP regression analysis examines the N = 16 regression residuals for possible differences among the a = 3 treatment levels of Factor A; consequently, no dummy codes are provided for Factor A as this information is implicit in the ordering of the a = 3 levels of Factor A in the last two columns of Fig. 4.49.

Fig. 4.49
figure 49

Example design matrix and bivariate response measurement scores for a multivariate LSED multiple regression analysis of Factor A with N = 16

Because there are only

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{a}n_{ A_{i}}!} = \frac{16!} {5!\;7!\;4!} = 1,441,440 }$$

possible, equally-likely arrangements of the N = 16 bivariate response measurement scores listed in Fig. 4.49, an exact permutation approach is feasible. An MRPP analysis of the N = 16 LAD regression residuals calculated on the bivariate response measurements for Factor A in Fig. 4.49 yields estimated LAD regression coefficients of

$$\displaystyle{ \tilde{\beta }_{1,1} = +58.00\;,\;\tilde{\beta }_{2,1} = -9.00\;,\;\tilde{\beta }_{1,2} = +94.00\;,\;\mbox{ and}\;\;\tilde{\beta }_{2,2} = +8.00 }$$

for Factor A. Figure 4.50 lists the observed y ik values, LAD predicted \(\tilde{y}_{ik}\) values, and residual e ik values for i = 1, , 16 and k = 1, 2.

Fig. 4.50
figure 50

Observed, predicted, and residual values for a multivariate LSED multiple regression analysis of Factor A with N = 16

Following Eq. (4.9) on p. 208 and employing ordinary Euclidean distance between residuals with v = 1, the N = 16 LAD regression residuals listed in Fig. 4.50 yield a = 3 average distance-function values of

$$\displaystyle{ \xi _{A_{1}} = 7.2294\;,\quad \xi _{A_{2}} = 20.0289\;,\mbox{ and}\quad \xi _{A_{3}} = 7.3475\;. }$$

Following Eq. (4.8) on p. 208, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.50 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{A_{i}}} {N} \;,\qquad i = 1,2,3\;, }$$

is

$$\displaystyle{ \delta _{A} =\sum _{ i=1}^{a}C_{ i}\xi _{i} = \frac{1} {16}\big[(5)(7.2294) + (7)(20.0289) + (4)(7.3475)\big] = 12.8587\;. }$$

If all arrangements of the N = 16 observed LAD regression residuals listed in Fig. 4.50 occur with equal chance, the exact probability value of δ A  = 12. 8587 computed on the M = 1, 441, 440 possible arrangements of the observed LAD regression residuals with \(n_{A_{1}} = 5\), \(n_{A_{2}} = 7\), and \(n_{A_{3}} = 4\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{A}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{A}$}} {M} = \frac{6,676} {1,441,440} = 0.0046\;. }$$

Following Eq. (4.11) on p. 209, the exact expected value of the M = 1, 441, 440 δ values is μ δ  = 18. 1020 and, following Eq. (4.10) on p. 209, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, \(i = 1,\ldots,N\), is

$$\displaystyle{ \mathfrak{R}_{A} = 1 -\frac{\delta _{A}} {\mu _{\delta }} = 1 -\frac{12.8587} {18.1020} = +0.2897\;, }$$

indicating approximately 29 % agreement between the observed and predicted values above that expected by chance.

Analysis of Factor B

A design matrix of dummy codes for an MRPP regression analysis of Factor B is given in Fig. 4.51, where the first column of 1 values provides for an intercept, the next two columns contain the dummy codes for Factor A, and the fourth and fifth columns contain the bivariate response measurement scores listed according to the original random assignment of the N = 16 subjects to the b = 2 levels of Factor B with the first \(n_{B_{1}} = 7\) scores and the last \(n_{B_{2}} = 9\) scores associated with the b = 2 levels of Factor B, respectively. The MRPP regression analysis examines the N = 16 regression residuals for possible differences between the b = 2 treatment levels of Factor B; consequently, no dummy codes are provided for Factor B as this information is implicit in the ordering of the b = 2 levels of Factor B in the last two columns of Fig. 4.51.

Fig. 4.51
figure 51

Example design matrix and bivariate response measurement scores for a multivariate LSED multiple regression analysis of Factor B with N = 16

Because there are only

$$\displaystyle{ M = \frac{N!} {\prod _{i=1}^{b}n_{ B_{i}}!} = \frac{16!} {7!\;9!} = 11,440 }$$

possible, equally-likely arrangements of the N = 16 response measurement scores listed in Fig. 4.51, an exact permutation approach is feasible. An MRPP analysis of the N = 16 LAD regression residuals calculated on the bivariate response measurements for Factor B in Fig. 4.51 yields estimated LAD regression coefficients of

$$\displaystyle\begin{array}{rcl} \tilde{\beta }_{1,1}& =& +46.00\;,\quad \tilde{\beta }_{2,1} = +5.00\;,\quad \tilde{\beta }_{3,1} = +20.00\;,\quad \tilde{\beta }_{1,2} = +104.00\;, {}\\ \tilde{\beta }_{2,2}& =& -4.00\;,\quad \mbox{ and }\tilde{\beta }_{3,2} = -20.00 {}\\ \end{array}$$

for Factor B. Figure 4.52 lists the observed y ik values, LAD predicted \(\tilde{y}_{ik}\) values, and residual e ik values for i = 1, , 16 and k = 1, 2.

Fig. 4.52
figure 52

Observed, predicted, and residual values for a multivariate LSED multiple regression analysis of Factor B with N = 16

Following Eq. (4.9) on p. 208 and employing ordinary Euclidean distance between residuals with v = 1, the N = 16 LAD regression residuals listed in Fig. 4.52 yield b = 2 average distance-function values of

$$\displaystyle{ \xi _{B_{1}} = 6.0229\quad \mbox{ and}\quad \xi _{B_{2}} = 16.7440\;. }$$

Following Eq. (4.4) on p. 208, the observed value of the MRPP test statistic  calculated on the LAD regression residuals listed in Fig. 4.52 with v = 1 and treatment-group weights

$$\displaystyle{ C_{i} = \frac{n_{B_{i}}} {N} \;,\qquad i = 1,2\;, }$$

is

$$\displaystyle{ \delta _{B} =\sum _{ i=1}^{b}C_{ i}\xi _{i} = \frac{1} {16}\big[(7)(6.0229) + (9)(16.7440)\big] = 12.0535\;. }$$

If all arrangements of the N = 16 observed LAD regression residuals listed in Fig. 4.52 occur with equal chance, the exact probability value of δ B  = 12. 0535 computed on the M = 11, 440 possible arrangements of the observed LAD regression residuals with \(n_{B_{1}} = 7\) and \(n_{B_{2}} = 9\) residuals preserved for each arrangement is

$$\displaystyle{ P\big(\delta \leq \delta _{B}\vert H_{0}\big) = \frac{\mbox{ number of $\delta $ values $ \leq \delta _{B}$}} {M} = \frac{2,090} {11,440} = 0.1827\;. }$$

Following Eq. (4.11) on p. 209, the exact expected value of the M = 11, 440 δ values is μ δ  = 12. 2923 and, following Eq. (4.10) on p. 209, the observed chance-corrected measure of effect size for the y i and \(\tilde{y}_{i}\) values, \(i = 1,\ldots,N\), is

$$\displaystyle{ \mathfrak{R}_{B} = 1 -\frac{\delta _{B}} {\mu _{\delta }} = 1 -\frac{12.0535} {12.2923} = +0.0194\;, }$$

indicating approximately 2 % agreement between the observed and predicted values above that expected by chance.

4.5 Coda

Chapter 4 applied the Multi-Response Permutation Procedures (MRPP) developed in Chap. 2 to interval-level response measurements, utilizing dummy and effect coding of treatment groups to generate regression residuals from LAD regression models, subsequently analyzed with MRPP. Considered in this chapter were one-way randomized, one-way randomized with a covariate, one-way randomized-block, two-way randomized-block, two-way factorial, Latin square, split-plot, and two-factor nested designs. Chapter 4 concluded with example multivariate multiple regression designs.

Comparisons of permutation-based LAD regression with ordinary Euclidean distance between response measurements, permutation-based OLS regression with squared Euclidean distance between response measurements, and conventional OLS regression with squared Euclidean distance between response measurements in Chap. 4, revealed that considerable differences can exist among the three approaches that are not systematic. Oftentimes, one of the three approaches yielded the lowest of the three probability values, while other times the same approach yielded the highest probability value. Sometimes the three approaches yielded the same, or nearly the same, probability value, as was the case with the analysis of Factor B in the two-way randomized-block design example, and other times the three probability values were markedly different, as was the case with the analysis of the A×B interaction in the two-way factorial design example. In general, permutation-based LAD regression, coupled with MRPP and ordinary Euclidean distance between response measurements, is recommended due to the lack of restrictive assumptions and robustness that is possible with extreme values.

Chapter 5

Chapter 5 establishes the relationships between the MRPP test statistics, δ and \(\mathfrak{R}\), and selected conventional tests and measures designed for the analysis of completely randomized data at the ordinal level of measurement. Considered in Chap. 5 are the Wilcoxon two-sample rank-sum test, the Kruskal–Wallis multiple-sample rank-sum test, the Mood rank-sum test for dispersion, the Brown–Mood median test, the Mielke power-of-rank functions, the Whitfield two-sample rank-sum test, and the Cureton rank-biserial test.