Exact Permutation/Randomization Tests Algorithms

Ghosh, Subir

doi:10.1007/s42519-020-00131-6

Exact Permutation/Randomization Tests Algorithms

ORIGINAL ARTICLE
Published: 06 October 2020

Volume 14, article number 65, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Exact Permutation/Randomization Tests Algorithms

Download PDF

Subir Ghosh ORCID: orcid.org/0000-0002-3500-6328¹

143 Accesses
Explore all metrics

Abstract

Fisher (The design of experiments, Oliver & Boyd, London, 1935) described the exact permutation and randomization tests for comparative experiments without assuming normality or any particular probability distribution. While having this as an attractive feature, the computational challenge was a disadvantage at that time but not now with modern computers. This paper introduces a permutation/randomization data algorithm to generate the permutation/randomization distributions under the null hypotheses for calculating the P-values. The properties of permutation/randomization data matrices developed by algorithms following the proposed mathematical processes are derived. Two illustrative examples demonstrate the usefulness of the proposed computational methods.

Permutation and Rank Tests

Permutation Statistical Methods

Permutation and Randomization Tests

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The randomization test for paired comparison of two population distributions shares a rich history starting from Fisher [15] with the permutation test for comparison of two populations by using the data obtained from two independent samples (see Yates [34]). The work of Pitman [26,27,28]; Welch [33]; Wald & Wolfowitz [32]; Hoeffding [17]; Kempthorne [18]; Box & Anderson [4] Tukey [30, 31]; Kempthorne & Doerfler [19]; Rao [29]; Lehmann [20]; Basu [1]; Boik [3]; Edgington & Onghena, P. [11]; Ludbrook and Dudley [21]; Calinsky and Kageyama [6, 7]; Ernst [13]; Good [16]; Manly [22]; Mielke and Berry [23]; David [8]; Pesarin and Salmaso [25]; Ferron and Levin [14]; Dugard [9]; Efron and Hastie [12]; Berry, Johnston, and Mielke [2]; Onghena [24]; and many others enriched the understanding of the randomization/permutation tests. The application areas include Bio-statistics and Bio-informatics, Computer Science and Engineering, Economics, Education, Psychology, and Sociology.

Section 2 provides the background information on the permutation matrices and the exchangeable condition on a vector of random variables.

The main idea behind the algorithm for the paired comparison randomization test in Sect. 3 is to start with the vector ${\varvec{Y}}_d$ of n paired differences. Then, the task is to develop a randomization matrix ${\varvec{R}}^{(n)}$ with n rows and $2^n$ columns to obtain the randomization distribution of ${\varvec{Y}}_d$ satisfying the exchangeable condition that the probability distribution of the columns of ${\varvec{R}}^{(n)}{\odot }{\varvec{Y}}_d$ to be the same under the null hypothesis, where the Schur product ${\odot }$ is defined in Definition 3. The algorithm identifies the randomization matrix ${\varvec{R}}^{(n)}$ to calculate the P value. Section 4 illustrates the algorithm for the boys’ shoes experiment data.

The main idea behind the algorithm for two independent sample permutation tests in Sect. 5 is to consider first the n! permutations of an $n \times 1$ random vector ${\varvec{Y}}$ obtained from ${\varvec{P}}_n{\varvec{Y}}$ by using the n! permutation matrices ${\varvec{P}}_n$. Then, the task is to identify a specific subset s, $s < n!$, of permutation matrices to satisfy the exchangeable condition that the probability distribution of ${\varvec{Y}}$ to be the same as the probability distribution of ${\varvec{P}}_n {\varvec{Y}}$ for all s permutation matrices ${\varvec{P}}_n$ under the null hypothesis. The algorithm identifies the s such permutation matrices to calculate the P value. Section 6 illustrates the algorithm for the tomato fertilizer experiment data.

Section 7 makes the concluding remarks. Appendix presents Tables 3, 4, 5, 6.

2 Permutations of Observations

An $m \times n$ matrix $\mathbf{A}$ with $a_{ij}$ as its element in the row i and column j, $i = 1, \ldots , m$ and $j = 1, \ldots ,n$, is denoted by ${\varvec{A}} = [a_{ij}]$. The elements $a_{ij}$ are members in the set of real numbers ${\mathbb {R}}$. The set of all $m \times n$ matrices with real elements is denoted by ${\varvec{M}}_{m \times n}({\mathbb {R}})$, or by $\mathbf{M}_n({\mathbb {R}})$ if m = n. For simplicity, we write ${\varvec{M}}_n({\mathbb {R}}) = {\varvec{M}}_n$ and ${\varvec{M}}_{m \times n}({\mathbb {R}}) = {\varvec{M}}_{m \times n}$. The identity matrix ${\varvec{I}}_n$ whose elements in the row i and column i for $i = 1, \ldots , n$ are 1 and the remaining elements are 0, belongs to ${\varvec{M}}_n$. The columns of ${\varvec{I}}_n$ form the standard basis vectors for the n-dimensional Euclidean space ${\mathbb {R}}^n$.

Definition 1

A square matrix ${\varvec{A}}$ is a permutation matrix if exactly one entry in each row and in each column is 1; all other entries are 0.

We denote a permutation matrix by ${\varvec{P}}_n$. Multiplying a matrix on the left by a permutation matrix permutes rows, and multiplying a matrix on the right by a permutation matrix permutes columns. The columns of an $n \times n$ permutation matrix are a permutation of the columns of ${\varvec{I}}_n$. The transpose of a permutation matrix is a permutation matrix. A permutation matrix ${\varvec{P}}_n$ satisfies

$$\begin{aligned} {\varvec{P}}^{\top }_n{\varvec{P}}_n = {\varvec{I}}_n. \end{aligned}$$

This implies that ${\varvec{P}}^{\top }_n$ is the inverse of ${\varvec{P}}_n$. There are n! permutation matrices in ${\varvec{M}}_n$. The n! permutations of an $n \times 1$ observation vector ${\varvec{y}} = (y_1, \ldots , y_n)^{\top }$ are obtained from ${\varvec{P}}_n{\varvec{y}}$ by using the n! permutation matrices ${\varvec{P}}_n$. For ${\varvec{P}}_n = {\varvec{I}}_n$, ${\varvec{P}}_n{\varvec{y}} = {\varvec{y}}$, but the remaining $( n! - 1)$ choices of ${\varvec{P}}_n$ produce changes in ${\varvec{y}}$. To be more specific, we denote a permutation matrix by ${\varvec{P}}_n (i_1, i_2, \ldots , i_n)$ having 1 in column 1 and row $i_1$, column 2 and row $i_2$, ..., and column n and row $i_n$. By this notation, ${\varvec{P}}_n (1, 2, \ldots , n) = {\varvec{I}}_n$ and ${\varvec{P}}_n (n, n - 1, \ldots , 2, 1)$ denoted by ${\varvec{P}}^{(r)}_n$ is equal to

$$\begin{aligned} {\varvec{P}}_n (n, n - 1, \ldots , 2, 1) = \begin{pmatrix}0 &{} 0 &{} \cdots &{} 0 &{} 1\\ 0 &{} 0&{} \cdots &{} 1&{} 0\\ \cdot &{}\cdot &{}\cdots &{}\cdot &{}\cdot \\ 0 &{} 1&{} \cdots &{} 0&{} 0\\ 1 &{} 0&{} \cdots &{} 0&{} 0\\ \end{pmatrix} = {\varvec{P}}^{(r)}_n. \end{aligned}$$

(1)

From (1), we have ${\varvec{P}}^{(r)}_n {\varvec{y}} = (y_{n}, y_{n - 1}, \ldots , y_2, y_1)^{\top }$.

Definition 2

A vector of random variables ${\varvec{Y}} = (Y_1, \ldots , Y_n)^{\top }$ is said to be exchangeable if the probability distribution of ${\varvec{Y}}$ is the same as the probability distribution of ${\varvec{P}}_n {\varvec{Y}}$ for all n! permutation matrices ${\varvec{P}}_n$.

3 An Algorithm for Performing Permutation Randomization Test for Paired Data

In comparing the effects of two treatments A and B by using a randomized complete block design (RCBD) design in n blocks of two experimental units in each block, the observations are $(y_{Aj}, y_{Bj}), j = 1, \ldots , n$, which are n pairs of realizations of $(Y_A, Y_B)$ for two random variables $Y_A$ and $Y_B$. The comparison is based on the difference between two observations $y_{Aj}$ and $y_{Bj}$ in the block j, denoted by $y_{dj} = y_{Aj} - y_{Bj}$, $j = 1, \ldots , n$. If the probability distribution of the random variable $Y_d = Y_A - Y_B$ is symmetric about 0, then there is no difference between the effects of two treatments A and B. The column vector of paired differences ${\varvec{y}}_d = (y_{d1}, \ldots , y_{dn})^{\top }$ is used for testing the null hypothesis $H_0$ : The probability distribution of $Y_d$ is symmetric about zero, against an alternative hypothesis from the possibilities of left or right or both sided alternatives. Under the null hypothesis, there are no differences between treatment effects of A and B and consequently, the probability distribution of $Y_A - Y_B = Y_d$ is the same as the probability distribution of $Y_B - Y_A = - Y_d$. We now present an algorithm to introduce the randomization matrix ${\varvec{R}}^{(n)}$ in ${\varvec{M}}_{n \times 2^n}$ such that the $2^n$ column vectors of ${\varvec{R}}^{(n)}{\odot }{\varvec{y}}_d$ are independent identically distributed samples from the probability distribution of $Y_d$ under the null hypothesis $H_0$.

Let ${\varvec{r}}_0$ in ${\varvec{M}}_{n \times 1}$ be the vector with n elements as 1 and ${\varvec{r}}_i$ in $\mathbf{M}_{n \times 1}$ be the $(n \times 1)$ vector having $- 1$ at the position i and 1 at the remaining $(n - 1)$ positions, $i = 1, \ldots , n$.

Definition 3

Let ${\varvec{A}} = [a_{ij}]$ and ${\varvec{B}} = [b_{ij}] \in {\varvec{M}}_{m \times n}$. The Hadamard product or the Schur product or the entry-wise product of ${\varvec{A}}$ and ${\varvec{B}}$ is ${\varvec{A}}{\odot }{\varvec{B}} = [a_{ij} b_{ij}] \in {\varvec{M}}_{m \times n}$.

We define the matrices ${\varvec{R}}^{(i)}$ in $\mathbf{M}_{n \times 2^i}$ as

$$\begin{aligned} {\varvec{R}}^{(0)} = {\varvec{r}}_0,{\varvec{R}}^{(i)} = \left( {\varvec{R}}^{(i - 1)} , {\varvec{R}}^{(i - 1)}{\odot }{\varvec{r}}_i\right) , i = 1, \ldots , n, \end{aligned}$$

(2)

where the matrix ${\varvec{R}}^{(i - 1)}{\odot }{\varvec{r}}_i$ is the Hadamard product or the Schur product or the entry-wise product of ${\varvec{R}}^{(i - 1)}$ and ${\varvec{r}}_i$. The matrix ${\varvec{R}}^{(i)}$ has n rows and $2^{i}$ columns, and satisfies

$$\begin{aligned} {\varvec{R}}^{(i)} {{\varvec{R}}^{(i)}}^{\top } = 2^i {\varvec{I}}_n, i = 0, 1, \ldots , n. \end{aligned}$$

(3)

The randomization matrix ${\varvec{R}}^{(n)}$ has n rows and $2^{n}$ columns and from (2)

$$\begin{aligned} {\varvec{R}}^{(n)} = \left( {\varvec{R}}^{(n - 1)} , {\varvec{R}}^{(n - 1)}{\odot }{\varvec{r}}_n\right) . \end{aligned}$$

(4)

The above algorithmic process generates sequentially

$$\begin{aligned} {\varvec{R}}^{(0)}&= {\varvec{r}}_0,\\ {\varvec{R}}^{(1)}&= \left( {\varvec{r}}_0, {\varvec{r}}_1 \right) , \\ {\varvec{R}}^{(2)}&= \left( {\varvec{r}}_0, {\varvec{r}}_1, {\varvec{r}}_2, {\varvec{r}}_1 {\odot }{\varvec{r}}_2 \right) , \vdots \\ {\varvec{R}}^{(n - 1)}&= \left( {\varvec{r}}_0, {\varvec{r}}_1, {\varvec{r}}_2, {\varvec{r}}_1 {\odot }{\varvec{r}}_2, \ldots ,{\varvec{r}}_1{\odot }\cdots {\odot }{\varvec{r}}_{n - 1}\right) , \\ {\varvec{R}}^{(n)}&= \left( {\varvec{r}}_0, {\varvec{r}}_1, {\varvec{r}}_2, {\varvec{r}}_1{\odot } {\varvec{r}}_2, \ldots , {\varvec{r}}_1 {\odot }\cdots {\odot }{\varvec{r}}_{n - 1},{\varvec{r}}_n, \ldots , {\varvec{r}}_1 {\odot }\cdots {\odot }{\varvec{r}}_n \right) . \\ \end{aligned}$$

It can be seen that

$$\begin{aligned} {\varvec{R}}^{(n - 1)}{\odot }{\varvec{r}}_n = - {\varvec{R}}^{(n - 1)} {\varvec{P}}^{(r)}_{2^{n - 1}}, \end{aligned}$$

(5)

where the permutation matrix ${\varvec{P}}^{(r)}_n$ is defined in (1). From a column vector ${\varvec{y}}_d = (y_{d1}, \ldots , y_{dn})^{\top } \in \mathbf{M}_{n \times 1}$, we generate $2^n$ vectors in $\mathbf{M}_{n \times 1}$ as the columns of the matrix ${\varvec{R}}^{(n)}{\odot }{\varvec{y}}_d$. The first column of ${\varvec{R}}^{(n)}{\odot }{\varvec{y}}_d$ is in fact ${\varvec{y}}_d$. Denote

$$\begin{aligned} \frac{1}{n}{\varvec{r}}^{\top }_0 \left( {\varvec{R}}^{(n)}{\odot }{\varvec{y}}_d\right) = \left( {\bar{y}}^{(1)}_d,{\bar{y}}^{(2)}_d, \ldots , {\bar{y}}^{(2^n)}_d\right) . \end{aligned}$$

(6)

It can be seen that

$$\begin{aligned} {\bar{y}}^{(1)}_d = \frac{1}{n}\left( y_{d1} + \ldots + y_{dn}\right) = {\bar{y}}^{obs}_d. \end{aligned}$$

(7)

The P value formula of Fisher’s randomization test for testing the null hypothesis against the left-sided alternative hypothesis is given by

$$\begin{aligned} \frac{\left( {\hbox {The}} \,{\hbox {number}}~ {\hbox {of}}~{\bar{y}}^{(u)}_d, u = 1,\ldots , 2^n \right) \le {\bar{y}}^{obs}_d}{2^n}. \end{aligned}$$

(8)

4 The Boys’ Shoes Experiment

The boys’ shoes experiment was presented and analyzed on pages 99−102, 107−115, in the book by Box, Hunter and Hunter [5], and the design used was named as randomized paired comparison design which is a special randomized complete block design (RCBD). The same experiment was discussed lucidly and analyzed on pages 30−55 in the book by Easterling [10]. The purpose of this experiment was to compare two treatments which were two shoe materials A and B. Material A was standard, and material B was a cheaper alternative. Naturally, it was expected that B would more likely result in an increased amount of wear. Ten boys participated in the experiment. Each boy had two different shoe materials on their two feet. Two shoe materials were randomly assigned to the right and left shoe soles of each boy. The boys were blocks, and their two feet or equivalently right shoes and left shoes were experimental units. The experimental units were naturally homogeneous for each boy. Each treatment was replicated once within each block but ten times in the design. Boys wore their shoes for a specified period of time after which the percent of wear was measured. Denote the observation for Treatment i in Block j by $y_{ij}$, $i = A, B; j = 1, 2,\ldots , 10$ and $y_{Aj} - y_{Bj} = y_{dj}$. Two-way layout in Table 1 presents a “balanced data” because of the presence of equal number of observation (one) for every combination of block and treatment. The data are given in Table 1. The positive value of $y_{dj}$ in Table 1 indicates that the material A wore more than the material B, and the negative value of $y_{dj}, j = 1, \ldots , 10$ indicates that the material A wore less than B. In Table 1, there are two positive values of $y_{dj}$ and eight negative values of $y_{dj}$. The data supported 80$\%$ in favor of material A over material B having the lesser amount of wear. The five negative values of $y_{dj}$ resulted when the boys wore material A on their left foot and the three negative values of $y_{dj}$ resulted when the boys wore material A on their right foot. The two positive values of $y_{dj}$ resulted when the boys wore material A on their left foot.

Table 1 Data for the boys’ shoe experiment, $\varvec{L}$ = left sole, $\varvec{R}$ = right sole

Full size table

The material A could be either on the left shoe or the right shoe. When it is on the right shoe, the material B would be on the left shoe and vice versa. The total number of possible random assignments of shoe materials A and B for 10 boys is $2^{10} = 1024$. Each of possible 1024 random assignments is a design. Assuming the null hypothesis, $H_0$ : The probability distribution of $Y_d$ is symmetric about 0, to be true, the possible $y_{dj}$ values for the $j^{th}$ boy are $+ y_{dj}$ and $- y_{dj}$, $j = 1,\ldots ,10$. The randomization matrix ${\varvec{R}}^{(10)}$ of ${\varvec{y}}_d$ has 10 rows and $2^{10} = 1024$ columns. We have from (6) and Table 3 (Appendix)

$$\begin{aligned} \frac{1}{10}{\varvec{r}}^{\top }_0 \left( {\varvec{R}}^{(10)}{\odot }{\varvec{y}}_d\right) = \left( {\bar{y}}^{(1)}_d,{\bar{y}}^{(2)}_d, \ldots , {\bar{y}}^{(1023)}_d, {\bar{y}}^{(1024)}_d\right) \\ =\left( - 0.41, - 0.25, \ldots , 0.25, 0.41\right) . \end{aligned}$$

The observed $y_{dj}$ values in Table 1 have the mean ${\bar{y}}^{obs}_{d} = - 0.41 = {\bar{y}}^{(1)}_d$ in (7). The 1024 column means are displayed in Table 3 where 14 mean values are highlighted:

$-\,0.41 (4 \,{\hbox {times}}), -\,0.43, -\,0.45, -\,0.47, 0.47, 0.45, 0.43, 0.41 (4 \,{\hbox {times}})$.

The 7 out of 1024 mean values smaller than − 0.41 are

− 0.41(4 times), − 0.43, − 0.45, − 0.47.

The P value for testing the null hypothesis against the alternative hypothesis is obtained by using (8).

$$\begin{aligned} \frac{\left( {\hbox {The number of}}~{\bar{y}}^{(u)}_d, u = 1, \ldots , 1024 \right) \le {\bar{y}}^{obs}_d}{1024} = \frac{7}{1024}= 0.006835938 \approx 0.006836. \end{aligned}$$

The data provide sufficient evidence at $1 \%$ level of significance against the null hypothesis and in favor of the alternative hypothesis. Table 4 (Appendix) displays the R-code and output to obtain the above numerical value, and our result matches up exactly to the result by using the R-code.

5 An Algorithm for Performing Permutation Test based on Two-Sample Data

In comparing the effects of two treatments A and B by using a completely randomized design (CRD) design or a randomized control trial (RCT), the observations are $y_{Aj}$, $j = 1, \ldots , n_1$ and $y_{Bj}$, $j = 1, \ldots , n_2$, $n = n_1 + n_2$. If the probability distribution of observations on A is the same as the probability distribution of observations on B, then there is no difference between the effects of two treatments A and B. We now present an algorithm to introduce the permutation data matrix ${\varvec{D}_p}$ in ${\varvec{M}}_{n \times s}$

$$\begin{aligned} s = \left( {\begin{array}{c}n\\ n_1\end{array}}\right) = \frac{n!}{{n_1}!{(n - n_1)}!} \end{aligned}$$

such that the s column vectors of it are independent identically distributed samples from the identical probability distribution of observations on A and B, under the null hypothesis $H_0$.

We denote

$$\begin{aligned} \begin{aligned} {\varvec{y}}_A&= (y_{A1}, \ldots , y_{An_1})^{\top },\\ {\varvec{y}}_B&= (y_{B1}, \ldots , y_{Bn_2})^{\top },\\ {\varvec{y}}&= (y_{A1}, \ldots , y_{An_1}, y_{B1}, \ldots , y_{Bn_2})^{\top }\\&= \begin{pmatrix}{\varvec{y}}_A\\ {\varvec{y}}_B\\ \end{pmatrix}.\\ \end{aligned} \end{aligned}$$

(9)

From the n! permutation matrices in ${\varvec{M}}_n$, we select the s permutation matrices ${\varvec{P}}^{(1)}, \ldots ,{\varvec{P}}^{(s)}$, such that ${\varvec{P}}^{(1)} = {\varvec{I}}_n$. We define the matrix ${\varvec{P}}^{(i)}_1$ in ${\varvec{M}}_{n_1 \times n}$ as a sub-matrix of ${\varvec{P}}^{(i)}$, having one element 1 and the other $(n - 1)$ elements 0 in every row and the specific columns corresponding to u observations in ${\varvec{y}}_A$ and $v$ observations in ${\varvec{y}}_B$, $u = 0, \ldots , n_1$ and $v = 0, \ldots , n_2$ satisfying the condition $u + v = n_1$. We define similarly the matrix ${\varvec{P}}^{(i)}_2$ in ${\varvec{M}}_{n_2 \times n}$ as a sub-matrix of ${\varvec{P}}^{(i)}$ having one element 1 and the other $(n - 1)$ elements 0 in every row and the specific columns corresponding to $(n_1 - u)$ observations in ${\varvec{y}}_A$ and $(n_2 - v)$ observations in ${\varvec{y}}_B$, $u = 0, \ldots , n_1$ and $v = 0, \ldots , n_2$ satisfying the conditions $u + v = n_1$ and $(n_1 - u) + (n_2 - v) = n_2$. We define the vectors ${\varvec{r}}_{01}$ in ${\varvec{M}}_{n_1 \times 1}$ and ${\varvec{r}}_{02}$ in ${\varvec{M}}_{n_2 \times 1}$ having the elements 1. We write

$$\begin{aligned} \begin{aligned} {\varvec{P}}^{(i)}&= \begin{pmatrix} {\varvec{P}}^{(i)}_1\\ {\varvec{P}}^{(i)}_2\\ \end{pmatrix},\\ {\varvec{r}}_0&= \begin{pmatrix} {\varvec{r}}_{01}\\ {\varvec{r}}_{02}\\ \end{pmatrix}.\\ \end{aligned} \end{aligned}$$

(10)

It can be checked that

$$\begin{aligned} \mathop {\sum \limits ^{n_1}_{u = 0} \sum \limits ^{n_2}_{v = 0}}_{u + v = n_1} \left( {\begin{array}{c}n_1\\ u\end{array}}\right) \left( {\begin{array}{c}n_2\\ v\end{array}}\right) = \left( {\begin{array}{c}n\\ n_1\end{array}}\right) = s. \end{aligned}$$

(11)

We define

$$\begin{aligned} \begin{aligned} d^{(i)}&= \frac{1}{n_1} {\varvec{r}}^{\top }_{01} {\varvec{P}}^{(i)}_1 {\varvec{y}} - \frac{1}{n_2} {\varvec{r}}^{\top }_{02} {\varvec{P}}^{(i)}_2 {\varvec{y}}. \end{aligned} \end{aligned}$$

(12)

The P value formula of Fisher’s permutation test based on two-sample data for testing the null hypothesis against the left-sided alternative hypothesis is given by

$$\begin{aligned} \frac{\left( {\hbox {The}}\,{\hbox {number}} \,{\hbox {of}}\,d^{(i)}, i = 1, \ldots , s \right) \le {d}^{obs}}{s}, \end{aligned}$$

(13)

where ${d}^{obs} = d^{(1)}$.

Table 2 Data on plants 1,...,5, for the tomato fertilizer experiment

Full size table

6 The Tomato Fertilizer Experiment

The tomato fertilizer experiment was presented and analyzed by Box, Hunter, and Hunter [5] and Easterling [10] in Chapter 3. The design used for this experiment was completely randomized design (CRD). The purpose was to compare the effects of two treatments, fertilizers A and B, on the yield from 11 tomato plants in terms of the total weight of tomatoes per plant. For our illustration of the permutation test algorithm, we consider the data for the first 5 plants in Table 2. The vectors ${\varvec{y}}_A = (29.9, 11.4, 25.3)^\top$ and ${\varvec{y}}_B = (26.6, 23.7)^\top$. The $n_1 = 3, n_2 = 2, n = 5$, $s = \left( {\begin{array}{c}n\\ n_1\end{array}}\right) = 10.$ In Table 5 (Appendix), we observe that $d^{(i)} \le d^{\hbox {obs}}, i = 1, 3, 6, 7, 9$. Therefore, the $P {\hbox {value}}$ is equal to 0.5. This is exactly the same number obtained by the R code in Table 6.

7 Concluding Remarks

The paper presents the method of finding the randomization matrix ${\varvec{R}}^{(n)}$ for the paired comparison under a randomized complete block design. It also gives the method of selection of s permutation matrices ${\varvec{P}}^{(i)}, i = 1, \ldots , s, s < n!$, from the set of all n! permutation matrices for performing two independent samples permutation test under a completely randomized design or a randomized control trial. The paper demonstrates the properties of such matrices. Two illustrative examples are used to explain the proposed methods that are compared with the outcomes of the corresponding R output.

References

Basu D (1980) Randomization analysis of experimental data: the Fisher randomization test. J Am Statist Assoc 75:575–582
Article MathSciNet Google Scholar
Berry KJ, Johnson JE, Mielke PW (2018) The measurement of association: a permutation statistical approach. Springer, Switzerland
Book Google Scholar
Boik RJ (1987) The Fisher–Pitman permutation test: a non-robust alternative to the normal theory F Test when variances are Heterogeneous. British J Math Statist Psychol 40:26–42
Article Google Scholar
Box GEP, Andersen SL (1955) Permutation theory in the derivation of robust criteria and the study of departures from assumption. J R Stat Society Ser B 17:1–34
MATH Google Scholar
Box GEP, Hunter WG, Hunter JS (1978) Statistics for experimenters: an introduction to design, data analysis, and model building. Wiley, New York
MATH Google Scholar
Calinski T, Kageyama S (2000) Block designs: a randomization approach. Vol 1: Analysis. Springer, New York
Book Google Scholar
Calinski T, Kageyama S (2003) Block designs: a randomization approach. Vol 2: Design. Springer, New York
Book Google Scholar
David HA (2008) The beginnings of randomization tests. Am Stat 62:70–72
Article MathSciNet Google Scholar
Dugard P (2014) Randomization tests: a new gold standard? J Context Behav Sci 3:65–68
Article Google Scholar
Easterling RG (2015) Fundamentals of statistical experimental design and analysis. Wiley, New York
Google Scholar
Edgington ES, Onghena P (2007) Randomization tests, 4th edn. Chapman & Hall, New York
Book Google Scholar
Effron B, Hastie T (2016) Computer age statistical inference : algorithms, evidence, and data Science. Cambridge University Press, New York
Book Google Scholar
Ernst MD (2004) Permutation methods: a basis for exact inference. Stat Sci 19:676–685
Article MathSciNet Google Scholar
Ferron JM, Levin JR (2014) Single-case permutation and randomization statistical tests: Present status, promising new developments. In: Kratochwill TR, Levin JR (eds) Single-case intervention research: statistical and methodological advances. American Psychological Association, Washington, pp 153–183
Chapter Google Scholar
Fisher RA (1935) The design of experiments. Oliver & Boyd, London
Google Scholar
Good PI (2005) Permutation, parametric and bootstrap tests of hypotheses, 3rd edn. Springer, New York
MATH Google Scholar
Hoeffding W (1952) The large-sample power of tests based on permutations of observations. Ann Math Stat 23:169–192
Article MathSciNet Google Scholar
Kempthorne O (1955) The randomization theory of experimental inference. J Am Stat Assoc 50:946–967
MathSciNet Google Scholar
Kempthorne O, Doerfler TE (1969) The behaviour of some significance tests under experimental randomization. Biometrika 56:231–248
Article MathSciNet Google Scholar
Lehmann EL (1975) Non-parametrics: statistical methods based on ranks. Holden-Day, San Francisco
Google Scholar
Ludbrook J, Dudley H (1998) Why permutation tests are superior to t and F Tests in biomedical research. Am Statist 52:127–132
Google Scholar
Manly BFJ (2007) Randomization, bootstrap and Monte Carlo methods in biology, 3rd edn. Chapman & Hall/CRC, Boca Raton
MATH Google Scholar
Mielke PW, Berry KJ (2007) Permutation methods: a distance function approach, 2nd edn. Springer, New York
Book Google Scholar
Onghena P (2018) Randomization and the randomization test: two sides of the same coin. In: Berger V (ed) Randomization, masking, and allocation concealment. Chapman & Hall/CRC Press, Boca Raton, pp 185–207
Google Scholar
Pesarin F, Salmaso L (2010) Permutation tests for complex data: theory, applications and software. Wiley, Chichester
Book Google Scholar
Pitman EJG (1937a) Significance tests which may be applied to samples from any population, I. J R Stat Soc Ser B 4:119–130
MATH Google Scholar
Pitman EJG (1937b) Significance tests which may be applied to samples from any populations, II, the correlation coefficient test. Suppl J R Stat Soc 4:225–232
Article Google Scholar
Pitman EJG (1938) Significance tests which may be applied to samples from any populations, III. Analf Var Test Biometrika 29:322–335
MATH Google Scholar
Rao CR (1973) Linear statistical inference and its applications. Wiley, New York
Book Google Scholar
Tukey JW (1962) The future of data analysis. Ann Math Stat 33:1–67
Article MathSciNet Google Scholar
Tukey JW (1988) Randomization and rerandomization: the wave of the past in the future. Ciminera Symposium, Philadelphia
Google Scholar
Wald A, Wolfowitz J (1944) Statistical tests based on permutations of the observations. Ann Math Stat 15:358–372
Article MathSciNet Google Scholar
Welch BL (1937) On the z-test in randomized blocks and Latin squares. Biometrika 29:21–52
Article Google Scholar
Yates F (1964) Sir Ronald Fisher and the design of experiments. Biometrics 20:307–321
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of California, Riverside, CA, 92521, USA
Subir Ghosh

Authors

Subir Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Subir Ghosh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Part of special issue guest edited by Dieter Rasch, Jürgen Pilz, and Subir Ghosh—Advances in Statistical and Simulation Methods.

Appendix

See Tables 3, 4, 5, and 6.

Table 3 The ${{\bar{y}}}^{(u)}_d, u = 1, \ldots , 1024$

Full size table

Table 4 The R Code : randomization test for the boys’ shoes data

Full size table

Table 5 ${\varvec{P}}^{(i)}_1$, ${\varvec{P}}^{(i)}_2$, and $d^{(i)}$, $i = 1, \ldots , 10$

Full size table

Table 6 The R Code: permutation test for the tomato fertilizer data

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, S. Exact Permutation/Randomization Tests Algorithms. J Stat Theory Pract 14, 65 (2020). https://doi.org/10.1007/s42519-020-00131-6

Download citation

Accepted: 08 September 2020
Published: 06 October 2020
DOI: https://doi.org/10.1007/s42519-020-00131-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Exact Permutation/Randomization Tests Algorithms

Abstract

Similar content being viewed by others

Permutation and Rank Tests

Permutation Statistical Methods

Permutation and Randomization Tests

1 Introduction

2 Permutations of Observations

Definition 1

Definition 2

3 An Algorithm for Performing Permutation Randomization Test for Paired Data

Definition 3

4 The Boys’ Shoes Experiment

5 An Algorithm for Performing Permutation Test based on Two-Sample Data

6 The Tomato Fertilizer Experiment

7 Concluding Remarks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exact Permutation/Randomization Tests Algorithms

Abstract

Similar content being viewed by others

Permutation and Rank Tests

Permutation Statistical Methods

Permutation and Randomization Tests

1 Introduction

2 Permutations of Observations

Definition 1

Definition 2

3 An Algorithm for Performing Permutation Randomization Test for Paired Data

Definition 3

4 The Boys’ Shoes Experiment

5 An Algorithm for Performing Permutation Test based on Two-Sample Data

6 The Tomato Fertilizer Experiment

7 Concluding Remarks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation