Abstract
Chapter 2 introduces a generalized Minkowski distance function that is the basis for a set of multi-response permutation procedures for univariate and multivariate completely randomized data. Multi-response permutation procedures constitute a class of permutation methods for one or more response measurements that are designed to distinguish possible differences among two or more groups. The multi-response permutation procedures provide a synthesizing foundation for a variety of statistical tests and measures developed in successive chapters.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This second chapter of Permutation Statistical Methods introduces a generalized distance function that provides the foundation for a set of multi-response permutation procedures specifically designed for univariate and multivariate completely randomized data. Multi-Response Permutation Procedures (MRPP) were introduced by Mielke , Berry , and Johnson in 1976 and constitute a class of permutation methods for one or more response measurements on each object that were initially developed to distinguish possible differences among two or more groups of objects [300].Footnote 1 The multi-response permutation procedures presented here are based on a generalized Minkowski distance function and provide a synthesizing foundation for a variety of statistical tests and measures for completely randomized data that are further developed in Chaps. 3–7.
2.1 Minkowski Distance Function
Hermann Minkowski (1864–1909), German mathematician and creator of the geometry of numbers, utilized geometrical methods to solve problems in number theory, mathematical physics, and the theory of relativity. Minkowski was a close friend of David Hilbert while teaching at Königsberg University and taught Albert Einstein while employed at Eidgenössische Polytechnikum in Zürich (now, ETH Zürich). In 1891 Minkowski introduced a measure of metric distance between two points in Crelle’s Journal [310].Footnote 2 The Minkowski metric distance of order p between two points in an r-dimensional Euclidean space, \(x^{{\prime}} = (x_{1},x_{2},\,\ldots,\,x_{r})\) and \(y^{{\prime}} = (y_{1},y_{2},\,\ldots,\,y_{r}) \in \mathbb{R}^{r}\), is given by
where p ≥ 1.
The Minkowski distance function is typically used with p = 1, 2, or \(\infty \). When p = 1, the distance is a first-order Minkowski metric, often called a city-block, Manhattan [231], rectilinear [54], or taxicab [222] metric, the latter named for the distance between two points that a car or taxicab would drive in a city laid out in square blocks. When p = 2, the distance is a second-order Minkowski metric and is the ordinary Euclidean distance between points, a generalization of the Pythagorean theorem to more than two coordinates. When \(p = \infty \), the Minkowski metric is known as the Tchebycheff (Chebyshev), von Neumann, or, in the two-dimensional case, the chess-board Minkowski distance [167].
Conventional statistical tests and measures, such as t tests, F tests, and ordinary least-squares (OLS) regression and correlation, are based on squared Euclidean distances between response measurement scores, which are not metric. The Minkowski distance function, however, is limited to metric distances and, under its standard definition, cannot accommodate most conventional statistical tests. Therefore, consider a generalized Minkowski distance function given by
where p ≥ 1 and v > 0 [297, p. 5]. When r ≥ 2, p = 2, and v = 1, \(\Delta (x,y)\) is rotationally invariant in an r ≥ 2 dimensional space. When \(v = p = 1\), \(\Delta (x,y)\) is a city-block metric, which is not rotationally invariant. When v = 1 and p = 2, \(\Delta (x,y)\) is an ordinary Euclidean distance metric. And when \(v = p = 2\), \(\Delta (x,y)\) is a squared Euclidean distance, which is not a metric distance function since the triangle inequality is not satisfied.Footnote 3
2.2 Multi-response Permutation Procedures
Multi-Response Permutation Procedures (MRPP) were originally designed to statistically determine possible differences among one or more response measurement scores among two or more groups of objects or subjects [300]. Let \(\Omega =\{\omega _{1},\,\ldots,\,\omega _{N}\}\) denote a finite sample of N objects that represents a target population, let x′ i = (x 1i , …, x ri ) be a transposed vector of r commensurate response measurement scores for object ω i , i = 1, …, N, and let S 1, …, S g designate an exhaustive partitioning of the N objects into g disjoint treatment groups.Footnote 4 The MRPP test statistic is a weighted mean given by
where C i > 0 is a positive weight for treatment group S i , i = 1, …, g, \(\sum _{i=1}^{g}C_{i} =\ 1\),
is the average distance-function value for all distinct pairs of objects in treatment group S i , i = 1, …, g, n i ≥ 2 is the number of a priori objects classified into treatment group S i , i = 1, …, g,
\(\sum _{j<k}\) is the sum over all j and k such that 1 ≤ j < k ≤ N, and \(\Psi (\cdot )\) is an indicator function given by
The choice of the treatment-group weights, C 1, …, C g , and the generalized Minkowski distance function given in Eq. (2.1) on p. 30 specify the structure of MRPP. The original choice of C i given by Mielke , Berry , and Johnson in 1976 was
for i = 1, …, g [300]. However, a variety of other treatment-group weights can be considered; for example,
for i = 1, …, g. The efficient choice of \(C_{i} = n_{i}/N\), i = 1, …, g, forces the population variance, \(\sigma _{x}^{2}\), to be proportional to N −2 and eliminates all terms of order 1∕N in the variance of δ [297, pp. 26, 30].
The null hypothesis (H 0) states that equal probabilities are assigned to each of the
possible, equally-likely allocations of the N objects to the g treatment groups, S 1, …, S g . Under H 0 the N multi-response measurements are exchangeable multivariate random variables.Footnote 5 The probability value associated with an observed value of δ, δ o, is the probability under the null hypothesis (H 0) of observing a value of δ as extreme or more extreme than δ o. Thus, an exact probability value for δ o may be expressed as
When M is very large, an approximate probability value for δ may be obtained from a resampling procedure, where
and L denotes the number of randomly sampled test statistic values. Typically, L is set to a large number to ensure accuracy.
Number of Resamplings Necessary
Exact permutation tests are restricted to relatively small samples, given the large number of possible permutations. On the other hand, resampling permutation tests are not limited by the size of the samples. Resampling permutation tests also have been shown to provide good approximations to exact probability values as a function of the number of resamplings considered. An early concern regarding the systematic use of resampling permutation tests was the speed of the computers used for calculating the probability values. Given modern high-speed computers, the question of computational speed is moot when probability values are not too small. The remaining question is: how many resamplings are required for a specified accuracy?
The number of resamplings suggested in books and articles on permutation methods is varied and likely dated due to previous limitations of computer speed and memory. Some authors have proposed as few as 100 resamplings to as many as 5,000; for example, see discussions by Dwass in 1957 [100]; Hope in 1968 [180]; Edwards in 1985 [110]; Jockel in 1986 [193]; Keller-McNulty and Higgins in 1987 [199]; Bailer in 1989 [16]; Kim, Nelson, and Startz in 1991 [216]; Manly in 1991 [258, pp. 32–35]; McQueen in 1992 [274]; Rickerts and Berry in 1994 [347]; Kennedy in 1995 [212]; Maxim in 1999 [265, p. 356]; Lunneborg in 2000 [256, pp. 210–213]; Good in 2001 [149, p. 47]; Higgins in 2004 [176]; and Edgington and Onghena in 2007 [109, pp. 40–41]. On the other hand, examples provided by Howell as recently as 2007 utilized as many as 10,000 resamplings [184, pp. 642–646]. Resampling computing packages such as Resampling Stats [14] and StatXact [15] typically use 10,000 resamplings as the default value.
The accuracy of a resampling probability value depends on both the probability value (P) and the number of resamplings (L). Confidence limits on the probability value can be obtained from the binomial distribution when L is large. The 1 −α confidence limits of the binomial distribution are given by
where P is the probability value in question and \(\hat{P }\) denotes the estimated value of P. Define
for i = 1, …, L, where \(\hat{P }_{\text{o}}\) denotes the observed value of \(\hat{P }\). Then \(\hat{P }\), the expected value of \(\hat{P }\), the variance of \(\hat{P }\), and the skewness of \(\hat{P }\) are given by
and
respectively [195, p. 916]. If L is small and P is close to either 0 or 1, the skewness term \(\gamma _{\hat{P } }\) becomes large and Eq. (2.4) may not be appropriate. For example, if L = 100 and P = 0. 01,
Table 2.1 lists a selected number of probability values (P = 0. 50, 0.25, 0.10, 0.05, and 0.01), a variety of resamplings (L = 100, 1000, 10,000, 1,000,000, and 100,000,000), computed skewness values, errors on the 95 % confidence limits determined from Eq. (2.4), and the simulated lower and upper errors on the 95 % confidence limits based on L resamplings and determined from the smallest value for which the cumulative binomial distribution is equal to or less than 0.025 and equal to or greater than 0.975, respectively. In general, as can be seen from Table 2.1, two additional orders of magnitude are required to increase accuracy by just one decimal place.
To illustrate the number of resamplings required to yield a predetermined number of decimal places of accuracy, given a known probability value, consider the interval-level data listed in Fig. 2.1.
The data listed in Fig. 2.1 are adapted from Berry, Mielke, and Mielke [38] and represent soil lead (Pb) quantities from two school districts in metropolitan New Orleans. Elevated Pb levels have been linked to a number of physiological, neurological, and endocrine effects in children, including difficulties in learning, perception, social behavior, and fine motor skills. The n 1 = 20 soil lead samples collected in District 1 yielded a mean value of \(\bar{x}_{1} = 203.9350\) mg/kg and the n 2 = 20 soil lead samples collected in District 2 yielded a mean value of \(\bar{x}_{2} = 1,661.7800\) mg/kg. There are
possible permutations of the soil lead data listed in Fig. 2.1 to be considered. Under the null hypothesis of no difference between the two group means in the population, a Fisher–Pitman permutation F test [38] yields an exact two-sided probability value of
for the soil lead data listed in Fig. 2.1. Figure 2.2 summarizes the results for eight different resamplings of the data listed in Fig. 2.1 and the associated two-sided resampling probability values with α = 0. 05. Each of the probability values was generated using a common seed and the same pseudorandom number generator [197]. The last row of Fig. 2.2 contains the exact probability value based on all M = 137, 846, 528, 820 possible permutations of the soil lead data listed in Fig. 2.1.
Given the results of the resampling probability analyses listed in Fig. 2.2, L = 1, 000, 000 is recommended whenever three decimal places of accuracy are required. There are four reasons for promoting L = 1, 000, 000 resamplings: accuracy, practicality, error, and consistency. First, inspection of Fig. 2.2 indicates that with an exact probability value of P = 0. 0149182123 and α = 0. 05, L = 1, 000, 000 resamplings is the minimum number of resamplings necessary to ensure three decimal places of accuracy. Second, given the speed of modern computers and the efficiency of resampling algorithms, such as the Mersenne Twister, L = 1, 000, 000 resamplings can be used on a routine basis. Third, there is the potential for additional type I error, the magnitude of which is of concern when the number of resamplings (L) is very small. Fourth, some researchers object to the use of resampling statistics because different pseudorandom number generators and different seeds can produce widely varying results. This is certainly true when L is very small. For example, in Fig. 2.2, L = 100 yields a probability value of P = 0. 06. Varying the seed with L = 100 and the same pseudorandom number generator produced observed probability values ranging from P = 0. 01 to P = 0. 11. However, with L = 1, 000, 000, varying the seed produced no differences in the third decimal place.
When the number of possible arrangements (M) is very large and the exact probability value (P) is exceedingly small, a resampling permutation procedure may produce no δ values equal to or less than δ o, even with L = 1, 000, 000, yielding an approximate resampling probability value of P = 0. 00. In such cases, moment-approximation permutation procedures based on fitting the first three exact moments of the discrete permutation distribution to a Pearson type III distribution provide approximate probability values, as detailed in Chap. 1, Sect. 1.2.2; see also references [284] and [300].
An Index of Agreement
It is oftentimes desirable to have an index of the amount of agreement among response measurement scores within g treatment groups. A useful measure for this purpose is a chance-corrected within-group coefficient of agreement given by
where μ δ is the arithmetic average of the Mδ values calculated on all possible arrangements of the observed response measurement scores, given by
\(\mathfrak{R}\) is a chance-corrected measure of agreement since \(\mathrm{E}[\mathfrak{R}\vert H_{0}] = 0\).Footnote 6 Because μ δ is a constant under H 0, the permutation distributions of δ and \(\mathfrak{R}\) are equivalent, viz.,
where
and δ o and \(\mathfrak{R}_{\text{o}}\) denote the observed values of δ and \(\mathfrak{R}\), respectively. Possible values of \(\mathfrak{R}\) range from slightly negative values to a maximum of \(\mathfrak{R} = +1\) for the extreme case when all response measurements on objects within each of the g classified treatment groups are identical, i.e., δ = 0.
The generalized Minkowski distance function, \(\Delta (x,y)\), as defined in Eq. (2.1) on p. 30, determines the analysis space of the MRPP test statistic, δ. The data space in question for almost all statistical analyses is an ordinary Euclidean distance space. If the distance function of the MRPP test statistic is based on p = 2 and v = 1, then the data and analysis spaces are congruent, so that the resulting statistical analyses represent the data in question. Unfortunately, commonly used statistical analyses based on the arithmetic mean, such as Student’s two-sample t test and Fisher’s one-way analysis of variance, are based on \(p = v = 2\), yielding a non-metric squared-distance analysis space that is not congruent with the data space. The difference between the data and analysis spaces associated with the most popular statistical analyses is a reason that problems occur with what should be routine analyses. Examples illustrating this problem are given elsewhere; see, for example, references [41, pp. 404–410] and [297, pp. 50–53]. Any statistical analysis is questionable when the data and analysis spaces are not congruent.
2.2.1 Chance-Corrected Agreement Measures
Chance-corrected measures yield values that are interpreted as a proportion above that expected by chance alone. Chance-corrected agreement measures provide clear and meaningful interpretations of the amount of, or lack of, agreement present in the data. In general, chance-corrected measures of agreement, such as \(\mathfrak{R}\), are equal to + 1 when perfect agreement among the response measurement scores occurs, 0 when agreement is equal to that expected under independence, and negative when agreement among the response measurement scores is less than that expected by chance. For example, define a chance-corrected measure such that
where O i and E i denote the Observed (earned) and Expected (chance) score from purely guessing, respectively, on a multiple-choice examination with N questions for the ith student in a class of m students [175, p. 912].
Thus, on a 50-question multiple-choice examination with five choices per question, chance would indicate that a student could answer 50 × 0. 20 = 10 questions correctly simply by guessing. If a student answered only eight questions correctly, then a chance-corrected measure of agreement would yield a grade of
since the score was less than expected by chance, i.e., only eight of 50 questions were answered correctly. The lowest grade would occur when a student answered all 50 questions incorrectly, yielding a score of
Note that while a student with the highest possible score of 50 correct answers would score
the lowest possible score is − 25, not − 100. Thus, the distributions of chance-corrected measures are usually asymmetric.
Since the mean value of \(\mathfrak{R}\) under H 0 is 0, homogeneity of within-classified-group response measurements is associated with \(\mathfrak{R} > 0\), and heterogeneity of within-classified-group response measurements is associated with \(\mathfrak{R}\leq 0\) [28]. The distribution of \(\mathfrak{R}\) is usually asymmetric and the upper and lower bounds depend on both the nature of the data and the structure of δ. The degree of homogeneity or heterogeneity depends on the discrete permutation distribution of \(\mathfrak{R}\). If large values of n 1, …, n g and N are involved, a very small value of \(P(\delta \leq \delta _{\text{o}}\vert H_{0})\) may be associated with a small positive observed value of \(\mathfrak{R}\), say \(\mathfrak{R}_{\text{o}}\). Conversely, with small values of n 1, …, n g and N, a large value of \(\mathfrak{R}_{\text{o}}\) may be associated with a relatively large value of \(P(\delta \leq \delta _{\text{o}}\vert H_{0})\).
2.2.2 Example Univariate MRPP Analysis with v = 2
Although multi-response permutation procedures were originally designed for analyzing multivariate response measurement scores, they can also be used for analyzing univariate data. Consider a comparison between two mutually exclusive groups of objects, S 1 and S 2, where a single response measurement, x, has been obtained from each object. For this example, there is r = 1 response measurement score for each object, g = 2 disjoint groups, and a total of N = 6 objects with n 1 = 2 and n 2 = 4 in treatment groups S 1 and S 2, respectively. Suppose that the n 1 = 2 observed response measurement scores for treatment group S 1 are {5, 4} and the n 2 = 4 response measurement scores for treatment group S 2 are {2, 3, 7, 9}. The treatment-group sizes and the response measurement scores are deliberately kept small to simplify the example analysis. The treatment-group sizes and the univariate response measurement scores are listed in Fig. 2.3.
For this example analysis, let v = 2, p = 2, r = 1,
so that the S 1 and S 2 treatment groups are weighted proportional to their group sizes of n 1 = 2 and n 2 = 4, respectively. For univariate response measurement scores with r = 1, Eq. (2.1) on p. 30 reduces to
Thus, for treatment group S 1 with n 1 = 2 objects, p = 2, and v = 2, the generalized Minkowski distance function yields
and for treatment group S 2 with n = 4 objects, the generalized Minkowski distance function yields
Then following Eq. (2.3) on p. 31, the average distance-function values for all distinct pairs of objects in treatment groups S i , i = 1, 2, are
and
Following Eq. (2.2) on p. 31, the observed weighted mean of the \(\xi _{1}\) and \(\xi _{2}\) values, based on v = 2 and \(C_{i} = n_{i}/N\) for i = 1, 2 is
Smaller values of δ o indicate a concentration of response measurement scores within the g treatment groups, whereas larger values of δ o indicate a lack of concentration between response measurement scores among the g treatment groups [301]. The N = 6 objects can be partitioned into g = 2 treatment groups, S 1 and S 2, respectively, with n 1 = 2 and n 2 = 4 response measurement scores preserved in
possible, equally-likely ways. The M = 15 possible arrangements of the observed data in Fig. 2.3, along with the corresponding \(\xi _{1}\), \(\xi _{2}\), and δ values, are listed in Table 2.2 and ordered by the δ values from lowest to highest. The observed MRPP test statistic, δ o = 14. 8889, obtained from the realized arrangement,
(Order 9 in Table 2.2) is not unusual since five of the remaining δ values (\(\delta _{11}\mbox{ to }\delta _{15}\)) exceed the observed value of δ o = 14. 8889 and 10 values of δ (δ 1 to δ 10) are equal to or less than the observed value. If all arrangements of the N = 6 observed response measurement scores listed in Fig. 2.3 occur with equal chance, the exact probability value of δ o = 14. 8889 computed on the M = 15 possible arrangements of the observed data with n 1 = 2 and n 2 = 4 response measurement scores preserved for each arrangement is
For comparison, a conventional Student two-sample pooled t test calculated on the N = 6 response measurement scores listed in Fig. 2.3 yields an observed value of \(t_{\text{o}} = -0.3004\). Assuming independence, normality, and homogeneity of variance, t is approximately distributed as Student’s t under the null hypothesis with \(N - 2 = 6 - 2 = 4\) degrees of freedom. Under the null hypothesis, the observed value of \(t_{\text{o}} = -0.3004\) yields an approximate two-sided probability value of P = 0. 7789.
Following Eq. (2.6) on p. 37, the exact average value of the M = 15 δ values listed in Table 2.2 is μ δ = 13. 60. Thus, the observed chance-corrected coefficient of agreement, following Eq. (2.5) on p. 37, is
indicating that within-group agreement is well below that expected by chance.
2.2.3 Example Univariate MRPP Analysis with v = 1
Permutation statistical tests and measures are data-dependent, distribution-free, and non-parametric; consequently, they require no distributional assumptions and make no estimates of population parameters. Thus, it is not necessary to set v = 2 and to square the response-measurement differences between objects. While conventional tests and measures that assume normality must estimate the mean and variance, μ x and \(\sigma _{x}^{2}\), of the normal distribution, both of which are based on squared deviations from the mean, permutation tests and measures do not assume normality and are not restricted to v = 2, which is not a metric distance function. A distance function based on v = 1 is an attractive alternative to v = 2 as it is a metric distance function, satisfies the triangle inequality, is robust to extreme values, provides an easy-to-understand ordinary Euclidean distance between objects, and ensures that the data and analysis spaces are congruent [284–287, 289, 295]. In addition, choosing v = 1 over v = 2 can make a substantial difference in the results of an MRPP analysis; see, for example, a discussion by Mielke and Berry in 2007 [297, pp. 45–50].
To illustrate the computation of δ with v = 1, consider the same finite sample of N = 6 objects listed in Fig. 2.3 on p. 40 and let S 1 and S 2 denote an exhaustive partitioning of the N = 6 objects into g = 2 disjoint treatment groups. As previously, let S 1 consist of n 1 = 2 objects, each with a single response measurement, and let S 2 consist of n 2 = 4 objects, each with a single response measurement.
Given the univariate data listed in Fig. 2.3, let r = 1, p = 2,
but in this case set v = 1 instead of v = 2, employing ordinary Euclidean distance instead of squared Euclidean distance between objects. Following Eq. (2.7) on p. 40 for treatment group S 1 with n 1 = 2 objects, p = 2, and v = 1, the generalized Minkowski distance function yields
and for treatment group S 2 with n = 4 objects, the generalized Minkowski distance function yields
Then following Eq. (2.3) on p. 31, the average distance-function values for all distinct pairs of objects in treatment group S i , i = 1, 2, are
and
Following Eq. (2.2) on p. 31, the observed weighted mean of the \(\xi _{1}\) and \(\xi _{2}\) values, based on \(C_{i} = n_{i}/N\) for i = 1, 2 is
As in the previous MRPP example with v = 2, the N = 6 objects can be partitioned into g = 2 treatment groups, S 1 and S 2, with n 1 = 2 and n 2 = 4 response measurement scores preserved for each arrangement of the observed data in
possible, equally-likely ways. The M = 15 possible arrangements of the observed data in Fig. 2.3, along with the corresponding \(\xi _{1}\), \(\xi _{2}\), and δ values, are listed in Table 2.3 and ordered by the δ values from lowest to highest. The observed MRPP test statistic, δ o = 3. 1111, obtained from the realized arrangement,
(Order 5 in Table 2.3) is not unusual since eight of the remaining δ values (δ 8 to δ 15) exceed the observed value of δ o = 3. 1111 and seven values of δ (δ 1 to δ 7) are equal to or less than the observed value. If all arrangements of the N = 6 observed response measurement scores listed in Fig. 2.3 occur with equal chance, the exact probability value of δ o = 3. 1111 computed on the M = 15 possible arrangements of the observed data with n 1 = 2 and n 2 = 4 response measurement scores preserved for each arrangement is
For comparison, for the univariate data listed in Fig. 2.3 the exact probability value based on v = 2, M = 15, and \(C_{i} = n_{i}/N\) for i = 1, 2 in the previous example is P = 0. 6667. No comparison is made with the conventional Student two-sample t test as Student’s t test is undefined for v = 1.
Following Eq. (2.6) on p. 37, the exact average value of the M = 15 δ values listed in Table 2.3 is μ δ = 3. 20. Thus, the observed chance-corrected coefficient of agreement, following Eq. (2.5) on p. 37, is
indicating very little within-group agreement above that expected by chance.
2.2.4 Example Bivariate MRPP Analysis with v = 2
In this second example, bivariate response measurement scores are used for simplicity to demonstrate a multivariate MRPP analysis. To illustrate the computation of MRPP with bivariate response measurement scores for each object, consider a finite sample of N = 7 objects and let S 1 and S 2 denote an exhaustive partitioning of the N objects into g = 2 disjoint treatment groups. Further, let S 1 consist of n 1 = 4 objects with r = 2 commensurate response measurement scores (x 1i and x 2i ) on each object for i = 1, …, 4, with \(x_{1}^{\,{\prime}} = (5,\,1)\), \(x_{2}^{\,{\prime}} = (4,\,6)\), \(x_{3}^{\,{\prime}} = (5,\,2)\), and \(x_{4}^{\,{\prime}} = (6,\,3)\), and let S 2 consist of n 2 = 3 objects with r = 2 commensurate response measurement scores (x 1i and x 2i ) on each object for i = 1, 2, 3 with \(x_{5}^{\,{\prime}} = (2,\,3)\), \(x_{6}^{\,{\prime}} = (3,\,4)\), and \(x_{7}^{\,{\prime}} = (2,\,4)\). The treatment group sizes and the response measurement scores are deliberately kept small to simplify the example analysis. The bivariate response measurement scores for the N = 7 objects are listed in Fig. 2.4.
For this example analysis, let v = 2, p = 2, r = 2,
so that the S 1 and S 2 treatment groups are weighted proportional to their group sizes of n 1 = 4 and n 2 = 3, respectively. Following Eq. (2.1) on p. 30 for treatment group S 1 with n 1 = 4 objects, p = 2, and v = 2, the generalized Minkowski distance function yields
and for treatment group S 2 with n 2 = 3 objects, the generalized Minkowski distance function yields
Then following Eq. (2.3) on p. 31, the average distance-function values for all distinct pairs of objects in treatment group S i , i = 1, 2, are
and
Following Eq. (2.2) on p. 31, the observed weighted mean of the \(\xi _{1}\) and \(\xi _{2}\) values, based on v = 2 and \(C_{i} = n_{i}/N\) for i = 1, 2 is
The N = 7 objects can be partitioned into g = 2 treatment groups, S 1 and S 2, with n 1 = 4 and n 2 = 3 response measurement scores preserved for each arrangement of the observed data in
possible, equally-likely ways. The M = 35 possible arrangements of the observed bivariate data in Fig. 2.4, along with the corresponding \(\xi _{1}\), \(\xi _{2}\), and δ values, are listed in Table 2.4 and ordered by the δ values from lowest to highest.
The observed MRPP test statistic, δ o = 6. 6667, obtained from the realized arrangement,
(Order 3 in Table 2.4) is unusual since 32 of the remaining δ values (δ 4 to δ 35) exceed the observed value of δ o = 6. 6667 and only two values of δ are less than the observed value: δ 1 = 4. 0000 and δ 2 = 6. 4762. If all arrangements of the N = 7 observed bivariate response measurement scores listed in Fig. 2.4 occur with equal chance, the exact probability value of δ o = 6. 6667 computed on the M = 35 possible arrangements of the observed data with n 1 = 4 and n 2 = 3 response measurement scores preserved for each arrangement is
A conventional Hotelling two-sample T 2 test is given by
where \(\bar{\mathbf{y}}_{1}\) and \(\bar{\mathbf{y}}_{2}\) denote vectors of mean differences between treatment groups S 1 and S 2, n 1 and n 2 are the number of interval-level multivariate response measurement scores in treatment groups S 1 and S 2, and S is a pooled variance–covariance matrix.
For the example data listed in Fig. 2.4, \(\bar{y}_{11} = 5.00\), \(s_{11}^{2} = 0.6167\), \(\bar{y}_{12} = 3.00\), \(s_{12}^{2} = 4.6667\), \(\mathrm{cov}(1,2)_{1} = -1.00\), \(\bar{y}_{21} = 2.3333\), \(s_{21}^{2} = 0.3333\), \(\bar{y}_{22} = 3.6667\), \(s_{22}^{2} = 0.3333\), and \(\mathrm{cov}(1,2)_{2} = +0.1667\). Then, \(\bar{\mathbf{y}}_{1} =\bar{ y}_{11} -\bar{ y}_{21} = 5.00 - 2.3333 = +2.6667\) and \(\bar{\mathbf{y}}_{2} =\bar{ y}_{12} -\bar{ y}_{22} = 3.00 - 3.6667 = -0.6667\).
The variance–covariance matrices for treatment groups S 1 and S 2 in Fig. 2.4 are
respectively, and the pooled variance–covariance matrix and its inverse are
respectively.Footnote 7
Following Eq. (2.8), the observed value of Hotelling’s T 2 is
and the observed F-ratio for Hotelling’s T 2 is
Assuming independence, normality, and homogeneity of variance, F is approximately distributed as Snedecor’s F under the null hypothesis with \(\nu _{1} = r = 2\) and \(\nu _{2} = N - r - 1 = 7 - 2 - 1 = 4\) degrees of freedom. Under the null hypothesis, the observed value of F o = 10. 2858 yields an approximate probability value of P = 0. 0265. While there is a considerable difference between the exact probability value of P = 0. 0857 and the approximate probability value of P = 0. 0265, it is not surprising, as Hotelling’s T 2 test was not designed for samples as small as n 1 = 4 and n 2 = 3.
Following Eq. (2.6) on p. 37, the exact average value of the M = 35 δ values listed in Table 2.4 is μ δ = 10. 0952. Thus, the observed chance-corrected coefficient of agreement, following Eq. (2.5) on p. 37, is
indicating approximately 34 % within-group agreement above that expected by chance.
2.2.5 Example Bivariate MRPP Analysis with v = 1
As mentioned in the univariate example on p. 43, the choice of v can make a substantial difference in the results of an MRPP analysis. To illustrate the computation of MRPP with bivariate data and v = 1, consider the same finite sample of N = 7 objects listed in Fig. 2.4 on p. 46 and let S 1 and S 2 denote an exhaustive partitioning of the N objects into g = 2 disjoint treatment groups. As previously, let S 1 consist of n 1 = 4 objects with r = 2 commensurate response measurement scores (x 1i and x 2i ) on each object for i = 1, …, 4, with \(x_{1}^{\,{\prime}} = (5,\,1)\), \(x_{2}^{\,{\prime}} = (4,\,6)\), \(x_{3}^{\,{\prime}} = (5,\,2)\), and \(x_{4}^{\,{\prime}} = (6,\,3)\), and let S 2 consist of n 2 = 3 objects with r = 2 commensurate response measurement scores (x 1i and x 2i ) on each object for i = 1, 2, 3 with \(x_{5}^{\,{\prime}} = (2,\,3)\), \(x_{6}^{\,{\prime}} = (3,\,4)\), and \(x_{7}^{\,{\prime}} = (2,\,4)\).
The bivariate response measurement scores for the N = 7 objects are listed in Fig. 2.4 on p. 46 and are replicated in Fig. 2.5 for convenience.
For this example analysis, let r = 2, \(C_{1} = n_{1}/N = 4/7\), \(C_{2} = n_{2}/N = 3/7\), and p = 2, but in this case set v = 1 instead of v = 2, employing ordinary Euclidean distance between objects. Following Eq. (2.1) on p. 30 for treatment group S 1 with n 1 = 4 objects, p = 2, and v = 1, the generalized Minkowski distance function yields
and for treatment group S 2 with n 2 = 3 objects, the generalized Minkowski distance function yields
Then, following Eq. (2.3) on p. 31, the average distance-function values for all distinct pairs of objects in treatment group S i , i = 1, 2, are
and
Following Eq. (2.2) on p. 31, the observed weighted mean of the \(\xi _{1}\) and \(\xi _{2}\) values, based on v = 1 and \(C_{i} = n_{i}/N\) for i = 1, 2 is
The N = 7 objects listed in Fig. 2.5 can be partitioned into g = 2 treatment groups, S 1 and S 2, with n 1 = 4 and n 2 = 3 response measurement scores preserved for each arrangement of the observed data in
possible, equally-likely ways. The M = 35 possible arrangements of the observed data in Fig. 2.5, along with the corresponding \(\xi _{1}\), \(\xi _{2}\), and δ values, are listed in Table 2.5 and ordered by the δ values from lowest to highest. The observed MRPP test statistic, δ o = 2. 1523, obtained from the realized arrangement,
(Order 2 in Table 2.5) is unusual since 33 of the remaining δ values (δ 3 to δ 35) exceed the observed value of δ o = 2. 1523 and only one value is less than the observed value: δ 1 = 1. 8152. If all arrangements of the N = 7 observed bivariate response measurement scores listed in Fig. 2.5 occur with equal chance, the exact probability value of δ o = 2. 1523 computed on the M = 35 possible arrangements of the observed data with n 1 = 4 and n 2 = 3 response measurement scores preserved for each arrangement is
For comparison, for the bivariate response measurement scores listed in Fig. 2.5 the exact probability value based on v = 2 and \(C_{i} = n_{i}/N\) for i = 1, 2 in the first example is P = 0. 0857. No comparison is made with the conventional Hotelling T 2 test as Hotelling’s T 2 is undefined for v = 1.
Following Eq. (2.6) on p. 37, the exact average value of the M = 35 δ values listed in Table 2.5 is μ δ = 2. 9475. Thus, the observed chance-corrected coefficient of agreement, following Eq. (2.5) on p. 37, is
indicating approximately 27 % within-group agreement above that expected by chance.
2.3 Coda
Chapter 2 provided the foundation for Multi-Response Permutation Procedures (MRPP), with special emphasis on the generalized Minkowski distance function, \(\Delta (x,y)\), as defined in Eq. (2.1) on p. 30; δ, the weighted mean of the specified distance function values for all distinct pairs of objects in treatment group S i for i = 1, …, g, as defined in Eq. (2.2) on p. 31; and \(\mathfrak{R}\), the chance-corrected within-group coefficient of agreement, as defined in Eq. (2.4) on p. 33. Chapters 3 and 4 provide applications of MRPP for completely randomized data at the interval level of measurement, Chaps. 5 and 6 provide applications of MRPP for completely randomized data at the ordinal (ranked) level of measurement, and Chap. 7 provides applications of MRPP for completely randomized data at the nominal (categorical) level of measurement.
Chapter 3
Chapter 3 establishes the relationship between the MRPP test statistics, δ and \(\mathfrak{R}\), and selected conventional tests and measures designed for the analysis of completely randomized data at the interval level of measurement. Considered in Chap. 3 are Student’s two-sample t test with interval-level univariate response measurement scores, Hotelling’s two-sample T 2 test with interval-level multivariate response measurement scores, one-way fixed-effects analysis of variance (ANOVA) with interval-level univariate response measurement scores, and one-way multivariate analysis of variance (MANOVA) with interval-level multivariate response measurement scores.
Notes
- 1.
The 1976 paper by Mielke , Berry , and Johnson was the first published account of MRPP [300]. Previously, Mielke utilized MRPP in a study sponsored by the National Communicable Disease Center that involved comparisons of proportional contributions of five plague organism protein bands based on electrophoresis measurements obtained from samples of organisms associated with distinct geographical regions.
- 2.
The Journal für die Reine und Angewandte Mathematik was founded by August Leopold Crelle in 1826. It continues today, although it is more popularly known as Crelle’s Journal.
- 3.
A distance function is a metric if it satisfies three properties given by (1) \(\Delta (x,y) \geq 0\) and \(\Delta (x,x) = 0\), i.e., the distance is positive between two different points and is equal to zero from any point to itself; (2) the distance is symmetric: \(\Delta (x,y) = \Delta (y,x)\), i.e., the distance between points x and y is the same in either direction; and (3) the triangle inequality is satisfied: \(\Delta (x,y) \leq \Delta (x,z) + \Delta (z,y)\), i.e., the distance between any two points is the shortest distance along any path.
- 4.
Multi-response permutation procedures also provide for a group of unclassified response measurement scores such as might result from a survey with question choices that include “none of the above” or “not applicable.” See, for example, a 1983 article on lead concentrations in inner-city soils by Mielke , Anderson , Berry , Mielke , Chaney , and Leech [302] and a discussion by Mielke and Berry in 2007 [297, pp. 35–40].
- 5.
A sufficient condition for a permutation statistical test is the exchangeability of the random variables. Sequences that are independent and identically distributed (i.i.d.) are always exchangeable, but so is sampling without replacement from a finite population. However, while i.i.d. implies exchangeability, exchangeability does not imply i.i.d. [150, 168, 217].
- 6.
As will be shown in Chap. 3, \(\mathfrak{R}\) may also be interpreted as a chance-corrected measure of effect size.
- 7.
Each element of the S matrix is constructed from two corresponding elements in the \(\hat{\boldsymbol{\Sigma }}\) matrices, weighted by the degrees of freedom, i.e., n − 1. For example, the first element of the S matrix is \(0.5333 = [(4 - 1)(0.6667) + (3 - 1)(0.3333)]/(4 + 3 - 2)\).
References
Agresti, A.: Measures of nominal-ordinal association. J. Am. Stat. Assoc. 76, 524–529 (1981)
Agresti, A.: Categorical Data Analysis, 2nd edn. Wiley, New York (2002)
Agresti, A., Finley, B.: Statistical Methods for the Social Sciences. Prentice-Hall, Upper Saddle River (1997)
Agresti, A., Liu, I.: Modeling a categorical variable allowing arbitrarily many category choices. Biometrics 55, 936–943 (1999)
Agresti, A., Liu, I.: Strategies for modeling a categorical variable allowing multiple category choies. Sociol. Method Res. 29, 403–434 (2001)
Altman, D.G., Bland, J.M.: Measurement in medicine: the analysis of method comparison studies. Statistician 32, 307–317 (1983)
Anderson, T.W.: An Introduction to Multivariate Statistical Analysis, 2nd edn. Wiley, New York (1984)
Anderson, T.W.: Two of Harold Hotelling’s contributions to multivariate analysis. Tech. Rep. 40, Stanford University, Stanford (1990)
Anderson, D.R., Sweeney, D.J., Williams, T.A.: Introduction to Statistics: Concepts and Applications. West, New York (1994)
Ansari, A.R., Bradley, R.A.: Rank-sum tests for dispersion. Ann. Math. Stat. 31, 1174–1189 (1960)
Anscombe, F.J.: Rejection of outliers. Technometrics 2, 123–147 (1960)
Arabie, P.: Was Euclid an unnecessarily sophisticated psychologist? Psychometrika 56, 567–587 (1991)
Arbuckle, J., Aiken, L.S.: A program for Pitman’s permutation test for differences in location. Behav. Res. Methods Instrum. 7, 381 (1975)
Author: Resampling Stats User’s Guide. Resampling Stats, Arlington (1999)
Author: StatXact for Windows. Cytel Software, Cambridge (2000)
Bailer, A.J.: Testing variance equality with randomization tests. J. Stat. Comput. Simul. 31, 1–8 (1989)
Bakan, D.: The test of significance in psychological research. Psychol. Bull. 66, 423–437 (1966)
Bakeman, R., Robinson, B.F., Quera, V.: Testing sequential association: estimating exact p values using sampled permutations. Psychol. Methods 1, 4–15 (1996)
Bartko, J.J.: On various intraclass correlation reliability coefficients. Psychol. Bull. 83, 762–765 (1976)
Bartko, J.J.: Measurement and reliability: statistical thinking considerations. Schizophr. Bull. 17, 483–489 (1991)
Bartlett, M.S.: A note on tests of significance in multivariate analysis. Proc. Camb. Philos. Soc. 34, 33–40 (1939)
Bernardin, H.J., Beatty, R.W.: Performance Appraisal: Assessing Human Behavior at Work. Kent, Boston (1984)
Berry, K.J., Mielke, P.W.: Moment approximations as an alternative to the F test in analysis of variance. Br. J. Math. Stat. Psychol. 36, 202–206 (1983)
Berry, K.J., Mielke, P.W.: An APL function for Radlow and Alf’s exact chi-square test. Behav. Res. Methods Instrum. Comput. 17, 131–132 (1985)
Berry, K.J., Mielke, P.W.: Goodman and Kruskal’s tau-b statistic: a nonasymptotic test of significance. Sociol. Methods Res. 13, 543–550 (1985)
Berry, K.J., Mielke, P.W.: Subroutines for computing exact chi-square and Fisher’s exact probability tests. Educ. Psychol. Meas. 45, 153–159 (1985)
Berry, K.J., Mielke, P.W.: A generalization of Cohen’s kappa agreement measure to interval measurement and multiple raters. Educ. Psychol. Meas. 48, 921–933 (1988)
Berry, K.J., Mielke, P.W.: A family of multivariate measures of association for nominal independent variables. Educ. Psychol. Meas. 52, 41–55 (1992)
Berry, K.J., Mielke, P.W.: Spearman’s footrule as a measure of agreement. Psychol. Rep. 80, 839–846 (1997)
Berry, K.J., Mielke, P.W.: Extension of Spearman’s footrule to multiple rankings. Psychol. Rep. 82, 376–378 (1998)
Berry, K.J., Mielke, P.W.: Least absolute regression residuals: analyses of block designs. Psychol. Rep. 83, 923–929 (1998)
Berry, K.J., Mielke, P.W.: Least sum of absolute deviations regression: distance, leverage, and influence. Percept. Mot. Skills 86, 1063–1070 (1998)
Berry, K.J., Mielke, P.W.: Least sum of Euclidean regression residuals: estimation of effect size. Psychol. Rep. 91, 955–962 (2002)
Berry, K.J., Mielke, P.W.: Longitudinal analysis of data with multiple binary category choices. Psychol. Rep. 93, 127–131 (2003)
Berry, K.J., Martin, T.W., Olson, K.F.: Testing theoretical hypotheses: a PRE statistic. Soc. Forces 53, 190–196 (1974)
Berry, K.J., Martin, T.W., Olson, K.F.: A note on fourfold point correlation. Educ. Psychol. Meas. 34, 53–56 (1974)
Berry, K.J., Mielke, P.W., Iyer, H.K.: Factorial designs and dummy coding. Percept. Mot. Skills 87, 919–927 (1998)
Berry, K.J., Mielke, P.W., Mielke, H.W.: The Fisher–Pitman permutation test: an attractive alternative to the F test. Psychol. Rep. 90, 495–502 (2002)
Berry, K.J., Johnston, J.E., Mielke, P.W.: Exact and resampling probability values for measures associated with ordered R by C contingency tables. Psychol. Rep. 99, 231–238 (2006)
Berry, K.J., Johnston, J.E., Mielke, P.W.: An alternative measure of effect size for Cochran’s Q test for related proportions. Percept. Mot. Skills 104, 1236–1242 (2007)
Berry, K.J., Johnston, J.E., Mielke, P.W.: A Chronicle of Permutation Statistical Methods: 1920–2000 and Beyond. Springer, Cham (2014)
Bilder, C.R., Loughin, T.M.: On the first-order Rao–Scott correction of the Umesh–Loughin–Scherer statistic. Biometrics 57, 1253–1255 (2001)
Bilder, C.R., Loughin, T.M., Nettleton, D.: Multiple marginal independence-testing for pick any/c variables. Commun. Stat. Simul. Comput. 29, 1285–1316 (2000)
Biondini, M.E., Mielke, P.W., Berry, K.J.: Data-dependent permutation techniques for the analysis of ecological data. Vegetatio 75, 161–168 (1988). [The name of the journal was changed to Plant Ecology in 1997]
Blalock, H.M.: A double standard in measuring degree of association. Am. Sociol. Rev. 28, 988–989 (1963)
Blattberg, R., Sargent, T.: Regression with non-Gaussian stable disturbances. Econometrica 39, 501–510 (1971)
Borgatta, E.F.: My student, the purist: a lament. Soc. Q. 9, 29–34 (1968)
Box, G.E.P.: Science and statistics. J. Am. Stat. Assoc. 71, 791–799 (1976)
Box, J.F.: R. A. Fisher: The Life of a Scientist. Wiley, New York (1978)
Box, G.E.P.: An Accidental Statistician: The Life and Memories of George E. P. Box. Wiley, New York (2013). [Inscribed “With a little help from my friend, Judith L. Allen”]
Bradbury, I.: Analysis of variance versus randomization: a comparison. Br. J. Math. Stat. Psychol. 40, 177–187 (1987)
Bradley, J.V.: Distribution-free Statistical Tests. Prentice-Hall, Englewood Cliffs (1968)
Bradley, J.V.: A common situation conducive to bizarre distribution shapes. Am. Stat. 31, 147–150 (1977)
Brandeau, M.L., Chiu, S.S.: Parametric facility location on a tree network with an L p norm cost function. Transp. Sci. 22, 59–69 (1988)
Brennan, P.F., Hays, B.J.: The kappa statistic for establishing interrater reliability in the secondary analysis of qualitative clinical data. Res. Nurs. Heal. 15, 153–158 (1992)
Brennan, R.L., Prediger, D.J.: Coefficient kappa: some uses, misuses, and alternatives. Educ. Psychol. Meas. 41, 687–699 (1981)
Brillinger, D.R., Jones, L.V., Tukey, J.W.: The role of statistics in weather resources management. Tech. Rep. II, Weather Modification Advisory Board, United States Department of Commerce, Washington, DC (1978)
Bross, I.D.J.: Is there an increased risk? Fed. Proc. 13, 815–819 (1954)
Brown, G.W., Mood, A.M.: On median tests for linear hypotheses. In: Neyman, J. (ed.) Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, vol. II, pp. 159–166. University of California Press, Berkeley (1951)
Burr, E.J.: The distribution of Kendall’s score S for a pair of tied rankings. Biometrika 47, 151–171 (1960)
Burry-Stock, J.A., Laurie, D.G., Chissom, B.S.: Rater agreement indexes for performance assessment. Educ. Psychol. Meas. 56, 251–262 (1996)
Campbell, M.J., Gardner, M.J.: Calculating confidence intervals for some non-parametric analyses. Br. Med. J. 296, 1454–1456 (1988)
Capraro, R.M., Capraro, M.M.: Treatments of effect sizes and statistical significance tests in textbooks. Educ. Psychol. Meas. 62, 771–782 (2002)
Capraro, R.M., Capraro, M.M.: Exploring the APA fifth edition Publication Manual’s impact of the analytic preferences of journal editorial board members. Educ. Psychol. Meas. 63, 554–565 (2003)
Carroll, R.M., Nordholm, L.A.: Sampling characteristics of Kelley’s ε 2 and Hays’ \(\hat{\omega }^{2}\). Educ. Psychol. Meas. 35, 541–554 (1975)
Carver, R.P.: The case against statistical significance testing. Harv. Educ. Rev. 48, 378–399 (1978)
Carver, R.P.: The case against statistical significance testing, revisited. J. Exp. Educ. 61, 287–292 (1993)
Chesterton, G.K.: The Complete Father Brown Stories: “The Head of Caesar”. Star Books, Vancouver (2003)
Cochran, W.G.: The comparison of percentages in matched samples. Biometrika 37, 256–266 (1950)
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46 (1960)
Cohen, J.: Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull. 70, 213–220 (1968)
Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Academic Press, New York (1969)
Cohen, J.: Things I have learned (so far). Am. Psychol. 45, 1304–1312 (1990)
Cohen, J.: The earth is round (p < . 05). Am. Psychol. 49, 997–1003 (1994)
Cohen, J., Cohen, P.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Erlbaum, Hillsdale (1975)
Colwell, D.J., Gillett, J.R.: Spearman versus Kendall. Math. Gaz. 66, 307–309 (1982)
Conover, W.J.: Practical Nonparametric Statistics, 3rd edn. Wiley, New York (1999)
Conti, L.H., Musty, R.E.: The effects of delta-9-tetrahydrocannabinol injections to the nucleus accumbens on the locomotor activity of rats. In: Arurell, S., Dewey, W.L., Willette, R.E. (eds.) The Cannabinoids: Chemical, Pharmacologic, and Therapeutic Aspects, pp. 649–655. Academic Press, New York (1984)
Coombs, C.H.: A Theory of Data. Wiley, New York (1964)
Costner, H.L.: Criteria for measures of association. Am. Sociol. Rev. 30, 341–353 (1965)
Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1946)
Crittenden, K.S., Montgomery, A.C.: A system of paired asymmetric measures of association for use with ordinal dependent variables. Soc. Forces 58, 1178–1194 (1980)
Cureton, E.E.: Rank-biserial correlation. Psychometrika 21, 287–290 (1956)
Cureton, E.E.: Rank-biserial correlation when ties are present. Educ. Psychol. Meas. 28, 77–79 (1968)
Curran-Everett, D.: Explorations in statistics: standard deviations and standard errors. Adv. Physiol. Educ. 32, 203–208 (2008)
Daniel, W.W.: Statistical significance versus practical significance. Sci. Educ. 61, 423–427 (1977)
Daniels, H.E.: Rank correlation and population models (with discussion). J. R. Stat. Soc. Ser. B Methodol. 12, 171–191 (1950)
Daniels, H.E.: Note on Durbin and Stuart’s formula for E(r s ). J. R. Stat. Soc. Ser. B Methodol. 13, 310 (1951)
Darwin, C.R.: The Effects of Cross and Self Fertilization in the Vegetable Kingdom. John Murray, London (1876)
David, F.N.: Review of “Rank Correlation Methods” by M. G. Kendall. Biometrika 37, 190 (1950)
de Mast, J., Akkerhuis, T., Erdmann, T.: The statistical evaluation of categorical measurements: simple scales, but treacherous complexity underneath (2014). [Originally a paper presented at the First Stu Hunter Research Conference in Heemskerk, Netherlands, March, 2013]
Decady, Y.R., Thomas, D.R.: A simple test of association for contingency tables with multiple column responses. Biometrics 56, 893–896 (2000)
Diekhoff, G.: Statistics for the Social and Behavioral Sciences: Univariate, Bivariate, Multivariate. Brown, Dubuque (1992)
Dielman, T.E.: A comparison of forecasts from least absolute and least squares regression. J. Forecast. 5, 189–195 (1986)
Dielman, T.E.: Corrections to a comparison of forecasts from least absolute and least squares regression. J. Forecast. 8, 419–420 (1989)
Dielman, T.E., Pfaffenberger, R.: Least absolute value regression: necessary sample sizes to use normal theory inference procedures. Decis. Sci. 19, 734–743 (1988)
Dielman, T.E., Rose, E.L.: Forecasting in least absolute value regression with autocorrelated errors: a small-sample study. Int. J. Forecast. 10, 539–547 (1994)
Dodd, D.H., Schultz, R.F.: Computational procedures for estimating magnitude of effects for some analysis of variance designs. Psychol. Bull. 79, 391–395 (1973)
Durbin, J., Stuart, A.: Inversions and rank correlation coefficients. J. R. Stat. Soc. Ser. B Methodol. 13, 303–309 (1951)
Dwass, M.: Modified randomization tests for nonparametric hypotheses. Ann. Math. Stat. 28, 181–187 (1957)
Dwyer, J.H.: Analysis of variance and the magnitude of effect: a general approach. Psychol. Bull. 81, 731–737 (1974)
Dyson, G.: Turing’s Cathedral: The Origins of the Digital Universe. Pantheon/Vintage, New York (2012)
Eden, T., Yates, F.: On the validity of Fisher’s z test when applied to an actual example of non-normal data. J. Agric. Sci. 23, 6–17 (1933)
Edgington, E.S.: Randomization tests. J. Psychol. 57, 445–449 (1964)
Edgington, E.S.: Statistical inference and nonrandom samples. Psychol. Bull. 66, 485–487 (1966)
Edgington, E.S.: Approximate randomization tests. J. Psychol. 72, 143–149 (1969)
Edgington, E.S.: Statistical Inference: The Distribution-Free Approach. McGraw-Hill, New York (1969)
Edgington, E.S.: Randomization Tests. Marcel Dekker, New York (1980)
Edgington, E.S., Onghena, P.: Randomization Tests, 4th edn. Chapman & Hall/CRC, Boca Raton (2007)
Edwards, D.: Exact simulation based inference: a survey, with additions. J. Stat. Comput. Simul. 22, 307–326 (1985)
Everitt, B.S.: Moments of the statistics kappa and weighted kappa. Br. J. Math. Stat. Psychol. 21, 97–103 (1968)
Ezekiel, M.J.B.: Methods of Correlation Analysis. Wiley, New York (1930)
Feinstein, A.R.: Clinical biostatistics XXIII: the role of randomization in sampling, testing, allocation, and credulous idolatry (Part 2). Clin. Pharmacol. Ther. 14, 898–915 (1973)
Feinstein, A.R.: Clinical Biostatistics. C.V. Mosby, St. Louis (1977)
Ferguson, G.A.: Statistical Analysis in Psychology and Education, 5th edn. McGraw-Hill, New York (1981)
Festinger, L.: The significance of differences between means without reference to the frequency distribution function. Psychometrika 11, 97–105 (1946)
Fidler, F., Thompson, B.: Computing correct confidence intervals for ANOVA fixed- and random-effects effect sizes. Educ. Psychol. Meas. 61, 575–604 (2001)
Fisher, R.A.: Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh (1925)
Fisher, R.A.: The Design of Experiments. Oliver and Boyd, Edinburgh (1935)
Fisher, R.A.: The logic of inductive inference (with discussion). J. R. Stat. Soc. 98, 39–82 (1935)
Fisher, R.A.: Mathematics of a lady tasting tea. In: Newman, J.R. (ed.) The World of Mathematics, vol. III, section VIII, pp. 1512–1521. Simon & Schuster, New York (1956)
Fisher, R.A.: The Design of Experiments, 7th edn. Hafner, New York (1960)
Fleiss, J.L.: Estimating the magnitude of experimental effects. Psychol. Bull. 72, 273–276 (1969)
Fleiss, J.L., Cohen, J., Everitt, B.S.: Large sample standard errors of kappa and weighted kappa. Psychol. Bull. 72, 323–327 (1969)
Franklin, L.A.: Exact tables of Spearman’s footrule for n = 11(1)18 with estimate of convergence and errors for the normal approximation. Stat. Probab. Lett. 6, 399–406 (1988)
Freeman, L.C.: Elementary Applied Statistics. Wiley, New York (1965)
Frick, R.W.: Interpreting statistical testing: process and propensity, not population and random sampling. Behav. Res. Methods Instrum. Comput. 30, 527–535 (1998)
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937)
Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Ann. Math. Stat. 11, 86–92 (1940)
Friedman, H.: Magnitude of experimental effect and a table for its rapid estimation. Psychol. Bull. 70, 245–251 (1968)
Gaebelein, J.W., Soderquist, J.A., Powers, W.A.: A note on the variance explained in the mixed analysis of variance model. Psychol. Bull. 83, 1110–1112 (1976)
Gail, M., Mantel, N.: Counting the number of r × c contingency tables with fixed margins. J. Am. Stat. Assoc. 72, 859–862 (1977)
Gardner, M.J., Altman, D.G.: Statistics with Confidence: Confidence Intervals and Statistical Guidelines. British Medical Journal, London (1989)
Geary, R.C.: Some properties of correlation and regression in a limited universe. Metron 7, 83–119 (1927)
Geary, R.C.: Testing for normality. Biometrika 34, 209–242 (1947)
Gebhard, J., Schmitz, N.: Permutation tests: a revival? I. Optimum properties. Stat. Pap. 39, 75–85 (1998)
Glass, G.V.: Note on rank-buserial correlation. Educ. Psychol. Meas. 26, 623–631 (1966)
Glass, G.V.: Primary, secondary, and meta-analysis of research. Educ. Res. 5, 3–8 (1976)
Glass, G.V.: Statistical Methods in Education and Psychology, 2nd edn. Prentice-Hall, Englewood Cliffs (1984)
Glass, G.V., Hakstian, A.R.: Measures of association in comparative experiments: their development and interpretation. Am. Educ. Res. J. 6, 403–414 (1969)
Glass, G.V., Peckham, P.D., Sanders, J.R.: Consequences of failure to meet assumptions underlying the fixed effects analysis of variance and covariance. Rev. Educ. Res. 42, 237–288 (1972)
Glass, G.V., McGraw, B., Smith, M.L.: Meta-Analysis in Social Research: Individual and Neighbourhood Reactions. Sage, Beverly Hills (1981)
Golding, S.L.: Flies in the ointment: methodological problems in the analysis of the percentage of variance due to persons and situations. Psychol. Bull. 82, 278–289 (1975)
Good, I.J.: Further comments concerning the lady tasting tea or beer: P-values and restricted randomization. J. Stat. Comput. Simul. 40, 263–267 (1992)
Good, P.I.: Permutation, Parametric and Bootstrap Tests of Hypotheses. Springer, New York (1994)
Good, P.I.: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer, New York (1994)
Good, P.I.: Resampling Methods: A Practical Guide to Data Analysis. Birkhäuser, Boston (1999)
Good, P.I.: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses, 2nd edn. Springer, New York (2000)
Good, P.I.: Resampling Methods: A Practical Guide to Data Analysis, 2nd edn. Birkhäuser, Boston (2001)
Good, P.I.: Extensions of the concept of exchangeability and their applications. J. Mod. Appl. Stat. Methods 1, 243–247 (2002)
Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Am. Stat. Assoc. 49, 732–764 (1954)
Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications, III: approximate sampling theory. J. Am. Stat. Assoc. 58, 310–364 (1963)
Gravetter, F.J., Wallnau, L.B.: Essentials of Statistics for the Behavioral Sciences, 8th edn. Wadsworth, Belmont (2014)
Greenhouse, S.W., Geisser, S.: On methods in the analysis of profile data. Psychometrika 24, 95–112 (1959)
Gridgeman, N.T.: The lady tasting tea, and allied topics. J. Am. Stat. Assoc. 54, 776–783 (1959)
Grier, D.A.: Statistical laboratories and the origins of computing. Chance 12, 14–20 (1999)
Grissom, R.J., Kim, J.J.: Effect Sizes for Research: A Broad Practical Approach. Lawrence Erlbaum, Mahwah (2005)
Grissom, R.J., Kim, J.J.: Effect Sizes for Research: Univariate and Multivariate Applications. Routledge, New York (2012)
Guggenmoos-Holzmann, I.: How reliable are chance-corrected measures of agreement? Stat. Med 12, 2191–2205 (1993)
Guggenmoos-Holzmann, I.: Comment on “Modeling covariate effects in observer agreement studies: the case of nominal scale agreement” by P. Graham. Stat. Med. 14, 2285–2286 (1995)
Guilford, J.P.: Fundamental Statistics in Psychology and Education. McGraw-Hill, New York (1950)
Hald, A.: History of Probability and Statistics and Their Applications Before 1750. Wiley, New York (1990)
Hald, A.: A History of Mathematical Statistics from 1750 to 1930. Wiley, New York (1998)
Haldane, J.B.S., Smith, C.A.B.: A simple exact test for birth-order effect. Ann. Eugen. 14, 117–124 (1948)
Hall, N.S.: R. A. Fisher and his advocacy of randomization. J. Hist. Biol. 40, 295–325 (2007)
Hanley, J.A.: Standard error of the kappa statistic. Psychol. Bull. 102, 315–321 (1987)
Harding, E.F.: An efficient, minimal-storage procedure for calculating the Mann–Whitney U, generalized U and similar distributions. J. R. Stat. Soc.: Ser. C: Appl. Stat. 33, 1–6 (1984)
Hayes, A.F.: Permutation test is not distribution-free: testing H 0: ρ = 0. Psychol. Methods 1, 184–198 (1996)
Hays, W.L.: Statistics. Holt, Rinehart and Winston, New York (1963)
Hedges, L.V.: Estimation of effect size from a series of independent experiments. Psychol. Bull. 92, 490–499 (1982)
Heiser, W.J.: Geometric representation of association between categories. Psychometrika 69, 513–545 (2004)
Hellman, M.: A study of some etiological factors of malocclusion. Dent. Cosmos 56, 1017–1032 (1914)
Hemelrijk, J.: Note on Wilcoxon’s two-sample test when ties are present. Ann. Math. Stat. 23, 133–135 (1952)
Henson, R.K., Smith, A.D.: State of the art in statistical significance and effect size reporting: a review of the APA task force report and current trends. J. Res. Dev. Educ. 33, 285–296 (2000)
Hess, B., Olejnik, S., Huberty, C.J.: The efficacy of two improvement-over-chance effect sizes for two-group univariate comparisons. Educ. Psychol. Meas. 61, 909–936 (2001)
Higgins, J.J.: Introduction to Modern Nonparametric Tests. Brooks/Cole, Pacific Grove (2004)
Hitchcock, D.B.: Yates and contingency tables: 75 years later. Electron. J. Hist. Probab. Stat. 5, 1–14 (2009)
Hodges, J.L., Lehmann, E.L.: Rank methods for combination of independent experiments in analysis of variance. Ann. Math. Stat. 33, 482–497 (1962)
Hodges, J.L., Lehmann, E.L.: Estimates of location based on rank tests. Ann. Math. Stat. 34, 598–611 (1963)
Hope, A.C.A.: A simplified Monte Carlo significance test procedure. J. R. Stat. Soc. Ser. B Methodol. 30, 582–598 (1968)
Hotelling, H.: The generalization of student’s ratio. Ann. Math. Stat. 2, 360–378 (1931)
Hotelling, H.: A generalized T test and measure of multivariate dispersion. In: Neyman, J. (ed.) Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, vol. II, pp. 23–41. University of California Press, Berkeley (1951)
Hotelling, H., Pabst, M.R.: Rank correlation and tests of significance involving no assumption of normality. Ann. Math. Stat. 7, 29–43 (1936)
Howell, D.C.: Statistical Methods for Psychology, 6th edn. Wadsworth, Belmont (2007)
Howell, D.C.: Statistical Methods for Psychology, 8th edn. Wadsworth, Belmont (2013)
Hubbard, R.: Alphabet soup: Blurring the distinctions between p’s and α’s in psychological research. Theor. Psychol. 14, 295–327 (2004)
Hubert, L.J.: A note on Freeman’s measure of association for relating an ordered to an unordered factor. Psychometrika 39, 517–520 (1974)
Hunter, A.A.: On the validity of measures of association: the nominal-nominal two-by-two case. Am. J. Sociol. 79, 99–109 (1973)
Hutchinson, T.P.: Kappa muddles together two sources of disagreement: Tetrachoric correlation is preferable. Res. Nurs. Health 16, 313–315 (1993)
Huynh, H., Feldt, L.S.: Conditions under which mean square ratios in repeated measurements designs have exact F distributions. J. Am. Stat. Assoc. 65, 1582–1589 (1970)
Irwin, J.O.: Tests of significance for differences between percentages based on small numbers. Metron 12, 83–94 (1935)
Isaacson, W.: The Innovators. Simon & Schuster, New York (2014)
Jockel, K.H.: Finite sample properties and asymptotic efficiency of Monte Carlo tests. J. Stat. Comput. Simul. 14, 336–347 (1986)
Johnston, J.E., Berry, K.J., Mielke, P.W.: A measure of effect size for experimental designs with heterogeneous variances. Percept. Mot. Skills 98, 3–18 (2004)
Johnston, J.E., Berry, K.J., Mielke, P.W.: Permutation tests: precision in estimating probability values. Percept. Mot. Skills 105, 915–920 (2007)
Jonckheere, A.R.: A distribution-free k-sample test against ordered alternatives. Biometrika 41, 133–145 (1954)
Kahaner, D., Moler, C., Nash, S.: Numerical Methods and Software. Prentice-Hall, Englewood Cliffs (1988)
Kaufman, E.H., Taylor, G.D., Mielke, P.W., Berry, K.J.: An algorithm and FORTRAN program for multivariate LAD (ℓ 1 of ℓ 2) regression. Computing 68, 275–287 (2002)
Keller-McNulty, S., Higgins, J.J.: Effect of tail weight and outliers and power and type-I error of robust permutation tests for location. Commun. Stat. Simul. Comput. 16, 17–35 (1987)
Kelley, T.L.: An unbiased correlation ratio measure. Proc. Natl. Acad. Sci. 21, 554–559 (1935)
Kempthorne, O.: The Design and Analysis of Experiments. Wiley, New York (1952)
Kempthorne, O.: The randomization theory of experimental inference. J. Am. Stat. Assoc. 50, 946–967 (1955)
Kempthorne, O.: Some aspects of experimental inference. J. Am. Stat. Assoc. 61, 11–34 (1966)
Kempthorne, O.: Why randomize? J. Stat. Plan. Inference 1, 1–25 (1977)
Kendall, M.G.: A new measure of rank correlation. Biometrika 30, 81–93 (1938)
Kendall, M.G.: The treatment of ties in ranking problems. Biometrika 33, 239–251 (1945)
Kendall, M.G.: Rank Correlation Methods. Griffin, London (1948)
Kendall, M.G.: Rank Correlation Methods, 3rd edn. Griffin, London (1962)
Kendall, M.G., Babington Smith, B.: The problem of m rankings. Ann. Math. Stat. 10, 275–287 (1939)
Kendall, M.G., Babington Smith, B.: On the method of paired comparisons. Biometrika 31, 324–345 (1940)
Kendall, M.G., Kendall, S.F.H., Babington Smith, B.: The distribution of Spearman’s coefficient of rank correlation in a universe in which all rankings occur an equal number of times. Biometrika 30, 251–273 (1939)
Kennedy, P.E.: Randomization tests in econometrics. J. Bus. Econ. Stat. 13, 85–94 (1995)
Kenny, D.A.: Statistics for the Social and Behavioral Sciences. Little Brown, Boston (1987)
Keppel, G.: Design and Analysis: A Researcher’s Handbook, 2nd edn. Prentice-Hall, Englewood Cliffs (1982)
Keppel, G., Zedeck, S.: Data Analysis for Research Designs: Analysis of Variance and Multiple Regression/Correlation Approaches. Freeman, New York (1989)
Kim, M.J., Nelson, C.R., Startz, R.: Mean revision in stock prices? a reappraisal of the empirical evidence. Rev. Econ. Stud. 58, 515–528 (1991)
Kingman, J.F.C.: Uses of exchangeability. Ann. Probab. 6, 183–197 (1978). [Abraham Wald memorial lecture delivered in Aug 1977 in Seattle, Washington]
Kirk, R.E.: Experimental Design: Procedures for the Behavioral Sciences. Brooks/Cole, Belmont (1968)
Kirk, R.E.: Practical significance: a concept whose time has come. Educ. Psychol. Meas. 56, 746–759 (1996)
Kirk, R.E.: Effect magnitude: a different focus. J. Stat. Plan. Inference 137, 1634–1646 (2006). [Keynote address delivered at the 2003 International Conference on Statistics, Combinatorics, and Related Areas, held at the University of Southern Maine]
Kraft, C.A., van Eeden, C.: A Nonparametric Introduction to Statistics. Macmillan, New York (1968)
Krause, E.F.: Taxicab Geometry. Addison-Wesley, Menlo Park (1975)
Krippendorff, K.: Bivariate agreement coefficients for reliability of data. In: Borgatta, E.G. (ed.) Sociological Methodology, pp. 139–150. Jossey-Bass, San Francisco (1970)
Kruskal, W.H.: Historical notes on the Wilcoxon unpaired two-sample test. J. Am. Stat. Assoc. 52, 356–360 (1957)
Kruskal, W.H., Wallis, W.A.: Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47, 583–621 (1952). [Erratum: J. Am. Stat. Assoc. 48, 907–911 (1953)]
Lachin, J.M.: Statistical properties of randomization in clinical trials. Control. Clin. Trials 9, 289–311 (1988)
LaFleur, B.J., Greevy, R.A.: Introduction to permutation and resampling-based hypothesis tests. J. Clin. Child Adolesc. 38, 286–294 (2009)
Lance, C.E.: More statistical and methodological myths and urban legends. Organ. Res. Methods 14, 279–286 (2011)
Lange, J.: Crime as Destiny: A Study of Criminal Twins. Allen & Unwin, London (1931). [Translated by C. Haldane]
Larson, S.C.: The shrinkage of the coefficient of multiple correlation. J. Educ. Psychol. 22, 45–55 (1931)
Larson, R.C., Sadiq, G.: Facility locations with the Manhattan metric in the presence of barriers to travel. Oper. Res. 31, 652–669 (1983)
Lawley, D.N.: A generalization of Fisher’s z test. Biometrika 30, 180–187 (1938)
Lawley, D.N.: Corrections to “A generalization of Fisher’s z test”. Biometrika 30, 467–469 (1939)
Leach, C.: Introduction to Statistics: A Nonparametric Approach for the Social Sciences. Wiley, New York (1979)
Lehmann, E.L.: Parametrics vs. nonparametrics: two alternative methodologies. J. Nonparametr. Stat. 21, 397–405 (2009)
Lehmann, E.L.: Fisher, Neyman, and the Creation of Classical Statistics. Springer, New York (2011)
Lehmann, E.L., Stein, C.M.: On the theory of some non-parametric hypotheses. Ann. Math. Stat. 20, 28–45 (1949)
Levine, J.H.: Joint-space analysis of “pick-any” data: analysis of choices from an unconstrained set of alternatives. Psychometrika 44, 85–92 (1979)
Levine, T.R., Hullett, C.R.: Eta squared, partial eta squared, and misreporting of effect size in communication research. Hum. Commun. Res. 28, 612–625 (2002)
Levine, T.R., Weber, R., Hullett, C.R., Park, H.S., Massi Lindsey, L.L.: A critical assessment of null hypothesis significance testing in quantitative communication research. Hum. Commun. Res. 34, 171–187 (2008)
Levine, T.R., Weber, R., Park, H.S., Hullett, C.R.: A communication researchers’ guide to null hypothesis significance testing and alternatives. Hum. Commun. Res. 34, 188–209 (2008)
Light, R.J.: Measures of response agreement for qualitative data: some generalizations and alternatives. Psychol. Bull. 76, 365–377 (1971)
Light, R.J., Margolin, B.H.: An analysis of variance for categorical data. J. Am. Stat. Assoc. 66, 534–544 (1971)
Linn, R.L., Baker, E.L., Dunbar, S.B.: Complex performance-based assessment: expectations and validation criterion. Educ. Res. 20, 15–21 (1991)
Loether, H.J., McTavish, D.G.: Descriptive and Inferential Statistics: An Introduction, 4th edn. Allyn and Bacon, Boston (1993)
Loughin, T.M., Scherer, P.N.: Testing for association in contingency tables with multiple column responses. Biometrics 54, 630–637 (1998)
Ludbrook, J.: Advantages of permutation (randomization) tests in clinical and experimental pharmacology and physiology. Clin. Exp. Pharmacol. Physiol. 21, 673–686 (1994)
Ludbrook, J.: Issues in biomedical statistics: comparing means by computer-intensive tests. Aust. N. Z. J. Surg. 65, 812–819 (1995)
Ludbrook, J.: The Wilcoxon–Mann–Whitney test condemned. Br. J. Surg. 83, 136–137 (1996)
Ludbrook, J.: Statistical techniques for comparing measures and methods of measurement: a critical review. Clin. Exp. Pharmacol. Physiol. 29, 527–536 (2002)
Ludbrook, J.: Outlying observations and missing values: how should they be handled? Clin. Exp. Pharmacol. Physiol. 35, 670–678 (2008)
Ludbrook, J., Dudley, H.A.F.: Issues in biomedical statistics: analyzing 2 × 2 tables of frequencies. Aust. N. Z. J. Surg. 64, 780–787 (1994)
Ludbrook, J., Dudley, H.A.F.: Issues in biomedical statistics: statistical inference. Aust. N. Z. J. Surg. 64, 630–636 (1994)
Ludbrook, J., Dudley, H.A.F.: Why permutation tests are superior to t and F tests in biomedical research. Am. Stat. 52, 127–132 (1998)
Ludbrook, J., Dudley, H.A.F.: Discussion of “Why permutation tests are superior to t and F tests in biomedical research” by J. Ludbrook and H.A.F. Dudley. Am. Stat. 54, 87 (2000)
Lunneborg, C.E.: Data Analysis by Resampling: Concepts and Applications. Duxbury, Pacific Grove (2000)
Maclure, M., Willett, W.C.: Misinterpretation and misuse of the kappa statistic. Am. J. Epidemiol. 126, 161–169 (1987)
Manly, B.F.J.: Randomization and Monte Carlo Methods in Biology. Chapman & Hall, London (1991)
Manly, B.F.J.: Randomization and Monte Carlo Methods in Biology, 2nd edn. Chapman & Hall, London (1997)
Manly, B.F.J.: Randomization, Bootstrap and Monte Carlo Methods in Biology, 3rd edn. Chapman & Hall/CRC, Boca Raton (2007)
Manly, B.F.J., Francis, R.I.C.: Analysis of variance by randomization when variances are unequal. Aust. N. Z. J. Stat. 41, 411–429 (1999)
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947)
Margolin, B.H., Light, R.J.: An analysis of variance for categorical data, II: small sample comparisons with chi square and other competitors. J. Am. Stat. Assoc. 69, 755–764 (1974)
Mathew, T., Nordström, K.: Least squares and least absolute deviation procedures in approximately linear models. Stat. Probab. Lett. 16, 153–158 (1993)
Maxim, P.S.: Quantitative Research Methods in the Social Sciences. Oxford, New York (1999)
Maxwell, S.E., Camp, C.J., Arvey, R.D.: Measures of strength of association: a comparative examination. J. Appl. Psychol. 66, 525–534 (1981)
May, R.B., Hunter, M.A.: Some advantages of permutation tests. Can. Psychol. 34, 401–407 (1993)
May, S.M.: Modelling observer agreement: an alternative to kappa. J. Clin. Epidemiol. 47, 1315–1324 (1994)
McCarthy, M.D.: On the application of the z-test to randomized blocks. Ann. Math. Stat. 10, 337–359 (1939)
McGrath, R.E., Meyer, G.J.: When effect sizes disagree: the case of r and d. Psychol. Methods 11, 386–401 (2006)
McHugh, R.B., Mielke, P.W.: Negative variance estimates and statistical dependence in nested sampling. J. Am. Stat. Assoc. 63, 1000–1003 (1968)
McLean, J.E., Ernest, J.M.: The role of statistical significance testing in educational research. J. Health Soc. Behav. 5, 15–22 (1998)
McNemar, Q.: Note on the sampling error of the differences between correlated proportions and percentages. Psychometrika 12, 153–157 (1947)
McQueen, G.: Long-horizon mean-reverting stock priced revisited. J. Financ. Quant. Anal. 27, 1–17 (1992)
Mehta, C.R., Patel, N.R.: Algorithm 643: FEXACT. A FORTRAN subroutine for Fisher’s exact test on unordered r × c contingency tables. ACM Trans. Math. Softw. 12, 154–161 (1986)
Mehta, C.R., Patel, N.R.: A hybrid algorithm for Fisher’s exact test in unordered r × c contingency tables. Commun. Stat. Theory Methods 15, 387–403 (1986)
Mehta, C.R., Patel, N.R., Gray, R.: On computing an exact confidence interval for the common odds ratio in several 2 × 2 contingency tables. J. Am. Stat. Assoc. 80, 969–973 (1985)
Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44, 335–341 (1949)
Meyer, G.J.: Assessing reliability: critical corrections for a critical examination of the Rorschach comprehensive system. Psychol. Assess. 9, 480–489 (1997)
Micceri, T.: The unicorn, the normal curve, and other improbable creatures. Psychol. Bull. 105, 156–166 (1989)
Mielke, P.W.: Asymptotic behavior of two-sample tests based on powers of ranks for detecting scale and location alternatives. J. Am. Stat. Assoc. 67, 850–854 (1972)
Mielke, P.W.: Squared rank test appropriate to weather modification cross-over design. Technometrics 16, 13–16 (1974)
Mielke, P.W.: Convenient beta distribution likelihood techniques for describing and comparing meteorological data. J. Appl. Meterol. 14, 985–990 (1975)
Mielke, P.W.: Meteorological applications of permutation techniques based on distance functions. In: Krishnaiah, P.R., Sen, P.K. (eds.) Handbook of Statistics, vol. IV, pp. 813–830. North-Holland, Amsterdam (1984)
Mielke, P.W.: Geometric concerns pertaining to applications of statistical tests in the atmospheric sciences. J. Atmos. Sci. 42, 1209–1212 (1985)
Mielke, P.W.: Non-metric statistical analyses: some metric alternatives. J. Stat. Plan Inference 13, 377–387 (1986)
Mielke, P.W.: The application of multivariate permutation methods based on distance functions in the earth sciences. Earth Sci. Rev. 31, 55–71 (1991)
Mielke, P.W., Berry, K.J.: An extended class of permutation techniques for matched pairs. Commun. Stat. Theory Methods 11, 1197–1207 (1982)
Mielke, P.W., Berry, K.J.: Asymptotic clarifications, generalizations, and concerns regarding an extended class of matched pairs tests based on powers of ranks. Psychometrika 48, 483–485 (1983)
Mielke, P.W., Berry, K.J.: Cumulant methods for analyzing independence of r-way contingency tables and goodness-of-fit frequency data. Biometrika 75, 790–793 (1988)
Mielke, P.W., Berry, K.J.: Permutation tests for common locations among samples with unequal variances. J. Educ. Behav. Stat. 19, 217–236 (1994)
Mielke, P.W., Berry, K.J.: Nonasymptotic inferences based on Cochran’s Q test. Percept. Mot. Skill 81, 319–322 (1995)
Mielke, P.W., Berry, K.J.: Permutation-based multivariate regression analysis: the case for least sum of absolute deviations regression. Ann. Oper. Res. 74, 259–268 (1997)
Mielke, P.W., Berry, K.J.: Permutation covariate analyses of residuals based on Euclidean distance. Psychol. Rep. 81, 795–802 (1997)
Mielke, P.W., Berry, K.J.: Euclidean distance based permutation methods in atmospheric science. Data Min. Knowl. Disc. 4, 7–27 (2000)
Mielke, P.W., Berry, K.J.: Data-dependent analyses in psychological research. Psychol. Rep. 91, 1225–1234 (2002)
Mielke, P.W., Berry, K.J.: Permutation Methods: A Distance Function Approach, 2nd edn. Springer, New York (2007)
Mielke, P.W., Berry, K.J.: A note on Cohen’s weighted kappa coefficient of agreement with linear weights. Stat. Methodol. 6, 439–446 (2009)
Mielke, P.W., Iyer, H.K.: Permutation techniques for analyzing multi-response data from randomized block experiments. Commun. Stat. Theory Methods 11, 1427–1437 (1982)
Mielke, P.W., Berry, K.J., Johnson, E.S.: Multi-response permutation procedures for a priori classifications. Commun. Stat. Theory Methods 5, 1409–1424 (1976)
Mielke, P.W., Berry, K.J., Brier, G.W.: Application of multi-response permutation procedures for examining seasonal changes in monthly mean sea-level pressure patterns. Mon. Weather Rev. 109, 120–126 (1981)
Mielke, H.W., Anderson, J.C., Berry, K.J., Mielke, P.W., Chaney, R.L., Leech, M.: Lead concentrations in inner-city soils as a factor in the child lead problem. Am. J. Public Health 73, 1366–1369 (1983)
Mielke, P.W., Berry, K.J., Landsea, C.W., Gray, W.M.: Artificial skill and validation in meteorological forecasting. Weather Forecast. 11, 153–169 (1996)
Mielke, P.W., Berry, K.J., Neidt, C.O.: A permutation test for multivariate matched-pairs analyses: comparisons with Hotelling’s multivariate matched-pairs T 2 test. Psychol. Rep. 78, 1003–1008 (1996)
Mielke, P.W., Berry, K.J., Johnston, J.E.: A FORTRAN program for computing the exact variance of weighted kappa. Percept. Mot. Skill 101, 468–472 (2005)
Mielke, P.W., Berry, K.J., Johnston, J.E.: The exact variance of weighted kappa with multiple raters. Psychol. Rep. 101, 655–660 (2007)
Mielke, P.W., Berry, K.J., Johnston, J.E.: Resampling programs for multiway contingency tables with fixed marginal frequency totals. Psychol. Rep. 101, 18–24 (2007)
Mielke, P.W., Berry, K.J., Johnston, J.E.: Resampling probability values for weighted kappa with multiple raters. Psychol. Rep. 102, 606–613 (2008)
Mielke, P.W., Berry, K.J., Johnston, J.E.: Robustness without rank order statistics. J. Appl. Stat. 38, 207–214 (2011)
Minkowski, H.: Über die positiven quadratishen formen und über kettenbruchähnliche algorithmen. Crelle’s J (J. Reine Angew. Math.) 107, 278–297 (1891). [Also available in H. Minkowski, Gesammelte Abhandlungen, vol. 1, AMS Chelsea, New York, 1967]
Mitchell, C., Hartmann, D.P.: A cautionary note on the use of omega squared to evaluate the effectiveness of behavioral treatments. Behav. Assess. 3, 93–100 (1981)
Mood, A.M.: On the asymptotic efficiency of certain nonparametric two-sample tests. Ann. Math. Stat. 25, 514–522 (1954)
Moses, L.E.: Statistical theory and research design. Ann. Rev. Psychol. 7, 233–258 (1956)
Murphy, K.R., Cleveland, J.: Understanding Performance Appraisal: Social, Organizational, and Goal-Based Perspectives. Sage, Thousand Oaks (1995)
Myers, J.L., Well, A.D.: Research Design and Statistical Analysis. HarperCollins, New York (1991)
Nanda, D.N.: Distribution of the sum of roots of a determinantal equation. Ann. Math. Stat. 21, 432–439 (1950)
Neave, H.R., Worthington, P.L.: Distribution-Free Tests. Unwin Hyman, London (1988)
Newson, R.: Parameters behind “nonparametric” statistics: Kendall’s tau, Somers’ D and median differences. Stata J. 2, 45–64 (2002)
Neyman, J., Pearson, E.S.: On the use and interpretation of certain test criteria for purposes of statistical inference: part I. Biometrika 20A, 175–240 (1928)
Neyman, J., Pearson, E.S.: On the use and interpretation of certain test criteria for purposes of statistical inference: part II. Biometrika 20A, 263–294 (1928)
Nix, T.W., Barnette, J.J.: The data analysis dilemma: Ban or abandon. A review of null hypothesis significance testing. Res. Schools 5, 3–14 (1998)
Nix, T.W., Barnette, J.J.: A review of hypothesis testing revisited: Rejoinder to Thompson, Knapp, and Levin. Res. Schools 5, 55–57 (1998)
O’Boyle, Jr., E., Aguinis, H.: The best and the rest: revisiting the norm of normality of individual performance. Percept. Psychophys. 65, 79–119 (2012)
Okamoto, D.: Letter to the editor: does it work for coffee? Significance 10, 45–46 (2013)
Olds, E.G.: Distribution of sums of squares of rank differences for small numbers of individuals. Ann. Math. Stat. 9, 133–148 (1938)
Olejnik, S., Algina, J.: Measures of effect size for comparative studies: applications, interpretations, and limitations. Contemp. Educ. Psychol. 25, 241–286 (2000)
Olson, C.L.: On choosing a test statistic in multivariate analysis of variance. Psychol. Bull. 83, 579–586 (1976)
Olson, C.L.: Practical considerations in choosing a MANOVA test statistic: a rejoinder to Stevens. Psychol. Bull. 86, 1350–1352 (1979)
Osgood, C.E., Suci, G., Tannenbaum, P.: The Measurement of Meaning. University of Illinois Press, Urbana (1957)
Overall, J.E., Spiegel, D.K.: Concerning least squares analysis of experimental data. Psychol. Bull. 72, 311–322 (1969)
Pagano, R.R.: Understanding Statistics in the Behavioral Sciences, 6th edn. Wadsworth, Pacific Grove (2001)
Pearson, K.: Contributions to the mathematical theory of evolution. Proc. R. Soc. Lond. 54, 329–333 (1893)
Pearson, K.: Contributions to the mathematical theory of evolution, II. Skew variation in homogeneous material. Philos. Trans. R. Soc. Lond. A 186, 343–414 (1895)
Pearson, K.: Mathematical contributions to the theory of evolution, XIII. On the theory of contingency and its relation to association and normal correlation. In: Drapers’ Company Research Memoirs, Biometric Series I, pp. 1–35. Cambridge University Press, Cambridge (1904)
Pearson, E.S.: Untitled. Nature 123, 866–867 (1929). [Review by E.S. Pearson of the second edition of R.A. Fisher’s Statistical Methods for Research Workers]
Pearson, K., Heron, D.: On theories of association. Biometrika 9, 159–315 (1913)
Pfaffenberger, R., Dinkel, J.: Absolute deviations curve-fitting: an alternative to least squares. In: David, H.A. (ed.) Contributions to Survey Sampling and Applied Statistics, pp. 279–294. Academic Press, New York (1978)
Picard, R.: Randomization and design: II. In: Feinberg, S.E., Hinkley, D.V. (eds.) R. A. Fisher: An Appreciation, pp. 46–58. Springer, Heidelberg (1980)
Pillai, K.C.S.: Some new test criteria in multivariate analysis. Ann. Math. Stat. 26, 117–121 (1955)
Pitman, E.J.G.: Significance tests which may be applied to samples from any populations. Suppl. J. R. Stat. Soc. 4, 119–130 (1937)
Pitman, E.J.G.: Significance tests which may be applied to samples from any populations: II. The correlation coefficient test. Suppl. J. R. Stat. Soc. 4, 225–232 (1937)
Pitman, E.J.G.: Significance tests which may be applied to samples from any populations: III. The analysis of variance test. Biometrika 29, 322–335 (1938)
Randles, R.H., Wolfe, D.A.: Introduction to the Theory of Nonparametric Statistics. Wiley, New York (1979)
Raveh, A.: On measures of monotone association. Am. Stat. 40, 117–123 (1986)
Reinhart, A.: Statistics Done Wrong: The Woefully Complete Guide. No Starch Press, San Francisco (2015)
Rice, J., White, J.: Norms for smoothing and estimation. SIAM Rev. 6, 243–256 (1964)
Ricketts, C., Berry, J.S.: Teaching statistics through resampling. Teach. Stat. 16, 41–44 (1994)
Roberts, J.K., Henson, R.K.: Correcting for bias in estimating effect sizes. Educ. Psychol. Meas. 62, 241–253 (2002)
Robinson, W.S.: Ecological correlations and the behavior of individuals. Am. Soc. Rev. 15, 351–357 (1950). [Reprinted in Int. J. Epidemiol. 38, 337–341 (2009)]
Robinson, W.S.: The statistical measurement of agreement. Am. Sociol. Rev. 22, 17–25 (1957)
Robinson, W.S.: The geometric interpretation of agreement. Am. Sociol. Rev. 24, 338–345 (1959)
Rosenberg, B., Carlson, D.: A simple approximation of the sampling distribution of least absolute residuals regression estimates. Commun. Stat. Simul. Comput. 6, 421–438 (1977)
Rosenthal, R., Rosnow, R.L., Rubin, D.B.: Contrasts and Effect Sizes in Behavioral Research: A Correlational Approach. Cambridge University Press, Cambridge (2000)
Rouanet, H., Lépine, D.: Comparison between treatments in a repeated measures design: ANOVA and multivariate methods. Br. J. Math. Stat. Psychol. 23, 147–164 (1970)
Rousseeuw, P.J.: Least median of squares regression. J. Am. Stat. Assoc. 79, 421–438 (1984)
Routledge, R.D.: Resolving the conflict over Fisher’s exact test. Can. J. Stat. 20, 201–209 (1992)
Roy, S.N.: On a heuristic method of test construction and its use in multivariate analysis. Ann. Math. Stat. 24, 220–238 (1953)
Roy, S.N.: Some Aspects of Multivariate Analysis. Wiley, New York (1957)
Saal, F.E., Downey, R.G., Lahey, M.A.: Rating the ratings: assessing the quality of rating data. Psychol. Bull. 88, 413–428 (1980)
Salama, I.A., Quade, D.: A note on Spearman’s footrule. Commun. Stat. Simul. Comput. 19, 591–601 (1990)
Salsburg, D.: The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. Holt, New York (2001)
Särndal, C.E.: A comparative study of association measures. Psychometrika 39, 165–187 (1974)
Satterthwaite, F.E.: An approximate distribution of estimates of variance components. Biom. Bull. 2, 110–114 (1946)
Scheffé, H.: Statistical inference in the non-parametric case. Ann. Math. Stat. 14, 305–332 (1943)
Scheffé, H.: The Analysis of Variance. Wiley, New York (1959)
Schmidt, F.L., Johnson, R.H.: Effect of race on peer ratings in an industrial situation. J. Appl. Psychol. 57, 237–241 (1973)
Schuster, C.: A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales. Educ. Psychol. Meas. 64, 243–253 (2004)
Scott, W.A.: Reliability of content analysis: the case of nominal scale coding. Public Opin. Q. 19, 321–325 (1955)
Senn, S.: Fisher’s game with the devil. Stat. Med. 13, 217–230 (1994). [Publication of a paper presented at the Statisticians in the Pharmaceutical Industry (PSI) annual conference held in Sept 1991 in Bristol, England]
Senn, S.: Tea for three: of infusions and inferences and milk in first. Significance 9, 30–33 (2012)
Senn, S.: Response to “Tea break” by S. Springate. Significance 10, 46 (2013)
Sheynin, O.B.: R. J. Boscovich’s work on probability. Arch. Hist. Exact Sci. 9, 306–324 (1973)
Shrout, P.E., Fleiss, J.L.: Intraclass correlations: uses in assessing rater relaibility. Psychol. Bull. 86, 420–428 (1979)
Shrout, P.E., Spitzer, R.L., Fleiss, J.L.: Quantification of agreement in psychiatric diagnosis revisited. Arch. Gen. Psychiatry 44, 172–177 (1987)
Siegel, S., Castellan, N.J.: Nonparametric Statistics for the Behavioral Sciences, 2nd edn. McGraw-Hill, New York (1988)
Siegel, S., Tukey, J.W.: A nonparametric sum of ranks procedure for relative spread in unpaired samples. J. Am. Stat. Assoc. 55, 429–445 (1960). [Corrigendum: J. Am. Stat. Assoc. 56, 1005 (1961)]
Siegfried, T.: Odds are, it’s wrong. Sci. News 177, 26–29 (2010)
Snedecor, G.W.: Calculation and Interpretation of Analysis of Variance and Covariance. Collegiate Press, Ames (1934)
Snyder, P., Lawson, S.: Evaluating results using corrected and uncorrected effect size estimates. J. Exp. Educ. 61, 334–349 (1993)
Somers, R.H.: A new asymmetric measure of association for ordinal variables. Am. Sociol. Rev. 27, 799–811 (1962)
Spearman, C.E.: The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904)
Spearman, C.E.: ‘Footrule’ for measuring correlation. Br. J. Psychol. 2, 89–108 (1906)
Spitznagel, E.L., Helzer, J.E.: A proposed solution to the base rate problem in the kappa statistic. Arch. Gen. Psychiatry 42, 725–728 (1985)
Springate, S.: Tea break. Significance 10, 45–46 (2013)
Stark, R., Roberts, I.: Contemporary Social Research Methods. Micro-Case, Bellevue (1996)
Stevens, J.P.: Applied Multivariate Statistics for the Social Sciences. Erlbaum, Hillsdale (1986)
Stevens, J.P.: Intermediate Statistics: A Modern Approach. Erlbaum, Hillsdale (1990)
Still, A.W., White, A.P.: The approximate randomization test as an alternative to the F test in analysis of variance. Br. J. Math. Stat. Psychol. 34, 243–252 (1981)
Stuart, A.: The estimation and comparison of strengths of association in contingency tables. Biometrika 40, 105–110 (1953)
“Student”: The probable error of a mean. Biometrika 6, 1–25 (1908). [“Student” is a nom de plume for William Sealy Gosset]
Susskind, E.C., Howland, E.W.: Measuring effect magnitude in repeated measures ANOVA designs: implications for gerontological research. J. Gerontol. 35, 867–876 (1980)
Tabachnick, B.G., Fidell, L.S.: Using Multivariate Statistics, 5th edn. Pearson, Boston (2007)
Taha, M.A.H.: Rank test for scale parameter for asymmetrical one-sided distributions. Publ. Inst. Stat. Univ. Paris 13, 169–180 (1964)
Taylor, L.D.: Estimation by minimizing the sum of absolute errors. In: Zarembka, P. (ed.) Frontiers in Econometrics, pp. 169–190. Academic Press, New York (1974)
Tedin, O.: The influence of systematic plot arrangements upon the estimate of error in field experiments. J. Agric. Sci. 21, 191–208 (1931)
Thompson, D.W.: On Growth and Form: The Complete Revised Edition. Dover, New York (1992)
Thompson, W.L.: 402 citations questioning the indiscriminate use of null hypothesis significance tests in observational studies. http://www.warnercnr.colostate.edu/~anderson/thompson1.html (2001). Accessed 18 June 2015
Thompson, W.L.: Problems with the hypothesis testing approach. http://www.warnercnr.colostate.edu/~gwhite/fw663/testing.pdf (2001). Accessed 18 June 2015
Thompson, W.D., Walter, S.D.: A reappraisal of the kappa coefficient. J. Clin. Epidemiol. 41, 949–958 (1988)
Trafimow, D.: Editorial. Basic Appl. Soc. Psychol. 36, 1–2 (2014)
Trafimow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 37, 1–2 (2015)
Tschuprov, A.A.: Principles of the Mathematical Theory of Correlation. Hodge, London (1939). [Translated by M. Kantorowitsch]
Tukey, J.W.: Data analysis and behavioral science (1962). [Unpublished manuscript]
Tukey, J.W.: The future of data analysis. Ann. Math. Stat. 33, 1–67 (1962)
Tukey, J.W.: Randomization and re-randomization: the wave of the past in the future. In: Statistics in the Pharmaceutical Industry: Past, Present and Future. Philadelphia Chapter of the American Statistical Association (1988). [Presented at a Symposium in Honor of Joseph L. Ciminera held in June 1988 at Philadelphia, Pennsylvania]
Umesh, U.N.: Predicting nominal variable relationships with multiple response. J. Forecast. 14, 585–596 (1995)
Umesh, U.N., Peterson, R.A., Sauber, M.H.: Interjudge agreement and the maximum value of kappa. Educ. Psychol. Meas. 49, 835–850 (1989)
Ury, H.K., Kleinecke, D.C.: Tables of the distribution of Spearman’s footrule. J. R. Stat. Soc.: Ser. C: Appl. Stat. 28, 271–275 (1979)
van der Reyden, D.: A simple statistical significance test. Rhod. Agric. J. 49, 96–104 (1952)
Vanbelle, S., Albert, A.: A note on the linearly weighted kappa coefficient for ordinal scales. Stat. Methodol. 6, 157–163 (2008)
Vaughan, G.M., Corballis, M.C.: Beyond tests of significance: estimating strength of effects in selected ANOVA designs. Psychol. Bull. 79, 391–395 (1969)
von Eye, A., von Eye, M.: On the marginal dependency of Cohen’s κ. Eur. Pychol. 13, 305–315 (2008)
Wald, A., Wolfowitz, J.: An exact test for randomness in the non-parametric case based on serial correlation. Ann. Math. Stat. 14, 378–388 (1943)
Wallis, W.A.: The correlation ratio for ranked data. J. Am. Stat. Assoc. 34, 533–538 (1939)
Watnik, M.: Early computational statistics. J. Comput. Graph. Stat. 20, 811–817 (2011)
Watterson, I.G.: Nondimensional measures of climate model performance. Int. J. Climatol. 16, 379–391 (1996)
Welch, B.L.: The specification of rules for rejecting too variable a product, with particular reference to an electric lamp problem. Suppl. J. R. Stat. Soc. 3, 29–48 (1936)
Welch, B.L.: On the z-test in randomized blocks and Latin squares. Biometrika 29, 21–52 (1937)
Welch, B.L.: The significance of the difference between two means when the population variances are unequal. Biometrika 29, 350–362 (1938)
Welch, B.L.: On the comparison of several mean values: an alternative approach. Biometrika 38, 330–336 (1951)
Welkowitz, J., Ewen, R.B., Cohen, J.: Introductory Statistics for the Behavioral Sciences, 5th edn. Harcourt Brace, Orlando (2000)
Wherry, R.J.: A new formula for predicting the shrinkage of the coefficient of multiple correlation. Ann. Math. Stat. 2, 440–457 (1931)
Whitehurst, G.J.: Interrater agreement for journal manuscript reviews. Am. Psychol. 39, 22–28 (1984)
Whitfield, J.W.: Rank correlation between two variables, one of which is ranked, the other dichotomous. Biometrika 34, 292–296 (1947)
Wickens, T.D.: Multiway Contingency Tables Analysis for the Social Sciences. Erlbaum, Hillsdale (1989)
Wilcox, R.R.: Statistics for the Social Sciences. Academic Press, San Diego (1996)
Wilcox, R.R.: Applying Contemporary Statistical Techniques. Academic Press, San Diego (2003)
Wilcox, R.R., Muska, J.: Measuring effect size: a non-parametric analgue of \(\hat{\omega }^{2}\). Br. J. Math. Stat. Psychol. 52, 93–110 (1999)
Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1, 80–83 (1945)
Wilkinson, L.: Statistical methods in psychology journals: guidelines and explanations. Am. Psychol. 54, 594–604 (1999)
Wilks, S.S.: Certain generalizations in the analysis of variance. Biometrika 24, 471–494 (1932)
Wilson, H.G.: Least squares versus minimum absolute deviations estimation in linear models. Decis. Sci. 9, 322–325 (1978)
Yates, F.: Contingency tables involving small numbers and the χ 2 test. Suppl. J. R. Stat. Soc. 1, 217–235 (1934)
Yule, G.U.: On the association of attributes in statistics: with illustrations from the material childhood society. Philos. Trans. R. Soc. Lond. 194, 257–319 (1900)
Yule, G.U.: On the methods of measuring association between two attributes. J. R. Stat. Soc. 75, 579–652 (1912). [Originally a paper read before the Royal Statistical Society on 23 April 1912]
Zwick, R.: Another look at interrater agreement. Psychol. Bull. 103, 374–378 (1988)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Berry, K.J., Mielke, P.W., Johnston, J.E. (2016). Completely Randomized Data. In: Permutation Statistical Methods. Springer, Cham. https://doi.org/10.1007/978-3-319-28770-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-28770-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28768-3
Online ISBN: 978-3-319-28770-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)