Abstract
Conditioning arguments are at the center of many disputes regarding the foundations of statistical inference. We present here only some simple arguments and examples.
Access provided by Autonomous University of Puebla. Download chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
12.1 Ancillarity
Conditioning arguments are at the center of many disputes regarding the foundations of statistical inference. We present here only some simple arguments and examples.
Definition 12.1.1.
A statistic A is ancillary (for θ) if its distribution does not depend on θ.
The basic idea in conditioning is that, since A provides no information on θ, we should use the conditional distribution, given A, for inference on θ. The idea originated with R.A. Fisher and has been discussed and disputed for decades. In some problems most statisticians condition on A, in other problems they do not.
In most problems the sample size is considered fixed, i.e., ancillary even though it may be determined by availability of funds or other considerations not related to the problem (θ) of interest. Similarly in regression type problems (linear models, generalized linear models, etc.) most statisticians condition on the covariates (design matrix). There seems to be no definite guidelines for when to condition and when not to condition.
Example.
Cox introduced the following example. Consider two measuring devices. Device P produces measurements which are normal with mean θ and variance σ 2 and device I produces measurements which are normal with variance k 2 σ 2 where k is much larger than 1. Which instrument is used is decided by the flip of a fair coin so that the precision of the measurement (i.e., what instrument is used) is ancillary.
Thus we would report the value of the measurement and the associated value of precision σ 2 or k 2 σ 2 depending on the instrument actually used. However, if we do not condition, the true variance of X is
Note that
so that the reported standard error will be either too small or too large.
Example (Valliant, Dorfman and Royall).
There is a population of size 1,000 from which we have selected a random sample of size 100 without replacement. The population mean is estimated by the sample mean which has variance estimated by
and s denotes the set of items selected.
Before we drew the sample, we considered doing a complete census of all 1,000 objects, but we had another study of interest. To decide whether to do the complete census or a sample of size 100 and the other study we flipped a coin. If the result was a head we did the complete census; if the result was a tail we took the sample of size 100.
The variance of the sample mean is
Using this an estimate of variability is clearly wrong, yet it is correct from a frequentist point of view. Note that the same variance would be required if we had done the complete census. In this case any confidence interval would consist of a set of points whereas we know the population mean exactly! Clearly there is need for conditioning in situations like this.
12.2 Problems with Conditioning
Examples in the previous section indicate that we should condition whenever there is an ancillary statistic. Unfortunately this is not always so easy. An excellent review article by Ghosh et al. [19] provides many examples and extensions. In particular there are examples given where there is no unique ancillary statistic.
Some authors have suggested that there are really two major types of ancillarity:
-
1.
Experimental
-
2.
Mathematical
Experimental ancillaries are those such as sample size, covariates, etc., i.e., situations where most statisticians routinely condition. Mathematical ancillaries are those that arise because of the specific nature of the statistical model.
Example (Continuous uniform).
Let X 1, X 2, …, X n be iid with pdf
where \(\Delta =\theta _{2} -\theta _{1}\).
The joint density is given by
It follows that the minimum and maximum of X 1, X 2, …, X n are minimal sufficient statistics for θ 1 and θ 2.
The joint distribution of the minimum and maximum from a random sample with distribution function F and density function f is easily shown to be
where Y 1 is the minimum of the X i ’s and Y n is the maximum
For the uniform distribution, we have that
so that the joint pdf of Y 1 and Y n is given by
Let \(\theta _{1} =\theta -\rho\) and \(\theta _{2} =\theta +\rho\), then we have that \(\Delta = 2\rho\) and hence the joint density is
If we assume that ρ is known, then the likelihood function for θ is
For the special case where \(\rho = 1/2\) it is easy to show that [Y 1, Y n ] is a \(100\left (1 - \frac{1} {2^{n-1}} \right )\) confidence interval for θ.
Suppose now that
Then the \(100(1 - \frac{1} {16})\,\% = 93.75\,\%\) confidence interval for θ is. 01 to. 99. But since
if and only if
with certainty.
Thus with these observed values of y 1 and y n we are certain that
and yet our 93.75 % confidence interval is
This is silly.
As Cox points out it is imperative to condition on the ancillary statistic in this example which is the range \(R = Y _{n} - Y _{1}\).
References
Ghosh, M., Reid, N., Fraser, D.A.S.: Ancillary statistics: a review. Statistica Sinica 20, 1309–1332 (2010)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Rohde, C.A. (2014). Conditionality. In: Introductory Statistical Inference with the Likelihood Function. Springer, Cham. https://doi.org/10.1007/978-3-319-10461-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-10461-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10460-7
Online ISBN: 978-3-319-10461-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)