Abstract
The goals of this chapter are to: Consider the problem of parameter estimation by the method of minimal distances, Study the properties of the estimators.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- Minimum Distance Method
- Negative Definite Kernel
- Symmetric Unimodal Density
- Bounded Influence Function
- Semiparametric Estimation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The goals of this chapter are to:
-
Consider the problem of parameter estimation by the method of minimal distances,
-
Study the properties of the estimators.
Notation introduced in this chapter:
Notation | Description |
---|---|
w ∘ | Brownian bridge |
F θ(x) = F(x, θ) | Distribution function with parameter θ |
p θ(x) = p(x, θ) | Density of F θ(x) |
1 Introduction
In this chapter, we consider minimal distance estimators resulting from using the \(\mathfrak{N}\)-metrics and compare them with classical M-estimators. This chapter, like Chap. 22, is not directly related to quantitative convergence criteria, although it does demonstrate the importance of \(\mathfrak{N}\)-metrics.
2 Estimating a Location Parameter: First Approach
Let us begin by considering a simple case of estimating a one-dimensional location parameter. Assume that
is a strongly negative definite kernel and
is the corresponding kernel defined on the class of distribution functions (DFs). As we noted in Chap. 22, \(\mathfrak{N}(F,G) = {\mathcal{N}}^{1/2}(F,G)\) is a distance on the class \(\mathbf{B}(\mathcal{L})\) of DFs under the condition
Suppose that x 1, …, x n is a random sample from a population with DF \(F_{\theta }(x) = F(x - \theta )\), where θ ∈ Θ ⊂ ℝ1 is an unknown parameter (Θ is some interval, which may be infinite). Assume that there exists a density p(x) of F(x) (with respect to the Lebesgue measure). Let F n ∗ (x) be the empirical distribution based on the random sample, and let θ ∗ be a minimum distance estimator of θ, so that
or
We have
Suppose that \(\mathcal{L}(u)\) is differentiable and \(\mathcal{L}\) and p are such that
Then, (23.2.2) implies that θ ∗ is the root of
or
Since the estimator θ ∗ satisfies the equation
where
it is an M-estimator.Footnote 1 It is well known [see, e.g., Huber [1981]] that (23.2.4) [or (23.2.5)] determines a consistent estimator only if
that is,
We show that if (23.2.3) holds, then (23.2.6) does as well. The integral
does not depend on θ. Therefore,
On the other hand,
Here, we used the equality \(\mathcal{L}(u - v) = \mathcal{L}(v - u)\). Comparing this with (23.2.7), we find that for θ = 0
However,
Consequently [see (23.2.8)],
which proves (23.2.6).
We see that the minimum \(\mathfrak{N}\)-distance estimator is an M-estimator, and the necessary condition for its consistency is automatically fulfilled.
The standard theory of M-estimators shows that the asymptotic variance of θ ∗ [i.e., the variance of the limiting random variable of \(\sqrt{n}({\theta }^{{_\ast} }- \theta )\) as n → ∞] is
where we assumed the existence of \(\mathcal{L}^{\prime\prime}\) and that the differentiation can be carried out under the integral. Note that when the parameter space Θ is compact, it is clear from geometric considerations that θ ∗ = argminθ ∈ Θ N(F n ∗ , F θ) is unique for sufficiently large n.
3 Estimating a Location Parameter: Second Approach
We now consider another method for estimating a location parameter θ. Let
where δθ is a distribution concentrated at the point θ and F n ∗ is an empirical DF. Proceeding as in Sect. 23.2, it is easy to verify that θ′ is a root of
and so it is a classic M-estimator. A consistent solution of (23.3.2) exists only if
What is a geometric interpretation of (23.3.3)? More precisely, how is the measure parameter δθ related to the family parameter, that is, to the DF F θ? This must be the same parameter, that is, for all θ1 we must have
Otherwise,
It is easy to verify that the last condition is equivalent to (23.3.3). Thus, (23.3.3) has to do with the accuracy of parameterization and has the following geometric interpretation. The space of measures with metric \(\mathfrak{N}\) is isometric to some simplex in a Hilbert space. In this case, δ-measures correspond to the extreme points (vertices) of the simplex. Consequently, (23.3.3) signifies that the vertex closest to the measure with DF F θ corresponds to the same value of the parameter θ (and not to some other value θ1).
4 Estimating a General Parameter
We now consider the case of an arbitrary one-dimensional parameter, which is approximately the same as the case of a location parameter. We just carry out formal computations assuming that all necessary regularity conditions are satisfied.
Let x 1, …, x n be a random sample from a population with DF F(x, θ), θ ∈ Θ ⊂ ℝ1. Assume that p(x, θ) = p θ(x) is the density of F(x, θ). The estimator
is an M-estimator defined by the equation
where
Here, \(\mathcal{L}(u,v)\) is a negative definite kernel, which does not necessarily depend on the difference of arguments, and the prime ′ denotes the derivative with respect to θ. As in Sect. 23.2, the necessary condition for consistency,
is automatically fulfilled. The asymptotic variance of θ ∗ is given by
We can proceed similarly to Sect. 23.3 to obtain the corresponding results in this case. Since the calculations are quite similar, we do not state these results explicitly. Note that to obtain the existence and uniqueness of θ ∗ for sufficiently large n, we do not need standard regularity conditions such as the existence of variance, differentiability of the density with respect to θ, and so on. These are used only to obtain the estimating equation and to express the asymptotic variance of the estimator.
In general, from the construction of θ ∗ we have
and hence
In the case of a bounded kernel \(\mathcal{L}\), the convergence is uniform with respect to θ. In this case it is easy to verify that nN(F n ∗ , F θ) converges to
as n → ∞, where w ∘ is the Brownian bridge.
5 Estimating a Location Parameter: Third Approach
Let us return to the case of estimating a location parameter. We will present an example of an estimator obtained by minimizing the \(\mathfrak{N}\)-distance, which has good robust properties. Let
where r > 0 is a fixed number. The famous Pólya criterionFootnote 2 implies that the function \(f(t) = 1 -\frac{1} {r}\mathcal{L}_{r}(t)\) is the characteristic function of some probability distribution. Consequently, \(\mathcal{L}_{r}(t)\) is a negative definite function. This implies that for a sufficiently large sample size n there exists an estimator θ ∗ of minimal \({\mathfrak{N}}^{r}\) distance, where \({\mathcal{N}}^{r}\) is the kernel constructed from \(\mathcal{L}_{r}(x - y)\). If the distribution function F(x − θ) has a symmetric unimodal density p(x − θ) that is absolutely continuous and has a finite Fisher information
then we conclude by (23.4.2) that θ ∗ is consistent and is asymptotically normal. The estimator θ ∗ satisfies (23.2.5), where
and
This implies that θ ∗ has a bounded influence function and, hence, is B-robust.Footnote 3
Consider now the estimator θ′ obtained by the method discussed in Sect. 23.3. It is easy to verify that this estimator is consistent under the same assumptions. However, θ′ satisfies the equation
so that it is a trimmed median. It is well known that a trimmed median is the most B-robust estimator in the corresponding class of M-estimators.Footnote 4
6 Semiparametric Estimation
Let us now briefly discuss semiparametric estimation. This problem is similar to that considered in Sect. 23.4, except that here we do not assume that the sample comes from a parametric family. Let x 1, …, x n , be a random sample from a population given by DF F(x), which belongs to some distribution class \(\mathcal{P}\). Suppose that the metric \(\mathfrak{N}\) is generated by the negative definite kernel \(\mathcal{L}(x,y)\) and that \(\mathbf{P} \subset \mathcal{B}(\mathcal{L})\). \(\mathcal{B}(\mathcal{L})\) is isometric to some subset of the Hilbert space \(\mathfrak{H}\). Moreover, Aronszajn’s theorem implies that \(\mathfrak{H}\) can be chosen to be minimal in some sense. In this case, the definition of \(\mathfrak{N}\) is extended to the entire \(\mathfrak{H}\).
We assume that the distributions under consideration lie on some “nonparametric curve.” In other words, there exists a nonlinear functional φ on \(\mathfrak{H}\) such that the distributions F satisfy the condition
The functional φ is assumed to be smooth. For any \(H \in \mathfrak{H}\)
where G is fixed.
Under the parametric formulation of Sect. 23.4, the equation for θ has the form
that is,
Here, the equation explicitly depends on the gradient of the functional N(F, F n ∗ ). However, under the nonparametric formulation, we work with the conditional minimum of the functional N(F, F n ∗ ), assuming that F lies on the surface φ(F) = C. Here, our estimator is
According to general rules for finding conditional critical points, we have
where λ is a number. Thus, in the general case, (23.6.1) is an eigenvalue problem. This is a general framework of semiparametric estimation.
References
Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New York
Huber P (1981) Robust statistics. Wiley, New York
Lukacs E (1969) Characteristic functions. Griffin, London
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Rachev, S.T., Klebanov, L.B., Stoyanov, S.V., Fabozzi, F.J. (2013). Statistical Estimates Obtained by the Minimal Distances Method. In: The Methods of Distances in the Theory of Probability and Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4869-3_23
Download citation
DOI: https://doi.org/10.1007/978-1-4614-4869-3_23
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4868-6
Online ISBN: 978-1-4614-4869-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)