Abstract
In this paper, we present a robust version of the empirical likelihood estimator for semiparametric moment condition models. This estimator is obtained by minimizing the modified Kullback-Leibler divergence, in its dual form, using truncated orthogonality functions. Some asymptotic properties regarding the limit laws of the estimators are stated.
This work was supported by a grant of the Ministry of Research, Innovation and Digitization, CNCS/CCCDI – UEFISCDI, project number PN-III-P4-ID-PCE-2020-1112, within PNCDI III.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
We consider a moment condition model, namely a family \(\mathcal {M}\) of probability measures Q, all defined on the same measurable space \(({\mathbb {R}^m},\mathcal {B}({\mathbb {R}^m}))\), such that \(\int _{\mathbb {R}^m} g(x,\theta )\,dQ(x) = 0\). The unknown parameter \(\theta \) belongs to the interior of a compact set \(\varTheta \subset \mathbb {R}^{d}\), and the function \(g:=(g_{1},\ldots ,g_{\ell })^{\top }\), with \(\ell \ge d\), is defined on the set \({\mathbb {R}^m}\times \varTheta \), each \(g_{i}\) being a real valued function. Denoting by M the set of all probability measures on \(({\mathbb {R}^m},\mathcal {B}(\mathbb {R}^m))\) and defining the sets
then the moment condition model \(\mathcal {M}\) can be written under the form
Let \(X_1,\ldots ,X_n\) be an i.i.d. sample with unknown probability measure \(P_{0}\). We assume that the equation \(\int _{\mathbb {R}^m} g(x,\theta )\,dP_0(x) = 0\) has a unique solution (in \(\theta \)) which will be denoted \(\theta _0\). We consider the estimation problem of the true unknown value \(\theta _{0}\).
Among the most known methods for estimating the parameter \(\theta _{0}\), we recall the Generalized Method of Moments (GMM) of [6], the Continuous Updating (CU) estimator of [7], the Empirical Likelihood (EL) estimator of [8, 14, 15], the Exponential Tilting (ET) of [11], as well as the Generalized Empirical Likelihood (GEL) class of estimators of [13] that contains the EL, ET and CU estimators in particular. Some alternative methods have been proposed in order to improve the finite sample accuracy or the robustness under misspecification of the model, for example in [4, 8, 11, 12, 16].
The authors in [3] have developed a general methodology for estimation and testing in moment condition models. Their approach is based on minimizing divergences in dual form and allows the asymptotic study of the estimators (called minimum empirical divergence estimators) and of the associated test statistics, both under the model and under misspecification of the model. Using the approach based on the influence function, [18] studied robustness properties for these classes of estimators and test statistics, showing that the minimum empirical divergence estimators of the parameter \(\theta _{0}\) of the model are generally not robust. This approach based on divergences and duality was initially used in the case of parametric models, the results being published in the articles, [2, 19, 20].
The classical EL estimator represents a particular case of the class of estimators from [3], namely, when using the modified Kullback-Leibler divergence. Although the EL estimator is superior to other above mentioned estimators in what regards higher-order asymptotic efficiency, this property is valid only in the case of the correct specification of the moment conditions. It is a known fact that the EL estimator and the EL ratio test for moment condition models are not robust with respect to the presence of outliers in the sample. Also, [17] showed that, when the support of the p.m. corresponding to the model and the orthogonality functions are not bounded, the EL estimator is not root n consistent under misspecification.
In this paper, we present a robust version of the EL estimator for moment condition models. This estimator is defined by minimizing an empirical version of the modified Kullback-Leibler divergence in dual form, using truncated orthogonality functions. For this estimator, we present some asymptotic properties regarding both consistency and limit laws. The robust EL estimator is root n consistent, even under misspecification, which gives a solution to the problem noticed by [17] for the EL estimator.
2 A Robust Version of the Empirical Likelihood Estimator
Let \(\left\{ P_{\theta }; \theta \in \varTheta \right\} \) be a reference identifiable model, containing probability measures such that, for each \(\theta \in \varTheta \), \(P_{\theta }\in \mathcal {M}_{\theta }\), meaning that \(\int _{\mathbb {R}^m} g(x,\theta )\,dP_{\theta } (x)=0\), and \(\theta \) is the unique solution of the equation. We assume that the p.m. \(P_{0}\) of the data, corresponding to the true unknown value \(\theta _{0}\) of the parameter to be estimated, belongs to this reference model. The reference model will be associated to the truncated orthogonality function \(g_c\), defined hereafter, that will be used in the definition of the robust version of the EL estimator of the parameter \(\theta _{0}\). We use the notation \(\Vert \cdot \Vert \) for the Euclidean norm. Similarly as in [16], using the reference model \(\left\{ P_{\theta };\, \theta \in \varTheta \right\} \), define the function \(g_{c}: \mathbb {R}^{m}\times \varTheta \rightarrow \mathbb {R}^{\ell }\),
where \(H_{c}:\mathbb {R}^{\ell }\rightarrow \mathbb {R}^{\ell }\) is the Huber’s function
and \(A_\theta \), \(\tau _\theta \) are determined by the solutions of the system of implicit equations
where \(I_\ell \) is the \(\ell \times \ell \) identity matrix and \(c >0\) is a given positive constant. Therefore, we have \(\Vert g_{c}(x,\theta )\Vert \le c\), for all x and \(\theta \). We also use the function
when needed to work with the dependence on the \(\ell \times \ell \) matrix A and on the \(\ell \)-dimensional vector \(\tau \). Then,
where \(A_\theta \) and \(\tau _\theta \) are the solution of (4). For given \(P_{\theta }\) from the reference model, the triplet \((\theta , A_\theta , \tau _\theta )\) is the unique solution of the system
The uniqueness is justified in [16], p. 48.
In what follows, we will use the so-called modified Kullback-Leibler divergence between probability measures, say Q and P, defined by
if Q is absolutely continuous with respect to P, and \(KL_m(Q,P):=+\infty \), elsewhere. The strictly convex function \(\varphi \) is defined by \(\varphi (x):=-\log x+x-1\), if \(x>0\), respectively \(\varphi (x):=+\infty \), if \(x\le 0\). Straightforward calculus show that the convex conjugateFootnote 1 of the convex function \(\varphi \) is \(\psi (u)=-\log (1-u)\) if \(u<1\), respectively \(\psi (u)=+\infty \), if \(u\ge 1\). Recall that the Kullback-Leibler divergence, between any probability measures Q and P, is defined by
if Q is absolutely continuous with respect to P, and \(KL_m(Q,P):=+\infty \), elsewhere. Here, the strictly convex function \(\varphi \) is defined by \(\varphi (x)=x\log x -x +1\), if \(x\ge 0\), and \(\varphi (x) = +\infty \), if \(x<0\). Notice also that \(KL_m(Q,P)=KL(P,Q)\), for all probability measures Q and P. For any subset \(\varOmega \) of M, we define the \(KL_m\)-divergence, between \(\varOmega \) and any probability measure P, by
Define the moment condition model
For any \(\theta \in \varTheta \), define the set
Since \(g_{c}(x,\theta )\) is bounded (with respect to x), then on the basis of Theorem 1.1 in [1] and Proposition 4.2 in [3], the following dual representation of divergence holds
where
and the supremum in (10) is reached, provided that \(KL_m(\mathcal {M}_{c,\theta },P_0)\) is finite. Moreover, the supremum in (10) is unique under the following assumption
where \(\overline{t}:=(t_0,t_1,\ldots ,t_\ell )^\top \) and \(\overline{g}:=(g_0,g_{c,1},\ldots , g_{c,\ell })^\top \). This last condition is satisfied if the functions \(g_0(\cdot ):=\mathbf {1}_{\mathbb {R}^{m}}(\cdot ),g_{c,1}(\cdot ,\theta ),\dots ,g_{c,\ell }(\cdot ,\theta )\) are linearly independent and \(P_0\) is not degenerate. The empirical measure, associated to the sample \(X_1,\ldots , X_n\), is defined by
\(\delta _{x}(\cdot )\) being the Dirac measure at the point x. Denote
In view of relation (10), for given \(\theta \in \varTheta \), a natural estimator of
can be defined by “plug-in” as follows
A “dual” plug-in estimator of the modified Kullback-Leibler divergence, between \(\mathcal {M}_{c,\theta }\) and \(P_0\), can then be defined by
where \(\overline{\log }(\cdot )\) is the extended logarithm function, i.e., the function defined by \(\overline{\log }(u) = \log (u)\) if \(u>0\), and \(\overline{\log }(u) = -\infty \) if \(u\le 0\). Hence,
can be estimated by
Since \(\theta _0 = \underset{\theta \in \varTheta }{\arg \inf }\, KL_m(\mathcal {M}_{c,\theta }, P_0)\), where the infimum is unique, we propose then to estimate \(\theta _0\) by
which can be seen as a “robust” version of the classical EL estimator. Recall that the EL estimator, see e.g. [14], can be written as
A slightly different definition of an estimator for the parameter \(\theta _{0}\) was introduced in [10], where robustness and consistency properties are also stated. However, the limiting distribution of the estimator in [10] is not standard, and not easy to be obtained, due to the fact that the used bounded orthogonality functions depend on both \(\theta \) and the data. The present version is simpler and does not present this difficulty. We give in the following sections, the influence function of the estimator (19), and state both consistency and the limiting distributions of all the proposed estimators (15), (16), (19) and (18).
2.1 Robustness Property
The classical EL estimator of the parameter \(\theta _{0}\) of a moment condition model can be obtained as a particular case of the class of minimum empirical divergence estimators introduced by [3]. [18] showed that the influence functions for the estimators from this class, so particularly the influence function of the EL estimator, are each proportional to the orthogonality function \(g(x,\theta _{0})\) of the model. These influence functions also coincide with the influence function of the GMM estimator obtained by [16]. Therefore, when \(g(x,\theta )\) is not bounded in x, all these estimators, and particularly the EL estimator of \(\theta _{0}\), are not robust.
Denote \(T_c(\cdot )\) the statistical functional associated to the estimator \(\widehat{\theta }_c\), so that \(\widehat{\theta }_c = T_c(P_n)\). The influence function \(\mathrm {IF}(x;T_{c},P_{0})\) of \(T_c\) at \(P_0\) is defined by
where \(P_{\varepsilon ,x}(\cdot ) = (1-\varepsilon )\, P_0(\cdot ) + \varepsilon \, \delta _x(\cdot )\), \(\varepsilon \in \,]0,1[\); see e.g. [5]. The influence function \(\mathrm {IF}(x;T_{c},P_{0})\) of the estimator \(\widehat{\theta }_{c}\) presented in this paper is linearly related to the bounded function \(g_{c}(x,\theta )\), more precisely, the following result holds
which implies the robustness of the estimator \(\widehat{\theta }_{c}\) of the parameter \(\theta _{0}\). The proof of this result is similar to the one presented in [10].
2.2 Asymptotic Properties
In this subsection, we give the limiting distributions of the proposed estimators, under some regularity assumptions similar to those used by [3].
Proposition 1
For any fixed \(\theta \in \varTheta \), under some regularity assumptions, we have
-
(1)
\(\sqrt{n}(\widehat{t}_{c,\theta } - t_{c,\theta })\) converges in distribution to a centered normal random vector;
-
(2)
If \(P_0\not \in \mathcal {M}_{c,\theta }\), then \(\sqrt{n}(\widehat{KL_m} (\mathcal {M}_{c,\theta }, P_0)-KL_m (\mathcal {M}_{c,\theta }, P_0))\) converges in distribution to a centered normal random variable;
-
(3)
If \(P_0\in \mathcal {M}_{c,\theta }\), then \(2n\, \widehat{KL_m} (\mathcal {M}_{c,\theta }, P_0)\) convergences in distribution to a \(\chi ^2(\ell )\) random variable.
Proposition 2
Under some regularity assumptions, we have
-
(1)
\(\sqrt{n}\left( \widehat{\theta }_c - \theta _0\right) \) converges in distribution to a centered normal random vector;
-
(2)
If \(\ell >d\), then \(2n \, \widehat{KL_m}(\mathcal {M}_c,P_0)\) converges in distribution to a \(\chi ^2(\ell - d)\) random variable.
Notes
- 1.
The convex conjugate, called also Fenchel-Legendre transform, of \(\varphi \), is the function defined on \(\mathbb {R}\) by \(\psi (u) := \sup _{x\in \mathbb {R}}\{u x - \varphi (x) \} , \, \forall u\in \mathbb {R}\).
References
Broniatowski, M., Keziou, A.: Minimization of \(\phi \)-divergences on sets of signed measures. Stud. Sci. Math. Hung. 43(4), 403–442 (2006)
Broniatowski, M., Keziou, A.: Parametric estimation and tests through divergences and the duality technique. J. Multivar. Anal. 100(1), 16–36 (2009)
Broniatowski, M., Keziou, A.: Divergences and duality for estimation and test under moment condition models. J. Stat. Plann. Infer. 142(9), 2554–2573 (2012)
Felipe, A., Martin, N., Miranda, P., Pardo, L.: Testing with exponentially tilted empirical likelihood. Methodol. Comput. Appl. Probab. 20, 1–40 (2018)
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, New York (1986)
Hansen, L.P.: Large sample properties of generalized method of moments estimators. Econometrica 50(4), 1029–1054 (1982)
Hansen, L.P., Heaton, J., Yaron, A.: Finite-sample properties of some alternative generalized method of moments estimators. J. Bus. Econ. Stat. 14, 262–280 (1996)
Imbens, G.W.: One-step estimators for over-identified generalized method of moments models. Rev. Econ. Stud. 64(3), 359–383 (1997)
Imbens, G.W.: Generalized method of moments and empirical likelihood. J. Bus. Econ. Stat. 20(4), 493–506 (2002)
Keziou, A., Toma, A.: A robust version of the empirical likelihood estimator. Mathematics 9(8), 829 (2021)
Kitamura, Y., Stutzer, M.: An information-theoretic alternative to generalized method of moments estimation. Econometrica 65(4), 861–874 (1997)
Lô, S.N., Ronchetti, E.: Robust small sample accurate inference in moment condition models. Comput. Stat. Data Anal. 56(11), 3182–3197 (2012)
Newey, W.K., Smith, R.J.: Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica 72(1), 219–255 (2004)
Owen, A.: Empirical likelihood. Chapman and Hall, New York (2001)
Qin, J., Lawless, J.: Empirical likelihood and general estimating equations. Ann. Stat. 22(1), 300–325 (1994)
Ronchetti, E., Trojani, F.: Robust inference with GMM estimators. J. Econ. 101(1), 37–69 (2001)
Schennach, S.M.: Point estimation with exponentially tilted empirical likelihood. Ann. Stat. 35(2), 634–672 (2007)
Toma, A.: Robustness of dual divergence estimators for models satisfying linear constraints. Comptes Rendus Mathématiques 351(7–8), 311–316 (2013)
Toma, A., Leoni-Aubin, S.: Robust tests based on dual divergence estimators and saddlepoint approximations. J. Multivar. Anal. 101(5), 1143–1155 (2010)
Toma, A., Broniatowski, M.: Dual divergence estimators and tests: robustness results. J. Multivar. Anal. 102(1), 20–36 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Keziou, A., Toma, A. (2021). Robust Empirical Likelihood. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2021. Lecture Notes in Computer Science(), vol 12829. Springer, Cham. https://doi.org/10.1007/978-3-030-80209-7_90
Download citation
DOI: https://doi.org/10.1007/978-3-030-80209-7_90
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80208-0
Online ISBN: 978-3-030-80209-7
eBook Packages: Computer ScienceComputer Science (R0)