Non-linear Panel Data Models

Kyriazidou, Ekaterini

doi:10.1057/978-1-349-95189-5_2094

Ekaterini Kyriazidou¹

67 Accesses

Abstract

Panel or longitudinal data are becoming increasingly popular in applied work as they offer a number of advantages over pure cross-sectional or pure time-series data. They allow researchers to model unobserved heterogeneity at the level of the observational unit, where the latter may be an individual, a household, a firm or a country. This article describes several estimation methods that are available for nonlinear panel data models, that is, models which are nonlinear in the parameters of interest and which include models that arise frequently in applied work, such as discrete choice models and limited dependent variable models, among others.

Access provided by CONRICYT-eBooks. Download reference work entry PDF

Panel Data Analysis

Panel data models with cross-sectional dependence: a selective review

Article 05 June 2016

Keywords

JEL Classifications

Panel or longitudinal data are becoming increasingly popular in applied work as they offer a number of advantages over pure cross-sectional or pure time-series data. They allow researchers to model unobserved heterogeneity at the level of the observational unit, where the latter may be an individual, a household, a firm or a country. This article describes several estimation methods that are available for nonlinear panel data models, that is, models which are nonlinear in the parameters of interest and which include models that arise frequently in applied work, such as discrete choice models and limited dependent variable models, among others.

Introduction

Panel or longitudinal data are becoming increasingly popular in applied work as they offer a number of advantages over pure cross-sectional or pure time-series data. A particularly useful feature is that they allow researchers to model unobserved heterogeneity at the level of the observational unit, where the latter may be an individual, a household, a firm or a country. Standard practice in the econometric literature is to model this heterogeneity as an individual-specific effect which enters additively in the model, typically assumed to be linear, that captures the statistical relationship between the dependent and the independent variables. The presence of these individual effects may cause problems in estimation. In particular in short panels, that is, in panels where the time-series dimension is of smaller order than the cross-sectional dimension, their estimation in conjunction with the other parameters of interest usually yields inconsistent estimators for both. (Notable exceptions are the static linear and the Poisson count panel data models, where estimation of the individual effects along with the finite dimensional coefficient vector yields consistent estimators of the latter.) This is the well-known incidental parameters problem (Neyman and Scott 1948). In linear regression models, this problem may be dealt with by taking transformations of the model, such as first differences or differences from time averages (‘within transformation’), which remove the individual effect from the equation under consideration. However they do not apply to nonlinear econometric models, that is, models which are nonlinear in the parameters of interest and which include models that arise frequently in applied work, such as discrete choice models, limited dependent variable models, and duration models, among others.

This article describes several estimation methods that are available for nonlinear panel data models. An approach that is available for estimating certain linear and nonlinear parametric models with individual effects is the conditional maximum likelihood approach. This is described in section “The Conditional Maximum Likelihood (CML) Approach”. Section “The Fixed Effects Approach” describes estimation techniques that have been recently developed for several semiparametric nonlinear panel data models. A common feature in the methods discussed in that section is that we do not make any assumptions about the nature of these individual effects, that is, whether they are fixed constants or random variables. Thus, we do not make any assumptions about whether they are related to the conditioning variables and, if so, in what manner. This approach is typically referred to as the fixed effects approach. Section “The Random Effects Approach” describes the so-called random effects approach in estimating nonlinear panel data models. In contrast to the fixed effects approach, the random effects approach does make assumptions about the individual effects.

The discussion distinguishes between two types of models, static and dynamic. In static models, the conditioning set includes past, present and future values of the variables. In this case the conditioning variables are said to be strictly exogenous. In dynamic models, the conditioning set may also include lags of the dependent variable and other endogenous variables, that is, variables that are only weakly exogenous or predetermined.

Our discussion is limited in several aspects. First, we focus only on the case when the time series dimension of the panel (T) is short so that it makes sense to consider the asymptotic properties of the estimators when the cross-sectional dimension (N) is large while T remains fixed. Second, we do not consider estimation of random coefficient models, that is, models where all the parameters are varying at the individual level. Finally, we do not discuss the Bayesian approach to estimating panel data models.

The Conditional Maximum Likelihood (CML) Approach

Suppose that a random variable y_it has density f (·,θ,α_i) where θ is the parameter of interest which is common across all units i, whereas α_i is a nuisance parameter which is allowed to differ across i. A sufficient statistic S_i for a_i is a function of the data such that the conditional distribution of the data given S_i does not depend on α_i. However, the conditional distribution may depend on θ. In this case, one can estimate θ by maximizing the conditional likelihood function, which conditions on the sufficient statistic(s). Such sufficient statistics are readily available for the exponential family that includes the normal, Poisson, gamma, logistic, and binomial distributions. The CML approach, when it exists, yields consistent and asymptotically normal estimators for parametric panel data models with individual effects (Andersen 1970). We will next demonstrate how the CML approach works in the case of a static and a dynamic logit model with individual effects.

The Static Panel Data Logit Model

Consider the binary choice logit model with individual effects

$$ {y}_{it}=1\left\{{x}_{it}{\beta}_0+{\alpha}_i+{\varepsilon}_{it}\ge 0\right\}\;i=1,\dots, N;t=1,\dots, T $$

where 1{A} = 1 if A occurs and is 0 otherwise. Let x_i≡(x_i1,…, x_iT ). Here the error term ε_it is distributed i.i.d. over t with a logistic distribution conditional on (x_i,α_i). Note that this assumption implies that ε_it is in fact independent of α_i and x_it for all t. We can easily calculate that

$$ \Pr \left({y}_{it}=1|{x}_i,{\alpha}_i\right)=\frac{\exp \left({x}_{it}{\beta}_0+{\alpha}_i\right)}{1+\exp \left({x}_{it}{\beta}_0+{\alpha}_i\right)}. $$

In this model it turns out that ∑_ty_it is a sufficient statistic for α_i. Indeed, let T = 2. Note that

$$ \Pr \left({y}_{it}=1|{y}_{i1}+{y}_{i2}=0,{x}_i,{\alpha}_i\right)=0\ \Pr \left({y}_{it}=1|{y}_{i1}+{y}_{i2}=2,{x}_i,{\alpha}_i\right)=1 $$

that is, individuals who do not switch states (i.e. who are 0 or 1 in both periods) do not offer any information about β₀. But it can be easily shown that

$$ \Pr \left({y}_{i1}=1|{y}_{i1}+{y}_{i2}=1,{x}_i,{\alpha}_i\right)=\frac{1}{1+\exp \left(\left({x}_{i2}-{x}_{i1}\right){\beta}_0\right)} $$

and

$$ \Pr \left({y}_{i1}=0|{y}_{i1}+{y}_{i2}=1,{x}_i,{\alpha}_i\right)=\frac{\exp \left(\left({x}_{i2}-{x}_{i1}\right){\beta}_0\right)}{1+\exp \left(\left({x}_{i2}-{x}_{i1}\right){\beta}_0\right)}. $$

In other words, conditional on the individual switching states (from 0 to 1 or from 1 to 0), the probability that y_it is 1 or 0 depends on β₀ (that is, contains information about β₀) but is independent of α_i.

The conditional log-likelihood is

$$ \begin{array}{ll}\mathscr{L}_C(\beta) & = \sum\limits_{i=1}^{N}1\{y_{i1} + y_{i2} =1 \} \\ & \times {\rm In} \ \left(\frac{\exp \left((x_{i2} - x_{i1})\beta\right)^{(1-y_{i1})}}{1 + \exp \left((x_{i2} - x_{i1})\beta)\right)}\right) \end{array}$$

and may be maximized over β to produce a consistent and root-N asymptotically normal estimator of β₀. Note that the approach uses a subset of the data, since only individuals who switch states enter the likelihood. For the expression of the conditional log-likelihood in the general T case, see Chamberlain (1984).

The Dynamic Panel Data Logit Model

Chamberlain (1985) noticed that the conditional maximum likelihood approach also applies to the ‘AR(1)’ logit model with individual effects:

$$ {y}_{it}=1\left\{{\upgamma}_0{y}_{it-1}+{\alpha}_i+{\varepsilon}_{it}\ge 0\right\}\;i=1,\dots, N;t=1,\dots, T $$

where the error term ε_it is distributed i.i.d. with a logistic distribution conditional on α_i and the initial observation of the sample y_i0. Note that we are not making any assumption about the distribution of the initial y_i0. As we will see, the approach requires at least four observations for each individual (including the initial observation). In fact, let that be the case and consider the events:

$$ A=\left\{{y}_{i0}={d}_0,{y}_{i1}=0,{y}_{i2}=1,{y}_{i3}={d}_3\right\}\;B=\left\{{y}_{i0}={d}_0,{y}_{it}=1,{y}_{i2}=0,{y}_{i3}={d}_3\right\} $$

where d₀ and d₃ are either 0 or 1. It is rather easy to derive the following probabilities which condition on the individual switching states in the two middle periods

$$ {\displaystyle \begin{array}{ll}\hfill & \Pr \left(A|A\cup B,{\alpha}_i\right)\\ {}& =\frac{1}{1+\exp \left({\upgamma}_0\left({d}_0-{d}_3\right)\right)}\Pr \left(B|A\cup B,{\alpha}_i\right)\hfill \\ {}& =\frac{\exp \left({\upgamma}_0\left({d}_0-{d}_3\right)\right)}{1+\exp \left({\upgamma}_0\left({d}_0-{d}_3\right)\right)}.\hfill \end{array}} $$

Note that these depend on γ₀ but are independent of α_i. The conditional log-likelihood of the model for four periods is:

$$ \begin{array}{ll}\mathscr{L}_C(\beta) & = \sum\limits_{i} \ 1 \{y_{i1} + y_{i2} =1 \} \\ & {\rm In} \ \left(\frac{\exp \left(\gamma(y_{i0} - y_{i3})\right)^{(y_{i1})}}{1 + \exp (\gamma(y_{i0} - y_{i3}))}\right) \end{array}$$

and maximizing it with respect to γ produces a consistent and root-N asymptotically normal estimator. The approach generalizes to logit models with more than one lags of y_it (see Magnac 2000).

It is important to note that the CML approach described above does not work in the logit model

$$ {y}_{it}=1\left\{{\upgamma}_0{y}_{it-1}+{x}_{it}{\beta}_0+{\alpha}_i+{\varepsilon}_{it}\ge 0\right\}\;i=1,\dots, N;t=1,\dots, T $$

that is, when the conditioning set also includes exogenous variables. Honoré and Kyriazidou (2000a) show that β₀ and γ₀ in the model above are in fact identified both for the case when the errors ε_it are logistic and when they are only assumed to have the same distribution over time conditional on (x_i, y_i0) (see below). In the logistic case identification is based on the fact that the following probabilities

$$ {\displaystyle \begin{array}{ll}\hfill & \Pr \left(A|A\cup B,{x}_{i2}={x}_{i3},{x}_i,{\alpha}_i\right)\\ {}& =\frac{1}{1+\exp \left(\left({x}_{i1}-{x}_{i2}\right){\beta}_0+{\upgamma}_0\left({d}_0-{d}_3\right)\right)}\Pr \left(B|A\cup B,{x}_{i2}={x}_{i3},{x}_i,{\alpha}_i\right)\hfill \\ {}& =\frac{\exp \Big(\left({x}_{i1}-{x}_{i2}\right){\beta}_0+{\upgamma}_0\left({d}_0-{d}_3\right)}{1+\exp \left(\left({x}_{i1}-{x}_{i2}\right){\beta}_0+{\upgamma}_0\left({d}_0-{d}_3\right)\right)}\hfill \end{array}} $$

are independent of α_i. Note that the probabilities above condition not only on the individual switching states in the middle two periods so that y_i1 + y_i2 = 1 but also on the event that x_i2 = x_i3. Honoré and Kyriazidou (2000a) propose estimating β₀ and γ₀ by maximizing

$$ \sum \limits_i1\left\{{x}_{i2}-{x}_{i3}=0\right\}1\left\{{y}_{i1}+{y}_{i2}=1\right\}\times \ln \left(\frac{\exp {\left(\left({x}_{i1}-{x}_{i2}\right)\beta +\upgamma \left({y}_{i0}-{y}_{i3}\right)\right)}^{y_{i1}}}{1+\exp \left(\left({x}_{i1}-{x}_{i2}\right)\beta +\upgamma \left({y}_{i0}-{y}_{i3}\right)\right)}\right) $$

when Pr(x_i2 = x_i3) > 0. When x_i2x_i3 is continuously distributed with support around 0, β₀ and γ₀ can be obtained by maximizing

$$ \sum \limits_iK\left(\frac{x_{i2}-{x}_{i3}}{h_N}\right)1\left\{{y}_{i1}+{y}_{i2}=1\right\}\times \ln \left(\frac{\exp {\left(\left({x}_{i1}-{x}_{i2}\right)\beta +\upgamma \left({y}_{i0}-{y}_{i3}\right)\right)}^{y_{i1}}}{1+\exp \left(\left({x}_{i1}-{x}_{i2}\right)\beta +\upgamma \left({y}_{i0}-{y}_{i3}\right)\right)}\right) $$

where K () is a kernel density function and h_N is a bandwidth sequence, chosen so as to satisfy certain assumptions that guarantee consistency and asymptotic normality of the proposed estimators.

The Fixed Effects Approach

The conditional maximum likelihood approach is not always available. For example, there are no sufficient statistics for the binary choice model with individual effects when the errors are normally distributed. Furthermore, like all ML approaches, the approach suffers from the fact that the distribution of the unobserved idiosyncratic errors needs to be parametrically specified. There do exist, however, methods for some semiparametric nonlinear panel data models with individual effects where the distribution of the underlying idiosyncratic errors is left unspecified. These include the binary choice model, the censored and truncated regression models, and the sample selection model.

The Semiparametric Panel Data Binary Choice Model

Manski (1987) considers the model

$$ {y}_{it}=1\left\{{x}_{it}{\beta}_0+{\alpha}_i-{\varepsilon}_{it}\ge 0\right\}\;i=1,\dots, N;t=1,\dots, T $$

where ε_it is identically distributed over time conditional on (x_i,α_i), with distribution function F that is a continuous and strictly increasing function on$ \mathcal{R} $. Note that, in contrast to the models considered above, F here is not assumed to have a specific functional form, hence the characterization of the model as semiparametric.

He observes that for T = 2 the time invariance of F implies that

$$ {\displaystyle \begin{array}{l}\Pr \left({y}_{i1}=1|{x}_i\right)\hfill \\ {}\lesseqgtr \Pr \left({y}_{i2}=1|{x}_i\right)\;\mathrm{if}\;\mathrm{and}\;\mathrm{only}\kern0.17em \mathrm{if}\;{x}_{i1}{\beta}_0\lesseqgtr {x}_{i2}{\beta}_0\hfill \end{array}} $$

or equivalently that

$$ \mathit{\operatorname{sgn}}\left(\Pr \left({y}_{i2}=1|{x}_i,{\alpha}_i\right)-\Pr \left({y}_{i1}=1|{x}_i,{\alpha}_i\right)\right)=\mathit{\operatorname{sgn}}\left(\left({x}_{i2}-{x}_{i1}\right){\beta}_0\right). $$

In fact it can be shown that, under appropriate regularity conditions on the joint distribution of Δx_i ≡(x_i2−x_i1), β₀ uniquely (up to scale) maximizes the so-called population ‘score function’

$$ E\left[\Delta {y}_i\cdot \mathit{\operatorname{sgn}}\left(\Delta {x}_i{\beta}_0\right)\right] $$

where sgn(x) equals 1 if x > 0, equals − 1 if x < 0 and is equal to 0 if x = 0. This suggests estimating β₀ by the so-called conditional maximum score estimator which maximizes the sample analog of the population score function

$$ \widehat{\beta}=\arg \max \limits_{\beta } \sum \limits_i\Delta {y}_i\cdot \mathit{\operatorname{sgn}}\left(\Delta {x}_i\beta \right). $$

Note that only observations for which y_i1 ≠ y_i2 are used here, similarly to conditional logit. The estimator is consistent under some additional assumptions but is not asymptotically normal and its rate of convergence is not root-N.

Honoré and Kyriazidou (2000a) show that it is possible to extend the conditional maximum score approach to the dynamic binary choice model:

$$ {\displaystyle \begin{array}{ll}\Pr \left({y}_{i0}=1|{x}_i,{\alpha}_i\right)={p}_0\left({x}_i,{\alpha}_i\right)\\ {}& \Pr \left({y}_{it}=1|{x}_i,{\alpha}_i,{y}_{i0},\dots, {y}_{it-1}\right)\hfill \\ {}& =F\left({x}_{it}{\beta}_0+{\upgamma}_0{y}_{it-1}+{\alpha}_i\right)\;t\hfill \\ {}& =1,\dots, T\hfill \end{array}} $$

where y_i0 is assumed to be observed and F is strictly increasing.

We will next demonstrate their identification scheme. Assume T = 3 and define the events A and B as above. Then

$$ {\displaystyle \begin{array}{l}\Pr \left(A|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)\hfill \\ {}={p}_0{\left({x}_i,{\alpha}_i\right)}^{d_0}{\left(1-{p}_0\left({x}_i,{\alpha}_i\right)\right)}^{1-{d}_0}\hfill \\ {}\times \left(1-F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)\right)\hfill \\ {}\times F\left({x}_{i2}{\beta}_0+{\alpha}_i\right)\hfill \\ {}\times {\left(1-F\left({x}_{i2}{\beta}_0+{\upgamma}_0+{\alpha}_i\right)\right)}^{\left(1-{d}_3\right)}\hfill \\ {}\times F\left({x}_{i2}{\beta}_0+{\upgamma}_0+{\alpha}_i\right)\Big){}^{d_3}\Pr \left(B|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)\hfill \\ {}={p}_0{\left({x}_i,{\alpha}_i\right)}^{d_0}{\left(1-{p}_0\left({x}_i,{\alpha}_i\right)\right)}^{1-{d}_0}\hfill \\ {}\times F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)\hfill \\ {}\times \left(1-F\left({x}_{i2}{\beta}_0+{\upgamma}_0+{\alpha}_i\right)\right)\hfill \\ {}\times {\left(1-F\left({x}_{i2}{\beta}_0+{\alpha}_i\right)\right)}^{\left(1-{d}_3\right)}\hfill \\ {}\times F\left({x}_{i2}{\beta}_0+{\alpha}_i\right)\Big){}^{d_3}.\hfill \end{array}} $$

If d₃ = 0, then,

$$ {\displaystyle \begin{array}{l}\frac{\Pr \left(A|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)}{\Pr \left(B|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)}\hfill \\ {}=\frac{\left(1-F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)\right)}{\left(1-F\left({x}_{i2}{\beta}_0+{\alpha}_i\right)\right)}\hfill \\ {}\times \frac{F\left({x}_{i2}{\beta}_0+{\alpha}_i\right)}{F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)}\hfill \\ {}=\frac{\left(1-F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)\right)}{\left(1-F\left({x}_{i2}{\beta}_0+{\upgamma}_0{d}_3+{\alpha}_i\right)\right)}\hfill \\ {}\times \frac{F\left({x}_{i2}{\beta}_0+{\upgamma}_0+{\alpha}_i\right)}{F\left({x}_{i1}{\beta}_0+{\alpha}_i\right)}\hfill \end{array}} $$

while if d₃ = 1, then,

$$ {\displaystyle \begin{array}{l}\frac{\Pr \left(A|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)}{\Pr \left(B|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)}\hfill \\ {}=\frac{\left(1-F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)\right)}{\left(1-F\left({x}_{i2}{\beta}_0+{\upgamma}_0+{\alpha}_i\right)\right)}\hfill \\ {}\times \frac{F\left({x}_{i2}{\beta}_0+{\upgamma}_0+{\alpha}_i\right)}{F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)}\hfill \\ {}=\frac{\left(1-F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right)\right)}{\left(1-F\left({x}_{i2}{\beta}_0+{\upgamma}_0{d}_3+{\alpha}_i\right)\right)}\hfill \\ {}\times \frac{F\left({x}_{i2}{\beta}_0+{\upgamma}_0{d}_3+{\alpha}_i\right)}{F\left({x}_{i1}{\beta}_0+{\upgamma}_0{d}_0+{\alpha}_i\right).}\hfill \end{array}} $$

Monotonicity of F implies that

$$ \mathit{\operatorname{sgn}}\Big(\Pr \left(A|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)-\Pr \left(B|{x}_i,{\alpha}_i,{x}_{i2}={x}_{i3}\right)=\mathit{\operatorname{sgn}}\left(\left({x}_{i2}-{x}_{i1}\right){\beta}_0+{\upgamma}_0\left({d}_3-{d}_0\right)\right). $$

This last equation suggests that β₀ and γ₀ can be estimated by conditional maximum score using only the observations satisfying y_i1 + y_i2 = 1 and x_i2 = x_i3, that is, by maximizing

$$ {\displaystyle \begin{array}{l}\sum \limits_i1\left\{{x}_{i2}-{x}_{i3}=0\right\}\;\left({y}_{i2}-{y}_{i1}\right)\hfill \\ {}\mathit{\operatorname{sgn}}\;\left(\left({x}_{i2}-{x}_{i1}\right)\beta +\upgamma \left({y}_{i3}-{y}_{i0}\right)\right).\hfill \end{array}} $$

Similar to the logit case, when x_i2− x_i3 is continuously distributed with support around 0, estimation of β₀ and γ₀ can be obtained by maximizing

$$ \sum \limits_iK\left(\frac{x_{i2}-{x}_{i3}}{h_N}\right)\;\left({y}_{i2}-{y}_{i1}\right)\mathit{\operatorname{sgn}}\;\Big(\left({x}_{i2}-{x}_{i1}\right)\beta +\upgamma \left({y}_{i3}-{y}_{i0}\right). $$

The Semiparametric Panel Data Censored Regression Model

The standard censored panel data (or Type 1 Tobit) model with individual effects is given by

$$ {y}_{it}=\max \left\{{x}_{it}{\beta}_0+{\alpha}_i+{\varepsilon}_{it},0\right\}\;i=1,\dots, N;t=1,\dots, T. $$

Estimation of this model was first considered by Honoré (1992) and later by Honoré and Kyriazidou (2000b), who extend the results of the former paper. We will present here Honoré (1992), who assumes that (ε_it, ε_is) are pairwise exchangeable conditional on (x_i, a_i). This implies that ε_it and ε_is are identically distributed conditional on (x_i, a_i) although it does not require (conditional) independence over time. (Fristedt and Gray 1997, give the following definition of exchangeability: Let I e a countable set. A sequence $( X_i : i \in \mathscr{I}) $ finite or infinite, of random variables on a probability space (Ω; F; P) is exchangeable if, for every permutation ρ of I, the distribution of $X_{p(i)} : i \in {\mathscr{I}} $and $X_{i} : i \in {\mathscr{I}} $) are identical. Note that a finite or infinite i.i.d. sequence is exchangeable and that exchangeability allows for certain types of serial correlation. Furthermore, exchangeability implies strict stationarity although the converse is not true.)

Consider the ‘pseudo-error’:

$$ {e}_{is t}\left(\beta \right)=\max \left\{{y}_{is},\left({x}_{is}-{x}_{it}\right)\beta \right\}-{x}_{is}\beta . $$

With this definition, at the true β₀

$$ {\displaystyle \begin{array}{ll}{e}_{is t}\left({\beta}_0\right)& =\max \left\{{y}_{is},\left({x}_{is}-{x}_{it}\right){\beta}_0\right\}-{x}_{is}{\beta}_0\hfill \\ {}& =\max \left\{\max \left\{{x}_{is}{\beta}_0+{\alpha}_i+{\varepsilon}_{is},0\right\},\left({x}_{is}-{x}_{it}\right){\beta}_0\right\}\hfill \\ {}& -{x}_{is}{\beta}_0=\max \left\{\max \left\{{\alpha}_i+{\varepsilon}_{is},-{x}_{is}{\beta}_0\right\},-{x}_{it}{\beta}_0\right\}\hfill \\ {}& =\max \left\{{\alpha}_i+{\varepsilon}_{is},-{x}_{is}{\beta}_0,-{x}_{is}{\beta}_0\right\}\hfill \end{array}} $$

The conditional exchangeability assumption implies that (e_ist(β₀), e_its(β₀)) is distributed like (e_its(β₀), e_ist(β₀)) conditional on (x_it, x_is, a_i) and hence the difference e_its(β₀) − e_ist(β₀) is distributed symmetrically around 0 conditional on (x_it, x_is, a_i). Since this is true for any α_i this symmetry holds conditional only on (x_it, x_is). Therefore for any odd function ξ (that is, a function ξ that satisfies ξ(−d) = −ξ(d)) we have

$$ E\left[\xi \left({e}_{is t}\left({\beta}_0\right)-{e}_{is t}\left({\beta}_0\right)\right)|{x}_{it},{x}_{is}\right]=0 $$

which also implies the following moment restriction:

$$ E\left[\xi \left({e}_{is t}\left({\beta}_0\right)-{e}_{is t}\left({\beta}_0\right)\right){\left({x}_{is}-{x}_{it}\right)}^{\prime }|{x}_{it},{x}_{is}\right]=0. $$

The left-hand side of the moment condition above may be thought of as the first order condition for the following population minimization problem

$$ \min \limits_{\beta }E\left[q\left({y}_{is},{y}_{it},\left({x}_{is}-{x}_{it}\right)\beta \right)|{x}_{it},{x}_{is}\right] $$

Where

$$ {\displaystyle \begin{array}{ll}\hfill & q\left({y}_i,{y}_j,\delta \right)\\ {}& =\left\{\begin{array}{ccc}\hfill \Xi \left({y}_i\right)-\left({y}_j+\delta \right)\xi \left({y}_i\right)\hfill & \hfill \mathrm{if}\hfill & \hfill \delta \le -{y}_j\hfill \\ {}\hfill \Xi \left({y}_i-{y}_j-\delta \right)\hfill & \hfill \mathrm{if}\hfill & \hfill -{y}_j<\delta <{y}_i\hfill \\ {}\hfill \Xi \left(-{y}_j\right)-\left(\delta -{y}_j\right)\xi \left(-{y}_i\right)\hfill & \hfill \mathrm{if}\hfill & \hfill {y}_i\le \delta \hfill \end{array}\right.\hfill \end{array}} $$

and Ξ(d):R →R⁾⁺ is an even function (that is, Ξ(−d) = Ξ(d)) which is convex, strictly increasing for d > 0 and has Ξ (0) = 0, and Ξ′(d) = ξ(d) where ξ(0) = 0.Note that for Ξ to be convex, ξ has to be monotone. Obvious choices for Ξ are Ξ (d) = d² (which corresponds to ξ(d) = 2d) and Ξ (d) = |d| (which corresponds to ξ(d) = sgn(d)).

The fact that the true β₀ solves the population minimization problem above suggests the following estimator for β₀:

$$ \widehat{\beta}=\arg \min \limits_{\beta } \sum \limits_i\sum \limits_{s<t}q\left({y}_{is},{y}_{it}\left({x}_{is}-{x}_{it}\right)\beta \right). $$

Honoré (1992) shows that the estimators corresponding to Ξ (d) = d² and Ξ (d) = |d| are root-N consistent and asymptotically normal.

Honoré (1993) considers a dynamic version of the model where the lag of the observed (censored) dependent variable appears in the model instead of the latent one. Hu (2002) considers the case where one lag of the latent (unobserved) dependent variable is included along with the set of exogenous variables x_it.

The Semiparametric Panel Data Sample Selection Model

The standard panel data sample selection (or Type 2 Tobit) model is defined as:

$$ {y}_{it}^{\ast }={x}_{it}^{\ast }{\beta}_0+{\alpha}_i^{\ast }+{\varepsilon}_{it}^{\ast }{y}_{it}={d}_{it}\cdot {y}_{it}^{\ast }{d}_{it}=1\left\{{z}_{it}{\upgamma}_0+{\eta}_i-{u}_{it}\ge 0\right\} $$

where i = 1,2,…,N; t = 1,…T. Kyriazidou (1997) considers estimation without any parametric assumptions on the form of the joint distribution of $ \left({\varepsilon}_{it}^{\ast },{u}_{it}\right) $ or on the individual effects (α_i,η_i).

Consider the case where T = 2 and only those individuals for whom d_i1 = d_i2 = 1. Let $ {\xi}_i=\left({z}_{i1},{z}_{i2},{x}_{i1}^{\ast },{x}_{i2}^{\ast },{\alpha}_i,{\eta}_i\right) $ denote all the information about individual i. Note that

$$ E\left({y}_{i1}-{y}_{i2}|{d}_{i1}={d}_{i2}=1,{\xi}_i\right)=\left({x}_{i1}^{\ast }-{x}_{i2}^{\ast}\right){\beta}_0+E\left({\varepsilon}_{i1}^{\ast }-{\varepsilon}_{i2}^{\ast }|{d}_{i1}={d}_{i2}=1,{\xi}_i\right) $$

and hence OLS estimation of the first differenced model will not yield consistent estimation of β₀ since in general the so-called ‘sample selection bias term’

$$ {\lambda}_{it}\equiv E\left({\varepsilon}_{it}^{\ast }|{d}_{i1}={d}_{i2}=1,{\xi}_i\right)=E\left({\varepsilon}_{it}^{\ast }|{u}_{i1}\le {z}_{i1}{\upgamma}_0+{\eta}_i,{u}_{i2}\le {\mathrm{z}}_{i2}{\upgamma}_0+{\eta}_i,{\xi}_i\right) $$

is not zero. Nor do we have in general that λ_i1 = λ_i2, so that first differencing removes the sample selection bias along with the individual effects. Kyriazidou (1997) makes a conditional exchangeability assumption that $ \left({\varepsilon}_{i1}^{\ast },{\varepsilon}_{i2}^{\ast },{u}_{i1},{u}_{i2}\right)\;\mathrm{and}\;\left({\varepsilon}_{i2}^{\ast },{\varepsilon}_{i1}^{\ast },{u}_{i2},{u}_{i1}\right) $ are identically distributed conditional on ξ_i. Under this assumption, it is easy to see that if z_i1γ₀ = z_i2γ₀ then

$$ {\displaystyle \begin{array}{ll}\hfill & {\lambda}_{i1}\\ {}& =E\left({\varepsilon}_{i1}^{\ast }|{u}_{i1}\le {z}_{i1}{\upgamma}_0+{\eta}_i,{u}_{i2}\le {z}_{i2}{\upgamma}_0+{\eta}_i,{\xi}_i\right)\hfill \\ {}& =E\left({\varepsilon}_{i2}^{\ast }|{u}_{i1}\le {z}_{i1}{\upgamma}_0+{\eta}_i,{u}_{i2}\le {z}_{i2}{\upgamma}_0+{\eta}_i,{\xi}_i\right)\hfill \\ {}& ={\lambda}_{i2}\hfill \end{array}} $$

so that first differencing will eliminate both α_i and λ_it simultaneously. So β₀ can be estimated by first difference OLS for the subsample of individuals that are observed in both periods (that is, that have d_i1 = d_i2 = 1) and also have the selection index, z_itγ₀, constant (that is, z_i1γ₀ = z_i2γ₀). Of course, this estimation scheme cannot be directly implemented since γ₀ is unknown. And it is quite possible that no observation has z_i1γ₀ = z_i2γ₀ if z_iγ₀ is continuously distributed. If, however, λ_it is a sufficiently smooth function and $ \widehat{\upgamma} $ is a consistent estimator of γ₀, z_i1γ₀≈ z_i2γ₀ implies λ_i1≈ λ_i2, and the preceding augment holds approximately. Kyriazidou proposes a two-step estimation procedure, in the spirit of Powell (2001), and Ahn and Powell (1993) who consider estimation of cross-section versions of the sample selection model. In the first step, γ₀ is consistently estimated based on the selection equation. In the second step, the estimate $ \widehat{\upgamma} $ is used to estimate β₀ based on those pairs of observations for which z_i1$ \widehat{\upgamma} $ and z_i2$ \widehat{\upgamma} $ are ‘close’. To this end define

$$ {\widehat{\psi}}_i=\frac{1}{h_N}K\left(\frac{\Delta {z}_i\widehat{\upgamma}}{h_N}\right) $$

where K () is a kernel density function and h_N is a bandwidth sequence. The proposed estimator takes the form:

$$ \widehat{\beta}={\left[\sum \limits_{i=1}^N{\widehat{\psi}}_i\Delta {x}_i^{\prime}\Delta {x}_i{d}_{i1}{d}_{i2}\right]}^{-1}\sum \limits_{i=1}^N{\widehat{\psi}}_i\Delta {x}_i^{\prime}\Delta {y}_i{d}_{i1}{d}_{i2}. $$

Under some assumptions and by appropriately choosing h_N, the estimator can be shown to be asymptotically normal although the rate of convergence is slower that the parametric $ \sqrt{N} $ rate. Apart from the conditional exchangeability assumption, another important assumption that underlies the approach is that there is at least one variable in z_it not contained in x_it, which is an exclusion restriction common in semiparametric sample selection models.

A dynamic version of the panel data sample selection model, with the own lagged dependent variable appearing in each equation, is considered by Kyriazidou (2001).

The Random Effects Approach

Fixed effects methods and conditional maximum likelihood methods (when they exist) estimate the coefficients of time-varying regressors consistently without making any assumptions on how the individual effects are related to the observed covariates or to the time-varying errors or to the initial observations of the sample. However, these methods do not deliver estimates of coefficients of time-invariant regressors and of the individual effects, and hence cannot be used for prediction, or for computation of marginal effects and elasticities which are often the quantities of interest. Furthermore, none of these approaches allows for non-stationary errors and hence for time-series heteroskedasticity.

These problems do not arise in the random effects approach. The approach essentially consists of treating (α_I + ε_it) as a two-component error term and making assumptions about its relationship with the observed covariates and, in the case of dynamic models, with the initial conditions as well. A downside of the approach is that misspecification of any part of the model typically yields inconsistent estimates.

Static Case

In the static panel data linear regression model, the traditional random effects approach (sometimes also called the uncorrelated random effects approach) assumes that the individual effects α_i along with the time-varying errors ε_it are uncorrelated with the observed covariates x_it. Then the coefficients of both time-varying and time-invariant regressors may be estimated consistently (albeit not efficiently) by pooled OLS. In static nonlinear models, the traditional random effects approach apart from parameterizing the conditional distribution of ε_it given x_it, also assumes that α_i is independent of x_it and ε_it for all t, and has a distribution, say H, that depends on a finite set of unknown parameters, say δ₀. For example, in the binary choice model,

$$ {y}_{it}=1\left\{{x}_{it}{\beta}_0+{\alpha}_i+{\varepsilon}_{it}\ge 0\right\}\;i=1,\dots, N;t=1,\dots, T $$

(1)

assuming that ε_it are i.i.d. over time and independent of x_i and α_i with known distribution F (say, standard normal or logistic), we may estimate the unknown parameters (β₀,δ₀) via ML. The log-likelihood is

$$ {\displaystyle \begin{array}{ll}\hfill & \ln L\left(\beta, \delta \right)\\ {}& =\sum \limits_i\ln \int \prod \limits_{T=1}^TF{\left({x}_{it}\beta +\alpha \right)}^{y_{it}}{\left(1-F\left({x}_{it}\beta +\alpha \right)\right)}^{1-{y}_{it}} dH\left(\alpha, \delta \right)\hfill \end{array}} $$

and involves a one-dimensional integral which may be calculated numerically, for example, by quadrature procedures (see Butler and Moffitt 1982).

However, things become quite complicated if we want to allow for arbitrary serial correlation in the ε_it’s. Consider the binary choice model

$$ {y}_{it}=1\left\{{x}_{it}{\beta}_0-{u}_{it}\ge 0\right\} $$

where u_it = α_I + ε_it is the composite error term. For T = 3 there are 2³ possible sequences of 0’s and 1’s. The likelihood for an individual for whom the sequence of observed y_it’s is (0,1,0) takes the form

$$ \underset{x_{i1}\beta }{\int }{\int}^{x_{i2}\beta}\underset{x_{i3}\beta }{\int }f\left({u}_1,{u}_2,{u}_3\right){du}_1{du}_2{du}_3 $$

where f is the trivariate density of (u₁,u₂,u₃) conditional on x_i. The log-likelihood is

$$ {\displaystyle \begin{array}{ll}\ln L\left(\beta, \delta \right)=& \underset{i:\left(0,0,0\right)}{\Sigma}\ln \underset{x_{i1}\beta }{\int \limits}\underset{x_{i2}\beta }{\int \limits}\underset{x_{i3}\beta }{\int \limits }f\left({u}_1,{u}_2,{u}_3\right){du}_1{du}_2{du}_3\hfill \\ {}& +\underset{i:\left(0,0,1\right)}{\Sigma}\ln \underset{x_{i1}\beta }{\int \limits}\underset{x_{i2}\beta }{\int \limits }{\int}^{x_{i3}\beta }f\left({u}_1,{u}_2,{u}_3\right)\hfill \\ {}& \times {du}_1{du}_2{du}_3+\dots \hfill \end{array}} $$

which requires the computation of multiple trivariate integrals. Multivariate integration is basically infeasible for large T. This is where simulation methods come in very handy.

The assumption that α_i is independent of x_i is often found unsatisfactory. A possible solution is to assume a specific functional form for the relationship of α_i with x_i. This approach (recently also called the correlated random effects approach) was first proposed by Chamberlain (1984). Suppose that

$$ {\alpha}_i=\sum \limits_{t=1}^T{x}_{it}{\upgamma}_{0,t}+{v}_i $$

where v_i is independent of x_i, similarly to the time varying error component ε_it, and that the composite new error term v_i + ε_it follows a specific distribution, say normal. In the case of the binary choice model, for example, assuming that ε_it + v_i|x_i; α_i is $ N\left(0,{\sigma}_{0,t}^2\right) $ implies that

$$ \Pr \left({y}_{it}=1|{x}_i\right)=\Phi \left(\frac{x_{it}{\beta}_0+{\sum}_{t=1}^T{x}_{it}{\upgamma}_{0,t}}{\sigma_{0,t}}\right)=\Phi \left({x}_{it}{\theta}_{0,t}\right). $$

For computational simplicity, Chamberlain proposes to estimate the unknown parameters θ_0,t via period-by-period probit. The ‘structural parameters’ $ {\beta}_0,{\left\{{\sigma}_{0,t}^2\right\}}_{t=1}^T,\mathrm{and}\;{\left\{{\upgamma}_{0,t}\right\}}_{t=1}^T $ can then be recovered by minimum distance estimation. Note that the approach allows for time series heteroskedasticity and requires only one normalization e.g. that$ {\sigma}_{0,t}^2=1 $.

Newey (1994) generalizes Chamberlain’s approach by postulating that

$$ {\alpha}_i=\rho \left({x}_{i1},\dots, {x}_{it}\right)+{v}_i $$

where ρ () is an unknown function of x_i. Assuming again that v_i and ε_it are independent of x_i and that the composite new error term v_i + ε_it follows a specific distribution, say F_t, we obtain

$$ {\pi}_t=\Pr \left({y}_{it}=1|{x}_i\right)={F}_t\left(\rho \left({x}_i\right)+{x}_{it}{\beta}_0\right) $$

which for a strictly monotonic F_t implies that

$$ {F}_t^{-1}\left({\pi}_t\right)=\rho \left({x}_i\right)+{x}_{it}{\beta}_0. $$

For example in the normal case

$$ {\Phi}^{-1}\left({\pi}_t\right)=\frac{\rho \left({x}_i\right)+{x}_{it}{\beta}_0}{\rho_{0,t}}. $$

Thus for two periods t and s we obtain

$$ {\Phi}^{-1}\left({\pi}_t\right)=\frac{\sigma_{0,s}}{\sigma_{0,t}}{\Phi}^{-1}\left({\pi}_s\right)+\frac{\sigma_{0,s}}{\sigma_{0,t}}\left({x}_{it}-{x}_{is}\right){\beta}_0. $$

Normalizing σ_0,t = 1 and estimating π_t and π_s nonparametrically, we can recover σ_0,s and β₀ from the regression of $ {\Phi}^{-1}\left({\widehat{\pi}}_t\right)\;\mathrm{on}\;{\Phi}^{-1}\left({\widehat{\pi}}_s\right)\;\mathrm{and}\;\left({x}_{it}-{x}_{is}\right). $

A criticism of all these correlated random effects approaches is that, although in the linear model writing $ {\alpha}_i={\sum}_{t=1}^T{x}_{it}{\upgamma}_{0,t}+{u}_i $ where E(u_ix_it) = 0 for all t does not impose x_it−x_is any restrictions on the joint distribution of α_i and x_i (apart from the requirement that it has second moments) since this is just the best linear projection of α_i on x_i, in the nonlinear model assuming α_i = ρ(x_i1,…,x_it) + u_i, even without specifying the functional form of ρ, imposes implausible restrictions in the sense that, if this relationship holds for the T observations, a similar one will not in general hold for T + 1.

Dynamic Case

In the case where there are genuine dynamics in the model in the form of lags of the dependent variable or other endogenous regressors, random effects methods become even more complicated and require additional assumptions about the relationship of the individual effects with the initial observations. We next describe a general approach for estimating dynamic random effects models suggested by Wooldridge (2000). For simplicity we will drop the subscripts i.

We are interested in the conditional distribution of y_t given a vector of strictly exogenous variables z^T ≡ (z₁, . . ., z_T), own lags and lags of other endogenous variables x^{t − 1} ≡ (y_{t − 1}, w_{t − 1}, y_{t − 2}, w_{t − 2}, . . ., y₀, w₀), and an unobserved scalar or vector random effect α. Here z_t is strictly exogenous in the sense that

$$ F\left({w}_t|{z}^T,{x}^{t-1},\alpha \right)=F\left({w}_t|{z}_t,{x}^{t-1},\alpha \right). $$

The conditional density of x_t ≡ (y_t, w_t) is

$$ {f}_t\left({x}_t|{z}^T,{x}^{t-1},\alpha \right)={f}_t\left({x}_t|{z}_t,{x}^{t-1},\alpha \right)={f}_t\left({y}_t|{w}_t,{z}_t,{x}^{t-1},\alpha \right)\cdot {f}_t\left({w}_t|{z}_t,{x}^{t-1},\alpha \right) $$

and the joint density for all T periods is

$$ f\left({x}_1,{x}_2,\dots, {x}_T|{z}^T,{x}_0,\alpha \right)=\prod \limits_{t=1}^T{f}_1\left({x}_t|{z}_t,{x}^{t-1},\alpha \right). $$

But a is unobserved. We need to integrate it out. One solution is to parameterize the distribution of α conditional on z^T and x₀, say h(α|z^T,x₀). Then

$$ f\left({x}_1,{x}_2,\dots, {x}_T|{z}^T,{x}_0\right)=\int \prod \limits_{t=1}^T{f}_1\left({x}_t|{z}_t,{x}^{t-1},\alpha \right)h\left(\alpha |{z}^T,{x}_0\right) d\alpha . $$

Notice that in the traditional random effects approach (in the line of Anderson and Hsiao 1981) we would have to make assumptions about the conditional distribution of x₀ conditional on a and z^T.

Bibliography

Ahn, H., and J.L. Powell. 1993. Semiparametric estimation of censored selection models with a nonparametric selection mechanism. Journal of Econometrics 58: 3–29.
Article Google Scholar
Andersen, E. 1970. Asymptotic properties of conditional maximum likelihood estimators. Journal of the Royal Statistical Society, Series B 32: 283–301.
Google Scholar
Anderson, T., and C. Hsiao. 1981. Estimation of dynamic models with error components. Journal of the American Statistical Association 76(375): 598–606.
Article Google Scholar
Butler, J.S., and R. Moffitt. 1982. A computationally efficient quadrature procedure for the one-factor multinomial probit model. Econometrica 50: 761–764.
Article Google Scholar
Chamberlain, G. 1984. Panel data. In Handbook of econometrics, ed. Z. Griliches and M. Intrilligator, Vol. 2. Amsterdam: North-Holland.
Google Scholar
Chamberlain, G. 1985. Heterogeneity, omitted variable bias, and duration dependence. In Longitudinal analysis of labor market data, ed. J.J. Heckman and B. Singer. Cambridge: Cambridge University Press.
Google Scholar
Fristedt, B., and L. Gray. 1997. A modern approach to probability theory. Boston: Birkhauser.
Book Google Scholar
Honoré, B.E. 1992. Trimmed LAD and least squares estimation of truncated and censored regression models with fixed effects. Econometrica 60: 533–565.
Article Google Scholar
Honoré, B.E. 1993. Orthogonality conditions for Tobit models with fixed effects and lagged dependent variables. Journal of Econometrics 59: 35–61.
Article Google Scholar
Honoré, B.E., and E. Kyriazidou. 2000a. Panel data discrete choice models with lagged dependent variables. Econometrica 68: 839–874.
Article Google Scholar
Honoré, B.E., and E. Kyriazidou. 2000b. Estimation of Tobit-type models with individual specific effects. Econometric Reviews 19: 341–366.
Article Google Scholar
Hu, L. 2002. Estimation of a censored dynamic panel data model. Econometrica 70: 2499–2517.
Article Google Scholar
Kyriazidou, E. 1997. Estimation of a panel data sample selection model. Econometrica 65: 1335–1364.
Article Google Scholar
Kyriazidou, E. 2001. Estimation of dynamic panel data sample selection models. Review of Economic Studies 68: 543–572.
Article Google Scholar
Magnac, T. 2000. Subsidised training and youth employment: Distinguishing unobserved heterogeneity from state dependence in labour market histories. Economic Journal 110: 805–837.
Article Google Scholar
Manski, C. 1987. Semiparametric analysis of random effects linear models from binary panel data. Econometrica 55: 357–362.
Article Google Scholar
Newey, W. 1994. The asymptotic variance of semiparametric estimators. Econometrica 62: 1349–1382.
Article Google Scholar
Neyman, J., and E.L. Scott. 1948. Consistent estimation from partially consistent observations. Econometrica 16: 1–32.
Article Google Scholar
Powell, J.L. 2001. Semiparametric estimation of bivariate latent variable models. In Nonlinear statistical modeling: Proceedings of the thirteenth International Symposium in Economic Theory and Econometrics: Essays in honor of Takeshi Amemiya, ed. C. Hsiao, K. Morimune, and J.L. Powell. Cambridge: Cambridge University Press.
Google Scholar
Wooldridge, J.M. 2000. A framework for estimating dynamic, unobserved effects panel data models with possible feedback to future explanatory variables. Economics Letters 68: 245–250.
Article Google Scholar

Download references

Author information

Authors and Affiliations

http://springerlink.bibliotecabuap.elogim.com/referencework/10.1057/978-1-349-95121-5
Ekaterini Kyriazidou

Authors

Ekaterini Kyriazidou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Copyright information

About this entry

Cite this entry

Kyriazidou, E. (2018). Non-linear Panel Data Models. In: The New Palgrave Dictionary of Economics. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-349-95189-5_2094

Download citation

DOI: https://doi.org/10.1057/978-1-349-95189-5_2094
Published: 15 February 2018
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-349-95188-8
Online ISBN: 978-1-349-95189-5
eBook Packages: Economics and FinanceReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences

Publish with us

Policies and ethics

Non-linear Panel Data Models

Abstract

Similar content being viewed by others

Panel Data Analysis

Panel Data Analysis

Panel data models with cross-sectional dependence: a selective review

Keywords

JEL Classifications

Introduction

The Conditional Maximum Likelihood (CML) Approach

The Static Panel Data Logit Model

The Dynamic Panel Data Logit Model

The Fixed Effects Approach

The Semiparametric Panel Data Binary Choice Model

The Semiparametric Panel Data Censored Regression Model

The Semiparametric Panel Data Sample Selection Model

The Random Effects Approach

Static Case

Dynamic Case

See Also

Bibliography

Author information

Authors and Affiliations

Editor information

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Navigation

Non-linear Panel Data Models

Abstract

Similar content being viewed by others

Panel Data Analysis

Panel Data Analysis

Panel data models with cross-sectional dependence: a selective review

Keywords

JEL Classifications

Introduction

The Conditional Maximum Likelihood (CML) Approach

The Static Panel Data Logit Model

The Dynamic Panel Data Logit Model

The Fixed Effects Approach

The Semiparametric Panel Data Binary Choice Model

The Semiparametric Panel Data Censored Regression Model

The Semiparametric Panel Data Sample Selection Model

The Random Effects Approach

Static Case

Dynamic Case

See Also

Bibliography

Author information

Authors and Affiliations

Editor information

Copyright information

About this entry

Cite this entry

Download citation

Share this entry

Publish with us

Search

Navigation