Abstract
A semiparametric technique that has been gaining considerable popularity in economics, the quantile regression model has a number of attractive features. For example, it can be used to characterize the entire conditional distribution of a dependent variable given a set of regressors; it has a linear programming representation which makes estimation easy; and it gives a robust measure of location. Concentrating on cross-section applications, this article presents the basic structure of the quantile regression model, highlights the most important features, and provides the elementary tools for using quantile regressions in empirical applications.
Access provided by CONRICYT-eBooks. Download reference work entry PDF
Similar content being viewed by others
Keywords
- Asymptotic covariance matrix
- Boostrap
- Censored quantile regression model
- Design matrix bootstrapping
- Equivariance
- Generalized method of moments
- Heteroskedasticity
- Kernel estimators in econometrics
- Least absolute deviation
- Linear programming
- Optimal minimum distance estimator
- Quantile regression
- Semiparametric estimation
- Tobit model
JEL Classifications
Introduction
The quantile regression is a semiparametric technique that has been gaining considerable popularity in economics (for example, Buchinsky 1994). It was introduced by Koenker and Bassett (1978b) as an extension to ordinary quantiles in a location model. In this model, the conditional quantiles have linear forms. A wellknown special case of quantile regression is the least absolute deviation (LAD) estimator of Koenker and Bassett (1978a), which fits medians to a linear function of covariates. In an important generalization of the quantile regression model, Powell (1984, 1986) introduced the censored quantile regression model. This model is an extension of the ‘Tobit’ model and is designed to handle situations in which some of the observations on the dependent variable are censored.
The quantile regression model has some very attractive features: (a) it can be used to characterize the entire conditional distribution of a dependent variable given a set of regressors; (b) it has a linear programming representation which makes estimation easy; (c) it gives a robust measure of location; (d) typically, the quantile regression estimator is more efficient than the least squares estimator when the error term is non-normal; and (e) L-estimators, based on a linear combination of quantile estimators (for example, Portnoy and Koenker 1989) are, in general, more efficient than least squares estimators.
This article presents the basic structure of the quantile regression model. It highlights the most important features and provides the elementary tools for using quantile regressions in empirical applications. The article concentrates on cross-section applications, where the observations are assumed to be independently and identically distributed (i.i.d.).
The Model
Definitions and Estimator
Any real-valued random variable z is completely characterized by its (right continuous) conditional distribution function F(z) = Pr (z ≤ a). For any 0 < θ < 1, the quantity Qθ(z) ≡ F−1(θ) = inf {a : F(a) ≥ θ} is called the θth quantile of z. This quantile is obtained as a solution to a minimization problem of a particular objective function, the check function, given by ρθ(λ) = λ(θ − I(λ < 0)), where I( ) denotes the usual indicator function. That is,
An estimate for the θth quantile of z can be obtained from i.i.d. data zi, i=1,…n, by minimizing the sample analogue of the population objective function defined above. That is,
or alternatively
The last equation provides a clear intuition for the quantile estimates. The θth quantile estimate is obtained by weighting the positive residuals by θ, while the negative residuals are weighted by the complement of θ, namely, 1 − θ.
The extension of this idea to the case of a conditional quantile is straightforward. Suppose that the θth conditional quantile of y, conditional on a K × 1 vector of regressors x = (1, x2, … , xK), is
This implies that the model can be written as
and, by construction, it follows that Qθ(uθ|x) = 0.
This model, which was first introduced by Koenker and Bassett (1978b), can be viewed as a location model. That is,
where uθ has the (right continuous) conditional distribution function \( {F}_{u_{\theta }}\left(\cdot |x\right), \) satisfying Qθ(uθ| x) = 0.
Similar to the unconditional case presented above, the population parameter vector βθ is defined by
The sample analogue for the θth quantile conditional quantile is defined in a similar manner. Let (yi, xi), i =1, …, n, be an i.i.d. sample from the population. Then,\( {\widehat{\beta}}_{\theta }, \)the estimator for βθ, is defined by
The θth quantile regression problem in (2) can be also be rewritten as
where sgn(λ) = I(λ ≥ 0) − I(λ < 0). The last equation gives, in turn, the K × 1 vector of first-order conditions (F.O.C.):
where ψ(x, y, β) = (θ − 1/2 + 1/2sgn(y − x ′ β))x. It is straightforward to show that under the quantile restriction Qθ(uθi| xi) = 0 the moment function ψ( ) satisfies \( E\Big[\psi \left({x}_i,{y}_i,{\beta}_{\theta}\right)\equiv E\left[\psi \left({x}_i,{y}_i,\beta \right)\right]\left|{}_{\beta ={\beta}_{\theta }}=0\right. \). In the jargon of the generalized method of moments (GMM) framework, this establishes the validity of ψ( ) as a moment function. Consequently, using the methodology of Huber (1967), one can establish consistency and asymptotic normality of \( {\widehat{\beta}}_{\theta } \).
For illustration and discussion below, it is convenient to define the following: Let y denote the stacked vector of yi, i = 1, … n, and let X denote the stacked matrix of the row vectors \( {x}_i^{\prime },i=1,\dots, n \).
Linear Programming and Quantile Regression
The problem in (2) can be shown to have a linear programming (LP) representation. This feature has some important consequences from both theoretical and practical standpoints.
Let the K × 1 vector β be written as a difference of two non-negative vectors β+ and β− , that is, β = β+ − β−, for β+, β− ≥ 0. Similarly let the n × 1 residuals vector u be written as a difference of two non-negative vectors u+ and u−, that is, u = u+ − u−, for u+, u− ≥ 0. Furthermore, define the following quantities: A = (X, −X, In, −In), where In is an n dimensional identity matrix, \( z={\left({\beta}^{+\prime },{\beta}^{-\prime },{u}^{+\prime },{u}^{-\prime}\right)}^{\prime },c={\left({0}^{\prime },{0}^{\prime },\theta \cdot {l}^{\prime}\left(1-\theta \right)\cdot {l}^{\prime}\right)}^{\prime },0 \) is a k × 1 vector of zeros, and l is an n × 1 vector of ones.
When written in matrix notation the problem in (2) takes the familiar primal problem of LP:
Furthermore, the dual problem, of LP is (approximately) the same as the F.O.C. given above, namely
The duality th of LP implies that feasible solutions exist for both the primal and the dual problems, if the design matrix X is of full column rank, that is, rank (X) = K. The equilibrium th of LP guarantees then that this solution is optimal.
The LP representation of the quantile regression problem has several important implications from both computational and conceptual standpoints. First, it is guaranteed that an estimate will be obtained in a finite number of simplex iterations.
Second, the parameter estimate is robust to outliers. That is, for \( {y}_i-{x}_i^{\prime }{\widehat{\beta}}_{\theta }>0,{y}_i \) can be increased toward ∞, and for all \( {y}_i-{x}_i^{\prime }{\widehat{\beta}}_{\theta }<0,{y}_i \) can be decreased toward − ∞, without altering the solution \( {\widehat{\beta}}_{\theta } \). In other words, the only thing that matters is not the exact value of y, but rather on which side of the estimated hyperplane it lies. This is important for many economic applications in which yi might be censored, at say \( {y}_i^0 \). For example, for the right-censored model \( {\widehat{\beta}}_{\theta } \) will not be affected as long as for all i we have \( {y}_i^0-{x}_i^{\prime }{\widehat{\beta}}_{\theta }>0 \).
Equivariance Properties
The quantile regression estimator has several important equivariance properties which help facilitate the computation procedure. That is, data-sets that are based on certain transformations of the original data set lead to estimators which are simple transformations of the original estimator. Denote the set of feasible solutions to the problem defined in (2) by B(θ, y, X). Then for every \( {\widehat{\beta}}_{\theta}\equiv \widehat{\beta}\left(\theta, y,X\right)\in B\left(\theta, y,X\right) \) we have (see Koenker and Bassett 1978b: Theorem 3.2):
These properties help in reducing the number of simplex iterations (of any LP algorithm) required for obtaining \( {\widehat{\beta}}_{\theta } \). For example, suppose that \( {\widehat{\beta}}_{\theta}^0 \) is a good starting value for \( {\widehat{\beta}}_{\theta } \) (for example, the least-squares estimate from the regression of y on x, or an estimate obtained from only a small subset of the data available). Let \( {\widehat{\beta}}_{\theta}^R \) denote the quantile regression estimate from the θth quantile regression of \( {y}^R=y-X{\widehat{\beta}}_{\theta}^0 \) on x. Then \( {\widehat{\beta}}_{\theta }={\widehat{\beta}}_{\theta}^R+{\widehat{\beta}}_{\theta}^0 \). In many cases it is faster to obtain the two estimates \( {\widehat{\beta}}_{\theta}^R \) and \( {\widehat{\beta}}_{\theta}^0 \) then to estimate \( {\widehat{\beta}}_{\theta } \) directly.
Efficient Estimation
The quantile regression estimator described above is not the efficient estimator for βθ. An efficient estimator can be obtained by solving
That is, each observation is weighted by the conditional density of its error evaluated at zero. This estimation procedure requires the use of an estimate for the unknown density \( {f}_{u_{\theta }}\left(0|x\right) \). Below we provide details about the estimation of the asymptotic covariance matrix, which, in turn, also provides information about possible estimates for \( {f}_{u_{\theta }}\left(0|x\right) \). (For a more complete discussion of this estimator, see Newey and Powell 1990.)
Interpretation of the Quantile Regression
How can the quantile’s coefficients be interpreted? Consider the partial derivative of the conditional quantile of y with respect to one of the regressors, say j, that is, ∂Qθ(y| x)/∂xj. This derivative may be interpreted as the marginal change in the θth conditional quantile due to a marginal change in the jth element of x. If x contains K distinct variables, then this derivative is given simply by βθj, the coefficient on the jth variable. It is important to note that one should be cautious not to confuse this result with the location of an individual in the conditional distribution. In general, it need not be the case that an observation that happened to be in the θth quantile of one conditional distribution will also be at the same quantile if x had changed. The above derivative reflects changes in the conditional distribution but has nothing to say about the location of an observation within the conditional distribution.
Note that an estimate for the θth conditional quantile of y given x is given by \( {Q}_{\theta}\left(y|x\right)={x}^{\prime }{\widehat{\beta}}_{\theta } \). Hence, if one were to vary θ between 0 and 1 and estimate a different quantile regression estimate for each θ, one can trace the entire conditional distribution of y, conditional on x.
Large Sample Properties of \( {\widehat{\boldsymbol{\beta}}}_{\boldsymbol{\theta}} \)
We denote the conditional distribution function of uθ by \( {F}_{u_{\theta }}\left(\cdot |x\right) \) and the corresponding density function by \( {f}_{u_{\theta }}\left(\cdot |x\right) \).
Assumption A.1
The distribution functions \( \left\{{F}_{u_{\theta i}}\left(\cdot |{x}_i\right)\right\} \) are absolutely continuous, with continuous density functions \( {f}_{u_{\theta i}}\left(\cdot |{x}_i\right) \) uniformly bounded away from 0 and ∞ at the point 0, for i = 1 , 2 , …
Assumption A.2
There exit positive definite matrices Δθ and Λ0 such that
- (i)
\( {lim}_{n\to \infty}\frac{1}{n}{\sum}_{i=1}^n{x}_i{x}_i^{\prime }={\Lambda}_0; \)
- (ii)
\( {lim}_{n\to \infty}\frac{1}{n}{\sum}_{i=1}^n{f}_{u_{\theta i}}\left(0|{x}_i\right){x}_i{x}_i^{\prime }={\Delta}_{\theta };\mathrm{and} \)
- (iii)
\( {\mathrm{max}}_{i=1,\dots, n}\left\Vert x\right\Vert /\sqrt{n}\to 0 \)
Assumption A.3
The parameter vector βθ is in the interior of the parameter space \( {\mathcal{B}}_{\theta } \).
Assumption A.1 requires that the conditional density of uθi, conditional on xi, be bounded and that there be no mass point at the conditional θth quantile at which βθ is estimated.
Assumptions A.2 and A.3 provide regularity conditions very similar to those used for the usual least-squares estimator. Assumptions A.1 and A.2 are sufficient for establishing that \( {\widehat{\beta}}_{\theta}\to {\beta}_{\theta } \) as n → ∞, while Assumption A.3 is needed in addition for establishing the asymptotic normality of \( {\widehat{\beta}}_{\theta } \) in the following theorem.
Theorem 1
Under Assumptions A.1, A.2, and A.3
- (i)
\( \sqrt{n}\left({\widehat{\beta}}_{\theta }-{\beta}_{\theta}\right){\to}^{\mathrm{L}}N\left(0,\theta \left(1-\theta \right){\Delta}_{\theta}^{-1}{\Lambda}_0{\Delta}_{\theta}^{-1}\right); \)
- (ii)
if in addition \( {f}_{u_{\theta }}\left(0|x\right)={f}_{u_{\theta }}(0) \) with probability 1, then
\( \sqrt{n}\left({\widehat{\beta}}_{\theta }-{\beta}_{\theta}\right){\to}^{\mathrm{L}}N\left(0,{w}_{\theta}^2{\Lambda}_0^{-1}\right) \), where \( {w}_{\theta}^2=\theta \left(1-\theta \right){f}_{u_{\theta}}^2(0) \).
The result in (i) uses the fact that the (yi,xi) are independent, but need not be identically distributed. This is the case when \( {f}_{u_{\theta }}\left(\cdot |x\right) \) depends on x, as is the case, for example, with heteroskedasticity. The result in (ii) simplifies the result in (i) when \( \left({y}_i,{x}_i^{\prime}\right) \) are i.i.d.
Estimation of the Asymptotic Covariance Matrix
Several estimators for the asymptotic covariance matrix are readily available. Some of the estimators are valid under Theorem 1(i), while others are valid only under the independence assumption of Theorem 1(ii). In the following, we refer to the former as the general case, while the latter is referred to as the i.i.d. case. Note that under either cases Λ0 can be easily estimated by its sample analogue, namely,
The i.i.d. Case
In this case the problem centers around estimating \( {\omega}_{\theta}^2 \), or more specifically around estimating \( 1/{f}_{u_{\theta }}(0) \). Let \( {\widehat{u}}_{\theta (1)},\dots, {\widehat{u}}_{\theta (n)} \) be the ordered residuals from the θth quantile regression.
Order estimator: Following Siddiqui (1960), an estimator for \( 1/{f}_{u_{\theta}}^2(0) \) is provided by
for some bandwidth hn = op(1). Bofinger (1975) provides an optimal choice for the bandwidth, that minimizes the mean squared error, based on the normal approximation for the true \( {f}_{u_{\theta }}\left(\cdot \right): \)
where Φ and φ denote the distribution function and density function of a standard normal variable, respectively.
Kernel estimator
The density \( {f}_{u_{\theta }}(0) \) can be estimated directly by
where κ( ) is some kernel function κ( ) and cn = op(1) is the kernel bandwidth. It can be optimally chosen using a variety of cross-validation methods (for example, least-squares, log likelihood, and so on).
Bootstrap estimator for\( {\omega}_{\theta}^2 \): This estimator relies on bootstrapping the residual series \( {\widehat{u}}_{\theta i},i=1,\dots, n. \) Specifically, one can obtain B bootstrap estimates for qθ, the θ’s quantile of uθ, say \( {\widehat{q}}_{\theta 1}^{\ast },\dots, {\widehat{q}}_{\theta B}^{\ast } \), from B bootstrap samples drawn from the empirical distribution \( {\widehat{F}}_{u_{\theta }}. \) An estimator for \( {\omega}_{\theta}^2 \) is obtained then by
where \( {\overline{q}}_{\theta}^{\ast }=\frac{1}{B}{\sum}_{j=1}^B{\widehat{q}}_{\theta j}^{\ast } \).
The General Case
There are several alternative estimators for the general case. Here we provide two possible estimators that have been proven accurate in a variety of Monte Carlo studies (for example, Buchinsky 1995).
Kernel estimator
Powell (1986) considered the following kernel estimator for Δθ
where κ(⋅) is some kernel function and cn = op(1) is the kernel bandwidth. Note that the top left-hand element of the matrix \( {\widehat{\Delta}}_{\theta } \) is an estimate of the density\( {f}_{u_{\theta }}(0). \)Hence, the same cross-validation methods discussed before can be used to optimally choose cn.
Design matrix bootstrap estimators
There are several alternative ways for employing the bootstrap method of Efron (1979). The most general method is what is termed the design matrix bootstrapping, whereby one re-samples from the joint distribution of (y, x). Specifically, let \( \left({y}_i^{\ast },{x}_i^{\ast}\right) \), i = 1 , … , n be a randomly drawn sample from the empirical distribution of (x,y), denoted \( {\widehat{F}}_{xy} \). Let \( {\widehat{\beta}}_{\theta}^{\ast } \) denote the quantile regression estimate based on the bootstrap sample. If we repeat this process B times, then an estimate for \( {V}_{\theta }=\theta \left(1-\theta \right){\Delta}_{\theta}^{-1}{\Lambda}_0{\Delta}_{\theta}^{-1} \) is given by
where \( {\overline{\beta}}_{\theta}^{\ast }=\frac{1}{B}{\sum}_{j=1}^B{\widehat{\beta}}_{\theta j}^{\ast } \). The estimate \( {\widehat{V}}_{\theta } \) is a consistent estimator for Vθ in the sense that the conditional distribution of \( \sqrt{n}\left({\widehat{\beta}}_{\theta}^{\ast }-{\widehat{\beta}}_{\theta}\right) \), conditional on the data, weakly converges to the unconditional distribution of \( \sqrt{n}\left({\widehat{\beta}}_{\theta }-{\beta}_{\theta}\right) \).
One important caveat about bootstrapping is in order. If one already uses the bootstrap method, it can be used more efficiently and effectively, taking advantage of the higher-order refinement properties of the method. For example, one can directly construct confidence intervals, test statistics, and so forth, based on the bootstrap estimates without having to first compute an estimate for Vθ. The number of bootstrap repetitions required for the particular application may be different. Nevertheless, the exact number of repetitions can be computed using the method proposed by Andrews and Buchinsky (2000).
Set of Quantile Regressions
The model presented in (1) considered only the estimation for a single quantile θ. In practice one would like to estimate several quantile regressions at distinct points of the conditional distribution of the dependent variable. This section outlines the estimation of a finite sequence of quantile regressions and provides its asymptotic distribution.
Estimation and Large Sample Properties
Consider the model given in (1) (dropping the i subscript) for p alternative θ’s:
for j = 1 , … , p. Without loss of generality assume that 0 < θ1 < θ2 < ⋯ < θp < 1. Estimating the p quantile regressions amounts to running p separate regressions for θ1 through θp. Let the stacked vector of \( {\beta}^{'}\mathrm{s}\;{\beta}_{\theta }={\left({\beta}_{\theta 1}^{\prime },\dots, {\beta}_{\theta_p}^{\prime}\right)}^{\prime } \) denote the population’s true parameter vector and let \( {\widehat{\beta}}_{\theta }={\left({\widehat{\beta}}_{\theta_1}^{\prime },\dots, {\widehat{\beta}}_{\theta_p}^{\prime}\right)}^{\prime } \) denote its corresponding estimate.
Theorem 2
under Assumptions A.1, A.2, and A.3
- (i)
\( {\displaystyle \begin{array}{l}\sqrt{n}\left({\widehat{\beta}}_{\theta }-{\beta}_{\theta}\right){\to}^{\mathrm{L}}\kern0.24em N\left(0,{\Lambda}_{\theta}\right), where\;{\Lambda}_{\theta }={\left\{{\Lambda}_{\theta_{jk}}\right\}}_{j,k=1,\dots, p'}\ \mathrm{and}\\ {}{\Lambda}_{\theta_{jk}}=\left(\min \left\{{\theta}_j,{\theta}_k\right\}-{\theta}_j{\theta}_k\right)\;{\Delta}_{\theta_j}^{-1}{\Lambda}_0{\Delta}_{\theta_k}^{-1};\mathrm{if}\ \mathrm{inaddition}\ {f}_{u_{\theta }}\left(0|x\right)={f}_{u_{\theta }}(0)\;\mathrm{for}\end{array}} \)
- (ii)
\( {\displaystyle \begin{array}{l}j=1,\dots, p\;\mathrm{with}\ \mathrm{probability}\;1,\mathrm{then}\;\sqrt{n}\left({\widehat{\beta}}_{\theta }-{\beta}_{\theta}\right){\to}^{\mathrm{L}}N\left(0,{\Lambda}_{\theta}\right),\mathrm{where}\\ {}{\Lambda}_{\theta }={\Omega}_{\theta}\otimes {\Lambda}_0^{-1}\;\mathrm{and}\kern0.19em {\Omega}_{\theta_{jk}}=\left[\min \left\{{\theta}_j,{\theta}_k\right\}-{\theta}_j{\theta}_k\right]/\left[{f}_{u_{\theta_j}}(0){f}_{u_{\theta_k}}(0)\right]\end{array}} \)
Crossing of Quantiles
Note that the estimated conditional quantiles, conditional on x, are given by \( {x}^{\prime }{\widehat{\beta}}_{\theta_1},\dots, {x}^{\prime }{\widehat{\beta}}_{\theta_p} \). Since the estimates \( {\widehat{\beta}}_{\theta_j}\left(j=1,\dots, p\right) \) for the p quantiles are obtained from separate quantile regressions, it is possible that for some vector x0\( {x}_0^{\prime }{\widehat{\beta}}_{\theta_j}>{x}_0^{\prime }{\widehat{\beta}}_{\theta_k} \)even though θj < θk, that is, conditional quantiles may cross each other. This may not be of any practical consequence, since there may not be such a vector within the relevant range of plausible x’s. Nevertheless, in any empirical application these potential crossing need to be examined.
Testing for Equality of Slope Coefficients
Under the i.i.d. assumption the p coefficient vectors \( {\beta}_{\theta_1},\dots, {\beta}_{\theta_p} \) should be the same, except for the intercept coefficients. There are a number of ways for testing the null hypothesis of i.i.d. errors. Only two testing procedures are provided here. For other alternative methods see Koenker (2005).
Wald-Type Testing
This testing procedure is based on the optimal minimum distance (MD) estimator under the null hypothesis. Denote the parameter vector under the null by \( {\beta}_{\theta}^R \) and note that \( {\beta}_{\theta}^R={\left({\beta}_{\theta_11},\dots, {\beta}_{\theta_p1},{\beta}_2,\dots, {\beta}_k\right)}^{\prime } \) is a (p + K − 1) × 1, with p distinct intercepts \( {\beta}_{\theta_11},\dots, {\beta}_{\theta_p1} \), and k − 1 common slope parameters β2, … , βk.
An optimal estimate for the restricted coefficient vector \( {\beta}_{\theta}^R \) is defined by
Where \( {\widehat{V}}_{\theta } \) is a consistent estimate for the covariance matrix of \( {\widehat{\beta}}_{\theta }, \)the unrestricted parameter estimate from the p quantile regressions, estimated under the null (that is, under Theorem 2(ii). The matrix R is simply a (p + K − 1) × p ⋅ k restriction matrix which imposes the restrictions implied by the i.i.d. assumption. A test statistic for equality of the slope coefficients is provided then by
Under the null hypothesis \( {W}_n\overset{D}{\to }{\upchi}^2\left( pK-p-K+1\right) \) as n → ∞ so, the null hypothesis is rejected if \( {W}_n>{\upchi}_{1-\alpha}^2\left( pK-p-K+1\right), \)where \( {\upchi}_{1-\alpha}^2(m) \) denotes the 1 − α quantile of a χ2-distribution with m degrees of freedom.
GMM-Type Testing
An alternative testing procedure can be applied using Hansen’s (1982) GMM method. Define a moment function ψ(x, y, β) by stacking the p individual moment functions as defined in (3). While this moment function is a pk × 1 vector, under the null there are only p + K − 1 parameters to be estimated. Hansen’s GMM framework provides an estimator for \( {\beta}_{\theta}^R \), say \( {\widehat{\beta}}_{\theta}^R \), defined by
An efficient estimator can be obtained if A is chosen so that \( A\overset{p}{\to }E\Big[\psi \left(x,y,{\beta}_{\theta}\right)\psi {\left(x,y,{\beta}_{\theta}\right)}^{\prime }. \)as n → ∞. This framework provides us with a straightforward testing procedure. Under the null hypothesis
as n → ∞.
Note that, because of the linearity of the conditional quantiles, the GMM testing provides a test statistics which is (asymptotically) equivalent to that provided by the MD testing.
Censored Quantile Regression
An important extension to the quantile regression model was suggested by Powell (1984, 1986). This extension considers the case in which some of the observations are censored. This model is essentially a semiparametric extension to the well known ‘Tobit’ model and can be written as
for i = 1 , … , n, where \( {y}_1^0 \) is the (known) top coding value of yi in the sample, for i = 1 , … , n. (For simplicity of presentation it will be assumed that \( {y}_i^0={y}^0 \) for all i = 1 , … , n.)
This model can be written as a latent variable model. That is, we have \( {y}_i^{\ast }={x}_i^{\prime }{\beta}_{\theta }+{u}_{\theta i} \), where Qθ(uθi| xi) = 0 and \( {y}_i={y}_i^{\ast }I\left({y}_i^{\ast}\le {y}^0\right) \). It is easy to see that the observed conditional θth quantile of yi, conditional on xi, is given by
\( {Q}_{\theta}\left({y}_i|{x}_i\right)=\min \left\{{y}^0,{x}_i^{\prime }{\beta}_{\theta}\right\} \).
Hence, Powell suggested the following estimator for βθ
where ρθ(λ) is the same check function as defined above. Note that in order to obtain a consistent estimator of βθ it is necessary that \( {x}_i^{\prime }{\beta}_{\theta }<{y}^0 \) for at least a fraction of the sample. Intuitively, the larger the fraction, the more precise the estimator will be.
Powell (1986) showed that, under certain regularity conditions, similar to those established by Huber (1967), the estimator is asymptotically normal. That is, \( \sqrt{n}\left({\widehat{\beta}}_{\theta }-{\beta}_{\theta}\right)\overset{D}{\to }N\left(0,{V}_{\theta}^C\right) \) as n → ∞, where
As in the basic quantile regression model, if \( {f}_{u_{\theta }}\left(0|x\right)={f}_{u_{\theta }}(0) \) with probability 1, then \( {V}_{\theta}^C \) simplifies to \( {V}_{\theta}^C={\omega}_{\theta}^2{\Lambda}_{C\theta}^{-1}, \) where \( {\omega}_{\theta}^2=\theta \left(1-\theta \right){f}_{u_{\theta}}^2(0) \).
It is important to note that if \( {x}_i^{\prime }{\widehat{\beta}}_{\theta}\le {y}^0 \) for all observations, then the censored quantile regression estimate coincides with the basic quantile regression.
The simple intuition for this estimation procedure is that βθ can be estimated only from that part of the sample for which it is observed, that is, for that fraction of the sample for which \( y={y}^{\ast }={x}^{\prime }{\beta}_{\theta }+{u}_{\theta}\le {y}^0 \). As a result, we note that the asymptotic covariance is ‘adjusted’ for that fact. That is, the term I(x ′ βθ ≤ y0) is included in both ΔCθ and ΔCθ.
A considerable drawback of the censored quantile regression model is that it does not have the attractive LP representation and the objective function is not globally convex in β.
Concluding Remarks
The main goal of this article is to provide the basic structure of the quantile regression model. Versions of this model have been widely used in the empirical literature in a variety of situations not covered by this article. Furthermore, there have been substantial advancements in the theoretical literature as well. This literature includes quantile regression for nonlinear models, time-series models, and others. There are also a number of empirical studies that have used quantile regression extensively, in a variety of data configurations and economic contexts. For a brilliant in-depth exposition of a wide variety of topics related to quantile regression, interested readers should refer to Koenker (2005).
Bibliography
Andrews, D., and M. Buchinsky. 2000. A three-step method for choosing the number of bootstrap repetitions. Econometrica 68: 23–51.
Bofinger, E. 1975. Estimation of density function using order statistics. Australian Journal of Statistics 17: 1–7.
Buchinsky, M. 1994. Changes in the U.S. wage structure 1963–1987: Application of quantile regression. Econometrica 62: 405–458.
Buchinsky, M. 1995. Estimating the asymptotic covariance matrix for quantile regression models: A Monte Carlo study. Journal of Econometrics 68: 303–338.
Efron, B. 1979. Bootstrap methods: Another look at the jackknife. Annals of Statistics 7: 1–26.
Hansen, L. 1982. Large sample properties of generalized method of moments estimators. Econometrica 50: 1029–1054.
Huber, P. 1967. The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability Vol. 1. Berkeley: University of California Press.
Koenker, R. 2005. Quantile regression. Econometric Society Monograph. New York: Cambridge University Press.
Koenker, R., and G. Bassett. 1978a. The asymptotic distribution of the least absolute error estimator. Journal of the American Statistical Association 73: 618–622.
Koenker, R., and G. Bassett. 1978b. Regression quantiles. Econometrica 46: 33–50.
Newey, W., and J. Powell. 1990. Efficient estimation of linear and type I censored regression models under conditional quantile restrictions. Econometric Theory 6: 295–317.
Portnoy, S., and R. Koenker. 1989. Adaptive L-estimation for linear models. Annals of Statistics 17: 362–381.
Powell, J. 1984. Least absolute deviation estimation for the censored regression model. Journal of Econometrics 25: 303–325.
Powell, J. 1986. Censored regression quantiles. Journal of Econometrics 32: 143–155.
Siddiqui, M. 1960. Distribution of quantile from a bivariate population. Journal of Research of the National Bureau of Standards 64: 145–150.
Author information
Authors and Affiliations
Editor information
Copyright information
© 2018 Macmillan Publishers Ltd.
About this entry
Cite this entry
Buchinsky, M. (2018). Quantile Regression. In: The New Palgrave Dictionary of Economics. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-349-95189-5_2795
Download citation
DOI: https://doi.org/10.1057/978-1-349-95189-5_2795
Published:
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-349-95188-8
Online ISBN: 978-1-349-95189-5
eBook Packages: Economics and FinanceReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences