1 Introduction

Changes in wage and income distributions are slow motion, reflecting certain stability of the Mincerian wage equation and slow demographic changes. However, profound institutional changes and macroeconomic instability accelerate the transformation pace. Latin American income inequality has reduced the most in the last decade, a fact documented by Gasparini and Lustig (2011), Lustig et al. (2013) and Cornia (2014) among others. According to Lustig et al. (2013), the Gini coefficient for the region as a whole declined from an average of 0.52 in 2000 to 0.50 in 2009. Recent studies have pointed out that the drop in wage inequality, in turn, explains most of this decline.

Given the strong correlation between education and wages, most of the literature has focused on studying the role of the returns to education on inequality changes. According to Barro and Lee (2013), the proportion of workers with completed tertiary education in the region went from 7.2 to 8.0 from 2000 to 2010. Such educational upgrading has an unequalizing effect known as the “paradox of progress.” That is, a more equal distribution of education increases wage inequality due to the convexity of the returns to education (Legovini et al. 2005). Recent evidence on the inequality increasing compositional effect of education in the region can be found in Beccaria et al. (2015), Wang et al. (2016), and Ferreira et al. (2016b).

However, despite the above effect, inequality in the region reduced mainly as a result of the drop in the returns to education. According to the SEDLAC (2016), the wage premium of having tertiary education in Latin America drops from 0.82 to 0.75 on average. The overall equalizing effect of education in recent years has been suggested by Beccaria et al. (2015) in the case of Argentina and by Wang et al. (2016) in the case of Brazil. This reduction in the returns to education has been linked, in turn, to the boom in the commodity prices that would have favored less educated workers (Gasparini and Lustig 2011).

The decreasing wage inequality in the region has been also consistent with lower levels of job informality (see ILO 2011; Berg 2010; Amarante and Arim 2015). Such informality pattern is a result of the implementation of programs toward better labor conditions and the reduction in labor market bureaucracy (Berg 2010; Maurizio 2014).Footnote 1 The growing interest on the relationship between informality and inequality has been recently studied by Maurizio (2012), Amarante and Arim (2015), Beccaria et al. (2015), Ferreira et al. (2016b) and Binelli (2016).

This paper develops an extensive empirical framework to test the distributional effects of both the education and the informality on wage inequality using several quantile regression (QR) decomposition techniques. QR is a general way of modeling heterogeneity in an outcome variable in terms of the effect of certain covariates (see Koenker 2005, for a comprehensive analysis of QR). The QR framework has been used to decompose changes in inequality on the basis of the Oaxaca–Blinder (OA) methodology (Oaxaca 1973; Blinder 1973; Oaxaca and Ransom 1994) by defining an “endowment effect,” i.e., differences in an outcome variable because of differences in explanatory variables (e.g., differences in education), and a “pricing effect,” i.e., differences across groups because of differences in the coefficients (e.g., differences in returns to schooling). Variation among quantiles is seen as additional heterogeneity in unobservables that cannot be captured by the OA method.

Therefore, some of the questions that we try to address here are: Have educational upgrading and job formalization had equalizing effects during period? How do changes in worker characteristics and in the returns to such characteristics explain the drop in inequality? Did the effect of job formalization on wage inequality remain after the latest financial crisis? To decompose the wage gap between price and endowment effects, we use microdata from household surveys from Argentina, Brazil, Colombia and Mexico for 2002–2014. This is a key time period to study both inequality changes and the stability of their determinants because it is a period with sustained economic growth as well as major global changes (e.g., the worldwide financial crisis and commodities’ prices boom). We do not consider the 1990s due to the methodological changes that make such surveys non comparable across time.

We use different inequality measures and distributional features based on the OA QR framework. All methods depict a similar picture, thus suggesting that our analysis is robust to the chosen econometric methodology. First, we follow Machado and Mata (2005) (see also Autor et al. 2005; Melly 2005) who proposed techniques to disentangle the effect of the changes of the distribution of the covariates from the effect of changes in the distribution of coefficients—or returns—in accounting for inequality changes. Such decomposition has some limitations in terms of the detailed contributions of each covariate to the total change (see the discussion in Fortin et al. 2011).

Second, we follow Firpo et al. (2009) who develop a useful framework to account for the particular effect of covariates using the recentered influence function (RIF) model. This can be applied to different statistics such us quantiles, variance and Gini or Theil coefficients within a OA framework to disentangle endowment and pricing effects. When this is applied to quantiles, the model is called the unconditional QR model. This method has been applied to study inequality in Argentina by Fabris and Montes-Rojas (2014) for different statistics.

Third, we provide a simple formalization for OA decompositions in QR using a simple and intuitive statistical framework, the random-coefficients decomposition (RCD) based on a random-coefficients representation of QR. Then, we define pricing effect as the distance between QR and ordinary least-squares (OLS) coefficients, and endowments effect as the differences in average covariate values evaluated at the corresponding QR coefficients. The proposed methodology is thus centered about the mean-based OA, that is, when we integrate over all the quantiles, we obtain the OLS OA decomposition. Within this framework, quantiles could be seen as the sources of the mean-based OA decomposition. The RCD methodology provides the contribution of each covariate to inequality at different points of the conditional distribution. We argue that an extension of the OA decomposition using the distance of quantile coefficients to the mean-based coefficients adds information not captured by the RIF and Machado–Mata decomposition. In particular, the distance between the mean- and quantile-based OA decomposition can be explained by the corresponding heterogeneity in QR coefficients.

While we focus on changes in individual endowments and characteristics, mainly education and formality, there may be other important factors affecting the wage distribution. In particular, changes in minimum wages and unionization may be additional potential factors with significant effects on inequality.

Regarding minimum wages, many studies have found a significant effect in reducing inequality, although the effect varies depending on the actual real value of the minimum wage, how much it increases, the extent of noncompliance, and whether it is binding (see Belser and Rani 2015, for a discussion). Maurizio and Vázquez (2016) and Messina and Silva (2018, ch.5) provide an extensive comparative review for Latin America. Some examples illustrate the caveats of the analysis. For Brazil, Ferreira et al. (2016a) show that rising minimum wages during the 2003–2012 period contributed in an important way to the drop in wage inequality. However, if they consider the period 1995–2012 the impact on inequality was small. Neumark et al. (2006), on the contrary, find no substantial effect of minimum wages on employment in Brazil. Bosch and Manacorda (2010) focus on income effects and find that income distribution has been substantially improved by minimum wage changes in Mexico.

Messina and Silva (2018, ch.5) provide a review on the effect on unionization in Latin America, which is characterized by a sharp decline in unionization levels. They found that “[m]ost studies find that the effects are pro-equality (except in Brazil) and small (except in Uruguay) and that their magnitude is dwarfed by supply-and-demand factors, transfers, or even the minimum wage” (p. 175). Some evidence also points out that what is important is the collective bargaining process rather than unions themselves, as union-negotiated wages may affect workers who are not unionized through various spillover effects (see also Hayter 2015, for a discussion). The effect of such institutions is not studied in this paper.

The paper is organized as follows. Section 2 analyzes quantile regression model to the Oaxaca decomposition model. Section  3 explains the data. Section 4 applies the proposed method to inequality in Latin America. Section 5 concludes.

2 Quantile regression Oaxaca–Blinder decompositions

2.1 General framework

Let y be an outcome variable (e.g., wages) and X be a \(p \times 1\)-dimensional vector of covariates. The mean and quantile linear regression models are two well-known models to estimate the effect of certain covariates on a response variable.

Mean regression, i.e., OLS, considers the effect of X on y through the conditional mean model

$$\begin{aligned} E(y|X)=X' \beta _M, \end{aligned}$$
(1)

where \(\beta _M\) is a \(p \times 1\)-dimensional vector of coefficients.

In quantile regression (QR), the conditional quantiles of y are of interest through the models

$$\begin{aligned} Q_{y}(\tau |X)=X' \beta (\tau )\;\;\mathrm{for}\;\;\tau \in (0,1), \end{aligned}$$
(2)

where \(Q_{y}(\tau |X)\) is the \(\tau \)-quantile of y|X and \(\beta (\tau )\) the vector of coefficients.

Note that Eq. (2) implies that the right-hand side is monotone increasing in \(\tau \). As stated in Koenker and Xiao (2006), monotonicity in QR determines that a random-coefficients (RC) notation can be introduced by considering a uniform random variable \(u\sim U(0,1)\) in the role of the fixed \(\tau \) and writing

$$\begin{aligned} y=X^\top \beta (u). \end{aligned}$$
(3)

Thus, the process \(\{y,X\}\) can be partially recovered from the marginal distributions, that is, the conditional distribution y|X can be described by its conditional quantiles based on \(\tau \in (0,1)\).

The QR analysis constructs a model \(y^*=y(X,\tau )\) in which \(y^*\) depends on endowmentsX and its location in the conditional distribution given by \(\tau \). The linear QR model determines that the coefficients \(\beta (\tau )\) are the pricings of those endowments in the market. As explained in Machado and Mata (2005) in the context of explaining changes in the distribution of wages, “the estimated QR coefficients are also quite interesting as they can be interpreted as rates of return (or ’prices’) of the labor market skills at different points of the conditional wage distribution” (p. 447).

This decomposition can be extended for comparing two groups, A and B, with realizations \(y_A(X_A,\tau )\) and \(y_B(X_B,\tau )\). Differences across individuals in y are due to differences in X valued at the mean coefficients or to differences in \(\beta \). Following the traditional Oaxaca–Blinder (OA) decomposition analysis (Oaxaca 1973; Blinder 1973; Oaxaca and Ransom 1994), all changes in y can be expressed as the combined effect of X and \(\beta \). Then, differences in y due only to X are defined as the endowment effect. Changes in y due only to \(\beta \) are defined as the pricing effect.

The usual OA decomposition for the mean differences (i.e., OLS based) is:

$$\begin{aligned} \mathrm{OAM}\equiv & {} (\overline{X}_A-\overline{X}_B)' \beta _{MB} + \overline{X}_A(\beta _{MA}-\beta _{MB}) \nonumber \\\equiv & {} \mathrm{OA1}+\mathrm{OA2}. \end{aligned}$$
(4)

The mean OA decomposition, however, is not informative about the underlying inequality of the outcome variable y. For our purposes, groups A and B correspond to two points in time. OAM, OA1 and OA2 could be 0 even though inequality changed over the period of analysis. The application of QR techniques to the OA decomposition helps in understanding the sources of inequality.

2.2 Machado and Mata (2005)

The Machado and Mata (2005) methodology is based on the estimation of marginal wage distributions consistent with a conditional distribution estimated by QR and with hypothesized distributions for the covariates. The methodology estimates both marginal densities based on QR by year and country, and their counterfactual marginal densities. To obtain such counterfactual marginal densities, the wage density that would have prevailed in the final year (group A) is estimated if all of the covariates had been distributed as in the first year (group B) and were paid as in the final year. Equation (5) shows the wage gap decomposition to estimate. The terms f(y), \(f^*(y)\) and \(f^{**}(y)\) refer to estimators of the marginal density of wages from the observed sample, wages from the estimated sample and wages from the counterfactual scenario, respectively. Here \(\alpha \) represents the summary statistics (such as quantile, scale measure or concentration index) to decompose.

$$\begin{aligned} \alpha (f(y(A)) - \alpha (f(y(B)))= & {} \alpha (f^*(y(A))) - \alpha (f^{**}(y(B);X(A))) \nonumber \\&\quad +\, \alpha (f^{**} (y (B); X (A))) - \alpha (f^*(y (A))) \nonumber \\&\quad +\, \mathrm{Residual} \end{aligned}$$
(5)

where the first term is the endowment effect, while the second one is the pricing effect.

The main drawback of this methodology is that it is not possible to perform a detailed decomposition of the composition effect. As argued by Fortin et al. (2011), “[this] approach does not provide a way of performing the detailed decomposition for the composition effect. This is a major drawback since the detailed decomposition of the composition effect is always interpretable, while the detailed decomposition of the wage structure effect arbitrarily depends on the choice of the omitted group” (p. 63).

2.3 RIF regression

The RIF regression methods allow to perform detailed decompositions for any distributional statistic for which an influence function can be estimated, such us quantiles of the unconditional distribution, the variance of wages and the Gini coefficient. According to Firpo et al. (2009), a RIF regression is similar to a standard regression except that the dependent variable, y, is replaced by the recentered influence function (RIF) of the statistic of interest. Formally, IF(yv) represents the influence function corresponding to an observed y for the distributional statistic of interest v(Fy). So, the RIF can be written as \(RIF(y;v)=v(Fy)+IF(y;v)\) and the conditional expectation of the RIF is given by \(E[RIF (Y;v)|X]=X\gamma \).

In the QR framework, the influence function \(IF(y, Q_\tau )\) is given by \((y-1[y\le Q_\tau ])/f_y(Q_\tau )\), where 1[.] is an indicator function, \(f_y(.)\) is the density of the marginal distribution of y and \(Q_\tau \) is the population \(\tau \)-quantile of the unconditional distribution of y. The RIF regression allows for a OA decomposition with a corresponding endowment and pricing effect:

$$\begin{aligned} \mathrm{OA}_{\mathrm{RIF}}(\tau )\equiv (\overline{X}_A-\overline{X}_B)' \gamma _{B}(\tau ) + \overline{X}_A(\gamma _{A}(\tau )-\gamma _{B}(\tau )). \end{aligned}$$
(6)

Note that while the RIF is centered to have a zero mean, \(E_\tau [\gamma (\tau )],\tau \sim U(0,1)\) is not necessarily equal to \(\beta _M\). The distributional statistics of interests are the influence functions of the unconditional quantiles of y. As such, the RIF methodology does not allow for a comparison of the mean-based OA and the quantile-based decompositions.

2.4 Random-coefficients endowment and pricing effects

A general RC formulation for y can be obtained by defining a random vector \({\beta }\in \mathcal {B}\), where \(\mathcal {B}\) where is the space of \(p\times 1\) random real-valued vectors (see Montes-Rojas et al. 2017, for a discussion). Then,

(7)

Model (2) is a special case of (7) in which \(X'{B}\) is monotone increasing on some common index. Note that because of monotonicity of \(\beta (\tau )\), the order statistics of \({\beta }\) correspond to the QR coefficients. That is, \(Q_{\beta }(\tau )=\beta (\tau ),\;\tau \in (0,1)\). This means that, for instance, the “median” value of \(\beta \) is exactly that of \(\beta (0.5)\). Thus, analyzing the marginal effects of covariates on different quantiles of the conditional distribution of y|X is equivalent to analyzing the quantiles of the linear effects of X on y.

Following Koenker and Xiao (2006), we rewrite (2) as

$$\begin{aligned} y=\beta _0 +x_{1} \beta _1+...+x_{p} \beta _p+w, \end{aligned}$$
(8)

where \(\bar{\beta }_0=E[\beta _0]\) and \(w=\beta _0-\bar{\beta }_0\). This means that for the OLS model we have

$$\begin{aligned} E(y|X)=\bar{\beta }_0 +x_{1} E[\beta _1]+...+x_{p} E[\beta _p], \end{aligned}$$
(9)

and then the mean regression coefficients can be obtained by taking the “mean” of the quantile coefficients. Thus, \(\beta _M=E[\beta ]\).

Then, combining mean and QR models we could rewrite (7) as

$$\begin{aligned}&y-\overline{X}' \beta _M=(X-\overline{X})' \beta _M + X' (\beta -\beta _M)\nonumber \\&\quad \equiv (\text{ Endowment } \text{ effect })+(\text{ Pricing } \text{ effect }). \end{aligned}$$
(10)

The pricing effect can be analyzed by studying the quantile process \(\{\beta (\tau ),\tau \in (0,1)\}\) together with the mean regression coefficients \(\beta _M\). Then, we can apply an endowment and pricing decomposition at particular quantiles of the conditional distribution of y. Then, for different values of X and quantiles \(\tau \), we can evaluate \(y(X,\tau )\) as

$$\begin{aligned}&y(X,\tau )-\overline{X}' \beta _M=(X-\overline{X})' \beta _M + X' (\beta (\tau )-\beta _M)\nonumber \\&\quad \equiv (\text{ Endowment } \text{ effect })+(\text{ Pricing } \text{ effect }(\tau )). \end{aligned}$$
(11)

In this case, the difference in pricing at \(\tau \) with respect to the mean pricing, i.e., \((\beta (\tau )-\beta _{M})\), is of interest. Inference on this object requires the joint consideration the OLS and QR estimators. This decomposition was first proposed by Autor et al. (2005) following the analysis of Machado and Mata (2005), replacing \(\beta _M\) with \(\beta (0.50)\), where the second term was referred as within-group inequality and also as prices. Equation (11) was thus proposed to use to evaluate changes in inequality.

Consider the evaluation at \(X_A=\overline{X}_A,X_B=\overline{X}_B\). Furthermore, assume that groups 1 and 2 are realizations of the same process with differences in endowments and pricings. Then, after some algebra,

$$\begin{aligned} \mathrm{OA}q\equiv & {} y_A(\overline{X}_A,\tau )-y_B(\overline{X}_B,\tau ) \nonumber \\= & {} \overline{X}_A\beta _A(\tau )-\overline{X}_B\beta _B(\tau ) \nonumber \\= & {} \overline{X}_A'\beta _A(\tau )-\overline{X}_B'\beta _B(\tau )+\overline{X}_A'\beta _B(\tau )-\overline{X}_A'\beta _B(\tau ) \nonumber \\= & {} (\overline{X}_A-\overline{X}_B)'\beta _B(\tau ) + \overline{X}_A'(\beta _A(\tau )-\beta _B(\tau )) \nonumber \\\equiv & {} \mathrm{OAq}1 +\mathrm{OAq}2, \end{aligned}$$
(12)

which corresponds to a Oaxaca decomposition for a particular quantile \(\tau \). OAq1 corresponds to the effect that differences in endowments have on that particular \(\tau \) quantile. OAq2 represents differences in prices of that particular quantile. This is the random-coefficients decomposition (RCD).

Note that if we integrate out \(\tau \), we obtain the usual OA decomposition at the mean, that is,

$$\begin{aligned} \mathrm{OAM}\equiv & {} E_\tau [y_A(\overline{X}_A,\tau )-y_B(\overline{X}_B,\tau )] \nonumber \\= & {} (\overline{X}_A-\overline{X}_B)' \beta _{MB} + \overline{X}_A(\beta _{MA}-\beta _{MB}) \nonumber \\\equiv & {} \mathrm{OA}1+\mathrm{OA}2, \end{aligned}$$
(13)

where \(\beta _{MB}\) is the average pricing of group B. This determines that the OA decomposition can be seen as arising from differences between the groups at different quantiles of the conditional distribution of y.

Furthermore, adding and subtracting \(\overline{X}_A'\beta _{MA}\), \(\overline{X}_B'\beta _{MB}\), \(\overline{X}_A'\beta _{MB}\), and rearranging, we obtain

$$\begin{aligned} y_A(\overline{X}_A,\tau )-y_B(\overline{X}_B,\tau )= & {} \mathrm{OA}+\overline{X}_A' (\beta _A(\tau )-\beta _{MA})- \overline{X}_B' (\beta _B(\tau )-\beta _{MB}) \nonumber \\= & {} \mathrm{OAM}+\mathrm{OAq}3-\mathrm{OAq}4. \end{aligned}$$
(14)

The first term corresponds to the usual (mean) OA decomposition in Eq. (4), OAM. The second term is the pricing effect of group A for that particular quantile, evaluated at the mean endowments of that group (OAq3). The third term is the pricing effect of group B for that particular quantile, also evaluated at the mean endowments of that group (OAq4).

3 Data

The results in this paper are based on data from household surveys. For Argentina, we use the Permanent Household Survey (EPH) implemented by National Institute of Statistics and Censuses (INDEC). For Brazil, we consider the National Household Survey (PNAD) conducted by the Brazilian Institute of Geography and Statistics (IBGE). For Colombia, we use the Continuous Household Survey (ECH) and its successor since 2006 the Integrated Household Survey (GEIH) both carried out by National Institute of Statistics (DANE). For Mexico, we consider the Urban Employment Household Survey (ENEU) and its successor since 2005 the Occupation and Employment Household Survey (ENOE) implemented by the Official Statistics Office (INEGI).

Our sample covers the years 2002, 2008 and 2014 for all countries except Argentina where the EPH was implemented from 2003 (due to a substantial methodological change in 2003). We do not consider previous years because household surveys changed their methodology at the beginning of the 2000s, i.e., changes in definitions, coverage and in data collecting methods made the surveys not strictly comparable. We select the 2008 in order to decompose the changes in inequality before and after worldwide financial crisis. The sample of workers for econometric estimation is selected according to the following criteria: salaried male urban workers aged 18–65, not in agricultural jobs. The real hourly wage is defined as the nominal wage obtained in the last month divided by the number of hours worked deflated by the consumer price index in each country. As covariates, we consider age, educational attainment, potential experience, marital status, firm size, city of residence, affiliation to social security and written contract.

Educational attainment corresponds to years of formal education. The potential experience is a proxy variable resulting from subtracting the years of education plus 6 from the age of the individual. The marital status and the firm size refers to dummy variables indicating whether the individual is married and whether the worker belongs to a firm that employs more than eleven employees, respectively. The affiliation to social security and written contract are the two informality measures that indicate whether the individual is covered by a health affiliation and whether the worker has a written contract, respectively. We study informal employment according to the “legal” approach stated by the ILO. In this case, the ILO defines that “employees are considered to have informal jobs if their employment relationship is, in law or in practice, not subject to national labor legislation, income taxation, social protection, or entitlement to certain employment benefits for specific reason.” We consider the following urban areas: Buenos Aires, Córdoba, Rosario, Mendoza and Tucumán in Argentina, São Paulo and Rio de Janeiro in Brazil, Bogotá, Medellín, Barranquilla, and Cali in Colombia, and Mexico D.F., Guadalajara, Monterrey and Puebla in Mexico. For the econometric estimation, we discard observations of individuals whose information of key variables is reported as missing or outside the coding provided by the INDEC, IBGE, DANE and INEGI.

Table 1 Hourly wage inequality

4 Inequality in Latin America 2002–2014

4.1 Stylized facts

Table 1 presents the recent evolution of wage inequality using three standard measures for hourly wages: the Gini coefficient, the Theil index and the ratio of the 90 and 10 percentiles of the hourly wage distribution. According to the results, Brazil and Colombia appear as the most unequal countries in the sample, followed by Mexico and Argentina. The table reports the fall in wage inequality for all the countries and periods except for Brazil in 2008–2014. During the first subperiod, the greatest decline in inequality took place in Argentina (change in Gini − 0.043) and Brazil (− 0.043), with small changes in Mexico (− 0.002) and Colombia (− 0.016). The last two are not statistically significant.

In the next subperiod (2008–2014), Colombia has the highest inequality decline (− 0.039) followed by Argentina (− 0.025), for which we have the largest inequality change over the whole period of analysis. Brazil seems to have a large reversal of its inequality trend (0.052), which actually erases all gains in inequality reduction. Finally, Mexico shows sustained but small changes across periods. These inequality trends are confirmed by the Theil index and P9010 measures too.

Fig. 1
figure 1

Wage change by percentiles

In order to explore the distributional features associated with the inequality decline, we estimate the median wage by percentile at the beginning and at the end of the period and we plot the wage gap in Fig. 1. According to the results, wages improved at the lower part (relative to the upper part) of the wage distribution for all countries. These patterns imply a decline in wage inequality for the four countries since the lower and the upper limits cut the range of the wage variation. Beccaria et al. (2015), Wang et al. (2016) and Esquivel et al. (2010) report this pattern for Argentina, Brazil and Mexico until 2012, respectively. The improvement of wages at the lower part could be related to several factors. We explore these factors using Oaxaca–Blinder decomposition methods.

Table 2 Oaxaca mean-based decompositions

4.2 Aggregate Oaxaca–Blinder decomposition

Table 2 reports the aggregate and the detailed OA decomposition comparing 2008 with 2002 (2003 for Argentina) and then 2014 with 2008. These estimates provide a measure of the overall changes but not on the distributional effects. Following the definition above, the term OAM refers to the mean wage gap, OA1 to the endowment effect and OA2 to the pricing effect.

According to the aggregate decomposition (first panel), Argentina and Brazil show the largest real hourly wage increment over the whole period and in each subperiod. Argentina’s wage increments are striking: 59.2% in 2002–2008 and 88% for 2008–2014. As mentioned above, the case of Argentina cannot be taken as face value because official prices were systematically undervalued for the second half of the 2000 decade. Although there is no consensus on the adequate price level adjustment, they are certainly pointing to a decade of sustained growth in hourly wages. Brazil values are also significant: 14.9% increase in the first subperiod and 27.5% in the second. Colombia also has positive but modest increments in hourly wages, 16.3% for 2002–2008 and 13.1% for 2008–2014. Mexico, on the other hand, has a negative overall effect, with a positive increment in 2002–2008 (6.5%) and negative change in 2008–2014 (− 12.75%). These changes in average wages were mainly a result of the pricing effect during the whole period in Argentina and Mexico and only after financial crisis in Brazil and Colombia.

The OA decomposition of the wage gap along distribution is presented in Tables 3 and 4 for Machado and Mata, RIF and RCD, respectively. In each case, the decompositions present the decomposition at a particular \(\tau \)-quantile and the endowment and pricing effects. All three methodologies show similar patterns. The reduction in inequality is the result of relative improvements in lower quantiles as compared to higher quantiles. This is the case of Argentina, Colombia and Mexico (both periods) and Brazil (2002–2008). For Brazil, for which we observed an increase in inequality for 2008–2014, we have relative improvements in central quantiles as compared to extreme ones. Results of the overall decomposition in the first subperiod are consistent with previous findings for Argentina and Brazil in Beccaria et al. (2015), Wang et al. (2016), and Ferreira et al. (2016b).

Table 3 Machado and Mata and RIF aggregate decomposition

The monotone improvement in the distribution of wages is mostly the result of the pricing effect for Argentina and Brazil during the whole period and for Colombia only during 2008–2014. The endowment effect has a differential impact across countries: while for Argentina lower quantiles have larger endowment effects, this is not the case for Brazil, Colombia and Mexico. Table  4 also reports the decomposition terms OAq3 and OAq4. These terms are the pricing effects in a particular \(\tau \)-quantile evaluated at the mean endowments of this group. By adding the term OAq3 and subtracting the term OAq4 from the OAM, we can recover the wage gap at quantile \(\tau \) (OAq). OAq3 reports coefficients differences at the end of the period (either 2008 or 2014), OAq4 that at the beginning (either 2002 or 2008). The decomposition reveals that 2002 and 2008 were characterized by a uniform reduction in the coefficients’ dispersion, a pattern that goes in the same direction for the 2008–2014 comparison. Thus, the improvement in overall inequality given by the aggregate OA decompositions observed for Argentina, Colombia and Brazil can be seen as a result of a reduction in pricing heterogeneity.

Table 4 Oaxaca random-coefficients quantile aggregate decompositions

4.3 Covariate decomposition

In order to study the role played by covariates in the wage gap change, we present in the lower panels of Table 2 the contribution of education, health coverage and written labor contract using Oaxaca–Blinder mean decompositions. This shows the share of each covariate in explaining aggregate changes. While from it we cannot infer their effect in inequality, they inform us about the magnitude of the changes whose impact might have a different effect along the distribution of wages. So, according to the decomposition at mean, both education and health coverage have a positive composition effect on average wages for all period in almost all countries. On the other hand, the wage structure effect seems to reduce mean wages.

To study the detailed decomposition of covariates along distribution, we explore two methods: the RIF regression decomposition applied to the unconditional quantiles and the Gini coefficient (Tables  5 and 6 , respectively), and RC OA decompositions (Tables  7 and 8 for education and health coverage). Results for the written contract are presented in the supplementary material, available from the authors upon request.

4.3.1 Education

As for the OA decomposition, education has an overall negative and significant effect for Argentina and Brazil, not statistically significant for Colombia and Mexico. In all cases, however, the composition effect is positive and the pricing effect is negative. While the former reveals the improvement in schooling level in the region, the latter shows that returns to education have fallen on average for all countries. The first supports evidence in favor of the paradox of progress previously documented by Bourguignon et al. (2005), Wang et al. (2016) and Ferreira et al. (2016b).

Table 5 Oaxaca RIF quantile decompositions—covariates

This fact is confirmed when we plot QR education coefficients along the quantile process. Education returns are increasing along the quantiles, but they decreased overtime for all percentiles and countries (see Fig. 2). The QR analysis in Tables  5 and 7 reveals, however, that (except for Mexico) most of the composition effect benefits the upper quantiles more than the lower quantiles, thus increasing wage inequality. The pricing effects, however, have a dissimilar effect.

In Argentina, for the first subperiod, all quantiles show a negative effect, larger (in absolute value) for \(\tau =0.10\) and \(\tau =0.90\) than for central quantiles; for the second subperiod it is positive for lower quantiles and negative for higher quantiles, thus reducing inequality. In fact, Argentina 2008–2014 is the only case in which the education pricing effect explains the reduction in inequality.

In Brazil, Colombia and Mexico, for both subperiods, the endowment and pricing effects, both produce an increment in inequality. An analysis of the terms OAq3 and OAq4 also shows that for these countries, the gap between the quantile coefficient and the mean coefficient increased below the median but increased above the median. That is, \(\mathrm{OAq3}(\tau )<\mathrm{OAq4}(\tau )\) for \(\tau <0.50\) but \(\mathrm{OAq3}(\tau )>\mathrm{OAq4}(\tau )\) for \(\tau >0.50\). As a result, while overall dispersion in pricing effects increased, this effect has enhanced inequality.

The RIF regression decomposition effect of education on the Gini coefficient shows a different picture (Table  6). While for Argentina, pricing effects are negative and significant (thus reducing inequality), its effect is not clear for the other countries. In fact, the education pricing effect goes in the opposite direction to the total change for Brazil (2002–2008 and 2008–2014). In all cases, endowment effects increase inequality.

Overall, while education plays a significant role in explaining wage changes across time, its effect on inequality seems to go in the opposite direction compared to the overall inequality reduction over the period of analysis. If any, the explanation lies somewhere else.

Table 6 Oaxaca RIF quantile decompositions—covariates—Gini
Table 7 Oaxaca random-coefficients quantile decompositions—education
Table 8 Oaxaca random-coefficients quantile decompositions—health coverage
Fig. 2
figure 2

Returns to education, health coverage and written contract. Quantile regression coefficients

4.3.2 Formality: health coverage and written contract

Returns to the two broad measures of formality, health coverage and written contract, are decreasing along the quantiles, thus indicating that they favor relatively more the workers in the lower part of the distribution. That is, labor market institutions in Latin American countries, a region of high informality rates, are a privilege of the highly skilled and paid workers. As such, they are most likely to have a positive effect on (conditionally) underprivileged workers. There is, however, no clear pattern across time, and then, its effect on inequality needs to be studied.

The RIF decomposition provides a useful picture. For the 2002–2008, both health coverage and written contract reduce inequality. The largest effect corresponds to the pricing factor, which is in fact larger than the total effect, except for Mexico. Results are in line with Beccaria et al. (2015) Wang et al. (2016) and Ferreira et al. (2016b). For 2008–2014, the formality effect continues to reduce inequality only for Argentina, while it is large but inequality enhancing for Brazil, Colombia and Mexico.

The same effects are also studied using the quantile decompositions. Table 8 shows the quantile effects decomposition for health coverage. The results reveal large heterogeneity across formality measures and countries. The decomposition suggests for the health coverage a large effect of the pricing factor for 2002–2008 that reduces inequality. After 2008, it reduces inequality only for Argentina.

For Argentina, the aggregate decomposition OAq shows that health coverage reduces inequality in the 2002–2008 period when comparing \(\tau =0.10\) with \(\tau =0.90\), but \(\tau =0.50\) has the lowest effect. The OAq3 and OAq4 show that this is related to a general reduction in the pricing heterogeneity across quantiles. For 2008–2014, however, upper and central quantiles have a positive aggregate OAq effect, but \(\tau =0.10\) shows a negative effect. Written contract, on the contrary, increases inequality for 2002–2008, but reduces inequality in 2008–2014. For Brazil, both health coverage and written contract reduce inequality in both subperiods. In this case, we do not observe a reduction in price heterogeneity, but a consistent negative effect in the upper quantiles. For Colombia, health coverage increases inequality but written contract reduces it. Mexico shows the smallest effects, but in general, both measures of formality increase inequality.

According to the decomposition of the Gini coefficient in Table  6, the net effect of composition and structure term of health coverage tends to reduce inequality in Argentina during the whole period and in Colombia during 2002–2008. In the case of written contract, it seems to reduce inequality in Argentina, Brazil and Colombia mainly during the first period. In the last period, only it tends to drop inequality in Argentina and Mexico.

5 Conclusions

This study provides a systematic analysis of inequality patterns in Latin America for the period 2002 to 2014. We focus on Argentina, Brazil, Colombia and Mexico by studying the distributional effects of schooling and job formality on the wage distribution. Based on three methodologies derived from the quantile regression framework (i.e., Machado and Mata, RIF regression and random-coefficients model), we perform an aggregate and a detailed decomposition of the pricing and endowment effects of education and job formality on the wage gap at several quantiles.

First, our results show that during the 2002–2008 period, wage inequality drops for all countries with the strongest decreases in Argentina and Brazil. For the most recent period (2008–2014), inequality also fells for all countries except for Brazil where there was a strong reversal. These patterns were robust to different inequality measures. Changes in the wage gap at quantiles reveal that the reduction in inequality is explained by the improvements in the lower part of the wage distribution for all countries. This fact remains with a lower effect by the period after world financial crisis for almost all countries.

Second, the decomposition results at the mean suggest that pricing effects explain most of the wage gap during the whole period in the case of Argentina and Mexico, while only in the last period for Brazil and Colombia. Results from the decomposition of the wage gap along hourly wages distribution by the three methodologies show that the monotone improvement in the distribution of wages is mostly the result of the pricing effect during 2002–2014 for Argentina and Brazil and only during 2008–2014 in Colombia. The endowment effect, however, has a differential impact across countries. The terms OAq3 and OAq4 of the RCD reveal a uniform reduction in coefficients dispersion suggesting an important role of reduction in pricing heterogeneity on overall inequality.

Finally, decomposition results for education reveal that most of the composition effect benefits the upper quantiles more than the lower quantiles increasing wage inequality (evidence in favor of the so-called paradox of progress), while pricing effects have a dissimilar effect. Results for the two broad measures of formality, health coverage and written contract, indicate that they favor relatively more the workers in the lower part of the distribution. The decomposition suggests for both measures a large effect of pricing factor for 2002–2008 that reduces inequality. After 2008, it reduces inequality only for Argentina.

As mentioned in Introduction, we do not study the potential impact of minimum wages and unionization on inequality, which have been noted in the literature as significant factors, but rather we focus on changes in individual endowments and characteristics, mainly education and formalization status. Further empirical analysis would be required to evaluate how these factors interact with our microlevel quantile regression decompositions.

The main policy implication is that governments should pay more attention to the implementation of economic policies toward job formalization. According to the results, an improvement in the labor conditions of the most disadvantaged workers, e.g., get an affiliation to social security, reduces inequality in a scenario of economic growth. The results of such policies are in general not only in terms of better incomes but also in terms of bigger stability and well-being. Therefore, the design of such policies should consider an economic scenario where the negative effects on inequality of the paradox of progress would be offset by the benefits of formalization.