1 Introduction

Though the thought of sacrificing the well-being of the least fortunate in a group is repugnant to most, the response to The Lottery (Jackson 1948) being a prominent example, developing a rigorous method for determining a desirable degree of redistribution is not trivial. The introduction of heterogeneity into macroeconomic models raises important questions about the proper welfare criterion and the definition of efficiency. The goal of the present paper is to develop a simple model of redistributive taxation and examine the associated political economy issues, particularly Rawls’ conceptions of the Original Position and max-min welfare criterion.

The present work examines an environment where wages are heterogeneous due to differences in worker productivity, and the government collects income taxes via a flat marginal rate and redistributes lump sum transfers to all agents, i.e. a negative income tax or basic income plan. There is an extensive margin in labor supply so the income tax may drive some low productivity workers out of the labor force.

The primary contribution of the paper is the interpretation of a formal comparison of optimal redistribution regime under utilitarian (Benthamite) welfare criteria and the Rawlsian max-min criterion with minimal assumptions about household preferences. Rawls (1971) advocates for the max-min criterion, where the outcome of the least fortunate member of society is maximized. He argues that the max-min criterion is the rational response of a person behind the “veil of ignorance” a key aspect of the Original Position,Footnote 1 meaning she does not know the realization of personal circumstances when deciding fair and just social relations. In the present environment, the question becomes one of the optimal tax regime when households do not know their level of productivity.

While most of the preliminary results have counterparts in the existing literature, the most important result is Proposition 6, which shows that the degree of redistribution satisfying the max-min criterion is necessarily greater than the one that maximizes aggregate utility in an environment with minimal assumptions about preferences. The model formalizes the objection of some philosophers, such as Nagel (1973) and Scanlon (1973), to the max-min criterion that rational agents behind the veil might be willing to risk a bad outcome. Rawls’ Original Position is a valuable concept, but his argument about the rationality of the max-min welfare criterion is suspect.

It is common to define welfare using a concave transformation across utilities, where the concavity represents a societal preference for equality above that given by the utilitarian criterion, for example Stiglitz (1987). With such a transformation, the max-min criterion coincides with welfare maximization in a limiting case with respect to the concavity of the transformation.

We eschew such transformations representing a desire for equality beyond what is implied by diminishing marginal utility. Whether society-wide agreement about a preference for equality is possible or meaningful is questionable, and the associated transformations are necessarily arbitrary. Though some undoubtedly have such a preference, subjective elements of a model limit it’s relevance. In the present environment without such a transformation, where welfare is simply the sum or expected value of the utilities across the population, the max-min criterion coincides with the tax rate at the peak of the Laffer Curve.

The basic income regime is not subject to some common criticisms of redistribution. Under the linear (flat rate) tax, workers do not have incentive to take lower wage jobs to avoid taxes. Also, the lump sum transfer is given to all so there is no distortion of labor supply decisions. For decreasing marginal utility, redistribution is desirable since the gain to low wage workers outweighs the loss for high wage workers.

The presence of an extensive margin in labor supply breaks the equivalence between welfare maximization and the usual allocational efficiency condition. In the present model, the usual allocational efficiency condition is obtained when the tax rate is zero and there is no redistribution as long as household choose a positive labor supply at any wage, a questionable assumption.

Individually, the elements of the present framework are not novel, but the combination and the focus on the interpretation of the welfare criteria is. These results agree with those found in the standard optimal linear tax model, summarized in Piketty and Saez (2013), but the model here relies only on standard household utility preferences, not on diminishing marginal welfare across utilities on the part of the social planner.Footnote 2 Optimal flat rate or linear tax regimes are studied in Stiglitz (1985) and Dixit and Sandmo (1977). The extensive labor supply margin is included in the models in Mirlees (1971) and Diamond (1980). Atkinson (1973) analyzes optimal taxation under the max-min criterion. Manski (2014) also considers simple tax regimes in a model with government production that affects consumption and productivity. He discusses informally some of the issues around efficiency in the present work.

The paper is organized as follows. Section 2 describes the elements of the model and some of the desirable properties of the basic income regime. Section 3 discusses the formalization of the welfare criteria. Section 4 reports the primary formal results of the paper related to optimal tax rate. Introduce a particular calibration of the model to show the quantitative relevance of the results and their relation to the literature on income distribution. Section 5 concludes.

2 The model

The key elements of the model are heterogeneous productivity and the extensive margin in labor supply. A unit mass of households indexed by i have heterogeneous productivity a i with single-peaked density \( a_{i}\sim f\left (a_{i}\right ) \in \left (0,\infty \right ) \) and associated distribution \(F\left (a_{i}\right ) \). Labor markets are competitive so real wages w i are determined by productivity such that w i = a i . Labor is the only input in the model. Each household supplies quantity of labor h i which produces output a i h i , the production function for a single input with constant returns to scale. Aggregate production Y is thereby

$$ Y=\int\nolimits_{0}^{\infty }a_{i}h\left( a_{i}\right) dF\left( a_{i}\right) . $$
(1)

Households receive wage income, pay taxes at a marginal rate of τ and receive a lump sum transfer T = τ Y. The household budget constraint is

$$ c\left( a_{i}\right) =a_{i}h\left( a_{i}\right) \left( 1-\tau \right) +\tau Y $$
(2)

where \(c\left (a_{i}\right ) \) is household consumption. Both consumption and hours worked are functions of an individual’s productivity a i . Since there is a unit mass of agents, the values c (a i ), h (a i ), Y and T can be interpreted as per capita values. For positive tax rates τ > 0, all households have a minimum income τ Y, which requires redistribution in that taxes reduce income for workers with sufficiently high productivity, implementing a basic income plan or negative income tax, advocated by Milton FriedmanFootnote 3 among others.

Households maximize utility U i (c(a i ), h (a i )) that is a function of consumption and leisure. We make minimal assumptions about the form of the utility function: U1i > 0, U11i ≤ 0,U2i ≤ 0,U22i ≥ 0.

The pair \(\left (c^{\ast }\left (a_{i}\right), h^{\ast }\left (a_{i}\right ) \right ) \) satisfies the optimality condition determined by maximizing \( U^{i}\left (c\left (a_{i}\right), h\left (a_{i}\right ) \right ) \) subject to the household budget (2).

$$ {U_{1}^{i}}\left( c^{\ast }\left( a_{i}\right) ,h^{\ast }\left( a_{i}\right) \right) a_{i}\left( 1-\tau \right) =-{U_{2}^{i}}\left( c^{\ast }\left( a_{i}\right) ,h^{\ast }\left( a_{i}\right) \right) $$
(3)

This condition shows the standard labor-leisure trade-off. Under a representative agent, allocational efficiency, where utility is maximized subject to the production constraints, is equivalent to the labor-leisure optimality condition without redistribution.

Lemma 1

Given a representative agent with productivity \(\overline {a}\), the efficiency condition where utility \(U\left (c\left (\overline {a}\right ) ,h\left (\overline {a}\right ) \right ) \) is maximized subject to the production constraints \(Y=\overline {a}h\) and c = Y, is equivalent to the household optimality condition (3) where τ = 0.

Proof

The first order conditions for the Lagrangian

$$L=U\left( c\left( \overline{a}\right) ,h\left( \overline{a}\right) \right) +\lambda \left( \overline{a}h-c\right) $$

are U 1 = λ and \(U_{2}=-\lambda \overline {a}\). Solving out the Lagrange multiplier λ yields the household optimality condition (3) for \(a_{i}=\overline {a}\) and τ = 0. □

However, for the above result to hold under heterogeneous productivity and wages requires an interior solution for all \(a_{i}\in \left (0,\infty \right ) \),Footnote 4 which means that workers work for any positive wage. Hence, for the above condition (3) to apply in the neighborhood of a i = 0, it must also be the case that the marginal disutility of labor is zero at that point, \({U_{2}^{i}}\left (c\left (0\right ), 0\right ) =0\) and strictly negative \({U_{2}^{i}}\left (c\left (a_{i}\right), h\left (a_{i}\right ) \right ) <0\) for a i > 0.

Remark 2

A necessary condition for efficiency with heterogeneous wages is a positive labor supply for any positive wage.

Naturally, the assumption that worker would never drop out of the labor market is questionable. If that assumption does not hold there is an extensive margin in labor supply. Working negative hours makes little sense, so let the labor supply function be

$$ h\left( a_{i}\right) =\max \left( 0,h^{\ast }\left( a_{i}\right) \right) , $$
(4)

which, along with the budget constraint (2), determines the consumption function \(c\left (a_{i}\right ) \). The bound on hours implies that there is a productivity threshold below which workers drop out of the labor market and their only income is the transfer from the government.

Definition 3

The productivity threshold \(\widetilde {a}\) is such that \(h^{\ast }\left (a_{i}\right ) \leq 0\) for all \(a_{i}\leq \widetilde {a}.\)

There are a number of additional reasons to focus on a tax and redistribution system of the present form. The transfer applies to all agents so there is no need for means-testing, which is costly and raises the potential for some to misrepresent their level of productivity. It also avoids problems of envy found in the optimal tax literature and most prominently in the model with two labor types in Stiglitz (1982).Footnote 5 This problem of envy does not arise in the flat-tax, lump-sum redistribution regime.

Proposition 4

Household utility \(U^{i}\left (c\left (a_{i}\right), h\left (a_{i}\right) \right ) \) is non-decreasing in a i for any tax rate τ .

Proof 2

See Appendix. □

The tax-transfer regime under consideration here does not lead to envy among high productivity (wage) workers, and the lump sum transfer does not distort labor markets as noted above. Furthermore, it does not require much sophistication on the part of the government to implement. Hence, this tax-tranfer specification is one of the least objectionable redistribution schemes.

Not to say that taxes do not have adverse effects. On the upward sloping portion of the labor supply curve the threshold \(\widetilde {a}\) is increasing in τ, so as the tax rate increases workers drop out of the labor market.

3 Welfare

Welfare is defined without an explicit preference for equality. Aggregate utility is one obvious choice of welfare criterion to judge the effect of redistribution, following most of the optimal tax literature including Mirlees (1971), Manski (2014) and Werning (2007). The Rawlsian approach to welfare is treated separately.

Aggregate utility U e is specified by aggregating household utility over the distribution of productivity \(F\left (a_{i}\right ) \) equivalent to expected utility.

$$\begin{array}{@{}rcl@{}} U^{e} &=&\int\nolimits_{0}^{\infty }U^{i}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) dF\left( a_{i}\right) \\ &=&\int\nolimits_{0}^{\widetilde{a}}U^{i}\left( \tau Y,0\right) f\left( a_{i}\right) +\int\nolimits_{\widetilde{a}}^{\infty }U^{i}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) dF\left( a_{i}\right) \end{array} $$
(5)

The second line expresses separately the utility of those with productivity below \(\widetilde {a}\) who do not work and who consume only the transfer τ Y.

Unlike many models in this literature, we do not include an arbitrary weighting function representing preferences the government or social planner. Such models introduce a concave function \(G\left (\cdot \right ) \) and define welfare as \(\int \nolimits _{0}^{\infty }G\left (U^{i}\right ) dF\left (a_{i}\right ) \). Concavity of the weighting function \(G\left (\cdot \right ) \) indicates diminishing marginal social welfare across the utilities of the population with the limiting case \(G^{\prime }\left (0\right ) =\infty \) representing the Rawlsian preference to maximize the utility of the least productive agent. While such an approach is mathematically elegant, the weighting function is arbitrary, and the results that follow could be questioned.

One could also consider aggregate utility U e to be the utility of a representative agent behind the “veil of ignorance,” meaning he does not know his own draw from the distribution \(F\left (a_{i}\right ) \). Rawls uses this concept in his development of principles of justice from the “Original Position”, which would be agreed upon by citizens behind the veil. Werning (2007) considers aggregate utility to be a representation of insurance for households against poor outcomes.

Rawls’ concept is a “thick veil”, where agents have no knowledge of the structure of the economy, which leads him to argue in favor of the stronger max-min criterion, which requires that the utility of the least advantaged agent is maximized. In the present context with the extensive margin in labor supply, the max-min principle is satisfied by the tax rate at the peak of the Laffer curve,Footnote 6 which relates the transfers (government revenue) \(T\left (\tau \right ) \) and the tax rate τ. The transfer is also the only income of the least productive worker at a ≈ 0, so the tax rate that maximizes \( T\left (\tau \right ) \) satisfies the max-min principle.

The present work considers the role of redistribution on welfare for both the max-min criterion and aggregate utility. The following derivative shows the effect of redistributive taxation on aggregate utility. The calculation involves multiple applications of Leibniz Rule, since the threshold \(\widetilde {a}\) depends on the tax rate τ. Also note that \( F^{\prime }\left (a\right ) =f\left (a\right ) \).

$$\begin{array}{@{}rcl@{}} \frac{dU^{e}}{d\tau } &\,=\,&{U_{1}^{i}}\left( \tau Y,0\right) \left( Y\,+\,\tau \frac{ dY}{d\tau }\right) F\left( \widetilde{a}\right) \,+\,U^{i}\left( \tau Y,0\right) f\left( a\right) \frac{d\widetilde{a}}{d\tau } \\ &&\,+\,\int\nolimits_{\widetilde{a}}^{\infty }\left[ {U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \frac{dc_{i}}{d\tau } \,+\,{U_{2}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \frac{ dh_{i}}{d\tau }\right] dF\left( a_{i}\right) \,-\,U^{i}\left( c\left( \widetilde{ a}\right) ,h\left( \widetilde{a}\right) \right) f\left( a\right) \frac{d \widetilde{a}}{d\tau } \end{array} $$

The effects of an increase in the marginal tax rate τ expressed above are intuitive. The \(\frac {d\widetilde {a}}{d\tau }\) show the effect of workers exiting the workforce, the first term shows the increased utility of those receiving only the transfer, and the integral term shows the change in utility for those in the workforce.

The definition of \(\widetilde {a}\ (\)Definition 3) implies that for \(a_{i}< \widetilde {a}\), it is the case that \(h\left (a_{i}\right ) =0\) and \(c\left (a_{i}\right ) =\tau Y.\) Hence, the \(\frac {d\widetilde {a}}{d\tau }\) terms cancel and the above expression simplifies.

$$\frac{dU^{e}}{d\tau }\,=\,{U_{1}^{i}}\left( \tau Y,0\right) \left( Y\,+\,\tau \frac{dY }{d\tau }\right) F\left( \widetilde{a}\right) +\int\nolimits_{\widetilde{a} }^{\infty }\left[ {U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \frac{dc_{i}}{d\tau }\,+\,{U_{2}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \frac{dh_{i}}{d\tau }\right] dF\left( a_{i}\right) $$

Using the household optimality condition (3) and differentiating the household budget constraint (2) with respect to τ and substituting yields an alternative form for \(\frac {dU^{e}}{d\tau }\).

$$ \frac{dU^{e}}{d\tau }\,=\,{U_{1}^{i}}\left( \tau Y,0\right) \left( Y\,+\,\tau \frac{dY }{d\tau }\right) F\left( \widetilde{a}\right) +\int\nolimits_{\widetilde{a} }^{\infty }{U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \left[ Y\,-\,a_{i}h\left( a_{i}\right) \,+\,\tau \frac{dY}{d\tau }\right] dF\left( a_{i}\right) $$
(6)

4 Redistribution

Some degree of redistribution maximizes aggregate utility given minimal assumptions. However, the aggregate utility maximizing tax rate is less than the rate that satisfies the Rawlsian welfare criterion.

Another form of the derivative \(\frac {dU^{e}}{d\tau }\) is obtained by combining the two terms in the relation (6), using the fact that the first term does not depend on a i for \(a_{i}\in \left (0,\widetilde {a }\right ) \).

$$ \frac{dU^{e}}{d\tau }=\int\nolimits_{0}^{\infty }{U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \left[ Y-a_{i}h\left( a_{i}\right) +\tau \frac{dY}{d\tau }\right] dF\left( a_{i}\right) $$
(7)

A mathematical trivial but important special case is that of constant marginal cost \({U_{1}^{i}}\left (\cdot \right ) =\overline {c}\), where the above reduces to \(\frac {dU^{e}}{d\tau }=\overline {c}\tau \frac {dY}{d\tau }.\) Output is decreasing in the tax rate, so \(\frac {dU^{e}}{d\tau }\leq 0\). Further, at the zero tax rate it is the case that \(\frac {dU^{e}}{d\tau } |_{\tau =0}=0\) under constant marginal utility so the introduction of redistribution does not improve welfare. Hence, any argument in favor of redistribution depends on decreasing marginal utility. Of course, assuming constant marginal utility across all income levels is highly questionable.

For realistic specifications of the utility function, some degree of redistribution is desirable. Using the equilibrium conditionFootnote 7 \(Y=\int \nolimits _{0}^{\infty }c\left (a_{i}\right ) dF\left (a_{i}\right ) \) gives the following.

Proposition 5

Given that marginal utility is decreasing in \(a_{i}\in \left (0,\infty \right ) \) , \(U_{11}^{i}\left (c\left (a_{i}\right ) ,h\left (a_{i}\right ) \right ) <0\) , consumption is increasing in productivity, \(c^{\prime }\left (a_{i}\right ) >0\) and there is an extensive labor supply margin \(\widetilde {a} >0\) then some degree of redistribution increases aggregate utility such that \(\frac {dU^{e}}{d\tau }|_{\tau =0}>0\) .

Proof

See Appendix. □

For redistributive taxation to have any benefit, the marginal utilities and mass of agents whose productivity (and wage) places them below mean income must be greater than those above. Even with the restrictions of a fixed marginal tax and lump sum transfer for all, redistribution is beneficial, as in Mirlees (1971). The assumption that consumption is increasing in productivity is equivalent to the assumption that it is a normal good. It might be violated for an extremely backward bending labor supply curve at higher levels of productivity, but the case for redistribution would be even stronger here. Workers on the backward bending portion of the labor supply curve would supply more labor at a higher tax rate.

The existence of an extensive margin in labor supply \(\widetilde {a}>0\) means that the result in Lemma 1 where efficiency is achieved without redistribution does not apply. This result clouds all arguments that rely on such a definition of efficiency. While some concept of efficiency properly defined may be useful, the role of heterogeneity is crucial.

Now that it has been established that a positive tax rate maximizes aggregate utility, the primary result that the Rawls’ max-min criterion calls for an even greater degree of redistribution can be shown.

Proposition 6

Given that the unique tax rate that maximizes aggregate utility U e is τ , and that the rate that satisfies the max-min criterion is \( \overline {\tau }\) , then

$$\tau^{\ast }<\overline{\tau }. $$

Proof

The household optimality condition (3) and labor supply specification (4) implies that if τ = 1,\(h\left (a_{i}\right ) =0\) for all a i . Hence, for τ sufficiently large, output Y is decreasing with τ, and the Laffer curve has its usual shape where \(T\left (\tau \right ) \) has a single maximum and \(T\left (0\right ) =T\left (1\right ) =0.\)

The proof proceeds by contradiction. Assume \(\tau ^{\ast }>\overline {\tau } .\) Since, both functions of the tax rate \(U^{e}\left (\tau \right ) \) and \( T\left (\tau \right ) \) have a unique maximum on \(\tau \in \left [ 0,1\right ] \) , it follows that \(\frac {dU^{e}}{d\tau }|_{\tau =\overline {\tau }}>0\) and \( \frac {dT}{d\tau }|_{\tau =\overline {\tau }}=0\). The latter equality implies \(\left (Y+\tau \frac {dY}{d\tau }\right ) |_{\tau =\overline {\tau }}=0\) . Combining these relations with the expression (7) yields

$${U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \left[ Y-a_{i}h\left( a_{i}\right) +\tau \frac{dY}{d\tau }\right] dF\left( a_{i}\right) |_{\tau =\overline{\tau }}>0, $$

but this can be written

$$-{U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) a_{i}h\left( a_{i}\right) dF\left( a_{i}\right) |_{\tau =\overline{\tau } }+{U_{1}^{i}}\left( c\left( a_{i}\right) ,h\left( a_{i}\right) \right) \left[ Y\,+\,\tau \frac{dY}{d\tau }\right] dF\left( a_{i}\right) |_{\tau =\overline{ \tau }}>0. $$

Since \(\left (Y+\tau \frac {dY}{d\tau }\right ) |_{\tau =\overline {\tau }}=0\), \({U_{1}^{i}}\left (\cdot \right ) >0\) and all the other terms in the first integral are positive, the above relation cannot be true. Therefore, it must be the case that \(\tau ^{\ast }<\overline {\tau }\) as required. □

In an economic environment with minimal restrictions, the aggregate utility maximizing degree of redistribution is less than that determined by the max-min principle.Footnote 8 Alternatively, satisfying the max-min principle would require a reduction in aggregate utility as compared to the optimum. The proof makes the reason clear. Considering the expression for aggregate utility (5), the max-min principle requires maximizing the first integral over \(\left [ 0,\widetilde {a}\right ] \) while ignoring the second integral over \(\left [ \widetilde {a},\infty \right ] \), meaning the utility of the least productive is maximized without regard to the effect on the rest of the population.

The results that the Rawlsian marginal tax is revenue maximizing but not welfare maximizing agree with results from the standard optimal linear tax model in Piketty and Saez (2013) that includes social welfare weighting function. Proposition 6 also parallels results in Werning (2007), which derives conditions for Pareto efficient income tax regimes. The resulting income distribution for such a regime must stochastically dominate the distribution that satisfies the Rawlsian (max-min) criterion for social welfare.

Proposition 6 formalizes the objection to the max-min criterion that, even for someone deciding behind the veil of ignorance, a person might be willing to risk a poor outcome if the probability is sufficiently small. Rawls (1971) argues that such a risk would not be rational but is not wholly convincing, as argued by Nagel (1973) and Scanlon (1973). The view embodied in the present model is that the Original Position is a useful concept for evaluating policy and societal norms. However, the max-min criterion is interesting as a benchmark, but relying on it in arguments about optimal policy is suspect.

A defender of the max-min criterion would argue that the veil in the above model is not sufficiently thick in that agents have knowledge about their preferences and the structure of production in the economy. Though the assumptions about the utility function are quite general, it is true that the production sector is stylized. However, there is no obvious extension that would overturn the result in Proposition 6, though this is an area for future research. More importantly, if the goal is to provide a rigorous framework for fairness, redistribution can be justified on utilitarian grounds with a thin veil, as in the model presented here, and there is no need to discuss the thickness of the veil, which is inherently vague and unlikely to be convincing for many.

5 Conclusion

There is no welfare criterion that is above criticism. It is more important to select a criterion that is reasonable across a wide range of preferences, than one that fits closely with a particular viewpoint. The utilitarian criterion (maximizing aggregate utility) does not capture all aspects of well-being, but has the advantage of simplicity. Diminishing marginal utility is not arbitrary. In identifying an appropriate welfare criterion, Rawls’ concept of the Original Position is useful in that it provides a foundation for the meaning of aggregate utility as expected utility behind the veil. However, his assertion that the max-min criterion is rational is suspect. Under minimal assumptions about household preferences the includes an extensive margin for labor supply, the marginal tax rate that satisfies the max-min criterion is higher than that which maximizes aggregate utility. The max-min criterion ignores the possibility that a rational agent behind the veil might be willing to risk an outcome worse than the maximum possible transfer under the basic income policy regime. Furthermore, with a heterogeneous population, the aggregate utility maximizing tax-transfer regime does not correspond to common formulations of allocational efficiency. Researchers studying heterogeneous populations should continue to focus on aggregate utility, making sure that any efficiency concept is rigorously defined.