1. Introduction

In the unitary model, household behavior is considered as resulting from the decisions of a single unit, concealing the fact that most households consist of several members. Since flexible and easily identifiable, this representation of household decision processes is very popular when estimating household preferences, particularly in the presence of a complex budget constraint (see for instance van Soest, 1995). In this respect, it is a convenient model when studying the impact of a real or topical tax reform on household behavior (see e.g., Blundell, Duncan, McCrae, & Meghir, 2000). Yet, the unitary model treats the family as a black box and consequently the within-family income redistribution resulting from a policy reform cannot be identified. Alternatively, the collective approach specifies the individual preferences of family members and simply assumes that the household decision-making process leads to Pareto-efficient allocations.

The present paper proposes a quantification of the distortions in tax reform analysis entailed by the use of a unitary model when the data are collective. The evaluation of tax reforms based on both unitary and collective models has been suggested by Beninger and Laisney (2002). Using purely synthetic data, they find important discrepancies in the incentive and distribution effects of revenue-neutral reforms based on unitary estimates rather than on the collective parameters. The aim of the present paper is to check the robustness of these results when the collective baseline situation is generated using real world data. This is done by means of a collective model calibrated on real data as described in Frederic Vermeulen et al. (2006), and by estimating a unitary model on this ‘collective data set’. We focus on German data, and a revenue neutral tax reform that consists of replacing the smooth progressive tax schedule and the current means-tested social benefits by a simple two-parameter linear taxation system involving a basic income and a flat tax. The same exercise is replicated for other European countries and other topical reforms.

The outline of the paper is as follows. The second section briefly introduces the baseline German tax and benefit system (corresponding to the year 1998). Section 3 gives statistical evidence on the data used, the 1998 wave of the GSOEP. Estimation results for the unitary model are given in Section 4. Section 5 compares normative and positive analyses of the tax reform on the basis of the two models. Section 6 concludes.

2. The 1998 German tax-benefit system

The German tax system is characterized by comprehensive taxation of various income sources through a smooth nonlinear tax schedule, and by joint taxation of couples. After deduction of income-related expenses, child allowances and maintenance payments to ex-partners, the individual marginal tax rate starts off at 25.9% (applicable from an annual personal allowance of 6,322 euro). The top rate is 53% for yearly earnings in excess of 61,376 euro. The tax schedule is the same for singles and for couples, but for couples, the ‘splitting’ system is applied: the tax liability per person is assessed on the basis of half of the joint taxable income, and the outcome is doubled to obtain the total income tax liability of the spouses. Employees are also subject to social security contributions on earnings at a more or less constant rate of about 20%. As these contributions are related to claims toward the social security system, they are treated here as consumption expenses that do not reduce disposable income, and are thus excluded from our narrow definition of the tax-benefit system. Footnote 1

We consider means-tested social benefits of a maximum of 511 euro per individual per month plus age-depent supplementary payments for children. Footnote 2 In addition, there is a universal child benefit, but parents may opt instead for an alternative tax deduction. Disposable income is defined in the study as net income minus net maintenance payments to or from parents, children and ex-partners.

The German tax-benefit system results in nonconvex budget sets for a large proportion of households (this concerns more than half of the couples in the sample used). A detailed description of the tax-benefit system and an illustration of the potential nonconvexity of the budget constraint are provided in Appendix A.

3. Data

Our exercise consists in estimating a unitary model on the basis of realistically simulated collective data. For this purpose, we use data from the 1998 wave of the German Socio-Economic Panel (GSOEP) and then simulate individual behavior according to collective rationality. This section briefly describes the basic data selection, and recalls the two-step construction of the collective data base. More detailed information on the sample chosen and some descriptive statistics are given in Appendix B.

The GSOEP is a representative panel data sample of households and individuals living in Germany. It started in 1984 with annual interviews in the Federal Republic of Germany and was extended to the former German Democratic Republic from 1990 on. The panel gives a wealth of information on the labour market status of individuals and on the various income sources of families.

We selected German nationals aged between 25 and 55 years. All are employees with a contractual labour supply of at least 10 h per week or individuals who are voluntarily out of employment. The restriction on hours is introduced to avoid extraordinary high wage rates as the ratio of earnings over hours due to measurement error for people with less than 10 h. As usual in labour supply studies, we exclude the self-employed, students, individuals on parental leave, and the (registered) unemployed.

The sample of singles consists of 488 individuals: 208 women and 280 men. Here a ‘single’ is a one-person household. He or she may have dependent children living outside the household. We also selected 1,332 families living in a one- or two-generation households composed of a married couple and dependent (possibly working) children. The children may live inside or outside the household.

Married women have a significantly lower participation rate than married men or singles. The distribution of labour supply is more evenly spread for wives than for husbands: they work more often in part-time jobs. In particular the wives’ weekly working time distribution has a mode at 20 h.

To obtain a data set representing the collective world for Germany, we use the following two-step procedure (details are given in Vermeulen et al., 2006). In the first step, we estimate preference parameters for single men and women and then predict their labour supply and the corresponding consumption level under the 1998 German tax-benefit system. This also involves the estimation of wage equations. In the second step, for families composed of a married couple (and children), we determine the partners’ relative weights in the household (“male power index”). We suppose that the preferences of married individuals differ from those of singles by an additional cross-leisure term, different for females and males (δ f and δ m hereafter). Therefore, apart from the latter, we assume that married individuals basically retain the same preferences as before marriage.

The predicted labour supplies, that almost perfectly fit the observed hours of work, are taken as the baseline situation for the estimation of the unitary model (see Beninger, Laisney, & Beblo, 2003, for details on the German collective data).

4. Unitary model: estimation

We now describe the estimation procedure for the unitary model, and the estimation results.

4.1. Estimation procedure for the unitary model

For the specification of the unitary model, we adopt the analogue to the individual utility functions used in the collective model (see Vermeulen et al., 2006). In order to remain close to the situation faced by an investigator who only observes aggregate household consumption, we do not take advantage of the fact that the collective baseline situation contains each spouse’s consumption (see Myck et al., 2006). Thus, we specify:

$$\begin{aligned} U\left(c,l^{f},l^{m};\mathbf{d}\right) =\beta _{c}( \mathbf{d} ) \ln\left[c-\bar{c}( \mathbf{d})\right] \\ +\beta _{l}^{f}( \mathbf{d}) \ln \left[l^{f}-\bar{l} ^{f}( \mathbf{d}) \right] +\beta _{l}^{m}( {\bf d}) \ln \left[ l^{m}-\bar{l}^{m}( \mathbf{d}) \right] \\ +\delta ( \mathbf{d}) \ln \left[ l^{f}-\bar{l}^{f}( \mathbf{d}) \right] \ln \left[ l^{m}-\bar{l}^{m}( \mathbf{d} ) \right] . \end{aligned}$$
(1)

Appendix C gives the conditions under which this utility function is increasing in its arguments and concave. We also experimented with direct translog utility functions along the lines of van Soest (1995), but with a quadratic form in logs of departures from minimal requirements, as in Eq. (1). Although several specifications were superior to our preferred specification in terms of likelihood values, all led to utility functions that were nonincreasing in at least one argument for a majority of observations. This made them useless for tax reform analysis.

The β and δ functions are assumed to be linear in d , and the minimum requirements in consumption and leisure are set to the values chosen for the collective model. Of course, the budget constraint remains the same as for the collective model.

We suppose that each spouse has K alternative values h k for his/her weekly labour supply, leading to leisure choices l k =Th k , where T is the total time available: 168 h a week. We choose K=7, and the following set of possible values for hk: 0, 10, 20, 30, 40, 50, 60. Hence, the couple has K2=49 possible combinations. If \({\hat U}_{j}={\hat U}\left( c_{j},l_{j}^{f},l_{j}^{m};\mathbf{d}\right)\) denotes the utility generated by combination j of the set of combinations { (c j ,l f j ,l m j ) K^2j=1 }, adding an error term ɛ j to the utility derived from combination j, we have:

$$ {\hat U}_{j}={\hat U}\left( c_{j},l_{j}^{f},l_{j}^{m};{\bf d}\right) +\varepsilon _{j}\qquad \forall j=1,\ldots,K^{2}.$$
(2)

The distribution of ɛ j is assumed to be the extreme value distribution defined by:

$$\Pr\left[\varepsilon_{j}\varepsilon \right]=\exp \left(-\exp \left(-\varepsilon \right) \right),\qquad \varepsilon \in \mathbb{R},$$
(3)

and the ɛ are assumed independent. If combination j is chosen by a household, its contribution to the likelihood is

$$\Pr \left[ {\hat U}_{j}>{\hat U}_{k}, \forall k\neq j\right]=\frac{\exp \left[ {\hat U}\left(c_{j},l_{j}^{f},l_{j}^{m}; \mathbf{d}\right)\right]}{\sum_{k=1}^{K^{2}}\exp \left[ {\hat U}\left( c_{k},l_{k}^{f},l_{k}^{m}; \mathbf{d}\right) \right] }.$$
(4)

This corresponds to the likelihood of the multinomial logit model. In order to account for unobservable heterogeneity in consumption, we also estimate a discrete mixture of such models (see Heckman & Singer, 1984; Hoynes, 1996) with two to three mass points on the coefficient of \(\ln \left[ c-\bar{c}\left( \mathbf{d}\right) \right]\) (mixed multinomial logit model, MMNL; details are given in Vermeulen et al., 2006, Appendix A.2).

4.2. Estimation results

Table 1 presents estimation results for a model with three mass points. Although the coefficient of the third mass point is poorly determined (β c3 ), and the estimated regime probabilities are extremely unequal (.990, .008, and .002, computed from the logits e1 and e2, see Eq. (20) in Vermeulen et al., 2006), an LR test rejects the specification with two mass points, which itself rejects the MNL specification. We conducted a descending specification search, starting with a full set of household and personal characteristics for d in β c and δ and with only own characteristics plus characteristics of children in β f and β m . For instance, an indicator to distinguish between households in East and West Germany, included to account for potential heterogeneity of the two populations, turns out to have a significant effect only on the δ coefficient. Among the variables d we also include information concerning the regimes (i.e. mass points) ‘chosen’ in the calibration of the collective model (see Vermeulen et al., 2006, line below Eq. (20)). The inclusion of this kind of ‘observed unobservable heterogeneity’ aims at maximum fairness towards the unitary model. The corresponding variables turn out to be highly significant in several places (variables reg3 f and reg1 m ).

Table 1 MMNL estimates of preferences for couples: 3 mass points

Given these parameter estimates, all couples present a positive marginal utility of the female’s leisure. The concavity condition is satisfied for all couples (at least for the ‘chosen’ regime, see the next section). The presence of children has a negative effect on the consumption coefficient. This effect increases if the household has children less than 6 years old. Living in Eastern Germany has a negative effect on the unitary estimate of the interaction term δ. Table 2 gives statistics on the cross-leisure terms for both models and depending on the number of children. In the unitary estimates, δ is positive on average, and is negative for 3 couples only. In the collective model the calibrated \(\bar{\delta}_{m}\) (for males) is also positive on average, but much smaller, and the calibrated \(\bar{\delta}_{f}\) (for females) is negative on average for all household types, a puzzling result that warrants further research. At this stage our conjecture is that this may be connected to our insufficient treatment of time devoted to household production activities.

Table 2 δ(d) coefficients in unitary and collective model by number of children

5. Predictions with the unitary model

We assume that the regime corresponding to a couple is the one which gives the best labour supply predictions under the condition that it leads to an increasing utility function. The frequencies of the regimes are more or less in accordance with the estimated probabilities, although regime 2 is chosen more often than it should (.053 predicted versus .008 estimated probability). The third regime is never chosen, as it never satisfies the positivity restrictions on the marginal utility of consumption.

As reported in Tables 3 and 4, the unitary model performs moderately well in predicting labour supply. Predictions are correct for only a third of the wives and for 45% of the husbands. The margins of the tables are not very well predicted, except for the participation rate. The results for cells within the tables are bad, as some large discrepancies occur. For instance, more than 44% of nonparticipating wives are predicted to work. Footnote 3 The unitary model tends to smooth the hours distribution. The mode is significantly lower for the unitary model, both for women and men: 80% of the husbands actually work full time in the collective baseline situation, but only 49% are predicted to work 40 h with the unitary model. Labor supply of both spouses is underpredicted on average. This points to the misspecification of the unitary model. However, the predictions from this particular model are in line with those found in the literature.

Table 3 Collective versus unitary female labour supply, baseline tax scheme
Table 4 Collective versus unitary male labour supply, baseline tax scheme

5.1. Linear tax reform

A linear tax schedule is applied with joint taxation for couples. It consists of a basic income plus a flat tax rate, replacing the current means tested social benefits and the smooth progressive tax rate. The tax reform applied linearizes the budget constraint, which is a rather large departure from the present design of the German tax-benefit system. Approximate revenue neutrality is achieved by setting the negative income tax for zero income at −6,000 euro for singles and −9,600 euro for couples and computing the constant marginal tax rate for both singles and couples as residual tax rate. It is important to note that the definition of a revenue neutral reform will differ if it is based on the unitary rather than on the collective model, because predicted behavioral adjustments will differ between the two models. In fact the unitary model leads to a flat tax rate of t=.403, the collective model to t=.428. This lets the reform appear much more favorable for singles (relatively) with the unitary than with the collective model.

In Fig. 1, we consider for the sake of illustration the changes in disposable income following the reform, for a particular individual, in euro per week. This is a single woman with the mean gross hourly wage rate of single women (about 13 euro), and no capital income. Footnote 4 This woman is thus a potential recipient of social benefits. The move to a linear tax system is beneficial for her only over a narrow range of hours corresponding to the progressive withdrawal of social benefits in the 1998 system. A first decrease (in post-compared to pre-reform disposable income) at very low hours is due to the fact that the flat tax rate applies from the first hour of work. A second one is due to the fact that under the 1998 system that person faces lower marginal tax rates. Wealthy households will gain from the reform.

Fig. 1
figure 1

Change in disposable income, single woman, w=13 euro, no capital income

5.2. Positive aspects of the reform

The unitary model introduces distortions in the prediction of tax revenues: Table 5 reveals that the unitary setting overpredicts total tax revenue by .6 billion euro. Unlike the collective model, the unitary model predicts couples to have higher tax liabilities under linear taxation.

Table 5 Tax revenues

We predict labour supply in the case of the baseline tax-benefit system and the linear tax reform (recall that the baseline situation used for each model consists of the predictions from that model), and compare predicted participation rates and working hours for the different tax systems, obtained with the collective and with the unitary models. The largest discrepancies between collective and unitary predictions are obtained for wives. Under linear taxation these discrepancies are partly due to differences in the reform definition (regarding the tax rate), but for the 1998 tax system they are due only to the misspecification of the unitary model.

Table 6 compares the collective and unitary labour supply of wives after the reform for wives. Footnote 5 The quality of unitary predictions proves to be better for the reform than for the baseline situation. However, labour supply is well predicted for only about a third of the wives, and it is badly predicted (prediction error ≥20 h) for another third of them.

Table 6 Collective versus unitary female labour supply, linear taxation

Tables 7 and 8 compare variations in unitary and collective labour supply for females and males. The adjustment of labour supply following the tax reform is poorly predicted by the unitary model, which predicts well the change in labour supply for only half of the wives and half of the husbands well. Beninger et al. (2003) show that this is in general verified for radical reforms, such as a linear tax reform. The reaction—mostly a reduction—in hours worked is underestimated for wives but overpredicted for husbands. Indeed, in the collective model wives’ labour supply is much more affected than husbands’ by changes in the tax system. The unitary model predicts that more men than women change their labour supply: nearly 72% of wives and 66% of husbands do not react to the reform for the unitary model. This contradicts the results obtained with the collective model, according to which the corresponding values are 61% and 73%.

Table 7 Variation in female labour supply, collective versus unitary
Table 8 Variation in male labour supply, collective versus unitary

Interestingly, wives and husbands react more similarly to the reform in the unitary than in the collective setting. Indeed, Table 9 shows that for almost 60% of couples (sum of the diagonal elements), both spouses have the same labour supply change. The collective model predicts the spouses to react differently to the tax reform (see Beninger et al., 2003).

Table 9 Variation in unitary labour supply, female versus male

5.3. Normative aspects of the reform

We describe the welfare effects of the reform measured at the household level by the unitary model by showing the distribution of percentage changes in household utility for every decile of the pre-reform distribution of the household equivalent disposable income. Footnote 6 These graphs require cautious interpretation. Considering percentage changes does not by itself permit interpersonal or interhousehold welfare comparisons. But given that the composition of deciles in corresponding graphs will remain identical across reforms, the graphs may be expected to convey a feel for the importance of welfare effects. What is well defined in the graphs is the information on proportions of winners and losers by decile. The graphs show the quartiles of the distribution (box). The lines emerging from the box extend upwards to the largest utility change smaller than Q75+1.5(Q75Q25) and downwards to the smallest utility change larger than Q25−1.5(Q75Q25). Observations outside this range are plotted individually.

Individual welfare effects of the reform are measured separately for husbands and wives within the collective model framework. They are described by showing the distribution of percentage changes in individual utility for every decile of the pre-reform distribution of the wives’ or husbands’ equivalent disposable income. Footnote 7

A direct comparison of the welfare analysis based on the two models is made on the basis of cross-tabulation of the positions of households (winner, indifferent, loser) with the pairs of positions of the spouses. A ±.1% change has been taken to define indifference.

Considering the unitary model, very few couples benefit from the move to linear taxation (see Fig. 2). The reform also has a negative impact for most of the married women (collective model, see Fig. 3). Only for men does this reform yield positive gains (see Fig. 4). Table 10 compares the effects of a move to linear taxation. About 80% of the couples are predicted to be welfare losers in the unitary framework. However, while over 80% of the wives are losers with the collective model, only 52% of the husbands are predicted to be losers. The percentage of Pareto winning households (columns f+ m0, f0 m+ and f+ m+) is 7.8%, and there are over 60% Pareto losing households (columns fm, fm0 and f0m). The percentage of households for which the move to linear taxation creates potential conflict amounts to 29% (22.5% where wives gain and husbands lose, plus 6.7% where wives lose and husbands gain). For 16% of the households, the unitary and the collective predictions are contradictory (e.g., decrease in household utility for the unitary model, while the reform is seen as Pareto-improving at the household level for the collective model).

Fig. 2
figure 2

Relative welfare gains from a switch to linear taxation for households, unitary model

Fig. 3
figure 3

Relative welfare gains from a switch to linear taxation for married women, collective model

Fig. 4
figure 4

Relative welfare gains from a switch to linear taxation for married men, collective model

Table 10 Winners and losers: collective versus unitary model, linear taxation

6. Summary of comparable results for other European countries or other reforms

Similar work was conducted for Belgium, France, Italy, and the UK. Footnote 8 For Germany we also summarize results obtained for other reforms than the move to linear taxation described above (Beninger et al., 2003). All the reform definitions can be found in Section 6 in Myck et al. (2006).

6.1. Belgium

Two baseline situations are considered. The first situation corresponds to predictions obtained from the collective model with the estimated power index and estimated coefficients of leisure interaction. The second situation starts from the calibrated values of these variables (to allow shifts in bargaining power following a reform, the constant in the power index regression is adjusted so that the estimated index corresponds to the calibrated one). Unitary models, including a leisure interaction term, are estimated on both of these data sets. Regular preferences are obtained for only 38% and 24% of the observations, respectively, and adjustments to the marginal propensity to consume are performed in order to correct for this. The fit is poor, with strong under-estimation of labour force participation of females in couples. Qualitative results obtained for the reforms with the two different baselines do not differ much.

The Belgian reform improves the bargaining position of a majority of women. Collective labour supply reactions to the reform are moderate, and the same result emerges when using the unitary model instead. But whereas the unitary model predicts that 97% of the households benefit from the reform, for the collective model there is a Pareto improvement for only 65.5%, and not all of these are winners according to the unitary model. Gini coefficients on equivalent incomes and concentration ratios focusing on income tax, obtained with both types of models, are in agreement qualitatively, although they differ in absolute values. The reform slightly reduces inequality, and increases the concentration of tax payments.

The second reform considered is the introduction of a linear income tax, which replaces both the current tax system and social security contributions. For each model, the flat tax rate was first set at a level of 50%. In a second step, a basic income (applicable to both singles and individuals in couples) was chosen to ensure revenue neutrality. Since labour supply reactions may differ across models, this basic income may also differ (for the first baseline situation, e.g., the basic income associated with the collective model is equal to 2,860 euro; whereas it is equal to 2,830 euro for the unitary model).

The linear taxation also leads to moderate labour supply reactions when predicted with the collective model. For the unitary model, the predicted behavior for women is similar, although more pronounced reactions are obtained, but for men the overall direction of the change is reversed. According to the collective model, the reform is a Pareto improvement for about 30% of the households, disadvantages both spouses for some 20%, and has conflicting impacts for 46%. These ambiguous effects are only very partially captured by the unitary model, for which 42% of the households win and 57% lose. According to both models, the reform increases overall inequality, leaving the concentration ratio almost unchanged.

6.2. France

The effects of the French tax credit reform on labour supplies, as predicted by the collective model, are rather small, with only 5% of wives and 1% of husbands altering their labour supply. Some 10% of these are not recipients of the tax credit after the reform: this type of reaction is purely “collective”, and is ruled out by the unitary setting. The estimation of a unitary model without leisure interaction term (LES) on the collective baseline leads to regular preferences for all households in the sample. Predictions are poor: only 41% of the women have correct labour supply predictions (75% for men). For the positive analysis of the reform, the unitary model under-predicts the changes found with the collective one, but both models agree in finding that the reform does not fulfil its objective, as it does not succeed in increasing participation of low-wage earners, at least if one restricts attention to couples. For 19% of the couples, the reform appears indifferent from the unitary point of view, whereas he wins and she loses for the collective model. For 32% the reform is preferred (U), whereas she wins and he loses (C). For only 23% total agreement is found, in that the reform is preferred for (U) and Pareto improving for (C). Contradictions concern only 1% of households.

6.3. Germany

Beyond the linear tax reform, for which results were reported above, the German Tax Reform 2000 and a revenue-neutral move from joint to individual taxation were also analyzed.

Taking (C) as reference, (U) yields about 33% correct predictions for reactions to the German tax reform and to individual taxation, and 40% for reactions to linear taxation. Overall, (U) predicts more changes than (C).

The German tax reform 2000 has negative welfare effects for 23% of individuals in couples, with equal shares of men and women in this percentage. Overall the reform is more beneficial to women (71% win) than to men (53%). For the unitary model 81% of the households benefit from the reform, but the collective model shows only 66% Pareto winners (+ +), less than 1% of Pareto losers, and 22% of conflicting changes (+− and −+). For this reform there are few (3.4%) outright contradictions between the two models ((U+, C− −), (U+, C− =), (U+, C= −), (U+, C= =), etc.).

For individual taxation, revenue-neutrality is obtained by multiplying the tax liability by a factor of .942 for the collective model, and by a factor of .894 for the unitary model. Thus, the tax burden is predicted to shift from singles to couples in a much more pronounced way when using the unitary model. The collective model shows that the only decile of the distribution of pre-reform equivalent incomes in which a majority of women prefer the reform to the status quo is the highest decile. For men, there are some large gains and losses (measured in relative terms) at all deciles. The unitary model finds the largest percentage of winners in the highest two deciles, but also the largest losses. There are only 3.5% Pareto winners, but 56% Pareto losers, and conflicting effects arise for 32% of the households. Some 8% contradictions arise between the two models.

6.4. Italy

The estimation of a unitary model leads to poor predictions on a sample of Italian households. Working hours predictions are correct for only 34% of men and 57% of women. The estimated unitary model tends to smooth the distribution of labour supply, leading to over-prediction for men and under-prediction for women. In particular, 13% of wives are predicted to be out of the labour market whereas they are working full time, and more than 50% of men are predicted working 10 h more than actual.

Two tax reforms were evaluated, namely the 2002 tax system and a linear income tax. The first one is not expected to be revenue neutral, due to both a reduction in tax rates and a substantial increase in tax credit for children and for employment income. The hypothetical linear tax system is defined to ensure revenue neutrality. In both reforms the largest discrepancies between the collective and the unitary model are found in the male labour supply, as only 35% (31%) of the cases are on the main diagonal when the 2002 reform (the linear income tax) is introduced in the simulation exercise. Moreover, husbands are expected to react much more to the first reform according to the unitary model than what is predicted by the collective framework. The unitary model also predicts a higher drop in women’s participation.

6.5. UK

The unitary model is estimated on the collective baseline as derived in Myck et al. (2006). The household utility function includes a leisure interaction term and the estimated coefficients give rise to regular preferences for the whole sample. The unitary specification does remarkably well in predicting the underlying (collective) distribution of hours. Indeed, hours are predicted accurately for 58% of men and 64% of women, and with a 10 h margin of error for 94.5% of men and 96% of women. Relative to the collective model, however, the unitary specification over predicts nonparticipation. While 2.9% of men and 10.2% of women are not employed in the collective baseline, the figures are 7.6% and 13.2%, respectively in the unitary model.

The unitary model does not allow us to model distribution of resources between partners and therefore to distinguish between two variants of the Working Families Tax Credit (WFTC) reform: one where the benefit is paid to the main carer and the other when it is paid to the main earner. Simulation of the WFTC using the unitary model results in smaller labour market response to the reform than either of the variants simulated in the collective world. For 20.4% out of the 2,619 couples with children in our sample the reform leads to a change in the number of working hours (relative to 35.0% for either of the variants of the WFTC simulation using the collective model). As we would expect, all of those who change their hours of work choose a combination of male and female hours at which they can claim the new credit. Most couples who change their labour supply reduce their hours of work and, as expected, there is a significant shift from two- and no-earner couples to one-earner families. The main difference between the unitary and the collective simulations is a much greater effect of the reform on employment of men. While the collective model predicts an increase in male participation from 96.3% to 98.2% (when WFTC is paid to main carer) and 97.7% (when it is paid to main earner) the unitary model predicts an increase from 89.4% to 94.6%. The percentage point increase in female employment rate is similar: while the collective model predicts a reduction from 85.6% to either 79.5% or 80.4%, the simulation using the unitary model results in the reduction of the female employment rate from 82.4% to 76.0%.

6.6. Summary

An important result is that all studies which estimate a unitary model (for Belgium, France, Germany, Italy and the UK) find substantial distortions arising from its use in predicting the positive and normative effects of reforms. The study for Spain (Raquel Carrasco and Javier Ruiz-Castillo, 2002) also produces evidence of such distortions, by separating ‘unitary effects’ of the collective model, that is, effects obtained considering only changes in the budget restriction, and not the changes in the bargaining position of the spouses.

7. Conclusion

The aim of this study has been to illustrate the distortions in policy evaluation entailed by the use of unitary model estimates when the underlying data obey collective rationality. We have addressed this question by estimating a unitary model on realistically simulated micro data for Germany within a collective setup.

A comparison of the collective data and the unitary predictions showed that in the baseline situation, on average, labour supply is underpredicted by the unitary model. In total, only a third of female labour supply decisions are correctly predicted (the corresponding figure for men is 40%). Much more interesting for policy evaluation are the distortions in predicted labour supply adjustments in reaction to a tax reform. For the move to linear taxation, and taking the collective predictions as the reference, the unitary predictions for the labour supply variation are correct for only about half of the wives. In the collective setting, the labour supply of married women is more responsive to this reform than their husbands’ is, whereas the unitary model predicts more men to alter their hours of work. Thus, by basing policy evaluation on unitary estimates when the data are generated by the collective model, the changes in hours for wives and for husbands will be underestimated and overestimated, respectively. Our results also show that the design of revenue neutral reforms itself may be heavily distorted by the use of a unitary model on collective data.

Turning to the normative aspects of reform evaluation, we compare changes in household utility predicted by the unitary model and changes in individual utility predicted by the collective model. We find that these predictions are contradictory for more than 16% of the households (e.g., decrease in household utility for the unitary model, while the reform is seen as Pareto-improving at the household level for the collective model). Another distinguishing trait of the collective model is that it allows for diverging effects on the welfare of both partners, whereas the unitary model is mute on this issue. It turns out that the move to a linear tax system in Germany creates conflicting welfare effects for 29% of the households.

While these results must still be considered as only illustrative, because there are several shortcomings in our approach (in particular, we do not take into account household production; the calibration approach for the identification of the spouses’ weights and the cross-leisure term may capture part of the unobservable heterogeneity in the model; also the specification of the income sharing rule may be problematic), they amply document the rewards to be expected from further work on the estimation of multi-person household models in realistic settings.