1 Introduction

Demand models play an important role in the evaluation of indirect tax policy reform (Banks et al. 1997). While optimal tax design has been shown to be highly dependent on the choice of demand system,Footnote 1 the aim of this paper was to test whether the results of marginal tax reform analysis are sensitive to changes in the underlying consumer demand system.

The sensitivity of marginal tax reform results to different demand systems has been tested before. Decoster and Schokkaert (1990) use Belgian data to test tax reform results based on elasticities estimated from the Almost Ideal Demand System (AIDS), Rotterdam, CBS and the linear expenditure system (LES). They also test the sensitivity of the results to the imposition of the theoretical constraints based on consumer utility maximisation. Their study shows that it is possible to draw relevant policy conclusions that hold for all specifications of demand. Madden (1996) also examines the sensitivity of tax reform results to different demand systems, in particular to assumptions concerning the stochastic and dynamic specification of the underlying consumer demand system. He concentrates mainly on different specifications of the AIDS model, although he also addresses similar issues as those in Decoster and Schokkart by testing the sensitivity of his results to use of the Rotterdam, CBS and LES models. The studies of Decoster and Schokkaert, and Madden, support Ahmad and Stern’s (1984) conjecture that optimal tax reform is less sensitive than optimal tax design to changes in the underlying demand specification.

It is important to distinguish between optimal tax design and optimal tax reform from the outset. Optimal tax design refers to a scenario where taxes are designed optimally taking no regard of the current structure of taxes. It is concerned with the design of taxes as if we were tasked with designing a tax system given no knowledge of the current tax system. Optimal tax reform attempts to identify the direction of social welfare increasing commodity tax changes to the existing structure of taxes. Ray (1997) succinctly notes that optimal taxation can be viewed as the limiting state of a sequence of tax reforms when there is no further possibility of social welfare increasing tax changes. In the optimal tax design literature, Deaton (1981), and later Ray (1986) and Majumder (1988), shows that optimal tax rate results can be driven by the demand system chosen. Deaton shows how assumptions regarding separability can drive optimal tax rate results. In an application to India, Ray tests the sensitivity of optimal tax design results to changes in the functional form of demand and recommends against using the LES, while Majumder again tests sensitivity to a PIGLOG demand system.

So what does this paper add to the literature? Since the analysis of Decoster and Schokkaert and Madden, Banks et al. (1997) introduced the Quadratic AIDS (QUAIDS) model to the demand literature. While Buse (1994) suggested that AIDS had become “the model of choice” for demand analysts, more recently Jansky (2013) argues that “nowadays there are two main demand systems”—the AIDS and the QUAIDS models. Decoster and Vermeulen (1998) suggest that QUAIDS has theoretical properties, such as the ability to capture more variety in Engel curves than rank two systems such as AIDS and Rotterdam, which can on their own be a justification for the use of the system. However, the sensitivity of marginal tax reform results to this extension of the AIDS model has not yet been tested. In practice, tax reform analysts often make an ex ante choice between these models. Therefore, given the current popularity of the AIDS and QUAIDS models, it is important to understand the sensitivity of marginal tax reform analysis to the choice between the models.

Similarly, the use of micro-data (rather than aggregate data as used in Madden and Decoster and Schokkaert) has become more common in estimating demand systems. The advantages of using micro-data in this setting are described in Blundell et al. (1993). Among other issues, they suggest that the use of aggregate data will exclude many important aggregation factors such as the proportion of total expenditure associated with particular family size, tenure group, or employment status. As Blow (2003) argues, demographics can have an important role in accurately estimating elasticities. In this paper, we use Ray ’s (1983) price-scaling technique to include demographics in the demand system. Not only does this method adjust total expenditure in accordance with the size and composition of a household (the usual equivalence scale approach), it also allows the demand for certain goods to change as household composition changes. As far as we are aware, the sensitivity of tax reform results to the inclusion of demographics in the underlying demand system has not yet been tested.

While the use of micro-data allows demographics to affect consumer demand, one must also deal with the issue of observed zero-expenditures in the data, which can lead to biases in estimation if left unaccounted. Zero-expenditures are particularly common for certain goods in household budget survey data such as tobacco, alcohol and clothing. With goods such as these, zero-expenditures can arise due to infrequent purchase (more likely in the case of clothing), withdrawal from the market or under-reporting of expenditure (both more likely in the case of alcohol and tobacco). Again, we can test the sensitivity of tax reform results to this adjustment in the underlying demand system. The use of these three methods, namely the QUAIDS extension, the adjustment for zero-expenditures and the inclusion of demographics, is increasingly popular in the demand literature. This paper attempts to measure their impact on tax reform analysis.

Using Ahmad and Stern’s (1984) model of marginal tax reform, we test the sensitivity of tax reform results to these three key advances in modelling consumer demand. Ahmad and Stern’s model bases its tax reform recommendations on equity and efficiency concerns. In reality, taxation on certain goods may be motivated by other criteria, such as merit good arguments. Schroyen (2010) proposes an extension to Ahmad and Stern’s model, which allows for merit good considerations to affect the tax reform recommendations. In the final part of the paper, we examine whether the tax reform recommendations are sensitive to the inclusion of merit good arguments, and whether the results from the merit good model themselves are sensitive to the choice of underlying demand system. We use Irish data to perform the analysis. Our data range from 1987 to 2009/2010.

The paper is laid out as follows: In Sect. 2, we detail the marginal tax reform model used in the analysis. We also present the different specifications of consumer demand from which we test the sensitivity of the tax reform results. We introduce the data in Sect. 3. The results of the paper are presented in Sects. 4 and 5, before Sect. 6 concludes.

2 Marginal tax reform model

2.1 Marginal tax reform

The primary tax reform model used in this paper is that of Ahmad and Stern (1984). This model attempts to find revenue-neutral welfare-improving directions for marginal indirect tax reform, taking equity and efficiency considerations into account. The key concept used in the model is that of the marginal revenue cost (\(MRC\)) of taxation of each good.Footnote 2 The model assumes that for any indirect tax change, there are two effects: a change in revenue, \({\partial {R}}/{\partial {t_{i}}}\), and a change in consumer welfare, \({{\partial {V}}}/{\partial {t_{i}}}\). The ratio of these effects is known as the \(MRC\) of taxation of each good. We wish to identify a vector of tax changes such that \(\hbox {d}V\ge 0\) and \(\hbox {d}R\ge 0\), with one inequality holding strictly. Such a vector of tax changes exists if the marginal cost in terms of revenue of an extra unit of social welfare raised via the \(i{th}\) good exceeds that for the \(j{th}\) good (\(MRC_i > MRC_j\)). The expression for the \(MRC\) of a unit of welfare raised through a change in the indirect tax on good \(i\) is therefore:

$$\begin{aligned} MRC_{i}=-\frac{{\partial {R}}/{\partial {t_{i}}}}{{{\partial {V}}}/{\partial {t_{i}}}} \end{aligned}$$
(1)

where the minus is inserted to denote marginal cost. At the optimum, \(MRC_{i}\) should be equal for all \(i\), so that no revenue-neutral welfare-improving tax reform exists. If not, it is possible to make a revenue-neutral welfare-improving changeFootnote 3 to the indirect tax system, by raising the tax on the good with the high \(MRC\) and lowering the tax on the good with the low \(MRC\). It is therefore the ranking of goods by \(MRC\), rather than the nominal values of the \(MRC\), that is important in making tax reform recommendations.

Ahmad and Stern show that the \(MRC\) can be estimated asFootnote 4:

$$\begin{aligned} MRC_{i}=\frac{q_{i}X_{i}}{\sum \nolimits _{h}\beta ^{h}q_{i}x_{i}^{h}}+\frac{\sum \nolimits _{k}\tau _{k}q_{k}X_{k}\epsilon _{ki}}{\sum \nolimits _{h}\beta ^{h}q_{i}x_{i}^{h}} \end{aligned}$$
(2)

where \(q_i\) is the consumer price of good \(i, q_{i}x_{i}^{h}\) household expenditure on good \(i\), and \(q_{i}X_{i}\) is aggregate expenditure on good \(i\). \(\tau _{k}\) is the tax on good \(k\) as a proportion of consumer price, and \(\epsilon _{ki}\) is the uncompensated cross-price elasticity of good \(k\) with respect to good \(i\). Importantly, all of the terms required to estimate the \(MRC\) are available directly or indirectly in the data, making marginal tax reform analysis such as this “one of the most practical applications of public economics” (Schroyen 2010).

The welfare weights, \(\beta ^{h}\), determine how much weight is placed on the welfare of each household when assessing the direction for marginal tax reform. In this case, the welfare weights come from a commonly used utility of income function due to Atkinson (1970):

$$\begin{aligned} U^{h}(I^{h})= & {} \frac{k(I^{h})^{1-e}}{1-e} \quad if \, e\ge 0 \quad and \quad e \ne 1 \end{aligned}$$
(3)
$$\begin{aligned} U^{h}(I^{h})= & {} k log(I^{h})\quad if\, e= 1 \end{aligned}$$
(4)

where \(I^{h}\) is the total expenditure per equivalent adultFootnote 5 of household \(h, e\) is a parameter reflecting inequality aversion, and \(k\) is chosen for normalisation. In practice, we normalise the welfare weights so that the poorest householdFootnote 6 has \(\beta ^{h}=1\). We therefore measure the welfare weight of household \(h\) relative to the poorest household:

$$\begin{aligned} \beta ^{h}=\left( \frac{I^{1}}{I^{h}}\right) ^{e} \end{aligned}$$
(5)

where \(I^1\) is the expenditure of the poorest household. Higher values of \(e\) result in more relative weight on the welfare of the poorest household. When \(e=0\), society has no aversion to inequality, so all households have equal welfare weight. As \(e\) moves to infinity, we get closer to the Rawlsian case, where only the poorest household has a nonzero welfare weight. In this paper, we perform the analysis for a range of values of \(e\) between 0 and 5.

The key term of Eq. 2 from the perspective of this analysis is \(\epsilon _{ki}\). Given that the majority of the components of the \(MRC\) expression are directly observable in data or involve straightforward data calculations (tax rates, household expenditure, aggregate expenditure, welfare weights), the key source of potential sensitivity in the ranking of goods by \(MRC\) is in the estimation of consumer demand responses, \(\epsilon _{ki}\). The estimation of this term, and the consequences for the tax reform recommendations arising from these estimations,Footnote 7 is therefore the focus of this analysis.

If we examine Eq. 2 in more detail, we can see the balance between equity and efficiency considerations in this model. Assuming \(\epsilon _{ki}=0 \forall _{k,i}\), the focus is on equity considerations alone. The inverse of the first term in the right- hand side of the equation is known as the distributional characteristic. The distributional characteristic describes the level of concentration of expenditure on a good among poor households. With \(e\) greater than zero, a higher distributional characteristic of a good suggests that expenditure on that good is particularly concentrated among poor households. As \(e\) increases, the consumption by the poorest households gets a higher and higher weight. The distributional characteristic gives the optimal direction of tax reform taking only equity considerations into account.

Alternatively, if we assume \(e=0\), the distributional characteristic for all goods will be equal (to one), so tax reform recommendations will be based purely on efficiency considerations. For a given set of elasticities, a higher value of \(e\) places more relative weight on equity considerations. Therefore, it will be for relatively lower values of \(e\) that we may see the largest effects of changes in the specification of the underlying demand system.

2.2 Applications of the model

Ahmad and Stern’s model of marginal indirect tax reform has been applied in a number of settings. Madden (1995a) applies the model to the Irish indirect tax system in the 1980s. He finds that there was considerable scope for indirect tax reforms at the margin for both 1980 and 1987. Similarly, Kaplanoglou and Newbery (2003) apply the model to the Greek indirect tax system using household-level data from 1987/1988. The authors found that the indirect tax structure in Greece is unnecessarily complicated and inefficient, without achieving any redistributive goals. They compared the indirect tax system in Greece with that in the UK and found that the UK indirect tax structure was simpler, more equitable and more efficient to implement and administer when simulated on Greek consumers. Ahmad and Ludlow (1989) use the model to show that a value-added tax would make Pakistan’s tax system more buoyant and reduce the production distortions inherent in Pakistan’s current tax system—and not at the expense of the poor. Emran and Stiglitz (2005) use the model to examine a “consensus” on indirect tax reform in developing countries that favours a reduction in trade taxes with an increase in VAT to raise revenue. Taking account of the informal economy, they find that contrary to the current consensus, the standard revenue-neutral selective reform of trade taxes and VAT reduces welfare under plausible conditions. They argue that a VAT base broadening with a revenue-neutral reduction in trade taxes may also reduce welfare.

Ahmad and Stern also use their model to estimate the implicit level of inequality aversion in the Indian indirect tax system, in what is known as the inverse optimum technique. Christiansen and Jansen (1978), and later Madden (1992), use versions of the inverse optimum technique to estimate the level of inequality aversion implicit in the indirect tax systems of Norway and Ireland, respectively. Rather than looking for optimal directions of change in the indirect tax system, the inverse optimum technique assumes that the current system is optimal and finds the level of inequality aversion that satisfies the optimality condition of all \(MRCs\) being equal. The model has also been applied to the Canadian indirect tax system by Cragg (1991). Using micro-data, Cragg finds that the Canadian indirect tax system was neither progressive nor regressive. He argues that the Canadian government showed no aversion to cross-class income inequality.

2.3 Estimating consumer demand responses

The majority of the information required to estimate the \(MRCs\) from Eq. 2 is available directly from various sources of data. The term in the equation which requires estimation is the uncompensated cross-price elasticities, \(\epsilon _{ki}\) . This is therefore the area where there is greatest scope for sensitivity in the results of the tax reform analysis.

The central demand systems in this paper are Deaton and Muellbauer’s (1980) AIDS and Banks et al.’s (1997) QUAIDS. The AIDS model relates budget shares of various commodities linearly to the logarithm of real total expenditure and the logarithms of relative prices. The key extension to this model introduced by Banks et al. was the inclusion of a quadratic term in total expenditure, which allows for curvature in the Working-Leser Engel curve. The QUAIDS model can be written:

$$\begin{aligned} w_{i}=\alpha _{i}+\sum \limits _{j=1}^{n}\gamma _{ij}lnp_{j} +\kappa _{i}ln\left[ \frac{m}{a(p)}\right] +\frac{\lambda _{i}}{b(p)}\left( ln\left[ \frac{m}{a(p)}\right] \right) ^{2} \end{aligned}$$
(6)

where \(w_i\) is the budget share for good \(i, p_j\) is the relative price of good \(j\), and \(m\) is total expenditure.Footnote 8 \(\alpha _{i}, \gamma _{ij}, \kappa _i, \) and \( \lambda _i\) are the parameters to be estimated. Ln \(a(p)\) has the translog form:

$$\begin{aligned} ln\,a(p)=\alpha _{0}+\sum \limits _{i=1}^{n}\alpha _{i}lnp_{i} +\frac{1}{2}\sum \limits _{i=1}^{n}\sum \limits _{j=1}^{n}\gamma _{ij}lnp_{i}lnp_{j} \end{aligned}$$
(7)

and the Cobb–Douglas price aggregator, \(b(p)\), is defined:

$$\begin{aligned} b(p)=\prod \limits _{i=1}^{n}p_{i}^{\kappa _{i}} \end{aligned}$$
(8)

As recommended by Deaton and Muellbauer (1980), \(\alpha _{0}\) is set to just below the minimum observed value of total expenditure.

Equation 6 shows clearly the extension proposed by Banks et al. While the QUAIDS model allows for a higher-order term in total expenditure (\(\lambda _{i} \ne 0\)), setting \(\lambda _{i}=0\) results in the AIDS model. Including higher-order terms in expenditure is an important extension for some of the goods in our analysis, as is shown in Sect. 3.3.

Using the parameter estimates in the models above, we can calculate the uncompensated price elasticities and the budget elasticities. First we differentiate Eq. 6 with respect to ln \( m\) and ln \(p_{j}\) to obtain

$$\begin{aligned} \mu _{i}\equiv & {} \frac{{\partial {w_{i}}}}{\partial {ln\,m}}=\kappa _{i} +\frac{2\lambda _{i}}{b(p)}\left( ln\left[ \frac{m}{a(p)}\right] \right) \end{aligned}$$
(9)
$$\begin{aligned} \mu _{ij}\equiv & {} \frac{{\partial {w_{i}}}}{\partial {ln\,p_{j}}}=\gamma _{ij}-\mu _i\left( \alpha _{j} +\sum \limits _{k}\gamma _{jk}lnp_{k}\right) -\frac{\lambda _{i}\kappa {j}}{b(p)}\left( ln\left[ \frac{m}{a(p)}\right] \right) ^{2} \end{aligned}$$
(10)

The budget elasticities and uncompensated elasticities can then by calculated by:

$$\begin{aligned} \epsilon _{i}= & {} \frac{\mu _{i}}{w_{i}}+1 \end{aligned}$$
(11)
$$\begin{aligned} \epsilon _{ij}^{u}= & {} \frac{\mu _{ij}}{w_{i}}-\delta _{ij} \end{aligned}$$
(12)

respectively, where \(\delta _{ij}\) is the Kronecker delta.

In order to ensure the demand system is compatible with the theory of utility maximisation, we impose a number of parameter restrictions when estimating the models. Homogeneity and symmetry are imposed on the model by restricting the parameters as (with \(n\) goods):

$$\begin{aligned} \sum \limits _{i=1}^{n}\gamma _{ij}=0 \quad \hbox {and} \quad \gamma _{ij}=\gamma _{ji} \end{aligned}$$
(13)

respectively.

Adding-up requires that \(\sum {w_{i}}=1\), which is satisfied provided:

$$\begin{aligned} \sum \limits _{i=1}^{n}\alpha _{i}=1 ,\quad \sum \limits _{i=1}^{n}\kappa _{i}=0 \quad \hbox {and} \quad \sum \limits _{i=1}^{n}\lambda _{i}=0 \end{aligned}$$
(14)

Of course, the restriction on \(\lambda _i\) only applies when estimating QUAIDS. Unrestricted estimation of the models will only satisfy the adding-up conditions given that, by construction, we have \(\sum {w_{i}}=1\). The symmetry condition is imposed by setting cross-equation restrictions when estimating the model. The homogeneity restriction is imposed by omitting one equation in estimation. The parameter estimates from the remaining equations can then be used to estimate the parameters of the omitted equation. Barten (1969) shows that the choice of omitted equation makes no difference to the parameter estimates.

2.3.1 Demographics

Blow (2003) suggests that demographics can be expected to affect the allocation of household expenditure among goods, at the very least because of economies of scale as household size increases and because different people have different needs. Specifying demographic effects correctly in demand analysis is important both in order to estimate correct price and expenditure elasticities and for the purpose of making household welfare comparisons.

To include demographicsFootnote 9 in the model, we follow Ray ’s (1983) equivalence scale approach.Footnote 10 Ray’s method uses for each household an expenditure function of the form:

$$\begin{aligned} w_{i}= & {} \alpha _{i}+\sum \limits _{j=1}^{n}\gamma _{ij}lnp_{j} +(\kappa _{i}+\eta _{i}^{'}z)ln\left[ \frac{m}{\bar{m}_{0}(z)a(p)}\right] \nonumber \\&+\frac{\lambda _{i}}{b(p)c(p,z)}\left( ln\left[ \frac{m}{\bar{m}_{0}(z)a(p)}\right] \right) ^{2} \end{aligned}$$
(15)

where \(\bar{m}_{0}(z)\) is parameterised as \(1+\delta ^{'}z, \delta \) is a vector of parameters to be estimated, and \(c(p,z)=\prod \nolimits _{j=1}^{n}\delta _{j}^{\eta _{j}^{'}z}\). The adding-up condition requires that \(\sum \nolimits _{j=1}^{n}\eta _{rj}=0\) for \(r=1,\ldots ,s\). This model is commonly referred to as the PS-QUAIDS model. In the case where \(\lambda _{i}=0\), we have the PS-AIDS model.

It is important to distinguish here between the inclusion of demographics in the demand system and the use of equivalence scales based on family composition in a more general sense. The use of standard equivalence scales adjusts income or expenditure to allow for the fact that the needs of a household grow with each additional member but not in a proportional way. Ray’s price-scaling technique allows for both economies of scale in consumption, as with standard equivalence scales, and also for the changes in patterns of consumption among different household compositions. For example, consumption of goods such as alcohol and tobacco may reduce when a household changes from a two-adult household to a two-adult, one-child household. This reflects Rothbart’s (1943) idea that consumption of “adult” goods is negatively affected by the presence of children in the household. He argues that incorporation of a child into the family involves fresh expenditure, which is financed by reducing the budget for goods not consumed by children. In that sense, using Ray’s approach captures both the income effect of changes in the composition of a household, and also the substitution effect which can change the marginal rate of substitution between goods. Ray argues, for example, that the presence of children in poor households contributes to the inability to substitute necessities in favour of luxuries with below average prices. Including demographics in the demand system will allow us to estimate elasticities taking these factors into account.

2.3.2 Zero-expenditures

The use of micro-data to estimate the elasticities means that, for certain goods, we observe a mass of zero-expenditures in the data. Ignoring this problem may lead to inconsistent estimates of the demand system parameters due to the nonlinearity in the conditional mean of \(w_{it}\) (Wooldridge 2002). Shonkwiler and Yen (1999) suggest a two-step procedure for dealing with observed zero-expenditures. In the first step, we estimate probability and cumulative density values in a single equation probit model, where the dependent variable (\(d_{it}\)) is binary indicating positive expenditure.

$$\begin{aligned} d_{it}^{*}=Z_{it}^{'}\pi _i+\nu _{it} \end{aligned}$$
(16)

The independent variables (\(Z_{it}\)) in this first step are the demographics, which enter the demand system through Rays technique in addition to working status of head of household and spouse, dwelling type, household tenure and an urban/rural identifier.

The probit estimates of the first step are then used in the second step to estimate:

$$\begin{aligned} w_{it}=\varPhi (Z_{it}^{'}\hat{\pi _i})f(X_{it},\nu _i) +\delta _i\varphi (Z_{it}^{'}\hat{\pi _i})+\upsilon _{it} \end{aligned}$$
(17)

where \(\varphi (Z_{it}^{'}\hat{\pi _i})\) is the normal probability density function (pdf) evaluated at \(Z_{it}^{'}\pi _i, \varPhi (Z_{it}^{'}\hat{\pi _i})\) is the normal cumulative density function (cdf) evaluated at \(Z_{it}^{'}\pi _i\) and \(\upsilon _{it}\) is the error term.Footnote 11 The significance of the coefficient on the probability density values indicates if additional information is provided by using this approach. In our case, \(f(X_{it},\nu _i)\) is specified as either the AIDS or QUAIDS model. Shonkwiler and Yen’s approach has been widely applied and analysed in the consumer demand field.Footnote 12

We therefore have six variations of the demand system on which we test the sensitivity of the marginal tax reform results. The AIDS and QUAIDS models are adjusted first for the inclusion of demographics, resulting in the PS-AIDS and PS-QUAIDS models. We also adjust the two models for zero-expenditures using Shonkwiler and Yen’s two-step approach, resulting in the zero-adjusted AIDS (ZA-AIDS) and zero-adjusted QUAIDS (ZA-QUAIDS) models.

3 Data components of the tax reform model

The data used to estimate the \(MRC\) of each good come from a number of sources. Cross-sectional household budget surveys, carried out by the Central Statistics Office (CSO), are used for information on household expenditure. Price data from the CSO is used in estimation of the elasticities, and various publications from the Revenue Commissioners are used to identify the indirect tax on each good in each year of the analysis.

3.1 Expenditure data

Expenditure data come from the Irish Household Budget Survey (HBS). This household-level survey is conducted every five yearsFootnote 13 and includes detailed information on household expenditure, as well as income and socio-economic variables. The main purpose of the survey is to collect detailed household income and expenditure for the purposes of weighting the Consumer Price Index.

For the analysis in this paper, we use five waves of the data, ranging between 1987 and 2009/2010.Footnote 14 The number of households surveyed in the HBS ranges from just below 6000 in 2009/2010 to just below 8000 in 1995. The overall number of households in our analysis is therefore close to 35,000. Weighting variables are included in the data to ensure the sample is representative of the Irish population. Household Expenditures are converted to a common price level. When estimating the \(MRCs\), we equivalise household expenditure.Footnote 15

Household expenditure is recorded in great detail in the HBS. In the 2005 wave of the data, for example, there are household-level expenditure data on over 1000 different goods and services. In order to estimate elasticities and calculate \(MRCs\), we need to aggregate the goods into a smaller number of representative goods. We use a number of principles in deciding which goods to aggregate together for the analysis. Hicks’ Composite Commodity Theorem suggests that commodities among which the relative prices are constant can, in a natural way, be treated as a single commodity (Bradford 1974). In order to make the results policy relevant, we also wish to group goods that are relatively homogenous. For example, white bread and brown bread would be suitable to group together, whereas beer and chairs may not be. Finally, we use the previous literature as a guide in choosing the aggregated goods in order to make our results comparable with previous work on the topic. A trade-off exists between the desire to have as many different goods and services in the analysis as possible, with the need to be able to estimate the parameters of the model with precision. We therefore choose six aggregated goods on which to do our analysis. The six goods are food, alcohol, tobacco, clothing, transport and fuel, and services and other goods. The first five goods are relatively homogenous, while the services and other goods category includes items as diverse as cinema tickets and envelopes. The sixth category is unlikely to satisfy Hick’s commodity theorem. However, including a residual good such as this is commonplace in the marginal tax reform literature. This approach assumes preferences are weakly separable, where maximisation of overall utility implies maximisation of the subutilities subject to whatever is spent on the groups Deaton (1986).

3.2 Price and tax data

To estimate elasticities, we also need price data. This data come from the CSO’s statbank. We use a price index for the price of each good in the analysis, with December 2011 as the base time period in each case.Footnote 16 Within each wave of expenditure data, we can identify the quarter that the household was interviewed. We therefore have between five and seven price observations per wave of data. Of course, most variation in price is observed between data years rather than within data years. The tax rate for each good as a percentage of the consumer price is also required to estimate the \(MRCs\). This data comes from the Revenue Commissioners Statistical Report for each year in question. Table 1 shows the tax rateFootnote 17 on each good between 1987 and 2009. To calculate the tax rate on the goods, we calculated the weighted average of the tax rates on each of the goods that the aggregate good is composed of, where the weight is each individual good’s share of total expenditure on the aggregate good. Tobacco has by far the highest tax rate out of the six goods in question. Food and services and other goods are relatively low-tax goods. While the tax rate on most goods remained relatively constant over the sample period, notably the tax rate on alcohol decreased by almost 10 percentage points over the sample period. Of course, given the goods in the analysis are aggregations of a number of goods, changes in the tax rates reflect both actual tax rate changes, and changes in expenditure patterns within the aggregated goods. The expenditure patterns within each aggregated good are also affected by the tax rates and are therefore endogenous. This problem is common to all studies dealing within aggregated goods.

Table 1 Tax rates % 1987–2009

3.3 Elasticities

The nonparametric relationship between total expenditure and budget share for each good gives us an idea about the sign and significance we can expect on certain coefficients in the demand system estimation. Figure 1 shows the Engel Curves in Working-Leser form,Footnote 18 which relate budget shares to the log of total (equivalised) expenditure. As can be seen from Eq. 6, the Working-Leser Engel curves form the basis of the AIDS model. The Engel curves for food and tobacco both have downward sloping Engel curves, suggesting that expenditure on these goods is concentrated among low-expenditure households. For two, and possibly three, of the goods in question, the curvature of the Engel curves provide evidence of the need to include a quadratic term in the demand system estimation.Footnote 19 While we can test the importance of including the higher-order terms in the demand system by testing the statistical significance of \(\lambda _{i}\), the focus of this paper is to examine if the inclusion of the terms alter the results of the marginal tax reform analysis.

Fig. 1
figure 1

Working-Leser Engel Curves

Table 2 shows the average own-price elasticity for each of the goods under each of the six consumer demand models.Footnote 20 The majority of the goods have the expected negative sign. Two exceptions to this are the food own-price elasticity under the PS-QUAIDS and ZA-QUAIDS models.Footnote 21

Table 2 Own-price (uncompensated) elasticities at average budget shares

Despite the significance of the quadratic terms in the estimation of the QUAIDS model (five out of six \(\lambda _i\) terms were significant at the 1 % significance level. Only the \(\lambda _i\) term for tobacco was statistically insignificant), suggesting that this model is appropriate to use with our data, the elasticities estimated with the QUAIDS model are very similar to those estimated with the AIDS model. Only the own-price elasticity of tobacco seems implausible, with a value of greater than 2 in absolute terms under both models. A review of previous literature (for example, Gospodinov and Irvine (2009); Escario and Molina (2004); Chaloupka and Tauras (2011)) suggests that the own-price elasticity of tobacco should be closer to \(-\)0.3. Larger differences between the AIDS and QUAIDS models occur when we include demographics. The demand for alcohol is estimated to be implausibly elastic under the PS-AIDS specification, while PS-QUAIDS estimates alcohol to have a more plausible elasticity of \(-\)0.7. The estimates of the own-price elasticity of tobacco are improved with the inclusion of demographics in both models, although they are still relatively elastic compared with previous research.

With the adjustment for zero-expenditures in the AIDS and QUAIDS models, alcohol becomes significantly more elastic, with an own-price elasticity greater than one. The own-price elasticity of tobacco in both ZA-models are comparable with previous research, although neither are estimated with precision, as we cannot reject that they are different from zero. Demand for five of the six goods (exception is food) is more elastic when estimated with the ZA-QUAIDS model compared with the ZA-AIDS model, although in most cases the differences are not significant. Given the relatively low number of time periods in the data from which these elasticities are estimated, the standard errors can be quite large. For the majority of elasticities, we therefore cannot reject that they are equal across models. Nonetheless, the point estimates of the own-price elasticities across the six models are plausible in most cases. Empirically, no set of elasticities seem markedly superior to the others. Each set has some elasticities that are in line with the previous literature and some which seem implausible.

We can also compare the budget elasticities and uncompensated cross-price elasticities of the different models in Table 3.Footnote 22 The budget elasticities are very similar between the two models, with none of the differences emerging as statistically significant. Food and tobacco are estimated to be necessity goods (budget elasticity \(<\)1) and clothing and services are estimated as luxury goods (budget elasticity \(>\)1). Fuel and transport is estimated to have a unitary budget elasticity, while the budget elasticity of alcohol is just above unit elastic with a budget elasticity of 1.1. Many of the cross-price elasticities estimated by the two models are also not statistically different from each other, although the PS-AIDS model estimates cross-price elasticities as large as \(-\)2.7, which seems large relative to previous literature. It is important to note here that, given the elasticities have associated standard errors, in turn the \(MRCs\) estimated with these elasticities will also have standard errors. To take account of this in policy analysis, it is common to take a concertina-type approach [(for example, Madden (1996)], analysing only the two goods with the highest \(MRC\)s and the two goods with the lowest \(MRC\)s. Given that the middle ranking goods are only one ranking position away from a change in the direction of the recommended tax reform, the concertina approach reduces the likelihood of misspecifying the direction of tax reform in the case where \(MRC\) confidence intervals overlap.

Table 3 Own-price, cross-price and budget elasticities from PS-AIDS and PS-QUAIDS, at average budget shares

With six sets of elasticities, and a reasonable degree of variation between each set, in practice, a tax analyst may be faced with a choice about which set of elasticities to use. In the case where demand models are nested (for example, AIDS is nested in QUAIDS), econometric tests can be used to determine the most appropriate model. Indeed, the quadratic terms in expenditure are statistically significant in the QUAIDS model, which suggests the use of QUAIDS over AIDS in this case. This process becomes more complicated, however, when we introduce the demographics and zero-expenditure adjustment.

Previous literature suggests the choice of demand system may not have a significant effect on the marginal tax reform results. In the next section, we use the different sets of elasticities estimated by the various demand models to examine whether the findings extend to the inclusion of a quadratic term (QUAIDS model), the inclusion of demographics (PS-models) and the adjustment for zero-expenditures (ZA-models) in the underlying demand system.

4 Marginal tax reform analysis with different consumer demand systems

4.1 Distributional characteristic

Before examining the \(MRCs\) themselves, Table 4 shows the ranking of the goods according to distributional characteristic. As discussed in Sect. 2.1, with zero inequality aversion (\(e=0\)), the distributional characteristic on all goods will be equal to one. With \(e\) greater than zero, a higher distributional characteristic indicates expenditure on a good is concentrated among poor households. As \(e\) increases, the consumption by the poorest households gets a higher and higher weight. Thus, ignoring efficiency concerns, according to Table 4, the policy advice in 1987 would be to increase the tax on clothing and services and other goods and decrease the tax on tobacco and food. This is the case for a relatively low level of inequality aversion (\(e=1\)) and a relatively high level of inequality aversion (\(e=5\)). Alcohol is the good most affected by changing the level of inequality aversion. In 2009, for example, alcohol changes from having the third lowest distributional characteristic to having the lowest distributional characteristic when \(e\) increases from 1 to 5. In general, however, Table 4 shows that the ranking of goods is not very sensitive to changes in inequality aversion.

Table 4 Ranking of goods according to distributional characteristic

The rankings are also quite stable across the years, with tobacco and food consistently having the highest distributional characteristic, and clothing and services and other goods usually at the bottom. Alcohol moves down the rankings in later years, particularly with the higher level of inequality aversion. This indicates that alcohol expenditure became relatively less concentrated among lower-expenditure households between the late 1980s and 2009.

4.2 Marginal revenue costs

We now turn to the estimates of the \(MRCs\). These values indicate the possibility of welfare-neutral revenue-improving marginal tax reform, with the policy rule being to increase the tax on goods with higher \(MRCs\) and decrease the tax on goods with lower \(MRCs\). When comparing \(MRCs\), it is the ranking of the good that is important, not the magnitude of the \(MRC\). This is because the \(MRC\) is a function of the welfare weight on each household, which has been normalised so that the lowest expenditure household has a weight of one. An alternative normalisation, for example the lowest expenditure household having a weight of two, would lead to different \(MRCs\). The ranking, however, would remain constant. For that reason, we are more interested in the ranking of the goods by \(MRC\) rather than the \(MRC\) of each good itself.

The distributional characteristic analysis showed that the ranking of goods is insensitive to changes in the value of the inequality aversion parameter, \(e\), a finding which mirrors that of Madden (1995a). Tables 5 and 6 show the ranking correlation coefficient between the \(MRCs\) of the goods as we change how we estimate \(\epsilon _{ki}\).

Table 5 Rank correlations between MRCs with AIDS elasticities
Table 6 Rank correlations between MRCs from AIDS and QUAIDS

Table 5 presents the rank correlations between the \(MRCs\) estimated using the AIDS elasticities with the corresponding rankings of \(MRCs\) based on PS-AIDS (top half) and the zero-expenditure adjusted AIDS (ZA-AIDS, bottom half). Taking the PS-AIDS correlation first, the ranking correlations range between 0.7 and 0.9, suggesting that the rankings are insensitive to the inclusion of demographics. Indeed, the average correlation between the AIDS and PS-AIDS \(MRC\) rankings is 0.8.

The ranking of goods by \(MRC\) is quite different following the zero-expenditure adjustment, particularly at low levels of inequality aversion. This ranking difference is mainly driven by a change in the ranking of tobacco. Recalling the results from Sect. 3.3, the zero-expenditure adjustment in the AIDS model caused the own-price elasticity of tobacco to change from \(-\)2.3 to \(-\)0.2. The less elastic the demand for a good or service, the more likely it is to be subject to a tax increase recommendation, \(ceteris\) \(paribus\). In this case, the significant change in the own-price elasticity contributed to tobacco changing from having a relatively low \(MRC\) with the AIDS elasticities to having a relatively high \(MRC\) with the ZA-AIDS elasticities. Given the relatively low number of goods in the analysis, the rank correlations are sensitive to dramatic changes in the ranking of one good. As the value of \(e\) increases, the \(MRC\) rankings converge as more weight is placed on the distributional concerns rather than efficiency concerns (Decoster and Schokkaert 1990).

Given the similarity in the elasticities estimated by the AIDS and QUAIDS models, it is unsurprising that the correlation between the rankings of \(MRCs\) based on these elasticities is very high, as can be seen in Table 6. Perfect correlation exists in 26 out of 30 pairs of rankings when comparing the \(MRCs\) based on the AIDS and QUAIDS (top panel). In the four other cases, the ranking correlation is above 0.9. While the inclusion of demographics in the demand systems introduces more variation in the estimated elasticities, the correlation between the \(MRC\) rankings based on these estimates remains very high, ranging between 0.7 and 1, with an average correlation of 0.9. Indeed if we take a concertina-type approach to tax reform, the policy advice remains almost identical between the models. Tobacco is consistently in the bottom of the \(MRC\) rankings, joined by alcohol in the majority of cases. For higher values of \(e\) using the QUAIDS elasticities, in some cases alcohol is replaced in the bottom two by food. At the other end of the scale, using both the PS-AIDS and PS-QUAIDS elasticities, services and other goods is consistently ranked in the top two \(MRCs\), joined by food for lower values of \(e\), and by clothing for higher values of \(e\).

The \(MRCs\) based on the ZA-AIDS and ZA-QUAIDS elasticities are also very highly correlated. In this case, however, tobacco is at the top of the \(MRC\) rankings in the majority of cases, indicating that the tax reform recommendation based on these elasticities would be to increase the tax on tobacco. As previously discussed, this result is driven mainly by the change in the own-price elasticity of tobacco following the zero-expenditure adjustment. Changes in some of the tobacco cross-price elasticities also contribute. Tobacco increases in substitutability with alcohol and transport and fuel with the zero-expenditure adjustment in the QUAIDS model, for example. Indeed, ignoring the change in ranking of tobacco,Footnote 23 the ranking correlations of \(MRCs\) based on the ZA-models compared with the other models increase substantially, in many cases higher than 0.6 with \(e=0\). Excluding tobacco, the tax reform recommendations are therefore quite robust to the choice of demand system. Taxation of tobacco is often motivated by considerations other than equity or efficiency; we examine the sensitivity of the tax reform recommendations taking these considerations into account in Sect. 5.

The 1987 \(MRC\) rankings based on the AIDS model can be compared with those in Madden (1995a).Footnote 24 A correlation coefficient of 0.6 between the rankings of the goods common to both studies shows that, although similar, the rankings are not identical. A number of factors can explain the difference in the rankings in the two papers. Firstly, as Blundell et al. (1993) explain, the use of micro-data can have strong advantages over macro data when estimating consumer demand patterns. They argue that aggregate data alone, as in Madden, are unlikely to produce reliable estimates of structural price and income coefficients. Secondly, while Madden estimated elasticities based on data from 1958 to 1988, in this paper we use data from 1987 to 2009. Finally, while Madden used the Stone Index to approximate \(a(p)\), in this paper we do not use an approximation. Pashardes (1993) points out that the commonly used Stone index approximation for linear estimation of the AIDS model can bias the parameters estimates of the budget share equations.

The analysis in this section supports the conjecture of Ahmad and Stern that marginal tax reform results are relatively insensitive to the specification of the underlying demand system. Similarly, the empirical work of Decoster and Schokkaert (1990) and Madden (1996) is reflected in the findings of this paper. The choice between the AIDS model and the QUAIDS model seems to matter little to the results of the tax reform analysis, in the Irish case at least. The same argument can be made with regard to the inclusion of demographics. Any combination of the unadjusted and PS-models produce ranking correlations averaging around 0.8. The zero-expenditure adjustment had the largest impact on the rankings of \(MRCs\), driven mainly by a change in the own-price elasticity of tobacco.

Given the stability of the rankings of the goods, we can briefly discuss the policy recommendations that come from the analysis.Footnote 25 Under most demand systems with any positive value of inequality aversion, clothing and services and other goods are the goods with the highest ranking \(MRCs\), so a tax increase is suggested. The \(MRC\) ranking of a tax increase on alcohol increases with the level of inequality aversion, particularly in later years, so that as more weight is put on the welfare of the poorest households, the policy advice changes from reducing the tax on alcohol to increasing the tax on alcohol. Tobacco is consistently at the bottom of the rankings with the standard and PS-elasticities, so that policy advice is to decrease the tax on this good. The exception to this is with the ZA- elasticities, where the policy advice with regard to tobacco changes from a suggested decrease in the tax rate to a suggested increase.

Of course, an element of taxation on certain goods may be to correct for merit good arguments or external effects associated with the consumption of the good. Consumption of tobacco and alcohol are two goods which are often argued to have demerit properties and negative externalities. In the case of tobacco consumption, for example, these demerit properties and negative external effects come in the form of health risks to the consumer and health risks associated with passive smoking. In addition to equity and efficiency considerations therefore, taxation of goods may in part be motivated by governmental or societal desire to alter household demand for certain goods. Schroyen (2010) extended the marginal tax reform methodology of Ahmad and Stern to allow for (de)merit good considerations. In the next section, we allow for these merit good arguments to enter the tax reform model and examine whether the results found in previous sections still hold.

5 Including merit good arguments in marginal tax reform analysis

Ahmad and Stern’s marginal tax reform model used above is based on the assumption that the government makes tax reform decisions based on distributional and efficiency considerations alone. Another element of taxation, however, may be corrective. In reality, taxes (or subsidies) are often used to try to alter consumer demand for particular goods on the basis of merit or demerit good arguments. Two examples of recent indirect tax policy illustrate this point. In 2011, Denmark became the first country to introduce a “fat tax”.Footnote 26 Denmark implemented this “fat tax” by increasing taxes on saturated fat in an attempt to limit the population’s intake of fatty foods. Similarly, in the Budget announced in 2014 in Ireland, taxes on tobacco were increased. In revealing the motivation behind the tax increase, the Minister for Finance said that “it wasn’t brought in to raise revenue...it’s settled policy that the best way to reduce smoking, particularly among young people, is to increase price”.Footnote 27 In both cases, demerit properties of the goods in question motivated additional taxes to be levied. As described by Schroyen (2010), a merit good argument represent the existence of a wedge between the households willingness to pay for an extra unit of a commodity and that of the government.

Schroyen extended Ahmad and Stern’s marginal tax reform model to account for these demerit arguments. Again, the model is based on the principle that the \(MRC\) of all goods should be equal at the optimum. If not, welfare-improving revenue-neutral tax reforms can be found. As before, we measure the revenue effect of a marginal indirect tax reform as:

$$\begin{aligned} q_i\frac{\partial {R}}{\partial {t_{i}}}= q_iX_i+\sum \limits _{k}\tau _{k}q_{k}X_{k}\epsilon _{ki} \end{aligned}$$
(18)

The merit good arguments are captured through two additional terms in the measurement of the welfare effect. Schroyen shows that the welfare effect (as perceived by the government), accounting for merit good arguments, can be approximated as:

$$\begin{aligned} -q_i\frac{{\partial {V}}}{\partial {t_{i}}}\approx \sum \limits _{h}\beta _{h}\left[ q_{i}x_{i}^{h} - \varOmega (q_{n}x_{n})\left( \sum \limits _{k=1}^{n}\sigma _{k}^{h}w_{k}^{h}\epsilon _{ki}+\epsilon _{ni} \right) \right] \end{aligned}$$
(19)

The first additional term in the measurement of the welfare effect is the merit parameter, \(\varOmega \). This term represents the difference between the government’s marginal willingness to pay (MWP) and the consumer’s MWP for a commodity with (de)merit properties, which we denote good \(n\). Merit good arguments affect the government’s demand prices in two ways. First, the government’s demand price for good \(n\) is altered by the presence of merit properties, represented by \(\varOmega \). Second, the government considers households to be better off (or worse off with demerit goods) due to the inframarginal units of good \(n\) consumed. This has a scale effect on all demand prices, the size of which will depend on the budget share of good \(n\). The term in the larger round brackets, including the scale elasticity \(\sigma \),Footnote 28 accounts for the change in \(q_i\) changing the consumption pattern for all goods. Again, the measure of merit, \(\varOmega \), needs to be accounted for.

Before examining the \(MRCs\) that take account of the merit good arguments (hereafter, \(Merit MRCs\)), two questions remain open. First, which goods in the analysis have merit or demerit properties? Second, for the goods with these properties, how do we value \(\varOmega \)? At least five of the six aggregated goods in the analysis could be argued to have merit or demerit properties. Certain food products, such as those with high levels of sugar or saturated fat are often argued to have demerit properties (recall the Danish “fat tax” example). Alcohol and tobacco are often subject to higher tax rates than other goods on health grounds. Emissions from fuel products such as petrol and diesel could be similarly classed as demerit goods on environmental grounds (see Schroyen and Aasness (2006)). Health-related expenditure within the services and other goods category could be classed as a merit good, due to the positive health effects created (see Sandler and Arce (2002)). For simplicity and clarity, as well as the practical difficulty of valuing \(\varOmega \) for such a wide range of goods, we allow only tobacco to have demerit properties in this section. Using tobacco as our demerit good allows us to draw on recent evidence on the scale of demerit properties in tobacco consumption.

In practice, it is difficult, if not impossible, to accurately calculate the “true” value of \(\varOmega \). In an empirical application of Schroyen’s (2010) model, Schroyen and Aasness (2006) value the demerit properties of alcohol and tobacco consumption so that the government’s MWP lies about 70 % below that of the consumer. This figure was chosen on the basis of some sensitivity analysis of MRC rankings. Madden (2003) provided a detailed discussion on the valuation of internal and external costs of tobacco consumption. The addictive nature of tobacco, and the potential for time-inconsistent preferences of smokers, resulted in additional complications in valuing the true cost of smoking. Madden (2003) estimated the external costs of smoking to be in the range of €3.18–€4.85 per packet. Given the average price of a packet of 20 cigarettes in Ireland in 2003 was €5.84Footnote 29, these external costs represented between 55 and 83 % of the consumer price. If we assume the government’s MWP is based on the consumer price less the external effects of consumption, a 70 % difference between the government’s MWP and the consumer’s MWP seems a reasonable approximation of \(\varOmega \). Therefore, on the basis of Schroyen and Aasness (2006) and Madden (2003), the results of the \(Merit\) \(MRC\) analysis in this section are based on a value of \(-\)0.7 for \(\varOmega \). Of course, including internal costs of consumption of tobacco as well as external costs may substantially increase (in absolute terms) the value of \(\varOmega \). Gruber and Koszegi (2001), for example, found that the internal cost of tobacco consumption may be as high as €30 per packet, suggesting \(-\)0.7 may be an underestimate of the “true” wedge between the government’s and the consumers MWP for tobacco.

Based on this approach, we examine two further potential sources of sensitivity in the \(MRC\) rankings. First, we examine whether the \(MRC\) rankings are sensitive to the inclusion of the merit good arguments. Second, we can examine whether the rankings of \(Merit\) \(MRCs\) are sensitive to the choice of underlying demand model. Table 7 shows the ranking correlation of the \(MRCs\), as estimated in Sect. 4, with the \(Merit\, MRCs\) estimated using the extended model described above. Most sensitivity is recorded in the \(MRCs\) based on the AIDS and QUAIDS models.Footnote 30 Some sensitivity is shown with the PS-models, while the rankings based on the ZA- models are almost identical with and without the merit terms included. Given the elasticities reported in Table 2, these results are unsurprising. \(Ceteris\) \(Paribus\), the more inelastic the own-price elasticity of a good, the more likely the recommendation will be to increase the tax on that good. Similarly, the inclusion of demerit properties in the model will also increase the probability of a tax increase recommendation. The own-price elasticity of tobacco becomes more inelastic as we move from (QU)AIDS to PS-(QU)AIDS to ZA-(QU)AIDS, which has a similar effect as including demerit arguments. Therefore, with the ZA- elasticities, tobacco was already subject to a tax increase recommendation so that inclusion of merit arguments has little effect on the rankings. With the standard and PS-AIDS and QUAIDS models, the inclusion of merit arguments increases the ranking of tobacco from a low-ranked good to a high-ranked good.

Table 7 Rank correlations between MRCs with and without Merit Arguments Included

Table 8 examines the sensitivity of rankings of \(Merit\, MRCs\) to changes in the underlying demand model. In the majority of cases, the rankings show very little sensitivity to the choice of model, with ranking correlations above 0.5 in 25 out of the 30 cases, and above 0.7 in more than half of the cases examined.Footnote 31 The low correlation in two 2004 comparisons is driven by a change in the ranking of clothing in the PS-AIDS model in this year. Overall though, the tax reform recommendations are robust to changes in the underlying demand system, suggesting an increase in the tax on tobacco and services, and in most cases, a decrease in the tax on alcohol and clothing or transport and fuel. Of course, as previously discussed, tobacco is not the only good with merit or demerit properties. To include these properties for the full range of goods in the analysis, we require estimation of the wedge between the government’s MWP and the consumer’s MWP for these goods. We leave this topic to further analysis.

Table 8 Rank correlations between Merit MRCs

6 Conclusion

Consumer demand modelling has been subject to a number of key advances in recent years. Consensus among researchers seems to have formed over the use of the AIDS and QUAIDS models in estimating consumer demand. In addition, the increasing availability of household expenditure data means that analysts no longer rely on aggregate expenditure data to estimate consumer demand responses. While the advantages of these developments are well documented, the implications for marginal tax reform analysis remained unclear. This paper aims to fill that gap. We estimated a number of different sets of elasticities from different specifications of the AIDS and QUAIDS models. We estimated the PS-AIDS and PS-QUAIDS models by including demographic factors and adjusted the demand systems for the presence of zero-expenditures in the micro-data. This resulted in six sets of own- and cross-price elasticities. These elasticities were the basis for testing the sensitivity of the tax reform results. Despite finding variation in the elasticities across the different models, rankings of \(MRCs\) based on the models were found to have very high correlation coefficients. As expected, the degree of correlation was highest for the higher levels of inequality aversion. The highest degree of sensitivity was shown when we adjusted for zero-expenditures in the data. This was shown to be the result of a large change in the own-price elasticity of tobacco and associated changes in tobacco cross-price elasticities, which substantially altered the ranking of the \(MRC\) of taxation of tobacco. Taking a concertina-type approach to tax reform, the results for many of the other goods remained constant, even under the zero-expenditure adjustment.

Ahmad and Stern’s model of marginal tax reform allows for efficiency and distributional considerations to affect tax reform recommendations. In reality, alternative criteria, such as merit good arguments, may also affect tax reform decisions. Using Schroyen’s (2010) extension of Ahmad and Stern’s model which includes (de)merit good considerations, we examined two further potential sources of sensitivity in the \(MRC\) rankings. First, we examined whether the \(MRC\) rankings were sensitive to the inclusion of the merit good arguments. Second, we examined whether the rankings of \(Merit\) \(MRCs\) were sensitive to the choice of underlying demand model. Comparing the standard \(MRCs\) with the \(Merit\) \(MRCs\), most sensitivity in results was recorded when comparing the \(MRCs\) based on the standard or PS-AIDS or QUAIDS to the \(Merit\) \(MRCs\) based on the same models. Once we corrected for the zero-expenditures in the underlying demand model, little sensitivity was shown in the tax reform results to the inclusion of merit good arguments. The ranking of goods by \(Merit\) \(MRC\) was particularly insensitive to changes in the underlying demand model. This was the case even though the elasticity for the demerit good in our analysis, tobacco, changed considerably between demand models.

The analysis in this paper is intended as a complement to the work of Decoster and Schokkaert, and Madden in assessing the sensitivity of tax reform results to changes in the underlying demand system. In the Ahmad and Stern marginal tax reform framework, sensitivity in tax reform results can come from three sources: the level of inequality aversion of the government or social planner, the estimation of demand responses, or the inclusion of merit good arguments. As shown in the previous studies, and reaffirmed here, tax reform recommendations are robust to changes in the level of inequality aversion. Changes in the estimation of demand responses can introduce some sensitivity into the tax reform recommendations, although in this analysis, the sensitivity was mainly due to changes in elasticities relating to tobacco. Excluding tobacco, most tax reform recommendations were robust across the range of demand models estimated. The inclusion of merit good arguments in the model did affect the tax reform recommendations in most cases. Notably though, once the merit good arguments were included, the choice of demand model was not relevant to the resulting tax reform recommendations. On that basis, the key issues to confront for the policy-maker appear to be those where merit effects might be relevant, rather than the choice of underlying demand model.

The current popularity of the AIDS and QUAIDS models, and extensions related to the use of household expenditure data, makes it important to understand the sensitivity of tax reform results to the choice between the models. Using Irish Household Budget Survey data, we showed that tax reform results are quite insensitive to this choice, in the Irish case at least.