1 Introduction

There has been a recent surge in the number of papers studying the relationship between income inequality and economic growth. Such renewed interest has been fueled by recent increases in income inequality in the U.S., as well as by empirical studies that document conflicting results on the relationship between inequality and the level of per capita income or growth [Partridge (1997); Li and Zou (1998) and Forbes (2000) find positive relationships, while Alesina and Perotti (1994) and Alesina and Perotti (1996) find negative relationships). These results have generated a good deal of discussion because they pose significant challenges to policy makers.

A particular challenge, from a policy standpoint, is that while several theoretical models show positive (see e.g., Li and Zou 1998; Galor and Tsiddon 1997) or negative (see e.g., Alesina and Perotti 1994; Persson and Tabellini 1994) effects of inequality on the level or growth of per capita income, their empirical counterparts rely on correlations and not necessarily on causality between inequality and per capita income. However, one cannot make informed public policy based on correlations because these simple correlations may lead to misguided policy.

Several empirical studies use cross-sections or panels of countries to examine the relationship between economic performance and inequality. Partridge (1997); Fallah and Partridge (2007), and Panizza (2002), among others, all point out that a problem with many cross-country studies is the quality and comparability of the inequality data. These quality and comparability issues in cross-country data can result in measurement error, which may induce numerical instability into the regression estimates, thus having dire consequences on their precision. Forbes (2000) notes that random measurement error for example can lead to an attenuation bias, which decreases the significance of the results. A potentially more pervasive problem, Forbes notes, is systematic measurement error, which can generate a positive or negative bias based on the correlation between the measurement error and other variables in the regression model. Footnote 1 This line of argument caused Kanbur (2000) to make the following argument against cross-country studies:

A superior approach is one which looks at country experiences in their historical and policy detail, and approaches the issues of policy directly and specifically....relying on cross-country regressions of inequality on per capita income or growth to support or contradict a policy “tradeoff” between the two does not seem to have been very productive (p. 832).

Furthermore, the relationship between inequality and the level or growth of per capita tends to vary depending on the time intervals under consideration. In general, studies that examine the short- and/or medium-run relationship tend to find a positive relationship, whereas studies that examine their longer-run relationship tend to find a negative relationship. Li and Zou (1998) and Forbes (2000), who use data averaged over 5 year periods find a positive relationship. Forbes (2000) shows that when the data are averaged over 10-year periods, the coefficient on inequality remains positive, but dramatically decreases to the extent that it becomes insignificant. Persson and Tabellini (1994) and Alesina and Rodrik (1994), who use data averaged over longer periods find negative relationships. It is worth mentioning that this set-up (averaging over \(n\) periods) is not without criticism. Attanasio et al. (2000) note that annual data provide information that is lost when averaging. They also argue that the length of business cycles varies over time and across space, and because the interval over which these averages are computed is arbitrarily fixed, there is no guarantee that business cycles are cut in the right way. In addition, they insist that if averaging indeed measures the long-run effects, it prevents the analysis of short-run effects which usually include the interesting dynamic interplay of forces acting in opposite directions or different magnitudes.

In this paper, we address the aforementioned problems using a new comprehensive panel of annual state-level income inequality from 1930 to 2005 assembled by Frank (2009a, b). Our goal was to examine the response of per capita income to shocks in inequality and the effects of inequality on the level of per capita income using a panel vector autoregression (VAR) approach. The new dataset, together with our empirical methodology, addresses the above mentioned problems in several ways. Firstly, while we are under no illusions that our dataset is free of measurement error, using state-level data decrease the problem significantly. Frank (2009b) points out that:

The greater homogeneity of state-level data helps mitigate the difficulty in adequately capturing structural differences across international panels of earlier studies such as Forbes (2000) and Barro (2000). Corruption levels, labor market flexibility, tax neutrality, tradition of entrepreneurship, and many other factors are only poorly measured, if at all (Barro 2000, pp. 10–11), and these sources of heterogeneity are much more likely to contribute to omitted variable bias across countries than across states (Frank 2009b).

Secondly, a panel VAR approach allows not only for the examination of the correlation between income inequality and per capita income, but also the dynamic responses of these variables. Thirdly, prior studies on the relationship between the level or growth of per capita income and inequality have traditionally followed one of two paths: the first path, motivated by the work of Kuznets (1955), examines how economic performance affects inequality. The other line of research examines rather the effect of inequality on economic performance. Our panel VAR methodology unifies these two approaches because our approach simultaneously examines the effect of changes in inequality on per capita income and the effect of per capita income on inequality. Finally, by using annual state-level data, we are able to examine both the short-run and long-run effects of inequality and per capita income. To check for robustness, we also identify the response of the level of per capita income to several other inequality measures. To our knowledge, this paper is the first that uses a panel VAR approach to examine the effect of inequality on per capita income and per capita income on inequality using U.S. state-level data.

Our empirical approach is certainly not without limitations. A potential limitation is that the baseline estimates of our model come from a bivariate panel VAR model of per capita income and changes in inequality. While bivariate VAR models have frequently been used in the traditional time-series VAR literature (Blanchard and Gali 2007; Kilian and Vigfusson 2011; Bachmeier and Cha 2011), it is possible that several shocks might be simultaneously affecting per capita income growth. The use of state-level data limits the series that can be added to the panel VAR model. In addition, we do not believe that this limitation is significant enough for the purposes of this paper because our primary purpose is on the cumulative responses of per capita income and inequality, and our empirical approach allows for a parsimonious, valid identification framework.

The impulse response functions (IRFs) from the estimated baseline bivariate panel VAR models show a pattern in which real per capita income tends to decrease following a shock to the Gini index. The maximum effect on real per capita income levels occurs approximately 5 years after the shock. This finding is robust to changes in the measures of inequality used. We also find that income inequality decreases following a one percentage point increase in per capita income, although the initial direction and significance differ depending on the measure of inequality used. To examine how stable our estimated impulse responses are, and to what extent they are driven by events such as World War II and the Great Moderation, we split our sample into three subsamples: 1930–1947, 1948–1984, and 1985–2005. The IRFs for the first subsample are qualitatively identical to those shown in the entire sample except that they are twice as large in magnitude. For the 1948–1984 subsample, per capita income has no significant responses following inequality shocks, while the response of inequality to income shocks is negative. For the 1985–2005 period, per capita income decreases slightly after a Gini shock, while the Gini response to an income shock is mostly insignificant, although there is an initial positive response within the first 2 years after the shock. Because some researchers argue that bivariate models may omit relevant information, we include a measure of human capital, and estimate a trivariate panel VAR model. The general results hold, indicating that the identified shocks from the bivariate VAR model are not contaminated by shocks other than income or inequality shocks.

Some researchers have argued that the results obtained when examining the short-run relationship between income inequality and per capita income or growth may be spurious if these results are not robust to medium-run and long-run situations (Partridge 1997). Consequently, we estimate the dynamic responses of the level of per capita income and inequality using data averaged over 5- and 10-year periods. We find that in general, in the medium run (5 year averages) and the long run (10 year averages), the aforementioned results hold. That is, in the medium- and long-run inequality (Gini) decreases per capita income, and per capita income shocks decrease inequality. The literature on the relationship between inequality and growth also stresses the importance of initial conditions on the level and growth of per capita income (Durlauf and Quah 1999). To this end, we examine how initial inequality affects growth in the medium and long terms. We find that in the medium run, initial inequality has a positive impact on the level of per capita income, however, the effect turns and stays negative after two periods. When we estimate the long-run response of per capita income to an initial inequality shock, the response is negative and persistent.

The rest of the paper proceeds as follows. Section 2 provides an overview of the data. Section 3 discusses the empirical methodology, while Sect. 4 estimates and presents the key results of the panel VAR model relating real per capita income and income inequality as measured by the Gini index. Section 5 concludes.

2 Overview of the data

This section presents a brief overview of the data as a prelude to the estimation of the structural VAR. Our dataset consists of annual data on the percentage change in per capita real income and various income inequality measures for the 48 contiguous states of the US (and DC) from 1930 to 2005. The data on nominal income per capita were collected from the regional economic information systems of the Bureau of Economic Analysis (BEA-REIS) and deflated using the consumer price index (1982\(-\)1984 = 100). Data on income inequality were downloaded from Professor Mark Frank’s website. Figure 1 shows the state-averaged real income per capita for all 48 states and DC, and the corresponding average growth rate. It is difficult to distinguish the cyclical movements in per capita income because the overall upward trend is so strong. A slowdown in the rate of growth beginning in the early 1970s is apparent from the graph, however. Figure 1b shows the growth rate real income per capita. The most noteworthy feature of this graph is the decrease in volatility of per capita income growth after 1945 and again in the beginning of the early 1970s.

Fig. 1
figure 1

State-level average of real income per capita and real income per capita growth: 1930–2005. a Average log of real per capita income. b Real per capita income growth

Figure 2 shows state-averaged measures of inequality from 1930 to 2005. All these indices of inequality usually lie between 0 and 1. An index that is close to 1 indicates a society characterized by high income inequality. These graphs display that in general, income inequality in the U.S. was low until the early 1980s after which a distinct rise in inequality is apparent. This pattern of inequality is similar to those found in Piketty and Saez (2003).Footnote 2

Fig. 2
figure 2

State-level average of various inequality measures: 1930–2005

2.1 Panel unit root tests

We begin our analysis by conducting a series of unit root tests. Nonstationarity is a property very common to economic data. It can be thought of as a phenomenon whereby a variable has no clear tendency to return to a constant value or linear trend. Several procedures exist for testing the presence of unit roots in panel data, notably the Fisher-type augmented Dickey–Fuller (ADF) test (Maddala and Wu 1999; Choi 2001), the Levin–Lin–Chu (LLC) (2002) test, the IM–Pesaran–Shin (IPS) (2003), the Harris–Tzavalis (HT) (1999) test, and the Hadri (2000) LM test.

The first four tests usually test the hypothesis of a unit root for each individual series in a panel (Pesaran 2011). The formulation of the alternative hypothesis is the rather controversial issue that critically depends on the assumptions made about the nature of the homogeneity/heterogeneity of the panel. Therefore, in the event of a rejection of the null hypothesis of these tests, one must interpret the results as implying that a statistically significant proportion of the units, and not necessarily all the units are stationary. The Hadri test on the other hand has a null of stationarity around a deterministic level or a unit-specific deterministic trend. It allows for homoskedastic error processes across the panel, or heteroskedastic error processes across cross-sectional units. The test allows for the correction of autocorrelation using a Newey–West estimator of the long-run variance. Because each of these tests has the advantages and disadvantages, we test for stationarity of our panel using all five tests. Table 1 presents the results.

Table 1 Panel unit root tests

Table 1 shows the results of the various unit root tests. For the LLC test, we apply the Newey–West bandwidth selection algorithm, while the HT, LPS, and ADF test statistics are robust to cross-sectional correlation of the error terms. Lag lengths were chosen using the Akaike information criteria (AIC). Our inferences are based on a 5 % level of significance. As suggested by Levin et al. (2002), all the tests are carried out on demeaned data to mitigate the effects of cross-sectional dependence. Panel A of Table 1 shows the various unit root tests for the Gini inequality index. As can be seen from the table, the first four tests, namely the LLC, HT, IPS and ADF, reject the null hypothesis of a unit root for each series in the panel. From these results, one might be tempted to conclude that the Gini coefficient of inequality is stationary. However, it must be stressed that a rejection of the null hypothesis only implies that a statistically significant proportion of the series of Gini coefficients is stationary, and not necessarily all the series in the panel. Consequently, we use the Hadri test to test the hypothesis that there is no unit root in any series (stationarity). The p value of 0 rejects this hypothesis, indicating that some of the series in the panel are nonstationary. Consequently, we first-difference each series to achieve stationarity. As shown in the same table, all five tests indicate that the series of Gini coefficients are now stationary after first differencing. Similarly, while the HT, IPS, and ADF tests reject the unit root hypothesis for the level of per capita income when intercept and trend are included, the Hadri (and the LLC) test indicates nonstationarity. All five tests, however, indicate stationarity of the growth rate of per capita income. As a result, we proceed to the estimation of the panel vector autoregression model with the variables in differences.

3 Methodology

The baseline specification is a bivariate panel vector autoregression model of the growth rate of real per capita income and the change in the Gini index. Denote the percentage change (or growth rate) of real per capita income of state \(i\) in year \(t\) by \(\triangle y_{i,t}\) and the change in the Gini coefficient of inequality of state \(i\) in year \(t\) by \(\triangle g_{i,t}\). Then a reduced-form panel VAR model of the variables can be written as:

$$\begin{aligned} Y_{it}=A(L)Y_{i,t-1}+\delta _{i}+\varphi _{i}+\varepsilon _{it}\quad \varepsilon _{i,t}\sim N(0,\Sigma _{i}), \end{aligned}$$
(1)

where \(Y_{i,t}=\bigl [\begin{array}{cc} \triangle y_{it}&\triangle g_{it}\end{array}\bigr ]^{'},\) \(\varepsilon _{i,t}=\bigl [\begin{array}{cc} \varepsilon _{it}^{\triangle y}&\varepsilon _{it}^{\triangle g}\end{array}\bigr ]^{'}\), \(t\) indexes time, \(L\) is the lag operator, \(A(\cdot )\) is a polynomial matrix in \(L\), \(\delta _{t}\) denotes the unobservable time effects, and \(\varphi _{i}\) is a vector of time-invariant state fixed effects. The state fixed effects capture any permanent differences in the dependent variable across states, such as differences in real per capita income arising from differences in cost of living across states. The time effects capture the aggregate components of the dependent variable that is common across states at time \(t\), such as differences arising from using aggregate inflation data to deflate state-level nominal per capita incomes.

One hardly interprets the coefficients of a reduced-form VAR model such as (1). Reduced-form VAR models are useful for forecasting purposes but not for calculating impulse response functions. In order to say something about the underlying structural shocks, it is necessary to impose more structure on the above system. Economists therefore typically work with structural VAR models whereby structural, exogenous shocks are the driving forces in the model. A reduced-form model such as (1) typically corresponds with a variety structural models, and the data usually do not allow the researcher to distinguish between these structural models. Restrictions based on economic theory are usually imposed in order to identify a structural VAR. The imposition of such structure allows for the recovery of interesting patterns in the data using little or no theory. In fields where there is little theoretical consensus (such as the study of inequality and growth), or where models are not completely specified, structural VAR models can be especially useful.

The use of panel data, however, requires that the underlying structure of the model is the same for each state in the panel. This is partially overcome by including state fixed effects, \(\varphi _{i}\). Unfortunately, the autoregressive nature of the panel VAR means that usual fixed effects, and instrumental variable estimation with mean differencing, no longer provides an unbiased estimation. Arellano and Bover (1995) show that this problem can be overcome by using a ‘Helmert procedure’, which removes only the forward mean of the variables in the VAR. That is, for \(\triangle y_{it}\) and \(\triangle g_{it}\), let \(\triangle \bar{y}_{it}=\sum _{s=t+1}^{T_{i}}\triangle y_{is}/(T_{i}-t)\) for \(i=1,\ldots ,N;\, t=1,\ldots ,T\) and \(\triangle \bar{g}_{it}=\sum _{s=t+1}^{T_{i}}\triangle g_{is}/(T_{i}-t)\) for \(i=1,\ldots ,N;\, t=1,\ldots ,T\) be the forward means of \(\triangle y_{it}\) and \(\triangle g_{it}\), respectively. Correspondingly, let \(\bar{\varepsilon }_{it}^{\triangle y}\) and \(\bar{\varepsilon }_{it}^{\triangle g}\) be the forward means of \(\varepsilon _{it}^{\triangle y}\) and \(\varepsilon _{it}^{\triangle g}\), respectively. Then the helmert transformation for these variables is \(\triangle \tilde{y}_{it}=\gamma _{it}(\triangle y_{it}-\triangle \bar{y}_{it})\), \(\triangle \tilde{g}_{it}=\gamma _{it}(\triangle g_{it}-\triangle \bar{g}_{it})\), \(\tilde{\varepsilon }_{it}^{\triangle y}=\gamma _{it}(\varepsilon _{it}^{\triangle y}-\bar{\varepsilon }_{it}^{\triangle y})\), \(\tilde{\varepsilon }_{it}^{\triangle g}=\gamma _{it}(\varepsilon _{it}^{\triangle g}-\bar{\varepsilon }_{it}^{\triangle g})\), where \(\gamma _{it}=\sqrt{(T_{i}-t)/(T_{i}-t+1)}\). Footnote 3 Therefore, the final transformed panel VAR model is:

$$\begin{aligned} \tilde{Y}_{it}=A(L)\tilde{Y}_{i,t-1}+\tilde{\varepsilon }_{it}. \end{aligned}$$
(2)

As a result of the transformation in Eq. (2), the lagged original variables are orthogonal to the transformed variable and can be used as instruments just as in the normal fixed effects estimation. These orthogonal relationships provide moment conditions from which the panel VAR can be estimated using GMM. After the VAR is estimated, the structural error terms are identified using a Cholesky decomposition and impulse response functions (IRFs) are generated.Footnote 4 The IRFs shown in this paper are the cumulative IRFs, and therefore they must be interpreted as the effect of per capita income and inequality shocks on the levels and not the growth rate of these variables.

The results of any paper carried out using structural VAR models crucially depend on identifying the correct order of the VAR model. Therefore, before proceeding, it is worth pausing to discuss the choice of the ordering of the variables in our model. Income is used in the calculation of the Gini coefficient, so any recorded changes in income will have contemporaneous effects on the Gini coefficient. Changes in the Gini coefficient, on the other hand, have delayed effects on income. This is clear from the literature which focuses on long-run effects of inequality on growth (see e.g., Frank 2009a). Thus, the one needed restriction is that changes in the Gini coefficient do not have contemporaneous (at least within a year of the change using annual data) effect on growth. Using a Cholesky decomposition, this means the Gini coefficient needs to be ordered second in the structural VAR.

We believe our empirical approach (panel VAR) has significant advantages over traditional (time-series) VAR models or panel data models because panel VARs, such as (1), capture additional complexity than either traditional VARs or panel data models. This is because panel VARs combine features of time-series VARs by treating all the variables in the model as endogenous, with panel data frameworks, by allowing for unobserved heterogeneity across states (Love and Zicchino 2006). Secondly, our approach begins with a baseline reduced-form panel VAR model. It is well documented in the panel VAR literature that these models fit economic data particularly well, and our data are no exception. In addition, our empirical methodology is consistent with several structural models, yet is parsimonious enough to capture the underlying relationship between inequality and growth without the need to make overly strong and debatable identifying assumptions. An added advantage of using a panel dataset is greater efficiency (Nijman and Verbeek 1990).

4 Estimation and results: impulse response functions

4.1 Results for a benchmark bivariate VAR model

For the panel VAR models estimated using annual state-level data, four lags seem sufficient to capture the dynamics of the system. We acknowledge that such an arbitrary lag selection is subjective, however, four lags seemed the most consistent and stable when we estimated the panel VAR models using various transformations of the data. In addition, some researchers have found that when panel data contain a large number of cross-sectional units (states in this case), there might be an increased chance of identifying spurious relationships, regardless of the outcome of lag-selection tests (Pollakowski and Ray 1997).

Figure 3 displays the cumulative IRFs from a one standard deviation shock, together with their corresponding 5 % confidence bands. The confidence bands were generated by Monte Carlo simulation methods based on 500 draws. Consider first the response of per capita income to its own shock. Per capita income rises permanently, with a peak response of about 7 percent in the fourth year, after which it converges to a new long-run level which is higher than the initial one. A one-standard deviation Gini shock leads to a permanent increase in income inequality, with some initial overshooting of approximately 17 percent.

Fig. 3
figure 3

Response of per capita income and inequality to income and inequality shocks

Now consider the response of per capita income to inequality shocks, and the response of inequality to shocks in per capita income, which are central to this paper. By construction, real per capita income does not respond to a Gini shock contemporaneously (cholesky identification), so there is nothing to say about the contemporaneous response of real per capita income to an inequality shock. However, we find lagged responses of per capita income. An unexpected one-standard deviation increase in income inequality decreases the level of real per capita income, leaving it permanently below what it otherwise would have been. Over time, the effect of the inequality shock gradually decreases in significance. While it is difficult to directly compare our estimates with previous studies, as these studies use different methods and usually examine the correlation, and not necessarily causality between growth and inequality, it is interesting to note that the cumulative response of the per capita income to a Gini shock is negative. Our dataset and methodology most closely follow Frank (2009a), except that we use a panel VAR methodology, while Frank (2009a) constructs individual state-level cumulative IRFs to examine the long-term responses of income to inequality shocks, and then computes the mean of the state responses to these shocks. We believe our panel VAR approach is superior to the methodology in Frank (2009a) because a panel VAR approach combines features of both time-series VAR models by treating all the variables in the model as endogenous and panel data models, by allowing for unobserved heterogeneity across states (Love and Zicchino 2006), and is therefore more efficient. Nonetheless, it is interesting to note that our finding of a significant decrease in income following a permanent shock to the Gini index is consistent with the state-averaged cumulative responses documented in Frank (2009a). Negative associations between inequality and growth have also been documented by numerous studies, including Alesina and Rodrik (1994), Alesina and Perotti (1996), among others.

Figure 3 also shows that the impact response of the Gini index to an income shock is a decline of 0.5 percent when real per capita income doubles. Following a shock to per capita income, the estimated cumulative impact on the Gini inequality index rises slightly (but is still negative) between years one and three, and then declines again by the fourth year, after which it is essentially flat. In fact, the IRFs show that the response of the Gini index to an income shock is not significantly different than zero for any horizon longer than 2 years. Frank (2009a) similarly finds that the state-averaged cumulative response of inequality from a permanent shock in per capita income is essentially zero. Again, we stress that the results presented in this paper are not directly comparable to those in previous studies, yet it is equally important to note that cross-sectional studies such as Huang et al. (2009), and Beck et al. (2007) also find negative but statistically insignificant effects of growth on inequality.

4.2 Discussion of identification restrictions

The IRFs shown in Sect. 4.1 were constructed using a bivariate panel VAR model. We identified the impulse responses using the cholesky decomposition of the covariance matrix of the residuals. The same identification assumptions apply to the analysis in the rest of the paper, so in this section, we discuss possible drawbacks of our identification assumptions. A potential objection to our approach is that the baseline specification is a bivariate panel vector autoregression model of the growth rate of real per capita income and the change in the Gini index. A bivariate panel VAR model possibly suffers from omitted variable bias. It is possible that there might be other macroeconomic shocks besides Gini shocks simultaneously affecting per capita growth. If this is indeed the case, then our identified Gini shocks might be contaminated by shocks to other economic variables, other than per capita income. However, we do not think this issue is very serious because given the short panels usually available for economic analyses, a bivariate panel VAR model allows for the estimation of a simple model, which is crucial when dealing with short panels, even though it is possible that the model might suffer from omitted variable bias.

To address this concern, we estimate a trivariate VAR model of income growth, inequality, and human capital (measured using the average years of schooling per state).Footnote 5 These data are collected from Turner et al. (Journal of Economic Growth, 2007). Because there is no theory or general consensus as to where education should be placed in the VAR model, we present the results of the three-variable VAR model with education placed at different positions in the VAR.

Figure 4 presents the cumulative IRFs resulting from the three-variable VAR model. The IRFs are similar, regardless of the ordering of the variables in the VAR model. Figure 4 also shows that the baseline bivariate VAR model does not suffer from omitted variable bias as the inclusion of education does not change the underlying results. The effect of per capita income on inequality in this specification is identical to Fig. 3. Per capita income still decreases immediately following a shock to the Gini index after which the response becomes insignificant. Footnote 6 The figure also shows a permanent rise in income in response to an education shock.This finding is consistent with endogenous growth models that posit that investment in human capital and knowledge are significant contributors to growth (see e.g., Romer 1990; Barro and Sala-i-Martin 1997). On the other hand, we find no statistically significant responses of education to income and Gini shocks. This result is not surprising. The average years of schooling for our sample is 10.55 years. This means that for most states, many individuals only have high school education or less. Since pre-college education in the U.S. is, for the most part, free and not dependent on family income, it should be expected that income shocks should not have any significant effect on education.

Fig. 4
figure 4

Trivariate impulse response functions. a Ordering: College Education, Real Per Capita Income, Gini Index. b Ordering: Real Per Capita Income, College Education, Gini Index. c Ordering: Real Per Capita Income, Gini Index, College Education

An interesting finding, from a policy standpoint, is the significant negative and persistent response of inequality to shocks to education. While many studies have focused on the role that income redistribution and tax policies can help to reduce inequality, our finding of a persistent negative response of inequality to education shocks implies that education policies that focus on equity in education may be a particularly important way to increase earnings mobility and decrease income inequality in the long run. States can strive to achieve this goal by providing equal opportunities to disadvantaged and advantaged students to achieve successful academic outcomes, thus laying solid foundations for them to achieve the higher levels of education needed to secure higher paying jobs to close the income gaps.

Our results from the three-variable VAR model give some degree of confidence that the bivariate model does not suffer from omitted variable bias.

4.3 Other measures of inequality

The Gini measure of inequality used in this paper is derived from tax filing data. Tax data, however, omit many low-end income earners, meaning a broad-based measure of income inequality can be misleading. This limitation has caused some researchers who use tax-derived data to focus rather on upper-end measures of income inequality, such as the top 10 and 1 % income shares (see e.g., Frank 2009a). To examine how robust our findings are to different measures of inequality, we consider four alternative measures of income inequality, namely the Theil Entropy Index, the Relative Mean Deviation, and the top 10 % income share, and the top 1 % income share. Appendix A of Frank (2009b) provides a detailed documentation of the construction of these measures of inequality. Figure 5 presents the impulse responses using these measures of inequality.

Fig. 5
figure 5

Response of per capita income and inequality to income and inequality shocks: different measures of inequality. a Theil Entropy Index. b Relative mean deviation. c Top Decile share. d Top 1 percent share

Figure 5 shows that the response of real per capita income to an inequality shock is similar across all four measures of inequality. By construction, there is no contemporaneous response of the level of real per capita income to an inequality shock for all four measures of inequality, however, we find that an unexpected one-standard deviation increase in income inequality decreases the level of real per capita income, with the maximum cumulative response occurring 5 years after the shock. The cumulative responses of the various measures of inequality to shocks in income differ slightly. The impact responses of all four measures are slightly positive, which is opposite to the initial response using the Gini coefficient. Despite slight quantitative differences, the Theil index, the top decile income share, and the top 1 % decrease permanently, experiencing their sharpest cumulative decreases 5 years after the shock.Footnote 7 This negative response is statistically significant for the top docile and top 1 percent share. The response of the relative mean deviation index is insignificant. Taken together, the results presented in this subsection suggest that the estimates of how income responds to an inequality shock in the baseline model model are not sensitive to the choice of inequality measure. The small response of inequality to an income shock observed in the baseline model, however, is not robust to different measures of inequality.

4.4 Structural break tests and sample splitting

How stable are the responses documented above, and to what extent are they driven by events that changed the underlying structure of the U.S. economy, such as World War II and the Great Moderation? To examine this, we split our sample into three subsamples: 1930–1947, 1948–1984, and 1985–2005. Such an arbitrary split certainly has limitations. One potential limitation is that the break dates have to be known in advance. However, we do not believe this problem is too severe for the analysis because for the U.S., enough is known about the history to suggest that we should at least look at certain dates to check for the stability of the VAR-based impulse response functions. In addition, many studies have documented structural break dates that coincide with wars for many macroeconomic aggregates (Kilian and Ohanian 1998; Strazicich et al. 2004). Furthermore, Kim and Nelson (1999) and McConnell and Perez-Queiros (2000), among others, have documented the presence of a structural break in the volatility of GDP growth at the beginning of 1984 (the so-called Great Moderation). Other researchers have found evidence of structural breaks in other macroeconomic variables including aggregate unemployment (Warnock and Warnock 2000), aggregate consumption and aggregate income (Chauvet and Potter 2001), and wages, prices and the inflation rate (Sensier and Van Djik 2004; Stock and Watson 2002; Bachmeier and Cha 2011). Cutler and Katz (1992) note that “movements in both the poverty rate and family income inequality indicate a break in the relationship between macroeconomic performance and inequality beginning about 1983”.

Panels a–c of Fig. 6 show the responses across the three subsamples. Consider first Figure 6a which shows the estimated cumulative impulse response functions for the 1930–1947 period. The results are qualitatively quite similar to the results in Sect. 4.1. Quantitaively, the estimated impulse responses are approximately twice as large as those reported in Fig. 3. While the response of the Gini to per capita income shocks remains statistically insignificant, the response of per capita income to Gini shocks is significantly different from zero as in Fig. 3.

Fig. 6
figure 6

Response of per capita income and inequality to income and inequality shocks for three sample periods. a Impulse response functions: 1930–1947. b Impulse response functions: 1948–1984. c Impulse response functions: 1985–2005

Now consider Fig. 6b which shows the results estimated using the 1948–1984 sample. The story is quite different. Contary to Figs. 3 and 6a, Gini shocks have no effect on per capita income. In addition, the response of the Gini index to a unit shock in per capita income is now significantly negative implying that increases in per capita income led to decreases in income inequality during this period. This result is hardly surprising. The end of WWII saw rapid growth in the U.S. economically. Sharp increases in agricultural productivity growth, fueled by rapid mechanization and chemichalization of agriculture, led to the growth of incomes for low-income farm workers into the middle class. The post-war “baby boom” led not only to an unprecedented rise in consumer spending, but also in the numbers of consumers. Suburbs grew dramatically following the Federal-Aid Highway Act of 1956, while the Employment Act of 1946 cultivated an environment for sustained economic growth. Rapid industrialization (the “military-industrial complex”), the expansion of labor unions and policy, as well as President Lyndon Johnson’s “Great Society” that expanded opportunities for minorities, all led to a dominant middle class and higher per capita incomes that led to a decrease in income inequality during this period (Conte 2001).

The third panel of Fig. 6 shows the response of per capita income and inequality during the 1985–2005 period, a period generally called the “Great Moderation”. This period is typically characterized by a moderated or reduced volatility in several macroeconomic variables (Stock and Watson 2002) caused by structural shifts in output from goods to services (Moore and Zarnowitz 1986), information-technology-led improvements in inventory management (McConnell and Perez-Queiros 2000), innovations in financial markets that facilitate intertemporal smoothing of consumption and investment (Blanchard and Simon 2001), improved monetary policy, (Cogley and Sargent 2001), and reductions in the variance of exogenous structural shocks (Stock and Watson 2002). This reduced variance of structural shocks implies that the response of macroeconomic variables to these shocks should be less severe than earlier periods. This is certainly the case in Fig. 6c. The response of the Gini index to a unit shock in per capita income is positive and significant for the first 2 years following the shock, after which it becomes statistically insignificant. On the other hand, per capita income tends to decrease following an inequality shock, although the response is not as severe as in the 1930–1947 period. These findings confirm the importance of the Great Moderation on the relationship between inequality and growth. Overall, our findings in this subsection imply that, as with any macroeconomic time series, the relationship varies over time and is sensitive to particular episodes in history. It also informs us as to what circumstances drive the baseline results.

4.5 A note on the medium and long-run relationships between inequality and per capita income

In the empirical literature relating the level or growth of per capita income and inequality, this relationship is typically estimated using data averaged over 5- or 10-year periods (see e.g., Alesina and Rodrik 1994; Persson and Tabellini 1994; Forbes 2000). On the one hand, while previous studies focus on such a long-term perspective due to data limitations, on the other hand, such a long-term perspective strives to explain long- rather than short-run variations. Such a longer-term approach may also reduce annual serial correlation from business cycle fluctuations that is captured in annual data. Consequently, we transform our data by constructing 5- and 10-year panels, and re-estimate the bivariate panel VAR.

Figure 7a shows the impulse responses constructed using the 5-year averaged data, while Figure 7b shows the impulse responses constructed using the 10-year averaged data. The impulse responses show that the medium- and long-term impact of inequality on per capita income is negative. We also find that shocks to per capita income in the medium and long run are beneficial for decreasing medium- and long-term inequality.

Fig. 7
figure 7

Impulse response functions of the longer-run responses to per capita income and inequality shocks. a Medium-run (5-year averaged data) responses. b Long-run (10-year averaged data) responses

The literature on inequality and economic performance traditionally examines the effect of initial conditions on long-term economic performance (Alesina and Rodrik 1994; Li and Zou 1998; Durlauf and Quah 1999; Forbes 2000; Fallah and Partridge 2007). Empirically, one justification for using predetermined variables is that it decreases the problem of endogeneity, since by so-doing, there should be no direct simultaneity between the variables. However, simultaneity is already addressed by the lag structure inherent in VAR models. A more interesting hypothesis is the hypothesis that initial conditions are important in determining longer-run economic performance. For these purposes, we estimate the effect of initial inequality (inequality at the start of the period) on the level of real per capita income in the medium run (5-year average per capita income) and long run (10-year average per capita income). Figure 8 presents these results. Footnote 8

Fig. 8
figure 8

Impulse response functions of the effect of initial inequality on longer-run per capita income

The results in Fig. 8 show that initial inequality decreases per capita income in the medium and long run, although there is initial positive response to a Gini shock in the medium run. Not only are these results significant, but also they agree with those traditionally reported in the literature. For example, using data over 5-year panels, Forbes (2000) finds that “in the short and medium term, an increase in a country’s level of income inequality has a strong positive correlation with subsequent economic growth”. The positive initial response of the 5-year averaged per capita income to an a Gini shock shown in Fig. 8 may in fact be capturing the “strong positive correlation” documented by Forbes. Forbes (2000) also remarks that

Moreover, these estimates do not directly contradict the previously reported negative relationship between inequality and growth....it is possible that the strong positive relationship between inequality and growth could diminish (or even reverse) over significantly longer periods

Using data over longer time periods, Alesina and Rodrik (1994), among others have documented long-run negative correlations between inequality and income (growth). The negative responses of per capita income to initial inequality shocks we show in Fig. 8 may also be capturing the negative correlations documented in previous studies examining the long-run response of per capita income to initial inequality shocks.

5 Conclusion

Previous studies on the relationship between income inequality and the level or growth of per capita income have traditionally focused on the correlation between these variables. However, one cannot make informed policy decisions based on correlations. Consequently, using a new comprehensive panel of annual U.S. state-level income inequality data from 1930 to 2005 constructed by Frank (2009b), this study reconsiders the relationship between inequality and per capita income using a panel VAR approach. A panel VAR allows not only for the examination of the correlation between income inequality and per capita income, but also the dynamic responses of these variables to income and inequality shocks, as well. It is worth mentioning that this paper is the first that uses a panel VAR approach to examine the effect of inequality on per capita income and per capita income on inequality using U.S. state-level data.

Results from the baseline bivariate panel VAR model indicate that per capita income rises permanently following its own shock, with a peak response of about 7 percent in the fourth year. We also find that a shock to the Gini index leads to a permanent increase in income inequality, with some initial overshooting of approximately 17 percent. Because our empirical methodology applies the cholesky decomposition, the baseline results also show that real per capita income has no contemporaneous response to a Gini shock. However, we find that an unexpected one-standard deviation increase in the Gini index decreases the level of real per capita income, but the long run response is insignificant. The impact response of the Gini index to an income shock is a decline of 0.5 percent when real per capita income doubles.

We conduct a series of robustness checks to determine if our results are sensitive to the data used, different time periods, and the inclusion of additional variables in the VAR. We examine if the impulse responses from the baseline bivariate panel VAR model are sensitive to alternative measures of income inequality, namely the Theil Entropy Index, the Relative Mean Deviation, the proportion of income held by the top 10 and 1 % of the population of each state. The response of inequality to an income shock using all three measures of inequality almost identically mirrors the impulse when the Gini index is used as the measure of inequality. The initial negative response of income to an inequality shock, however, is not robust to different measures of inequality. Furthermore, we examine how stable our estimated impulse responses are through time by splitting the sample into three subsamples: 1930–1947, 1948– 1984, and 1985–2005. The results from the first sample are qualitatively similar to the results from the entire sample but are approximately two times larger in magnitude. The results from 1948–1984 sample show a pattern in which per capita income does not respond to Gini shocks, while shocks to per capita income significantly decrease inequality. For the 1985–2005 sample, per capita income decreases slightly after a Gini shock, while the Gini response to an income shock is mostly insignificant, although there is an initial positive response within the first 2 years after the shock. Because some researchers argue that the results obtained when examining the short-run relationship between income inequality and per capita income or growth may be spurious if these results are not robust to longer-run situations, we construct 5- and 10-year panels and estimate the cumulative responses to of these variables to both shocks. The general pattern of the impulse responses is similar to the baseline specification, although the magnitude of these responses is now considerably less. When we estimate the effect of initial inequality on per capita income in the medium run, we find an initial positive response, although the response later turns and stays negative. Li and Zou (1998) and Forbes (2000) document a positive correlation between income inequality and growth, and therefore it is possible that the initial positive response of per capita income is in fact capturing the sort of positive correlation documented by Li and Zou (1998) and Forbes (2000). In the long run, we find that the response of per capita income is negative, as well. These results highlight the relatively strong finding of our estimation: increases in inequality lead to lower income per capita.

Throughout the paper, identification of the impulse responses was achieved by applying a Cholesky decomposition to the covariance matrix of the residuals. In addition, the baseline model is a bivariate panel VAR model. A bivariate VAR model potentially omits relevant information. We find evidence that in general, the impulse responses are not sensitive the inclusion of a third variable.