1 Introduction

Over the last four decades, the USA has experienced a significant rise in income inequality. Researchers have attributed this increase to a multitude of factors, including skill-biased technological change (Card and DiNardo 2002; Acemoglu 1998), trade and financial globalization (Feenstra and Hanson 1996; Munch and Skaksen 2008; Phillipon and Reshef 2012), changes in labor market institutions (Wilkinson and Pickett 2010; Jaumotte and Buitron 2015), income redistribution policies (Bassett et al. 1999; Benabou 2000; Joumard et al. 2012), monetary policy (Galbraith 1998; Coibion et al. 2016), and educational attainment (De Gregorio and Lee 2002; Atems and Jones 2015).

Parallel to the increase in income inequality in the USA has been a rise in entrepreneurship, typically measured as the number of self-employed, the number of (small) businesses in an economy, (small) business employment, or the number of businesses per capita (Quandrini 1999, 2000, 2009; Shane 2009; Halvarsson et al. 2016). Figure 1 plots the relationship between income inequality, measured using the Gini index, and these measures of entrepreneurship (all expressed in natural logarithms) using aggregate US data from 1989 to 2013. Figure 1a illustrates the relationship between the Gini index and the number of self-employed individuals, Fig. 1b shows the trend in small private nonfarm business employment, and Fig. 1c shows the evolution of the number of small businesses in the US economy. All three figures depict upward trends in income inequality and the measures of entrepreneurship. This upward trend is, however, typically interrupted immediately following periods of recessions during which time both entrepreneurship and inequality temporary fall.

Fig. 1
figure 1

Gini coefficient, number of self-employed, and small business employment

This paper is motivated by the trends shown in Fig. 1. We seek to investigate whether entrepreneurship impacts inequality or whether this correlation is merely an empirical coincidence. That entrepreneurship may increase inequality has been suggested by, among others, Halvarsson et al. (2016), Åstebro et al. (2011), Gort and Lee (2007), Hamilton (2000), and Evans and Leighton (1989). These authors contend that while entrepreneurship increases income for some people, most self-employed and small-scale entrepreneurs have average earnings that are lower than the population average. In particular, Åstebro et al. (2011) propose a model of labor market frictions in which entrepreneurs are skilled in a variety of activities and there is complimentarity between their skills, finding that average wages under self-employment are lower than mean wages under paid employment. Hamilton (2000) examines differences in earnings between self-employment and paid employment, reporting that a majority of entrepreneurs enter and remain in business even though their initial earnings, as well as their earnings growth are lower than earnings of individuals in paid employment. He estimates an earnings differential of 35% for self-employed individuals, further arguing that this differential potentially underestimates the true differential as fringe benefits are generally not included in wages under paid employment. Using data for Canada, Lin et al. (2000) find that self-employed earnings average approximately 8 cents to a dollar less than those of wage and salary employees. Similar findings have been reported by, among others, Mathias et al. (2017), and Åstebro and Chen (2014).

While the abovementioned studies provide evidence on earnings differentials between entrepreneurs and paid employees, few papers have directly estimated the link between entrepreneurship and income inequality at a more macrolevel. Existing studies have typically examined how the structure of inequality in compensation and opportunity within organizations triggers low-earning employees to exit these organizations for other firms or to pursue more entrepreneurial ventures. For example, Sørensen and Sharkey (2014) find that employees in organizations with high wage inequality are more likely to make the transition to entrepreneurship, but find no evidence of these individuals moving to other firms. Using linked employer-employee data for the legal services sector, Carnahan et al. (2012) also report a positive relationship between high-performance employees and their likelihood of transitioning to entrepreneurship in organizations with a low degree of compensation dispersion, while low-performing employees in organizations with greater compensation dispersion are less likely to become entrepreneurs after they exit their current organization.

The theoretical literature, however, has suggested mechanisms through which entrepreneurship may increase aggregate income and wealth inequality. These theories generally rely on incentives arguments, whereby, borrowing constraints on entrepreneurial investment and savings, together with the high costs of external finance provide incentives for entrepreneurs to accumulate wealth, leading to a higher observed equilibrium concentration of wealth among entrepreneurs than paid employment (see e.g., Quadrini 2000, 2009; Bohacek 2006; Cagetti and DeNardi 2006). Quadrini (2000), for example, develops a dynamic general equilibrium model that incorporates entrepreneurial choice and financial frictions, showing that more wealth inequality is generated in the model that explicitly incorporates entrepreneurship than one than does not consider entrepreneurial activities. Extending the work of Quadrini (2000), Cagetti and DeNardi (2006) construct and calibrate an occupational choice model that allows for entrepreneurial entry, exit, and investment decisions in the presence of borrowing constraints. They show that a model economy without entrepreneurs produces a distribution of wealth which is significantly lower (top 1% only held 4% of total wealth) than that observed in aggregate US data. However, when entrepreneurs are incorporated in the benchmark model economy, the distribution of wealth in this economy is similar to that observed in the data (top 1% owned about 30% of total wealth). While these theoretical models focus on wealth inequality, incentives to increase wealth may also lead to increases in aggregate income. For example, borrowing and other financial constraints may cause individuals with entrepreneurial goals to increase work effort (Schumpeter 1934), leading to higher levels of income growth and capital accumulation for (potential) entrepreneurs (Jones and Kim 2017), which in turn increases aggregate income inequality.

This paper represents the first attempt to estimate the impact of entrepreneurship on income inequality using US state-level data. We use data on all 50 US states and the District of Columbia (DC) from 1989 to 2013. Following, among others, Meager (1992), Van Stel et al. (2005), and Thurik et al. (2008), the paper uses the number of self-employed as a fraction of the total labor force (self-employment rate) as the measure of entrepreneurship. For robustness purposes, we also use three other proxies for entrepreneurship, namely the number of small businesses as a percent of total businesses, private small business employment as a percent of total private nonfarm employment, and the number of businesses per capita. Our econometric strategy relies on the Arellano and Bover (1995) and Blundell and Bond (1998) system generalized method of moments (GMM) estimator. Empirical results show evidence of a strong positive relationship between entrepreneurship and income inequality. The results are robust across the different measures of entrepreneurship, as well as to alternative measures of income inequality.

The rest of the paper is organized as follows. Section 2 outlines the econometric methodology, while Section 3 describes the data, their sources, and any transformations to the data. Section 4 discusses the empirical results. Concluding remarks are presented in Section 5.

2 Econometric methodology and estimation

The baseline model for investigating the impact of entrepreneurship on income inequality is:

$$ I_{\text{it}}=\alpha+\beta\,I_{i,t-1}+\gamma\,E_{i,t-1}+\boldsymbol{X}_{i,t-1}\delta+\eta_{i}+\nu_{t}+\varepsilon_{\text{it}} $$
(1)

where Iit and Ii,t− 1, respectively, denote income inequality in state i in years t and t − 1, Ei,t− 1 represents entrepreneurship in state i in year t − 1, and Xi,t− 1 is the year t − 1 vector of controls in state i. Entrepreneurship and the control variables are unlikely to affect inequality contemporaneously, hence they enter the model in lags. In Eq. 1, α is a constant, ηi captures state-specific fixed effects, νt are year fixed effects, and εit is the error term.

Before proceeding, it is useful to discuss the choice of the variables contained in X. The literature on the determinants of income inequality is vast, so the rationale for choosing the control variables in X will not be overemphasized. We include the percentage of individuals with at least a high school degree and the percentage with at least a college degree as proxies for the impact of human capital on inequality. The empirical literature has documented conflicting results on the impact of education on inequality. De Gregorio and Lee (2002) report a negative effect of education on inequality, while Katz and Murphy (1992), Acemoglu (1999, 2002), Lemieux (2006) among others attribute the rise in wage inequality to the college-high school wage premium. Many studies, (see e.g., Brewer et al. 1999; Davies and Guppy 1997; Kane and Rouse 1995; Rumberger and Thomas 1993) have shown that while the income difference between high school and college has grown, so has the income difference between college graduates with an undergraduate and a graduate degree. Some of these studies also show that the income premium for different degrees has grown, with engineering and science degrees outperforming social science degrees, and social science degrees outperforming humanities degrees. Including measures of educational attainment beyond our two measures (high school and college) would be ideal to adequately control for the impact of human capital on income inequality. These data, however, are not available annually at the US state-level, limiting our paper to the use of high school and college attainment rates.

The paper also includes state-level unemployment rates, poverty rates, and real GDP per capita growth to control for the effect of macroeconomic conditions. That macroeconomic conditions affect inequality has been studied by Blank and Blinder (1986), Silber and Zilberfarb (1994), Jantti (1994), Mocan (1999), and Cysne (2009). Union membership, state minimum wages, and an indicator variable for right-to-work states are included to control for the well-documented link between labor market institutions and inequality (see e.g., Di Nardo et al. 1996; DiNardo and Lemieux 1997; Card 2001; Koeniger et al. 2007). We follow the literature on the link between redistributive government policies and income inequality (see e.g., Milanovic 2000; Goni et al. 2011) and include average income tax (individual and corporate) and capital gains tax rates, as well as welfare expenditures as a percent of total state expenditures. The percent of the population 65 years and older, and the minority share of the population are included to control for demographic influences on income inequality, while the shares of total employment accounted for by employment in seven industry categories have been included to control for state industrial composition. The industry shares considered include the proportion of total employment accounted for by employment in construction, government, manufacturing, mining, retail sales, wholesale sales, and the finance, insurance, and real estate (FIRE) sectors. This list of controls is certainly not exhaustive and was determined by the availability and reliability of data for all 50 states and DC for the sample period. Furthermore, some factors that affect inequality may be unobservable. The inclusion of the state-specific effects should help control for the impact of unobservable factors on inequality that vary across states but are time-invariant. As well, the possibility of common trends in the variables of interest may contaminate the results obtained from estimating Eq. 1. We include the year fixed effects, νt to mitigate this issue.

Ordinary least squares (OLS) estimation of Eq. 1 results in an estimate of β that is upward biased (Hsiao 1986; Bond et al. 2001). The upward bias is caused by the positive correlation between the lagged dependent variable Ii,t− 1 and the state-specific fixed effects ηi. To see this, suppose that a large shock to income in state i affects the income distribution of that state, causing a significant increase in inequality in year t, Iit. Also suppose that the effect of the shock is only captured in εit and not directly modeled in Eq. 1. This increase in inequality in year t will cause the average value of ηi over the entire sample period to be higher. This means that Ii,t− 1 and ηi will both be higher in the following year, causing the OLS estimate of β, the coefficient on Ii,t− 1 to be upward biased. Nickell (1981) further shows that the bias cannot be eliminated using the fixed effects or least squares dummy variable estimator. In fact, fixed effects estimation of Eq. 1 produces an estimate of β that is downward biased.

The dynamic panel data literature has generally advocated for the use of the so-called difference GMM estimator proposed by Arellano and Bond (1991) to address this problem. This estimator first-differences all the variables in Eq. 1, thereby removing the state-specific fixed effects (ηi):

$$ {\Delta} I_{\text{it}}=\beta\,{\Delta} I_{i,t-1}+\gamma\,{\Delta} E_{i,t-1}+{\Delta}\boldsymbol{X}_{i,t-1}\delta+{\Delta}\nu_{t}+{\Delta}\varepsilon_{\text{it}} $$
(2)

After first-differencing, appropriate lags of the variables in levels can be used as instruments for the variables in first differences. Under the assumption that errors are serially uncorrelated and independent across states, we have:

$$E\left( \varepsilon_{\text{it}}\varepsilon_{\text{is}}\right)= 0\;for\;s\neq t $$

Furthermore, assuming that

$$E\left( I_{\text{i1}}\varepsilon_{\text{it}}\right)= 0\;for\;t\geq2 $$

second and higher order lags of Iit are valid instruments for Eq. 2. For example, Ii,1 can be used as an instrument for (Ii,2Ii,1) for period 3, and Ii,1 and Ii,2 as instruments for (Ii,3Ii,2) for period 4, and so on.

Under the assumption that entrepreneurship and the other regressors are strictly exogenous, lags, leads, and contemporaneous values of Eit and Xit are valid instruments for the equations in first differences. There are, however, compelling reasons to believe that entrepreneurship and the other regressors are not strictly exogenous. Possible examples include business-friendly policies or attitudes toward entrepreneurship versus government intervention in the state population or current state government. There may also be longer term reverse causal effects of inequality on entrepreneurship. Maintaining the assumption that inequality and entrepreneurship (and the other regressors) are uncorrelated contemporaneously, Arellano and Bond (1991) show that lagged values of the predetermined Eit and Xit are valid instruments for the first-differenced equation. If Eit and Xit are endogenous, their second and higher order lags are valid instruments for the differenced equations (2).

Blundell and Bond (1998, 2000) argue that when the time series are highly persistent, instruments based on lags of the variables in levels may introduce weak instrument bias. To the extent that income inequality and the regressors are persistent, the difference GMM estimator may not be appropriate for our purposes. Arellano and Bover (1995) propose an alternate but related dynamic panel data estimator, the system GMM estimator, which has been shown by Blundell and Bond (1998) to considerably alleviate the weak instrument problem of the difference GMM estimator. The estimator requires that GMM estimation be performed using a “stacked” system consisting of two equations:

$$\begin{array}{@{}rcl@{}} {\Delta} I_{\text{it}}\!\!&=&\!\beta\,{\Delta} I_{i,t-1}\,+\,\gamma\,{\Delta} E_{i,t-1}\,+\,{\Delta}\boldsymbol{X}_{i,t-1}\delta\,+\,{\Delta}\nu_{t}\,+\,{\Delta}\varepsilon_{\text{it}}\\ I_{\text{it}}\!\!&=&\!\alpha\,+\,\beta\,I_{i,t-1}\,+\,\gamma\,E_{i,t-1}\,+\,\boldsymbol{X}_{i,t-1}\delta\,+\,\eta_{i}\,+\,\nu_{t}\,+\,\varepsilon_{\text{it}}\\ \end{array} $$
(3)

The first equation in the above system (3) is the same as Eq. 2, and similar to the difference GMM estimator, uses lags of the regressors in levels as instruments, while the second in the system estimates Eq. 1 instrumented with lags of the regressors in first differences. Blundell and Bond (1998) recommend using the second and higher order lags of the regressors as instruments when applying the system GMM estimator. In addition to alleviating the endogeneity bias caused by the introduction of the lagged endogenous variable, the system GMM is more robust to measurement error than cross-sectional models (Bond 2002; Brülhart and Mathys 2008). The state fixed effects capture time-invariant measurement errors, while sufficiently longer lags of the regressors as instruments mitigate the effect of time-varying measurement errors.

In light of these advantages, the results of this paper are estimated using the system GMM estimator. We use the third and fourth lags of the regressors as instruments and apply the two-step estimator to maximize efficiency. While we believe that our sample size is sufficiently large, we nonetheless apply the Windmeijer (2005) small sample correction of the standard errors in all the two-step system GMM estimations. Our reliance on “internal” instruments makes it necessary to test for instrument validity. Consequently, we report the Hansen test for overidentifying restrictions, as well as the Arellano and Bond (1991) test for second-order serial correlation (AR (2)) in the residuals for all regressions.

3 Data

The data for the analysis consist of annual observations for all 50 US states and DC over the period 1989–2013. The dependent variable in our model is state-level income inequality. Data on nine measures of income inequality are constructed by Frank (2009) and obtained from http://www.shsu.edu/~eco_mwf/inequality.html. Frank (2009) constructs the state-level inequality measures from individual tax-filing data collected from the Internal Revenue Service’s (IRS) Statistics of Income (SOI) program. The pretax SOI measure of adjusted gross income (AGI) includes wages and salary income; capital income such as dividends, interest, rents, and royalties; and entrepreneurial income from self-employment, small businesses, and partnerships. Interest on state and local bonds, and federal and state intergovernmental transfer income are excluded. Using these data, Frank (2009) uses percentile rankings to derive income share measures of inequality, namely, the top 5, 1, 0.5, 0.1, and 0.01% income shares. Frank (2009) also uses these data to construct Gini coefficients, the Atkinson index, the relative mean deviation, and the Theil entropy index. We refer the reader to Frank (2009) for details on the construction of the inequality measures. In the current paper, we use all nine measures to ensure that conclusions regarding the impact of entrepreneurship on income inequality are stable and not sensitive to alternative measures of inequality.

Figure 2 plots the measures of income inequality used in this paper. A particularly striking feature of the graphs is the decline in inequality (across all the nine measures) immediately following the three US recessions that occurred since 1989 (1990–1991, 2001, 2007–2009). At first glance, this decrease in inequality might seem odd since many people lost their jobs or moved to lower paying service jobs in the following years. However, just as higher income individuals tend to reap greater financial benefits than lower income individuals during good economic times, they generally tend to incur greater financial loss in bad economic times. This occurs because more volatile sources of income related to capital, such as business income, capital gains, and dividends, typically constitute a relatively higher proportion of the income of higher income individuals than labor income (labor and wages), so that an economic downturn results in more financial losses for higher income individuals. According to 2011 Congressional Budget Office (CBO) estimates, labor income accounted for 37% of the income of the top 1% income earners, but 62% of the income of the middle quintile.Footnote 1 In contrast, capital income accounted for 58% of the income of the top 1% versus 4% for the middle quintile. The same CBO estimates reveal that the top 1% derived 1% of their income from government transfers against 25% for the middle quintile. During the Great Recession of 2007–2009, the top 1% saw their income share decrease by 36.3 versus 11.6% for the bottom 99%, with much of the declines caused by decreases in capital gains, which decreased by 75% from 2007–2009. Dividend and interest incomes, respectively, fell by 33 and 40%, while the drop in labor income was much smaller (6%). These large decreases in capital income relative to labor income explain the decreases in inequality during the Great Recession and in previous recessions.

Fig. 2
figure 2

Measures of income inequality

Of particular interest in this paper is the possible effect that entrepreneurship has on income inequality. Following, among others, Van Stel et al. (2005), and Thurik et al. (2008), we use the number of self-employed as a fraction of the total labor force (self-employment rate) as the measure of entrepreneurship. The data on self-employment are from the Current Population Survey, public-use micro-data files, while data on labor force participation are from the Bureau of Labor Statistics (BLS). For robustness purposes, we also use three other proxies for entrepreneurship, namely the number of small businesses (< 100 employees) as a percent of total businesses, private small business employment as a percent of total private nonfarm employment, and the number of businesses per capita. Data on the other proxies for entrepreneurship were obtained from the US Census Bureau’s County Business Patterns.

As mentioned earlier, we control for the effect of education, unemployment, and other social, economic, and demographic factors that directly impact income inequality. We use two measures of educational attainment, namely the fraction of total population with a high school degree or more, and the percentage of the population with a college degree or higher. The data on high school and college attainment are constructed by Frank (2009) and available at http://www.shsu.edu/~eco_mwf/inequality.html. Data on the unemployment rate, union membership, and state minimum wages come from the BLS, while gross state product (GSP) per capita growth, welfare expenditure share, and the seven state-level industry employment shares were all collected from the Regional Economic Accounts (Bureau of Economic Analysis). Income and capital gains tax rates at the state-level are constructed following Feenberg and Coutts (1993), and are available on the website of the National Bureau of Economic Research (NBER) at http://users.nber.org/~taxsim/state-rates/. Demographic controls include the percent of the population aged 65 and older, and the state minority population share. Both demographic variables, together with poverty rates and population density were collected from the Census Bureau. We include a dummy variable indicating whether or not a state is a right-to-work state. Data on states with right-to-work laws were obtained from the National Right to Work Foundation. Table 1 provides descriptive statistics of the variables.

Table 1 Definitions and descriptive statistics of the variables: 1989–2013

4 Empirical results

4.1 Benchmark results

Table 2 reports estimation results for the nine measures of income inequality and with the self-employment rate used as the measure of entrepreneurship. Because the validity of the econometric methodology and the consistency of the estimated results, respectively, depend on the validity of the instruments and on the serial correlation of the errors, we first discuss the results of the Hansen tests of overidentifying restrictions, and the Arellano and Bond (AR(2)) tests for second-order serial correlation before turning to the estimated coefficients. These tests, displayed on the last rows of Table 2, provide no evidence of model misspecification problems for any of the nine regression models. That is, the Hansen tests of overidentifying restrictions in all regressions fail to reject the null hypothesis of the joint exogeneity of the instruments, suggesting that the instruments in general are valid. The AR(2) tests for second-order serial correlation fail to reject the null of no second-order residual autocorrelation. The results of these tests give us some degree of confidence in the reliability of our empirical methodology and the consistency of our estimated results.

Table 2 The impact of entrepreneurship on US income inequality

Turning now to the estimated coefficients, several observations are evident from the table. First, regardless of the measure of inequality, the estimate of the coefficient on the self-employment rate is always positive and statistically significant at conventional levels of significance. When the Gini index is used as the measure of inequality, the estimated coefficient on the self-employment rate is 0.035, suggesting that a one percentage point increase in the self-employment rate is associated with a 0.035 point increase in income inequality (Gini index). The magnitude of the estimated coefficients on income inequality are relatively similar when alternative measures on inequality are used. For example, we find that the top 1%, top 0.5%, top 0.1%, and the relative mean deviation respectively increase by 0.039, 0.034, 0.031, and 0.0323 points following a one percentage point increase in the self-employment rate. A noteworthy exception is when inequality is measured using the Theil index, which shows that a percentage point increase in the self-employment rate increases inequality (Theil index) by as much as 0.12 points. While our results are not directly comparable with previous studies (as we use US state-level data, whereas previous studies are based on more micro-level data), it is worthwhile to note that our finding that more entrepreneurship is related to more inequality is consistent with, among others, Sørensen and Sharkey (2014), Carnahan et al. (2012), Campbell (2013), and Campbell et al. (2012).

Second, all the estimated coefficients on lagged inequality are positive and highly significant. In addition, the estimated coefficients display a high degree of persistence, with estimated coefficients generally larger than 0.85. The finding that inequality is persistent gives us some degree of confidence in our choice of using the system GMM estimator. As mentioned before, the system GMM estimator ameliorates some of the shortcomings of other dynamic panel data estimators by reducing the potential biases and impressions associated with these estimators when variables exhibit persistence (Blundell and Bond 2000).

Third, the estimated coefficients on the control variables are generally consistent with theoretical models and/or previous empirical findings. The coefficients on high school are positive in all nine regressions and statistically significant for the Gini, top 1%, top 0.5%, and top 0.1% regressions. The coefficient on college attainment is negative and significant at the 10% level of significance for the Gini index, but statistically insignificant in the remaining regressions. The finding that high school education is associated with higher inequality while college education seems to decrease inequality is consistent with, among others, Katz and Murphy (1992), Acemoglu (1999, 2002), Lemieux (2006) who attribute the rise in wage inequality to the college-high school wage premium. Many studies, (see e.g., Brewer et al. 1999; Davies and Guppy 1997; Kane and Rouse 1995; Rumberger and Thomas 1993) have shown that while the income difference between high school and college has grown, so has the income difference between college graduates with an undergraduate and a graduate degree. Some of these studies also show that the income premium for different degrees has grown, with engineering and science degrees outperforming social science degrees, and social science degrees outperforming humanities degrees. Ideally, including other measures of educational attainment beyond high school and college would provide a deeper understand of the relationship between educational attainment and income inequality. These data, however, are not available annually at the US state-level, limiting our paper to the use of high school and college attainment rates.

The signs of the coefficients on the macroeconomic variables are consistent with expectation. The unemployment and poverty rates display positive signs and in many cases the estimated coefficients are statistically significantly different from zero. Mocan (1999) notes that “the consensus has been that income inequality is countercyclical in behavior, i.e., increases in unemployment worsen the relative position of low-income groups” (page 122). Mocan (1999), employing US data for the period 1948–1994 finds that increases in structural unemployment are significantly associated with higher income inequality. Positive impacts of poverty and unemployment on inequality have been reported by Jantti (1994), and Cysne (2009). The results in Table 3 further show that growing economies tend to experience a decrease in inequality. These results are consistent with the view that a rising tide lifts all boats, as a growing economy in particular, and improving macroeconomic conditions in general provide opportunities for poor households to raise their incomes.

Table 3 The impact of entrepreneurship (small on US income inequality)

Table 2 further shows that union membership exerts a positive impact on inequality. The estimated coefficients on union membership are significant for the top 1%, top 0.5%, top 0.1%, and top 0.01% income shares. This result is rather surprising. However, consider that our sample runs from 1989 to 2013, a period during which union membership rates in the USA have considerably reduced. If unions are able to negotiate for higher wages for their members relative to the nonunionized work force, then the finding here of a positive impact of union membership has credence. Rubin (1988) finds a positive effect of unions on income inequality, and explains that because wage and salary earnings constitute the largest proportion of personal income, increases in income that result from increased union density accrue to workers at the extremes of the income distribution at the expense of those in the middle who are usually not union members. As expected, the coefficient on right to work is positive and significant for all inequality measures, suggesting that states with right-to-work laws generally experience more inequality than those without these laws. Nieswiadomy et al. (1991), and Ngarambe et al. (1998) also document a positive relationship between right to work and income inequality. In line with DiNardo et al. (1996), Lee (1999), and Autor et al. (2016), our results in Table 2 consistently provide strong evidence that minimum wages reduce income inequality.

We control for the impact of redistributive policies by including the income and capital gains tax rates, as well as the welfare expenditure share of total state government expenditures. We find no statistical evidence of a significant relationship between income tax rates and inequality, whereas increases in capital gains tax rates, as expected, are associated with reductions inequality. Results also show that the coefficients on the welfare expenditure share are positive, but only significant for the Gini regression. Evidence that redistributive policies may not be very effective in reducing income inequality has been documented by Doerrenberg and Peichl (2014), and Alm et al. (2005). Demographic variables, which we control for by including the minority share of the population and the proportion of the population 65 and older do not display any significant impacts on income inequality.

The impact of industrial composition on income inequality is accounted for by including seven state-level industry employment shares. Construction and manufacturing exhibit no statistically significant relationship with inequality. On the other hand, a larger FIRE sector, as well as a larger mining sector are positively related to inequality. In all nine regressions, the estimated coefficients on FIRE and mining are significant. Table 2 further shows that wholesale and retail trade sectors, as well as government employment share are generally associated with reductions in income inequality. Moore (2009), using US state-level data, documents very similar results.

4.2 Robustness analysis

The results presented in Table 2 used the self-employment rate in a state as the measure of entrepreneurship. To ensure that our results are not driven by the choice of the measure of entrepreneurship, we present results of regressions using small business share of total businesses in a state (small business share) as the measure of entrepreneurship. These results, shown in Table 3 continue to show a positive effect of entrepreneurship on all the nine measures of inequality. In fact, using small business share, the estimated coefficients are generally larger and more highly statistically significant than when the self-employment rate is used as the inequality measure (as in Table 2). The coefficient on small business share suggests that a one percentage point increase in the share of small businesses increases the Gini index by 0.11 points, whereas before, a one percentage point increase in self-employment rate is only associated with a 0.03 point increase in the Gini index. The positive impact of small business share on inequality is significant at the 5% level or better in eight of the nine regressions, whereas the coefficients on self-employment (Table 2) are significant at the 5% level only when the Gini index is used as the measure of inequality. Similar to the results in Table 2, the AR (2) tests find no significant evidence of second-order serial correlation in the first-differenced residuals, while the Hansen tests continue to show evidence of instrument validity. Furthermore, the estimated coefficients on many of the control variables are generally consistent with those displayed in Table 2.

As an alternative robustness check, we use the percent of total private nonfarm employment accounted for by employment by small businesses (small employment share) as the measure of entrepreneurship. The results of this specification are shown in Table 4. The magnitudes and signs of the coefficients on small employment share are, in fact, quite similar to those when the self-employment rate is used as the measure of entrepreneurship. For example, a one percentage point increase in small employment share is associated with a 0.033 point increase in the Gini index, while the corresponding increase in the Gini index when the self-employment rate is used is 0.035 percentage points.

Table 4 Impact of entrepreneurship on income inequality

As a final robustness check, we estimate the system GMM models using the total number of businesses per capita to measure state-level entrepreneurship. The estimates of the impact of entrepreneurship remain positive and significant (Table 5). Estimates of the coefficients of the other control variables remain qualitatively and qualitatively identical to those in the previous tables. In addition, the standard diagnostic tests continue to provide significant evidence of instrument validity. That is, the Hansen tests of overidentifying restrictions fail to reject the null hypothesis that the instruments are valid, while the AR(2) tests for second-order serial correlation fail to reject the null of no second-order residual autocorrelation. Taken together, the consistently positive and significant estimates of all four measures of entrepreneurship across the nine measures of income inequality gives some degree of confidence that the findings of this paper are robust to the choice of the measure of entrepreneurship.

Table 5 Impact of entrepreneurship on income inequality

5 Conclusion

Over the last four decades, the USA has experienced a simultaneous increase in income inequality and entrepreneurship. This paper explores whether there is, in fact, a relationship between entrepreneurship and income inequality, or whether this correlation is merely an empirical coincidence. The analysis uses annual data on all 50 US states and DC from 1989 to 2013. To ensure robustness of our results, we use four measures of entrepreneurship and nine measures of income inequality. Methodologically, the paper contributes to the literature by using the system GMM estimator which is more robust to measurement error than cross-sectional models, and also alleviates some of the potential endogeneity biases that arise from the simultaneous determination of entrepreneurship and inequality. The methodology also allows us to control for unobservable time-invariant factors across states that may affect both inequality and entrepreneurship through the inclusion of state fixed effects. As well, year fixed effects are included to account for the possibility of common trends in the variables of interest.

Our results provide strong evidence of a positive relationship between entrepreneurship and income inequality. This finding is stable across different measures of entrepreneurship, as well as alternative measures of income inequality. Given the extensive literature on the growth-retarding impacts of income inequality (see e.g., Aghion et al. 1999; Atems 2013a, b) our results suggest that policies aimed at encouraging entrepreneurship will not only increase inequality, but may be detrimental to growth. Yet, the finding that economic growth reduces inequality (Tables 234, and 5) suggests that perhaps growth-enhancing policies are a more effective way to reduce inequality than policies aimed at encouraging entrepreneurship, per se.

The econometric methodology employed by this paper is certainly not without criticism. One potential criticism is that our empirical results are based on the system GMM estimator. While the system GMM estimator a considerable improvement on least squares based panel data estimators (fixed effects and random effects models) and other dynamic panel data methods such as the difference GMM estimator in terms of bias and consistency, the estimator is known to suffer a ”weak instruments” problem and is sensitive to the specification in terms of number of lags (Bellemare et al. 2017; Bun and Windmeijer 2009). Bellemare et al. (2017) go so far as to suggest that “the use of lagged explanatory variables to solve endogeneity problems is an illusion” (page 949). Without access to reliable “external” instruments, we hesitate to conclude that the strong positive relationship between entrepreneurship and income inequality found in this paper is, in fact, causal. Addressing the question of causality is therefore are area for future research.