1 Introduction

Over the last century, the evolution of spatial inequalities and the tendency of per capita incomes to converge has been one of the most popular topics which has been heatedly investigated by a number of theoretical and empirical studies (Magrini 2004; Cörvers and Mayhew 2021).

In theoretical terms, there are two main views about the evolution of regional disparities. The optimistic stream, represented by the Neo-Classical class of growth theories state that due to “diminishing marginal productivity”, production factors are relatively more efficient in the places with lower capital accumulation (Ramsey 1928; Solow 1956; Cass 1965; Koopmans 1963; Baumol 1986; Barro and Sala-i Martin 1991, 1992). The implied “convergence hypothesis” claims a process which relatively less developed regions grow faster than the well-developed ones, thus, all regions are predicted to approach to a unique steady state through a monotonic saddle path at which income disparities are eliminated (Solow 1956; Baumol 1986; Barro and Sala-i Martin 1991, 1992, Rey and Montouri 1999; Harris 2011; Duran 2014; Magrini et al. 2015).

In contrast, pessimistic theories claim the polarization of territorial development and expansion of inequalities. As one of the main references, Cumulative Causation Theory states that initially advantageous regions are likely to develop in a cumulative manner as they gain competitive advantages through the development process (Myrdal 1944; 1957; Perroux 1955; Hirschman 1958; Dixon and Thrilwall 1975; Armstrong and Taylor 2000; O’Sullivan 2003). Consistent to this view, prosperous metropolitan areas are likely to benefit the economic agglomeration (concentration) (i.e. scale economies) due to related positive externalities and increasing returns to scale (Marshall 1890; Perroux 1955; Hirschman 1958; Krugman 1991; 1992; Porter 1998; Armstrong and Taylor 2000; O’Sullivan 2003)

Empirically, an extensive number of studies have tested the convergence hypothesis in cross-country and cross-regional settings (Magrini 2004; Cörvers and Mayhew 2021). With regard to the papers focusing on the U.S., the findings can be summarized in two groups. The first group represent the studies that report evidence of convergence, mostly during the first half of the 20th century. As some examples of these studies, Barro and Sala-i Martin (1991), Rey and Montouri (1999), Webber et al. (2005) can be mentioned. We may relate this process to the “polarization reversal” (mentioned by Fan and Casetti 1994) during which labor and capital tend to flow into peripheral locations due to the advantages of low labor and land costs and, thereby, reinforce the homogenization of prosperity (Norton and Rees 1979; Bourne 1980; Bluestone and Harrison 1982; Hall 1987; Scott 1988; Storper and Walker 1989; Fan and Casetti 1994; Kim 1998; Duran 2014)

Controversially, recent studies point to a non-convergent pattern. (i.e. Bernat (2001), Yamamoto (2008), Heckelman (2013) and Ram (2021)). The rise in regional inequalities is often observed after mid-1970s. This period is termed as polarization and “spatial restructuring” period which high-tech industries and services are put forward as critical sectors that are spatially agglomerative which prefer locating in the developed metropolitan areas due to their advantages on infrastructure, knowledge, etc. (Norton 1987; Lampe 1988; Barff and Knight 1988; Coffey and Bailly 1991; Daniels et al. 1991; Harrington et al. 1991; Fan and Casetti 1994; Kim 1998; Duran 2014)

To the best of our knowledge, the existing stream of empirical literature has often adopted a retrospective look by analyzing the past evolution of spatial inequalities. We depart from the main stream by adopting a future perspective. There are some exceptional studies such as Wear and Prestemon (2019) that project the future tendency of regional convergence across US counties, however, the methodology, spatial units, time period and analysis results are different compared to our study. Similarly, Van Vuuren et al. (2007) have provided future projections of income convergence across countries but not focused on the regions.

Will regional inequalities shrink over time? How will the shape of income distribution evolve? Will spatial dependency increase in the future? Answering these questions is politically crucial. If the regional inequalities are expected to rise in the future, a tight long-term preparation and planning of the related policies will become necessary. Moreover, identification of the regions that will remain backward in the future will shed light on the decisions about the allocation of resources.

Methodologically, we forecast the long-term trajectory of per capita real personal income for each U.S. state using the ARIMA (Autoregressive Integrated Moving Average) model. We estimate the future trend of disparities, the shape and geographical income distribution and the degree of spatial dependence by the help of inequality indices (Theil, Atkinson, Coefficient of Variation), kernel probability density distributions, explorative maps and Moran’s I test. The paper continues with the data and methods (section 2), results (section 3) and conclusion parts (section 4).

2 Data and methods

The dataset includes 48 coterminous U.S. states and the national economic unit over the period 1929-2022. In terms of the main variable, we use annual per capita real personal income (y), downloaded from the electronic sources of U.S. Bureau of Economic Analysis (BEA (n.d.). The data is deflated by the help of consumer price index which is provided by the webpage of FED Minneapolis (U.S. Bureau of Labor Statistics BLS (n.d.).)Footnote 1

Initially, we apply an ARIMA model to the income series of the aggregate economy and 48 states. The general ARIMA model can be explained by the following expression (Box and Jenkins 1970)Footnote 2:

$${\Delta }^{d}{{\text{ln}}y}_{t}=c+{\sum\limits }_{i=1}^{p}{\alpha }_{i}{\Delta }^{d}{{\text{ln}}y}_{t-i}+{\sum\limits }_{k=1}^{q}{\beta }_{k}{\Delta }^{d}{u}_{t-k}+{u}_{t}$$
(1)

\({\Delta }^{d}\) denotes a d-order differencing operator an outcome of the unit root analysis, \(p\) shows the number of autoregressive time lags and \(q\) is the degree of moving average process of the residuals ((Box and Jenkins 1970). We implement the estimation in R 4.3.2 program using the “Forecast” package and related auto.arima procedure. (Peiris and Perera 1988; Wang et al. 2006; Hyndman and Khandakar 2008; Hyndman and Athanasopoulos 2018; Hyndman et al. 2023, 2024).Footnote 3

With this ARIMA procedure and by using the estimated parameters, the forecasted (future) values of income are obtained for each state. In other words, the estimated models are progressively extended to the future where, in the meanwhile, 80 and 95% confidence intervals are estimated and reported for the future estimations.

It is worth noting that the forecast of future personal income is conditional upon the assumption that the past economic and political mechanisms of income generation will remain unchanged in the future.

In order to illustrate the future shape of income distribution, we estimate the Kernel density probability distributions of per capita relative real personal incomes (\({{ry}_{i}=ln(y}_{i})-ln(\overline{y }\)), \(\overline{y }:\mathrm{cross state average})\) for the past and future years (1929, 2022, 2050 and 2090).Footnote 4 (Silverman 1986; Marron and Nolan 1988; Härdle 1991; Fan and Marron 1994; Simonoff 1996) Together with this, we provide skeweness and kurtosis indicators as well as Jarque–Bera (JB) Histogram normality test results for these specific years (Pearson 1894; 1895; 1905; Bera and Jarque 1981; Jarque and Bera, 1980; 1987; Jambu 1991; Westfall 2014). Moreover, we present the choropleth maps of \({ry}_{i}\) for the past and future years.Footnote 5

We apply Global Moran’s I tests to \({ry}_{i}\) values in order to test the spatial dependency of income distribution and its future evolution, we use “R Spdep” package in the implementation of the test. (Bivand et al. 2013, 2024; Bivand and Wong 2018; Bivand 2022; Moran 1948; 1950; Anselin 1988). The test is applied for the years 1929, 2022, 2050 and 2090. Raw standardized inverse distance spatial weight matrix has been used where the distance matrix between states was obtained by the help of R “Stats” package (Anselin 1988; Herrera-Gomez et al. 2012; R Core Team, 2023a).Footnote 6Footnote 7

As a last method, we calculate several income inequality indices over the period 1929–2090 by using the existing and projected income data. Four inequality indexes are calculated: i.Coefficient of Variation: \(\sigma ({y}_{i})/ \overline{y }\) \(\sigma\): standard deviation (Williamson 1965), ii. \(Theil Index=\sum_{i=1}^{48}\left(\frac{{y}_{i}}{\overline{y} }\right){\text{ln}}\left(\frac{{y}_{i}}{\overline{y} }\right)\) (Theil 1967; Gluschenko 2018) and iii. Atkinson Index which is calculated for state level y values by using “DescTools” R package (Atkinson 1970; Atkinson and Bourguignon 2000; Dayioğlu and Başlevent 2006; Signorell 2023; Signorell et al. 2024. iv. Maximmum/minimum ratio such as max(\({y}_{i})/\) min(\({y}_{i}).\)Footnote 8

3 Empirical results

To start with the results, we present first the ARIMA model’s estimated parameters for the U.S. economy in Table 1. It is seen that the automatic ARIMA routine has resulted with 2-years autoregressive lag (p=2), 2-years moving average component (q=2) and 1-year time differencing, d=1. As for the state-level estimation, Table 2 documents the optimal parameters for 48 states which take quite different p,q and d values.

Table 1 U.S. ARIMA parameter estimates, ARIMA (2,1,2)
Table 2 ARIMA parameters for states

Having applied the ARIMA model, we demonstrate three samples of the forecasted income trajectories in the Fig. 1; i. US economy, ii. New York (the state with highest per capita income in 1929) and iii. South Carolina (the state with lowest per capita income in 1929).

Fig. 1
figure 1

ARIMA forecast of per capita real personal income for US, New York and South Carolina. Note: Dark gray color illustrates the confidence interval at the 80%, light gray color represents the confidence interval at the 95% level

Based on the past and forecasted values, we plot the Kernel probability density distribution of the relative per capita income in the Fig. 2, and provide main distributional properties as well as JB test statistics in the Table 3. Over the years, we observe significant changes in shape of the income distribution. In detail, from 1929 to 2022, we observe a homogenization/convergence process since the probability density tends to have higher peak at the median income. Consistent with this evolution, kurtosis value increases from 1929 to 2022, whereas JB statistic decreases. However, from 2020 to 2050 and 2090, we observe an income divergence pattern. Such that income distribution is forecasted to be bi-modal and more heterogeneous over time. We expect to observe a clearly polarized distribution by 2090 at which kurtosis values hit the lowest value whereas the non-normality indicator (JB) becomes higher.

Fig. 2
figure 2

Kernel density estimates of relative incomes, \({{ry}_{i}=ln(y}_{i})-ln(\overline{y }\))

Table 3 Distributional properties of relative incomes

As one of the most important results, Fig. 3 illustrates the forecasted trajectory of the regional inequality indices. Several findings are remarkable. First, regading the past period, from 1929 to mid-1970s, it is observed a clear decline in all inequality indexes which points to an evidence of income convergence. However, after mid-1970s, a mild increase in disparities are observed until 2022. With regard to the future period, it is seen that all of the inequality indexes are expected to increase from 2022 to 2090. Such that CV is expected to increase from 0.4 in 2022 to 0.43 in 2050 and 0.65 in 2090. Theil Index is forecasted to rise from 0.16 in 2022 to 0.19 in 2050 and 0.43 in 2090, Atkinson Index is expected to increase from 0.51 in 2022 to 0.60 in 2050 and 1.39 in 2090. As a last remark, the ratio between the richest and poorest state (max/min ratio) is expected to increase from 1.83 (in 2022) to 2.58 (in 2090). Overall, by 2090, regional inequalities are projected to be elevated considerably and reach the values approximately equivalent to that of during the 1940s.

Fig. 3
figure 3

Forecasted regional inequality indexes

These results are comparable with the study of Wear and Prestemon (2019). In their study on US counties, estimated trajectories (based on a past data 1975–2010) towards 2070 indicate a tendency towards further income convergence while our study claims a controversial evolution.

As one of our particular interest, the geographical distribution of income is forecasted to change substantially in the future (Fig. 4). Such that while in 2022 the prosperous states have concentrated in Northeast and West coast, by 2090, the pole of prosperity is likely to shift to the Southeast and Mid-north. With regard to the low income places, it is projected that the pattern will change as well. While in 2022, the backward states are mostly concentrated around Southeast, by 2090, they are expected to be concentrated around the hinterland parts of Northeast and Southwest.

Fig. 4
figure 4

Geographical sssmes (ry)

Why such a spatial pattern might occur is a critical but also a complex question. It may underlie different dynamics of state-level economies such as the change in industrial structure, degree of capital accumulation, direction of capital and human capital flows, location choice of productive sectors, etc. However, this issue is beyond the purpose of our study.

As a last finding, the results of the Moran’s I test in the Table 4 reveal that the spatial dependency is expected to disappear over the long-run. Such that Moran’s I statistic is positive (0.19) and strongly significant in 1929, it decreases to 0.14 but still remains significant in 2022. However, by 2050 and 2090, it is forecasted to have insignificant values. The reason why spatial association may disappear in the future might be related to a global tendency towards excessive mobility of production factors such as knowledge and human capital as the main input for the high-tech sectors, the increasing digitalization in social and work environment, the further construction of world-wide interlinked supply-chains, trade and financial openness.

Table 4 Moran’s I test result

4 Conclusions

This paper has investigated the future prospects of spatial income inequalities in the U.S. An important set of results appeared to emerge as the outcome of the empirical analyses. First, income disparities are expected to increase over the long-term that implies a divergence pattern. Second, the forecasted shape of the income distribution is bi-modal and polarized, therefore, pointing to a widening of the inequalities. Third, the geography of the prosperity is projected to change in a way that the geographical position of high and low-income areas will change. Fourth, spatial dependence in per capita income is expected to fade away in the future.

These results are relevant from a regional policy stand point. It may be argued that additional resources should be devoted to maintain territorial cohesion. Special long-term policies should be formulated particularly for the states that are expected to become backward, such as the ones around the hinterland parts of Northeast and Southwest. Providing tax exemptions, financial aids, structural reforms regarding industrial mix, labor and capital markets can be mentioned as some of the possible policies.

Finally, it is word spending few words about the limitations of the study and future research topics. As the main limitation, the future forecasts do not take into account the economic shocks and business cycle fluctuations which are unpredictable but they may significantly influence the long-term trajectories. As for the future research topics, it may be relevant to focus on the socio-economic and geographical determinants of the future spatial income distribution.

In sum, this topic remains largely uncovered in the literature and needs the further attention of researchers.