1 Introduction

Health care expenditures have been on the rise in most developed economies, with it being most pronounced in the US (Wang 2009). In light of this evidence, studies on OECD economies, including the US, have provided evidence to understanding the data generating process, the sources of this growth in health care expenditures, its impact on economic growth, whether health care expenditures are necessary or luxury goods, and the resulting policy implications (Carrion-i-Silvestre 2005; Narayan 2006, 2009, 2010; Narayan and Narayan 2008a, b; Rettenmaier and Wang 2006; Narayan et al. 2011; Freeman 2012; Narayan and Popp 2012; Caporale et al. (2015) and references cited therein for detailed literature reviews).

In this paper, our concern is the convergence of real per capita health care expenditures across the US states. While there are quite a few studies that have analyzed the convergence of health care expenditures across countries, for instance, OECD and the European Union members (Ben-David 1996; Hitiris 1997; Maddala and Wu 1999; Nixon 1999; Hitiris and Nixon 2001; Hofmarcher et al. 2004; Kaitila 2004; Okunade et al. 2004; Abiad et al. 2007; Chou 2007; Dogan and Saracoglu 2007; Lima and Resende 2007; Narayan 2007; Aslan 2008; Kerem et al. 2008; Fallahi 2011; Panopoulou and Pantelidis 2011; Schmitt and Starke 2011; Montanari and Nelson 2013; Lau 2014; Pekkurnaz 2015; and references cited therein), there is hardly anything across regions within a country. While, there are other factorsFootnote 1 that can result in convergence of health care expenditures, the primary reason behind convergence or lack of it is believed to be income. As income converges (does not), so does (does not) the income-dependent health care expenditures.

It is well-known that health care consumers spend more of their income shares in the US than in any other developed economy, and also that health expenditure grew rapidly in most US states, particularly in those with low health expenditure initially (Wang 2009). Given this, at this stage, it is important to understand why convergence in health care expenditures is indeed an important question, primarily from a policy perspective, to analyze. From policy maker’s viewpoint, the issues and results on convergence in total health care expenditure are important for understanding the growth pattern and spending trends of health expenditure, which in turn, are important to the understanding of how followers catch-up with the leaders in health expenditure. This knowledge will then allow the policy maker to evaluate existing policies on terms of health care coverage, affordability, access and quality, currently pursued in various states, and decide accordingly as to whether there is a need for new initiatives.

Generally speaking, it is believed that convergence of health care expenditures, possibly due to income convergence, is more likely to occur across regions within a country than across countries, given that these regions are relatively more homogenous than countries in terms of economic conditions, policies related to health, technology, the structure of the health industry, consumer preferences, and general features characterizing the health care system (Wang 2009). This line of thinking is also vindicated by Lau et al. (2014). The authors provide a detailed literature review of health expenditure convergence in European countries, and in general across countries, to show that in fact the results on convergence of health care expenditures is contradictory, inconclusive and potentially misleading, and can be characterized, at best, as a ‘mixed bag’. In this regard, Lau et al. (2014) suggest the need to use sophisticated econometric techniques accounting for nonlinearities in the data generating process of health care expenditures, and hence going beyond linear econometric frameworks of convergence-testing that has been used in the literature primarily (as referenced above), to resolve the issue of mixed evidence.

Wang (2009) analyzes the convergence of health care expenditures across the US states—which to the best of our knowledge, is the only paper exploring the convergence issue for the US health care system. Using both a standard cross-sectional approach and the time series-based cluster analysis, Wang (2009) documents the movements of real per capita health care expenditures across the 50 US states over the period 1980–2004. He provides favorable evidence that while convergence has occurred across the US states, both in terms of total expenditures and in terms of their major components, the rate of convergence is slow. More importantly, this paper indicates that there is no single nation-wide convergence process, with states converging to number of separate convergent clubs. Specifically, 38 states are found to form 16 convergence clubs of size 2 or 3, with the remaining 12 states being individually separated. Wang (2009), however, warns that even though the critical values of the cluster analysis are based on Monte Carlo simulations to account for the short length of the time series, his empirical results should be viewed with caution. From a policy perspective, his results highlight that health policies cannot be uniform across the entire country, but in fact, they need to be similar only within the convergent clubs, implying that health care policies should in fact be implement on the state level, rather than on national level.

Against this backdrop, we revisit the issue of convergence in terms of aggregate real per capita health care expenditures across the US states using an annual dataset spanning the period of 1966–2009. We cast the problem of convergence as a test of stationarity of the ratio of real per capita health expenditures of a specific state relative to the corresponding cross-sectional average across all 50 US states. Given that our sample size is relatively short for time series analysis, we implement a modified version of the Im et al. (2003) panel unit root test, as it is widely believed that panel-based approaches can increase the power of time series-based unit root tests like the Augmented Dickey and Fuller (1979, ADF) test, especially in cases where the length of the time series involved is not too long (Chang et al. 2015), as it happens to be in our case with 44 years of data. Note that, the IPS test allows for testing unit root in dynamic heterogeneous panels based on the mean of the individual ADF (t-)statistics of each member in the panel. The IPS test is more general and less restrictive than other panel unit root tests like the Breitung (2000) and Levin et al. (2002) test, since, unlike the latter two which assumes common unit root process across the panels, the IPS test allows for heterogeneous coefficients. Naturally, the IPS test does have high power relative to other popular panel data-based unit root tests (see Lau et al. 2014 for a detailed discussion in this regard).

Our modifications of the standard IPS test are in two directions. First, given the evidence of structural breaks in the health care expenditures in the US states, as reported by Rettenmaier and Wang (2006) and Freeman (2012), we model structural breaks of an unknown form as a smooth process via means of flexible Fourier transforms (Enders and Lee 2012). Such an approach is preferable over standard methodologies of modeling structural breaks through dummy variables (Carrion-i-Silvestre 2005; and references cited therein), which implies abrupt changes in the mean and/or trend of a series, which is less likely to be observed in low frequency data (Chang et al. 2015). Moreover, in terms of the dummy variables approach, one has to acknowledge the exact number and location of the breaks. These are not usually known and, therefore, need to be estimated, which in turn, introduces an undesirable pre-selection bias (Maddala and Kim 1998). By contrast, the flexible Fourier function based approach does not require specifying the maximum number of breaks or imposing a 15–20 % truncation at the beginning or the end of the data sample, which could also possibly include breaks. By ignoring structural breaks while testing for unit roots, is highly likely to lead to interpreting any departures from structural instabilities as permanent stochastic disturbances, i.e., sway the analysis towards the unit root hypothesis (Canarella et al. 2012). Note that, alternatively, we could have also used a nonlinear unit root testing approach as in Lau et al. (2014), on the presumption that income, which is considered to be the major determinant of health care expenditures, is a nonlinear process. It would be interesting to apply this test to our data in the future, but for now we decided to incorporate smooth structural breaks rather than nonlinearity in the data generating process, given the evidence of structural breaks reported in the literature on US health care expenditures data.

A result of convergence, i.e., stationarity of our metric, for the entire panel does not indicate which of the states, if not all, are driving these results, since panel-based unit root tests are joint tests of a unit root across all members in a panel. Therefore, as the second extension, we augment the IPS test with a Fourier function by applying the Sequential Panel Selection Method (SPSM), proposed by Chortareas and Kapetanios (2009). The SPSM approach classifies the whole panel into groups of stationary and non-stationary series, and hence, it clearly identifies which of the series, if not all, in the panel are stationary processes, driving the stationarity of the entire panel. In other words, the SPSM, provides us information on which state(s) is (are) convergent and which state(s) is (are) not. This is clearly important from a policy perspective, since lack of convergence in certain states would imply that policy makers would need additional political and policy debates concerning the reasons behind divergence, and then determine develop relevant policies to ensure that these states catches up on average.

Note that, we also apply the modified IPS test to an appropriate metric of the real per capita disposable income across the US states, to check out for the role of income in the convergence process of the health care expenditures. This issue too is very important from the perspective of health policy. Since, if healthcare expenditures do not converge and this seems to originate from the non-convergence of income (the primary driver of health care expenditures), then the government would need to undertake broader policies related to allocative efficiency in the economy in general, and just not the health sector to ensure convergence. To the best of our knowledge, this is the first paper to have developed and applied this modified IPS test, which accounts for both structural breaks and individual cross-section level stationarity, to analyze convergence in both health care expenditures and disposable income across all 50 US states. The remainder of the paper is organized as follows: Sect. 2 lays out the basics of the methodology, while Sect. 3 describes the dataset and reports the results. Finally, Sect. 4 concludes the paper.

2 Methodology

In this section, we briefly outline the modified IPS test which not only accounts for structural breaks (via the use of Fourier transformation), but it also incorporates the SPSM to clearly identify which of the series, if not all, in the panel could be I(0) and hence, driving the stationarity of the overall panel.

To analyze whether the real per capita health care expenditures are converging, we need an appropriate metric. To our end, we define: \(HE_{i,t} = HCE_{i,t} /\overline{{HCE_{t} }}\), i.e., the ratio of real per capita health care expenditures for a specific state at a specific point in time (HCE i,t ), divided by the cross-sectional average of real per capita health care expenditures across the 50 US states at that same point in time (\(\overline{{HCE_{t} }}\)). If the real per capita health care expenditures for a specific state i converges, then HE i,t should be a stationary series. Provided that our metric of interest has a time trend for each of the cross-sections (i), as will be discussed below, the system of the modified IPS equations with a Fourier function yields:

$$\Delta HE_{i,t} = \xi_{i} + \delta_{i} HE_{i,t - 1} + \gamma_{i} t + \sum\limits_{j = 1}^{{k_{i} }} {\theta_{i,j} \Delta HE_{i,t - j} } + a_{i} \sin \left( {\frac{2\pi kt}{T}} \right) + b_{i} \cos \left( {\frac{2\pi kt}{T}} \right) + \varepsilon_{i,t}$$
(1)

where t = 1, 2, …., T. Note that the standard IPS test excludes the \(a_{i} \sin \left( {\frac{2\pi kt}{T}} \right) + b_{i} \cos \left( {\frac{2\pi kt}{T}} \right)\) part in Eq. (1). The rationale for selecting \([\sin (2\pi kt/T),\;\cos (2\pi kt/T)]\) is based on the fact that a Fourier expression is capable of approximating absolutely integrable functions to any desired degree of accuracy, where k represents the frequency selected for the approximation, and [a i b i ]′ measures the amplitude and displacement of the frequency component. It also follows that if there is a structural break, at least one frequency component must be present.Footnote 2 Gallant (1981), Becker et al. (2004), Enders and Lee (2012) and Pascalau (2010) demonstrate that a Fourier approximation can often capture the behavior of an unknown function even if this function itself is not periodic. As there is no a priori knowledge concerning the shape of the breaks in the data, a grid-search is first performed to find the best frequency. Next, we turn to the SPSM process which is based on the following steps:

  1. (1)

    The IPS test with a Fourier function is first conducted on HE i,t . If we fail to reject the null hypothesis of a unit root (i.e., δ i  = 0), then the procedure stops and we conclude that all series in the panel are non-stationary. If the null is rejected (i.e., δ i  < 0), then we can continue to Step 2;

  2. (2)

    The series with the minimum IPS statistic is removed since it is identified as being stationary;

  3. (3)

    We return to Step 1 for the remaining series, or stop the procedure if all the series are removed from the panel.

The final step is the separation of the whole panel into a set of mean-reverting series and a set of non-stationary series, if any.

3 Data and Empirical Results

We make use of annual data on Healthcare Expenditure (HCE) from 1966 to 2009 for 50 US states. Data were obtained from the Center for Medicare and Medicaid Services Health Expenditures by State of Residence. This database reports total personal health care spending by state and by service. Personal health care expenditures by State of Residence are based on State of Provider estimates adjusted for the flow of residents between states in order to consume health care services. These estimates present health spending on behalf of residents in the 50 States and in the District of Columbia. Included are estimates of aggregate and per capita health spending by type of good or service (hospital care, physician and clinical services, retail prescription drugs, etc.) and source of funding for those services (private health insurance, Medicare, Medicaid, out-of-pocket spending, etc.). Per enrollee spending for Medicare and Medicaid are presented by type of good or service and per enrollee private health insurance is presented in aggregate. To ensure we have a worthwhile time series, we work with aggregate health care expenditures with data being expressed in per capita terms, by dividing with population figures, obtained from the regional database of the Bureau of Economic Analysis (BEA). Given that state-level CPI is not available for the entire period under study, the nominal per capita health care expenditures are converted to their real values by deflating with the aggregate US CPI. Note, we transform the data into their natural logarithmic values before building the ratios, as discussed in Sect. 2. Figure 1 in the Appendix plots the HE i,t for each of the 50 states, with the states being divided into the nine census divisions for the sake of clearly observing the trend in our variable of interest, which is even more pronounced if all states are plotted independently.

Fig. 1
figure 1

Convergence metric of real per health care expenditures

We start out the analysis with the standard IPS test with a trend in the specification. However, the null hypothesis of unit root cannot be rejected even at the 10 % level of significance, with the test statistic generating a value of −1.7998. This result is in line with the general lack of convergence into a single club, as reported in Wang (2009) and based on time series based tests. As discussed earlier, given the evidence of structural breaks in the health care expenditures, we next carry out the analysis with the modified IPS test which now includes the Fourier function. The new test statistic turns to be −2.9617, implying that the null of a unit root is rejected at the 1 % level of significance.Footnote 3 However, since the overall statistic does not provide any information as to which of the states are driving this result of stationarity for the entire panel, we turn to the SPSM process. The results are reported in Table 1. At the 5 % level of significance, convergence is observed across all states, barring New Jersey and Arkansas. However, if we allow our inferences to be based at the 10 % level of significance, we find overwhelming evidence of convergence in the health care expenditures across all 50 states.Footnote 4

Table 1 IPS unit root tests (constant and trend) with Fourier function and SPSM for HE i,t

An important related question is what causes this convergence? As indicated by Freeman (2012), real per capita personal disposable income is considered to be one of the main drivers of health care expenditures. Therefore, we explore whether real per capita personal disposable income also converges, using the same specification as in Eq. 1. As with the health care expenditures, our metric is the ratio of real per capita personal disposable income for a specific state at a specific point in time, divided by the cross-sectional average of real per capita personal disposable income across the 50 US states at that same point in time. If this ratio (PDI i,t ) is stationary for a specific state i, then this state has convergent dynamics in terms of real per capita personal disposable income. Data on nominal personal disposable income are obtained from the regional database of the BEA, and then is converted to their per capita and real values by dividing with the population and CPI, respectively.

Figure 2 in the Appendix plots the PDI i,t for the 50 US states, again clubbed together based on the census divisions. The trend in the data is clearly visible. The IPS test with a trend and Fourier function generates a test statistic of −3.485, implying the rejection of the null of a unit root at the 1 % level of significance. Next, we now implement the SPSM and the results are reported in Table 2. There is strong evidence in favor of convergence of the real per capita personal disposable income across all states at the 1 % level of significance for 49 states (except Tennessee), and across all states at the 5 % level of significance.Footnote 5

Fig. 2
figure 2

Convergence metric of real per capita personal disposable income

Table 2 IPS unit root tests (constant and trend) with Fourier function and SPSM for PDI i,t

Overall, the empirical analysis provides overwhelming evidence in favor of (broken trend-stationary) convergence in real per capita health care expenditures across the 50 US states, which is shown to be the result of (broken trend-stationary) convergence in real per capita personal disposable income—believed to be the main driver of health care expenditures.Footnote 6

At this stage, it is important to understand what these results of health care expenditure and income convergence means economically by drawing from the theoretical models of convergence. Given N = 50 states with different initial levels of health expenditure, convergence seems to suggest that expenditure is increasing faster in lower level states than in higher level states over the time horizon of 1966–2009, so that there is a narrowing process in the differential between the health care expenditures and income. This narrowing process, in turn, implies that all the 50 states over 1960–2009 seems to be convergent to a long-run steady state, which happens to be the cross-sectional average of health care expenditures and income during this period of time. This concept is often called β-convergence. Put alternatively, one could have also analyzed what is called the δ-convergence, whereby, we look at if cross-state variance in expenditure and is declining during the time period of 1966–2009. But here, through the test of stationarity of the ratio of health care expenditure and income of a specific state relative to the cross-sectional average, we are basically checking to see whether the two series, i.e., the series for a specific state and the time-varying cross-sectional average are moving towards one another, independent of their current positions. Our results of convergence, basically confirms that this is indeed the case across the 50 states over 1966–2009.

4 Conclusion

Current time-series evidence on convergence of real per capita health care expenditures across the 50 US states into a single club is nil. Against this backdrop, using a modified version of the panel-based IPS unit root test that accommodates for smooth structural changes using a Fourier function, we provided strong evidence in favor of convergence. In addition, implementing the SPSM methodology, we observed that the evidence of convergence in the entire panel is in fact driven by convergence in each of the 50 US states and not just a few cross-sectional units. Using the same methodology, we also determined that this convergence was possibly due to the presence of convergence in real per capita personal disposable income, which in the health literature, has been indicated to be the main driver of health care expenditures.

The results highlight the importance of modeling for structural breaks in the unit root tests, since if structural breaks are not accounted for, the study reverts back to earlier results in the literature showing no evidence of convergence. From a policy perspective, these results imply that common policies relating to the health care system can be pursued across all US states, since the health market is not disaggregated once we allow for smooth structural changes in the data generating process. In this regard, our results highlight the fact that policy makers should be careful in relying on simple tests of convergence which do not account for structural changes, since it can lead to misleading results of non-convergence and hence, inaccurate policy measures and unnecessary intervention. In addition, given the evidence of income convergence as well upon accounting for structural breaks, imply that there is no need for implementing economy wide policies related to income distribution. However, these results require re-evaluation on a continuous basis as new data becomes available to ensure that indeed the convergence hypothesis is continuing to hold in the health care sector and the economy in general. Finally, given that our results show that, income convergence seems to be driving health acre convergence, general policies on allocative efficiency is likely to ensure convergence in the health care system, and might in turn, not need health-sector specific policies. But again, this policy conclusion should be regularly evaluated as new data information on health and income comes in.