1 Introduction

There is a substantive body of evidence that indicates that health care expenditure within nations of the European Union (EU) has been progressively rising over recent decades. Yet, some uneven patterns of health care expenditure persist; engendering disparities, such as the relatively high health care expenditure per capita of Norway ($5,352), which is almost double of that of Greece ($2,724). According to health statistics produced by the OECD, the average annual health expenditure per capita growth rate for the period 2000–2009 differed markedly. For example, with growth rates of 6.9 % for Greece but only 0.7 % for Luxembourg. Superficially, such data could be taken to support a policy discourse that a process of convergence is occurring. Due to the dangers associated with the misinterpretation of data and overly simplistic policy readings, we seek to interrogate whether there is indeed robust evidence of a ‘catch-up effect’, where by there is a convergence taking place.

The concept of convergence is defined as ‘the result of a process in which the structures of different industrial societies come increasingly to resemble each other’ (Jary and Jary 1991, p. 121).The matter of ‘convergence’ has gained research traction over recent times, in part due to broader theories of economic globalisation, global governance and supranational institutions, such as the EU. The theoretical basis of economic convergence is derived from the neoclassical growth model, that maintains that in the long-run, all countries will converge to a common (conditional) equilibrium level of income per head—the steady state thesis—provided that trade is free, technologies are common across states, and countries share similar preferences. One of the key implications of this perspective is that the growth rates of income per head across countries are inversely related to initial conditions. Following this, the Solow (1956) model predicts that countries will converge to their steady states in a conditional sense.

Health care is considered as a normal good in economics. It should exhibit similar dynamics to income. Therefore, there is reason to suggest that the dynamic path of health care expenditure per capita should follow that of GDP per capita. The paper provides an empirical test to examine the health expenditure convergence hypothesis. There is one common feature of the existing studies—using linear unit root test. In view of the experience that the EU has gone through a rather fundamental integration process, structural change can plausibly cause non-linearity in health care expenditure. In our paper, we fill the gap in current literature of health care convergence by using nonlinear panel unit root test (Cerrato et al. 2011). We find weak evidence of this convergence hypothesis among 14 EU members.

The rest of the paper is organized as follows. Section 2 provides a literature review on the studies applying linear unit root test to health care expenditure. Section 3 presents the methodology while Sect. 4 discusses the findings. Section 5 provides a summary and some concluding remarks.

2 Literature review

2.1 A mixed-bag of evidence: an international literature survey of studies applying the linear unit root test to health care expenditure convergence

In an EU context, there is some evidence of economic convergence, although this is derived from pre-economic crisis data. Kaitila (2004), for example, finds that the accession countries caught up with the EU’s GDP per capita during two periods; namely, the 1960–1973 oil crisis (due to increased trading activity and high investment rate from the neighbor countries), and the mid-1980s to mid-1990s. Substantiating Katlia’s analysis, Abiad et al. (2007) find an accelerating income convergence in the EU, suggesting that enhanced financial integration has opened up new markets, wherein capital is flowing from the rich to the poor countries. Dogan and Saracoglu (2007) use five panel unit root tests to also investigate income convergence in the EU. Relative to the EU average, they find some evidence that the candidate countries catch up with the existing members. Indeed, examining the relation between income convergence and international trade, Ben-David (1996) finds that the mere prospect of joining the EU had a positive influence on the economic development of accession candidates. In recognition that disposable income is the most important determining factor of personal health expenditure, does a convergence in income over recent decades necessarily imply a similar convergence in health expenditure? Answering such a question is not as clear-cut as one might initially anticipate—findings surveyed from extant literature provide a messy picture, which supports the need for additional analysis to help clarify the picture.

Some prior studies show evidence that health care expenditures converge among EU member states (Nixon 2000). Hitiris and Nixon (2000) examine the \( \sigma \) and \( \beta \) convergence in health care per capita spending of EU members (EU-15) prior to the latest expansion. By using the life expectancy and Infant mortality rate as explanatory variables, the study identifies 1960–1995 and 1980–1995 as periods for \( \sigma \) and \( \beta \) convergence, respectively. Hofmarcher et al. (2004) recognize that there is a wide gap in the average health expenditure levels prior to the Monetary Union of the EU. Evidence tends not to support any notable level of health care per capita expenditure convergence until the mid-1990s. This coincides the expansion of EU membership in 1995; and in May 2004 ten of the then EU accession countries became full members of the EU. Narayan (2007) examines whether or not per capita health expenditures of OECD countries converge to the per capita health expenditures of the USA over the period 1960–2000. Using linear univariate and panel tests, some evidence of convergence is found. Schmitt and Starke (2011) examine ‘conditional’ convergence of various types of social expenditures, of which health expenditure is one, in 21 OECD countries, utilising the error correction model, with results strongly indicating conditional convergence.

Using linear panel unit root test, Maddala and Wu (1999), however, conclude that the time series of health expenditure on average contains a unit root for the OECD countries. Wang (2009) examines the extent of health care expenditure (and its nine components) among the 52 states in the US. The convergence extent and speed is found to be moderate; and the performances of individual components are diverse. He ascribes hospital care to the bulk of cross-state convergence in total expenditure, while prescription drugs spending is the most important diverging factor. Aslan (2008) investigates the OECD per capita health care spending using the Lima and Resende (2007) persistence method. Regional health inequality is evaluated in terms of panel data unit root tests advanced by Im et al. (2003). The evidence illustrates that one cannot reject the null hypothesis of unit root for the (log) of the ratio of health care expenditures of each country relative to a reference unit except average of per capita health expenditures. Studies on intra-provincial evidence using linear LM panel unit root test also suggest substantial disparities in health service in China (Chou 2007). Kerem et al. (2008) use the average health expenditure-GDP ratio and health expenditure per capita of EU-12 countries to test for β-, σ- and γ-convergence. They find that economic integration does not lead to an automatic homogenization of health care expenditure and health policy. Based on the social citizenship development theory, Montanari and Nelson (2013) examine convergence in various health care dimensions of different EU countries, of which health expenditure is one dimension. For countries less severely affected by the global recession and budgetary restrictions, they find no convergence of any health dimension except the increasing share of private financing. Finally, synthesizing the existing body of literature associated with 17 OECD countries, Panopoulou and Pantelidis (2011) do not find evidence of convergence. The full panel can diverge but groups of countries can converge to different equilibria. There is one common feature of these studies: the application of the linear unit root test.

2.2 Key factors influencing health care expenditure

In view of contradictory evidence pertaining to the notion of health care expenditure convergence in different parts of the world and the EU in particular, it is necessary to consider some of the key factors affecting the spatial variation of health care expenditure. The findings of this section provide the present study with a firm conceptual grounding. Numerous studies find evidence that income (GDP as a proxy) is the most significant factor explaining variations in health care expenditure. From this it can be inferred that the paths of health expenditure should mimic economic growth paths.

One of the most important studies is that of Newhouse (1977), which examines the per capita health expenditure of 13 OECD countries. Using cross-section regression, he concluded that 92 % of variation of health expenditure can be explained by GDP variation. His finding is consistent with the notion that healthcare is a luxury good. The 2012 Ageing Report by the Directorate-General for Economic and Financial Affairs of the European Commission also finds a strong GDP per capita effect on the growth of per capita public health expenditure among the EU member countries. Following this line of enquiry, numerous economists have explored the integration and cointegration properties between disposable income and health expenditure. However, up-until the mid-1990s, most of these studies were confined to country-by-country analyses. McCoskey and Selden (1998) was one of the earliest studies using panel unit root test. Jewell et al. (2003) re-examined the perceived income-health dynamic by utilizing 20 OECD country data. Recognizing the deficiencies of the univarite unit root test that lacks power in testing near unit root behaviour and breaks, they utilized the panel LM unit root test to account for heterogeneous structural breaks.Footnote 1

Whilst income appears to be a key factor influencing health, it is by no means the only factor. While confirming the importance of GDP, Hitiris and Posnett (1992) attempted to examine the non-income linkage. Their choice of variables included demographic structure, epidemiological needs and health financing; although the study found that these factors are not significant. Di Matteo and Di Matteo (1998) examine the balance between public and private health expenditure in the Canadian system. The key factors are deemed to be per capita income, government transfer variables, the share of individual income held by the top quintile of the income distribution and long-term economic forces. Applying recursive panel estimation procedure, Herwartz and Theilen (2003) find evidence for cross country homogeneity during the period 1961–1979. However, they also find that country-specific factors dominated between 1980 and 2000, indicating evidence of divergence in health care systems. Bilgel and Tran (2013) utilized panel data on GDP, the relative price of health care, the share of publicly funded health expenditure, the share of senior population and the life expectancy at birth to investigate the determinants of Canadian provincial health expenditures over a 28 year period. Estimation results from Generalized Instrumental Variables (GIV) and Generalized Method of Moments (GMM) suggested that long-run income elasticity of health care expenditure was significantly lower than one, contrary to the general perception that health care being a luxury good.

Increasing international cooperation is another reason for potential convergence in health care spending. Developments over recent decades encouraged more joint actions among the EU member states to promote health protection, subsidize medical and health care policy research, establish international information systems and promote equality of health care expenditure equality.Footnote 2 The 1991 Maastricht Treaty and the 1997 Treaty of Amsterdam empowered the European Parliament to strengthen European cooperation and provided new direction of community action toward tackling health issues. The treaty of Lisbon, which entered into force on 1 December 2009, enshrones the notion of ‘cohesion’ in terms of greater political, social, territorial and economic integration. This mechanism has subsequently generated forces for convergence at the level of public health care among EU member states. The Commission of the European Communities (1994, p. 40), argues, for example, that increasing integration among the EU citizens and professionals may lead member states to seek ‘long-term solutions in similar directions’, with substantial funds available to facilitate structural adjustments, epitomised by the resources associated with the European Regional Development Fund. Such politico-policy shifts have prompted some, such as Abel-Smith et al. (1995) and McKee et al. (1996), to speculate that EU health policy reforms and European law on health care provision may lead to greater convergence in health care expenditure across the EU.

3 Data and research methodology

This section is devoted to describing various unit root tests. We start with the traditional Dickey-Fuller test, followed by the Im et al. (2003) linear panel unit root statistic. Finally, the Cerrato et al. (2011) nonlinear panel unit root test will be introduced. The empirical analysis is based on 14 EU countries, namely Austria, Belgium, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Luxembourg, Netherlands, Spain, Sweden, and the U.K. This sample of countries is dictated by data availability. All time series data are annual and have been PPP-adjusted into US Dollar for the period 1970–2008 to avoid the outlier effects as evidenced by the global economic recession. All data are obtained from the OECD health database 2010.

3.1 Conventional unit root test

We first employ annual health care expenditure per capita data for 14 EU countries, to construct health care expenditure series relative to the average of other EU members, such that the series of interest for country \( i \) at time \( t \) is:

$$ y_{i,t} = ln\left(\frac{{g_{i,t} }}{{\bar{g}_{t} }}\right) $$

where \( g_{i,t} \) is the health care expenditure per capita of country \( i \), and \( \bar{g}_{t} \) Footnote 3 is the average health care expenditure per capita in EU other than the country being considered. To check for the existence of a unit root, one can perform the univariate Augmented Dickey Fuller (ADF) test. However, it is well-documented in the literature that when a series is stationary but close to unit root, the Dickey Fuller test has relatively low power. One plausible solution is increasing the sample size, which can be a challenge for macroeconomic series. A further limitation is that EU health care expenditure data is only available on an annual basis. However, panel unit root test can extract more information by combining temporal and spatial dimensions to make it a more powerful procedure; implicitly increasing the sample size.

In the past decade or so, there has been an expanding literature on the presence of a unit root in a panel data. Baltagi and Kao (2000), and Hurlin and Mignon (2004) are surveys of recent developments. Breitung (2000), for instance, assumes that the panel data are generated by a deterministic trend and an unobservable autoregressive process. He proposes a linear transformation of the data, and constructs a statistic for testing a unit root process.Footnote 4 Bai and Ng (2004) use the factor structure of panels to examine the nature of stationarity. A time series with a factor structure can be nonstationary either due to the common factors or the idiosyncratic error; they come up with a test that can be applied to these two components separately. The Bai and Ng (2004) approach helps understand nonstationarity on a series by series basis, and from the viewpoint of a panel. Harris et al. (2004) are concerned about the low power of Dickey Fuller, and Bai and Ng (2004) tests. They construct a new stationarity test that captures arbitrary unknown cross-sectional disparity, which allows flexible choice of stationary dynamics (including ARMA) and contemporaneous effect. The statistic is the sum of lag-k studentized autocovariance across panels that renders the temporal dynamic specification unnecessary.

Chang (2004) develops a bootstrap methodology testing nonstationarity in a cross-sectionally dependent panel. In his setup, each panel unit is characterized by a general linear process which is approximated by a finite autoregressive integrated process increasing with time. He, then, applies the bootstrap method to derive the critical values, limit distribution and asymptotic properties of the unit root process. Choi and Chue (2007) propose a subsampling test that includes panel unit root and cointegration tests as special cases. The series of interest can be cross-sectionally correlated and cointegrated. The panel data model is linear, semiparametric and a mixture process. One of the advantages of this subsampling procedure is that it can be applied to certain types of discontinuous distribution.

Based on the mean of the individual ADF t-statistics of each member in the panel, Im et al. (2003) propose the LM—bar statistic (IPS test) for testing unit root in dynamic heterogeneous panels. In particular, they develop three LM-bar statistics when (a) the errors are \( i.i.d \); (b) the errors are serially correlated and heterogenous across groups; (c) the panels contain the same common trend.Footnote 5 Assume that the relative health care spending follows an autoregressive process with individual specific factor:

$$ y_{i,t} = (1 - \varphi_{i} )\mu_{i} + \varphi_{i} y_{i,t - 1} + \varepsilon_{i,t} ,\quad i = 1, \ldots ,N;\quad t = 1, \ldots ,T, $$
(1)

Rewriting it in first difference form, the null hypothesis is

$$ H_{0} :\beta_{i} = 0,\quad i = 1, \ldots ,N. $$

against the alternative:

$$ H_{0} :\beta_{i} < 0\quad \forall i = 1, \ldots ,N_{1} ,\quad \beta_{i} = 0,\quad i = N_{1} + 1, \ldots ,N. $$

where \( N_{1} \) is the number of stationary seriesFootnote 6 and \( \beta_{i} = - (1 - \varphi_{i} ) \).

The alternative is more general that \( \beta_{i} \) can differ across groups; and it allows some individual series to have unit roots. Im et al. (2003) proceed to derive the LM-bar statistic for the series with serially correlated error:

$$ y_{i,t} = \mu_{i} \varphi_{i} (1) + \sum\limits_{j = 1}^{{p_{i} + 1}} {} \varphi_{ij} y_{i,t - j} + \varepsilon_{i,t} $$
(2)

The first difference form is:

$$ \Delta y_{i,t} = \alpha_{i} + \beta_{i} y_{i,t - 1} + \sum\limits_{j = 1}^{{p_{i} }} {} \rho_{ij} y_{i,t - j} + \varepsilon_{i,t} $$

where \( p_{i} \) is the number of lags, \( \varphi_{i} (1) = 1 - \sum\nolimits_{j = 1}^{{p_{i} + 1}} {\varphi_{ij} } \), \( \alpha_{i} = \mu_{i} \varphi_{i} (1) \), \( \beta_{i} = - \varphi_{i} (1) \), \( \rho_{ij} = - \sum\nolimits_{h = j + 1}^{{p_{i} + 1}} {\varphi_{ih} } \).

Notice that the Im et al. (2003) do not impose restriction on the mean equation; each panel is heterogeneous in the sense that it allows different individual specific factor.Footnote 7 In matrix form,

$$ {\mathbf{\Delta y}}_{i} {\mathbf{ = }}\beta_{i} {\mathbf{y}}_{i, - 1} + {\mathbf{Q}}_{i} \gamma_{i} + \varepsilon_{i} $$
(3)

where \( {\mathbf{Q}}_{i} = (\tau_{T} ,\Delta {\mathbf{y}}_{i, - 1} ,\Delta {\mathbf{y}}_{i, - 2} , \ldots .,\Delta {\mathbf{y}}_{i, - p} ) \) and \( \gamma_{i} = (\alpha_{i} ,\rho_{i1,} \rho_{i2,} \ldots ,\rho_{{ip_{i} ,}} )^{'} \).

Nonetheless, there is no reason to stick to a linear mean process. Cerrato et al. (2011) augment the Im et al. (2003) heterogenous panel unit root methodology with the Kapetanios et al. (2003) approach, which is essentially a nonlinear panel unit root test.

3.2 Non-linear panel unit root test

This paper fills a notable research gap by better accounting for the existence of nonlinearity in the growth dynamics of health care expenditure. The health expenditure growth path of the EU member states may follow nonlinear dynamics. The equalization of prices of goods and factors of production follows a non-linear dynamics as shown by many researchers (e.g. Michael et al. 1997). These models suggest that exchange rate adjustment follows a non-linear path due to the existence of ‘bands of inaction’ in the exchange rate adjustment process. Within the bands, arbitrage of tradable good is not profitable because transaction cost (i.e. the sum of transportation cost, cost of trade barriers, and distribution cost) is greater than the price difference. The existence of ‘bands of inaction’ may come from market frictions such as trade protectionism or transaction costs. Similarly, health care expenditure per capita may converge due to structural change, namely policy shift after the integration of Europe.

In addition, Lau (2010) finds evidence of provincial income divergence using Cerrato’s NCADF test for the period 1952–2005. His finding for Chinese provincial growth dynamics suggests further study on conditional convergence, whereas heterogeneous factor difference may hinder beta convergence across provinces. These factors include inflation rate, infrastructure, human capital, degree of openness, and use of foreign capital among provinces. There is reason to believe that health expenditure for EU members will follow nonlinear path because as we mentioned, health expenditure will follow the dynamic of income path, which is nonlinear in nature.

Therefore, we proceed to use the Exponential Smooth Transition Autoregressive (ESTAR) model to specify the price evolvement dynamics across countries. Cerrato et al. (2011) develop a non-linear panel ADF test under cross-sectional dependence, which is based on the following ESTAR specification:

$$ y_{i,t} = \xi_{1} y_{i,t - 1} + \xi_{t}^{ * } y_{i,t - 1} Z(\theta_{i} ;y_{i,t - d} ) + u_{i,t} \quad i = 1, \ldots .N,\quad t = 1, \ldots .T $$
(4)
$$ Z(\theta_{i} ;y_{i,t - d} ) = 1 - e^{{[ - \theta_{i} (y_{i,t - d} - c)^{2} ]}} $$
(5)

where \( \theta_{i} \) is a positive coefficient and \( c \) is the equilibrium value of relative expenditure difference between country \( i \) and the EU average, due to regional heterogeneous factors. The initial value, \( \bar{y}_{i,0} \), is given, and the error term, \( u_{i,t} \), has the one-factor structure:

$$ u_{i,t} = \gamma_{i} f_{t} + \varepsilon_{i,t} $$
(6)
$$ \{ \varepsilon_{i,t} \}_{t} \sim i.i.d(0,\sigma_{i}^{2} ) $$

in which \( f_{t} \) is the unobserved common factor, and \( \varepsilon_{i,t} \) is the individual-specific (idiosyncratic) error. For simplification purpose, the delay parameter \( d \) is set to be equal to one so that Eq. (4) may be rewritten in first difference form:

$$ \Delta y_{i,t} = \varphi_{i} y_{i,t - 1} + \xi_{i}^{ * } y_{i,t - 1} [1 - e^{{ - \theta_{i} (y_{i,t - 1} - c)^{2} }} ] + \gamma_{i} f_{t} + \varepsilon_{i,t} $$
(7)

where \( \varphi_{i} = - (1 - \xi_{i} ) \). Assuming that \( \varphi_{i} = 0{\kern 1pt} \) and normalizing \( c \) to zero,

$$ \Delta y_{i,t} = \xi_{i}^{ * } y_{i,t - 1} [1 - e^{{( - \theta_{i} y_{i,t - 1}^{2} )}} ] + \gamma_{i} f_{t} + \varepsilon_{i,t} $$
(8)

The null hypothesis is \( H_{0} :\beta_{i} < 0\,\forall i = 1, \ldots ,N_{1} ,\quad \beta_{i} = 0,\quad i = N_{1} + 1, \ldots ,N \), where \( N_{1} \) is the number of stationary series and \( \beta_{i} = - (1 - \varphi_{i} ) \). Notice that the alternative allows some series to have unit roots.

Because \( \xi_{i}^{ * } \) is not identified under the null, it is not feasible to test the null hypothesis directly. Thus, Cerrato et al. (2011) reparameterize Eq. (8) by first-order Taylor series approximation and obtain the auxiliary regression

$$ \Delta y_{i,t} = a_{i} + \delta y_{i,t - 1}^{3} + \gamma_{i} f_{t} + \varepsilon_{i,t} $$
(9)

Cerrato et al. (2011) further prove that the common factor can be approximated by a linear function of mean lagged values of \( y_{i,t} \).

$$ f_{t} \approx \frac{1}{{\bar{\gamma }_{\omega } }}\Delta \bar{y}_{\omega ,t} + \frac{b}{{\bar{\gamma }_{\omega } }}\bar{y}_{\omega ,t - 1}^{3} $$

where \( \Delta \bar{y}_{\omega ,t} = \sum\limits_{i = 1}^{N} {} \omega_{i} \Delta y_{i,t} \), \( \bar{y}_{\omega ,t - 1}^{3} = \sum\limits_{i = 1}^{N} {} \omega_{i} y_{i,t - 1}^{3} \), \( \bar{\gamma }_{\omega } = \sum\limits_{i = 1}^{N} {} \omega_{i} \gamma_{i} \).

Therefore, it follows that Eq. (9) can be written as the following non-linear cross-sectionally augmented DF (NCADF) regression:Footnote 8

$$ \Delta \bar{y}_{i,t} = a_{i} + \delta y_{i,t - 1}^{3} + c_{i} \Delta \bar{y}_{t} + d_{i} \bar{y}_{t - 1}^{3} + e_{i,t} . $$
(10)

Given the framework above, the authors develop a unit root test in the heterogeneous panel model based on Eq. (10). Extending the idea of \( \bar{y}_{i,t} \), Kapetanios et al. (2003) derive t-statistics on \( \hat{\delta } \), which are denoted by:

$$ t_{i,NL} (N,T) = \frac{{\hat{\delta }_{i} }}{{s.e.(\hat{\delta }_{i} )}} $$

where \( \hat{\delta }_{i} \) is the OLS estimate of \( \delta_{i} \), and \( s.e.(\hat{\delta }_{i} ) \) is its associated standard error. Following Pesaran (2007), the t-statistic in Eq. (10) can be used to construct a panel unit root test by averaging the individual test statistics:

$$ \bar{t}_{i,NL} (N,T) = \frac{1}{N}\sum\limits_{i = 1}^{N} t_{i,NL} (N,T) $$

This is a non-linear cross-sectionally augmented version of the IPS test (NCIPS). Consequently, Pesaran (2007) calculates critical values of both individual and panel NCADF tests for varying cross section and time dimensions. Difference in health expenditure among EU states is possible due to either time lead effect or policy consensus, and this will also form the so-called “bands of inaction” in the health expenditure adjustment process among EU members.

4 Results and discussions

4.1 Conventional linear unit root test

Figure 1a, b display an upward trend of health care expenditure per capita between the study period of 1970–2008. While most of the countries’ health care spendings have been growing linearly, those of Austria, Luxembourg, Netherlands and Ireland seem to be exponential. Spain and Greece are the member countries with lowest per capita spending; Luxembourg and Austria are the highest. There is limited fluctuations in terms of rankings; the key exception being Ireland, which had been the second lowest prior to 1987, but by 2005 had a higher health care expenditure per capita than Spain, Italy, UK. and Sweden. Strikingly, the application of the conventional linear root test reveals no significant convergence in the level of health care expenditure per capita.

Fig. 1
figure 1

Health care expenditure per capita (US$ PPP adjusted, constant 2005 international dollar)

Figure 2a, b show the relative health expenditures of the 14 EU member countries to its EU average. By 2008, Austria and UK have the highest and lowest per capita health care expenditure, respectively. Finland, on the other hand, exhibits relatively large temporal variation. One can see that there is apparent convergence among different EU countries toward the mean. The range reduced significantly for Spain, UK, Sweden and Portugal. The convergence appears to accelerate after mid-1990s, coinciding with the EU’s economic integration agenda discussed earlier. Notice that there is a structural break in 1969–1970 for these four countries followed by strongly positive co-movement.

Fig. 2
figure 2

Log health expenditure differential from EU average

However, the statistical test of conventional linear unit root test shows weak evidence of such a convergence hypothesis, possibly due to weak power that fails to take into account of nonlinearity. Table 1 shows the results of univariate unit root tests; we can see that only three countries converge to the EU mean in the long run at the 5 % significance level—they are Netherlands, Spain, United Kingdom, and, to a lesser extent, Austria. Table 2 reports the results of Im et al. (2003), Choi (2001) and Maddala and Wu (1999) linear panel unit root tests.Footnote 9 The null hypothesis of unit root is uniformly rejected, suggesting that there is at least one country converging to the EU mean; however these tests fail to indicate which countries are converging to the mean. More importantly, they fail to capture nonlinearity that the test has low power to distinguish between structural change and non-convergence. For instance, by simulating a stationary autoregressive process with a time dummy variable indicating structural change, the Dickey-Fuller and Philips-Perron test fail to reject the null hypothesis. Both statistics are biased toward nonrejection.

Table 1 Univariate unit root test: null hypothesis: unit root (individual unit root process)
Table 2 Linear panel unit root test

We contend in Sect. 2 that, the national income follows a nonlinear dynamic, since there is strong evidence that health care spending is strongly correlated to income, it is plausible that health care spending can follow nonlinear dynamic. For example, Shelley and Wallace (2011) could not reject the null hypothesis of unit using data since the Great Depression. They argue that prior study failed to correct for non-normality and heteroscedasticity in a nonlinear unit root test. Beyart and Camacho (2008) combined threshold model, panel data unit root and bootstrap standard error. Using 1950–2004 as the sample period, they fail to detect real GDP convergence in the enlarging EU. Chong et al. (2008) applied the nonlinear unit root test of Kapetanios et al. (2003) to test for nonlinear convergence of 12 OECD countries. Only two cases converge in the long run.

To provide a heuristic proof of nonlinearity,Footnote 10 we proceed to carry out the wellknown BDS Brock et al. 1987) test for the log relative expenditure of ten EU countries.Footnote 11 The test is performed for series with at least 30 consecutive observations; and these countries are Austria, Belgium, Denmark, Finland, Ireland, Netherlands, Portugal, Spain, Sweden and the United Kingdom. Correlation integral is the core of BDS test; it measures the frequency with which temporal patterns are repeated in the data. The sample correlation integral at embedding dimension \( n \) is:

$$ C_{n,\varepsilon } = \frac{2}{{T_{n} (T_{n} - 1)}}\sum\limits_{n \le s} \sum\limits_{ < t \le T} I(x_{t}^{n} ,x_{s}^{n} ,\varepsilon ) $$
(11)

where \( x_{t}^{n} = (x_{t,} x_{t - 1,} \ldots ,x_{t - n + 1} ) \), \( T_{n} = T - n + 1 \) and \( I(x_{t}^{n} ,x_{s}^{n} ,\varepsilon ) \) is an indicator function which is equal to one if the absolute distance of two series is bigger than \( \varepsilon \) and zero otherwise. The BDS statistic is defined as follows:

$$ V_{n,\varepsilon } = \sqrt T \frac{{C_{n,\varepsilon } - C_{1,\varepsilon }^{n} }}{{s_{n,\varepsilon } }} $$
(12)

The denominator is the standard deviation of \( \sqrt {T(} C_{n,\varepsilon } - C_{1,\varepsilon }^{n} ) \). Under some fair regularity conditions, the asymptotic distribution converges to standard normal. Table 3 reports the nonlinearity test results for these countries. The values of \( \varepsilon \) and \( n \) are set to be 0.7 and 6, respectively. Clearly, all series exhibit nonlinear dynamics, casting doubt on the traditional linear panel data method.

Table 3 A nonlinearity test by the BDS test statistic

4.2 Non-linear panel unit root test results

Table 4 shows the results for nonlinear panel unit root test. It indicates that three countries - Greece, Sweden and the UK—converge to the EU mean even after taking nonlinearity into account. The average t-statistic (−1.59) also refutes the conclusion of Im et al. (2003) test.

Table 4 Non-linear panel unit root test (EU average as benchmark)

As a robustness check, we proceed to conduct the nonlinear unit root test by varying the benchmark country. Tables 5, 6 and 7 report the results using UK (lowest per capita health care expnediture), Spain and Austria (highest per capita health care expenditure) as the benchmark countries.Footnote 12 Table 5 shows that Greece and Netherlands, individually converge to the EU mean. The quality of health services could be one diverging factor in the convergence process, as indicated by Wu (2014) the quality of services provided has correlation with consumer satisfaction improvement, resulted in higher health care expenditures. When we use Spain as the reference country, only Austria shows convergence property. If Austria is used as the reference country, France and Netherlands converge to the EU mean. However, all the average t-statistics still convincingly rejects the Im et al. (2003) result.

Table 5 Non-linear panel unit root test (UK as benchmark)
Table 6 Non-linear panel unit root test (Spain as benchmark)
Table 7 Non-linear panel unit root test (Austria as benchmark)

5 Concluding discussion

Based on longitudinal data for 14 EU countries for the period 1970–2008, this paper has tested the hypothesis that health care expenditure per capita has converged. The evidence indicates that one cannot reject the null hypothesis of unit root for the health care expenditures of most EU member states relative to the EU average, even after taking nonlinearity into account. The use of nonlinear unit root test is motivated by both theoretical justification (real income following a nonlinear path) and formal statistical test (BDS test). Although some studies (e.g., Nixon 2000; Hitiris and Nixon 2001) have claimed to demonstrate a convergence of health care expenditure among EU members, this study concludes that existing measures to encourage convergence are limited. This generates some notable policy implications and raises issues for those researching this topic. Firstly, it warrants renewed political and policy debates concerning the extent to which a convergence in health care expenditure (and ultimately provision) has materialised. We may even expect more diverging cases as the demand for health products and services varies in European area due to increasing aging population in some countries (Walder and Döring 2012). Secondly, it alerts policymakers to the contested and fragile nature of the health care convergence thesis. We would urge policymakers to be alert to the perils of overly simplistic readings of research findings, which can have significant policy ramifications. Thirdly, it raises a challenge for new research into this disputed policy terrain.

There are several possible explanations for nonconvergence of health care expenditure across the sample of EU countries studied. Spencer and Walshe (2009) find varying degree of adaptations and implementation of both health care policies and strategies throughout 24 EU member states; they argue that this can cause “varying levels of progress in implementation”. Cucic (2000) suggests that it would take much more than equalizing health care expenditure to synchronize the health care systems throughout the EU. However, it is possible that, because there is substantial variation in EU health care systems and health care financing, that the desired convergence will be a long and complicated process. In other words, a geohistorically mediated process. Leiter and Engelbert (2009) also point to the mobility of the labor market across borders, i.e. the ability of people to cross countries to shop for health care (or what has been termed ‘health tourism’ as an interesting phenomenon of convergence. Nonetheless, they found in their study that, over the long term, ‘countries do not move towards a common mean’. With different structure of health care financing in EU countries (e.g., private, public, mixed funding), one possible research avenue warranting further exploration would be analyses of the financing structure of each EU member and the different sectors of health care providers.

One of the limitations of this study is the power of the test. The asymptotic properties of nonlinear unit root is still not well established in the literature. In any case, the policy implications of our finding is clear—that the existing EU health policy reforms and European law on health care provision may not able to encourage greater convergence in EU. Further research is encouraged to investigate the determinants affecting health care expenditure differences across countries in EU.