1 Introduction

Since the work by Baumol (1986) and by Barro and Sala-i-Martin (1991, 1995), many papers have set about analyzing convergence using two conventional approaches: \( \beta \)-convergence and \( \sigma \)-convergence. These two forms of convergence have many applications in time series properties. Indeed, the development of econometric analysis techniques and the availability of databases (Summers and Heston 1991) covering large periods provide the opportunity to go beyond cross-sectional analysis and to exploit the properties of non-stationary time series (Bernard and Durlauf 1995; Edjo 2003) so as to better inform the debate on economic convergence.

Convergence tests have also expanded within the framework of panel data analysis. The first panel data tests were based primarily on the methodology used in cross-sectional analysis (e.g. Islam 1995; Berthélemy et al. 1997). Then, just as with individual time series, panel unit root tests were used to study economic convergence. This procedure based on panel unit root tests was first implemented by Quah (1992), Evans (1996), Evans and Karras (1996), Bernard and Jones (1996), and Gaulier et al. (1999) among others. More powerful tests were devised by combining the cross-section and time dimensions. Until now, two generations of unit root tests have been distinguished. Most methods of analyzing economic convergence using the properties of non-stationary series refer to the first generation, which assumes independence between individuals (Harris and Tzavalis 1999; Maddala and Wu 1999; Hadri 2000; Choi 2001; Levin et al. 2002; Im et al. 2003). However, as Hurlin and Mignon (2005) pointed out, this assumption of cross-section independence is particularly troublesome in applications of macro-economic convergence tests. The second generation of unit root tests is generally based on common factor models (Bai and Ng 2004; Moon and Perron 2004; Pesaran 2007 and Bai and Ng 2010) and takes into account more general forms of cross-sectional dependencies. One of the most important contribution on the second generation of panel unit root test has been developed by Bai and Ng (2004) who presented a toolkit for Panel Analysis of Non-stationarity in Idiosyncratic and Common components (PANIC). The point of departure is that the unobserved common factors can be consistently estimated, provided that the cross-section dimension is large. This allows them to decouple the issue of whether common factors exist from the issue of whether the factors are stationary. While unit root tests are imprecise when the common and idiosyncratic components have different orders of integration, direct testing on the two components are found to be more accurate in simulations by the authors.

The empirical procedure we use here is inspired by this second generation of unit root tests based on factor models and explicitly takes account of the dependencies in the individual dimension. Following Westerlund et al. (2010), we focus on the fact that the cross-country correlation in the convergence equation is due not only to simple correlation of residuals but also to the presence of one or more common factors that jointly affect the real per capita GDP. Therefore, the study of the convergence in panels, based on the standard ADF model as suggested by Evans and Karras (1996), is no longer suitable because it leads to tests with very low power (Strauss and Yigit 2003).

Another issue addressed in this paper is the existence of breaks in per capita GDP. Studies of structural change in panel data with cross-sectional dependence are very rare (Bai and Carrion-i-Silvestre 2009; Carrion-i-Silvestre and German-Soto 2009; Westerlund et al. 2010). As pointed out by Carrion-i-Silvestre et al. (2005), ignoring these shocks in the econometrics of panel data can lead to biases that lead to wrong conclusions. Financial and economic crises, economic reforms, and so on are events that may cause such shocks.

In the next section we present the first approaches generally used to test for convergence in non-stationary panel data by focusing on Evans and Karras (1996) procedure. Then in Sect. 2 we present the econometric framework of our analysis. In Sect. 3 we conduct Monte-Carlo simulations to analyze the impact of PANIC on performance tests. Section 4 presents an application using a sample of OECD member countries and a sample of countries in Sub-Saharan Africa.

2 Convergence tests in panel data econometrics

Convergence tests in panel data are generally based on the standard approach in cross-section the purpose of which is to test whether economies with low initial income relative to their long-term position will grow faster than economies with high initial income. This involves applying ordinary least squares (OLS) to the equation

$$\begin{aligned} \frac{1}{T}\ln ( {y_{i,T}/y_{i,0}}) = \kappa + \beta \ln ({y_{i,0}}) + \varphi \Xi _{i} + \xi _{i} \;\;\;\;\;\; \xi _{i} \backsim i.i.d(0,\sigma _{\xi }^2) \end{aligned}$$
(1)

where \( y_i \) is real per capita GDP of country \(i, \Xi _i \) is a vector of controlled variables so as to maintain a constant steady state of each economy, and \( \xi _i \) is the error term. The index T refers to the length of the time interval. \( \kappa , \beta \) and \( \varphi \) are unknown parameters which have to be estimated. The convergenceFootnote 1 speed \( \theta =-{{T}^{-1}}\ln \left( 1+\beta T \right) \) is the rate at which a given economy catches up to its steady-state. The null hypothesis tested is the lack of convergence against the alternative that some countries converge to a certain level of production that is initially different. If the estimated coefficient \( \beta \) is negative and significant, the hypothesis of convergence can be accepted.

However, OLS estimation of (1) is useful for inference under certain conditions only. Evans and Karras (1996) explain that the estimators \( \hat{\beta } \) and \( \hat{\varphi } \) obtained by applying ordinary least squares to (1) are valid only if \( \xi _i \) and \( y_{i,0} \) are uncorrelated and if the constant term is generated as follows

$$\begin{aligned} \kappa _{i}=\psi ^{'} \Xi _{i} \end{aligned}$$
(2)

with \( \psi \equiv (\lambda -1) \varphi / \beta \).Footnote 2 In panel data, Evans and Karras (1996) procedure based on unit root tests is a basic procedure for many studies of economic convergence tests (see for example Gaulier et al. 1999). Considering a group of N countries, these authors show that the countries converge if deviations of the log per capita GDP from the international average are stationary for each country. Let \( y_{it} \) be the log per capita GDP of country i at the period t with \( i=1, \ldots N ; t=1, \ldots T \) and \( \bar{y}_{t}= \sum _{i=1}^{N} y_{it}/N \), the international average of \( y_{it} \). This is to test whether the data generating process \( (y_{it}-\bar{y_t}) \) is stationary for all i

$$\begin{aligned} \lim _{h \rightarrow \infty } (y_{i,t+h}-\bar{y}_{t+h})=\mu _{i}. \end{aligned}$$
(3)

Convergence occurs if for each i deviations of per capita GDP from the international average tend to a constant when \( t \rightarrow \infty \). Specifically, the convergence hypothesis is accepted only if \( y_{it}-\bar{y}_{t} \) are stationary while the \( y_{it} \) are integrated of order 1. In such a case, we have stochastic convergence. However, as stressed by Carrion-i-Silvestre and German-Soto (2009), stochastic convergence is a necessary but not a sufficient condition to satisfy the definition of \( \beta \)-convergence. With \( y_{it}^{c}=y_{it}-\bar{y}_{t}\), the data generating process proposed by Evans (1996) is

$$\begin{aligned} y_{it}^{c} = \kappa _{i} + \lambda y_{i,t-1}^{c} + u_{it} \end{aligned}$$
(4)

where \( \lambda \equiv (1+\beta T)^{(1/T)} \) is less than 1 if the N economies converge and in this case \( \beta < 0 \). However, there is divergence if \( \lambda =1 \) which implies that \( \beta =0 \). The constant term \( \kappa _i \) is specific to each country and the error term is serially uncorrelated. Evans and Karras (1996) further show that in the case where the errors are correlated in the individual dimension, this specification entails serious problems of statistical inference. International trade in goods and assets means that innovations are probably correlated. In addition, given the specificity of countries in terms of technology, the parameter \( \lambda \) should be specific to each economy. Therefore, the ADF specification in panels with a heterogeneous autoregressive root is generally used as an alternative

$$\begin{aligned} \Delta y_{it}^{c}=\kappa _{i}+\rho _{i}y_{it}^{c}+\sum _{s=1}^{p}\gamma _{i,s}\Delta y_{i,t-p}^{c}+u_{it}. \end{aligned}$$
(5)

The parameter \( \rho _i \) is negative if the economies converge and is zero if they diverge. The roots of \( \sum _{s} \gamma _{i,s}L^{s} \) are outside the unit circle. Below, in the framework of a PANIC procedure, we use a general specification of Eq. (4) which allows better control of cross-sectional and serial correlations of \( u_{it} \). It also takes into account a possible structural change in the mean of the data generating process.

3 PANIC procedure

By using PANIC (Panel Analysis of Nonstationarity in the Idiosyncratic and Common components), we test first for stochastic convergence, a primary condition of \(\beta \)-convergence on the basis of Bai and Ng (2010) statistics. This is to test non-stationarity of per capita GDP cross economies differences \( ( H_{0}: \lambda =1 ) \). If stochastic convergence is confirmed, we then move on to conduct the \(\beta \)-convergence test.

3.1 Econometric specification

As mentioned previously, specification (4) is useful only under certain conditions and if they do not pertain it will be very challenging to obtain a consistent estimate of parameters. These conditions are relative to the error term \( u_{it} \) and can be summarized in two general points related by Evans (1996). (i) \(u_{it}\) is a serially uncorrelated error term with a zero mean and finite and constant variance. (ii) also, \( u_{it} \) is contemporaneously uncorrelated across countries. Following Bai and Ng (2010), to deal with cross-sectional correlation of \( u_{it} \) we use a common factor structure where the common and the idiosyncratic components have the same order of integration for all i . We have,

$$\begin{aligned} y_{it}^{c}={{\kappa }_{i}}+{{{\pi }'}_{i}}{{F}_{t}}+{{e}_{it}} \end{aligned}$$
(6)

where

$$\begin{aligned} {{F}_{t}}= \lambda {{F}_{t-1}} + {{f}_{t}} \end{aligned}$$
(7)

and

$$\begin{aligned} {{e}_{it}}= \lambda {{e}_{i,t-1}}+ {{\varepsilon }_{it}}. \end{aligned}$$
(8)

Notice that \( {f}_{t} \) and \( {\varepsilon }_{it} \) are random variables. We can avoid problems related to conditions (i) and (ii) by using specification (6). We first deal with these cross-sectional dependencies by using PANIC to remove common factors. Then the null hypothesis of divergence is tested on the de-factored variable \( x_{it}=e_{it}\) which dynamic form is given by

$$\begin{aligned} {{x}_{it}}= \lambda {{x}_{i,t-1}}+ {{\varepsilon }_{it}}. \end{aligned}$$
(9)

This is equivalent to testing the null hypothesis of unit root \( H_{0}:\lambda =1 \) against the alternative hypothesis of stationarity \( H_{1}:\lambda < 1 \) for some individuals in the panel. Notice that de-factored panel data can be obtained by projecting the panel data to the space orthogonal to the factor loadings. The matrix of factor loadings can be estimated using a modified version of the principal component method used in Bai and Ng (2002), proposed by Moon and Perron (2004). A similar orthogonalization procedure is also suggested in Phillips and Sul (2003).

The problems of structural changes that may affect the economies are also taken into account in this procedure. Paci and Pigliaru (1997) argued that structural change plays a fundamental role in the convergence process. It is closely associated with shifts of resources in different sectors. Thus, just as with economic interdependence, the omission of these breaks when modeling convergence process generally leads to the hypothesis of convergence being wrongly rejected. To take this into account, we propose another general form of Eq. (6) which admits the occurrence of a single break in the mean

$$\begin{aligned} y_{it}^{c}={{\kappa }_{i}}+{{\theta }_{i}}D{{U}_{i,t}}+{{{\pi }'}_{i}}{{F}_{t}}+{{e}_{it}} \end{aligned}$$
(10)

where \( DU_{i,t} = 1 \) for \( t > T_{b,i} \) and 0 elsewhere. \( T_{b,i} \) denotes the break in the intercept for the ith individual. The first-differenced form of Eq. (10) is

$$\begin{aligned} \Delta y_{it}^{c}={{\theta }_{i}}D{{({{T}_{b,i}})}_{t}}+\,{{{\pi }'}_{i}}\Delta {{F}_{t}}+\Delta {{e}_{it}} \end{aligned}$$
(11)

where \( D(T_{b,i}) \) are impulses such that \( D(T_{b,i})_{t}=1 \) for \( t=T_{b,i}+1 \) and 0 elsewhere. When the cross-section dimension tends to infinity, their effects become limited and can be included in the idiosyncratic error (Bai and Carrion-i-Silvestre 2009). Let us define \({{\hat{F}}_{t}}=\sum \nolimits _{s=2}^{t}{\Delta {{F}_{t}}}\) and \({{\hat{e}}_{it}}=\sum \nolimits _{s=2}^{t}{\Delta e_{is}^{{}}}\). For \(t=2,\,.\,.\,.\,,T\), we can write,

$$\begin{aligned} \hat{y}_{it}^{c}={{{\pi }'}_{i}}{{\hat{F}}_{t}}+{{\hat{e}}_{it}}. \end{aligned}$$
(12)

The \( \hat{y}_{it}^{c} \) series preserve the same non-stationarity property as the original series \( y_{it}^{c} \). In addition, model (12) has two important advantages. Firstly, this process is not affected by structural change. Thus, we face the simple case of a test without a break. Second, unlike Moon and Perron (2004) approach which uses an orthogonalization procedure à la Phillips and Sul (2003) to eliminate the common factors, these factors are estimated explicitly before being eliminated from the model. As pointed out by Bai and Ng (2010), the method presented by Moon and Perron (2004) to eliminate common and deterministic components causes serious problems of power especially when the model contains a trend. In Sect. 4 we make Monte-Carlo simulations to verify whether this procedure can affect the test performances. Let \( {{\hat{x}}_{it}} \) be the de-factored form of \({{\hat{y}}^{c}}_{it}\). Since \({{\hat{x}}_{it}}={{\hat{e}}_{it}}=\lambda {{\hat{e}}_{i,t-1}}+{{\hat{\varepsilon }}_{it}} \), then the de-factored model is

$$\begin{aligned} {{\hat{x}}_{it}}=\lambda {{\hat{x}}_{i,t-1}}+{{\hat{\varepsilon }}_{it}} \end{aligned}$$
(13)

where \( \hat{\varepsilon }_{it} \) is uncorrelated across country accordance to condition (ii). Notice that to select the number of common factors r we use criteria developed by Bai and Ng (2002).

3.2 Testing for stochastic convergence

In this section, we present the technique for the estimation of \( \lambda \), and the test statistics of the null hypothesis \( \lambda = 1 \).

The test statistics of the null hypothesis \( \lambda =1 \) can be constructed from the pooled modified OLS estimator of the autoregressive root. This estimator is corrected to take into account the condition (i). Thus, the possible serial correlations of the residuals \({{\hat{\varepsilon }}_{it}}\) are controlled. Let \({{\hat{\phi } }_{\varepsilon }}\) be the sum of positive autocovariances of the errors \( \varepsilon \) and \( \hat{x} \) the \((T-2)\times N\) matrix of \( \hat{x}_{it} \). The modified OLS estimator is

$$\begin{aligned} {{\hat{\lambda }}^{*}}=\frac{\textit{trace}({{{{\hat{x}}'}}_{-1}}\hat{x})-NT{{{\hat{\phi }}}_{\varepsilon }}}{\textit{trace}({{{{\hat{x}}'}}_{-1}}{{{\hat{x}}}_{-1}})}. \end{aligned}$$
(14)

Following Bai and Ng (2010), two test statistics of the null hypothesis \( \lambda =1 \) are constructed using this estimator of the autoregressive root. The statistics are noted \({{P}_{a}}\) and \({{P}_{b}}\) and are the analogs of \( t_a \) and \( t_b \) of Moon and Perron (2004). Both follow a standard normal law and we have

$$\begin{aligned} P_{a}^{{}}= & {} \frac{T\sqrt{N}\left( {{{\hat{\lambda }}}^{*}}-1 \right) }{\sqrt{2\hat{\nu }_{\varepsilon }^{4}/\hat{\omega }_{\varepsilon }^{4}}}\rightarrow N(0,1);\end{aligned}$$
(15)
$$\begin{aligned} P_{b}^{{}}= & {} T\sqrt{N}\left( {{{\hat{\lambda }}}^{*}}-1 \right) \sqrt{\frac{1}{N{{T}^{2}}}{} \textit{trace}\left( {{{{\hat{x}}'}}_{-1}}{{{\hat{x}}}_{-1}} \right) \frac{\hat{\omega }_{\varepsilon }^{2}}{\hat{\nu }_{\varepsilon }^{4}}}\rightarrow N(0,1) \end{aligned}$$
(16)

where \(\omega _{\varepsilon }^{2}\) and \(\nu _{\varepsilon }^{4}\) respectively correspond to the means on N of the individual long-term variances \(\omega _{\varepsilon ,i}^{2}\) and of squared individual long-term variances \(\phi _{\varepsilon ,i}^{4}\) of \({{\varepsilon }_{it}}\). Let \({{\hat{\Gamma }}_{i}}(j)\) be the residual empirical autocovariance, we have

$$\begin{aligned} {{\hat{\Gamma }}_{i}}(j)=\frac{1}{T}\sum \limits _{t=1}^{T-j}{{{{\hat{\varepsilon }}}_{i,t}}}{{\hat{\varepsilon }}_{i,t+j}}. \end{aligned}$$

From \({{\hat{\Gamma }}_{i}}(j)\), it is possible to construct an estimator of the individual long-term variancesFootnote 3

$$\begin{aligned} \hat{\omega }_{\hat{\varepsilon },i}^{2}=\frac{1}{N}\sum \limits _{j=-T+1}^{T-2}{\omega }({{q}_{i}},j){{\hat{\Gamma }}_{i}}(j);\quad \quad {{\hat{\phi }}_{\hat{\varepsilon },i}}=\sum \limits _{j=1}^{T-1}{\omega ({{q}_{i}},j){{{\hat{\Gamma }}}_{i}}(j)}. \end{aligned}$$

These individual variances are used to define the estimates of the means of the individual long-term variances as follows

$$\begin{aligned} \hat{\omega }_{{\hat{\varepsilon }}}^{2}=\frac{1}{N}\sum \limits _{i=1}^{N}{\hat{\omega }_{\hat{\varepsilon },i}^{2}};\quad \quad {{\hat{\phi }}_{{\hat{\varepsilon }}}}=\frac{1}{N}\sum \limits _{i=1}^{N}{{{{\hat{\phi }}}_{\hat{\varepsilon },i}}};\quad \quad \hat{\nu }_{{\hat{\varepsilon }}}^{4}=\frac{1}{N}\sum \limits _{i=1}^{N}{(\hat{\omega }_{\hat{\varepsilon },i}^{2}}{{)}^{2}}. \end{aligned}$$

The test statistics are obtained by substituting the estimated values of these variances in the expressions of \( P_{a} \) and \( P_{b} \). If the realization of the statistic \({{P}_{a,b}}\) is lower than the normal critical level, we accept the hypothesis of stochastic convergence.

Also, notice that in the previous section we see that the implementation of the economic convergence test requires the selection of the number of common factors to be eliminated from the data generating process in order to define a consistent estimate of \( \lambda \). To estimate r , one can use the series of panel criteria defined by Bai and Ng (2002): \( PC_{i}, IC_i \) and \( BIC_i \) for \( i=1,2,3 \) [see for example Bai and Ng (2002) and Moon and Perron (2004)].

3.3 Analyzing \(\beta \)-convergence

This subsection presents the method used to analyze \(\beta \)-convergence when the hypothesis of stochastic convergence is accepted. The aim is to estimate the implied value of \( \beta \) given by \(\hat{\beta }=({{({{\hat{\lambda }}^{*}})}^{T}}-1)/T\,\) in order to analyze \( \beta \)-convergence. For this purpose we use \( \hat{\lambda }^{*} \), the consistent estimator of \( \lambda \). The procedure is summarized in three steps for estimating \( \lambda \) and testing the null hypothesis \(\lambda =0\).

Step 1 We apply PANIC to the dataset using Eq. (6) and the obtained series follow

$$\begin{aligned} {{\hat{x}}_{it}}=\lambda {{\hat{x}}_{i,t-1}}+{{\hat{\varepsilon }}_{it}} \end{aligned}$$

where the variables are defined as in Eq. (13). Then, for each i , the \( {{\hat{x}}_{it}} \) series are normalized by the OLS regression standard error \({{\hat{\sigma }}_{{{{\hat{\varepsilon }}}_{i}}}}\) to control for heterogeneity across countries. The normalized series is

$$\begin{aligned} {{\hat{z}}_{it}}={{\hat{x}}_{it}}/{{\hat{\sigma }}_{{{{\hat{\varepsilon }}}_{i}}}}. \end{aligned}$$

Step 2 In this step we construct the following normalized model to estimate the pooled parameter, \( \lambda \)

$$\begin{aligned} {{\hat{z}}_{it}}=\lambda {{\hat{z}}_{i,t-1}}+{{\hat{v}}_{it}} \end{aligned}$$
(17)

where \( {{\hat{v}}_{it}}={{\hat{\varepsilon }}_{it}}/{{\hat{\sigma }}_{{{{\hat{\varepsilon }}}_{i}}}} \). Let \( \hat{z} \) the matrix of observations \( {{\hat{z}}_{it}} \) and \( {{\hat{z}}_{-1}} \) the matrix of lagged observations. The corrected estimator of \( \lambda \) is

$$\begin{aligned} {{\hat{\lambda }}^{*}}=\frac{\textit{trace}({{{{\hat{z}}'}}_{-1}}\hat{z})-NT{{{\hat{\phi }}}_{\varepsilon }}}{\textit{trace}({{{{\hat{z}}'}}_{-1}}{{{\hat{z}}}_{-1}})}. \end{aligned}$$

Step 3 On the basis of the modified pooled estimator of the normalized equation, we define the corrected t-statistic

$$\begin{aligned} {{t}^{*}}\left( \lambda \right) =\frac{{{{\hat{\lambda }}}^{*}}}{{{{\hat{\sigma }}}_{{{\lambda }^{*}}}}}, \end{aligned}$$

where

$$\begin{aligned} {{\hat{\sigma }}_{{{\lambda }^{*}}}}={{\hat{\sigma }}_{{\hat{v}}}}{{\left( \sum \limits _{i=1}^{N}{\sum \limits _{t=2}^{T}{\hat{z}_{i,t-1}^{2}}} \right) }^{-1/2}};\quad {{\hat{\sigma }}_{{\hat{v}}}}=\sqrt{\textit{trace}((\hat{z}-{{{\hat{\lambda }}}^{*}}{{{\hat{z}}}_{-1}})(\hat{z}-{{{\hat{\lambda }}}^{*}}{{{\hat{z}}}_{-1}}{)}')/NT}. \end{aligned}$$

The estimated value of \( \lambda \) obtained in Step 2 is used.

4 Simulations studies

This section presents the results of Monte-Carlo simulations whose main purpose is to check whether the use of the PANIC procedure can also help to eliminate statistical incidences of structural change problems. The verification of this hypothesis is important in the sense that it implies that the specification (17) can be used not only to test the hypothesis of convergence without being confronted with problems of correlations in the error term but in addition it can handle breaks affecting the mean of the series. In line with Westerlund et al. (2010) we show that even in the presence of a single break, PANIC can be used. Indeed, in this framework the break will have no negative impact on the size and power of the \( P_a \) and \( P_b \) tests and then we obtain much more satisfactory results than standard approaches.

4.1 Power and size

To study power and size we consider two cases in each of which two experimentsFootnote 4 are conducted using 1000 replications with \(N=\{20,50\}\) and \(T=\{100,200\}\). In both cases we have a single common factor and following Bai and Ng (2010) we hold the number of factors in the simulations to the true value. In all power and size simulations, the 5 % significance level is considered.

4.1.1 Case 1

In this first case we want to assess the effect of a single break. In doing so, we use a DGP which draws upon that of Moon and Perron (2004) and we have

Experiment 1.1:

$$\begin{aligned} \begin{aligned}&y_{it}^{c}={{\kappa }_{i}}+\lambda y_{i,t-1}^{c} + {{u}_{it}} \\&{{u}_{it}} = {{\pi }_{i}}{{f}_{t}} + {{\varepsilon }_{it}} \\ \end{aligned} \end{aligned}$$

Experiment 1.2:

$$\begin{aligned} \begin{aligned}&y_{it}^{c}={{\kappa }_{i}} + {{\theta }_{i}}D{{U}_{i,t}} + \lambda y_{i,t-1}^{c} + {{u}_{it}} \\&{{u}_{it}}= {{\pi }_{i}}{{f}_{t}} + {{\varepsilon }_{it}}. \\ \end{aligned} \end{aligned}$$

In Experiment 1.2 we include break points which are randomly positioned for each i with break fractions \({{\alpha }_{i}}\) following \({{\alpha }_{i}}\sim {\ }U\left[ 0.2,0.8 \right] \). In both experiments \({{\kappa }_{i}}\sim {\ }N(0,1)\) and \(({{f}_{t}},{{\pi }_{i}},{{\varepsilon }_{it}})\sim {\ }iidN(0,{{I}_{3}})\).

Table 1 Results for Case 1

To simulate size, we set \({{\lambda }_{i}}=1 \forall i\) and for power we have \( \lambda _{i} \sim {\ }U\left[ 0.9,0.99 \right] \). In both experiments, the deterministic component is not estimated. Indeed, the model allows for fixed effects only (without incidental trends) and the panel data are prior demeaned. So, no estimation of deterministic components is necessary. However, notice that when there are incidental trends in the model and it is necessary to remove heterogeneous deterministic components, Moon and Perron (2004) show that these tests have no significant asympotitic power. This is due to the so-called incidental trends problem.

The empirical power corresponds to the proportion of rejection times of the null hypothesis. The statistics \( P_a \) and \( P_b \) were compared with their critical values at 5 % significance level. The simulated results for power are given in Table 1, in which for each Experiment \( P_a \) and \( P_b \) cells has two entries, the first one is power and the second one is adjusted power. Notice that in fact tests require the estimation of the number of factors and that due to overestimation of the number of factors they can have severe size distortions for small N. Indeed, as stressed by Moon and Perron (2004), in such a case the information criteria tend to choose a number of factors more than the true number, in the presence of a break (see Breitung and Eickmeier 2011) as well as when there is no break. Table 6 in the Appendix gives the results of the average estimated number of factors by using three variants of the information criteria respectively noted \( PC_1, IC_1\) and \( BIC_3 \). However notice that in our case, we hold the number of factors always equal to the true one.

The simulations results show that power of the test goes down with the presence of a single break only for small sample size. However, when N become larger there is no difference between the simulated power of the two experiments. Furthermore, an examination of the results on size reveals that with the presence of a break, the \( P_a \) and \( P_b \) tests show size distortions and becomes relatively less interesting particularly when N increase. In sum, as expected when there is a single break in the data generating process, the tests are negatively affected.

4.1.2 Case 2

The second simulation study illustrates the difference between PANIC and the standard orthogonalisation procedure in controlling these negative effects. Thus, the aim is to highlight the role of PANIC estimation in controlling the effect of the break. As in the previous case, two DGP are considered. The first one is the Bai and Ng (2004) DGP and the second one is the same DGP augmented with a single break. Thus, we have

Experiment 2.1:

$$\begin{aligned} y_{it}^{c}={{\kappa }_{i}} + {{\pi }_{ij}}{{F}_{t}} + {{e}_{it}} \end{aligned}$$

Experiment 2.2:

$$\begin{aligned} y_{it}^{c}={{\kappa }_{i}}+{{\theta }_{i}}D{{U}_{i,t}} + {{\pi }_{i}}{{F}_{t}}+{{e}_{it}} \end{aligned}$$

where \( {{F}_{t}}=\Phi {{F}_{t-1}}+{{\eta }_{t}}\) and \( {{e}_{it}}=\lambda {{e}_{i,t-1}}+{{\varepsilon }_{it}} \). As in Case 1, we have \({{\alpha }_{i}}\sim {\ }U\left[ 0.2,0.8 \right] , {{\kappa }_{i}}\sim {\ }N(0,1)\) and \(({{f}_{t}},{{\pi }_{i}},{{\varepsilon }_{it}})\sim {\ }iidN(0,{{I}_{3}})\).

In each experiment, the model is estimated in the one hand using PANIC procedure and on the other hand by using an orthogonalisation procedure (Ortho.). In both estimation methods, the data are first-differenciated prior to estimating the common factor with principal component analysis (PCA). Then, we define the de-factored form of the model

$$\begin{aligned} {{\hat{x}}_{it}}=\lambda {{\hat{x}}_{i,t-1}}+{{\hat{\varepsilon }}_{it}}. \end{aligned}$$

To study the size of the test we have, following Bai and Ng (2010), \({{\lambda }_{i}}=\Phi =1\) for all i. For power, we have considered values of \( \lambda _i \) that are not far from the null hypothesis of unit root. Thus, under the alternative, the parameter \( \lambda \) is specific to each individual with \( \lambda \sim {\ }U\left[ 0.9,0.99 \right] \), whereas \(\Phi =0.5\).

Table 2 Results for Case 2

Table 2 presents results for power and size in each empirical experiment described above using PANIC and orhtogonalisation methods respectively. For the two data generating processes, the properties of size and power of \( P_{a} \) and \( P_{b} \) tests are studied by considering the percentage of replications in which the unit root hypothesis is rejected. Results on power are completed by their adjusted form. As expected, the results show that in the presence of a single break, the method based on PANIC remain much more satisfactory that those obtained using orthogonalisation. In addition, when PANIC is used size and power are controlled so that their properties for Experiment 1 and 2 become very similar. Notice that in addition to providing very unsatisfactory results, the approach based on orthogonalization also leads to very mixed results. We would also like to mention that as in the previous case, the number of common facors are hold to be equal to the true one. However, in fact the various strategies of estimation of the number of common factors developed by Bai and Ng (2002) can be used. These strategies are known to give consistent estimates of the number of common factors in the framework of PANIC.

4.2 Critical values of the statistics \(t^*\left( \hat{\lambda }\right) \)

The analysis of the results obtained in the framework of the approach used here requires knowledge of the marginal significance level of the corrected t-statistics \( t ^ * \left( \hat{\lambda } \right) \). We present simulations from which the critical values for standard threshold of 1, 5 and 10 % can be determined. The simulation procedure is as follows. First, the parameters (variances) of \( {{\hat{\varepsilon }}_{it}} \) are collected for each i to construct the null model \({{\hat{x}}_{it}}={{\hat{\varepsilon }}_{it}} \). For this purpose, we first apply the PANIC procedure to the model \(y_{it}^{c}={{\kappa }_{i}}+{{\theta }_{i}}D{{U}_{i,t}}+{{{\lambda }'}_{i}}\,{{F}_{t}}+{{e}_{it}}\) where \( {{F}_{t}}=\Phi {{F}_{t-1}}+{{\eta }_{t}} \) and \( {{e}_{it}}=\lambda {{e}_{i,t-1}}+{{\varepsilon }_{it}} \). The de-factored and de-trended model \( {{\hat{x}}_{it}}=\lambda {{\hat{x}}_{i,t-1}}+{{\hat{\varepsilon }}_{it}} \) is then estimated by OLS to obtain residuals \( {{\hat{\varepsilon }}_{it}} \).

Table 3 Simulated critical values of \(t^*\left( \hat{\lambda }\right) \)

For each i, using the variance of \( {{\hat{\varepsilon }}_{it}}\), we generate 10,000 data sets for the null model \({{\hat{x}}_{it}}={{\hat{\varepsilon }}_{it}}\) with \({{\hat{\varepsilon }}_{it}}\sim {\ }iidN\left( 0,\sigma _{{{{\hat{\varepsilon }}}_{i}}}^{2} \right) \). Then, on the basis of these data sets of the null model, we estimate the alternative model \( {{\hat{x}}_{it}}=\lambda {{\hat{x}}_{i,t-1}}+{{\hat{\varepsilon }}_{it}} \). So, steps 2 and 3 of the procedure presented in Sect. 3.3 are implemented to obtain the unbiased pooled estimator of the normalized model and compute the test statistics \( {{t}^{*}}\left( {\hat{\lambda }} \right) \). With a sample of 10,000 values of \( {{t}^{*}}\left( {\hat{\lambda }} \right) \) we obtain critical values which correspond to quantiles 1, 5 and 10 %. Then, \( {{t}^{*}}\left( {\hat{\lambda }} \right) \) is compared to these critical values. The simulated 1, 5 and 10 % critical values for \( \lambda = 0.94, 0.98, 1.00 \) and \(N=\{20,50\}\); \(T=\{100,200\}\) are given in Table 3.

5 Convergence in developing and developed countries

5.1 Data

The data are from the World Development Indicators (WDI) of World Bank Group. These are annual real per capita GDP covering the period 1975–2008. To compare results for developed and poor countries we consider two samples. The first sample OECD includes 20 OECD member countries: Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Japan, Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, United Kingdom, United States. The second sample called CFA comprises 13 CFA zone member countries with 7 countries in Sub-Saharan Africa. The 13 member countries are Benin, Burkina Faso, Cameroon, Central African Republic, Chad, Congo, Côte d’Ivoire, Gabon, Guinea Bissau, Mali, Niger, Senegal and Togo. Generally, these countries have highly correlated business cycles. The 7 countries of the AFRICA sample are: Democratic Congo, Gambia, Ghana, Liberia, Nigeria, Sierra Leone, South Africa. A global sample called GLOBAL composed by these two groups is also considered. This last sample comprises 40 countries including poor and rich economies.

5.2 Results

5.2.1 Comparing the results from the different generations of tests

We use several test statistics of the three generations developed in the literature to test the non-stationarity of the deviations of the per capita GDP from the international average. All test results are presented in Table 4 and are based on a data generating process whose deterministic component contains an intercept augmented with a single break if necessary. Using these statistics makes it possible to test the null hypothesis of divergence and to make a comparative study of the results by analyzing the impact of interdependence and/or structural change. Initially, tests for structural change are conducted using Bai and Perron (1998) procedure. The null hypothesis of independence is also verified on the basis of Pesaran (2004) CD statistic, which is robust to breaks. The results of these tests are given in Tables 7 and 8 and show that there are problems of break and cross-section dependence. For the GLOBAL sample, the CD test results based on \(\textit{ADF} (p) \) regression residuals are significant at the 5 % level for both the log per capita GDP and its mean-centered variant. This also applies to the OECD sample and regardless of the lag order \(p = 1,2,3 \). For the AFRICA group, only the test on the log per capita GDP rejects the null hypothesis of independence, the tests applied on the cross-section demeaned log per capita GDP reject the null hypothesis.Footnote 5

For the presentation of the test statistics reported in Table 4 see for example Gengenbach et al. (2010). Other statistics are also available in the literature but here we consider only those best suited to the structure of our panel given the specific properties of these tests. For the ‘second generation’ tests, which allow cross-section dependencies to be modeled on the basis of a factor model (Bai and Ng 2004; Moon and Perron 2004; Pesaran 2007), the statisticsFootnote 6 \( P_ {\hat{e}, \textit{Choi}}^c \) and \( {M {Q} _ {c}} \) respectively test the non-stationarity of the idiosyncratic and common components from the same country. Contrary to \(P_{\hat{e},\textit{Choi}}^c\) which follows a standard normal law, \( {M {Q} _ {c}} \) and \(\textit{CIP}{{S}^{*}}\) are nonstandard and their critical values are provided by the authors. The \(\textit{CIPS}^* \) statistic is built on the basis of the average of individual \(\textit{CADF}^* \) statistics and standard central limit theorems do not apply. At the 5 % threshold, critical values of \( {M {Q} _ {c}} \) and \(\textit{CIP}{{S}^{*}}\) are respectively equal to \(-\)57.04 and \(-\)2.22. According to the test of Carrion-i-Silvestre et al. (2001) whose model takes into account a single break in the mean and ignores the correlations in the individual dimension, the corresponding statistic follows a normal distribution with zero mean and a variance that depends on T and the position of the break. Finally, the statistics \( {{P} _ {m}} \) used for the test of Bai and Carrion-i-Silvestre (2009), which includes both interdependence and structural change and thus, belongs to a third generationFootnote 7 of tests, here admits a standard normal distribution.

Table 4 Results based on different generations of tests

The results of the first and second generations of tests display significant disparity between the results of the same generation and between those of the two different generations. Among the first tests, only that of Levin et al. (2002) accepts the hypothesis of convergence at the 1 % level for OECD and GLOBAL, and 10 % for the sample AFRICA. The statistic \( {{P} _ \textit{MW}} \) of Maddala and Wu (1999) validates the hypothesis of convergence for OECD member countries alone with a significance level of 5 % while the test \({{W}_\textit{tbar}}\) of Im et al. (1997) accepts the hypothesis of convergence for countries in the GLOBAL sample and for OECD countries at the respective thresholds of 5 and 1 %. The hypothesis of convergence is definitely rejected for the African countries (The full test reports are available on request).

The inclusion of the cross-section dependencies only (second generation of tests) also leads to mitigated results. With the procedure of Bai and Ng (2004), convergence is rejected regardless of the sample considered. This procedure has the advantage of identifying the source (idiosyncratic or common) of non-convergence between the economies. The lack of convergence among OECD countries and those of the GLOBAL sample is caused by common factors. To the extent that most countries considered are active in the same economic or monetary areas, this situation of divergence may seem contradictory. Economic theory, particularly in the area of economic and monetary integration, supports the claim that the economic interdependencies generated by the policies of sub-regional integration should accelerate the convergence process. However, it should be noted that apart from the impact of integration policies, economies are also affected by shocks related to the global economy which, as shown by the test results of Bai and Ng (2004), are real sources of divergence. Bai and Carrion-i-Silvestre (2009) argue that when the data generating process contains common factors, \(I\left( 0 \right) \) factors represent the common shocks, while \(I\left( 1 \right) \) factors model the effects related to unobservable global stochastic trends. For example, Hurlin and Mignon (2005) note that in the analysis of properties of non-stationarity of GNP series, \(I\left( 1 \right) \) common factors can be assimilated to the factors of global growth. Still with regard to the second generation of tests, Moon and Perron (2004) statistics noted \( t_ {b }^{*}\) accepted the convergence hypothesis at the 1 % level for our three samples, completely contradicting the results of the \(\textit{CIPS}^{*}\) test that conclude in favor of divergence for these samples. In the \(\textit{CIPS}^{*}\) test, the hypotheses are formulated so that in the alternative, we consider two categories of countries: a first category of converging economies and a second category of countries which diverge. Thus, if the alternative hypothesis is accepted, this reflects the fact that there is at least one country whose per capita GDP converges to the international average. This also applies to the tests of Im et al. (1997) and Moon and Perron (2004). In Table 9, we use the individual CADF statistics of Pesaran (2007) for each economy to identify countries with a per capita GDP which converges to the international average.

In order to study the situation where only the structural change is taken into account, the test of Carrion-i-Silvestre et al. (2001) which is an extension of the unit root test (first generation) developed by Harris and Tzavalis (1999) is also implemented. Here, the estimated break dates are common to all economies and are obtained on the basis of a Supremum statistic. The common dates correspond to 1989, 1995 and 1989 respectively for the samples AFRICA, OECD and GLOBAL. The test results obtained with Carrion-i-Silvestre et al. (2001) statistics are identical to those of the \(\textit{CIPS}^{*}\) tests in the sense that they conclude in favor of the non-stationarity of the cross-section demeaned per capita GDP regardless of the sample.

Table 5 Estimation results of the PANIC approach

In general, the finding that emerges through the study of the results of the first two generations of tests is that although considerable progress is being made in the literature on non-stationary panels, the results related to empirical tests of convergence are very mitigated and not always in line with the predictions of economic theory. Thus, it seems essential to go further towards effectively integrating the various phenomena that may affect the convergence equation, the omission of which generally leads to the convergence hypothesis being wrongly rejected. Moreover, the results from the \({{P}_{m}}\) test of Bai and Carrion-i-Silvestre (2009), which belongs to the third generation of tests including economic co-movements and structural change, accept the hypothesis of economic convergence for the three groups of countries at the 1 % threshold. The following section presents an application based on the PANIC procedure which, in addition to testing the convergence hypothesis by taking into account both interdependencies and breaks, allows us to go further by analyzing the \(\beta \)-convergence.

5.2.2 Results based on the PANIC procedure

Table 5 displays the results of the three-step approach which is mainly based on PANIC. For the samples (AFRICA, OECD, GLOBAL), the criterion \( BI{{C}_ {3}} \) estimates six factors corresponding to the estimated value of r in Sect. 5.2. The results show that countries in the overall sample converged over the period 1975–2008. The p values associated with the test statistics \({{P}_ {a}} \) and \({{P}_ {b}}\) are lower than the 1 % threshold, indicating the rejection of the null hypothesis of divergence for these countries. Thus, for this sample, the parameter \( {{\hat{\lambda }}^{*}}\) is significantly lower than 1 with a value \( {{\hat{\lambda }^{*} = 0.9421 }}\). The tests based on \( {{t }^{*}} \left( \lambda \right) \) show that \( \lambda \ne 0 \) Footnote 8 and that the implied value of the parameter \( \beta \) is \( \hat{\beta } = -0.0266\). These results are used to determine the speed of convergence. The rate of convergence for countries in the sample GLOBAL is 5.95 %.

The results for OECD countries show that \( {{P}_{a}} \) and \( {{P}_{b}} \) statistics also accept the hypothesis of convergence for these countries at the 1 % level. In addition, there is \( \beta \)-convergence for the OECD countries during the period 1975–2008. With a parameter \( \hat{\beta } =- 0.0306\) the speed of convergence is 12.30 %.

For the AFRICA sample, the null hypothesis of divergence is finally accepted. The probabilities associated with \( {{P}_{a}} \) and \( {{P}_{b}} \) are higher than the standard thresholds of 5 and 10 %.

These results thus point in the same direction as the numerous studies on economic convergence in panel data by accepting the \( \beta \)-convergence for the OECD countries and for the full sample. Moreover, as expected, the treatment of structural change and economic interdependencies led to faster convergence than with the approaches generally used. Estimates of Evans and Karras (1996) over the period 1950–1990 based on a larger sample of 54 rich and poor countries from Summers and Heston’s data base provide a rate of convergence of 4.30 %. Gaulier et al. (1999) take into account the hypothesis of heterogeneity of the convergence parameter in the procedure of Evans and Karras (1996) and obtain a convergence rate of 11.4 % for a sample of 27 OECD countriesFootnote 9 over the period 1960–1990. However, neither the period considered by these authors nor their sources of data are identical to ours. That can make the comparison more difficult. However, it is important to note that the use of non-stationary panel data, particularly by taking into account the phenomena of co-movement and structural change, substantially solves the problem of bias encountered in cross-sectional analysis, which takes the speed of convergence towards 0. This is the example of the studies by Barro and Sala-i-Martin (1991) and Mankiw et al. (1992) who found a convergence rate of about 2 %. As stressed by Evans (1997), in the context of the neoclassical growth model, this slow rate of convergence is incompatible with the fact that physical capital is the only reproducible factor and is paid its marginal product. Because, in the case of slow convergence (for example 2 %), the elasticity of output will have to be higher than the observed elasticity for physical capital. In other words, for a more accurate analysis of the convergence process, the use of appropriate procedures such as the one adopted here is necessary.

6 Concluding remarks

In line with Westerlund et al. (2010), we focus on cross-sectional dependencies and structural changes, which, if ignored, can lead to biases that significantly reduce the power of the convergence test. It appears through the Monte-Carlo experiments that a PANIC based approach controls common factors and structural breaks that may be associated with the convergence process. The study period (1975–2008) is one when sub-regional integration policies were central to economic development strategies in North and South countries alike. However, these policies have caused changes in the structure of the economies by generally permitting them to achieve higher economic growth. Thus, with the persistence of such policies, the poorest economies tend to grow faster.

The approach used to study the convergence process goes beyond the standard approach of considering the phenomena mentioned as simple nuisance parameters. Applications are made on the AFRICA sample composed mainly of member countries of the CFA zone and for comparison, on a sample of OECD countries. The results confirm the rejection of the hypothesis of convergence for the countries of sub-Saharan Africa as do studies that have focused on economic convergence in these countries. However, beyond this, an important point emerges. This work has highlighted the fact that the slow rate generally observed in convergence studies is largely due to the omission of certain shocks that affect economies by creating economic co-movements and/or structural changes with significant impacts on the convergence process. This is confirmed by the results for the OECD group which validate the assumption of \( \beta \)-convergence for the OECD countries with a relatively high rate of convergence (12.30 %). For a heterogeneous sample of 40 rich and poor countries made up of countries in both AFRICA and OECD, the hypothesis of economic convergence is also accepted at a slower rate than the OECD countries but relatively faster than the convergence measured by existing approaches in the literature.