Panel bootstrap tests of slope homogeneity

Blomquist, Johan; Westerlund, Joakim

doi:10.1007/s00181-015-0978-z

Panel bootstrap tests of slope homogeneity

Published: 21 July 2015

Volume 50, pages 1359–1381, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Empirical Economics Aims and scope Submit manuscript

Panel bootstrap tests of slope homogeneity

Download PDF

Johan Blomquist¹ &
Joakim Westerlund^2,3

540 Accesses
12 Citations
Explore all metrics

Abstract

This paper proposes two bootstrap-based tests that can be used to infer whether the individual slopes in a panel regression model are homogenous. The first test is suitable when wanting to infer the null of homogeneity versus the general alternative, while the second is suitable when wanting to infer the units of the panel that can be pooled. Both approaches are shown to be asymptotically valid, a property that is verified in small samples using Monte Carlo simulation.

Inferences on the regression coefficients in panel data models: parametric bootstrap approach

Article Open access 28 December 2019

New Approaches for Testing Slope Homogeneity in Large Panel Data Models

Article 25 January 2024

Testing slope homogeneity in panel data models with a multifactor error structure

Article 10 July 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Most common panel regression models, such as the random and fixed-effects models, assume that the regression slopes are equal across the cross section. In fact, in some parts of the literature the equal slope assumption is so common that it is hardly ever questioned. The main reason for this is twofold (see, e.g., Baltagi et al. 2008). First, it greatly simplifies the process of estimation and inference. In fact, as a referee to this journal points out, the main feature of panel data is the ability to selectively pool the information regarding the assumed common slope, while allowing great heterogeneity in other parts of the model. Second, if true exploiting the equality of the slopes leads to large gains in efficiency. The problem is that if the assumption is false, panel techniques based on models with equal slopes become inconsistent, causing misleading inference (see, e.g., Hsiao 2003, Chapter 6). It is therefore crucial to test for slope homogeneity before proceeding with an analysis based on this assumption.

The classical approach to test the equal slope assumption is to employ a simple F-test (see, e.g., Hsiao and Pesaran 2008; Baltagi et al. 2008; Pesaran and Yamagata 2008). Alternatively, one may use the Hausman test of Pesaran et al. (1996), in which two estimators are compared, one is constructed under the assumption of homogenous slopes, while the other is not. A third possibility is to use the Swamy-type test of Pesaran and Yamagata (2008), which is based on the dispersion of individual slope estimates from a suitable pooled estimator. Although very popular, these approaches suffer from at least two important shortcomings.

First, all three tests assume that the regression errors are cross-sectional independent, a restriction that is likely to be violated in practice, especially in macroeconomic and financial applications involving country-level data where strong intraeconomy linkages can be expected. Recognizing this shortcoming, Phillips and Sul (2003) propose a Hausman-type test that is appropriate in the special case when the dependence can be represented by means of a single common factor. The problem is, of course, that the common factor structure need not be correct, which would then invalidate the test. Another possibility, which allows for more general forms of cross-sectional dependence, is to use the seemingly unrelated regression (SUR) approach. The advantage of this approach is that, as long as the cross-sectional dimension, N, is “small” and the time series dimension, T, is “large,” the cross-sectional dependence can be allowed to be quite general. The drawback is that unless $T>>N$, the small-sample performance of SUR-based tests is often very poor (see Bun 2004).

The second shortcoming relates to the formulation of the hypotheses tested, which, depending on the problem at hand, may not be very informative. In particular, while the null hypothesis can certainly be formulated as that all the slopes are equal, the alternative hypothesis that there are at least some units for which homogeneity fails is typically too broad for any interesting economic conclusions; it could be that the slopes of all units are different, but it could also be that there is only a small fraction of units for which homogeneity fails. Some studies have considered clustering methods for grouping units such that the slopes are homogeneous within each group, but heterogenous across groups (see, e.g., Kapetanios 2006; Lin and Ng 2012). Unfortunately, these methods are quite complicated to implement, which is probably also the reason for why they are almost never used in practice.

In this paper, we develop several procedures to ascertain the homogeneity of a panel. The point of departure is a quite general data-generating process that allows, for example, serial correlation of unknown form and complex cross-sectional dependencies such as dynamic common factor models. In fact, except for some mild regulatory conditions, there are virtually no restrictions on the forms of serial and cross-sectional dependence that can be permitted. We do require that the regressors are strictly exogenous, but the approach can be easily extended to allow more general types of regressors. Given this generality, corrections aimed at achieving asymptotically pivotal statistics are not really an option. In this paper, we therefore consider the block bootstrap as a means to obtain tests that are asymptotically valid. Two block bootstrap-based test procedures are considered: one is appropriate when testing the above mentioned hypothesis of slope homogeneity versus at least some heterogeneity, while the other can be used to sequentially determine the units for which the slopes are equal. Both procedures are easy to implement and work well even when N and T are similar in magnitude.

The plan of the rest of the paper is as follows. Section 2 describes the model and assumptions, which are used in Sect. 3 to study the properties of the two test procedures considered. Section 4 presents the results of a small Monte Carlo study. Section 5 concludes.

2 Model and assumptions

Let $y_{i,t}$ be generated as

$$\begin{aligned} y_{i,t} = \theta _i + \beta ^\prime _i x_{i,t} + \varepsilon _{i,t}, \end{aligned}$$

(1)

where $\theta _{i}$ is regarded as a unit-specific intercept or fixed effect, $x_{i,t}$ is a $m\times 1$ vector of regressors, and $\beta _i$ is a conformable vector of unknown slope coefficients. The purpose of this paper is to infer to which extent $\beta _i$ can be regarded as equal across i.

The assumptions that we will be working under are stated below, where $\rightarrow _p$ and $\rightarrow _d$ signifies convergence in probability and distribution, respectively, and $C\in (0, \infty )$ is a generic constant not depending on N or T. Let $Q_{i,T} = T^{-1} (x_i^\prime M_{\tau } x_i)$, where $x_i = (x_{i,1},\ldots ,x_{i,T})^\prime $ and ${M}_\tau = {I}_T -\tau _T (\tau ^\prime _T \tau _T)^{-1}\tau ^\prime _T$, ${I}_T$ is the $T\times T$ identity matrix, and $\tau _T =(1,\ldots ,1)'$ is a $T \times 1$ vector of ones. It is also convenient to let $\tilde{z}_t = (\tilde{z}_{1,t}^{\prime },\ldots ,\tilde{z}_{N,t}^{\prime })^\prime $, where $\tilde{z}_{i,t} = (x_{i,t} - \mu _i) \varepsilon _{i,t}$ and $\mu _i = E(x_{i,t})$. The sigma-field generated by $\{\varepsilon _{i,n}\}_{n=1}^t$ ($\{x_{i,n}\}_{n=1}^t$) is henceforth going to be denoted $\mathcal {F}_{\varepsilon i,t}$ ($\mathcal {F}_{x i,t}$),

Assumption REGR. $Q_{i,T} \rightarrow _p Q_i$ as $T \rightarrow \infty $, where $Q_i$ is a positive definite matrix.

Assumption ERR.

(i)
$E(\varepsilon _{i,t} x_{j,n}) = 0$ for all i, t, j, and n;
(ii)
$E(\varepsilon _{i,t}^2 | \mathcal {F}_{\varepsilon i,t-1} \cup \mathcal {F}_{x i,t}) = E(\varepsilon _{i,t}^2) = \sigma _i^2 \le C$ for all t;
(iii)
$\Sigma = \mathrm {var}(T^{-1/2} \sum _{t=1}^T \tilde{z}_t)$ is positive definite;
(iv)
$\{ (x_{i,t},\varepsilon _{i,t})\}_{t=1}^T$ is an $\alpha $-mixing sequence with mixing coefficient $\alpha _i(n)$, which is such that $\sup _{i=1,\ldots ,N} \alpha _i (n) \le \alpha (n)$, where $\alpha (n) = Cn^{-a}$ for some $a > r/(r-2)$ and $r > 2$;
(v)
$E( || x_{i,t} ||^{2r+p}) \le C$ and $E( | \varepsilon _{i,t} |^{2r+p}) \le C$ for some $p > 0$, and all i and t.

Assumptions ERR (i) and (ii) rule out regressors that are non-strictly exogenous (including lagged dependent variables) and heteroskedasticity across time. As we discuss in Remark 1 of Sect. 3, these assumptions are stronger than needed. However, since the main motivation of the paper is to account for cross-sectional (and serial) correlation, we have preferred to keep Assumptions ERR (i) and (ii) as they stand (see, e.g., Kapetanios 2008; Hidalgo 2003, for similar assumptions). These assumptions are not needed for the specific tests that we propose, but for the validity of our residual-based bootstrap, which can be easily modified to accommodate both endogenous regressors and errors that are heteroskedastic across time. The reason for sticking with the residual-based bootstrap is that it makes for relatively straightforward proofs. Assumption ERR (iv) allows for heterogeneous forms of serial correlation across the cross section, but imposes a uniform bound on the mixing coefficients. Many commonly encountered stochastic processes can be accommodated under this assumption (see Davidson 1994, Section 14.4). Note also how Assumption ERR does not impose any particular cross-correlation structure. In particular, it is not necessary to know whether the dependence is weak or strong (see Chudik et al. 2011; Pesaran and Tosetti 2010). The dependence also does not have to be static but may be dynamic in nature, when $\varepsilon _{i,t}$ is generated by a dynamic common factor model. The moment requirements in Assumption ERR (v) rule out time trends in the regressors.

3 The bootstrap test procedures

Let us denote by q the number of units that cannot be pooled; that is, q is the number of units for which $\beta _i \ne \beta $, where $\beta $ is the common value of $\beta _i$. The purpose of this paper is to make inference regarding q. Let us therefore denote by $0 = q_1 < \cdots < q_K < N$ a set of K user-defined numbers, representing the number non-poolable units to be considered in the testing. Let $H_0(q_k)$ denote the null hypothesis that $q = q_k$, where $k = 1, \ldots ,K$, and let $H_1(q_{k+1})$ denote the alternative hypothesis that $q \ge q_{k+1}$.

3.1 A pooled test for testing $q = 0$ versus $q \ge 1$

In this subsection, we consider the relatively simple problem of testing $H_0(0)$ versus $H_1(1)$; that is, the null hypothesis of homogeneity ($q=0$) is tested versus the alternative that there is at least one unit that cannot be pooled ($q \ge 1$). The reason for considering this testing problem separately is that under the null hypothesis the slopes of all the units are equal, which makes it possible to consider pooled test statistics in the spirit of much of the previous literature (see, e.g., Hsiao and Pesaran 2008; Baltagi et al. 2008; Pesaran and Yamagata 2008). The test statistic that we consider is a bootstrap Swamy-type homogeneity test and is given by

$$\begin{aligned} S = \sum _{i = 1}^N T \big ( {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}}\big )^\prime \frac{Q_{i,T}}{\hat{\sigma }_i^2} \big ( {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}}\big ), \end{aligned}$$

where ${\hat{\beta }}_i$ is the least-squares (LS) estimator of $\beta _i$ when applied to cross-sectional unit i, $\hat{\sigma }^2_i = T^{-1}\sum _{t=1}^T \hat{\varepsilon }_{i,t}^2$, $\hat{\varepsilon }_{i,t} = y_{i,t} - \hat{\theta }_i - \hat{\beta }^\prime _i x_{i,t}$, and ${\hat{\beta }}_\mathrm{{WFE}}$ is the weighted fixed-effects (WFE) estimator, as given by

$$\begin{aligned} {\hat{\beta }}_\mathrm{{WFE}} = \left( \sum _{i = 1}^{N} \frac{x_i^\prime M_{\tau } x_i}{\hat{\sigma }_i^2} \right) ^{-1} \sum _{i = 1}^{N} \frac{x_i^\prime M_{\tau } {y}_i}{\hat{\sigma }_i^2}. \end{aligned}$$

As Hsiao (2003) and Pesaran and Yamagata (2008) show, under $H_0$ and the assumption of independently identically distributed (iid) errors, $S \rightarrow _d \chi ^2[m(N-1)]$ as $T \rightarrow \infty $ with N held fixed. However, under the more general assumptions laid out above, this is no longer the case. The approach opted for here is therefore based on the bootstrap.

Algorithm BOOT.

1.
Estimate (1) using LS for each cross-sectional unit and organize the residuals, $\hat{\varepsilon }_{i,t}$, in a $T \times N$ matrix $\hat{\varepsilon }= (\hat{\varepsilon }_{1},\ldots ,\hat{\varepsilon }_{T})'$, where $\hat{\varepsilon }_{t} = (\hat{\varepsilon }_{1,t},\ldots ,\hat{\varepsilon }_{N,t})'$.
2.
Obtain ${\hat{\beta }}_\mathrm{{WFE}}$ and $\hat{\theta }_i = \overline{y}_i - {\hat{\beta }}_\mathrm{{WFE}}\overline{x}_i$, where $\overline{y}_i = T^{-1}\sum _{t=1}^T y_{i,t}$ with a similar definition of $\overline{x}_i$.
3.
Choose a block length, l. Let $J_{t} = (\hat{{\varepsilon }}_{t},\hat{{\varepsilon }}_{t+1},\ldots ,\hat{{\varepsilon }}_{t+l-1})^\prime $ be the block of l consecutive estimated errors starting at date t, and let $I_1, I_2,\ldots $ be a sequence of iid random variables with a discrete uniform distribution on $\{1,\dots ,(T-l+1)\}$. The $T\times N$ matrix of pseudo errors, ${\varepsilon }^*$, is such that the first l rows are determined by $J_{I_1}$, the next l rows are given by $J_{I_2}$, and so on. The procedure is stopped when T rows have been generated.
4.
Simulate pseudo-data under $H_0$ as $y_{i,t}^* = \hat{\theta }_i + {\hat{\beta }}^\prime _\mathrm{{WFE}} x_{i,t} + \varepsilon _{i,t}^*$.
5.
Compute the bootstrap test statistic, $S^*$, where $S^*$ is calculated exactly as S, but with $(y_{i}, x_i)$ replaced by $(y_{i}^*, x_i)$.
6.
Repeat steps 3–5 B times.
7.
Select the bootstrap critical value as the $(1-\alpha )$-quantile of the ordered $S^{*}$ statistics.

Remark 1

The above algorithm is an example of a residual-based bootstrap and is similar to the algorithms used in Bun (2004), Kapetanios (2008), Hidalgo (2003), and Zhou and Shao (2013), to mention a few. Alternatively, we may follow, for example, Freedman (1981) and Fitzenberger (1997) and block bootstrap $(y_{i,t}, x_{i,t}^\prime )$. Hence, instead of resampling $\hat{{\varepsilon }}_{t}$ in step 3 in Algorithm BOOT, we resample $(y_{t}', x_{t}')'$, a $N\times (1+m)$ matrix. The bootstrap test statistic is S based on $(y_{i}^*, x_i^*)$, the bootstrapped version of $(y_{i}, x_i)$. The main advantage of this resampling scheme is that it does not require homoskedasticity and strictly exogenous regressors.^{Footnote 1}

Remark 2

Algorithm BOOT requires a choice of block length, l. A common approach is to set l as a deterministic function of T. Alternatively, one may follow, for example, Gonçalves and White (2005) and Gonçalves (2011) and set l according to the data-dependent rule of Andrews (1991) or Newey and West (1994), originally proposed for the purpose of bandwidth selection in long-run variance estimation (see Fitzenberger 1997, Section 3.4, for a discussion). In Sect. 4, we use Monte Carlo simulation to evaluate the effect of various rules for selecting l.

The asymptotic validity of the bootstrap procedure requires the following assumption on the block length.

Assumption BL. $l \rightarrow \infty $ and $l = o(\sqrt{T})$ as $T \rightarrow \infty $.

Theorem 1

Suppose that Assumptions REGR and ERR hold and that $T \rightarrow \infty $ with N held fixed. Under $H_0(0)$,

$$\begin{aligned} S \rightarrow _d \sum _{j=1}^{kN} \lambda _j U_j^2, \end{aligned}$$

where $\lambda _j \in [0,\infty )$ and $U_j \sim N(0,1)$ independently across j. Under $H_1(1)$,

$$\begin{aligned} S \rightarrow \infty . \end{aligned}$$

The corresponding result for $S^*$ is given in Theorem 2.

Theorem 2

Suppose that Assumptions REGR, ERR, and BL hold and that $T \rightarrow \infty $ with N held fixed. Under $H_0(0)$ and $H_1(1)$,

$$\begin{aligned} S^* \rightarrow _{d^*} \; \sum _{j=1}^{kN} \lambda _j U_j^2 \quad \hbox {in probability}, \end{aligned}$$

where $\rightarrow _{d^*}$ convergence in distribution conditional on the realization of the sample.

Remark 3

Together Theorems 1 and 2 establish the asymptotic validity of the proposed bootstrap test procedure. There are two requirements. First, S and $S^*$ must converge to the same asymptotic null distribution. Second, under $H_1(1)$, while $S^*$ should converge to the same asymptotic distribution as under $H_0(0)$, S should diverge.

Remark 4

The asymptotic results in Theorems 1 and 2 are based on letting $T \rightarrow \infty $, while keeping N fixed. There are several reasons for considering such a large-T and fixed-N asymptotic framework. First, if T is fixed, then the above resampling scheme will not be asymptotically valid. Also, N plays no role in the proofs of Theorems 1 and 2. As there is no resampling in the cross-sectional dimension, there is no reason to expect the performance of the bootstrap test to become more (or less) accurate when N increases.^{Footnote 2} Second, it is difficult to obtain distributional results for $N \rightarrow \infty $ without imposing additional conditions on the cross-sectional dependence structure.^{Footnote 3} Third, from a practical viewpoint, a test for slope equality may be of greatest interest in applications in which N is relatively small. In such cases, the condition that $N \rightarrow \infty $ may be difficult to justify. There are, of course, applications in which it is more appropriate to rely on large-N asymptotics, but then other test statistics are likely to be more effective (see Pesaran and Yamagata 2008, for a discussion). However, in such large-N panels T is typically too small for estimation of individual slopes, and therefore, the analysis must be made conditional on the slopes being equal.

Remark 5

In contrast to, for example, Bun (2004), we make no attempt to obtain an asymptotically pivotal test statistics. Nevertheless, it should be noted that there is evidence that bootstrapping of (asymptotically) pivotal test statistics leads to asymptotic refinements (see, e.g., Davidson and MacKinnon 1999). Therefore, if such a test statistic could be obtained, it may be preferable to the approach taken here.

3.2 A sequential test procedure for determining q

The test considered in the previous section is appropriate if one wishes to infer whether there is any evidence against poolability at all. The problem is that in many cases one would like to go further than just concluding that $q > 0$ in case of a rejection, and in this section, we therefore consider a sequential test that can be used to pinpoint q. In so doing, we will assume that $q_k = k-1$, where $k = 1,\ldots ,K$ and $K=N-1$, such that the number of units to be tested decreases by one at each iteration; later on we discuss how to proceed when $q_1,\ldots ,q_K$ are set differently.

To test whether a particular unit i has the same slope coefficient vector as a certain benchmark unit b, we may use the following Wald test statistic:

$$\begin{aligned} W_i = T \big ( {\hat{\beta }}_i - {\hat{\beta }}_b \big )^\prime \big ( \hat{\sigma }_i^2 Q_{i,T}^{-1} + \hat{\sigma }_b^2 Q_{b,T}^{-1}\big )^{-1} \big ( {\hat{\beta }}_i - {\hat{\beta }}_b \big ). \end{aligned}$$

The idea is to apply this test statistic in a sequential fashion to determine the set of units with coefficient vector $\beta _b = \beta $. The problem is that in doing so we are likely to end up with spurious rejections due to the multitude of tests; that is, we face the problem of controlling the overall significance level of the approach. To this end, we follow Smeekes (2015), who considers a bootstrap sequential unit root test to determine the stationary units in a panel.

Let $W_{(1)}, W_{(2)}, \ldots , W_{(N-1)}$ denote the $N-1$ order statistics of $W_{1}, \ldots , W_{b-1},$ $W_{b+1}, \ldots , W_{N}$ defined as $W_{(1)} \ge \cdots \ge W_{(N-1)}$. Denote by $\hat{q}$ the estimated number of units that cannot be pooled with the benchmark. The sequential procedure is carried out as follows.

Algorithm SEQ.

1.
Set $k = 0$.
2.
Test $H_0(k)$ against $H_1(k+1)$ using $W_{(k+1)}$ as a test statistic. Reject $H_0(k)$ if $W_{(k+1)} > c_{\alpha }(W_{(k+1)})$, where $c_{\alpha }(W_{(k+1)})$ is the appropriate critical value at significance level $\alpha $.
3.
If $H_0(k)$ is not rejected, set $\hat{q} = k$, whereas if $H_0(k)$ is rejected, set $k = k+1$ and go back to step 2.
4.
Preform steps 2 and 3 until $H_0(k)$ cannot be rejected anymore, and set $\hat{q} = k$. If all null hypotheses up to $H_0(N-1)$ are rejected, set $\hat{q} = N$.

We now focus on how to obtain appropriate critical values, $c_{\alpha }(W_{(k+1)})$. Let $\mathcal {D}_k = \{i : W_i \ge W_{(k)} \}$ denote the set of units for which $W_i$ is larger than the kth-order statistic. The complement of $\mathcal {D}_k$ is henceforth denoted $\mathcal {D}_k^c$. Let us also denote by $a_{(1:\mathcal {F})}$ the largest element of the set $\{ a_i : i \in \mathcal {F} \}$. The following bootstrap algorithm will be used to obtain $c_{\alpha }(W_{(k+1)})$.

Algorithm SEQBOOT.

1.
For each unit, estimate (1) by LS and organize the residuals in a $T \times N$ matrix $\hat{{\varepsilon }}$.
2.
Choose a block length, l, and obtain the matrix of bootstrap errors, ${{\varepsilon }}^*$, as described in step 3 of Algorithm BOOT.
3.
Simulate pseudo-data under $H_0(k)$ as $y_{i,t}^* = \hat{\theta }_i + {\hat{\beta }}_b^\prime x_{i,t} + \varepsilon _{i,t}^*$.
4.
Obtain $W_i^*$ by applying $W_i$ to $(y_{i,t}^*, x_{i,t}$) for all $i \in \mathcal {D}_k^c$ and obtain the bootstrap test statistic as $W_{(k+1)}^* = W_{(1:\mathcal {D}_k^c)}^*$.
5.
Repeat steps 2–4 B times.
6.
Select the bootstrap critical value $c^*_{\alpha }(k+1)$ as the $(1-\alpha )$-quantile of the ordered $W_{(k+1)}^{*}$ statistics.

Remark 6

Note that the sequential procedure will not only estimate the number of poolable units, $\hat{q}$, but will also identify those units. The set of units that are poolable (non-poolable) equals $\mathcal {D}_{\hat{q}}^c$ ($\mathcal {D}_{\hat{q}}$).

In this study, the benchmark is a single unit. This is not necessary. In fact, as pointed out by Kapetanios (2003), ${\hat{\beta }}_b$ may be based on a pooled estimator. The problem with such an approach is that the pooled estimator may not make sense under the alternative of different slopes. Suppose, for example, that there are two groups of units, whose slopes are equal within each group but heterogeneous across groups. In this case, the pooled estimate will likely lie between the true coefficients, and therefore, the sequential procedure is likely to find evidence against poolability for all units.^{Footnote 4} The use of a benchmark unit overcomes this problem.

The question is how to choose the benchmark unit. In some applications, there is a natural candidate, as when evaluating policy for a particular unit or when there is a “dominant” unit (see Pesaran and Chudik 2013). In other applications, the choice of benchmark may be less obvious. However, in many cases the researcher will have some a priori information as to the units that are most likely to be poolable, and in such circumstances, the benchmark unit may be picked at random from that set.

Denote by $\mathcal {Z} = \{ i : \beta _i = \beta _b, i \ne b \}$ the set of units for which the null hypothesis $\beta _i = \beta _b$ is true. In addition, let $G_h$ denote the asymptotic distribution (as $T \rightarrow \infty $) of $W_h$ for $h \in \mathcal {Z}$. Lemma 1 establishes the asymptotic distributions of the relevant test statistics when testing $H_0(k)$ against $H_1(k+1)$.

Lemma 1

Suppose $H_0(k)$ is tested against $H_1(k+1)$ using Algorithms SEQ and SEQBOOT. Under Assumptions REGR and ERR, as $T \rightarrow \infty $ with N held fixed

$$\begin{aligned}&W_{(k+1)} \rightarrow \infty \quad \hbox {if}\, k+1 \le q, \\&W_{(k+1)} \rightarrow _d G_{(1:\mathcal {Z})} \quad \hbox {if}\,k+1 > q, \\&W_{i}^* \rightarrow _{d^*} G_i \quad \hbox {for any}\, q\, \hbox {and all} \,i \in \mathcal {D}_k^\mathsf {c}, \\&W_{(k+1)}^* \rightarrow _{d^*} G_{\big (1:\mathcal {D}_k^{c}\big )} \quad \hbox {for any}\, k\, \hbox {and} \,q. \end{aligned}$$

Lemma 1 states the asymptotic validity of the bootstrap approach. Here, $G_{(1:\mathcal {Z})}$ represents the asymptotic distribution of $W_{(1:\mathcal {Z})}$. Under Assumptions REGR and ERR, $G_{(1:\mathcal {Z})}$ is unknown and an analytical approach to the sequential test is not feasible. Fortunately, the desired distribution can be obtained via bootstrapping.

Theorem 3

Under the conditions of Lemma 1,

$$\begin{aligned} \lim _{T \rightarrow \infty } P(\hat{q} = k)= & {} 0 \quad \text{ if } k+1 \le q, \\ \lim _{T \rightarrow \infty } P(\hat{q} = k)= & {} 1 - \alpha \quad \text{ if } k = q, \\ \lim _{T \rightarrow \infty } P(\hat{q} = k)\le & {} \alpha \quad \text{ if } k - 1 \ge q. \end{aligned}$$

Remark 7

The first part of Theorem 3 shows that asymptotically the probability of underrejection is zero; that is, the probability of finding the set of poolable units tends to one as $T \rightarrow \infty $. Also, as is evident from the third part, since the family-wise error rate (FWE) is at most $\alpha $, the procedure is able to control the overall significance level.

While asymptotically irrelevant, the finite sample performance of the sequential test depends on the number of hypotheses tested and the true number of poolable units. In particular, for the overall procedure to have satisfactory power when the number of non-poolable units is large, it is necessary that the power at each step of the procedure be close to one. If this is not the case, and if the number of tests is large, the probability of correctly labeling all non-poolable units is likely to be quite low. Consequently, the sequential procedure is mainly suited for panels where N is relatively small. In Sect. 4, we elaborate on this issue.

As a partial solution to the problem of low power, we may follow the suggestion of Smeekes (2015) and apply the sequential procedure to another set of pre-specified null hypotheses. More specifically, instead of testing each unit against the benchmark, we may skip some units. That is, instead of considering $q_k = k-1$ for $k = 1,\ldots ,N-1$, we may consider another set of numbers $q_1,\ldots ,q_K$, where $q_k$ is not necessarily equal to $k-1$. In this case, in the first iteration of Algorithm SEQ, $H_0(q_1)$ is tested against $H_1(q_2)$. If $H_0(q_1)$ is rejected, in the second iteration $H_0(q_2)$ is tested against $H_1(q_3)$, and so on. By taking $q_{k+1}-q_{k} > 1$, we reduce the number of tests conducted, which is likely to lead to an increase in the overall power of the testing procedure. There is, however, one major drawback of this approach. Since not every unit is tested against the benchmark, unless $q \in \{q_k: k = 1,\ldots ,K\}$, $\hat{q} = q_k$ cannot be interpreted as the number of non-poolable units, nor can $\mathcal {D}_{\hat{q}}^c$ be interpreted as the set of poolable units. Instead, the finding that $\hat{q} = q_k$ should be interpreted as that $q \in [q_{k-1}, q_{k+1}]$. The effect of skipping units is illustrated in the next section.

4 Monte Carlo simulations

In this section, we investigate the small-sample properties of the above tests using Monte Carlo simulations. We begin by considering $S^*$. In this case, we focus on the performance of the test across different specifications of the dependence structure of the errors and across different block-length selection rules. We then proceed to evaluate the sequential bootstrap approach. Here, we will focus on the ability of the test to correctly identify the set of poolable units and to control the FWE.

4.1 Simulation design

The following data-generating process will be used to analyze the performance of $S^*$:

$$\begin{aligned} y_{i,t}= & {} \theta _i + \beta _i x_{i,t} + \varepsilon _{i,t},\\ x_{i,t}= & {} \lambda _{x} f_{x,t} + v_{i,t},\\ f_{x,t}= & {} \rho _{x} f_{x,t-1} + e_{x,t},\\ v_{i,t}= & {} \rho _{x} v_{i,t-1} + u_{i,t}, \end{aligned}$$

where $\theta _i \sim N(1,1)$, $\lambda _{x} = \sqrt{0.5}$, $\rho _{x}=0.5$, $e_{x,t} \sim N(0,1-\rho _x^2)$, and $u_{i,t} \sim N(0,(1-\rho _x^2)(1-\lambda _{x}^2))$. Three specifications for $\varepsilon _{i,t}$ are considered.

E1.
$\varepsilon _{i,t} \sim N(0,1)$.
E2.
$\varepsilon _{t} = (\varepsilon _{1,t},\ldots ,\varepsilon _{N,t})^\prime $ is generated according to the following spatial first-order autoregressive model:
$$\begin{aligned} \varepsilon _{t} = 0.5 W_N \varepsilon _{t} + e_{t}, \end{aligned}$$
where $e_t \sim N(0, I_N)$. The spatial weight matrix $W_N$ is constructed as a first-order continuity matrix in which each unit, except for the first and last, has one left and one right neighbor (see Anselin et al. 2008). To make the simulation results invariant to the variance of the error terms, $\varepsilon _{i,t}$ is scaled by the square root of the ith diagonal element of the covariance matrix of $\varepsilon _t$.
E3.
$\varepsilon _{i,t}$ is generated according to the following dynamic common factor model:
$$\begin{aligned} \varepsilon _{i,t}= & {} \lambda _{\varepsilon } f_{\varepsilon , t} + \eta _{i,t},\\ f_{\varepsilon , t}= & {} \rho _{\varepsilon } f_{\varepsilon , t-1} + e_{\varepsilon , t}, \end{aligned}$$
where $\lambda _{\varepsilon } = \sqrt{0.5}$, $\eta _{i,t} \sim N(0,1-\lambda _{\varepsilon }^2)$, and $e_{\varepsilon , t} \sim N(0,1-\rho _{\varepsilon }^2)$. When $\rho _{\varepsilon } = 0$, $\varepsilon _{i,t}$ is serially independent, whereas when $\rho _{\varepsilon } \ne 0$, then this is no longer the case.^{Footnote 5}

Under the null, $\beta _1 = \cdots = \beta _N = 1$, while under the alternative, $\beta _{i} = 1$ for $i = 1,\ldots ,\lfloor N/2\rfloor $ and $\beta _{i} \sim N(1,0.04)$ for $i = \lfloor N/2\rfloor ,\ldots ,N$, with $\lfloor x \rfloor $ denoting the integer part of x. All tests are conducted at a 5 % nominal level, and we take $N \in \{5, 25, 50\}$ and $T \in \{25, 50, 100\}$.

In case of the sequential testing procedure, we consider $N \in \{10,20\}$. If $N=10$, then $q \in \{0, 1, 3, 6\}$, whereas if $N = 20$, then $q \in \{0, 1, 3, 6, 12\}$. In both cases, the benchmark unit is taken to be $b=N$. The errors are generated as in E3 with $\rho _{\varepsilon } = 0.3$. For all $i \in \mathcal {Z}$, $\beta _i = \beta _b = 1$, and for $i \notin \mathcal {Z}$, $\beta _i \sim U(0, 0.5)$. The significance level is taken to be 5%.^{Footnote 6} To evaluate the ability of the sequential procedure to control the overall significance level, we calculate the empirical FWE as the proportion of tests with at least one false rejection. As a measure of power, we report the proportion of tests in which the estimated set of poolable units, $\mathcal {D}_{\hat{j}}^c$, equals the true set, $\mathcal {Z}$, henceforth referred to as “CP.” We also consider skipping units. In this case, we have to decide on a sequence of numbers $q_1,\ldots ,q_K$ to be tested. In this section, $q_{k+1} = q_{k} + \delta $, where $\delta \in \{3,6\}$.

In a final set of experiments, we compare our sequential bootstrap test with two alternative methods frequently encountered in the literature on multiple testing, the Bonferroni and Holm procedures.^{Footnote 7} Denote by $\hat{p}_i$ the p value of $W_i$. In the Bonferroni procedure, $H_i : \beta _i = \beta _b$ is rejected if $\hat{p}_i \le \alpha /(N-1)$. The Holm procedure consists of the following steps.

Holm algorithm.

1.
Set $k = 1$.
2.
Let $\hat{p}_{(1)} \le \cdots \le \hat{p}_{(N-1)}$ denote the ordered p values and $H_{(1)}, \ldots , H_{(N-1)}$ the associated null hypotheses. If $\hat{p}_{(k)} \ge \alpha /(N-k)$, accept $H_{(k)}, H_{(k+1)}, \ldots , H_{(N-1)}$ and stop. If $\hat{p}_{(k)} < \alpha /(N-k)$, proceed to step 3.
3.
Reject $H_{(k)}$, set $k = k + 1$, and go to step 2.

Compared with the Bonferroni procedure, the Holm criterion for rejecting $H_{(k)}$ becomes increasingly less strict at larger p values. The Holm procedure is therefore expected to be more powerful.

All results are based on 1,000 replications and 499 bootstrap draws. The first 50 time series observations are discarded to reduce the initial value effect.

4.2 Block-length selection rules

An important consideration in practice is the block length, l. In this section, we consider three rules for selecting l.

In the first rule, $l = \lfloor 2 T^{1/3}\rfloor $, which amounts to block lengths of 6, 7, and 9 for sample sizes of 25, 50, and 100, respectively. These block lengths are within the range usually encountered in the literature. Although simple, setting l as a function of T means ignoring the covariance structure of the data (see Hall et al. 1995). A data-driven rule is therefore often preferable.

The second rule amounts to setting l according to the Newey and West (1994) automatic procedure for bandwidth selection. This approach not only accounts for the serial correlation of the data, but is also relatively easy to implement and has good small-sample properties (see Gonçalves and White 2005; Gonçalves 2011).

The Newey and West (1994) approach is designed for variance estimation; however, it is more appealing to consider a block-length selection rule designed for distribution estimation. The proposals of Hall et al. (1995) and Lahiri et al. (2007) are examples of nonparametric resampling methods that can be used for this purpose. In particular, while Hall et al. (1995) propose a rule based on subsampling, the rule of Lahiri et al. (2007) is based on the jackknife-after-bootstrap (JAB) method of Lahiri (2002). In this paper, we will employ the latter method because it is computationally more efficient and has been found to perform relatively well in simulations (see Lahiri et al. 2007).

4.3 Monte Carlo results

The results of the Monte Carlo simulations are reported in Tables 1, 2, 3, and 4. Table 1 reports the results for S and $S^*$, whereas Tables 2, 3, and 4 report the results for the sequential procedure.

Table 1 Size and power of S and $S^*$

Full size table

Table 2 FWE and CP of the sequential test

Full size table

Table 3 FWE and CP of the sequential test when skipping units

Full size table

Table 4 FWE and CP of the bootstrap, Bonferroni and Holm procedures

Full size table

Table 1 provides the empirical size and power of S (the asymptotic test based on iid innovations) in the absence of serial correlation. Looking first at results for E1, we see that S tends to over-reject when $N \ge T$, which is in line with earlier findings in the literature (see, e.g., Pesaran and Yamagata 2008). By contrast, $S^*$ tends to perform quite well for all combinations of N and T. Regarding the block-length selection rules, we see that the three rules considered lead to very similar performance. The deterministic rule is slightly oversized; however, the distortions go away as T increases. When it comes to the behavior under the alternative hypothesis, all tests seem to have satisfactory power properties, with power rising in both N and T.

While oversized in the presence of serial correlation, cross-sectional dependence alone ($\rho _{\varepsilon } = 0$) makes S undersized. The bootstrap tests, on the other hand, tend to perform well in all cases considered with good size accuracy and power. As expected, the tests based on the data-driven block-length selection rules generally perform best, especially when T is small. Looking next at the results for the case when $\rho _{\varepsilon } \ne 0$, we see that the size of S is increasing in $\rho _{\varepsilon }$. The bootstrap tests, on the other hand, continue to perform quite well when $\rho _{\varepsilon } = 0.3$, although there is a slight tendency to reject too often when T is relatively small. The distortions are made worse by increasing $\rho _{\varepsilon }$ to 0.6, in which case the largest distortions are obtained by using the JAB rule. In other words, the use of a data-driven block length does not automatically lead to better size accuracy. Specifically, although the block lengths selected by the data-driven rules are increasing in $\rho _{\varepsilon }$, they do not increase sufficiently, which in turn leads to size distortions. However, while this means that the best performance is sometimes obtained by using the deterministic rule, the difference is not very large with the data-driven rules leading to acceptable performance in most cases considered.

To summarize the results so far, we find that S displays substantial size distortions in the presence of cross-sectional and/or serial dependence. The bootstrap tests, on the other hand, generally show small size distortions and maintain satisfactory power in small samples. In particular, the bootstrap tests are robustness to the specification of the cross-sectional dependence and also perform reasonably well in the presence of error serial correlation. Since the bootstrap tests are also very simple to implement, they should be well suited for applied work.

We now turn our attention to the results for the sequential testing procedure. Here we focus on the results based on the Newey and West (1994) rule, which are shown in Tables 2, 3, and 4. Looking first at FWE in Table 2, it is evident that the sequential test is able to control the overall significance level very well in small samples. This is true for all values of q. We also see that the ability to identify the true set of poolable units (as measured by CP) depends greatly on q when T is small. However, when T increases, CP approaches 95%, as predicted by our theoretical results.

Table 3 provides the results of the sequential test when skipping units. The first thing to note is that, when T is small, CP is much higher in Table 3 than in Table 2. This result suggests that, by skipping units, we may increase the ability of the sequential test to correctly identify the true set of poolable units. On the other hand, Table 3 reveals that the increase in power comes at the expense of higher FWE, at least among the smaller values of T. This trade-off between size and power also becomes evident when comparing the results for $\delta = 3,\,6$.

Table 4 compares the bootstrap sequential test with the Bonferroni and Holm procedures. In this case, the errors are generated as in E1. Looking first at FWE, as expected, we see that all procedures are quite successful in controlling the overall significance level. We also see that the ability to identify the true set of poolable units, as measured by CP, differs, sometimes considerably, between the procedures. For example, looking at the results for the case when $q = 12$ and $T = 50$, we see that the bootstrap CP is almost twice that of the Bonferroni approach. In general, the bootstrap and Holm procedures are able to identify more non-poolable units than can the Bonferroni approach, and this advantage becomes more important as q increases, which is in accordance with our expectations. We also see that the bootstrap approach outperforms the Holm procedure in terms of CP (when $q > 0$) in all but three cases.

In conclusion, the bootstrap sequential test can control FWE in small samples. On the other hand, the ability to find the true set of poolable units depends largely on the sample size and the true number of poolable units. When the number of poolable units is small, and T is small, the sequential procedure will inevitably end up rejecting too few non-poolable units. We show, however, that this problem can be somewhat alleviated by skipping units. Finally, the simulation results indicate that even in the iid case, in which standard multiple testing procedures apply, the bootstrap approach offers improvements in the ability to identify non-poolable units.

5 Conclusions

In this paper, we address the issue of testing for slope homogeneity in panel data models with non-spherical errors. Previous slope homogeneity tests require modeling and estimation of the error dependence structure in order to be practical. This may be problematic in the presence of complicated dependencies of possible unknown form. Therefore, in this paper we propose a bootstrap test that can be implemented with no specification or estimation of the dependence structure. Yet, the test can accommodate both cross-sectional and serial dependence in the error process. The results of a Monte Carlo study indicate that the bootstrap test has small size distortions and satisfactory power in small samples.

Notes

As pointed out by MacKinnon (2007), the advantage of bootstrapping $(y_{i,t}, x_{i,t}^\prime )$ rather than residuals often comes at the cost of poor small-sample performance, a finding that is supported by our preliminary Monte Carlo results.
An alternative bootstrap approach based on cross-sectional resampling has been suggested by Kapetanios (2008). Unlike the bootstrap approach used here, the cross-sectional resampling scheme provides asymptotically valid bootstrap procedures when $N \rightarrow \infty $ but T remains fixed, but this is based on cross-sectional independence.
See Gonçalves (2011) for a set of assumptions that can be used in the case of arbitrary cross-sectional dependence and N asymptotics.
Kapetanios (2003) assumes the existence of a consistent pooled estimator of some common parameter $\beta $, regardless of whether or not the units are poolable. Typically, this requires that a random coefficient assumption is satisfied, or alternatively, that the fraction of non-poolable units tends to zero as $N \rightarrow \infty $.
We also considered a moving average model for $f_{\varepsilon , t}$. The results were, however, very similar to the ones based the autoregressive model considered here and are therefore omitted.
The power results are not size corrected because such a correction is generally not available in practice. Hence, a test is useful for applied work only if it respects roughly the nominal significance level.
For a detailed discussion of these procedures, we refer to Lehmann and Romano (2005, Chapter 9).

References

Andrews DWK (1991) Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59:817–858
Article Google Scholar
Anselin L, Le Gallo J, Jayet H (2008) Spatial panel econometrics. In: Mátyás L, Sevestre P (eds) The econometrics of panel data. Springer, Berlin
Google Scholar
Baltagi BH (2008) Econometric analysis of panel data. Wiley, Chichester
Google Scholar
Baltagi BH, Bresson G, Pirotte A (2008) To pool or not to pool? In: Mátyás L, Sevestre P (eds) The econometrics of panel data. Springer, Berlin
Google Scholar
Bun MJG (2004) Testing poolability in a system of dynamic regressions with nonspherical disturbances. Empir Econ 29:89–106
Article Google Scholar
Chudik A, Pesaran MH, Tosetti E (2011) Weak and strong cross-section dependence and estimation of large panels. Econom J 14:C45–C90
Article Google Scholar
Davidson J (1994) Stochastic limit theory. Oxford University Press, Oxford
Book Google Scholar
Davidson R, MacKinnon JG (1999) The size distortion of bootstrap tests. Econom Theory 15:361–376
Article Google Scholar
Fitzenberger B (1997) The moving blocks bootstrap and robust inference for linear least squares and quantile regressions. J Econom 82:235–287
Article Google Scholar
Freedman DA (1981) Bootstrapping regression models. Ann Stat 9:1218–1228
Article Google Scholar
Gonçalves S (2011) The moving blocks bootstrap for panel linear regression models with individual fixed effects. Econom Theory 27:1048–1082
Article Google Scholar
Gonçalves S, White H (2005) Bootstrap standard error estimates for linear regression. J Am Stat Assoc 100:970–979
Article Google Scholar
Hall P, Horowitz JL, Jing B-Y (1995) On blocking rules for the bootstrap with dependent data. Biometrika 82:561–574
Article Google Scholar
Hidalgo J (2003) An alternative bootstrap to moving blocks for time series regression models. J Econom 117:369–399
Article Google Scholar
Hsiao C (2003) Analysis of panel data. Cambridge University Press, Cambridge
Book Google Scholar
Hsiao C, Pesaran MH (2008) Random coefficient panel data models. In: Mátyás L, Sevestre P (eds) The econometrics of panel data. Springer, Berlin
Google Scholar
Kapetanios G (2003) Determining the poolability properties of individual series in panel datasets. Queen Mary, University of London Working Paper No. 499
Kapetanios G (2006) Cluster analysis of panel data sets using non-standard optimisation of information criteria. J Econ Dyn Control 30:1389–1408
Article Google Scholar
Kapetanios G (2008) A bootstrap procedure for panel data sets with many cross-sectional units. Econom J 11:377–395
Article Google Scholar
Lahiri SN (2002) On the jackknife-after-bootstrap method for dependent data and its consistency properties. Econom Theory 18:79–98
Article Google Scholar
Lahiri SN, Furukawa K, Lee Y-D (2007) A nonparametric plug-in rule for selecting optimal block lengths for block bootstrap methods. Stat Methodol 4:292–321
Article Google Scholar
Lehmann EL, Romano JP (2005) Testing statistical hypotheses. Springer, New York
Google Scholar
Lin C-C, Ng S (2012) Estimation of panel data models with parameter heterogeneity when group membership is unknown. J Econom Methods 1:42–55
Article Google Scholar
MacKinnon JG (2007) Bootstrap hypothesis testing. Queen’s economics department Working Paper No. 1127
Mathai AM, Provost SB (1992) Quadratic forms in random variables. Marcel Dekker, New York
Google Scholar
Newey WK, West KD (1994) Automatic lag selection in covariance matrix estimation. Rev of Econ Stud 61:613–653
Article Google Scholar
Pesaran MH, Tosetti E (2010) Large panels with common factors and spatial correlations. J Econom 161:182–202
Article Google Scholar
Pesaran MH, Yamagata T (2008) Testing slope homogeneity in large panels. J Econom 142:50–93
Article Google Scholar
Pesaran MH, Chudik A (2013) Econometric analysis of high dimensional VARs featuring a dominant unit. Econom Rev 32:592–649
Article Google Scholar
Pesaran MH, Smith R, Im KS (1996) Dynamic linear models for heterogeneous panels. In: Mátyás L, Sevestre P (eds) The econometrics of panel data. Springer, Berlin
Google Scholar
Phillips PCB, Sul D (2003) Dynamic panel estimation and homogeneity testing under cross section dependence. Econom J 6:217–259
Article Google Scholar
Smeekes S (2015) Bootstrap sequential tests to determine the stationary units in a panel. J Time Ser Anal (forthcoming)
White H (2000) A reality check for data snooping. Econometrica 68:1097–1126
Article Google Scholar
White H (2001) Asymptotic theory for econometricians. Academic Press, New York
Google Scholar
Zhou Z, Shao X (2013) Inference for linear models with dependent errors. J R Stat Soc Ser B 75:323–343
Article Google Scholar

Download references

Author information

Authors and Affiliations

AgriFood Economics Center, Department of Economics, Swedish University of Agricultural Sciences, Uppsala, Sweden
Johan Blomquist
Department of Economics, Lund University, Box 7082, 220 07, Lund, Sweden
Joakim Westerlund
Financial Econometrics Group, Centre for Research in Economics and Financial Econometrics, Deakin University, Geelong, Australia
Joakim Westerlund

Authors

Johan Blomquist
View author publications
You can also search for this author in PubMed Google Scholar
Joakim Westerlund
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joakim Westerlund.

Additional information

The authors would like to thank Badi Baltagi (Editor), Edith Madsen, Hans Christian Kongsted, David Edgerton, and seminar participants at Lund University for many valuable comments and suggestions. Financial support from the Knut and Alice Wallenberg Foundation and the Jan Wallander and Tom Hedelius Foundation is gratefully acknowledged.

Appendix: Proofs

This appendix is concerned with the proofs of Lemma 1 and Theorems 1–3. The convergence is in probability, but we generally do not add this explicitly in order to simplify the notation. The sequence $\{a_T\}$ is at most of order $T^{\kappa }$ in probability, denoted $a_T = O_p(T^{\kappa })$, if $T^{-\kappa }a_T$ converges in distribution. The sequence is of order smaller than $T^{\kappa }$ in probability, denoted by $a_T = o_p(T^{\lambda })$, if $T^{-\kappa } a_T \rightarrow _p 0$. The bootstrap stochastic order symbols, denoted $O_{p^*}(\cdot )$ and $o_{p^*}(\cdot )$, are defined in an analogous manner.

We begin by defining the following quantities which will be used throughout this Appendix:

$$\begin{aligned} \xi _{i,T}= & {} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_{i,t} = \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - \overline{x}_{i}) \varepsilon _{i,t}, \\ \tilde{\xi }_{i,T}= & {} \frac{1}{\sqrt{T}} \sum _{t=1}^T \tilde{z}_{i,t} = \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - \mu _{i}) \varepsilon _{i,t}, \\ \xi _{i,T}^*= & {} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_{i,t}^* = \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - \overline{x}_{i}) \varepsilon _{i,t}^*, \\ \tilde{\xi }_{i,T}^{*}= & {} \frac{1}{\sqrt{T}} \sum _{t=1}^T \tilde{z}_{i,t}^{*} = \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - \mu _{i}) \tilde{\varepsilon }_{i,t}^{*}, \end{aligned}$$

where $\tilde{\varepsilon }_{i,t}^{*} = y_{i,t}^* - \theta _i - x_{i,t}^\prime \beta _i$. Also, let $z_t$ and $\tilde{z}_t$ be the $mN \times 1$ stacked vectors $z_t = (z_{1,t}^\prime ,\ldots , z_{N,t}^\prime )^\prime $ and $\tilde{z}_t = (\tilde{z}^{\prime }_{1,t},\ldots , \tilde{z}^{\prime }_{N,t})^\prime $, with similar definitions of $z_{t}^{*}$ and $\tilde{z}_{t}^{*}$.

Lemma 2

Under Assumption ERR, as $T \rightarrow \infty $,

$$\begin{aligned} \Sigma ^{-1/2} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_t \rightarrow _d N({0},{I}_{mN}). \end{aligned}$$

Proof of Lemma 2

Clearly

$$\begin{aligned} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_{i,t}= & {} \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - \overline{x}_i) \varepsilon _{i,t}\\= & {} \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - \mu _i) \varepsilon _{i,t} - ( \overline{x}_i - \mu _i) \frac{1}{\sqrt{T}} \sum _{t=1}^T \varepsilon _{i,t}\\= & {} \frac{1}{\sqrt{T}} \tilde{z}_{i,t} - ( \overline{x}_i - \mu _i) \frac{1}{\sqrt{T}} \sum _{t=1}^T \varepsilon _{i,t}. \end{aligned}$$

By Corollary 3.48 in White (2001), $( \overline{x}_{i} - \mu _{i}) = T^{-1} \sum _{t=1}^T (x_{i,t} - \mu _{i}) = o_p(1)$, and by further use of his Theorem 5.20, $T^{-1/2} \sum _{t=1}^T \varepsilon _{i,t} = O_p(1)$. It follows that

$$\begin{aligned} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_{i,t} = \frac{1}{\sqrt{T}}\sum _{t=1}^T \tilde{z}_{i,t} + o_p(1). \end{aligned}$$

The required result now follows by applying to $T^{-1/2} \sum _{t=1}^T \tilde{z}_t$ a central limit theorem for mixing processes (see, e.g., White 2001, Theorem 5.20). $\square $

Lemma 2*

Under Assumptions ERR and BL, as $T \rightarrow \infty $,

$$\begin{aligned} \Sigma ^{-1/2} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_t^* \rightarrow _{d^*} N(0,I_{mN}) \quad \text{ in } \text{ probability }. \end{aligned}$$

Proof of Lemma 2*

We have $\varepsilon _{i,t}^* = \tilde{\varepsilon }_{i,t}^{*} - (\hat{\theta }_i - \theta _i) - x_{i,t}^\prime ({\hat{\beta }}_i - \beta _i)$. Similarly, $\hat{\varepsilon }_{i,t} = \varepsilon _{i,t} - (\hat{\theta }_i - \theta _i) - x_{i,t}^\prime ({\hat{\beta }}_i - \beta _i)$. Using these relationships, we can write

$$\begin{aligned} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_{i,t}^* = \frac{1}{\sqrt{T}} \sum _{t=1}^T (x_{i,t} - {\overline{x}}_i) \varepsilon _{i,t}^* = \frac{1}{\sqrt{T}} \sum _{t=1}^T [(x_{i,t} - \overline{x}_i) \tilde{\varepsilon }_{i,t}^{*} - (x_{i,t} - \overline{x}_i) \varepsilon _{i,t}], \end{aligned}$$

where we have used that fact that $\sum _{t=1}^T (x_{i,t} - \overline{x}_i) \hat{\varepsilon }_{i,t} = 0$ by the first-order conditions for ${\hat{\beta }}_i$. By adding and subtracting appropriately, we have

$$\begin{aligned} \frac{1}{\sqrt{T}} \sum _{t=1}^T z_{i,t}^*&= \frac{1}{\sqrt{T}} \sum _{t=1}^T [(x_{i,t} - \mu _i) \tilde{\varepsilon }_{i,t}^{*} - (x_{i,t} - \mu _i) \varepsilon _{i,t}] \\&\quad - \frac{1}{\sqrt{T}} \sum _{t=1}^T (\overline{x}_i - \mu _i) \tilde{\varepsilon }_{i,t}^{*} - \frac{1}{\sqrt{T}} \sum _{t=1}^T (\overline{x}_i - \mu _i) \varepsilon _{i,t} \\&= \frac{1}{\sqrt{T}} \sum _{t=1}^T [(x_{i,t} - \mu _i) \tilde{\varepsilon }_{i,t}^{*} - (x_{i,t} - \mu _i) \varepsilon _{i,t} ] + o_p(1), \end{aligned}$$

where the last equality follows from using $(\overline{x}_i - \mu _i) = o_p(1)$ (see White 2001, Corollary 3.48), $T^{-1/2}\sum _{t=1}^T \tilde{\varepsilon }_{i,t}^{*} = O_{p^*}(1)$ (see Fitzenberger 1997, Theorem 3.1), and $T^{-1/2} \sum _{t=1}^T \varepsilon _{i,t} = O_p(1)$ (see White 2001, Theorem 5.20). The required result now follows from the same argument as in Fitzenberger 1997, giving

$$\begin{aligned} \Sigma ^{-1/2} \frac{1}{\sqrt{T}} \sum _{t=1}^T (\tilde{z}_t^{*} - \tilde{z}_t ) \rightarrow _{d^*} N({0}, {I}_{mN}) \quad \text{ in } \text{ probability } \end{aligned}$$

as $T \rightarrow \infty $. $\square $

Proof of Theorem 1

We begin by proving the asymptotic distribution under $H_0$, in which case

$$\begin{aligned} {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}} = \frac{1}{\sqrt{T}} { {Q}}_{i,T}^{-1} \xi _{i,T} - \frac{1}{\sqrt{T}} \left( \sum _{i=1}^N \hat{\sigma }_i^{-2} { {Q}}_{i,T} \right) ^{-1} \sum _{i=1}^N \hat{\sigma }_i^{-2} \xi _{i,T}. \end{aligned}$$

This implies

$$\begin{aligned} S = \sum _{i=1}^N \frac{ \xi _{i,T}^\prime {{Q}}_{i,T}^{-1} \xi _{i,T}}{\hat{\sigma }_i^{2}} - \left( \sum _{i=1}^N \hat{\sigma }_i^{-2} \xi _{i,T} \right) ^\prime \left( \sum _{i=1}^N \hat{\sigma }_i^{-2} {{Q}}_{i,T} \right) ^{-1} \left( \sum _{i=1}^N \hat{\sigma }_i^{-2} \xi _{i,T} \right) , \end{aligned}$$

(2)

or, in stacked form,

$$\begin{aligned} S = \xi _{T}^\prime \big ( \hat{\Sigma }_T {Q}_{T} \big )^{-1} \xi _{T} - \big ( {D} \hat{\Sigma }_T^{-1} \xi _{T} \big )^\prime \big ( {D} \hat{\Sigma }_T^{-1} {Q}_{T} {D}^\prime \big )^{-1} \big ({D} \hat{\Sigma }_T^{-1} \xi _{T}\big ) , \end{aligned}$$

(3)

where $\xi _T = \big ( \xi _{1,T}^\prime ,\ldots ,\xi ^\prime _{N,T}\big )^\prime $, ${D} = {\tau }_N^\prime \otimes {I}_k$, ${Q}_{T} = \mathrm {diag}({Q}_{1,T} ,\ldots , {Q}_{N,T})$, and $\hat{\Sigma }_T$ is a diagonal $Nm\times Nm$ matrix whose diagonal is given by $(\hat{\sigma }_1^2,\ldots , \hat{\sigma }_N^2)^\prime \otimes {\tau }_k$.

By the properties of the LS residuals and Corollary 3.48 in White (2001), we have that $\hat{\sigma }_i^2 - \sigma _i^2 = o_p(1)$, where $\sigma _{i}^2 = E\big (\varepsilon _{i,t}^2\big )$. Note also that, by Lemma 2 and Assumption REGR, $\xi _T \rightarrow _d \xi $ as $T \rightarrow \infty $, where $\xi \sim N({0}, \Sigma )$, and $Q_{T} - Q =o_p(1)$, where ${Q} = \mathrm {diag}({Q}_{1} ,\ldots , {Q}_{N})$. It follows that

$$\begin{aligned} S&\rightarrow _d {Z}_1^\prime Z_1 - \big ( {D} \Sigma _0^{-1/2} Q^{1/2} Z_1 \big )^\prime \big [ {D}\Sigma _0^{-1/2} Q^{1/2} \big ({D} \Sigma _0^{-1/2} Q^{1/2} \big )^\prime \big ]^{-1}\big ({D} \Sigma _0^{-1/2} Q^{1/2} Z_1 \big )\nonumber \\&=_d Z_1^\prime \big ( {I}_{kN} -{H}^\prime \big ({H}{H}^\prime \big )^{-1}{H}\big ) Z_1 = Z_1^\prime A Z_1, \end{aligned}$$

(4)

where $=_d$ signifies equality in distribution, $Z_1 =\Sigma _0^{-1/2} Q^{-1/2} \xi $, $Z_1 \sim N(0, B)$ with ${B} = \Sigma _0^{-1/2} Q^{-1/2}\Sigma \Sigma _0^{-1/2} Q^{-1/2}$, $\Sigma _0$ is $\hat{\Sigma }_T$ with $\hat{\sigma }_i^2$ replaced by $\sigma _i^2$, $H = {D} \Sigma _0^{-1/2} Q^{1/2}$ and $A = {I}_{kN} -{H}^\prime ({H}{H}^\prime )^{-1}{H}$. Since A is symmetric,

$$\begin{aligned} Z_1^\prime {A} Z_1 =_d \sum _{j=1}^{kN} \lambda _j U_j^2, \end{aligned}$$

where $\lambda _1, \ldots , \lambda _{mN}$ are the eigenvalues of ${B}^{1/2} {A} {B}^{1/2}$ (see Mathai and Provost 1992). Noting that A and B are (non-stochastic) positive semidefinite matrices, we have that $\lambda _j \ge 0$ for all j. This establish the required result under $H_0$.

To show consistency, we consider an alternative hypothesis of the form $H_{1} : \beta _i = \beta + \delta _i$, where $\delta _i$ are $m \times 1$ vectors of fixed constants such that $\Vert \delta _i \Vert \le C < \infty $ for all i and $\delta _i \ne \delta _h$ for some pair $i \ne h$. Under this alternative,

$$\begin{aligned} \sqrt{T} ( {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}} )= & {} \sqrt{T} \delta _i + Q_{i,T}^{-1} \xi _{i,T} - \sqrt{T} \left( \sum _{i=1}^N \frac{ Q_{i,T} }{\hat{\sigma }_i^2} \right) ^{-1} \sum _{i=1}^N \frac{ Q_{i,T} \delta _i }{\hat{\sigma }_i^2}\\&-\, \sqrt{T}\left( \sum _{i=1}^N \frac{ Q_{i,T} }{\hat{\sigma }_i^2} \right) ^{-1} \sum _{i=1}^N \frac{ \xi _{i,T} }{\hat{\sigma }_i^2} \\= & {} \sqrt{T} \delta _i + Q_{i,T}^{-1} \xi _{i,T} - \sqrt{T}c_T - \sqrt{T} \left( \sum _{i=1}^N \frac{ Q_{i,T} }{\hat{\sigma }_i^2} \right) ^{-1} \sum _{i=1}^N \frac{ Q_{i,T} \delta _i }{\hat{\sigma }_i^2}, \end{aligned}$$

with an obvious definition of $c_T$. The last term on the right is $O_p(1)$, suggesting that

$$\begin{aligned} \sqrt{T}( {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}} ) = \sqrt{T} ( \delta _i - c_T) + O_p(1). \end{aligned}$$

Since $\delta _i \ne \delta _h$ for some $i \ne h$, it must hold that $\delta _i - c_T \ne 0$ for at least one i. Since ${Q}_i/\sigma _i^2$ is positive definite, this means that $T( {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}})^\prime Q_{i,T} ( {\hat{\beta }}_i - {\hat{\beta }}_\mathrm{{WFE}} )/\hat{\sigma }_i^2 \rightarrow \infty $ as $T \rightarrow \infty $, and so the proof is complete. $\square $

Proof of Theorem 2

Define $\hat{\Sigma }_T^*$ as the diagonal matrix whose diagonal is given by $(\hat{\sigma }_1^{*2},\dots , \hat{\sigma }_N^{*2})^\prime \otimes {\tau }_k$, and assume for the moment that there exists a diagonal matrix $\Sigma _0^*$ satisfying $\hat{\Sigma }_T^* - \Sigma _0^* = o_{p^*}(1)$. Under $H_0$,

$$\begin{aligned} {\hat{\beta }}^*_i - {\hat{\beta }}^*_\mathrm{{WFE}} = \frac{1}{\sqrt{T}} {{Q}}_{i,T}^{-1} \xi ^*_{i,T} - \frac{1}{\sqrt{T}}\left( \sum _{i=1}^N (\hat{\sigma }_i^{*2})^{-1} { {Q}}_{i,T} \right) ^{-1} \sum _{i=1}^N (\hat{\sigma }_i^{*2})^{-1} \xi ^*_{i,T}, \end{aligned}$$

where $\hat{\sigma }_i^{*2}$ is the usual LS estimate of the variance from the bootstrap procedure. This result, together with the same arguments used in the proof of Theorem 1, implies that

$$\begin{aligned} S^* \rightarrow _{d^*} Z _1^{*\prime } ( {I}_{kN} - {H}^{*\prime } ( {H}^* {H}^{*\prime })^{-1}{H}^*) Z_1^* =_{d^*} Z_1^{*\prime } {A}^* Z_1^* \quad \hbox {in probability}, \end{aligned}$$

(5)

where ${H}^* = {D}(\hat{\Sigma }_T^*)^{-1/2} Q^{1/2}$ and $Z_1^* \sim N({0}, {B}^*)$ with ${B}^* = (\hat{\Sigma }_T^*)^{-1/2} Q^{-1/2} \Sigma (\hat{\Sigma }_T^*)^{-1/2} Q^{-1/2}$. Hence, as in the proof of Theorem 1, if we can show that $\hat{\Sigma }_T^* - \Sigma _0 =o_{p^*}(1)$, then

$$\begin{aligned} Z_1^{*\prime } {A}^* Z_1^* =_{d^*} \sum _{j=1}^{kN} \lambda _j U_j^2 \quad \text{ in } \text{ probability }, \end{aligned}$$

(6)

and the second result of the theorem follows. We now verify that $\hat{\Sigma }_T^* - \Sigma _0 =o_{p^*}(1)$. Under $H_0$,

$$\begin{aligned} \frac{T-k-1}{T} (\hat{\sigma }_i^{*})^2= & {} \frac{1}{T} ({y}_i^* - x_{i}{\hat{\beta }}_i^*)^\prime {M}_{\tau } ({y}_i^* - x_{i} {\hat{\beta }}_i^*) = \frac{1}{T} ({M}_{\tau } \varepsilon ^*_i)^\prime {M}_{x_i} ({M}_{\tau } \varepsilon _i^*) \\= & {} \frac{1}{T} \sum _{t=1}^T ( \varepsilon _{i,t}^* - \overline{\varepsilon }_i^*)^2 + o_{p^*}(1) = \frac{1}{T} \sum _{t=1}^T (\varepsilon _{i,t}^{*})^2 + o_{p^*}(1), \end{aligned}$$

where ${M}_{x_i} = {I}_T - x_i (x_i^\prime {M}_{\tau } x_i)^{-1}x_i^\prime $. The second equality follows from straightforward algebra under the null hypothesis. We can use an argument similar to Lemma 2 to show that $T^{-1/2} x_i^\prime {M}_{\tau } \varepsilon ^*_i = O_{p^*}(1)$, and the third equality follows. Finally, by Theorem 3.1 in Fitzenberger (1997), since $\varepsilon _{i,t}^* = \tilde{\varepsilon }_{i,t}^{*} + \hat{\varepsilon }_{i,t} - \varepsilon _{i,t}$ and $T^{-1} \sum _{t=1}^T \hat{\varepsilon }_{i,t} = 0$, we have $\overline{\varepsilon }_i^* = T^{-1} \sum _{t=1}^T (\tilde{\varepsilon }_{i,t}^{*} - \varepsilon _{i,t}) = O_{p^*}(T^{-1/2})$, and so the fourth equality follows. It remains to consider $T^{-1} \sum _{t=1}^T (\varepsilon _{i,t}^{*})^2$. By MBB–Lemma A.3 in Fitzenberger (1997), $T^{-1} \sum _{t=1}^T [(\tilde{\varepsilon }_{i,t}^{*})^2 - \varepsilon _{i,t}^2] = o_{p^*}(1)$. We now show that $T^{-1} \sum _{t=1}^T [(\varepsilon _{i,t}^{*})^2 - (\tilde{\varepsilon }_{i,t}^{*})^2] = o_{p^*}(1)$. For this purpose, it is convenient to define $v_{i,t} = w_{i,t} (w_i^\prime w_i)^{-1} \sum _{s=1}^T w_{i,s}^\prime \varepsilon _{i,s}$, where $w_i = (\tau _T, x_i)$. Also, by the properties of LS residuals, $\hat{\varepsilon }_{i,t} = \varepsilon _{i,t} - v_{i,t}$. Hence,

$$\begin{aligned} \frac{1}{T} \sum _{t=1}^T \big (\varepsilon _{i,t}^{*}\big )^2= & {} \frac{1}{T} \sum _{t=1}^T \big (\tilde{\varepsilon }_{i,t}^{*}\big )^2 + \frac{2}{T} \sum _{t=1}^T \tilde{\varepsilon }_{i,t}^{*}(\hat{\varepsilon }_{i,t} - \varepsilon _{i,t}) + \frac{1}{T} \sum _{t=1}^T (\hat{\varepsilon }_{i,t} - \varepsilon _{i,t})^2 \\= & {} \frac{1}{T} \sum _{t=1}^T \big (\tilde{\varepsilon }_{i,t}^{*}\big )^2 - \frac{2}{T} \sum _{t=1}^T \tilde{\varepsilon }_{i,t}^{*} v_{i,t} + o_p(1) \\= & {} \frac{1}{T} \sum _{t=1}^T \big (\tilde{\varepsilon }_{i,t}^{*}\big )^2 + o_{p^*}(1) + o_p(1). \end{aligned}$$

The second equality follows from the fact that $T^{-1} \sum _{t=1}^T \hat{\varepsilon }_{i,t}^2\rightarrow _p\sigma _i^2$. As for the third equality, by Corollary 3.48 in White (2001), $T^{-1} \sum _{s=1}^T w_{i,s}^\prime \varepsilon _{i,s} = o_p(1)$, by Assumption REGR, $(T^{-1}w_i^\prime w_i)^{-1}$ converges to a nonzero deterministic matrix, and by further use of MBB–Lemma A.3 in Fitzenberger (1997), $T^{-1} \sum _{t=1}^T (\tilde{\varepsilon }_{i,t}^{*}w_{i,t} - \varepsilon _{i,t}w_{i,t}) =o_{p^*}(1)$. Finally, by Corollary 3.48 in White (2001), $T^{-1} \sum _{t=1}^T \varepsilon _{i,t}w_{i,t} =o_p(1)$. This establishes that $\hat{\Sigma }_T^* - \Sigma _0 =o_{p^*}(1)$, and so the proof of the first statement of the theorem is complete.

The second part of the theorem follows from the fact that in the bootstrap the data are generated under the null of equal slope coefficients. Consequently, the result in (6) also holds under the alternative hypothesis. $\square $

Proof of Lemma 1

Consider the following alternative hypothesis $H_{1} : \beta _i = \beta _b + \delta _i$, where $\delta _i$ is a $m \times 1$ vector of fixed constants such that $\Vert \delta _i \Vert \le C < \infty $. By the properties of the LS estimator, ${\hat{\beta }}_i - \beta _i = O_p(T^{-1/2})$. It follows that $\hat{\beta }_i - \hat{\beta }_b - \delta _i = o_p(1)$. Also, $( \hat{\sigma }_i^2 Q_{i,T}^{-1} + \hat{\sigma }_b^2 Q_{b,T}^{-1} )^{-1}$ converges to a positive definite matrix by Assumption REGR. Consequently, provided that $i \notin \mathcal {Z}$, $W_i \rightarrow \infty $ as $T \rightarrow \infty $, which in turn implies

$$\begin{aligned} W_{(1)}, \ldots , W_{(q)} \rightarrow \infty . \end{aligned}$$

Consider the second result of the lemma. We only need to consider the order statistics for which $i \in \mathcal {Z}$, as $W_{(k+1)} \rightarrow \infty $ if $k + 1 \le q$ by the first part. Hence,

$$\begin{aligned} W_{(k+1)} \rightarrow _p W_{1:\mathcal {Z}}. \end{aligned}$$

The second result then follows from application of the continuous mapping theorem (see, e.g., White 2000, Proposition 2.2).

In order verify the third result, we proceed in two steps. First, we establish the exact expression of $G_i$, the limiting distribution of $W_i$ for $i \in \mathcal {Z}$. Second, we show that the bootstrap analog, $W_i^*$, converges to the same distribution. Define the $m \times Nm$ matrix ${P}_i = (0 , \ldots , 0, {I}_{m} , 0, \ldots , 0, -{I}_{m}, 0, \ldots ,0)$, where ${I}_m$ and $-{I}_m$ are in positions i and b, respectively. In addition, define ${\hat{\beta }} = ( {\hat{\beta }}_1, \ldots , {\hat{\beta }}_N )^\prime $. In this notation, since $\beta _i = \beta _b$ for all $i \in \mathcal {Z}$, $W_i$ can be written as

$$\begin{aligned} W_i = T (P_i {\hat{\beta }})^\prime \big ( {P}_i \hat{\Sigma }_T Q_T^{-1} {P}_i^\prime \big )^{-1}({P}_i {\hat{\beta }}) = \tilde{{Z}}_2^\prime \big ( {P}_i \hat{\Sigma }_T Q_T^{-1} {P}_i^\prime \big )^{-1} \tilde{{Z}}_2, \end{aligned}$$

where $\tilde{{Z}}_2 = \sqrt{T} \, {P}_i {\hat{\beta }} = {P}_i Q_T^{-1} \xi _T$. By Lemma 2 and Assumption REGR,

$$\begin{aligned} \tilde{{Z}}_2 \rightarrow _d N \big ({0}, {P}_i Q^{-1} \Sigma Q^{-1} {P}_i^\prime \big ) \end{aligned}$$

as $T \rightarrow \infty $, and $\hat{\Sigma }_T Q_T^{-1} - \Sigma _0^* Q^{-1} =o_p(1)$.

To analyze the bootstrap test, $W_i^*$, define ${\hat{\beta }}^* = ( {\hat{\beta }}_1^*, \ldots , {\hat{\beta }}_N^* )^\prime $. It follows that we can write $W_i^*$ as

$$\begin{aligned} W_i^* = T ({P}_i {\hat{\beta }}^*)^\prime \big ( {P}_i \hat{\Sigma }_T^*Q_T^{-1} {P}_i^\prime \big )^{-1} ({P}_i {\hat{\beta }}^* ). \end{aligned}$$

Note that ${\hat{\beta }}^*$ is estimated using $y_{i,t}^*$, which is generated under the hypothesis that ${\beta }_i = {\beta }_b$ for all i. This implies that

$$\begin{aligned} W_i^* = \tilde{ {Z}}_2^{*\prime } \big ( {P}_i \hat{\Sigma }_T^*Q_T^{-1} {P}_i^\prime \big )^{-1} \tilde{ {Z}}_2^{*}, \end{aligned}$$

where $\tilde{{Z}}_2^{*} = \sqrt{T} {P}_i {\hat{\beta }}^* = {P}_i Q_T^{-1} \xi _T^*$. By Lemma 2 and Assumption REGR, we have that

$$\begin{aligned} \tilde{{Z}}_2 \rightarrow _d N \big ({0}, {P}_i Q^{-1} \Sigma Q^{-1} {P}_i^\prime \big ) \quad \text{ in } \text{ probability }. \end{aligned}$$

In addition, from the proof of Theorem 1, $\hat{\Sigma }_T^* - \Sigma _0 = o_{p^*}(1)$, which implies that $\hat{\Sigma }_T^* Q_T^{-1} - \Sigma _0 Q^{-1} = o_{p^*}(1)$. This completes the proof of the third result.

The fourth result follows from the third and the continuous mapping theorem (White 2000, Proposition 2.2). $\square $

Proof of Theorem 3

The first result follows from the first and last results of Lemma 1. To show the second result, note that, by the first part of Lemma 1, $W_i \rightarrow _p \infty $ if $i \notin \mathcal {Z}$. Therefore, if $k = q$, $\mathcal {D}_k^c$ equals $\mathcal {Z}$ in the limit with probability one. It follows from second and fourth parts of Lemma 1 that the probability of not rejecting the null, $H_0(k)$, is equal to $1 - \alpha $ (the confidence level in Algorithm SEQBOOT). Finally, as for the third result of Theorem 3, note that

$$\begin{aligned} \sum _{k=0}^{N-1} \lim _{T \rightarrow \infty } P(\hat{q} = g) = \sum _{k=0}^{q} \lim _{T \rightarrow \infty } P(\hat{q} = g) + \sum _{k=q + 1}^{N-1} \lim _{T \rightarrow \infty } P(\hat{q} = g) = 1. \end{aligned}$$

From the first two parts of the theorem, $\sum _{k=0}^{q} \lim _{T \rightarrow \infty } P(\hat{q} = g) = 1 - \alpha $. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blomquist, J., Westerlund, J. Panel bootstrap tests of slope homogeneity. Empir Econ 50, 1359–1381 (2016). https://doi.org/10.1007/s00181-015-0978-z

Download citation

Received: 07 October 2014
Accepted: 14 April 2015
Published: 21 July 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s00181-015-0978-z

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Panel bootstrap tests of slope homogeneity

Abstract

Similar content being viewed by others

Inferences on the regression coefficients in panel data models: parametric bootstrap approach

New Approaches for Testing Slope Homogeneity in Large Panel Data Models

Testing slope homogeneity in panel data models with a multifactor error structure

1 Introduction

2 Model and assumptions

3 The bootstrap test procedures

3.1 A pooled test for testing \(q = 0\) versus \(q \ge 1\)

Remark 1

Remark 2

Theorem 1

Theorem 2

Remark 3

Remark 4

Remark 5

3.2 A sequential test procedure for determining q

Remark 6

Lemma 1

Theorem 3

Remark 7

4 Monte Carlo simulations

4.1 Simulation design

4.2 Block-length selection rules

4.3 Monte Carlo results

5 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proofs

Appendix: Proofs

Lemma 2

Proof of Lemma 2

Lemma 2*

Proof of Lemma 2*

Proof of Theorem 1

Proof of Theorem 2

Proof of Lemma 1

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation