Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In a series of papers Durbin and Watson (195019511971) developed a celebrated test to detect serial correlation of order one. The corresponding Durbin–Watson (DW) statistic was proposed by Sargan and Bhargava (1983) in order to test the null hypothesis of a random walk,

$$\displaystyle{y_{t} =\rho y_{t-1} +\varepsilon _{t}\,,\ t = 1,\ldots,T\,,\quad H_{0}:\,\rho = 1\,.}$$

Bhargava (1986) established that the DW statistic for a random walk is uniformly most powerful against the alternative of a stationary AR(1) process. Local power of DW was investigated by Hisamatsu and Maekawa (1994) following the technique by White (1958). Hisamatsu and Maekawa (1994) worked under the following assumptions: (1) a model without intercept like above, (2) a zero (or at least negligible) starting value y 0, (3) serially independent innovations \(\{\varepsilon _{t}\},\) and (4) homoskedastic innovations. Nabeya and Tanaka (19881990a) and Tanaka (19901996) introduced the so-called Fredholm approach to econometrics. Using this approach, Nabeya and Tanaka (1990b) investigated the local power of DW under a more realistic setup. They allowed for an intercept and also a linear trend in the model and for errors displaying serial correlation and heteroskedasticity of a certain degree. Here, we go one step beyond and relax the zero starting value assumption. To this end we adopt the Fredholm approach as well.Footnote 1

In particular, we obtain the limiting characteristic function of the DW statistic for near-integrated processes driven by serially correlated and heteroskedastic processes with the primary focus to reveal the effect of a “large initial condition” growing with \(\sqrt{T}\), where T is the sample size. This starting value assumption has been picked out as a central theme by Müller and Elliott (2003), see also Harvey et al. (2009) for a recent discussion.

The rest of the paper is organized as follows. Section 2 becomes precise on the notation and the assumptions. The underlying Fredholm approach is presented and discussed in Sect. 3. Section 4 contains the limiting results. Section 5 illustrates the power function of DW. A summary concludes the paper. Proofs are relegated to the Appendix.

2 Notation and Assumptions

Before becoming precise on our assumptions we fix some standard notation. Let \(\mathbb{I}(\cdot )\) denote the usual indicator function, while I T stands for the identity matrix of size T. All integrals are from 0 to 1 if not indicated otherwise, and \(w(\cdot )\) indicates a Wiener process or standard Brownian motion.

We assume that the time series observations {y t } are generated from

$$\displaystyle{ y_{t} = x_{t} +\eta _{t}\,,\quad \eta _{t} =\rho \,\eta _{t-1} + u_{t}\,,\quad t = 1,\ldots,T, }$$
(1)

where x t is the deterministic component of y t which we restrict to be a constant or a linear time trend. We maintain that the following conventional assumption governs the behavior of the stochastic process {u t }.

Assumption 1

The sequence {u t } is generated by

$$\displaystyle{ u_{t} =\sum _{ j=0}^{\infty }\alpha _{ j}\varepsilon _{t-j}\quad \mbox{ with }\alpha _{0} = 1\,,\quad \sum _{j=0}^{\infty }\vert \alpha _{ j}\vert < \infty \,,\quad a:=\sum _{ j=0}^{\infty }\alpha _{ j}\neq 0\,, }$$

while \(\{\varepsilon _{t}\}\) is a sequence of martingale differences with

$$\displaystyle{ \mbox{ plim}_{T\rightarrow \infty }\frac{1} {T}\sum _{t=1}^{T}E\left (\varepsilon _{ t}^{2}\vert F_{ t-1}\right ) =\sigma ^{2}\,,\text{ }\mbox{ plim}_{ T\rightarrow \infty }\frac{1} {T}\sum _{t=1}^{T}E\left (\varepsilon _{ t}^{2}\mathbb{I}\left (\left \vert \varepsilon _{ t}\right \vert > \sqrt{T}\gamma \right )\vert F_{t-1}\right ) = 0\text{,} }$$

for any γ > 0, 0 < σ 2 < , and that F t is the σ-algebra generated by the \(\varepsilon _{s},\ s \leq t\). Also we let \(\sigma _{u}^{2}:=\lim _{T\rightarrow \infty }\frac{1} {T}\sum _{t=1}^{T}E\left (u_{ t}^{2}\right )\) and \(\omega _{u}^{2}:=\lim _{T\rightarrow \infty }T^{-1}E\left (\sum \nolimits _{t=1}^{T}u_{t}\right )^{2} =\sigma ^{2}a^{2}\).

Assumption 2

In model (1) we allow ρ to be time-dependent and set \(\rho _{T} = 1 - \frac{c} {T}\) with c > 0, where the null distribution is covered as limiting case (\(c \rightarrow 0\)).

Assumption 3

For the starting value \(\eta _{0} =\xi\) we assume: a) \(\xi = o_{p}(\sqrt{T})\) (“small starting value”), where \(\xi\) may be random or deterministic; b) \(\xi =\delta \sqrt{\omega _{u }^{2 }/\left (1 -\rho _{ T }^{2 } \right )}\) where \(\delta \sim N\left [\mu _{\delta }\mathbb{I}\left (\sigma _{\delta }^{2} = 0\right ),\sigma _{\delta }^{2}\right ]\) and independent from {u t } (“large starting value”).

Assumption 1 allows for heteroskedasticity of {u t }. If we assume homoskedasticity, \(E\left (\varepsilon _{t}^{2}\vert F_{t-1}\right ) =\sigma ^{2}\), then {u t } is stationary. Under Assumption 1 an invariance principle is guaranteed (see, for example, Phillips and Solo 1992). By Assumption 2, the process {η t } is near-integrated as defined by Phillips (1987). The initial condition under Assumption 3a) will be negligible as \(T \rightarrow \infty \). The effect of initial condition under Assumption 3b) will not be negligible and the specification of \(\delta\), compactly, covers both random and fixed cases depending on the value of \(\sigma _{\delta }\).

We distinguish the model with demeaning from that with detrending using μ and τ for the corresponding cases. The test statistics DW j, T (j = μ, τ) are given by

$$\displaystyle{ \mathit{DW }_{j,T} = \frac{T} {\hat{\overline{\omega }}^{2}} \frac{\sum _{t=2}^{T}\left (\hat{\eta }_{t}^{j} -\hat{\eta }_{t-1}^{j}\right )^{2}} {\sum _{t=1}^{T}\left (\hat{\eta }_{t}^{j}\right )^{2}} \,, }$$
(2)

where \(\hat{\eta }_{t}^{j}\) are OLS residuals calculated from (1) with \(\hat{\overline{\omega }}^{2}\) being a consistent estimator of \(\sigma _{u}^{2}/\omega _{u}^{2}\) (see Hamilton 1994, Sect. 10.5, for further discussions).

DW j, T rejects a null hypothesis of ρ = 1 in favor of ρ < 1 for too large values. The critical values are typically taken from the limiting distributions, DW j , which are characterized explicitly further down,

$$\displaystyle{ \mathit{DW }_{j,T}\mathop{ \rightarrow }\limits^{ D}\mathit{DW }_{j}\,,\quad j =\mu,\tau \,, }$$

where \(\mathop{\rightarrow }\limits^{ D}\) denotes convergence in distribution as \(T \rightarrow \infty \).

3 Fredholm Approach

The Fredholm approach relies on expressing limiting distributions as double Stieltjes integrals over a positive definite kernel K(s, t) that is symmetric and continuous on \([0,1] \times [0,1]\).Footnote 2 Given a kernel, one defines a type I Fredholm integral equation as

$$\displaystyle{ f(t) =\lambda \int K(s,t)\,f(s)\,\mathit{ds}\,, }$$

with eigenvalue λ and eigenfunction f. The corresponding Fredholm determinant (FD) of the kernel is defined as (see Tanaka 1990, Eq. (24))

$$\displaystyle{ D(\lambda ) =\lim _{T\rightarrow \infty }\mbox{ det}\,\left (I_{T} - \frac{\lambda } {T}\left [K\left ( \frac{j} {T}, \frac{k} {T}\right )\right ]_{j,k=1,\ldots,T}\right )\,. }$$
(3)

Further, the so-called resolvent \(\varGamma \left (s,t;\lambda \right )\) of the kernel (see Tanaka 1990, Eq. (25)) is

$$\displaystyle{ \varGamma \left (s,t;\lambda \right ) = K\left (s,t\right ) +\lambda \int \varGamma \left (s,u;\lambda \right )K\left (u,t\right )\mathit{du}\,. }$$
(4)

Those are the ingredients used to determine limiting characteristic functions following Nabeya and Tanaka (1990a) and more generally Tanaka (1990).Footnote 3

Let DW j (j = μ, τ) represent the limit of DW j, T . DW j −1 can be written as \(S_{X} =\int \left \{X\left (t\right ) + n\left (t\right )\right \}^{2}\mathit{dt}\) for some stochastic process \(X\left (t\right )\) and an integrable function \(n\left (t\right )\). Tanaka (1996, Theorem 5.9, p. 164) gives the characteristic function of random variables such as S X summarized in the following lemma.

Lemma 1

The characteristic function of

$$\displaystyle{ S_{X} =\int \left [X\left (t\right ) + n\left (t\right )\right ]^{2}\mathit{dt} }$$
(5)

for a continuous function \(n\left (t\right )\) is given by

$$\displaystyle{ Ee^{i\theta S_{X} } = \left [D\left (2i\theta \right )\right ]^{-1/2}\exp \left [i\theta \int n^{2}\left (t\right )\mathit{dt} - 2\theta ^{2}\int h\left (t\right )n\left (t\right )\mathit{dt}\right ], }$$
(6)

where \(h\left (t\right )\) is the solution of the following type II Fredholm integral equation

$$\displaystyle{ h\left (t\right ) = m\left (t\right ) +\lambda \int K\left (s,t\right )h\left (s\right )\mathit{ds}\text{, } }$$
(7)

evaluated at λ = 2iθ, \(K\left (s,t\right )\)  is the covariance of \(X\left (t\right ),\) and \(m\left (t\right ) =\int K\left (s,t\right )n\left (s\right )\mathit{ds}\) .

Remark 1

Although Tanaka (1996, Theorem 5.9) presents this lemma for the covariance \(K\left (s,t\right )\) of \(X\left (t\right )\), his exposition generalizes for a more general case. Adopting his arguments one can see that Lemma 1 essentially relies on an orthogonal decomposition of \(X\left (t\right )\), which does not necessarily have to be based on the covariance of \(X\left (t\right )\). In particular, if there exists a symmetric and continuous function \(C\left (s,t\right )\) such that

$$\displaystyle{ \int X^{2}\left (t\right )\mathit{dt} =\int \int C\left (s,t\right )\mathit{dw}\left (s\right )\mathit{dw}\left (t\right )\text{,} }$$

then we may find \(h\left (t\right )\) as in (7) by solving \(h\left (t\right ) =\int C\left (s,t\right )n\left (s\right )\mathit{ds} +\lambda \int C\left (s,t\right )h\left (s\right )\mathit{ds}\). This may in some cases shorten and simplify the derivations. As will be seen in the Appendix, we may find the characteristic function of DW μ resorting to this remark.

4 Characteristic Functions

In Proposition 2 below, we will give expressions for the characteristic functions of DW j −1 (j = μ, τ) employing Lemma 1. To that end we use that \(\mathit{DW }_{j}^{-1} =\int \left [X_{j}\left (t\right ) + n_{j}\left (t\right )\right ]^{2}\mathit{dt}\) for some integrable functions \(n_{j}\left (t\right )\) and demeaned and detrended Ornstein–Uhlenbeck processes \(X_{\mu }\left (t\right )\) and \(X_{\tau }\left (t\right )\), respectively, see Lemma 2 in the Appendix. \(n_{\mu }\left (t\right )\) and \(n_{\tau }\left (t\right )\) capture the effect of the initial condition whose exact forms are given in the Appendix. Hence, we are left with deriving the covariance functions of \(X_{\mu }\left (t\right )\) and \(X_{\tau }\left (t\right )\). We provide these rather straightforward results in the following proposition.

Proposition 1

The covariance functions of \(X_{\mu }\left (t\right )\)  and \(X_{\tau }\left (t\right )\)  from \(\mathit{DW }_{j}^{-1} =\int \left [X_{j}\left (t\right ) + n_{j}\left (t\right )\right ]^{2}\mathit{dt}\) are

$$\displaystyle\begin{array}{rcl} K_{\mu }\left (s,t\right )& =& K_{1}\left (s,t\right ) - g\left (t\right ) - g\left (s\right ) +\omega _{0}, {}\\ K_{\tau }\left (s,t\right )& =& K_{1}\left (s,t\right ) +\sum \nolimits _{ k=1}^{8}\phi _{ k}\left (s\right )\psi _{k}\left (t\right ), {}\\ \end{array}$$

where \(K_{1}\left (s,t\right ) = \frac{1} {2c}\left [e^{-c\left \vert s-t\right \vert } - e^{-c\left (s+t\right )}\right ]\)  and the functions \(\phi _{k}\left (s\right )\), \(\psi _{k}\left (s\right )\),

\(k = 1,2,\ldots,8\) , and  \(g\left (s\right )\)  and the constant ω 0  can be found in the Appendix.

The problem dealt with in this paper technically translates into finding \(h_{j}\left (t\right )\) as outlined in Lemma 1 for \(K_{j}\left (s,t\right )\), j = μ, τ, i.e. finding \(h_{j}\left (t\right )\) that solves a type II Fredholm integral equation of the form (7). Solving a Fredholm integral equation in general requires the knowledge of the FD and the resolvent of the associated kernel. The FD for \(K_{j}\left (s,t\right )\) (j = μ, τ) are known (Nabeya and Tanaka 1990b),but not the resolvents. Finding the resolvent is in general tedious, let alone the difficulties one might face finding \(h_{j}\left (t\right )\) once FD and the resolvent are known. To overcome these difficulties, we suggest a different approach to find \(h_{j}\left (t\right )\) which follows.Footnote 4 As we see from Proposition 1 kernels of the integral equations considered here are of the following general form

$$\displaystyle{ K\left (s,t\right ) = K_{1}\left (s,t\right ) +\sum \nolimits _{ k=1}^{n}\phi _{ k}\left (s\right )\psi _{k}\left (t\right ). }$$

Thus to solve for \(h\left (t\right )\) in

$$\displaystyle{ h\left (t\right ) = m\left (t\right ) +\lambda \int K\left (s,t\right )h\left (s\right )\mathit{ds}\text{,} }$$
(8)

we let \(\upsilon = \sqrt{\lambda -c^{2}}\) and observe that (8) is equivalent to

$$\displaystyle{ h^{{\prime\prime}}\left (t\right ) +\upsilon ^{2}h\left (t\right ) = m^{{\prime\prime}}\left (t\right ) - c^{2}m\left (t\right ) +\lambda \sum \nolimits _{ k=1}^{n}b_{ k}\left [\psi _{k}^{{\prime\prime}}\left (t\right ) - c^{2}\psi _{ k}\left (t\right )\right ], }$$
(9)

with the following boundary conditions

$$\displaystyle\begin{array}{rcl} h\left (0\right )& =& m\left (0\right ) +\lambda \sum \nolimits _{ k=1}^{n}b_{ k}\psi _{k}\left (0\right ),{}\end{array}$$
(10)
$$\displaystyle\begin{array}{rcl} h^{{\prime}}\left (0\right )& =& m^{{\prime}}\left (0\right ) +\lambda \sum \nolimits _{ k=1}^{n}b_{ k}\psi _{k}^{{\prime}}\left (0\right ) +\lambda b_{ n+1},{}\end{array}$$
(11)

where

$$\displaystyle{ b_{k} =\int \phi _{k}\left (s\right )h\left (s\right )\mathit{ds}\text{, }k = 1,2,\ldots,n\text{ and }b_{n+1} =\int e^{-cs}h\left (s\right )\mathit{ds}. }$$
(12)

The solution to (9) can now be written as

$$\displaystyle{ h\left (t\right ) = c_{1}\cos \upsilon t + c_{2}\sin \upsilon t + g_{m}\left (t\right ) +\sum \nolimits _{ k=1}^{n}b_{ k}g_{k}\left (t\right ) }$$
(13)

where \(g_{k}\left (t\right )\), \(k = 1,2,\ldots,n\), are special solutions to the following differential equations

$$\displaystyle{ g_{k}^{{\prime\prime}}\left (t\right ) +\upsilon ^{2}g_{ k}\left (t\right ) =\lambda \left [\psi _{k}^{{\prime\prime}}\left (t\right ) - c^{2}\psi _{ k}\left (t\right )\right ]\text{, }k = 1,2,\ldots,n, }$$

and \(g_{m}\left (t\right )\) is a special solution of

$$\displaystyle{ g_{m}^{{\prime\prime}}\left (t\right ) +\upsilon ^{2}g_{ m}\left (t\right ) = m^{{\prime\prime}}\left (t\right ) - c^{2}m\left (t\right ). }$$

Using the boundary conditions (10) and (11) together with equations from (12) the unknowns c 1, c 2, b 1, b 2, …, \(b_{n+1}\) are found giving an explicit form for (13). The solution for \(h\left (t\right )\) can then be used for the purposes of Lemma 1. It is important to note that if we replace \(K_{1}\left (s,t\right )\) with any other nondegenerate kernel, the boundary conditions (10) and (11) need to be modified accordingly.

Using the method described above we establish the following proposition containing our main results for DW j (j = μ , τ).

Proposition 2

For DW j (j = μ ,τ) we have under Assumptions  1 2 , and  3 b)

$$\displaystyle\begin{array}{rcl} E\left [e^{i\theta \mathit{DW}_{j}^{-1} }\right ]& =& \left [D_{j}\left (2i\theta \right )\right ]^{-1/2}\left \{1 -\frac{\sigma _{\delta }^{2}} {c}\left [i\theta \varTheta _{j} - 2\theta ^{2}\varPsi _{ j}\left (\theta;c\right )\right ]\right \}^{-1/2} \times {}\\ & &\exp \left \{ \frac{\mu _{\delta }^{2}\left [i\theta \varTheta _{j} - 2\theta ^{2}\varPsi _{j}\left (\theta;c\right )\right ]} {2c - 2\sigma _{\delta }^{2}\left [i\theta \varTheta _{j} - 2\theta ^{2}\varPsi _{j}\left (\theta;c\right )\right ]}\right \}, {}\\ \end{array}$$

where \(D_{j}\left (\lambda \right )\)  is the FD of \(K_{j}\left (s,t\right )\)  with \(\upsilon = \sqrt{\lambda -c^{2}}\),

$$\displaystyle\begin{array}{rcl} D_{\mu }\left (\lambda \right )& =& \frac{e^{-c}} {\upsilon ^{4}} \left [\upsilon \left (\lambda -c^{3}\right )\sin \upsilon -\left (c^{2}\upsilon ^{2} + 2c\lambda \right )\cos \upsilon + 2c\lambda \right ], {}\\ D_{\tau }\left (\lambda \right )& =& e^{-c}\left [\left (\frac{c^{5} - 4\lambda c^{2}} {\upsilon ^{4}} -\frac{12\lambda \left (c + 1\right )\left (c^{2}+\lambda \right )} {\upsilon ^{6}} \right )\frac{\sin \upsilon } {\upsilon }\right. {}\\ & & \left.+\left (\frac{c^{4}} {\upsilon ^{4}} + \frac{8\lambda c^{3}} {\upsilon ^{6}} -\frac{48\lambda ^{2}\left (c + 1\right )} {\upsilon ^{8}} \right )\cos \upsilon + \frac{4\lambda \upsilon ^{2}c^{2}\left (c + 3\right ) + 48\lambda ^{2}\left (c + 1\right )} {\upsilon ^{8}} \right ], {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \varTheta _{\mu }& =& \frac{e^{-2c}} {2c^{2}} \left (-1 + e^{c}\right )\left (c - 2e^{c} + ce^{c} + 2\right ), {}\\ \varTheta _{\tau }& =& \frac{e^{-c}} {c^{4}} \left [-4\left (-6 + c^{2}\right ) - 8\left (3 + c^{2}\right )\cosh c + c\left (24 + c^{2}\right )\sinh c\right ], {}\\ \end{array}$$

where \(\varPsi _{j}\left (\theta;c\right )\)  for j = μ,τ are given in the Appendix.

Remark 2

Under Assumption 3a) the limiting distributions of DW j , j = μ, τ, are the same as the limiting distributions derived under a zero initial condition in Nabeya and Tanaka (1990b). These results are covered here when \(\mu _{\delta } =\sigma _{ \delta }^{2} = 0\).

Remark 3

The Fredholm determinants, \(D_{j}\left (\lambda \right )\), are taken from Nabeya and Tanaka (1990b).

Remark 4

It is possible to derive the characteristic functions using Girsanov’s theorem (see Girsanov 1960) given, for example, in Chap. 4 of Tanaka (1996). Further, note that Girsanov’s theorem has been tailored to statistics of the form of DW j under Lemma 1 in Elliott and Müller (2006).

5 Power Calculation

To calculate the asymptotic power function of DW j, T (j = μ, τ) we need the quantiles c j, 1−α as critical values where (j = μ, τ)

$$\displaystyle{ P(\mathit{DW }_{j} > c_{j,1-\alpha }) = P\left (\mathit{DW }_{j}^{-1} < c_{ j,1-\alpha }^{-1}\right ) =\alpha \,. }$$

Our graphs rely on the α = 5 % level with critical values c μ, 0. 95 = 27. 35230 and c τ, 0. 95 = 42. 71679 taken, up to an inversion, from Table 1 of Nabeya and Tanaka (1990b). Let ϕ(θ; c) stand for the characteristic functions obtained in Proposition 2 for large initial conditions of both deterministic and random cases in a local neighborhood to the null hypothesis characterized by c. With x = c j, 1−α −1 we hence can compute the local power by evaluating the distribution function of DW j −1 where we employ the following inversion formula given in Imhof (1961)

$$\displaystyle{ F\left (x;c\right ) = \frac{1} {2} -\frac{1} {\pi } \int _{0}^{\infty }\frac{1} {\theta } \mbox{ Im}\left [e^{-i\theta x}\phi \left (\theta;c\right )\right ]d\theta. }$$
(14)

When it comes to practical computations, Imhof’s formula (14) is evaluated using a simple Simpson’s rule while correcting for the possible phase shifts which arise due to the square root map over the complex domain (see Tanaka 1996, Chap. 6, for a discussion).Footnote 5

Figure 1 shows the local power functions of DW j (j = μ, τ) for a deterministic large initial condition, that is for \(\sigma _{\delta }^{2} = 0\) in Assumption 3b). As is clear from Proposition 2, the power function is symmetric around \(\mu _{\delta } = 0\) and decreasing in \(\mu _{\delta }^{2}\) for any level of the local-to-unity parameter c.

For the random case we set \(\mu _{\delta } = 0\). Figure 2 contains graphs for

\(\sigma _{\delta } \in \{ 0,0.1,\ldots,3\}\); to keep shape conformity with the case of a large deterministic initial condition the graphs are arranged symmetrically around 0. Figure 2 shows by how much the power decreases in the variance of the initial condition.

Fig. 1
figure 1

Power function of DW j (\(j =\mu,\tau\)) for a large deterministic initial condition with \(\mu _{\delta } \in \{-3,-2.9,\ldots,3\}\), \(c \in \{ 0,1,\ldots,10\}\) and \(\sigma _{\delta } = 0\)

Fig. 2
figure 2

Power function of DW j (\(j =\mu,\tau\)) for a large random initial condition with \(\sigma _{\delta } \in \{ 3,2.9,\ldots,0,0.1,\ldots,3\}\), \(c \in \{ 0,1,\ldots,10\}\) and \(\mu _{\delta } = 0\)

6 Summary

We analyze the effect of a large initial condition, random or deterministic, on the local-to-unity power of the Durbin–Watson unit root test. Using the Fredholm approach the characteristic function of the limiting distributions are derived. We observe the following findings. First, the local power after detrending is considerably lower than in case of a constant mean. Second, a large initial value has a negative effect on power: the maximum power is achieved for \(\mu _{\delta } =\sigma _{\delta } = 0\), which corresponds to a “small initial condition.” Finally, comparing Figs. 1 and 2 one learns that deterministic and random initials values have a similar effect depending only on the magnitude of the mean or the standard deviation, respectively, of the large initial condition.