Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

We consider goodness-of-fit tests for univariate autoregressive moving average (ARMA) models with uncorrelated errors. Portmanteau tests and the Lagrange multiplier (LM) test are popular tools in ARMA modeling. Portmanteau test statistics, defined by the sum of squares of the first m residual autocorrelations, are commonly used in time series analysis to describe the goodness of fit. This approach was first presented by Box and Pierce [1] and Ljung and Box [15] for univariate autoregressive (AR) models. McLeod [18] derived the large sample distribution of the residual autocorrelations and the portmanteau statistic for ARMA models. LM tests for ARMA time series models have been investigated by many authors, e.g., Godfrey [6], Newbold [19], and Hosking [7]. The test statistics compare the null hypothesis model ARMA(pq) against either ARMA\((p+m, q)\) or ARMA\((p, q+m)\). From the viewpoint of finite sample size and power, these two test statistics are often used in combination. Li [14, Chap. 2] reviews several such tests. However, most of these tests impose the restriction that the errors must be independent. This precludes the application of a nonlinear model.

In recent years, the time series literature has been characterized by a growing interest in nonlinear models. Francq et al. [3] reported that many important classes of nonlinear processes admit ARMA models with uncorrelated errors. Some examples include bilinear processes, autoregressive-conditional duration processes, the Markov-switching ARMA model, generalized autoregressive conditionally heteroscedastic (GARCH) model, and hidden Markov models. Francq et al. [3] also reported that, under the Wold decomposition theorem, any purely nondeterministic second-order stationary process admits an infinite-order moving average (MA) representation, where the noise is considered to be white noise. The ARMA model with uncorrelated errors also has this representation, and is regarded as an approximation of the MA model. Therefore, this model covers a very wide class of second-order stationary processes. Fitting nonlinear models is often difficult, whereas fitting ARMA models is easy and computable using statistical software (e.g., SAS, R, SPSS). Additionally, the estimators are easy to interpret. Therefore, ARMA models can be useful tools, even if the true process appears to be nonlinear.

There are now three portmanteau tests for ARMA models with uncorrelated errors: (i) Francq et al. [3] presented an asymptotic distribution of Ljung and Box’s [15] portmanteau statistics under the condition of uncorrelated errors. The distribution is given by the weighted sum of a chi-squared random variable that contains the unknown ARMA parameters. Therefore, we have to compute the critical values in each test. (ii) Katayama [9] modified the portmanteau statistic with a correction term that is asymptotically chi-squared. However, these two test statistics require an estimate of the covariance structure of a high-dimensional multivariate process and a large sample size. (iii) Kuan and Lee’s [12] portmanteau test is based on the approach developed by Kiefer, Vogelsang, and Bunzel [11] (referred to as KVB). Instead of estimating the asymptotic covariance matrix, Kuan and Lee’s [12] portmanteau test statistic employs a random normalizing matrix to eliminate the nuisance parameters of the asymptotic covariance matrix. The asymptotic critical values are tabulated by a series of simulations. We review these test statistics in Sect. 2.

To overcome these weaknesses, we propose a new portmanteau test and an LM test in Sect. 3. Our proposed tests are based on the KVB approach. The test statistics have no use for recursive estimators, and do not require an estimate of the covariance structure of a high-dimensional multivariate process. Therefore, our test statistics have a significantly lower computational cost. We compare the finite sample performance of these test statistics via simulations in Sect. 4. We demonstrate that our proposed test exhibits sufficiently efficient empirical size and power properties with existing portmanteau tests.

In the remainder of this paper, \({\,\,\Rightarrow \,\,}\) denotes weak convergence (of associated probability measures) and \({\,\,\mathop {\rightarrow }\limits ^{d}\,\,}\) denotes the convergence in distribution. Throughout the paper, convergence is described for a sample size n going to infinity. Therefore, we omit the phrase “as \(n\rightarrow \infty \),” except in a few cases. \(W_{m}\) denotes a vector of m independent standard Wiener processes, and \(B_{m}\) is the Brownian bridge with \(B_{m}(\tau )=W_{m}(\tau )-\tau W_{m}(1)\) for \(\tau \in (0,1]\). A matrix \(A^{+}\) denotes the MP-inverse of A. Let \(\partial f(y)/\partial x\) denote \(\partial f(x)/\partial x|_{x=y}, \nabla _{x}f(y)\) denote \(\partial f(y)/\partial x'\), and \(\nabla _{x}'f(y)\) denote \(\partial f(y)/\partial x\). Additionally, [c] denotes the integer part of c.

Finally, this paper is based on Katayama [8], which extended new KVB-based tests to the M test and considered not only portmanteau tests and LM tests, but also GMM over-identification tests and the Hausman tests. This paper is available on request.

2 Review of the Portmanteau Tests

Suppose that a univariate time series \(\{Y_{t}\}\) is generated by an autoregressive-moving average model ARMA(pq):

$$\begin{aligned} Y_{t} = \sum _{i=1}^{p}a_{i}^{0}Y_{t-i} +\varepsilon _{t}+\sum _{j=1}^{q}b_{j}^{0}\varepsilon _{t-j}, \quad t=0,\pm 1, \pm 2,\ldots , \end{aligned}$$
(1)

where \(\{\varepsilon _{t}\}\) provides white-noise sequences with variance \(\sigma _{0}^{2}\). It is assumed that the above model is stationary, invertible, and not redundant, so that the polynomials \(1-a_{1}^{0}z-\cdots -a_{p}^{0}z^p =0\) and \(1+b_{1}^{0}z +\cdots +b_{q}^{0}z^{q} =0\) have no common roots, and that all roots are outside the unit circle. We denote the true parameter vector as \({\theta _{0}}=(a_{1}^{0},\ldots ,a_{p}^{0}, b_{1}^{0},\ldots ,b_{q}^{0})'\); this belongs to the parameter space \(\varTheta \subset {\mathbb {R}}^{p+q}\). We suppose that \(a_{p}^{0}\ne 0\) or \(b_{q}^{0}\ne 0\) and any \(\theta \in \varTheta \) satisfies the conditions of the polynomials. Given a process \(\{Y_{t}\}_{t=1}^{n}\), as defined in Eq. (1), the nonlinear least-squares estimator of \({\theta _{0}}, {\widehat{\theta }_{n}}=(\widehat{a}_{1},\ldots ,\widehat{a}_{p}, \widehat{b}_{1},\ldots ,\widehat{b}_{q})'\), is obtained by minimizing the sum of the squared residuals. The residuals \(\widehat{\varepsilon }_{t}=\varepsilon _{t}({\widehat{\theta }_{n}})\) (\(t=1,\ldots ,n\)) from the fitted models are given by \(\widehat{\varepsilon }_{t}=Y_{t}- \widehat{a}_{1}Y_{t-1}-\cdots - \widehat{a}_{p}Y_{t-p}- \widehat{b}_{1}\widehat{\varepsilon }_{t-1}-\cdots - \widehat{b}_{q}\widehat{\varepsilon }_{t-q}\), where the unknown starting values are set to 0: \(\widehat{\varepsilon }_{0}=\cdots =\widehat{\varepsilon }_{1-q}=Y_{0}=\cdots =Y_{1-p}=0\). Throughout this paper, we assume that:

Assumption 1

\(\{Y_{t}\}\) is strictly stationary, satisfies the ARMA(pq) model (1), \(E|\varepsilon _{t}|^{4+2\nu }<\infty \), and \(\{\varepsilon _{t}\}\) is an \(\alpha \)-mixing of size \(-(2+\nu )/\nu \) for some \(\nu >0\).

This assumption is somewhat stronger than Assumption 1\('\) in Francq et al. [3], because it implies the summability of \(\alpha \)-mixing coefficients raised to the \(\nu /(2+\nu )\)th power. Francq and Zakoïan [4] showed that, under this assumption, \({\widehat{\theta }_{n}}\) is \(\sqrt{n}\)-consistent and asymptotically normal. Francq et al. [3] note that Assumption 1 does not require the noise to be independent or a martingale difference sequence (MDS). In Sect. 3, we apply this assumption to establish a functional central limit theorem (FCLT) of near-epoch dependence (NED) in the mixing process \(\{\varepsilon _{t}\}\).

To check the adequacy of the model fit, we examine the residual autocorrelations as follows:

$$\begin{aligned} \widehat{r}(j) = \frac{\widehat{\gamma }(j)}{\widehat{\gamma }(0)},\quad \widehat{\gamma }(j)=\frac{1}{n}\sum _{i=j+1}^{n}\widehat{\varepsilon }_{i}\widehat{\varepsilon }_{i-j}, \qquad j=0,1,\ldots ,n-1. \end{aligned}$$

The vector of residual autocorrelations, \(\widehat{r}=[\widehat{r}(1),\ldots ,\widehat{r}(q)]'\), is used to test for \(H_{0}: {{\mathrm{E}}}[z_{t}]=0\) for any t, where \(z_{t}=(\varepsilon _{t-1},\ldots ,\varepsilon _{t-m})'\varepsilon _{t}\). The asymptotic joint distribution of \(\widehat{r}\) has been analyzed by Box and Pierce [1] and McLeod [18]. When \(\{\varepsilon _{t}\}\) is independent and identically distributed (i.i.d.), the asymptotic distribution of \(\sqrt{n}\widehat{r}\) is a multivariate normal distribution with mean zero and an asymptotic covariance matrix that is approximately idempotent for large m. Therefore, both of the abobe papers proposed a portmanteau statistic, \(Q_{m}=n\sum _{i=1}^{m} \widehat{r}(i)^{2}\) for \(p+q<m<n\), which is approximately distributed as \(\chi ^{2}_{m-p-q}\). Ljung and Box [15] showed that a better approximation of \(Q_{m}\) can be achieved using the following modified portmanteau statistic:

$$\begin{aligned} Q_{m}^{*}=n(n+2)\sum _{i=1}^{m}\frac{\widehat{r}(i)^{2}}{n-i}. \end{aligned}$$

These statistics have been adopted by many practitioners, and have been modified or extended in various ways (see Li [14] and references therein).

2.1 The Portmanteau Test of Francq et al. [3]

The portmanteau tests using \(Q_{m}^{*}\) are originally chi-squared tests, assuming the error is i.i.d. Francq et al. [3] established the asymptotic distribution of \(Q_{m}^{*}\) under Assumption 1. The statistic \(Q_{m}^{*}\) is no longer a chi-squared random variable, but is given by the weighted sums of the chi-squared random variables. Therefore, the present portmanteau test cannot control type I error. Francq et al. [3] established a portmanteau test using the asymptotic distribution of \(Q_{m}^{*}\). From McLeod [18] and Francq et al. [3], we have:

$$\begin{aligned}&\widehat{r} = \widehat{\gamma }/\sigma _{0}^{2} + O_{p}{\left( 1/n \right) },\nonumber \\&\widehat{\gamma } = \gamma +\sigma _{0}^{2}\Lambda _{0}'{\left( {\widehat{\theta }_{n}}-{\theta _{0}} \right) } + O_{p}{\left( 1/n \right) }, \end{aligned}$$
(2)

where \(\widehat{\gamma }=[\widehat{\gamma }(1),\ldots , \widehat{\gamma }(m)]', \gamma =[\gamma (1),\ldots ,\gamma (m)]'\),

$$\begin{aligned} \gamma (i) = \frac{1}{n}\sum _{j=i+1}^{n}\varepsilon _{i}\varepsilon _{j-i},\quad i=0,1,\ldots ,n-1, \end{aligned}$$

\(\Lambda _{0}=\Lambda ({\theta _{0}})=(\lambda _{1},\ldots ,\lambda _{m})\) is an \( m \times (p+q)\) matrix, and \(\{\lambda _{j}\}\) is a \((p+q)\)-vector of sequences defined by

$$\begin{aligned} \frac{\partial \varepsilon _{t}({\theta _{0}})}{\partial \theta }=\sum _{j=1}^{\infty }\lambda _{j}\varepsilon _{t-j}. \end{aligned}$$

Note that \({{\mathrm{rank}}}\{\Lambda (\theta )\}=p+q\) for any \(\theta \in \varTheta \). The distribution of \(\sqrt{n}\{\gamma ',\ ({\widehat{\theta }_{n}}-{\theta _{0}})'\}'\) is asymptotically normal with mean zero and covariance matrix \(\varSigma _{\gamma ,\theta }\). Estimating this covariance matrix is not easy, as it is the long-run variance of a stationary process. For example, the asymptotic variance of \(\sqrt{n}\gamma \) is given by:

$$\begin{aligned} \varGamma = \sum _{j=-\infty }^{\infty }{{\mathrm{E}}}\left( z_{t}z_{t-j}'\right) . \end{aligned}$$

When \(\{\varepsilon _{t}\}\) is i.i.d., \(\varGamma =\sigma _{0}^{4}{\mathbb {I}}_{m}\). However, when \(\{\varepsilon _{t}\}\) is uncorrelated but non-independent, \(\varGamma \) is not always simple.

Francq et al. [3] also showed that, when \(\{\varepsilon _{t}\}\) is uncorrelated but non-independent, the asymptotic variance of \(\sqrt{n}\widehat{r}\) is no longer idempotent and the asymptotic distribution of \(Q_{m}^{*}\) is the weighted sum of the chi-squared random variables. Therefore, their proposed portmanteau test with \(Q_{m}^{*}\) uses critical regions of the non-pivotal distribution with the nonparametric estimator of \(\varSigma _{\gamma ,\theta }\). Francq et al. [3] referred to their portmanteau test as a modified Ljung–Box (MLB) test. Therefore, we call this the MLB test throughout this paper.

2.2 The Portmanteau Test of Katayama [9]

Francq et al. [3]’s MLB test must estimate critical values, because the asymptotic distribution is non-pivotal. Recently, Katayama [9] proposed another approach that provides a chi-squared distribution. First, let \(D=\Lambda _{0}' (\Lambda _{0}\varGamma ^{-1}\Lambda _{0}')^{-1}\Lambda _{0}\varGamma ^{-1}\) and S be the square root of \(\varGamma \). Katayama [9] assumed that:

Assumption 2

The matrix S is nonsingular.

This assumption is satisfied for stationary, ergodic, and square-integrable MDSs; see, e.g., Francq and Zakoïan [5, Theorem 5.1]. From (2) and \(({\mathbb {I}}_{m}-D)\Lambda _{0}'=0\), we have:

$$\begin{aligned}&({\mathbb {I}}_{m}-D)\widehat{\gamma } = ({\mathbb {I}}_{m}-D)\gamma + O_{p}{\left( 1/n \right) },\\&S^{-1}({\mathbb {I}}_{m}-D)\sqrt{n}\widehat{\gamma } {\,\,\mathop {\rightarrow }\limits ^{d}\,\,}N(0, {\mathbb {I}}_{m}- F(F'F)^{-1}F'), \end{aligned}$$

where \(F=S^{-1}\Lambda _{0}'\). Therefore, Katayama [9] proposed that

$$\begin{aligned} Q_{m}^{\text{ K }}=\widehat{\gamma }'T_{n}({\mathbb {I}}_{m}-\widehat{D}')\widehat{\varGamma }^{-1}({\mathbb {I}}_{m}-\widehat{D})T_{n}\widehat{\gamma }, \end{aligned}$$

where \(\widehat{D}\) is a \(\sqrt{n}\)-consistent estimator of D and \(T_{n}=\{n(n+2)\}^{1/2}{{\mathrm{diag}}}\{ (n-1)^{-1/2},\ldots , (n-m)^{-1/2}\}\). The matrix \(T_{n}\) is the small-sample approximation of \(\sqrt{n}{\mathbb {I}}_{m}\), similar to the weights of \(Q_{m}^{*}\). The matrix \(\widehat{\varGamma }\) is a consistent estimator of \(\varGamma \) computed from nonparametric methods. Katayama [9] showed that \(Q_{m}^{\text{ K }}\) is approximately \(\chi ^{2}_{m-p-q}\). However, simulations indicated that, similarly to the MLB test, the finite-sample properties of \(Q_{m}^{\text{ K }}\) result in some size distortions as m increases [9]. This may be due to the difficulty in establishing a non-parametric estimation of \(\varGamma \).

2.3 The Portmanteau Test of Kuan and Lee [12] and Lee [13]

The main difficulty of conducting the Francq et al. [3] and Katayama [9] portmanteau tests is obtaining nonparametric estimates of \(\varSigma _{\gamma ,\theta }\) and \(\varGamma \). These estimates require an approximation of the covariance matrix of a high-dimensional multivariate process and a large sample size. Following KVB, Kuan and Lee [12] and Lee [13] proposed an alternative approach. Their approach uses random normalized matrices to eliminate the nuisance covariance matrix. Let \(\widetilde{\theta }_{t}\) denote the nonlinear least-squares estimator from subsample \(\{y_{i}\}_{i=1}^{t}\), and let \(\{\widetilde{\varepsilon }_{i}\}_{i=1}^{t}\) be the residual sequences given by \(\widetilde{\theta }_{t}\). Define the matrices

$$\begin{aligned}&\widehat{C}_{n}= \frac{1}{n}\sum _{i,j=1}^{n-1}\sum _{t=1}^{i}\sum _{s=1}^{j} {\left\{ (\kappa _{ij}-\kappa _{i,j+1})-(\kappa _{i+1,j}-\kappa _{i+1,j+1}) \right\} } (\widehat{z}_{t}-\widehat{\gamma })(\widehat{z}_{s}-\widehat{\gamma }) \\&\widetilde{C}_{n} = \frac{1}{n}\sum _{i,j=1}^{n-1}\sum _{t=1}^{i}\sum _{s=1}^{j} {\left\{ (\kappa _{ij}-\kappa _{i,j+1})-(\kappa _{i+1,j}-\kappa _{i+1,j+1}) \right\} }(\widetilde{z}_{t}-\widehat{\gamma })(\widetilde{z}_{s}-\widehat{\gamma }), \nonumber \end{aligned}$$
(3)

with \(\widehat{z}_{t}=\widehat{\varepsilon }_{t}(\widehat{\varepsilon }_{t-1},\ldots ,\widehat{\varepsilon }_{t-m})'\) and \(\widetilde{z}_{t}=\widetilde{\varepsilon }_{t}(\widetilde{\varepsilon }_{t-1},\ldots ,\widetilde{\varepsilon }_{t-m})'\). Additionally, \(\kappa _{ij}=\kappa (|i-j|/n)\), where \(\kappa \) denotes a kernel function. The main idea underlying the KVB approach is to employ a normalizing random matrix instead of estimating the asymptotic variance of \(T_{n}\widehat{\gamma }\). Kuan and Lee [12] and Lee [13] considered two generalized test statistics. These are given by:

$$\begin{aligned}&\widehat{Q}_{m}^{{\text{ KL }}} =\widehat{\gamma }'T_{n}\widehat{C}_{n}^{-1}T_{n}\widehat{\gamma },\\&\widetilde{Q}_{m}^{{\text{ KL }}} = \widehat{\gamma }'T_{n}\widetilde{C}_{n}^{-1}T_{n}\widehat{\gamma }.\ \end{aligned}$$

Under conditions of the FCLT, Kuan and Lee [12] and Lee [13] showed that

$$\begin{aligned}&T_{n}\widehat{\gamma } {\,\,\mathop {\rightarrow }\limits ^{d}\,\,}V W_{m}(1),\\&\widehat{C}_{n} {\,\,\Rightarrow \,\,}S' U_{m} S,\\&\widetilde{C}_{n} {\,\,\Rightarrow \,\,}V' U_{m} V, \end{aligned}$$

where V is the matrix square root of the asymptotic covariance matrix of \(\sqrt{n} \widehat{\gamma }\), and \(U_{m}=\int _{0}^{1}\int _{0}^{1} \kappa (t-s)dB_{m}(t)dB_{m}(s)'\). It follows that

$$\begin{aligned}&\widehat{Q}_{m}^{{\text{ KL }}}{\,\,\mathop {\rightarrow }\limits ^{d}\,\,}W_{m}(1)'V'(S' U_{m} S)^{-1}VW_{m}(1),\\&\widetilde{Q}_{m}^{{\text{ KL }}} {\,\,\mathop {\rightarrow }\limits ^{d}\,\,}W_{m}(1)'U_{m}^{-1}W_{m}(1). \end{aligned}$$

Therefore, \(\widetilde{Q}_{m}^{{\text{ KL }}}\) is an asymptotically pivotal distribution, critical values for which can be obtained via simulations. The critical values of \(W_{m}(1)'U_{m}^{-1}W_{m}(1)\) are given by KVB (Table II), Lobato [16, Table 1], Kiefer and Vogelsang [10, Tables I and II], and Su [23, Table 1]. Note that Kuan and Lee [12] and Lee [13] assume V is nonsingular. However, this assumption is restrictive, as Francq et al. [3] noted in their Remark 2 that V may be singular. Additionally, because elements of V are nonlinear functions of \({\theta _{0}}\), it is difficult to confirm this assumption.

3 New Portmanteau Tests and LM Tests Using the KVB Approach

The KVB-based portmanteau statistics proposed by Kuan and Lee [12] and Lee [13] do not estimate asymptotic covariance matrices of \(\sqrt{n} \widehat{\gamma }\). However, these statistics contain a recursive estimator, and the assumption on the covariance matrix is restrictive. To solve these problems, in this section, we propose new KVB-based test statistics.

3.1 New Portmanteau Tests Using the KVB Approach

We now re-examine (2). Kuan and Lee’s [12] approach was based on the asymptotic joint distribution of \(\sqrt{n}(\gamma ', ({\widehat{\theta }_{n}}-{\theta _{0}})')\). However, the the asymptotic distribution of \(\sqrt{n}({\widehat{\theta }_{n}}-{\theta _{0}})\) is cumbersome. Therefore, our approach eliminates this estimation effect in a similar manner to Katayama [9].

Let \({\mathscr {P}}_{n}^{P}={\mathbb {I}}_{m}-\widehat{\Lambda }'(\widehat{\Lambda }\widehat{\Lambda }')^{-1}\widehat{\Lambda }\), where \(\widehat{\Lambda }=\Lambda ({\widehat{\theta }_{n}})\) and \({\mathscr {P}}_{0}^{P}={\mathbb {I}}_{m}-\Lambda _{0}'(\Lambda _{0}\Lambda _{0}')^{-1}\Lambda _{0}\). Then, \({\mathscr {P}}_{n}^{P}{\,\,\mathop {\rightarrow }\limits ^{p}\,\,}{\mathscr {P}}_{0}^{P}\) and \({\mathscr {P}}_{0}^{P}\Lambda _{0}'=0\). It follows from (2) that:

$$\begin{aligned} {\mathscr {P}}_{n}^{P}\widehat{\gamma } = {\mathscr {P}}_{0}^{P}\gamma + o_{p}{\left( n^{-1/2} \right) }. \end{aligned}$$
(4)

We now construct a KVB-based portmanteau statistic based on this equation. Under Assumption 1, the FCLT for NED functions of some mixing process \(\{\varepsilon _{t}\}\) (Davidson [2], Corollary 29.19) gives:

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{t=1}^{[n\tau ]} z_{t} {\,\,\Rightarrow \,\,}S W_{m}(\tau ) \end{aligned}$$
(5)

for any \(\tau \in (0,1]\). It follows from (4), (5), and the continuous mapping theorem that:

$$\begin{aligned} T_{n}{\mathscr {P}}_{n}^{P}\widehat{\gamma } {\,\,\mathop {\rightarrow }\limits ^{d}\,\,}\Psi _{0}^{P} W_{m}(1) \end{aligned}$$
(6)

and

$$\begin{aligned} {\mathscr {P}}_{n}^{P}\widehat{C}_{n}{{\mathscr {P}}_{n}^{P}}' {\,\,\Rightarrow \,\,}\Psi _{0}^{P} U_{m} {\Psi _{0}^{P}}', \end{aligned}$$
(7)

where \(\Psi _{0}^{P}= {\mathscr {P}}_{0}^{P}S\) and \(\widehat{C}_{n}\) is given by (3). We define the following portmanteau test statistic:

$$\begin{aligned} Q_{m}^{{\text{ NEW }}} = \widehat{\gamma }T_{n} {\mathscr {P}}_{n}^{P}\left( {\mathscr {P}}_{n}^{P}\widehat{C}_{n}{{\mathscr {P}}_{n}^{P}}'\right) ^{+}{\mathscr {P}}_{n}^{P}T_{n}\widehat{\gamma }. \end{aligned}$$

Since \({\mathscr {P}}_{n}^{P}\widehat{C}_{n}{{\mathscr {P}}_{n}^{P}}'\) is singular with rank \(m-p-q\), we use the MP inverse as a normalizing matrix. Thus, we obtain a new portmanteau test that extends those of Lobato [16] and Su [23] to the estimated parameter case.

Theorem 1

Given Assumptions 1 and 2, \(Q_{m}^{{\text{ NEW }}}{\,\,\mathop {\rightarrow }\limits ^{d}\,\,}W_{m-p-q}(1)'U_{m-p-q}^{-1}W_{m-p-q}(1)\).

Proof

The necessary and sufficient condition for the continuity of the MP-inverse matrix is that the rank of the matrices is constant: \({{\mathrm{rank}}}({\mathscr {P}}_{n}^{P}\widehat{C}_{n}{{\mathscr {P}}_{n}^{P}}')={{\mathrm{rank}}}(\Psi _{0}^{P} U_{m} {\Psi _{0}^{P}}')\); see, e.g., Schott [22, Theorem 5.21]. Because \({{\mathrm{rank}}}\Lambda (\theta )=p+q\) for any \(\theta \in \varTheta \), we have \({{\mathrm{rank}}}({\mathscr {P}}_{n}^{P}\widehat{C}_{n}{{\mathscr {P}}_{n}^{P}}')={{\mathrm{rank}}}({\mathscr {P}}_{n}^{P})=m-p-q\) and \({{\mathrm{rank}}}(\Psi _{0}^{P} U_{m} {\Psi _{0}^{P}}')={{\mathrm{rank}}}(\Psi _{0}^{P})={{\mathrm{rank}}}({\mathscr {P}}_{0}^{P})=m-p-q\). Therefore, this matrix satisfies the continuity condition of the MP-inverse. It follows from (6) and (7) that

$$\begin{aligned} Q_{m}^{{\text{ NEW }}} {\,\,\Rightarrow \,\,}W_{m}(1)'{\Psi _{0}^{P}}' \left( \Psi _{0}^{P} U_{m} {\Psi _{0}^{P}}'\right) ^{+}\Psi _{0}^{P} W_{m}(1). \end{aligned}$$

The rest of the proof is similar to that of Equation (9) in Kuan and Lee [12].

As noted by Kuan and Lee [12, Remark 2], we can modify \(\widetilde{Q}_{m}^{{\text{ KL }}}\) using the MP-inverse. However, it is difficult to estimate \({{\mathrm{rank}}}(V)\), as V is generally a complicated matrix. Our proposed portmanteau test overcomes this problem without using a recursive estimator.

3.2 New LM Test Using the KVB Approach

The LM test as a goodness-of-fit test of ARMA models is a special case of a test for a parameter constraint of a nonlinear regression model. Therefore, we briefly discuss LM tests for nonlinear regression models. Similar to our approach in the previous subsection, the new KVB-based LM test statistic uses a projection matrix. We now consider the following nonlinear regression model:

$$\begin{aligned} Y_{t}=f_{t}(Y^{t-1};\beta )+\varepsilon _{t}, \end{aligned}$$
(8)

where \(Y_{t}\) is the tth observation of a dependent variable, \(\beta \) is an r-dimensional vector of parameters to be estimated, and \(f_{t}\) is a function of \(Y^{t-1}=\{Y_{j}, j<t\}\) and \(\beta \) and third-order differentiable with respect to \(\beta \). We consider the null hypothesis \(\beta _{0} = c(\delta _{0})\), where \(\beta _{0}\) is a true parameter of \(\beta , \delta _{0}\) is an s-dimensional constrained vector, and c is a differentiable function from \({\mathbb {R}}^{s}\) to \({\mathbb {R}}^{r}\) with values in \({\mathbb {R}}^{r}\) and \(r>s\). We set \(e_{t}(\beta )=Y_{t}-f_{t}(Y^{t-1};\beta )\), and define

$$\begin{aligned} {{\mathscr {L}}_{n}}(\beta )=-\frac{1}{2n}\sum _{t=1}^{n}e_{t}(\beta )^{2} \end{aligned}$$
(9)

as a quasi-maximum log-likelihood function. Let \(\widehat{\delta }_{n}\) be a root-n consistent estimator of \(\delta _{0}\) and \(\widehat{\beta }_{n}=c(\widehat{\delta }_{n})\) so as to satisfy the first-order condition:

$$\begin{aligned} \left. \frac{\partial {{\mathscr {L}}_{n}}(c(\delta ))}{\partial \delta }\right| _{\delta =\widehat{\delta }_{n}}= \left. \frac{\partial c(\delta )'}{\partial \delta }\frac{\partial {{\mathscr {L}}_{n}}(\beta )}{\partial \beta }\right| _{\delta =\widehat{\delta }_{n},\ \beta =\widehat{\beta }_{n}}=0. \end{aligned}$$
(10)

The classical LM test is:

$$\begin{aligned} \text {LM}=\left. n\frac{\partial {{\mathscr {L}}_{n}}(\beta )}{\partial \beta '}{{\mathrm{E}}}{\left[ \frac{\partial ^{2}{{\mathscr {L}}_{n}}(\beta )}{\partial \beta \partial \beta '} \right] }^{-1} \frac{\partial {{\mathscr {L}}_{n}}(\beta )}{\partial \beta }\right| _{\beta =\widehat{\beta }_{n}}. \end{aligned}$$

Under standard regularity conditions, and when \(\{\varepsilon _{t}\}\) is i.i.d., this test statistic is asymptotically \(\chi ^{2}_{r-s}\) when \(\beta _{0} = c(\delta _{0})\) is true; see, e.g., White [24, Section 10.1]. However, when \(\{\varepsilon _{t}\}\) is not independent but uncorrelated, \(\text {LM}\) is not always approximately chi-squared, because the asymptotic variance of \(\sqrt{n}\) times the score vector does not always coincide with the Fisher information matrix. One modification is to employ a nonparametric estimator of the asymptotic variance. Another is to use the KVB-based LM test statistic given by Kuan and Lee [12] with a recursive estimator.

We now propose another KVB-based LM test statistic with a full sample estimator. To proceed, we further suppose that

$$\begin{aligned} \left| -\nabla _{\beta }'\nabla _{\beta }{{\mathscr {L}}_{n}}(\beta ) - {{\mathrm{E}}}\left[ \nabla _{\beta }' e_{t}(\beta ) \nabla _{\beta } e_{t}(\beta )\right] \right| {\,\,\mathop {\rightarrow }\limits ^{p}\,\,}0 \end{aligned}$$
(11)

uniformly in \(\beta \). From (11) and the first-order Taylor series approximation around \(\widehat{\delta }_{n} =\delta _{0}\), we have that:

$$\begin{aligned} \frac{\partial {{\mathscr {L}}_{n}}(\widehat{\beta }_{n})}{\partial \beta }&= \frac{\partial {{\mathscr {L}}_{n}}(\beta _{0})}{\partial \beta } + {{\mathscr {J}}\!\!}_{0}C_{0}' (\widehat{\delta }_{n} - \delta _{0})+ o_{p}(n^{-1/2}), \end{aligned}$$
(12)

where \({{\mathscr {J}}\!\!}_{0}= -{{\mathrm{E}}}[\nabla _{\beta }' e_{t}(\beta _{0}) \nabla _{\beta }e_{t}(\beta _{0})]\) and \(C_{0}=\nabla _{\delta } c(\delta _{0})'\). We define the matrices \({\mathscr {P}}_{0}^{ \text{ LM }}={\mathbb {I}}_{r}-{{\mathscr {J}}\!\!}_{0}C_{0}'(C_{0}{{\mathscr {J}}\!\!}_{0}C_{0}')^{-1}C_{0}\) and \({\mathscr {P}}_{n}^{ \text{ LM }}={\mathbb {I}}_{r}-{{\mathscr {J}}\!\!}_{n}C_{n}'(C_{n}{{\mathscr {J}}\!\!}_{n}C_{n}')^{-1}C_{n}\), where \(C_{n}=\nabla _{\delta }' c(\widehat{\delta }_{n})\) and \({{\mathscr {J}}\!\!}_{n}\) denotes a consistent estimator of \({{\mathscr {J}}\!\!}_{0}\). These projection matrices are used in a similar way to \(Q_{m}^{{\text{ NEW }}}\). From (12), we have that:

$$\begin{aligned} \frac{\partial {{\mathscr {L}}_{n}}(\widehat{\beta }_{n})}{\partial \beta } ={\mathscr {P}}_{n}^{ \text{ LM }}\frac{\partial {{\mathscr {L}}_{n}}(\widehat{\beta }_{n})}{\partial \beta } ={\mathscr {P}}_{0}^{ \text{ LM }}\frac{\partial {{\mathscr {L}}_{n}}(\beta _{0})}{\partial \beta } + o_{p}(n^{-1/2}). \end{aligned}$$
(13)

The first equality comes from (10), as \(C_{n}\nabla _{\beta }'{{\mathscr {L}}_{n}}(\widehat{\beta }_{n})=0\). The second equality follows from (12), as \({\mathscr {P}}_{n}^{ \text{ LM }}\) is a consistent estimator of \({\mathscr {P}}_{0}^{ \text{ LM }}\) and \({\mathscr {P}}_{0}^{ \text{ LM }}{{\mathscr {J}}\!\!}_{0}C_{0}'=0\). Therefore, if we suppose that \(n^{1/2}\nabla _{\beta }'{{\mathscr {L}}_{n}}(\beta _{0}){\,\,\mathop {\rightarrow }\limits ^{d}\,\,}GW_{r}(1)\), then (13) implies that \(n^{1/2}\nabla _{\beta }'{{\mathscr {L}}_{n}}(\widehat{\beta }_{n}) {\,\,\mathop {\rightarrow }\limits ^{d}\,\,}{\mathscr {P}}_{0}^{ \text{ LM }}GW_{r}(1)\). We note that the asymptotic variance of \(n^{1/2}\nabla _{\beta }'{{\mathscr {L}}_{n}}(\widehat{\beta }_{n}), {\mathscr {I}}\!\!_{0}=GG'\), is not always equal to \({{\mathscr {J}}\!\!}_{0}\).

We define the following new LM test statistic:

$$\begin{aligned}&\text {LM}^{{\text{ NEW }}} = n \frac{\partial {{\mathscr {L}}_{n}}(\widehat{\beta }_{n})}{\partial \beta '} {\left( \sum _{i=1}^{n-1}\sum _{j=1}^{n-1} k_{ij}\widehat{\varphi }_{i} \widehat{\varphi }_{j}' \right) }^{+}\frac{\partial {{\mathscr {L}}_{n}}(\widehat{\beta }_{n})}{\partial \beta } ,\\&\widehat{\varphi }_{j}= \frac{1}{\sqrt{n}}{\mathscr {P}}_{n}^{ \text{ LM }}\sum _{i=1}^{j}{\left\{ -\frac{\partial e_{i}(\widehat{\beta }_{n})}{\partial \beta }e_{i}(\widehat{\beta }_{n})- \frac{\partial {{\mathscr {L}}_{n}}(\widehat{\beta }_{n})}{\partial \beta } \right\} }. \end{aligned}$$

Theorem 2 gives the limiting distribution of the LM test statistic:

Theorem 2

Assume that

  1. (i)

    \({{\mathrm{Rank}}}(C_{n}{{\mathscr {J}}\!\!}_{n}C_{n}')={{\mathrm{rank}}}(C_{0}{{\mathscr {J}}\!\!}_{0}C_{0}')=s\).

  2. (ii)

    \({\displaystyle \frac{1}{\sqrt{n}} \sum _{t=1}^{[n\tau ]}\frac{\partial e_{t}(\beta _{0})}{\partial \beta }e_{t}(\beta _{0}) {\,\,\Rightarrow \,\,}GW_{r}(\tau )}\) for any \(\tau \in (0,1]\) as \(n\rightarrow \infty \), where G is a \(r\times r\) positive definite matrix.

  3. (iii)

    Equation (11) or (12) holds.

Then, \(\text {LM}^{{\text{ NEW }}} {\,\,\Rightarrow \,\,}W_{r-s}(1)'U_{r-s}^{-1}W_{r-s}(1)\) as \(n\rightarrow \infty \).

Proof

The proof is similar to that for Theorem 1. Hence, it is omitted here.

This result can be applied to the goodness-of-fit test for ARMA models with uncorrelated errors, e.g., \(H_{0}: ARMA(p,q)\) against \(H_{1}:ARMA(p+m,q)\) and \(H_{0}: ARMA(p,q)\) against \(H_{1}:ARMA(p,q+m)\), where \(p+q=s\) and \(m=r-s\). The constrained estimator \({\widehat{\theta }_{n}}\) is a quasi-maximum-likelihood estimator of ARMA(pq) and \(\widehat{\beta }_{n}'=({\widehat{\theta }_{n}}', 0')\). The residuals \(\{e_{t}(\widehat{\beta }_{n})\}\) are given by the residuals of ARMA(pq). The residuals \(\{\nabla _{\beta }e_{t}(\widehat{\beta }_{n})\}\) are derived from the residuals of the alternative model. These statistics can be computed using standard statistical software, such as R and SAS, as they are the same as for ARMA models with i.i.d. errors. The first-order Taylor series approximatin of (12) is obtaiend from the proof of Lemma 5 and Theorem 2 in Francq and Zakoïan [4]. For example, when the null model is AR(1) and the alternative model is AR\((1+m), {\theta _{0}}=a_{1}^{0}\) and \(\beta _{0}=(1,0,\ldots ,0)'{\theta _{0}}\). \(f_{t}(Y^{t-1};\beta _{0})=a_{1}^{0}Y_{t-1}+a_{2}^{0}Y_{t-2}+\cdots + a_{m+1}^{0}Y_{t-m-1}, e_{t}(\widehat{\beta })=Y_{t}-{\widehat{\theta }_{n}}Y_{t-1}, \nabla _{\beta }e_{t}(\widehat{\beta }_{n})=-(Y_{t-1}, \ldots , Y_{t-m-1})\), and \({{\mathscr {J}}\!\!}_{n}\) are given by the sample mean of \(\{\nabla _{\beta }'e_{t}(\widehat{\beta }_{n})\nabla _{\beta }e_{t}(\widehat{\beta }_{n})\}\).

4 Some Simulation Studies

In this section, we examine the empirical size and power of the various portmanteau tests and the LM test to diagnose the goodness of fit of AR(1) models.

4.1 Empirical Significance Level

We first examine the empirical significance level of the following univariate AR(1) models \(Y_{t}=a_{1}^{0} Y_{t-1}+\epsilon _{t}\), where \(\{\epsilon _{t}\}\) is defined by:

DGP 1:

(Gaussian GARCH(1, 1) model): \(\epsilon _{t}=\sigma _{t}z_{t}, \sigma _{t}^{2}=10^{-6} +0.1\epsilon _{t-1}^{2}+0.8\sigma _{t-1}^{2}\), where \(\{z_{t}\} \sim i.i.d. N(0,1)\);

DGP 2:

(Non-Gaussian ARCH(1) model): \(\epsilon _{t}=\sigma _{t}v_{t}, \sigma _{t}^{2}=10^{-6} +0.1\epsilon _{t-1}^{2}\), where \(\{v_{t}\} \sim i.i.d.\) Skew-Normal distribution with location, scale, and shape parameters (0.8, 1.0, 0);

DGP 3:

(All-Pass ARMA(1, 1) model): \(\epsilon _{t}=0.8\epsilon _{t-1} + w_{t}-0.8^{-1}w_{t-1}\), where \(\{w_{t}\}\) is i.i.d. Student’s t distribution with 10 degrees of freedom;

DGP 4:

(Bilinear model): \(\epsilon _{t}=z_{t-1} + 0.5z_{t-1}\epsilon _{t-2}.\)

We selected these data generating processes (DGPs) from Francq et al. [3] and Lobato et al. [17]. DGPs 1 and 2 are MDS examples, and use the R function garchSim from the fGarch R package with default parameter values. DGPs 3 and 4 are non-MDS examples, where the parameters are given by Lobato et al. [17]. We set \(a_{1}^{0}=0.9\) and considered sample sizes of \(n=200, 400\), and 3000 in each experiment.

Five different test statistics were examined. The first two have to estimate the long-run variance matrices:

  1. (i)

    \(Q_{m}^{\text {MLB}}\): Francq et al. [3]’s MLB portmanteau test (discussed in Sect. 2.1), where \(M=30\) in step 2 of Francq et al. [3].

  2. (ii)

    \(Q_{m}^{\text{ K }}\): Katayama’s [9] modified portmanteau test statistic (discussed in Sect. 2.2).

The remaining three test statistics are based on KVB, where we use sharp original kernels with the \(\rho \) value proposed by Phillips et al. [20]:

  1. (iii)

    \(\widetilde{Q}_{m,\rho }^{\text {KL}}\): Kuan and Lee’s KVB-based portmanteau statistics with the AR(1) recursive estimator (discussed in Sect. 2.3).

  2. (iv)

    \(Q_{m,\rho }^{\text {NEW}}\): Our proposed portmanteau test, described in Sect. 3.1.

  3. (v)

    \(\text {LM}_{m,\rho }^{\text {NEW}}\): Our proposed LM test, discussed in Sect. 3.2, where the null model is AR(1) and the alternative model is AR\((1+m)\).

The sharp original kernel \(\kappa _{\rho }(x)\) is given by:

$$\begin{aligned} \kappa _{\rho }(x)= {\left\{ \begin{array}{ll} (1-|x|)^{\rho } &{} |x| \le 1 \\ 0 &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

where \(\rho \) is a positive integer. When \(\rho =1\), the sharp original kernel is the usual Bartlett kernel. As \(\rho \) increases, \(\kappa _{\rho }(x)\) becomes concentrated at the origin with a sharper, more pronounced peak. We investigated the cases \(\rho =1,8,16,32,48,64\); for reasons of space, we present the cases \(\rho =1,16,64\) here.

The asymptotic distributions of (iii)–(v) are \(W_{\tau }(1)'U_{\tau }^{-1}W_{\tau }(1)\), where \(\kappa =\kappa _{\rho }\) and \(\tau =m\) or \(m-1\). The critical values of the distribution are obtained by simulations. The Brownian motion and Brownian bridge process are approximated using the normalized partial sum of \(n=2000\) i.i.d. N(0, 1) random variables, and the simulation involves 30,000 replications. These critical values have also been computed by Su [23].

Tables 1 and 2 present the relative rejection frequencies (in %) for \(m=2,6,10,14\) and \(n=200, n=400\), respectively. The tests using \({Q_{m}^{\text {MLB}}}, {Q_{m}^{\text {K}}}\), and \(\widetilde{Q}_{m,\rho }^{\text {KL}}\) seem to have noticeable under-rejection probabilities for larger m. Our proposed tests using \(Q_{m,\rho }^{\text {NEW}}\) and \(\text {LM}_{m,\rho }^{\text {NEW}}\) exhibit relatively stable sizes, which for a finite number of samples is one of the superior features of our proposed tests. The tests using \(Q_{m,\rho }^{\text {NEW}}\) and \(\text {LM}_{m,\rho }^{\text {NEW}}\) seem to have a slight under-rejection probability for DGP 1 as m increases. The tests using \(\text {LM}_{m,\rho }^{\text {NEW}}\) seem to have an over-rejection tendency for some cases when \(n=200\), though this is not observed when \(n=400\).

Table 1 Empirical significance level of DGPs 1–4 (\(n=200, a_{0}=0.9\))
Table 2 Empirical significance level of DGPs 1–4 (\(n=400, a_{0}=0.9\))
Table 3 Empirical power of DGPs 1–4 (\(n=200, a_{0}=0.9, \alpha _{2}^{0}=-0.15\))
Table 4 Empirical power of DGPs 1–4 (\(n=200, a_{0}=0.9, \alpha _{2}^{0}=-0.30\))
Table 5 Empirical size-adjusted power of DGPs 1–4 (\(n=200, a_{0}=0.9, \alpha _{2}^{0}=-0.15\))
Table 6 Empirical size-adjusted power of DGPs 1–4 (\(n=200, a_{0}=0.9, \alpha _{2}^{0}=-0.30\))

4.2 Empirical Power

We next conducted 3000 replications with \(n=200\) for the univariate AR(2) models defined by: \(Y_{t}=a_{1}^{0}Y_{t-1}+a_{2}^{0}Y_{t-2}+\varepsilon _{t}\), where \(a_{1}^{0}=0.9, a_{2}^{0}=-0.15,-0.3\) and \(\{\varepsilon _{t}\}\) is defined by DGPs \(1,2,\ldots , 4\). We fitted an AR(1) model and conducted the tests to a 5% significance level. Tables 3 and 4 present the empirical powers corresponding to the empirical size in Table 1; Table 3 corresponds to \(a_{2}^{0}=-0.15\) and Table 4 to \(a_{2}^{0}=-0.30\).

The tests using \({\text {LM}}_{m,64}^{\text {NEW}}\) were confirmed to be the most powerful in almost all cases. All three KVB-based tests produce an increase in power as \(\rho \) increases, which is consistent with the asymptotic power envelope under the local alternatives given by Phillips et al. [20, 21]. Tests using \({Q_{m}^{\text {MLB}}}\) and \({Q_{m,64}^{\text {NEW}}}\) were also powerful, although the \({Q_{m}^{\text {MLB}}}\) case showed a serious under-rejection frequency. It is interesting that our proposed tests, \(Q_{m,\rho }^{\text {NEW}}\) and \(\text {LM}_{m,\rho }^{\text {NEW}}\), give similar powers for \(m=6,10,14\). This similarity is explained by Hosking [7, Section 4]. The portmanteau tests examine the goodness of fit without particular alternatives. However, Hosking [7, Section 4] noted that portmanteau tests can be approximately interpreted as LM tests for a particular form of ARMA models.

To compare the potential power properties, we also computed the size-adjusted powers; the results are listed in Tables 5 and 6. The tests using \({Q_{m}^{\text {MLB}}}\) are most powerful for \(m=6,10,14\). We confirmed that tests using \(\text {LM}_{2,64}^{\text {NEW}}\) are the most powerful, and that \(Q_{2,64}^{\text {NEW}}\) have a comparatively greater power than \(Q_{2}^{\text {MLB}}\). Our proposed portmanteau test \(Q_{m,\rho }^{\text {NEW}}\) exhibited a superior power to Kuan and Lee’s \(\widetilde{Q}_{m,\rho }^{\text {KL}}\).

From these simulations, we can state that our proposed tests are sufficiently efficient in terms of their empirical size and power properties compared with existing portmanteau tests. Besides their empirical size and power, our proposed tests are also superior in terms of computational cost. As m increases, \({Q_{m}^{\text {MLB}}}, {Q_{m}^{\text {K}}}\), and the LM test need a large sample size n, because these statistics have to estimate long-run variance matrices containing \(\varGamma \). In summary, we recommend our proposed test for determining the goodness of fit for the ARMA model.