1 Introduction

Recently, there has been a growing interest in modeling non-negative integer-valued time series, especially, time series of counts. Several models have been proposed in the literature; in particular, the INteger-valued AutoRegressive (INAR) model has been the subject of study in various research papers. Reasons to introduce these integer-valued data models come from the need to account for the discrete nature of certain data sets, often counts of events, objects or individuals. The application areas of these types of integer-valued time series include epidemiology, actuarial statistics, neurobiology, psychometry etc. See for example Ristić et al. (2009), Moriña et al. (2011), Park and Kim (2012), Maiti et al. (2015) and Davis et al. (2016) .

A first-order integer-valued autoregressive model (INAR(1)) is defined as

$$\begin{aligned} X_t = \phi \circ X_{t-1} + Z_t,\quad t \ge 1, \end{aligned}$$
(1)

where \( \phi \in [0, 1)\), \(\{Z_t\}\) is a sequence of independent and identically distributed (i.i.d.) non-negative integer-valued random variables with probability mass function \(f_z,~\)\(E(Z_t) = \lambda \), \(~Var(Z_t) = \sigma ^2_z,\) and \(\{Z_t\}\) independent of \(X_0\) for all t. The symbol ‘\(\circ \)’ stands for a thinning operator, which, conditional on \(X_{t-1}\) is defined as

$$\begin{aligned} \phi \circ X_{t-1} = \sum \limits _{i=1}^{X_{t-1}}B_i, \end{aligned}$$

where \(\{B_i, i=1,2,\ldots ,X_{t-1}\}\) are i.i.d. Bernoulli random variables with parameter \(\phi .\) This model has lot of similarity to the ordinary AR(1) model for a continuous time series. As an example of a standard INAR(1) model, one may consider \(X_t\) as the number of surviving cancer patients in a hospital at time t, \(\phi \) be the probability of survival from time \(t-1\) to t, and \(Z_t\) be the number of new cancer patients admitted at time t (Zheng et al. 2006). Thus, \(\{X_t\}\) can be considered as a branching process with immigration. For more details and further examples, we refer to the papers by McKenzie (1985a, b, 1986, 1987, 1988a, b), Al-osh and Alzaid (1987, 1991, 1992), Alzaid and Al-osh (1988), Bouzar and Jayakumar (2008), Kim and Weiß (2013), Kashikar et al. (2013) and Khao et al. (2015) among others. In fact, a web page on integer-valued time series is available at https://sites.google.com/site/integervaluedtimeseries/.

It may be noted that the thinning parameter \(\phi \) need not remain as a constant; on the other hand, it may vary with time. For example, in the above discussion, the survival probability \(\phi \) need not be a constant throughout the time period. Treating the parameter \(\phi \) as a random variable defined in [0, 1), Zheng et al. (2007) have introduced a random coefficient integer-valued autoregressive model of order one (RCINAR(1)). This may be defined as follows:

$$\begin{aligned} X_t = \phi _t \circ X_{t-1} + Z_t,\quad t \ge 1, \end{aligned}$$
(2)

where \(\{\phi _t\}\) is a sequence of i.i.d. random variables defined on [0, 1). Let \(\phi = E(\phi _t),\) and \(~ \sigma ^2_{\phi } = Var(\phi _t).\) The random variable \(X_0\) is assumed to be independent of \(\{\phi _t\},\) which is independent of \(\{Z_t\}.\) Zheng et al. (2007) have studied this model extensively and several properties of this model have been discussed in their paper. These include the Markov chain properties such as periodicity, positive recurrence and ergodicity, conditional and unconditional mean and variance, covariance and autocorrelation. They further derived conditional least squares and quasi-likelihood estimators of the model parameters and established their asymptotic properties. Zheng et al. (2006) have extended all the above results to a p-th order random coefficient integer-valued autoregressive (RCINAR(p)) model. Two interesting real data sets were modeled under this set up. Zhang et al. (2011) have studied the empirical likelihood method for the estimation of an RCINAR(1) process. One of the problems of interest under this set up is, given the data \(\{X_1,X_2,\ldots ,X_n\},\) test the hypothesis that there is no time variation for the thinning parameter \(\phi \) (the series is INAR(1)) against that it varies randomly across the time (it is RCINAR(1)). This is essentially same as testing \(H_0:\sigma ^2_{\phi } = 0\) against \(H_1:\sigma ^2_{\phi } > 0\). This problem is of great interest, because, having known that there is no stochastic time variation for the thinning parameter \(\phi \), the inference procedures are some what easy, as opposed to the case when the thinning parameter is stochastic and time varying. Therefore, in this paper we address this testing problem.

It may be possible to construct a likelihood ratio test for the above testing problem. However, the likelihood ratio approach has some serious set backs; as the true value of the parameter under the null hypothesis lies on the boundary of the parameter space, the asymptotics will not be smooth. Therefore, in this paper, we develop a locally most powerful type test for testing this hypothesis. Kale and Ramanathan (1997) have proposed a similar test for testing the randomness of the environment in a branching process. Incidentally, Kang and Lee (2009) have considered a change-point problem in a RCINAR(1) model and have employed the cumulative sum (CUSUM) test based on the conditional least-squares and modified quasi-likelihood estimators. Han and McCabe (2013) have considered the problem of testing the constancy of the parameter for a general class of non-Gaussian time series, which include the integer-valued time series also. They have advocated the use of the two-sided CUSUM test for the parameter constancy, proposed by Brown et al. (1975). These authors were interested in the structural break-type models, where the parameters get changed at certain specific time points (unknown). Our approach allows the parameters to be completely random under the alternate hypothesis. Schweer and Weiß (2014) proposed a test for overdispersion in INAR(1) process, Meintatnis and Karlis (2014) proposed a test for testing the hypothesis that the innovation distribution belongs to Poisson stopped-sum distributions. If the \( \phi _{t}\) in model (2) is random then it leads to overdispersion, as the variance of the series is sum of \( \sigma _{\phi }^2\) and the variance of the innovation random variable. The tests proposed in these two papers can be also used to test the overdispersion in the count time series data but, they do not specifically test whether the overdispersion is due to the randomness in the thinning parameter. Our test is specifically designed to identify the randomness in the thinning parameter of the model. Zhao and Hu (2015) have proposed a test for randomness of the coefficient of an RCINAR(1) process, but using the least squares estimates of the parameter \( \sigma ^2_{\phi } \).

This paper is organized as follows. In Sect. 2, we derive the test statistic and its asymptotic distribution theory under the general setup. Section 3 considers these for Poisson and geometric INAR(1) models. A simulation study is reported in Sect. 4 to judge the performance of the suggested test. Section 5 consists of various applications and data analysis. Section 6 concludes the paper.

2 Test statistic and its asymptotic distribution

2.1 RCINAR(1) model

Let \(\{X_1,X_2,\ldots ,X_n\}\) be the time series data from an RCINAR(1) model which satisfies (2). Let \(\{\phi _{t}\} \) be an i.i.d. sequence of random variables with probability distribution \( P_{\phi } \). Note that, conditional on \(\{\phi _{t}\} \), \( \phi _{t} \circ X_{t-1}\) is binomial with parameters (\(X_{t-1},\phi _{t}\)) and hence conditional on both (\(X_{t-1},\phi _{t}\))=(\(x,\phi \)), the distribution of \( X_{t}\) is given by \( f_z(X_t-\phi \circ x). \)

Before deriving the test statistic, we make the following regularity conditions:

  1. C1:

    The probability mass function (p.m.f.) \(f_z\) of \(Z_t = X_t - \phi \circ X_{t-1}\) is such that \(\log f_z\) is thrice differentiable with respect to \(\phi \) and \(\lambda \) and the derivatives (partial and mixed) are bounded in a neighbourhood around \((\phi , \lambda ).\)

  2. C2:

    Differentiation thrice with respect to \((\phi , \lambda )\) of \(f_z\) under the integration is permitted.

  3. C3:

    The distribution \( P_{\phi }\) of \(\phi _{t}\) is such that \(E{{|{\phi _t}|}}^{3} < \infty . \)

Since \(Z_t\)’s are i.i.d. with probability mass function \(f_z\), we can write down the likelihood function \(L_{H_1}\) for RCINAR(1) model as,

$$\begin{aligned} L_{H_1}(X_{1},\ldots ,X_{n})=\prod \limits _{t=1}^{n}P(X_{t}|X_{t-1})P(X_{0}) \end{aligned}$$

and

$$\begin{aligned} P(X_{t}=y|X_{t-1}=x, \phi _{t}=\phi )= & {} P(\phi _{t}\circ X_{t-1}+Z_{t}=y|X_{t-1}=x, \phi _{t}=\phi )\\= & {} f_{z}(y-\phi \circ x). \end{aligned}$$

Therefore,

$$\begin{aligned} L_{H_1}(X_{1},\ldots ,X_{n}) =P(X_{0}) \prod \limits _{t=1}^{n} f_z(z_t) = P(X_{0})\prod \limits _{t=1}^{n}\int f_z(X_t - \phi \circ X_{t-1}) dP_{\phi }. \end{aligned}$$
(3)

It is assumed that distribution of \( X_{0} \) is free from \( (\phi , \sigma _{\phi }^{2}) \). Expanding \(L_{H_1}\) around \(\phi ,\) the mean of \(\phi _t,\) using the Taylor series expansion yields

$$\begin{aligned} L_{H_1}= & {} P(X_{0})\prod \limits _{t=1}^{n}\int \Bigg \{ f_z(X_t - \phi {\circ } X_{t-1})+ (\phi _t - \phi ) f^{'}_{z}(X_t - \phi {\circ } X_{t-1})\\&+ \frac{(\phi _t - \phi )^2}{2} f^{''}_z(X_t - \phi \circ X_{t-1}) + \Delta _n \Bigg \} dP_{\phi }. \end{aligned}$$

Therefore,

$$\begin{aligned} \left. \frac{\partial log L_{H_{1}}}{\partial \sigma _{\phi }^{2}}\right| _{\sigma _{\phi }^{2} = 0} = \frac{1}{2}\sum \limits _{t=1}^{n}\frac{f^{''}_z(X_t - \phi \circ X_{t-1})}{f_z(X_t - \phi \circ X_{t-1})}, \end{aligned}$$

which results in the test statistic:

$$\begin{aligned} T_n(\underline{\theta })=\sum \limits _{t=1}^{n}\frac{1}{2}\frac{f_z^{''}(X_t-\phi \circ X_{t-1})}{f_z(X_t-\phi \circ X_{t-1})} = \sum \limits _{t=1}^{n}Y_t(\underline{\theta }),\quad \text{ say, } \end{aligned}$$

where

$$\begin{aligned} Y_t(\underline{\theta }) = \frac{1}{2}\frac{f_z^{''}(X_t-\phi \circ X_{t-1})}{f_z(X_t-\phi \circ X_{t-1})}. \end{aligned}$$

The actual test statistic may be obtained by replacing \(\underline{\theta } = (\phi , \lambda )^{T}\) with their maximum likelihood estimator \(\underline{\hat{\theta }} = (\hat{\phi }, \hat{\lambda })^{T}.\) It can be easily verified that \(\{Y_t(\underline{\theta }); \mathcal{F}_{t}^{Y}\}\) is a zero-mean martingale and hence we have the following lemma (Basawa and Prakasa Rao 1980).

Lemma 2.1

Under the following assumptions,

A1. \(E|Y_t(\underline{\theta })|^{2+\delta } \le \Delta < \infty ,~\forall ~ t \ge 1,~~\Delta \)a constant, \(\delta >0 \) and

A2. \(\lim _{n \rightarrow \infty } \frac{1}{n}\sum \limits _{t=1}^{n}E(Y_t^2(\underline{\theta })|\mathcal{F}_{t-1}^Y) = \sigma ^2(\underline{\theta }) ~~a.s.,\)

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum \limits _{t=1}^{n}Y_t(\underline{\theta }) \mathop {\rightarrow }\limits ^{d} N\left( 0, \sigma ^2(\underline{\theta })\right) . \end{aligned}$$

The asymptotic normality of the test statistic is considered in the following theorem whose proof is deferred to Appendix 1.

Theorem 2.1

Under the regularity conditions C1–C3 and the assumptions A1–A2 of Lemma 2.1,

$$\begin{aligned} (\sqrt{n}~\hat{\nu })^{-1} T_n(\underline{\hat{\theta }})~\mathop {\rightarrow }\limits ^{d}~N(0,1), \end{aligned}$$
(4)

where

$$\begin{aligned} \hat{\nu }^{2} = \sigma ^{2}(\underline{\hat{\theta }}) - \underline{\sigma }^{T}(\underline{\hat{\theta }})~I^{-1}(\underline{\hat{\theta }})~\underline{\hat{\sigma }}(\underline{\hat{\theta }}). \end{aligned}$$

(The expressions for \(I(\underline{\theta })\) and \(\underline{\sigma }(\underline{\theta })\) are derived in Appendix 1)

2.2 RCINAR(p) model

Following Zheng et al. (2006), we define a p-th order random coefficient integer-valued autoregressive (RCINAR(p)) model as

$$\begin{aligned} X_t = \sum \limits _{i=1}^p\phi _{it}\circ X_{t-i} + Z_t,\quad t \ge 1, \end{aligned}$$

where \({\phi _t} =(\phi _{1t},\phi _{2t}, \ldots ,\phi _{pt})'\) is a \(p\times 1\) random vector with \(E({\phi _t}) = {\phi } =(\phi _{1},\phi _{2}, \ldots ,\phi _{p})'\) and variance-covariance matrix \(\Sigma _{\phi }\). The random vectors \({ \phi _t},~t = 1,2,\ldots ,n\) are assumed to be i.i.d. and defined on \([0,1)^p\). Our interest here is to examine whether the variation in \({ \phi }\) is significant or not, that is, to test the hypothesis that \(H_0 : {\phi _t = \phi }~~\text{ for }~~ t = 1,2,\ldots ,n\) (non random) against \(H_1 :{\phi _t}\) is random as described above. This is equivalent to testing \(H_0 : {\Sigma _{\phi } = 0}\) against \(H_1 : {\Sigma _{\phi } \ne 0}\) . However, we may consider the hypothesis as \(H_0 : \sigma ^2_{\phi _i}= 0~,~~ i = 1, 2, \ldots p; \text{ against }~H_1 : \sigma ^2_{\phi _i} > 0 \) for some i with an assumption that \(\Sigma _{\phi }\) is a diagonal matrix.

Now, proceeding on the similar lines as in RCINAR(1), it is possible to consider RCINAR(p) model and arrive at the test statistic,

$$\begin{aligned} \underline{\mathbf{T}} = \left( T_{n1}(\underline{\hat{\theta }}), T_{n2}(\underline{\hat{\theta }}),\ldots , T_{np}(\underline{\hat{\theta }})\right) '. \end{aligned}$$

An appropriate quadratic form of this vector can be proved to have a chi-square distribution with p degrees of freedom asymptotically.

3 Illustrations

3.1 Poisson INAR(1) model

In this section we derive the test statistic in the case of a Poisson INAR(1) model. Let \(Z_t\) in model (1) be distributed as Poisson(\(\lambda \)), that is, \(\{X_t\}\) is a Poisson INAR(1) model. Then

$$\begin{aligned} f(X_t|X_{t-1}) = f_z(X_t-\phi \circ X_{t-1}) = \sum \limits _{k=0}^{M_t} P_k~\phi ^k~(1-\phi )^{X_{t-1}-k}, \end{aligned}$$

where

$$\begin{aligned} P_k = P_k(X_t, X_{t-1}, \lambda ) = \left( \begin{array}{c} X_{t-1} \\ k \end{array}\right) \frac{\lambda ^{X_t-k}}{(X_t-k)!} e^{-\lambda } \end{aligned}$$

and

$$\begin{aligned} M_t = \min (X_t,X_{t-1}). \end{aligned}$$

The test statistic in this case is

$$\begin{aligned} T_n(\underline{\theta }) = \sum \limits _{t=1}^{n}Y_t(\underline{\theta }),\quad \text{ where }\quad Y_t(\underline{\theta }) = \frac{A(\underline{\theta })}{B(\underline{\theta })}, \end{aligned}$$

with

$$\begin{aligned} A(\underline{\theta })= & {} \sum \limits _{k=0}^{M_t} P_k~\phi ^{k-2}~(1-\phi )^{X_{t-1}-k-2}\Big \{k(k-1)(1-\phi )^2\\&-\,(X_{t-1}-k)[2k\phi (1-\phi )-\phi ^2(X_{t-1}-k-1)]\Big \} \end{aligned}$$

and

$$\begin{aligned} B(\underline{\theta }) = 2~\sum \limits _{k=0}^{M_t} P_k~\phi ^k~(1-\phi )^{X_{t-1}-k}. \end{aligned}$$

For the Poisson INAR(1) model, all the regularity conditions are satisfied. Further, \(\{X_t\}\) is a stationary and ergodic process (Freeland and McCabe 2004). Hence by applying Theorem 2.1, we can claim the asymptotic normality of the test statistic \(T_n(\underline{\theta }).\) However, the computations of the terms involved in the asymptotic variance of the normal distribution are messy. Therefore, we adopt a computational procedure for evaluating the elements of empirical Fisher information matrix, as suggested by Freeland and McCabe (2004). These elements will converge to the theoretical quantities, as the sample size increases to infinity (by strong law of large numbers). The details are deferred to Appendix 2.

3.2 Geometric INAR(1) model

If the count time series data are over dispersed, then the usual Poison INAR(1) model is not suitable, however, we may use geometric INAR(1) models in such situations. The geometric INAR(1) process with negative binomial thinning proposed by Ristić et al. (2009) is given by,

$$\begin{aligned} X_{t}=\alpha * X_{t-1}+\epsilon _{t},\quad t\ge 1 \end{aligned}$$
(5)

where \( \alpha *X=\sum \nolimits _{i=1}^{X} W_{i},\quad \alpha \in [0,1) \), \( \{X_{t}\} \) is a stationary process with \( P(X_{t}=x)=\mu ^{x}/(1+\mu )^{x+1},~x=0,1,2,\ldots \), \( \{ W_{i} \} \) is sequence of i.i.d random variables with geometric (\(\alpha /(1+\alpha \))) distribution, \( \{ \epsilon _{t} \} \) is independent of \( W_{i} \) and \( X_{t-l} \) for all \( l\ge 1 \). The distribution of \(\epsilon _{t}\) is,

$$\begin{aligned} P(\epsilon _{t}=l)=\Big (1-\frac{\alpha \mu }{\mu -\alpha }\Big )\frac{\mu ^{l}}{(1+\mu )^{{l}+1}} +\frac{\alpha \mu }{\mu -\alpha }\frac{\alpha ^{l}}{(1+\alpha )^{l+1}},\quad l=0,1,2,\ldots . \end{aligned}$$

The mean and variance of \(\epsilon _{t}\) are \(\mu _{\epsilon }=(1-\alpha )\mu \) and \( \sigma _{\epsilon }^2=(1+\alpha )\mu ((1+\mu )(1-\alpha )-\alpha ) \) respectively. The conditional p.m.f. is,

$$\begin{aligned}&P(X_{t}=y|X_{t-1}=x) \\&\quad =\left\{ \begin{array}{ll} \displaystyle \Big (1-\frac{\alpha \mu }{\mu -\alpha }\Big )\frac{\mu ^{y}}{(1+\mu )^{{y}+1}} +\frac{\alpha \mu }{\mu -\alpha }\frac{\alpha ^{y}}{(1+\alpha )^{y+1}},&{}\quad \text{ if }\, x=0,\\ \displaystyle \frac{\mu \alpha ^{y+1}}{(\mu -\alpha )(1+\alpha )^{y+x+1}}{{y+x}\atopwithdelims (){y}}&{}\\ \quad \displaystyle +\Big (1-\frac{\alpha \mu }{(\mu -\alpha )}\Big )\frac{\mu ^{y}}{(1+\alpha )^{x}(1+\mu )^{y+1}}\sum \limits _{r=0}^{y}{{x+r-1}\atopwithdelims (){x-1}}\Big (\frac{\alpha (1+\mu )}{\mu (1+\alpha )}\Big )^r,&{}\quad \text{ if }\, x\ge 1,\end{array}\right. \end{aligned}$$

Therefore, by taking \( \underline{\theta }={(\mu , \alpha )}^{\tau }\), the test statistics \(T_{n}(\underline{\theta }) \) can be obtained as

$$\begin{aligned} T_{n}(\underline{\theta })=\sum \limits _{t=1}^{n}Y_{t}(\underline{\theta })=\sum \limits _{t=1}^{n}\frac{A(\underline{\theta })}{B(\underline{\theta })}, \end{aligned}$$

where

$$\begin{aligned}&\displaystyle B(\underline{\theta })=2f(X_{t}|X_{t-1}). \\&\displaystyle A(\underline{\theta })=\left\{ \begin{array}{ll}\displaystyle \frac{-2\mu ^{X_{t}+2}}{(1+\mu )^{X_{t}}(\mu -\alpha )^{3}}+\frac{\mu \alpha ^{X_{t}-1}}{(1+\alpha )^{X_{t}+3}(\mu -\alpha )^3}A_{1\alpha \mu },&{}\quad \text{ if }\,X_{t-1}=0,\\ \displaystyle \frac{\mu \alpha ^{X_{t}-1} }{(1+\alpha )^{3+X_{t-1}+X_{t}}(\mu -\alpha )^3}{{X_{t}+X_{t-1}}\atopwithdelims (){X_{t}}}A_{2\alpha \mu }&{}\\ \quad \displaystyle +\frac{\mu ^{X_{t}}\alpha ^{-2} (\alpha -\mu )^{-3}}{(1+\alpha )^{2+X_{t-1}}(1+\mu )^{1+X_{t}}}\sum \limits _{k=0}^{X_{t}}\Big (\frac{\alpha (1+\mu )}{\mu (1+\alpha )}\Big )^{k}A_{3\alpha \mu },&{}\quad \text{ if }\,X_{t-1}>0, \end{array}\right. \end{aligned}$$

with,

$$\begin{aligned} A_{1\alpha \mu }= & {} 2 \alpha ^4-4 \alpha ^3 X_{t}+\mu ^2 X_{t} (1+X_{t})-2 \alpha \mu (1+X_{t}) (-1+\mu +X_{t})\\&+\,\alpha ^2 ((-1+X_{t}) X_{t}+\,6 \mu (1+X_{t})), \\ A_{2\alpha \mu }= & {} 2 \alpha (\alpha ^3+\mu +3 \alpha \mu -\mu ^2)+\alpha ^2 (\alpha -\mu )^2 X_{t-1}^2+(-4 \alpha ^3+\mu ^2-2 \alpha \mu ^2\\&+\,\alpha ^2 (-1+6 \mu )) X_{t}+\,(\alpha -\mu )^2 X_{t}^2+\alpha (\alpha -\mu ) X_{t-1}\\&\times \left( 3 \alpha ^2+2 \mu -\alpha \mu -2 (\alpha -\mu ) X_{t}\right) \end{aligned}$$

and

$$\begin{aligned} A_{3\alpha \mu }= & {} \,-\,\mu ^3 (-1+k) k+\alpha ^5 (1+\mu ) X_{t-1} (1+X_{t-1})+\alpha \mu ^2 k (3 (-1+k)\\&+\,\mu (3+k+2 X_{t-1}))-\,\alpha ^4 (1+X_{t-1})\\&\times \left( 2 k+2 \mu ^2 (-1+X_{t-1})+\mu (2 k+3 X_{t-1})\right) \\&-\alpha ^2 \mu \left( 3 (-1+k) k+\mu ^2 X_{t-1} (3+2 k+X_{t-1})\right. \\&\left. +2 \mu \left( -1+k^2+3 k (1+X_{t-1})\right) \right) \\&+\;\alpha ^3 \left( (-1+k) k+\mu ^3 (-1+X_{t-1}) X_{t-1}+\mu k (5+k+6 X_{t-1}) \right. \\&\left. +\;\mu ^2 \left( 4+5 X_{t-1}+3 X_{t-1}^2+k (2+4 X_{t-1})\right) \right) . \end{aligned}$$

For the geometric INAR(1) model, it is possible to establish the asymptotic normality of \( T_{n}(\underline{\theta }), \) using the similar arguments as in the Poisson INAR(1) case. Here too the terms involved in the elements of the variance-covariance matrix of the asymptotic normal distribution are messy. A similar approach to Freeland and McCabe (2004) is adopted to obtain the empirical Fisher information matrix. The details are deferred to Appendix 2.

Table 1 Empirical level of the test at nominal level 0.05 for Poisson INAR(1) model
Table 2 Empirical power of the test at nominal level 0.05 for Poisson RCINAR(1) model with \(\phi _{t}\sim \)Beta(ab)
Table 3 Empirical level of the test at nominal level 0.05 for geometric INAR(1) model
Table 4 Empirical power of the test at nominal level 0.05 for geometric RCINAR(1) model with \(\phi _{t}\sim \)Beta(ab)
Table 5 Empirical level of the Zhao and Hu (2015) test at nominal level 0.05 for Poisson INAR(1) model
Table 6 Empirical power of the Zhao and Hu (2015) test at nominal level 0.05 for Poisson RCINAR(1) model with \(\phi _{t}\sim \)Beta(ab)

3.3 Procedure to compute the test statistics

  • Step 1 Estimate parameter values from the time series data {\(X_{1},X_{2},\ldots ,X_{n}\)}.

  • Step 2 Compute the first and second order partial derivatives of the likelihood with respect the parameters in the model.

  • Step 3 Compute \( T_{n}(\underline{\theta })\) using (7) for Poisson INAR(1) model or (8) for geometric INAR(1) model (see Appendix 2).

  • Step 4 Compute \( \sigma _1(\underline{\theta }),\sigma _2({\underline{\theta }}), \sigma ^2(\underline{\theta }) \) and elements of \(I(\underline{\theta })\).

  • Step 5 Compute \( \hat{\nu }^{2} \) as given in Theorem 2.1 and using this, compute the value of \((\sqrt{n}~\hat{\nu })^{-1} T_n(\underline{\hat{\theta }})\).

4 Simulation study

In order to assess the performance of the proposed test statistics, we have carried out two simulation studies, one under Poisson INAR(1) model and another with geometric INAR(1) model. Firstly we simulate observations from the Poisson INAR(1) model with various combinations of the parameters. The test statistic is computed for samples of sizes 50, 100, 300, 500, 1000 and 5000 and 1000 simulations were carried out in each case. The distribution of \(\phi _{t}\) is assumed to be Beta(ab) under the alternative hypothesis. The empirical level and power of the test statistic is reported in Tables 1 and 2 respectively, under Poison INAR(1) set up for various combinations of \(\lambda , a~ \text{ and }~ b\). It can be seen that the test maintains the level and the power tends to one as sample size increases. However, for small sample sizes such as 50 or 100, the performance of the suggested test is not so good. For the higher values of \(\phi \) and smaller sample sizes, the level is distorted as these values of \(\phi \) are near to the non-stationary region, but in this case also the test maintains the level and achieves power, as sample size increases.

In the second simulation study, we consider the geometric INAR(1) model. Here also we have considered the same sample size and same number of simulations for uniformity. The empirical size and power of the test are given in Tables 3 and 4 respectively. Similar conclusions can be made from these tables too. Here also, the distribution of \(\alpha _{t}\) under alternative hypothesis is assumed to be the Beta with parameters (ab). The power of the test is computed for various combinations of \(\mu \), a and b. In this case, we cannot choose higher values of \(\alpha \) (such as 0.6, 0.7, 0.8) corresponding to smaller values of \(\mu ,\) because of the restriction \(\alpha \le {\mu }/{(1+\mu )}\). Therefore, we have chosen higher values of \(\alpha \) for higher values of \(\mu \) (see Table 3). It is found that the proposed test works well in these cases too.

The third simulation was carried out to compare the proposed test with the test given by Zhao and Hu (2015). These authors have proposed a test for the same hypothesis using the least squares estimator of \( \sigma _{\phi }^{2} \) as the test statistic. In this simulation study we found that test suggested by Zhao and Hu (2015) based on the least squares estimator of \( \sigma _{\phi }^{2} \) is not performing well in terms of maintaining level and achieving power, in comparison with the test suggested in this paper. The simulation results given in their paper are not convincing as the range of \( \phi \) will not be in [0, 1) for the values of \( \sigma _{\phi }^{2} \) chosen by them. For comparison purpose we have computed the empirical level as well as power of the test for various values of \(\phi \) under Poisson INAR(1) model, at 5% nominal level. We have considered the same Beta(ab) distribution for the parameter \(\phi _{t}\) under alternative hypothesis. The results are reported in Tables 5 and 6 . We have used the same parameter combination as well as the sample size and number of simulations as that of their paper for the computation of level. From comparison with level and power of our test from Tables 1 and 2, it can be clearly seen that the suggested test in this paper is better in terms of maintaining the level and power as opposed to that of Zhao and Hu (2015).

5 Applications

We consider three applications of the above discussed testing procedure for data from real life situations.

5.1 Dengue data

The weekly (753 weeks) time series data on dengue cases reported for the period 2001 to 2015 in Hamburg city of Germany have been considered. These data are available at Robert Koch Institute website https://survstat.rki.de. The mean and variance of the data are found to be 0.4063 and 0.5341 respectively. From mean, variance and correlation plots given in Fig. 1, it can be seen that the data can be modeled by a Poisson INAR(1) model. From the time series plot given in Fig. 1, the constancy of the thinning parameter may be suspected, and therefore it is of interest to test the hypothesis of constancy of the thinning parameter. The value of the test statistic in (4) for testing \(H_0:\sigma ^2_{\phi } = 0\) against \(H_1:\sigma ^2_{\phi } > 0\) turned out to be 3.5447, with p value 0.0003, indicating the rejection of the null hypothesis at 5% level of significance. Therefore, it is better to model this data with RCINAR(1) rather than INAR(1) model.

Fig. 1
figure 1

Time series, ACF and PACF plots of dengue data

5.2 Tuberculosis data

This data consists of 521 weekly number of cases of Tuberculosis reported in Freiburg city of Baden-Württemberg state of Germany from the year 2001 to 2010, available at https://survstat.rki.de. The mean and variance are found to be 2.7428 and 3.4106 respectively. Using this information and correlation structure from Fig. 2, it may be concluded that the data are over dispersed and has an AR(1) structure. Therefore a geometric INAR(1) model would be appropriate to model these data. From the time series plot, it is clear that the process has larger variation. We are interested in testing whether this larger variation can be attributed to the randomness of the thinning parameter or not. And therefore, we test the hypothesis that \(H_0:\sigma _{\phi }^{2}=0\) against \(H_1:\sigma ^2_{\phi } > 0\). The value of the test statistic in (4) turned out to be 2.4081 with p value 0.0160,which indicates the rejection of the null hypothesis at \(5\%\) level of significance. Thus, in this case a geometric RCINAR(1) model would be much more appropriate as opposed to the geometric INAR(1).

Fig. 2
figure 2

Time series, ACF and PACF plots of tuberculosis data

5.3 Poliomyelitis data

This data consist of 168 monthly number of cases of poliomyelitis in USA for the period 1970 to 1983, see Zeger (1988). The mean and variance are found to be 1.3413 and 3.5153 respectively. From the correlation structure in Fig. 3 and the mean and variance it can be concluded that the data are overdispersed and has an AR(1) structure. Hence, a geometric INAR(1) model would be suitable for this case. We carried out the test for \(H_0:\sigma _{\phi }^{2}=0\) against \(H_1:\sigma ^2_{\phi } > 0\). The value of the test statistic in (4) turned out to be 0.7905 with p value 0.3739,  and hence, we cannot reject the null hypothesis of constant thinning parameter at \( 5\% \) level of significance. We applied the test of Poisson RCINAR(1) also to the data and the value of the test statistics turned out to be 2.5179 with p value 0.1125. This implies that though the data are overdispersed, this overdispersion is not due to the randomness in the thinning parameter.

Fig. 3
figure 3

Time series, ACF and PACF plots for poliomyelitis data

6 Conclusions

In this paper we have proposed a locally most powerful-type test for testing the constancy of the thinning parameter of an RCINAR(1) process. The suggested test is found to be working well with RCINAR(1) models with Poisson and geometric marginals. It is clear from the applications that the thinning parameter need not remain constant throughout the time. Therefore, it is essential to have such a test conducted whenever the random variation is suspected. A complex RCINAR model would be necessary, only when the proposed test rejects the null hypothesis. The test suggested in this paper is not exclusively for testing overdispersion hypothesis. However, it tests the presence of overdispersion due to randomness of the thinning parameter. Note that overdispersion may arise in various different ways.