Abstract
Integer-valued autoregressive models are widely used for modeling the time dependent count data. Many of the inference problems related to these types of models are not yet addressed due to the complexities of the related distribution theory. In this paper, we consider one such inference problem associated with these types of models. For a random coefficient integer-valued autoregressive model, we develop a locally most powerful-type test for testing the hypothesis that the thinning parameter is constant across the time. The asymptotic distribution of the suggested test statistic is derived. The Poisson and geometric INAR(1) models are considered for the illustration of the suggested methodology. Simulation studies indicate that the suggested test performs quite well. We have applied our methods to count time series data sets, where the thinning parameter is suspected to be varying.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Recently, there has been a growing interest in modeling non-negative integer-valued time series, especially, time series of counts. Several models have been proposed in the literature; in particular, the INteger-valued AutoRegressive (INAR) model has been the subject of study in various research papers. Reasons to introduce these integer-valued data models come from the need to account for the discrete nature of certain data sets, often counts of events, objects or individuals. The application areas of these types of integer-valued time series include epidemiology, actuarial statistics, neurobiology, psychometry etc. See for example Ristić et al. (2009), Moriña et al. (2011), Park and Kim (2012), Maiti et al. (2015) and Davis et al. (2016) .
A first-order integer-valued autoregressive model (INAR(1)) is defined as
where \( \phi \in [0, 1)\), \(\{Z_t\}\) is a sequence of independent and identically distributed (i.i.d.) non-negative integer-valued random variables with probability mass function \(f_z,~\)\(E(Z_t) = \lambda \), \(~Var(Z_t) = \sigma ^2_z,\) and \(\{Z_t\}\) independent of \(X_0\) for all t. The symbol ‘\(\circ \)’ stands for a thinning operator, which, conditional on \(X_{t-1}\) is defined as
where \(\{B_i, i=1,2,\ldots ,X_{t-1}\}\) are i.i.d. Bernoulli random variables with parameter \(\phi .\) This model has lot of similarity to the ordinary AR(1) model for a continuous time series. As an example of a standard INAR(1) model, one may consider \(X_t\) as the number of surviving cancer patients in a hospital at time t, \(\phi \) be the probability of survival from time \(t-1\) to t, and \(Z_t\) be the number of new cancer patients admitted at time t (Zheng et al. 2006). Thus, \(\{X_t\}\) can be considered as a branching process with immigration. For more details and further examples, we refer to the papers by McKenzie (1985a, b, 1986, 1987, 1988a, b), Al-osh and Alzaid (1987, 1991, 1992), Alzaid and Al-osh (1988), Bouzar and Jayakumar (2008), Kim and Weiß (2013), Kashikar et al. (2013) and Khao et al. (2015) among others. In fact, a web page on integer-valued time series is available at https://sites.google.com/site/integervaluedtimeseries/.
It may be noted that the thinning parameter \(\phi \) need not remain as a constant; on the other hand, it may vary with time. For example, in the above discussion, the survival probability \(\phi \) need not be a constant throughout the time period. Treating the parameter \(\phi \) as a random variable defined in [0, 1), Zheng et al. (2007) have introduced a random coefficient integer-valued autoregressive model of order one (RCINAR(1)). This may be defined as follows:
where \(\{\phi _t\}\) is a sequence of i.i.d. random variables defined on [0, 1). Let \(\phi = E(\phi _t),\) and \(~ \sigma ^2_{\phi } = Var(\phi _t).\) The random variable \(X_0\) is assumed to be independent of \(\{\phi _t\},\) which is independent of \(\{Z_t\}.\) Zheng et al. (2007) have studied this model extensively and several properties of this model have been discussed in their paper. These include the Markov chain properties such as periodicity, positive recurrence and ergodicity, conditional and unconditional mean and variance, covariance and autocorrelation. They further derived conditional least squares and quasi-likelihood estimators of the model parameters and established their asymptotic properties. Zheng et al. (2006) have extended all the above results to a p-th order random coefficient integer-valued autoregressive (RCINAR(p)) model. Two interesting real data sets were modeled under this set up. Zhang et al. (2011) have studied the empirical likelihood method for the estimation of an RCINAR(1) process. One of the problems of interest under this set up is, given the data \(\{X_1,X_2,\ldots ,X_n\},\) test the hypothesis that there is no time variation for the thinning parameter \(\phi \) (the series is INAR(1)) against that it varies randomly across the time (it is RCINAR(1)). This is essentially same as testing \(H_0:\sigma ^2_{\phi } = 0\) against \(H_1:\sigma ^2_{\phi } > 0\). This problem is of great interest, because, having known that there is no stochastic time variation for the thinning parameter \(\phi \), the inference procedures are some what easy, as opposed to the case when the thinning parameter is stochastic and time varying. Therefore, in this paper we address this testing problem.
It may be possible to construct a likelihood ratio test for the above testing problem. However, the likelihood ratio approach has some serious set backs; as the true value of the parameter under the null hypothesis lies on the boundary of the parameter space, the asymptotics will not be smooth. Therefore, in this paper, we develop a locally most powerful type test for testing this hypothesis. Kale and Ramanathan (1997) have proposed a similar test for testing the randomness of the environment in a branching process. Incidentally, Kang and Lee (2009) have considered a change-point problem in a RCINAR(1) model and have employed the cumulative sum (CUSUM) test based on the conditional least-squares and modified quasi-likelihood estimators. Han and McCabe (2013) have considered the problem of testing the constancy of the parameter for a general class of non-Gaussian time series, which include the integer-valued time series also. They have advocated the use of the two-sided CUSUM test for the parameter constancy, proposed by Brown et al. (1975). These authors were interested in the structural break-type models, where the parameters get changed at certain specific time points (unknown). Our approach allows the parameters to be completely random under the alternate hypothesis. Schweer and Weiß (2014) proposed a test for overdispersion in INAR(1) process, Meintatnis and Karlis (2014) proposed a test for testing the hypothesis that the innovation distribution belongs to Poisson stopped-sum distributions. If the \( \phi _{t}\) in model (2) is random then it leads to overdispersion, as the variance of the series is sum of \( \sigma _{\phi }^2\) and the variance of the innovation random variable. The tests proposed in these two papers can be also used to test the overdispersion in the count time series data but, they do not specifically test whether the overdispersion is due to the randomness in the thinning parameter. Our test is specifically designed to identify the randomness in the thinning parameter of the model. Zhao and Hu (2015) have proposed a test for randomness of the coefficient of an RCINAR(1) process, but using the least squares estimates of the parameter \( \sigma ^2_{\phi } \).
This paper is organized as follows. In Sect. 2, we derive the test statistic and its asymptotic distribution theory under the general setup. Section 3 considers these for Poisson and geometric INAR(1) models. A simulation study is reported in Sect. 4 to judge the performance of the suggested test. Section 5 consists of various applications and data analysis. Section 6 concludes the paper.
2 Test statistic and its asymptotic distribution
2.1 RCINAR(1) model
Let \(\{X_1,X_2,\ldots ,X_n\}\) be the time series data from an RCINAR(1) model which satisfies (2). Let \(\{\phi _{t}\} \) be an i.i.d. sequence of random variables with probability distribution \( P_{\phi } \). Note that, conditional on \(\{\phi _{t}\} \), \( \phi _{t} \circ X_{t-1}\) is binomial with parameters (\(X_{t-1},\phi _{t}\)) and hence conditional on both (\(X_{t-1},\phi _{t}\))=(\(x,\phi \)), the distribution of \( X_{t}\) is given by \( f_z(X_t-\phi \circ x). \)
Before deriving the test statistic, we make the following regularity conditions:
-
C1:
The probability mass function (p.m.f.) \(f_z\) of \(Z_t = X_t - \phi \circ X_{t-1}\) is such that \(\log f_z\) is thrice differentiable with respect to \(\phi \) and \(\lambda \) and the derivatives (partial and mixed) are bounded in a neighbourhood around \((\phi , \lambda ).\)
-
C2:
Differentiation thrice with respect to \((\phi , \lambda )\) of \(f_z\) under the integration is permitted.
-
C3:
The distribution \( P_{\phi }\) of \(\phi _{t}\) is such that \(E{{|{\phi _t}|}}^{3} < \infty . \)
Since \(Z_t\)’s are i.i.d. with probability mass function \(f_z\), we can write down the likelihood function \(L_{H_1}\) for RCINAR(1) model as,
and
Therefore,
It is assumed that distribution of \( X_{0} \) is free from \( (\phi , \sigma _{\phi }^{2}) \). Expanding \(L_{H_1}\) around \(\phi ,\) the mean of \(\phi _t,\) using the Taylor series expansion yields
Therefore,
which results in the test statistic:
where
The actual test statistic may be obtained by replacing \(\underline{\theta } = (\phi , \lambda )^{T}\) with their maximum likelihood estimator \(\underline{\hat{\theta }} = (\hat{\phi }, \hat{\lambda })^{T}.\) It can be easily verified that \(\{Y_t(\underline{\theta }); \mathcal{F}_{t}^{Y}\}\) is a zero-mean martingale and hence we have the following lemma (Basawa and Prakasa Rao 1980).
Lemma 2.1
Under the following assumptions,
A1. \(E|Y_t(\underline{\theta })|^{2+\delta } \le \Delta < \infty ,~\forall ~ t \ge 1,~~\Delta \)a constant, \(\delta >0 \) and
A2. \(\lim _{n \rightarrow \infty } \frac{1}{n}\sum \limits _{t=1}^{n}E(Y_t^2(\underline{\theta })|\mathcal{F}_{t-1}^Y) = \sigma ^2(\underline{\theta }) ~~a.s.,\)
The asymptotic normality of the test statistic is considered in the following theorem whose proof is deferred to Appendix 1.
Theorem 2.1
Under the regularity conditions C1–C3 and the assumptions A1–A2 of Lemma 2.1,
where
(The expressions for \(I(\underline{\theta })\) and \(\underline{\sigma }(\underline{\theta })\) are derived in Appendix 1)
2.2 RCINAR(p) model
Following Zheng et al. (2006), we define a p-th order random coefficient integer-valued autoregressive (RCINAR(p)) model as
where \({\phi _t} =(\phi _{1t},\phi _{2t}, \ldots ,\phi _{pt})'\) is a \(p\times 1\) random vector with \(E({\phi _t}) = {\phi } =(\phi _{1},\phi _{2}, \ldots ,\phi _{p})'\) and variance-covariance matrix \(\Sigma _{\phi }\). The random vectors \({ \phi _t},~t = 1,2,\ldots ,n\) are assumed to be i.i.d. and defined on \([0,1)^p\). Our interest here is to examine whether the variation in \({ \phi }\) is significant or not, that is, to test the hypothesis that \(H_0 : {\phi _t = \phi }~~\text{ for }~~ t = 1,2,\ldots ,n\) (non random) against \(H_1 :{\phi _t}\) is random as described above. This is equivalent to testing \(H_0 : {\Sigma _{\phi } = 0}\) against \(H_1 : {\Sigma _{\phi } \ne 0}\) . However, we may consider the hypothesis as \(H_0 : \sigma ^2_{\phi _i}= 0~,~~ i = 1, 2, \ldots p; \text{ against }~H_1 : \sigma ^2_{\phi _i} > 0 \) for some i with an assumption that \(\Sigma _{\phi }\) is a diagonal matrix.
Now, proceeding on the similar lines as in RCINAR(1), it is possible to consider RCINAR(p) model and arrive at the test statistic,
An appropriate quadratic form of this vector can be proved to have a chi-square distribution with p degrees of freedom asymptotically.
3 Illustrations
3.1 Poisson INAR(1) model
In this section we derive the test statistic in the case of a Poisson INAR(1) model. Let \(Z_t\) in model (1) be distributed as Poisson(\(\lambda \)), that is, \(\{X_t\}\) is a Poisson INAR(1) model. Then
where
and
The test statistic in this case is
with
and
For the Poisson INAR(1) model, all the regularity conditions are satisfied. Further, \(\{X_t\}\) is a stationary and ergodic process (Freeland and McCabe 2004). Hence by applying Theorem 2.1, we can claim the asymptotic normality of the test statistic \(T_n(\underline{\theta }).\) However, the computations of the terms involved in the asymptotic variance of the normal distribution are messy. Therefore, we adopt a computational procedure for evaluating the elements of empirical Fisher information matrix, as suggested by Freeland and McCabe (2004). These elements will converge to the theoretical quantities, as the sample size increases to infinity (by strong law of large numbers). The details are deferred to Appendix 2.
3.2 Geometric INAR(1) model
If the count time series data are over dispersed, then the usual Poison INAR(1) model is not suitable, however, we may use geometric INAR(1) models in such situations. The geometric INAR(1) process with negative binomial thinning proposed by Ristić et al. (2009) is given by,
where \( \alpha *X=\sum \nolimits _{i=1}^{X} W_{i},\quad \alpha \in [0,1) \), \( \{X_{t}\} \) is a stationary process with \( P(X_{t}=x)=\mu ^{x}/(1+\mu )^{x+1},~x=0,1,2,\ldots \), \( \{ W_{i} \} \) is sequence of i.i.d random variables with geometric (\(\alpha /(1+\alpha \))) distribution, \( \{ \epsilon _{t} \} \) is independent of \( W_{i} \) and \( X_{t-l} \) for all \( l\ge 1 \). The distribution of \(\epsilon _{t}\) is,
The mean and variance of \(\epsilon _{t}\) are \(\mu _{\epsilon }=(1-\alpha )\mu \) and \( \sigma _{\epsilon }^2=(1+\alpha )\mu ((1+\mu )(1-\alpha )-\alpha ) \) respectively. The conditional p.m.f. is,
Therefore, by taking \( \underline{\theta }={(\mu , \alpha )}^{\tau }\), the test statistics \(T_{n}(\underline{\theta }) \) can be obtained as
where
with,
and
For the geometric INAR(1) model, it is possible to establish the asymptotic normality of \( T_{n}(\underline{\theta }), \) using the similar arguments as in the Poisson INAR(1) case. Here too the terms involved in the elements of the variance-covariance matrix of the asymptotic normal distribution are messy. A similar approach to Freeland and McCabe (2004) is adopted to obtain the empirical Fisher information matrix. The details are deferred to Appendix 2.
3.3 Procedure to compute the test statistics
-
Step 1 Estimate parameter values from the time series data {\(X_{1},X_{2},\ldots ,X_{n}\)}.
-
Step 2 Compute the first and second order partial derivatives of the likelihood with respect the parameters in the model.
-
Step 3 Compute \( T_{n}(\underline{\theta })\) using (7) for Poisson INAR(1) model or (8) for geometric INAR(1) model (see Appendix 2).
-
Step 4 Compute \( \sigma _1(\underline{\theta }),\sigma _2({\underline{\theta }}), \sigma ^2(\underline{\theta }) \) and elements of \(I(\underline{\theta })\).
-
Step 5 Compute \( \hat{\nu }^{2} \) as given in Theorem 2.1 and using this, compute the value of \((\sqrt{n}~\hat{\nu })^{-1} T_n(\underline{\hat{\theta }})\).
4 Simulation study
In order to assess the performance of the proposed test statistics, we have carried out two simulation studies, one under Poisson INAR(1) model and another with geometric INAR(1) model. Firstly we simulate observations from the Poisson INAR(1) model with various combinations of the parameters. The test statistic is computed for samples of sizes 50, 100, 300, 500, 1000 and 5000 and 1000 simulations were carried out in each case. The distribution of \(\phi _{t}\) is assumed to be Beta(a, b) under the alternative hypothesis. The empirical level and power of the test statistic is reported in Tables 1 and 2 respectively, under Poison INAR(1) set up for various combinations of \(\lambda , a~ \text{ and }~ b\). It can be seen that the test maintains the level and the power tends to one as sample size increases. However, for small sample sizes such as 50 or 100, the performance of the suggested test is not so good. For the higher values of \(\phi \) and smaller sample sizes, the level is distorted as these values of \(\phi \) are near to the non-stationary region, but in this case also the test maintains the level and achieves power, as sample size increases.
In the second simulation study, we consider the geometric INAR(1) model. Here also we have considered the same sample size and same number of simulations for uniformity. The empirical size and power of the test are given in Tables 3 and 4 respectively. Similar conclusions can be made from these tables too. Here also, the distribution of \(\alpha _{t}\) under alternative hypothesis is assumed to be the Beta with parameters (a, b). The power of the test is computed for various combinations of \(\mu \), a and b. In this case, we cannot choose higher values of \(\alpha \) (such as 0.6, 0.7, 0.8) corresponding to smaller values of \(\mu ,\) because of the restriction \(\alpha \le {\mu }/{(1+\mu )}\). Therefore, we have chosen higher values of \(\alpha \) for higher values of \(\mu \) (see Table 3). It is found that the proposed test works well in these cases too.
The third simulation was carried out to compare the proposed test with the test given by Zhao and Hu (2015). These authors have proposed a test for the same hypothesis using the least squares estimator of \( \sigma _{\phi }^{2} \) as the test statistic. In this simulation study we found that test suggested by Zhao and Hu (2015) based on the least squares estimator of \( \sigma _{\phi }^{2} \) is not performing well in terms of maintaining level and achieving power, in comparison with the test suggested in this paper. The simulation results given in their paper are not convincing as the range of \( \phi \) will not be in [0, 1) for the values of \( \sigma _{\phi }^{2} \) chosen by them. For comparison purpose we have computed the empirical level as well as power of the test for various values of \(\phi \) under Poisson INAR(1) model, at 5% nominal level. We have considered the same Beta(a, b) distribution for the parameter \(\phi _{t}\) under alternative hypothesis. The results are reported in Tables 5 and 6 . We have used the same parameter combination as well as the sample size and number of simulations as that of their paper for the computation of level. From comparison with level and power of our test from Tables 1 and 2, it can be clearly seen that the suggested test in this paper is better in terms of maintaining the level and power as opposed to that of Zhao and Hu (2015).
5 Applications
We consider three applications of the above discussed testing procedure for data from real life situations.
5.1 Dengue data
The weekly (753 weeks) time series data on dengue cases reported for the period 2001 to 2015 in Hamburg city of Germany have been considered. These data are available at Robert Koch Institute website https://survstat.rki.de. The mean and variance of the data are found to be 0.4063 and 0.5341 respectively. From mean, variance and correlation plots given in Fig. 1, it can be seen that the data can be modeled by a Poisson INAR(1) model. From the time series plot given in Fig. 1, the constancy of the thinning parameter may be suspected, and therefore it is of interest to test the hypothesis of constancy of the thinning parameter. The value of the test statistic in (4) for testing \(H_0:\sigma ^2_{\phi } = 0\) against \(H_1:\sigma ^2_{\phi } > 0\) turned out to be 3.5447, with p value 0.0003, indicating the rejection of the null hypothesis at 5% level of significance. Therefore, it is better to model this data with RCINAR(1) rather than INAR(1) model.
5.2 Tuberculosis data
This data consists of 521 weekly number of cases of Tuberculosis reported in Freiburg city of Baden-Württemberg state of Germany from the year 2001 to 2010, available at https://survstat.rki.de. The mean and variance are found to be 2.7428 and 3.4106 respectively. Using this information and correlation structure from Fig. 2, it may be concluded that the data are over dispersed and has an AR(1) structure. Therefore a geometric INAR(1) model would be appropriate to model these data. From the time series plot, it is clear that the process has larger variation. We are interested in testing whether this larger variation can be attributed to the randomness of the thinning parameter or not. And therefore, we test the hypothesis that \(H_0:\sigma _{\phi }^{2}=0\) against \(H_1:\sigma ^2_{\phi } > 0\). The value of the test statistic in (4) turned out to be 2.4081 with p value 0.0160,which indicates the rejection of the null hypothesis at \(5\%\) level of significance. Thus, in this case a geometric RCINAR(1) model would be much more appropriate as opposed to the geometric INAR(1).
5.3 Poliomyelitis data
This data consist of 168 monthly number of cases of poliomyelitis in USA for the period 1970 to 1983, see Zeger (1988). The mean and variance are found to be 1.3413 and 3.5153 respectively. From the correlation structure in Fig. 3 and the mean and variance it can be concluded that the data are overdispersed and has an AR(1) structure. Hence, a geometric INAR(1) model would be suitable for this case. We carried out the test for \(H_0:\sigma _{\phi }^{2}=0\) against \(H_1:\sigma ^2_{\phi } > 0\). The value of the test statistic in (4) turned out to be 0.7905 with p value 0.3739, and hence, we cannot reject the null hypothesis of constant thinning parameter at \( 5\% \) level of significance. We applied the test of Poisson RCINAR(1) also to the data and the value of the test statistics turned out to be 2.5179 with p value 0.1125. This implies that though the data are overdispersed, this overdispersion is not due to the randomness in the thinning parameter.
6 Conclusions
In this paper we have proposed a locally most powerful-type test for testing the constancy of the thinning parameter of an RCINAR(1) process. The suggested test is found to be working well with RCINAR(1) models with Poisson and geometric marginals. It is clear from the applications that the thinning parameter need not remain constant throughout the time. Therefore, it is essential to have such a test conducted whenever the random variation is suspected. A complex RCINAR model would be necessary, only when the proposed test rejects the null hypothesis. The test suggested in this paper is not exclusively for testing overdispersion hypothesis. However, it tests the presence of overdispersion due to randomness of the thinning parameter. Note that overdispersion may arise in various different ways.
References
Al-osh MA, Alzaid AA (1987) First order integer-valued autoregressive (INAR(1)) processes. J Time Ser Anal 8:261–275
Al-osh MA, Alzaid AA (1991) Binomial autoregressive moving average models. Stoch Models 7:261–282
Al-osh MA, Alzaid AA (1992) First order autoregressive time series with negative binomial and geometric marginals. Commun Stat Theory Methods 21:2483–2492
Alzaid AA, Al-osh MA (1988) First order integer-valued autoregressive (INAR(1)) processes: distributional and regression properties. Stat Neerl 42:53–61
Basawa IV, Prakasa Rao BLS (1980) Statistical inference for stochastic processes. Academic Press, London
Bouzar N, Jayakumar K (2008) Time series with discrete semistable marginals. Stat Pap 49:619–635
Brown RL, Durbin J, Evans JM (1975) Techniques for testing the constancy of regression relationships over time. J R Stat Soc Ser B Methodol 37(2):149–192
Davis RA, Holan SH, Lund R, Ravishankar N (2016) Handbook of discrete-valued time series. Chapman and Hall, New York
Freeland RK, McCabe BPM (2004) Analysis of low count time series data by Poisson autoregression. J Time Ser Anal 25:701–722
Han L, McCabe B (2013) Testing for parameter constancy in non-Gaussian time series. J Time Ser Anal 34:17–29
Kale M, Ramanathan TV (1997) A test for randomness of the environments in a branching process. Stat Pap 38:409–421
Kang J, Lee S (2009) Parameter change test for random coefficient integer-valued autoregressive processes with application to polio data analysis. J Time Ser Anal 30(2):239–258
Kashikar A, Rohan N, Ramanathan TV (2013) Integer autoregressive models with structural breaks. J Appl Stat 40(12):2653–2669
Khao WC, Ong SH, Biswas A (2015) Modeling time series of counts with new class of INAR(1) model. Stat Pap. doi:10.1007/s00362-015-0704-0
Kim Y-H, Weiß CH (2013) Parameter estimation for binomial AR(1) models with applications in finance and industry. Stat Pap 54:563–590
Maiti R, Biswas A, Das S (2015) Time series of zero inflated counts and their coherent forecasting. J Forecast 34:694–707
McKenzie E (1985a) Contribution to the discussion of Lawrence and Lewis. J R Stat Soc B 47:187–188
McKenzie E (1985b) Some simple models for discrete variate time series. Water Resour Bull 21:645–650
McKenzie E (1986) Autoregressive moving-average processes with negative-binomial and geometric marginal distributions. Adv Appl Probab 18:679–705
McKenzie E (1987) Innovation distributions for gamma and negative binomial autoregressions. Scand J Stat 14:79–85
McKenzie E (1988a) The distributional structure of finite moving-average processes. J Appl Probab 25:313–321
McKenzie E (1988b) Some ARMA models for dependent sequence of Poisson counts. Adv Appl Probab 20:822–835
Meintatnis SG, Karlis D (2014) Validation tests for innovation distribution in INAR time series models. Comput Stat 29:1221–1241
Moriña D, Puig P, Ríos J, Vilella A, Trilla A (2011) A statistical model for hospital admissions caused by seasonal diseases. Stat Med 30:3125–3136
Park Y, Kim Y-H (2012) Diagnostic checks for integer-valued autoregressive models using expected residuals. Stat Pap 53:951–970
Ristić MM, Bakouch HS, Nastić AS (2009) A new geometric first order integer-valued autoregressive (NGINAR(1)) process. J Stat Plan Inference 139:2218–2226
Schweer S, Weiß CH (2014) Compound Piosson INAR(1) processes: stochastic properties and testing for overdispersion. Comput Stat Data Anal 77:267–284
Zeger SL (1988) A regression model for time series of counts. Biometrika 75:621–629
Zhang H, Wang D, Zhu F (2011) The empirical likelihood for first-Order random coefficient integer-valued autoregressive processes. Commun Stat Theory Methods 40(3):492–509
Zhao Z, Hu Y (2015) Statistical inference for first order random coefficient integer-valued autoregressive processes. J Inequal Appl. doi:10.1186/s13660-015-0886-y
Zheng H, Basawa IV, Datta S (2006) Inference for p-th order random coefficient integer-valued autoregressive processes. J Time Ser Anal 27:411–440
Zheng H, Basawa IV, Datta S (2007) First-order random coefficient integer-valued autoregressive processes. J Stat Plan Inference 173:212–229
Acknowledgements
Authors are thankful to the editor and referees for their comments. They are also thankful to Dr. A. S. Kashikar for fruitful discussions. Manik Awale would like to acknowledge the University Grants Commission of India for an award of Research Fellowship under the Faculty Improvement Program. N. Balakrishna would like to acknowledge a grant from the Department of Science & Technology, Government of India, SR/S4/MS-837/13. T. V. Ramanathan’s research was partially supported by a grant from the Department of Science & Technology, Government of India, SR/S4/MS-866/13.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proof of Theorem 2.1
Using the standard likelihood theory, under the null hypothesis, we can write
where
Now expanding \((1/\sqrt{n})\sum _{t=1}^n Y_t(\underline{\hat{\theta }})\) around \((1/\sqrt{n})\sum _{t=1}^n Y_t(\underline{\theta })\) we have
with \(Y_t^{'}(\underline{\theta }) = \left( \frac{\partial Y_t}{\partial \phi }~ \frac{\partial Y_t}{\partial \lambda }\right) ^{T}.\) The remainder term is since \(\underline{\hat{\theta }} \mathop {\rightarrow }\limits ^{p} \underline{\theta }.\) In view of the regularity conditions \(C_1-C_2,\) we may consider . Therefore,
Now,
This is because
and
Thus (6) is equivalent to (asymptotically)
where
Using regularity conditions and Lemma 2.1, we have
Therefore,
where
Thus, we have proved the theorem.
Appendix 2: Elements of the Information Matrix
-
1.
Poisson INAR(1): We have the log likelihood
$$\begin{aligned} l = \log ~L = \sum \limits _{t=1}^{n}log~f(X_t|X_{t-1})=\sum \limits _{t=1}^{n}l_t,\quad \text{ say. } \end{aligned}$$The first and second derivative of l with respect to \(\phi \) are given by
$$\begin{aligned} \frac{\partial l}{\partial \phi }\,=\, \mathop {l}\limits ^{.}_{\phi } \,=\, \sum \limits _{t=1}^{n}\frac{1}{f(X_{t}|X_{t-1})}\frac{\partial }{\partial \phi }f(X_{t}|X_{t-1})=\sum \limits _{t=1}^{n}\mathop {l}\limits ^{.}_t(\phi ),\quad \text{ say, } \end{aligned}$$where
$$\begin{aligned} \mathop {l}\limits ^{.}_t(\phi )\,=\,\frac{X_{t-1}}{1-\phi }\frac{f(X_t-1|X_{t-1}-1)-f(X_t|X_{t-1})}{f(X_t|X_{t-1})}, \end{aligned}$$and
$$\begin{aligned} \frac{\partial ^2l}{\partial \phi ^2}=\;\mathop {l}\limits ^{..}_{\phi \phi }\;=\;\sum \limits _{t=1}^{n}\left[ \frac{\frac{\partial ^2}{\partial \phi ^2}f(X_t|X_{t-1})}{f(X_t|X_{t-1})} {-} \left\{ \frac{\frac{\partial }{\partial \phi }f(X_t|X_{t-1})}{f(X_t|X_{t-1})} \right\} ^2\right] {=}\sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\phi \phi ),~\text{ say }, \end{aligned}$$where
$$\begin{aligned} \mathop {l}\limits ^{..}_t(\phi \phi )= & {} \frac{1}{\phi ^2(1-\phi )^2}\left[ \frac{2\phi ^2X_{t-1}f(X_t-1|X_{t-1}-1)}{f(X_t|X_{t-1})}-\phi ^2X_{t-1}\right. \\&\left. +\,\frac{\phi ^2X_{t-1}(X_{t-1}-1)f(X_t-2|X_{t-1}-2)}{f(X_t|X_{t-1})}\right. \\&-\left. \left\{ \frac{\phi X_{t-1}f(X_t-1|X_{t-1}-1)}{f(X_t|X_{t-1})} \right\} ^2\right] . \end{aligned}$$Recall that
$$\begin{aligned} Y_t(\underline{\theta })=\frac{1}{2}\frac{f^{''}(X_t|X_{t-1})}{f(X_t|X_{t-1})} = \frac{1}{2}\frac{f_z^{''}(X_t-\phi \circ X_{t-1})}{f_z(X_t-\phi \circ X_{t-1})}, \end{aligned}$$where
$$\begin{aligned} \underline{\theta }=(\phi ,\lambda ). \end{aligned}$$Thus, we have
$$\begin{aligned} \mathop {l}\limits ^{..}_t(\phi \phi )= & {} \frac{\frac{\partial ^2}{\partial \phi ^2}f(X_t|X_{t-1})}{f(X_t|X_{t-1})} - \left\{ \frac{\frac{\partial }{\partial \phi }f(X_t|X_{t-1})}{f(X_t|X_{t-1})} \right\} ^2\\= & {} 2Y_t(\theta )-\left\{ \frac{\frac{\partial }{\partial \phi }f(X_t|X_{t-1})}{f(X_t|X_{t-1})} \right\} ^2 \end{aligned}$$or
$$\begin{aligned} Y_t(\underline{\theta }) = \frac{1}{2}\left[ \mathop {l}\limits ^{..}_t(\phi \phi )+\left\{ \mathop {l}\limits ^{.}_t(\phi )\right\} ^2\right] . \end{aligned}$$And thus, the test statistic is
$$\begin{aligned} T_n(\underline{\theta })=\sum \limits _{t=1}^{n}Y_t(\underline{\theta })=\frac{1}{2}\sum \limits _{t=1}^{n}\left[ \mathop {l}\limits ^{..}_t(\phi \phi )+\left\{ \mathop {l}\limits ^{.}_t(\phi )\right\} ^2\right] . \end{aligned}$$(7)Now the derivative of the log likelihood with respect to \(\lambda \) is given by
$$\begin{aligned} \frac{\partial l}{\partial \lambda } = \mathop {l}\limits ^{.}_{\lambda }=\sum \limits _{t=1}^{n} \frac{1}{f(X_t|X_{t-1})} \frac{\partial }{\partial \lambda } f(X_t|X_{t-1}) = \sum \limits _{t=1}^{n} \mathop {l}\limits ^{.}_t(\lambda ),\quad \text{ say, } \end{aligned}$$where
$$\begin{aligned} \mathop {l}\limits ^{.}_t(\lambda )=\frac{f(X_t-1|X_{t-1})-f(X_t|X_{t-1})}{f(X_t|X_{t-1})}. \end{aligned}$$Also, the second derivative and the cross derivatives are
$$\begin{aligned}\frac{\partial ^2 l}{\partial \lambda ^2} = \mathop {l}\limits ^{..}_{\lambda \lambda }=\sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\lambda \lambda ),\quad \text{ say, } \end{aligned}$$and
$$\begin{aligned} \frac{\partial ^2 l}{\partial \phi \partial \lambda }\,=\, \frac{\partial ^2 l}{\partial \lambda \partial \phi } \,=\, \sum \limits _{t=1}^{n} \mathop {l}\limits ^{..}_t(\phi \lambda ),\quad \text{ say, } \end{aligned}$$where
$$\begin{aligned} \mathop {l}\limits ^{..}_t(\phi \lambda ) \,= & {} \, \frac{1}{\lambda \phi (1-\phi )} \Bigg [\frac{\lambda \phi X_{t-1}f(X_t-2|X_{t-1}-1)}{f(X_t|X_{t-1})}\\&-\,\frac{\lambda f(X_t-1|X_{t-1})\phi X_{t-1}f(X_t-1|X_{t-1}-1)}{\{f(X_t|X_{t-1})\}^2}\Bigg ] . \end{aligned}$$The terms appearing in the asymptotic variance \(\nu ^2\) can be derived as
$$\begin{aligned} \sigma ^2(\underline{\theta }) = E\left( Y_t(\underline{\theta })\right) ^2=\frac{1}{4}E\left[ \mathop {l}\limits ^{..}_t(\phi \phi )+\left\{ \mathop {l}\limits ^{.}_t(\phi )\right\} ^2\right] ^2, \end{aligned}$$$$\begin{aligned} ~\underline{\sigma }(\underline{\theta }) = (\sigma _1(\underline{\theta }), \sigma _2(\underline{\theta }))^{T} \end{aligned}$$and
$$\begin{aligned} I(\underline{\theta }) = \left[ \begin{array}{ll}I_{11}(\underline{\theta }) &{}\quad I_{12}(\underline{\theta })\\ I_{21}(\underline{\theta }) &{}\quad I_{22}(\underline{\theta })\end{array} \right] , \end{aligned}$$where
$$\begin{aligned} \sigma _1(\underline{\theta })= & {} E\left[ \mathop {l}\limits ^{.}_t(\phi ) Y_t(\underline{\theta })\right] =\frac{1}{2}~E\left[ \mathop {l}\limits ^{.}_t(\phi )\left\{ \mathop {l}\limits ^{..}_t(\phi \phi )+[\mathop {l}\limits ^{.}_t(\phi )]^2\right\} \right] ,\\ \sigma _2(\underline{\theta })= & {} E\left[ \mathop {l}\limits ^{.}_t(\lambda ) Y_t(\underline{\theta })\right] =\frac{1}{2}~E\left[ \mathop {l}\limits ^{.}_t(\lambda )\left\{ \mathop {l}\limits ^{..}_t(\phi \phi )+[\mathop {l}\limits ^{.}_t(\phi )]^2\right\} \right] , \end{aligned}$$$$\begin{aligned} I_{11}(\theta )=-E\left[ \mathop {l}\limits ^{..}_t(\phi \phi )\right] \approx -\frac{1}{n} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\phi \phi ), \end{aligned}$$$$\begin{aligned} I_{12}(\theta )= -E\left[ \mathop {l}\limits ^{..}_t(\phi \lambda )\right] \approx -\frac{1}{n} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\phi \lambda ) \end{aligned}$$and
$$\begin{aligned} I_{22}(\theta )= -E\left[ \mathop {l}\limits ^{..}_t(\lambda \lambda )\right] \approx -\frac{1}{n} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\lambda \lambda ). \end{aligned}$$ -
2.
Geometric INAR(1): Proceeding as in the case of Poisson INAR(1), we have
$$\begin{aligned} \log L=\sum \limits _{t=1}^{n}logf(X_{t}|X_{t-1}) =\sum \limits _{t=1}^{n}l_{t},\quad \text{ say }. \end{aligned}$$The first and second derivative of the log likelihood with respect to \( \alpha \) are
$$\begin{aligned} \frac{\partial l}{\partial \alpha }= \sum \limits _{t=1}^{n}\frac{1}{f(X_t|X_{t-1})}\frac{\partial }{\partial \alpha }f(X_t|X_{t-1})=\sum \limits _{t=1}^{n}\mathop {l}\limits ^{.}_t(\alpha ),\quad \text{ say }, \end{aligned}$$and
$$\begin{aligned} \frac{\partial ^2l}{\partial \alpha ^2}=\sum \limits _{t=1}^{n}\left[ \frac{\frac{\partial ^2}{\partial \alpha ^2}f(X_t|X_{t-1})}{f(X_t|X_{t-1})} - \left\{ \frac{\frac{\partial }{\partial \alpha }f(X_t|X_{t-1})}{f(X_t|X_{t-1})} \right\} ^2\right] =\sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\alpha \alpha ),\quad \text{ say }. \end{aligned}$$Recall that
$$\begin{aligned} Y_t(\underline{\theta })=\frac{1}{2}\frac{f^{''}(X_t|X_{t-1})}{f(X_t|X_{t-1})} = \frac{1}{2}\frac{f_z^{''}(X_t-\alpha \circ X_{t-1})}{f_z(X_t-\alpha \circ X_{t-1})}, \end{aligned}$$where
$$\begin{aligned}\underline{\theta }=(\alpha ,\mu ). \end{aligned}$$Thus, we have
$$\begin{aligned} \mathop {l}\limits ^{..}_t(\alpha \alpha )= & {} \frac{\frac{\partial ^2}{\partial \alpha ^2}f(X_t|X_{t-1})}{f(X_t|X_{t-1})} - \left\{ \frac{\frac{\partial }{\partial \alpha }f(X_t|X_{t-1})}{f(X_t|X_{t-1})} \right\} ^2\\= & {} 2Y_t(\underline{\theta })-\left\{ \frac{\frac{\partial }{\partial \alpha }f(X_t|X_{t-1})}{f(X_t|X_{t-1})} \right\} ^2 \end{aligned}$$or
$$\begin{aligned} Y_t(\underline{\theta }) = \frac{1}{2}\left[ \mathop {l}\limits ^{..}_t(\alpha \alpha )+\left\{ \mathop {l}\limits ^{.}_t(\alpha )\right\} ^2\right] .\end{aligned}$$Therefore, the test statistic is
$$\begin{aligned} T_n(\underline{\theta })=\sum \limits _{t=1}^{n}Y_t(\underline{\theta })=\frac{1}{2}\sum \limits _{t=1}^{n}\left[ \mathop {l}\limits ^{..}_t(\alpha \alpha )+\left\{ \mathop {l}\limits ^{.}_t(\alpha )\right\} ^2\right] . \end{aligned}$$(8)Now the derivatives of the log likelihood with respect to \(\mu \) are
$$\begin{aligned} \frac{\partial l}{\partial \mu }= & {} \sum \limits _{t=1}^{n} \frac{1}{f(X_t|X_{t-1})} \frac{\partial }{\partial \mu } f(X_t|X_{t-1}) = \sum \limits _{t=1}^{n} \mathop {l}\limits ^{.}_t(\mu ),\quad \text{ say },\\ \frac{\partial ^2 l}{\partial \mu ^2}= & {} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\mu \mu ),\quad \text{ say }, \end{aligned}$$and the cross derivative is
$$\begin{aligned} \frac{\partial ^2 l}{\partial \alpha \partial \mu } = \frac{\partial ^2 l}{\partial \mu \partial \alpha } = \sum \limits _{t=1}^{n} \mathop {l}\limits ^{..}_t(\alpha \mu ),\quad \text{ say }. \end{aligned}$$The terms appearing in the asymptotic variance \(\nu ^2\) can be derived as
$$\begin{aligned} \sigma ^2(\underline{\theta }) = E\left( Y_t(\underline{\theta })\right) ^2= & {} \frac{1}{4}E\left[ \mathop {l}\limits ^{..}_t(\alpha \alpha )+\left\{ \mathop {l}\limits ^{.}_t(\alpha )\right\} ^2\right] ^2,\\ \underline{\sigma }(\underline{\theta })= & {} (\sigma _1(\underline{\theta }), \sigma _2(\underline{\theta }))^{T} \end{aligned}$$and
$$\begin{aligned} I(\underline{\theta }) = \left[ \begin{array}{cc}I_{11}(\underline{\theta }) &{}\quad I_{12}(\underline{\theta })\\ I_{21}(\underline{\theta }) &{}\quad I_{22}(\underline{\theta })\end{array} \right] , \end{aligned}$$where
$$\begin{aligned} \sigma _1(\underline{\theta })= & {} E\left[ \mathop {l}\limits ^{.}_t(\alpha ) Y_t(\underline{\theta })\right] =\frac{1}{2}~E\left[ \mathop {l}\limits ^{.}_t(\alpha )\left\{ \mathop {l}\limits ^{..}_t(\alpha \alpha )+[\mathop {l}\limits ^{.}_t(\alpha )]^2\right\} \right] ,\\ \sigma _2(\underline{\theta })= & {} E\left[ \mathop {l}\limits ^{.}_t(\mu ) Y_t(\underline{\theta })\right] =\frac{1}{2}~E\left[ \mathop {l}\limits ^{.}_t(\mu )\left\{ \mathop {l}\limits ^{..}_t(\alpha \alpha )+[\mathop {l}\limits ^{.}_t(\alpha )]^2\right\} \right] ,\end{aligned}$$$$\begin{aligned}I_{11}(\theta )= & {} -E\left[ \mathop {l}\limits ^{..}_t(\alpha \alpha )\right] \approx -\frac{1}{n} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\alpha \alpha ),\\ I_{12}(\theta )= & {} -E\left[ \mathop {l}\limits ^{..}_t(\alpha \mu )\right] \approx - \frac{1}{n} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\alpha \mu ) \end{aligned}$$and
$$\begin{aligned}I_{22}(\mu )= -E\left[ \mathop {l}\limits ^{..}_t(\mu \mu )\right] \approx -\frac{1}{n} \sum \limits _{t=1}^{n}\mathop {l}\limits ^{..}_t(\mu \mu ). \end{aligned}$$
Rights and permissions
About this article
Cite this article
Awale, M., Balakrishna, N. & Ramanathan, T.V. Testing the constancy of the thinning parameter in a random coefficient integer autoregressive model. Stat Papers 60, 1515–1539 (2019). https://doi.org/10.1007/s00362-017-0884-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-017-0884-x