1 Introduction

During the past few decades, there has been an upsurge of interest in the field of count data time series analysis. In particular, INAR(1) models have attracted the attention of a great number of researchers due to the strong application value, see Weiß (2008). The most common way to construct INAR(1) processes is based on different types of thinning operators. The binomial thinning operator “\(\circ \)” is the most popular one which was proposed by Steutel and Van Harn (1979). It is defined as

$$\begin{aligned} \alpha \circ X=\sum _{i=1}^{X}B_i, \end{aligned}$$
(1)

where \(\{B_i\}\) is an independent identically distributed (iid) Bernoulli(\(\alpha \)) random sequence independent of X and \(\alpha \in [0,1]\). Based on the binomial thinning operator, some INAR(1) processes have been proposed, see, for example, Kim and Lee (2017), Bourguignon and Vasconcellos (2015) and Jazi et al. (2012) and Bourguignon et al. (2019), and the references therein. The models cited above are suitable to model count data having an infinite range \(\{0,1,\ldots \}\).

The use of mixture models is another approach for constructing INAR(1) processes. As pointed out by Shirozhan and Mohammadpour (2018a) and Khoo et al. (2017), the mixture models provide an appealing tool for time series modeling, having been used in modeling population heterogeneity. It is well-known that finite mixture models provide more flexibility in empirical modeling and are able to cater for multimodality in the data. For these reasons, some mixture INAR(1) models based on Pegram’s operator have been proposed. The Pegram’s operator was proposed by Pegram (1980) and is defined below.

Definition 1

(Pegram’s Operator) Consider two independent discrete random variables U and V. Pegram operator mixes U and V with the weights \(\phi \) and \(1-\phi \) as

$$\begin{aligned} Z=(\phi ,U) *(1-\phi ,V) \end{aligned}$$
(2)

with the corresponding marginal probability function

$$\begin{aligned} \mathrm {P}(Z=j)=\phi \mathrm {P}(U=j)+(1-\phi )\mathrm {P}(V=j),\quad j=0,1,2,\ldots \end{aligned}$$

Based on Pegram and different types of thinning operators, three new INAR(1) models have been proposed. Khoo et al. (2017) introduced a novel INAR(1) process to provide more flexibility in empirical modeling. Shirozhan and Mohammadpour (2018a, b) proposed INAR(1) models with Poisson marginal distribution and serially dependent innovation, respectively. However, the above three models focus on count data having an infinite range \(\{0,1,\ldots \}\). The binomial AR(1) (BAR(1)) process, proposed by McKenzie (1985), is the most common way to model count data having a finite range \(\{0,1,\ldots n\}\). The definition of the BAR(1) model is given below.

Definition 2

(BAR(1) Model) Let \(p\in (0;1)\) and \(\rho \in (\max \{-\frac{p}{1-p},-\frac{1-p}{p}\};1)\). Defined \(\beta :=p(1-\rho )\) and \(\alpha :=\beta +\rho \). Fix \(n\in {\mathbb {N}}\). A BAR(1) process \(\{X_t\}\) is defined by the recursion

$$\begin{aligned} X_t=\alpha \circ X_{t-1}+\beta \circ (n-X_{t-1}) \quad with \;X_0\sim \mathrm{B}(n, p), \end{aligned}$$
(3)

where all thinnings are performed independently of each other, and where the thinnings at time t are independent of \((X_s)_{s<t}\).

The BAR(1) model has been widely used due to its strong application value. Some authors have studied this model during the past ten years. Cui and Lund (2010) studied inference methods for the BAR(1) model. Weiß (2009a) investigated marginal and serial properties of jumps in the BAR(1) process. Weiß (2009b) proposed several approaches to monitor a binomial AR(1) process. Weiß and Kim (2013) presented four approaches for estimating the parameters of the BAR(1) model. Weiß (2013) considered the moments, cumulants, and estimation of the BAR(1) model. Scotto et al. (2014) introduced new classes of bivariate time series models being useful to fit count data time series with a finite range of counts. Yang et al. (2018) presented a new approach for estimating the parameters of the self-exciting threshold BAR(1) model.

The binomial index of dispersion, \(\mathrm {BID}\), is a useful metric which is used to quantify the dispersion behavior of a count data random variable X with a finite range \(\{0,1,\ldots n\}\). It is defined as

$$\begin{aligned} \mathrm {BID}=\frac{n\sigma ^2}{\mu (n-\mu )}, \end{aligned}$$

where \(\mu \) and \(\sigma ^2\) are the mean and variance of the random variable X, respectively. A finite range count data random variable is said to have overdispersion if \(\mathrm {BID}>1\), it is equidispersed if \(\mathrm {BID}=1\), and it is underdispersed if \(\mathrm {BID}<1\). The \(\mathrm {BID}\) of binomial distribution equals one, which leads to the result that the BAR(1) model can not explain underdispersion and overdispersion. To solve this problem, some extended BAR(1) models were proposed. Weiß and Pollett (2014) considered a class of density-dependent BAR(1) process \(n_t\) with range \(\{0,1,\ldots n\}\), where the thinning probabilities were not constant in time but rather depend on the current density \(n_t/n\). Kim and Weiß (2015) considered the modeling of count data time series with a finite range having extra-binomial variation. Möller et al. (2016) proposed types of self-exciting threshold BAR(1) models for integer-valued time series with a finite range, which are based on the BAR(1) model. Möller et al. (2018) developed four extensions of the BAR(1) processes, which can accommodate a broad variety of zero patterns.

To better model count data with bounded support, in this paper, we introduce a mixture INAR(1) model with bounded support based on the mixing Pegram and binomial thinning operators. One main advantage of the mixture model is that multimodality can be well described. Furthermore, the new model not only has the ability to handle equidispersion, underdispersion and overdispersion, but also gives good performances in explaining the zero-inflated phenomenon.

The contents of this paper are organized as follows. In Sect. 2, we introduce the new model. Some probabilistic and statistical properties are investigated. In Sect. 3, the CML method is used to estimate the model parameters. Section 4 presents some simulation studies for the proposed estimation method. In Sect. 5, the forecasting problem is addressed. Section 6 gives two real data examples and the forecasting methods discussed in Sect. 5 are applied. The paper ends with a discussion section.

2 Definition and properties of the new process

In this section, a mixture INAR(1) model with a finite range \(\{0,1,\ldots ,n\}\) based on the mixing Pegram and binomial thinning operators is presented. Note that the BAR(1) process defined in (3) comprises two thinned elements: \(\alpha \circ X_{t-1}\) and \(\beta \circ (n-X_{t-1})\). Pegram’s mixing operator (\(*\)) with mixing weight \(\phi \) on the two integer-valued random variables \(\alpha \circ X_{t-1}\) and \(\beta \circ (n-X_{t-1})\) yields the proposed model given below.

Definition 3

(Mixture of Pegram-BAR(1) Model) Let \(\phi , \alpha , \beta \in (0;1)\). Fix \(n\in {\mathbb {N}}\) and the initial value of the process \(X_0\in \{0,1,\ldots ,n\}\). Then the mixture of Pegram-BAR(1) model \(\{X_t\}\) is defined by the recursion

$$\begin{aligned} X_t=(\phi ,\alpha \circ X_{t-1})*(1-\phi ,\beta \circ (n-X_{t-1})), \end{aligned}$$
(4)

where \(\circ \) and \(*\) are the binomial and mixing Pegram thinning operators, respectively. The random variables \(\alpha \circ X_{t-1}\) and \(\beta \circ (n-X_{t-1})\) are independent of each other when \(X_{t-1}\) is given. All thinnings are performed independently of each other and the thinnings at time t are independent of \((X_s)_{s<t}\). For convenience, we denote the new model by MPTBAR(1) (BAR(1) model with the mixture of Pegram and binomial thinning operators) model.

Transition probabilities are very important in determining the process since the MPTBAR(1) model is Markovian. The transition probabilities of the MPTBAR(1) model are given in the following proposition.

Proposition 1

For fixed \(n \in {\mathbb {N}}\), the transition probabilities of the MPTBAR(1) model are given by

$$\begin{aligned} \mathrm {P}(X_t=i|X_{t-1}=j)&=I_{\{i\le j,i\le n-j\}}\Bigg (\phi \left( {\begin{array}{c}j\\ i\end{array}}\right) \alpha ^i(1-\alpha )^{j-i}\nonumber \\&\quad +(1-\phi )\left( {\begin{array}{c}n-j\\ i\end{array}}\right) \beta ^i(1-\beta )^{n-j-i}\Bigg )\nonumber \\&\quad +I_{\{i\le j,n-j<i\}}\phi \left( {\begin{array}{c}j\\ i\end{array}}\right) \alpha ^i(1-\alpha )^{j-i}\nonumber \\&\quad +I_{\{j< i,i\le n-j\}}(1-\phi )\left( {\begin{array}{c}n-j\\ i\end{array}}\right) \beta ^i(1-\beta )^{n-j-i}. \end{aligned}$$
(5)

Remark 1

As pointed out by a referee: from (5), the MPTBAR(1) model encounters the problem that the impossible one-step transitions exist. To be specific, for \(i, j\in \{0,1,\ldots ,n\}\), we have \(\mathrm {P}(X_t=i|X_{t-1}=j)=0\) if \(j<i\) and \(n-j<i\). For example, without loss of generality, suppose that \(n=6\), we have \(\mathrm {P}(X_t=6|X_{t-1}=1)=0\), \(\mathrm {P}(X_t=6|X_{t-1}=2)=\mathrm {P}(X_t=5|X_{t-1}=2)=0\), \(\mathrm {P}(X_t=6|X_{t-1}=3)=\mathrm {P}(X_t=5|X_{t-1}=3)=\mathrm {P}(X_t=4|X_{t-1}=3)=0\), \(\mathrm {P}(X_t=6|X_{t-1}=4)=\mathrm {P}(X_t=5|X_{t-1}=4)=0\) and \(\mathrm {P}(X_t=6|X_{t-1}=5)=0\). With the referee’s help, this problem will be fixed in Sect. 7.

The strict stationarity and ergodicity of the MPTBAR(1) model are very important to establish the asymptotic properties of the parameter estimates. Then we state the following theorem.

Theorem 1

The process \(\{X_t\}\) is an irreducible, aperiodic and positive recurrent (and thus ergodic) Markov chain. Hence, there exists a strictly stationary process satisfying Eq. (4).

Proof

Suppose that \(\mathrm {I}\) is the state space of \(\{X_t\}\). From Definition 3, we know that \(\mathrm {I}=\{0,1,\ldots ,n\}\) is finite. Firstly, from (5), for \(\forall \) \(i,j \in \mathrm {I}\), we have

$$\begin{aligned} \mathrm {P}(X_{t+1}=0|X_{t}=j)=\phi (1-\alpha )^j+(1-\phi )(1-\beta )^{n-j}>0 \end{aligned}$$

and

$$\begin{aligned} \mathrm {P}(X_{t+1}=i|X_{t}=0)&=I_{\{i=0\}}[\phi +(1-\phi )(1-\beta )^{n}] \\&+ \quad I_{\{i\ne 0\}}\bigg [(1-\phi )\left( {\begin{array}{c}n\\ i\end{array}}\right) \beta ^{i}(1-\beta )^{n-i}\bigg ]>0. \end{aligned}$$

For \(\forall \) \(i,j \in \mathrm {I}\), we have

$$\begin{aligned} \mathrm {P}(X_{t+2}=i|X_{t}=j)&=\sum _{k=0}^{n}\mathrm {P}(X_{t+2}=i,X_{t+1}=k|X_{t}=j)\\&=\sum _{k=0}^{n}\mathrm {P}(X_{t+1}=k|X_{t}=j)\mathrm {P}(X_{t+2}=i|X_{t}=j,X_{t+1}=k)\\&=\sum _{k=0}^{n}\mathrm {P}(X_{t+1}=k|X_{t}=j)\mathrm {P}(X_{t+2}=i|X_{t+1}=k)\\&\ge \mathrm {P}(X_{t+1}=0|X_{t}=j)\mathrm {P}(X_{t+2}=i|X_{t+1}=0)>0. \end{aligned}$$

For \(\forall \) \(i,j \in \mathrm {I}\), based on \(\mathrm {P}(X_{t+2}=i|X_{t}=j)>0\), we have

$$\begin{aligned} \mathrm {P}(X_{t+3}=i|X_{t}=j)&=\sum _{k=0}^{n}\mathrm {P}(X_{t+3}=i,X_{t+2}=k|X_{t}=j)\\&=\sum _{k=0}^{n}\mathrm {P}(X_{t+2}=k|X_{t}=j)\mathrm {P}(X_{t+3}=i|X_{t}=j,X_{t+2}=k)\\&=\sum _{k=0}^{n}\mathrm {P}(X_{t+2}=k|X_{t}=j)\mathrm {P}(X_{t+3}=i|X_{t+2}=k)\\&\ge \mathrm {P}(X_{t+2}=0|X_{t}=j)\mathrm {P}(X_{t+3}=i|X_{t+2}=0)>0. \end{aligned}$$

By the similar way, for \(\forall \) \(i,j \in \mathrm {I}\) and \(s\ge 4\), we can prove that

$$\begin{aligned} \mathrm {P}(X_{t+s}=i|X_{t}=j)>0. \end{aligned}$$

Thus, we can conclude that for \(\forall \) \(i,j \in \mathrm {I}\) and \(m\ge 2\), we have

$$\begin{aligned} \mathrm {P}(X_{t+m}=i|X_{t}=j)>0. \end{aligned}$$

This implies primitivity and thus the process is ergodic with uniquely determined stationary marginal distribution since we have a finite Markov Chain. \(\square \)

Next, some statistical properties, such as the autocorrelation function and binomial dispersion index \(\mathrm {BID}\), are studied. Since the one-step conditional moments are the most important regression properties, we give the following proposition now.

Proposition 2

Suppose \(\{X_t\}\) is the stationary process defined in (4), the one-step conditional moments of \(\{X_t\}\) are given by

$$\begin{aligned} \mathrm {E}(X_t|X_{t-1})&=[\phi \alpha -(1-\phi )\beta ]X_{t-1}+(1-\phi )n\beta ,\\ \mathrm {E}(X_t^2|X_{t-1})&=(n^2-n)(1-\phi )\beta ^2+n(1-\phi )\beta +[\phi \alpha ^2+(1-\phi )\beta ^2]X_{t-1}^2\\&\quad +[\phi \alpha -\phi \alpha ^2+(1-\phi )\beta ^2-2n(1-\phi )\beta ^2+(1-\phi )\beta ]X_{t-1}. \end{aligned}$$

Proof

It is easy to obtain

$$\begin{aligned} \mathrm {E}[X_{t}|X_{t-1}]&=\phi \alpha X_{t-1}+(1-\phi )\beta (n-X_{t-1})\\&=[\phi \alpha -(1-\phi )\beta ]X_{t-1}+(1-\phi )n\beta \end{aligned}$$

and

$$\begin{aligned} \mathrm {E}[X_{t}^2|X_{t-1}]&=\phi \mathrm {E}[(\alpha \circ X_{t-1})^2|X_{t-1}]+(1-\phi ) \mathrm {E}\big [\big (\beta \circ (n-X_{t-1})\big )^2|X_{t-1}\big ]\\&=\phi [\alpha X_{t-1}+\alpha ^2X_{t-1}(X_{t-1}-1)]\\&\quad +(1-\phi ) [\beta (n-X_{t-1})+\beta ^2(n-X_{t-1})(n-X_{t-1}-1)]\\&=(1-\phi )(n\beta +n^2\beta ^2-n\beta ^2)+[\phi \alpha ^2+(1-\phi )\beta ^2]X_{t-1}^2\\&\quad +[\phi (\alpha -\alpha ^2)+(1-\phi )(\beta ^2-2n\beta ^2-\beta )]X_{t-1}. \end{aligned}$$

\(\square \)

Based on the one-step conditional moments of the MPTBAR(1) process in Proposition 2, the mean and variance of our model can be given by the following proposition.

Proposition 3

Suppose \(\{X_t\}\) is the stationary process defined in (4), the mean and variance of \(\{X_t\}\) are given by

$$\begin{aligned} \mathrm {E}(X_{t})&=\frac{(1-\phi )n\beta }{1-[\phi \alpha -(1-\phi )\beta ]},\\ \mathrm {Var}(X_{t})&=\frac{\phi (1-\phi )n\beta (\alpha -\alpha ^2)+(1-\phi )^2n\beta (\beta ^2-2n\beta ^2-\beta )}{\big [1-\big (\phi \alpha ^2+(1-\phi )\beta ^2\big )\big ]\big [1-\big (\phi \alpha +(1-\phi )\beta \big )\big ]}\\&\quad +\frac{(1-\phi )(n^2\beta ^2-n\beta ^2+n\beta )}{1-[\phi \alpha ^2+(1-\phi )\beta ^2]} -\frac{(1-\phi )^2n^2\beta ^2}{\big [1-\big (\phi \alpha +(1-\phi )\beta \big )\big ]^2}. \end{aligned}$$

Proof

Based on Proposition 2, we have

$$\begin{aligned} \mathrm {E}(X_{t})&=\mathrm {E}[\mathrm {E}(X_{t}|X_{t-1})]=\mathrm {E}\big [\big (\phi \alpha -(1-\phi )\beta \big )X_{t-1}+(1-\phi )n\beta \big ] \\&=\frac{(1-\phi )n\beta }{1-[\phi \alpha -(1-\phi )\beta ]},\\ \mathrm {E}(X_{t}^2)&=\mathrm {E}[\mathrm {E}(X_{t}^2|X_{t-1})]=\frac{(1-\phi )(n^2\beta ^2-n\beta ^2+n\beta )}{1-[\phi \alpha ^2+(1-\phi )\beta ^2]}\\&\quad +\frac{\phi (\alpha -\alpha ^2)+(1-\phi )(\beta ^2-2n\beta ^2-\beta )}{1-[\phi \alpha ^2+(1-\phi )\beta ^2]}\mathrm {E}(X_{t})\\&=\frac{\phi (1-\phi )n\beta (\alpha -\alpha ^2)+(1-\phi )^2n\beta (\beta ^2-2n\beta ^2-\beta )}{\big [1-\big (\phi \alpha ^2+(1-\phi )\beta ^2\big )\big ]\big [1-\big (\phi \alpha -(1-\phi )\beta \big )\big ]}\\&\quad +\frac{(1-\phi )(n^2\beta ^2-n\beta ^2+n\beta )}{1-[\phi \alpha ^2+(1-\phi )\beta ^2]},\\ \mathrm {Var}(X_{t})&=\mathrm {E}(X_{t}^2)-[\mathrm {E}(X_{t})]^2 \\&=\frac{\phi (1-\phi )n\beta (\alpha -\alpha ^2)+(1-\phi )^2n\beta (\beta ^2-2n\beta ^2-\beta )}{\big [1-\big (\phi \alpha ^2+(1-\phi )\beta ^2\big )\big ]\big [1-\big (\phi \alpha +(1-\phi )\beta \big )\big ]}\\&\quad +\frac{(1-\phi )(n^2\beta ^2-n\beta ^2+n\beta )}{1-[\phi \alpha ^2+(1-\phi )\beta ^2]} -\frac{(1-\phi )^2n^2\beta ^2}{\big [1-\big (\phi \alpha +(1-\phi )\beta \big )\big ]^2}. \end{aligned}$$

\(\square \)

Next, we give the expression for the binomial dispersion index of the MPTBAR(1) process.

Proposition 4

Suppose \(\{X_t\}\) is the stationary process defined in (4), the binomial dispersion index \(\mathrm {BID}\) of \(\{X_t\}\) is given by

$$\begin{aligned} \mathrm {BID}&= \frac{\big [1-\big (\phi \alpha -(1-\phi )\beta \big )\big ]^2\big [\phi (\alpha -\alpha ^2)+(1-\phi )(\beta ^2-2n\beta ^2-\beta )\big ]}{(1-\phi \alpha )\big [1-\big (\phi \alpha ^2+(1-\phi )\beta ^2\big )\big ]\big [1-\big (\phi \alpha +(1-\phi )\beta \big )\big ]}\\&\quad +\frac{(n\beta -\beta +1)\big [1-\big (\phi \alpha -(1-\phi )\beta \big )\big ]^2}{(1-\phi \alpha )\big [1-\big (\phi \alpha ^2+(1-\phi )\beta ^2\big )\big ]} \\&\quad -\quad \frac{(1-\phi )n\beta \big [1-\big (\phi \alpha -(1-\phi )\beta \big )\big ]^2}{(1-\phi \alpha )\big [1-\big (\phi \alpha +(1-\phi )\beta \big )\big ]^2}. \end{aligned}$$

We will see in Sect. 4, the MPTBAR(1) model has ability to model equidispersion, overdispersion and underdispersion according to the true values of the model parameters. In the following proposition, we consider the autocorrelation function of the MPTBAR(1) model.

Proposition 5

Suppose \(\{X_t\}\) is the stationary process defined in (4), for any non-negative integer h, the autocovariance at lag h is given by

$$\begin{aligned} \mathrm {Cov}(X_{t},X_{t+h})=[\phi \alpha -(1-\phi )\beta ]^h\mathrm {Var}(X_{t}). \end{aligned}$$
(6)

Proof

It is easy to obtain

$$\begin{aligned} \mathrm {Cov}(X_{t},X_{t+h})&=\mathrm {E}(X_{t}X_{t+h})-\mathrm {E}(X_{t+h})\mathrm {E}(X_{t})\\&=\mathrm {E}\big [X_{t}\big (\phi \alpha X_{t+h-1}+(1-\phi )\beta (n-X_{t+h-1})\big )\big ]\\&\quad -\mathrm {E}(X_{t})\mathrm {E}\big (\phi \alpha X_{t+h-1}+(1-\phi )\beta (n-X_{t+h-1})\big )\\&=[\phi \alpha -(1-\phi )\beta ]\mathrm {E}(X_{t}X_{t+h-1})\\&\quad -[\phi \alpha -(1-\phi )\beta ]\mathrm {E}(X_{t})\mathrm {E}(X_{t+h-1})\\&=[\phi \alpha -(1-\phi )\beta ]\mathrm {Cov}(X_{t},X_{t+h-1})\\&=\cdots =[\phi \alpha -(1-\phi )\beta ]^h\mathrm {Var}(X_{t}). \end{aligned}$$

\(\square \)

Remark 2

From (6), it follows that the autocorrelation function of the MPTBAR(1) model can be given by \(\mathrm {Corr}(X_{t},X_{t+h})=[\phi \alpha -(1-\phi )\beta ]^h\) for \(h\ge 0\).

To obtain the unique stationary distribution of the MPTBAR(1) process, a Markov–Chain approach, proposed by Weiß (2010), is applied. Let \({{\mathbf {P}}}\) denote the transition matrix of the MPTBAR(1) process, i.e.,

$$\begin{aligned} {{\mathbf {P}}}=(\mathrm {P}(j|i))_{i,j=0,1,2\ldots } \end{aligned}$$
(7)

with \(\mathrm {P}(j|i)=\mathrm {P}(X_t=j|X_{t-1}=i)\) is the transition probability given in (5). Let \(\varvec{\Pi }\) denote the stationary marginal distribution of our process. Then, by solving the invariance equation \(\varvec{\Pi }=\varvec{\Pi }{{\mathbf {P}}}\), the marginal distribution will be obtained. We will show the the marginal distribution plots of our model in Sect. 4.

3 Conditional maximum likelihood estimation

Suppose that \(\{X_t\}\) is a strictly stationary and ergodic solution of the MPTBAR(1) model. Our aim is to estimate the parameter \(\varvec{\eta }=(\alpha ,\beta ,\phi )^{\mathrm {\top }}\) from a sample \((X_1,X_2,\ldots ,X_T)\). We assume that n is known. The CML method is used to estimate the model parameters. Furthermore, some analytical and asymptotic results for estimators are derived. Assume that \(x_1\) is fixed. The CML estimates can be obtained by maximizing the conditional log-likelihood function for the MPTBAR(1) model

$$\begin{aligned} \ell (\varvec{\eta })=\log \Big (\prod _{t=2}^{T}\mathrm {P}(X_t|X_{t-1})\big )=\sum _{t=2}^{T}\log \big (\mathrm {P}(X_t|X_{t-1})\Big ), \end{aligned}$$
(8)

where \(\mathrm {P}(X_t|X_{t-1})\) is defined in (5). We use numerical optimization methods to obtain the CML estimates because the CML estimates do not have closed form.

Theorem 2

There exists a consistent CML estimator of \(\varvec{\eta }\), maximizing (8), that is also asymptotically normally distributed,

$$\begin{aligned} \sqrt{T-1}\begin{pmatrix} {\widehat{\alpha }}_{\textit{CML}}-\alpha \\ {\widehat{\beta }}_{\textit{CML}}-\beta \\ {\widehat{\phi }}_{\textit{CML}}-\phi \end{pmatrix} \mathop {\longrightarrow }\limits ^{d}\mathrm {N}({\mathbf {0}},{\mathbf {I}}^{-1}(\varvec{\eta })), \end{aligned}$$

\({\mathbf {I}}(\varvec{\eta })\) is the Fisher information matrix.

Proof

To prove Theorem 2, we have to check if Condition 5.1 in Billingsley (1961) is satisfied. Let \(\mathrm {I}\) be the state space of \(\{X_t\}\) and \(\mathrm {P}_{\varvec{\eta }}(j|i)=\mathrm {P}(X_{t}=j|X_{t-1}=i)\). For \(\forall ~ i, j\in \mathrm {I}\), the set \(\mathrm {D}\) of (ij) such that \(\mathrm {P}_{\varvec{\eta }}(j|i)>0\). First, \(\mathrm {D}\) is independent of \(\varvec{\eta }\). Furthermore, for \(\forall \) \((i,j)\in \mathrm {D}\), \(\mathrm {P}_{\varvec{\eta }}(j|i)\) are polynomials in \(\alpha \), \(\beta \), \(\phi \). So partial derivatives in \(\alpha \), \(\beta \), \(\phi \) up to any order exist and are continuous functions in \(\alpha \), \(\beta \), \(\phi \).

Without loss of generality, we will assume \(n>2\) in the proof below. The gradients of

$$\begin{aligned} \mathrm {P}_{\varvec{\eta }}(0|0)&=\phi +2(1-\phi )(1-\beta )^n,\\ \mathrm {P}_{\varvec{\eta }}(0|1)&=\phi (1-\alpha )+2(1-\phi )(1-\beta )^{n-1},\\ \mathrm {P}_{\varvec{\eta }}(0|2)&=\phi (1-\alpha )^2+2(1-\phi )(1-\beta )^{n-2}, \end{aligned}$$

are shown to be linearly independent, i.e. the matrix

$$\begin{aligned} \begin{pmatrix} \partial \mathrm {P}_{\varvec{\eta }}(0|0)/\partial \alpha &{} \partial \mathrm {P}_{\varvec{\eta }}(0|0)/\partial \beta &{} \partial \mathrm {P}_{\varvec{\eta }}(0|0)/\partial \phi &{} \\ \partial \mathrm {P}_{\varvec{\eta }}(0|1)/\partial \alpha &{} \partial \mathrm {P}_{\varvec{\eta }}(0|1)/\partial \beta &{} \partial \mathrm {P}_{\varvec{\eta }}(0|1)/\partial \phi &{} \\ \partial \mathrm {P}_{\varvec{\eta }}(0|2)/\partial \alpha &{} \partial \mathrm {P}_{\varvec{\eta }}(0|2)/\partial \beta &{} \partial \mathrm {P}_{\varvec{\eta }}(0|2)/\partial \phi \end{pmatrix} \end{aligned}$$

has rank 3.

So Condition 5.1 in Billingsley (1961) is satisfied, and there exists a consistent CML estimator of \(\varvec{\eta }\) that is also asymptotic normally distributed (Billingsley 1961, Theorems 2.1 and 2.2). \(\square \)

4 Simulation study

In this section, we conduct a simulation study to the finite sample performances of the CML estimates. The initial value \(X_0\) is fixed at 3. We generate the data from the MPTBAR(1) model and set the sample sizes \(T=50, 100, 200, 500\). In simulations, the mean squared error (MSE), mean absolute deviation error (MADE) and standard deviation (SD) are computed over 1000 replications. The true values of the parameters are selected as:

  1. (a)

    \((n,\alpha ,\beta ,\phi )=(5,0.9,0.8,0.1)\) (\(\mathrm {BID}=0.9845\), underdispersion);

  2. (b)

    \((n,\alpha ,\beta ,\phi )=(5,0.8,0.7,0.2)\) (\(\mathrm {BID}=1\), equidispersion);

  3. (c)

    \((n,\alpha ,\beta ,\phi )=(10,0.3,0.4,0.5)\) (\(\mathrm {BID}=2.1155\), overdispersion; zero-inflation);

  4. (d)

    \((n,\alpha ,\beta ,\phi )=(40,0.3,0.8,0.4)\) (multimodality).

Figure 1 shows the sample paths, autocorrelation function (ACF), marginal distributions of some typical MPTBAR(1) models. Note that the ACF of the MPTBAR(1) model with \((n,\alpha ,\beta ,\phi )=(10,0.3,0.4,0.5)\) indicates that the samples are not stemmed from a first-order autoregressive process. The explanation for this phenomenon is that the first-order autocorrelation coefficients of the MPTBAR(1) model is close to zero under this situation.

Fig. 1
figure 1

Sample paths, ACF and marginal distributions of the MPTBAR(1) model for: (a) \((n,\alpha ,\beta ,\phi ) = (5,0.9,0.8,0.1)\); (b) \((n,\alpha ,\beta ,\phi )=(5,0.8,0.7,0.2)\); (c) \((n,\alpha ,\beta ,\phi )=(10,0.3,0.4,0.5)\); (d) \((n,\alpha ,\beta ,\) \(\phi )=(40,0.3,0.8,0.4)\)

Table 1 MSE, MADE and SD of the CML estimates
Fig. 2
figure 2

Q–Q plots of the CML estimators for the MPTBAR(1) model with \(n=10\), \(\alpha =0.3\), \(\beta =0.4\), \(\phi =0.5\) and sample size \(T=500\)

The summary of the simulation results are shown in Table 1. We observe that MSE, MADE and SD of the estimators decrease as the sample size n increases, as expected. The CML estimation method can produce reliable estimates for the model parameters. Figure 2 shows the Q–Q plots of CML estimators for the MPTBAR(1) model with \((n,\alpha ,\beta ,\phi )=(10,0.3,0.4,0.5)\). Q–Q plots in Fig. 2 appear to be roughly normally distributed as expected. Similar results can be obtained from the MPTBAR(1) model with other parameter combinations. To save space, the figures are omitted here.

5 Forecasting for the MPTBAR(1) process

The forecasting problem is always an important topic in time series analysis. The most common method of forecasting is to use the conditional expectation, which yields forecasts with minimum mean square error. Based on the above reason, We use this method to forecast a MPTBAR(1) process. The h-step-ahead predictor of the MPTBAR(1) model is given by

$$\begin{aligned} {\widehat{X}}_{t+h}|X_{t}=\mathrm {E}[X_{t+h}|X_{t}],\quad h=1,2,\ldots \end{aligned}$$

From the properties of the binomial thinning operator, we have the following proposition.

Proposition 6

Suppose \(\{X_t\}\) is defined in (4), the h-step conditional mean of \(X_t\) can given by

$$\begin{aligned} \mathrm {E}(X_{t+h}|X_t)&=[\phi \alpha -(1-\phi )\beta ]^hX_t+(1-\phi )n\beta \frac{1-[\phi \alpha -(1-\phi )\beta ]^h}{1-[\phi \alpha -(1-\phi )\beta ]}. \end{aligned}$$

Proof

It is easy to obtain

$$\begin{aligned} \mathrm {E}(X_{t+1}|X_{t})&=[\phi \alpha -(1-\phi )\beta ]X_{t}+n(1-\phi )\beta ,\\ \mathrm {E}(X_{t+2}|X_t)&=\mathrm {E}\big [\big (\phi \alpha X_{t+1}+(1-\phi )\beta (n-X_{t+1})\big )\big |X_t\big ]\\&=\mathrm {E}\Big [\big [[\phi \alpha -(1-\phi )\beta ]\big (\phi \alpha X_{t}+(1-\phi )\beta (n-X_{t})\big )+(1-\phi )n\beta \big ]\big | X_t\Big ]\\&=[\phi \alpha -(1-\phi )\beta ]^2X_t+[\phi \alpha -(1-\phi )\beta ](1-\phi )n\beta +(1-\phi )n\beta ,\\ \mathrm {E}(X_{t+h}| X_t)&=[\phi \alpha -(1-\phi )\beta ]^hX_t+(1-\phi )n\beta \sum _{j=0}^{h-1}[\phi \alpha -(1-\phi )\beta ]^j\\&=[\phi \alpha -(1-\phi )\beta ]^hX_t+(1-\phi )n\beta \frac{1-[\phi \alpha -(1-\phi )\beta ]^h}{1-[\phi \alpha -(1-\phi )\beta ]}. \end{aligned}$$

\(\square \)

Remark 3

It can be easily seen that \(\mathrm {E}(X_{t})=\lim _{h\rightarrow \infty }\mathrm {E}(X_{t+h}| X_t)=\frac{(1-\phi )n\beta }{1-[\phi \alpha -(1-\phi )\beta ]}\).

From Proposition 6, based on the sample \(X_1, X_2, \ldots , X_t\), the h-step-ahead predictor \({\widehat{X}}_{t+h}\), \(h \in {\mathbb {N}}\), can be given by

$$\begin{aligned} {\widehat{X}}_{t+h}=[{\widehat{\phi }}{\widehat{\alpha }}-(1-{\widehat{\phi }}){\widehat{\beta }}]^hX_t+(1-{\widehat{\phi }})n{\widehat{\beta }}\frac{1-[{\widehat{\phi }}{\widehat{\alpha }}-(1-{\widehat{\phi }}){\widehat{\beta }}]^h}{1-[{\widehat{\phi }}{\widehat{\alpha }}-(1-{\widehat{\phi }}){\widehat{\beta }}]}, \quad h \in {\mathbb {N}}, \end{aligned}$$

where \({\widehat{\alpha }}\), \({\widehat{\beta }}\) and \({\widehat{\phi }}\) are estimators for \(\alpha \), \(\beta \) and \(\phi \), respectively.

However, this procedure will seldom produce integer-valued \({\widehat{X}}_{t+h}\). For this, Freeland and Mccabe (2004) proposed that using the h-step-ahead forecasting conditional distributions to forecast the future value. Freeland and Mccabe (2004), Möller et al. (2016), Li et al. (2018) and Maiti and Biswas (2017) have applied this method to forecast their processes. As pointed out by Möller et al. (2016), this approach leads to forecasts being themselves counts and therefore being coherent with the sample space, and the point forecasts are easily obtained from the median or the mode of the forecasting distribution. By the Chapman–Kolmogorov equations, \(p_h(x_{t+h}|X_t)\), the h-step-ahead conditional distribution of \(X_{t+h}\) given \(X_t\) of the MPTBAR(1) process, can be given by

$$\begin{aligned} p_h(x_{t+h}|X_t=x_t)=\mathrm {P}(X_{t+h}=x_{t+h}|X_t=x_t)=\mathbf {[P^{h}]_{x_t,x_{t+h}}}, \end{aligned}$$

where \(\mathbf {P}\) denotes the transition matrix defined by (7).

Now, we have obtained the h-step-ahead conditional distribution, as \(p_h(x_{t+h}|X_t,\varvec{\eta })\), where \(\varvec{\eta }=(\alpha ,\beta ,\phi )\). Estimating \(\varvec{\eta }\) before we implement the forecasting method is the problem we concern. For this, we can use CML method in practice. As pointed out by Theorem 2, the CML estimate \(\widehat{\varvec{\eta }}_{CML}\) is asymptotically normally distributed around the true value \(\varvec{\eta }_{0}\). Following Li et al. (2018), we have the next theorem.

Theorem 3

For a fixed \(x\in \{0,1,\ldots ,n\}\), the quantity \(p_h(x|X_t,\widehat{\varvec{\eta }}_{CML})\) has an asymptotically normal distribution, i.e.,

$$\begin{aligned} \sqrt{T-1}\big (p_h(x|X_t,\widehat{\varvec{\eta }}_{CML}) -p_h(x|X_t,\varvec{\eta }_{0})\big ) \mathop {\longrightarrow }\limits ^{d}\mathrm {N}(0,{\mathbf {DI}}^{\mathbf {-1}}(\varvec{\eta }) {\mathbf {D}}^{\top }), \end{aligned}$$

where \({\mathbf {D}}=\bigg (\frac{\partial p_h(x|X_t,\varvec{\eta })}{\partial \alpha }\Big |_{\varvec{\eta }=\varvec{\eta }_{0}},\frac{\partial p_h(x|X_t,\varvec{\eta })}{\partial \beta }\Big |_{\varvec{\eta }=\varvec{\eta }_{0}},\frac{\partial p_h(x|X_t,\varvec{\eta })}{\partial \phi }\Big |_{\varvec{\eta }=\varvec{\eta }_{0}}\bigg )\), \({\mathbf {I}}(\varvec{\eta })\) is the Fisher information matrix in Theorem 2.

Based on Theorem 5, the \(100(1-\alpha )\%\) confidence interval for \(p_h(x_{t+h}|X_t,\varvec{\eta })\) can be given by

$$\begin{aligned} {\mathcal {C}}_{\varvec{\eta }}^{\alpha }=\big (p_h(x|X_t,\widehat{\varvec{\eta }}_{CML})-\frac{\sigma }{\sqrt{T-1}}u_{1-\frac{\alpha }{2}},p_h(x|X_t,\widehat{\varvec{\eta }}_{CML})+\frac{\sigma }{\sqrt{n}}u_{1-\frac{\alpha }{2}}\big ), \end{aligned}$$

where \(\sigma =\sqrt{{\mathbf {DI}}^{\mathbf {-1}}(\varvec{\eta }) {\mathbf {D}}^{\top }}\), \(u_{1-\frac{\alpha }{2}}\) is the \((1-\frac{\alpha }{2})\)-upper quantile of N(0, 1).

To compare the two forecasting methods, we will apply the two methods to a real data in Sect. 6.

6 Data analysis

In this section, the first application of the MPTBAR(1) model is conducted to real data for illustrative purposes. We consider a data set which represents monthly counts of car beats in Pittsburgh (among \(n=42\) such car beats) that had at least one police offense report of prostitution in that month. The data consist of 96 observations, starting from January 1990 and ending in December 1997. This data set has been investigated by Möller et al. (2018). Figure 3 shows the sample path, ACF and PACF of the observations. The descriptive statistics for the data are listed in Table 2. The binomial dispersion index of the data set is given by 1.28, indicating that the data set shows overdispersion.

Fig. 3
figure 3

Sample path, ACF and PACF of the prostitution data

Table 2 Descriptive statistics for the prostitution data
Table 3 Estimates of the parameters and statistics for the prostitution data

We illustrate the competitiveness and usefulness of the MPTBAR(1) model in applications by comparing our process with the following models:

  • BAR(1) model (McKenzie 1985);

  • RZ-BAR(1), IZ-BAR(1), ZIB-AR(1) and ZT\(^{0}\)-BAR(1) models (Möller et al. 2018). The unknown parameters of the fitted models are estimated by the (conditional) maximum likelihood method. Also, the following statistics are derived: Akaike information criterion (AIC), Bayesian information criterion (BIC), the binomial index of dispersion \(\mathrm {BID}\) and zero frequency \(p_0\).

From Table 3, we conclude that the BAR(1) model is not suitable to fit this data set because the BAR(1) model fails to capture the zero-inflated and overdispersed characteristics of the data. The explanation for this phenomenon is that the marginal distribution of the BAR(1) model is the binomial distribution. While the RZ-BAR(1) and IZ-BAR(1) processes can capture the zero-inflation and overdispersion of the data, the two models perform poorly when we consider AIC and BIC. Comparing the ZIB-AR(1) model with the ZT\(^{0}\)-BAR(1) model, we find that the ZIB-AR(1) model performs better based on each statistics (except BIC). Furthermore, we find that the MPTBAR(1) and ZIB-AR(1) models provide the most satisfactory results for this data set. To be specific, the MPTBAR(1) and ZIB-AR(1) models have good performances and the MPTBAR(1) model gives a best fit when we consider AIC and BIC. Although the MPTBAR(1) and ZIB-AR(1) models can both describe overdispersion accurately, the ZIB-AR(1) model performs a little better than the MPTBAR(1) model based on \(\mathrm {BID}\). Since the zero frequency of the two models are very close to the empirical zero frequency, we conclude that the MPTBAR(1) and ZIB-AR(1) models have the ability to capture the zero-inflated feature of the data precisely. Based on these considerations, the MPTBAR(1) and ZIB-AR(1) models are the most appropriate for this data set.

Fig. 4
figure 4

h-step-ahead forecasting conditional distribution of the prostitution data: (a) \(h=1\); (b) \(h=2\); (c) \(h=3\); (d) \(h=4\); (e) \(h=5\); (f) \(h=6\)

Table 4 Properties of the standardized Pearson residuals

For the models above, we consider their corresponding Pearson residual analysis. The standardized Pearson residual are defined as

$$\begin{aligned} e_t=\frac{X_t-\mathrm {E}(X_t|X_{t-1})}{\sqrt{\mathrm {Var}(X_t|X_{t-1})}},\quad t=2,\ldots ,T. \end{aligned}$$

As pointed out by Möller et al. (2018), if the model is correctly specified, then the residuals should have zero mean, unit variance, and no significant serial correlation in \(e_t\) and \(e_t^2\). We compute the mean, variance, first-order autocorrelation coefficient of the \(e_t\) and \(e_t^2\) in Table 4. Comparing the properties of the residuals in Table 4, \({\widehat{\rho }}_{e_t}(1)\) and \({\widehat{\rho }}_{e_t^2}(1)\) of the six models are competitive. Based on the mean and variance of the residuals, we find that the MPTBAR(1) model gives the best performance. Thus, the MPTBAR(1) process is the most appropriate model for fitting this data set. Based on the above discussions, we conclude that the MTPBAR(1) process is an useful model to fit the count data with bounded support and suitable to capture the binomial dispersion and zero inflation characteristics of the data.

To check the predictability of the MPTBAR(1) model and compare the two forecasting methods discussed in Sect. 5, another simulation study is given to investigate the h-step-ahead forecasting for varying h of the MPTBAR(1) model. We give the plots of the h-step-ahead conditional distribution in Fig. 4 when \(h=1,2,3,4,5,6\). The median or the mode of h-step-ahead conditional distribution can be both viewed as the point prediction. For comparison, a standard descriptive measure of forecasting accuracy, namely, predicted mean absolute deviation (PMAD) is adopted. This measure can be give by

$$\begin{aligned} \mathrm {PMAD}=\frac{1}{H}\sum _{h=1}^{H}|X_{t+h}-{\widehat{X}}_{t+h}|. \end{aligned}$$

The conditional expectation and conditional distribution point predictors of the series are presented in Table 5. From Table 5, PMAD value of the mode of h-step-ahead conditional distribution point predictors is smaller than the h-step-ahead conditional expectation point predictors. The median of h-step-ahead conditional distribution point predictors give a poor performance based on PMAD. The reason is that the median of one-step-ahead conditional distribution point predictor is ten which is much greater than the observed value zero. The explanation for this phenomenon may be that the one-step-ahead conditional distribution in Fig. 4 is heavy tail. Based on these facts, we conclude that the mode of h-step-ahead conditional distribution point predictors are more appropriate for the data set.

Fig. 5
figure 5

Sample path, ACF and PACF of the drunken driving data

Table 5 h-step-ahead predictions for the real data
Table 6 Descriptive statistics for the drunken driving data
Table 7 Comparison of AIC and BIC for the SETBAR(1) model with different threshold values

As pointed out by a referee, we should compare the results obtained by fitting the MPTBAR(1) and BAR(1) models to the data set, with other competitors capable to cope with overdispersion and underdispersion. For this, we conduct another application of the MPTBAR(1) model to a real data set for comparative purposes. We compare the MPTBAR(1) model with the BAR(1) and SETBAR(1) (Möller et al. 2016) models. The SETBAR(1) model with appropriate parameter settings has ability to capture equidispersion, underdispersion and overdispersion. The second data set is computed from the file PghCarBeat.csv, which was downloaded from The Forecasting Principles site (http://www.forecastingprinciples.com). The data set is given for 42 different car beats and reach from January 1990 to December 1997. For each month t, the value \(x_t\) counts the number of car beats reported at least one case of drunken driving. So our data have finite range with fixed upper limit \(n=42\) and the series contains 96 observations.

Figure 5 shows the sample path, ACF and PACF of the observations. The descriptive statistics for the data are listed in Table 6. Table 7 shows the CML estimates, AIC and BIC for the SETBAR(1) model with different threshold values. From the sample path of the observations in Fig. 5, the threshold values \(R\in \{3,\ldots ,7\}\) is a reasonable range. By comparing AIC and BIC in Table 7, the SETBAR(1) model with a threshold \(R=3\) is the best choice. Table 8 lists the CML estimates, AIC and BIC for the MPTBAR(1) model, the BAR(1) model and the SETBAR(1) model with a threshold \(R=3\). From Table 8, we find that the BAR(1) model gives the worst fit based on AIC and BIC. Although the SETBAR(1) model gives the best fit when we consider AIC, the MPTBAR(1) model performs best if considering BIC. Thus, the MPTBAR(1) and SETBAR(1) models are competitive for fitting this data set. However, we have to select a suitable threshold by experiments when we decide to use the SETBAR(1) model. Selecting a suitable threshold sometimes may lead to some inconveniences in practice. Based on these considerations, we recommend the use of the MPTBAR(1) model to fit this data set.

Table 8 Estimates of the parameters and statistics for the drunken driving data

7 Discussion

The aim of the present work is to introduce a mixture INAR(1) process to model count data with a finite range \(\{0,1,\ldots ,n\}\). Parameter estimation, forecasting and diagnostic checking for the new model are investigated. Applications to real data sets are given to show the application of the new model. However, more research is still needed for one aspect of the MPTBAR(1) model. The issue is pointed out by a referee: from (5), the MPTBAR(1) model encounters the problem that the impossible one-step transitions exist. For example, \(\mathrm {P}(X_t=n|X_{t-1}=1)=0\), \(\mathrm {P}(X_t=n-1|X_{t-1}=2)=\mathrm {P}(X_t=n|X_{t-1}=2)=0\). Following the referee’s suggestion, we consider another mixture INAR(1) model \(\{X_t\}\) below to fix the problem.

Definition 4

Let \(\phi , \alpha \in (0;1)\). Fix \(n\in {\mathbb {N}}\) and the initial value of the process \(X_0\in \{0,1,\ldots ,n\}\). The new model \(\{X_t\}\) is defined by the recursion

$$\begin{aligned} X_t=(\phi ,\alpha \circ X_{t-1})*(1-\phi ,\epsilon _t), \end{aligned}$$
(9)

where \(\circ \) and \(*\) are the binomial and mixing Pegram thinning operators, respectively. \(\{\epsilon _t\}\) is a sequence of iid discrete random variables on \(\{0,1,\ldots ,n\}\). We suppose that the mean and variance of \(\{\epsilon _t\}\) are \(\mu _{\epsilon }\) and \(\sigma _{\epsilon }^2\).

The one-step transition probabilities of this model are given by

$$\begin{aligned} \mathrm {P}(X_t=i|X_{t-1}=j)&=I_{\{i\le j\}}\Bigg (\phi \left( {\begin{array}{c}j\\ i\end{array}}\right) \alpha ^i(1-\alpha )^{j-i}+(1-\phi )\mathrm {P}(\epsilon _t=i)\Bigg ) \nonumber \\&\quad +I_{\{j<i\}}(1-\phi )\mathrm {P}(\epsilon _t=i) \nonumber \\&=I_{\{i\le j\}}\phi \left( {\begin{array}{c}j\\ i\end{array}}\right) \alpha ^i(1-\alpha )^{j-i}+(1-\phi )\mathrm {P}(\epsilon _t=i). \end{aligned}$$
(10)

From (10), all one-step transition probabilities are positive.

Next, we consider the one-step marginal conditional moments. We obtain

$$\begin{aligned} \mathrm {E}(X_t|X_{t-1})&=\alpha \phi X_{t-1}+(1-\phi )\mu _{\epsilon },\\ \mathrm {E}(X_t^2|X_{t-1})&=\alpha ^2\phi X_{t-1}^2+(\alpha \phi -\alpha ^2\phi ) X_{t-1}+(1-\phi )(\mu _{\epsilon }^2+\sigma _{\epsilon }^{2}). \end{aligned}$$

Based on the one-step marginal conditional moments, the mean and variance are given by

$$\begin{aligned} \mathrm {E}(X_{t})&=\frac{(1-\phi )\mu _{\epsilon }}{1-\alpha \phi },\\ \mathrm {Var}(X_{t})&=\frac{\mu _{\epsilon }(\alpha \phi -\alpha ^2\phi )(1-\phi )}{(1-\alpha \phi )(1-\alpha ^2\phi )} +\frac{(1-\phi )(\mu _{\epsilon }^2+\sigma _{\epsilon }^{2})}{1-\alpha ^2\phi } -\frac{\mu _{\epsilon }^2(1-\phi )^2}{(1-\alpha \phi )^2}. \end{aligned}$$

The binomial dispersion index of the model is given by

$$\begin{aligned} \mathrm {BID}&=\frac{n\mu _{\epsilon }(\alpha \phi -\alpha ^2\phi )(1-\phi )(1-\alpha \phi )}{[n\mu _{\epsilon }(1-\phi )(1-\alpha \phi )-\mu _{\epsilon }^2(1-\phi )^2](1-\alpha ^2\phi )}\\&\quad +\frac{n(1-\phi )(1-\alpha \phi )^2(\mu _{\epsilon }^2+\sigma _{\epsilon }^{2})}{[n\mu _{\epsilon }(1-\phi )(1-\alpha \phi )-\mu _{\epsilon }^2(1-\phi )^2](1-\alpha ^2\phi )}\\&\quad -\frac{n\mu _{\epsilon }^2(1-\phi )^2}{n\mu _{\epsilon }(1-\phi )(1-\alpha \phi )-\mu _{\epsilon }^2(1-\phi )^2}. \end{aligned}$$

We verify that the autocovariance function of the process defined in (9) is given by

$$\begin{aligned} \mathrm {Cov}(X_{t},X_{t+h})=(\alpha \phi )^h\mathrm {Var}(X_{t}),\quad h\in \{0,1,\ldots \}. \end{aligned}$$
(11)

From (11), the autocorrelation function is given by \(\mathrm {Corr}(X_{t},X_{t+h})=(\alpha \phi )^h\) for \(h\in \{0,1,\ldots \}\).

7.1 Special cases

Now, we will consider two special cases of the process defined by (9). The first special case is that \(\{\epsilon _t\}\) in (9) is assumed to follow binomial distribution \(\mathrm {B}(n,p)\). Then the mean and variance of \(\{\epsilon _t\}\) are

$$\begin{aligned} \mathrm {E}(\epsilon _t)=np, \quad \mathrm {Var}(\epsilon _t)=np(1-p). \end{aligned}$$

The one-step transition probabilities of the model are

$$\begin{aligned} \mathrm {P}(X_t=i|X_{t-1}=j)&=I_{\{i\le j\}}\phi \left( {\begin{array}{c}j\\ i\end{array}}\right) \alpha ^i(1-\alpha )^{j-i}+(1-\phi )\left( {\begin{array}{c}n\\ i\end{array}}\right) p^i(1-p)^{n-i}. \end{aligned}$$

Following the referee’s suggestion and Möller et al. (2016), since the transition matrix is primitive and the state space of the model is finite, the process is ergodic with uniquely determined stationary marginal distribution.

The one-step conditional moments of the process are given by

$$\begin{aligned} \mathrm {E}(X_t|X_{t-1})&=\alpha \phi X_{t-1}+(1-\phi )np,\\ \mathrm {E}(X_t^2|X_{t-1})&=\alpha ^2\phi X_{t-1}^2+(\alpha -\alpha ^2)\phi X_{t-1}+np(np+1-p)(1-\phi ). \end{aligned}$$

The mean and variance of the model are given by

$$\begin{aligned} \mathrm {E}(X_{t})&=\frac{(1-\phi )np}{1-\alpha \phi },\\ \mathrm {Var}(X_{t})&=\frac{np(\alpha -\alpha ^2)\phi (1-\phi )}{(1-\alpha \phi )(1-\alpha ^2\phi )} +\frac{np(np+1-p)(1-\phi )}{1-\alpha ^2\phi } -\frac{(np)^2(1-\phi )^2}{(1-\alpha \phi )^2}. \end{aligned}$$

The binomial dispersion index of the model is given by

$$\begin{aligned} \mathrm {BID}&=\frac{n^2p(\alpha -\alpha ^2)\phi (1-\phi )(1-\alpha \phi )}{[n^2p(1-\phi )(1-\alpha \phi )-(np)^2(1-\phi )^2](1-\alpha ^2\phi )}\\&\quad +\frac{n^2p(1-\phi )(1-\alpha \phi )^2(np+1-p)}{[n^2p(1-\phi )(1-\alpha \phi )-(np)^2(1-\phi )^2](1-\alpha ^2\phi )}\\&\quad -\frac{n(np)^2(1-\phi )^2}{n^2p(1-\phi )(1-\alpha \phi )-(np)^2(1-\phi )^2}. \end{aligned}$$

The second special case is that we assume \(\{\epsilon _t\}\) in (9) follows zero-inflated binomial distribution \(\mathrm {ZIB}(n,p,\pi )\). The probability mass function of \(\mathrm {ZIB}(n,p,\pi )\) is given by

$$\begin{aligned} \mathrm {P}(\epsilon _t=k)=\left\{ \begin{array}{l} {\pi +(1-\pi )(1-p)^n}, \quad k=0,\\ (1-\pi )\left( {\begin{array}{c}n\\ k\end{array}}\right) p^k(1-p)^{n-k},\quad k=1,2,\ldots \end{array} \right. \end{aligned}$$

The mean and variance of \(\mathrm {ZIB}(n,p,\pi )\) are

$$\begin{aligned} \mathrm {E}(\epsilon _t)=np(1-\pi ), \quad \mathrm {Var}(\epsilon _t)=(1-\pi )np(1-p)+(1-\pi )(np)^2-(1-\pi )^2(np)^2. \end{aligned}$$

The one-step transition probabilities of the model are

$$\begin{aligned} \mathrm {P}(X_t=i|X_{t-1}=j)&=I_{\{i\le j\}}\phi \left( {\begin{array}{c}j\\ i\end{array}}\right) \alpha ^i(1-\alpha )^{j-i}\\&\quad +(1-\phi )\Bigg (\pi I_{\{i=0\}}+(1-\pi )\left( {\begin{array}{c}n\\ i\end{array}}\right) p^k(1-p)^{n-i}\Bigg ). \end{aligned}$$

Similarly, the model is ergodic with uniquely determined stationary marginal distribution.

The one-step conditional moments of the model are given by

$$\begin{aligned} \mathrm {E}(X_t|X_{t-1})&=\alpha \phi X_{t-1}+np(1-\phi )(1-\pi ),\\ \mathrm {E}(X_t^2|X_{t-1})&=\alpha ^2\phi X_{t-1}^2+(\alpha \phi -\alpha ^2\phi )X_{t-1}\\&\quad +np(1-p)(1-\phi )(1-\pi )+(np)^2(1-\phi )(1-\pi ). \end{aligned}$$

The mean and variance of the model are

$$\begin{aligned} \mathrm {E}(X_{t})&=\frac{np(1-\phi )(1-\pi )}{1-\alpha \phi },\\ \mathrm {Var}(X_{t})&=\frac{np(\alpha -\alpha ^2)\phi (1-\phi )(1-\pi )}{(1-\alpha \phi )(1-\alpha ^2\phi )}\\&\quad +\frac{np(np+1-p)(1-\phi )(1-\pi )}{1-\alpha ^2\phi } -\frac{(np)^2(1-\phi )^2(1-\pi )^2}{(1-\alpha \phi )^2}. \end{aligned}$$

The binomial dispersion index \(\mathrm {BID}\) of the model is given by

$$\begin{aligned} \mathrm {BID}&=\frac{n^2p(\alpha -\alpha ^2)(1-\alpha \phi )\phi (1-\phi )(1-\pi )}{[n^2p(1-\alpha \phi )(1-\phi )(1-\pi )-(np)^2(1-\pi )^2(1-\phi )^2](1-\alpha ^2\phi )}\\&\quad +\frac{n^2p(1-\phi )(1-\alpha \phi )^2(1-\pi )(1-p+np)}{[n^2p(1-\pi )(1-\phi )(1-\alpha \phi )-(np)^2(1-\pi )^2(1-\phi )^2](1-\alpha ^2\phi )}\\&\quad -\frac{n(np)^2(1-\pi )^2(1-\phi )^2}{n^2p(1-\pi )(1-\phi )(1-\alpha \phi )-(np)^2(1-\pi )^2(1-\phi )^2}. \end{aligned}$$

Also, the marginal distributions of the two special models can be obtained by solving the invariance equation \({\varvec{\Pi }}^{'}={\varvec{\Pi }}^{'}{{\mathbf {P}}^{'}}\), where \({\varvec{\Pi }}^{'}\) and \({{\mathbf {P}}^{'}}\) are the marginal distribution and transition matrix of the process.