1 Introduction

Count models are used in many theoretical and practical disciplines, including engineering, health, transportation, and insurance. Data science approaches have been used to describe pandemonium behaviour, crop harvesting, business data mining, e-commerce fraud, and other challenges (see Tien 2017). One of the most important uses of statistics is the representation of natural events or different real-world circumstances in a probability function that has a certain probability distribution that fits with those events. In order to express these incidents using a random variable (rv), we must be aware of them. A probability distribution function, which can be discrete, continuous, or mixed, can be used to represent any rv. We provide a mixed count model in this article that is based on the Lagrange expansion mentioned in Jenson (1902).

Poisson distribution is one often used model in the literature for counting data. However, because of its unique characteristics, this distribution is inappropriate for the most of count data, especially when there are issues with overdispersion or underdispersion. The most of count data deviate from the equidispersion of the Poisson distribution. As a result, it restricts the uses for this distribution (see Kusumawati and Wong 1987; Khan et al. 2018). Mixed-Poisson distributions in modeling count datasets have been proposed by researchers as a potential remedy for this problem. For instance, Bhati et al. (2017) created the Poisson-transmuted exponential distribution, a new mixed-Poisson distribution by combining the Poisson distribution with the transmuted exponential distribution. Poisson–Bilal distribution was first introduced by Altun (2020a). Altun et al. (2021) introduced the Poisson–Xgamma distribution (PXGD). Poisson-generalized Lindley distribution (PGLD) was first developed by Altun (2021). An detailed literature overview on mixed-Poisson distributions may be found in Karlis and Xekalaki (2005). Numerous researchers have proposed using generalized distributions to deal with circumstances where many non-homogeneous occurrences and common distributions are unsuccessful in explaining the behavior of their problems. The ability to represent both homogeneous and heterogeneous populations, as well as the fact that they are significantly wider than their usual forms, are characteristics of generalized distributions, see Consul and Jain (1973), Wagh and Kamalja (2017) and Bhattacharyya et al. (2021).

The generalized Poisson distribution (GPD) was created by Consul and Jain (1973) using the Lagrange expansion described in Jenson (1902). The GPD must be more suited to many forms of data with overdispersion or underdispersion than the classical Poisson distribution, which lacks dispersion flexibility. According to Consul and Jain (1973), the variance of the GPD is greater than, equal to, or less than the mean depending on the value of the parameter, whether it is positive, zero, or negative, respectively. Also, they showed that when parameter values increased, so did the variance and mean values, see Khan et al. (2018) and Wagh and Kamalja (2017). Several statistical applications prefer the GPD model, which generalizes the Poisson distribution. The characteristics of the GPD and the potential to represent data with overdispersion or underdispersion as well as the data with equidispersion make it a desirable distribution in distribution theory and a variety of applications, including branching processes, queuing theory, science, ecology, biology, and genetics. Moreover, the GPD takes up the most space and is the most important concept in the theory of Lagrangian distributions.

The idea of this work is based on the mixture of the GPD and XLindley distribution (XLD) using Lagrangian expansion given in Jenson (1902). Chouia and Zeghdoudi (2021) developed the XLD by mixing the exponential and Lindley distributions. This work is motivated by the following: XLD is simple and easy to apply. However, in general it is applicable to try out simpler distributions than more complicated ones; the XLD can be used quite effectively in analyzing many real lifetime data set: application to Ebola, Corona and Nipah virus and gives adequate fits too many datasets. For more details see Chouia and Zeghdoudi (2021).

Several disciplines, including agriculture, biology, ecology, engineering, epidemiology, sociology, etc., frequently use count data with extra zeros. Examples of such data include the number of women over 80 who pass away each day (see Hasselblad 1969), the number of fetal movements per second (Leroux and Puterman 2011), the number of HIV-positive patients (Van den Broek 1995), and the number of ambulances call for illnesses brought on by the heat (Bassil et al. 2011), the number of health services visits during a follow-up time (Feng 2021). To explain count data with excess zeros, a number of zero-inflated models, such as the zero-inflated Poisson distribution (ZIPD), the zero-inflated negative binomial distribution, and many others, have been studied in the literature (see Wagh and Kamalja 2018). In several areas, zero-inflated models are becoming more and more common. We also develop the zero-inflated GPXLD and give it the name zero-inflated GPXLD (ZIGPXLD) in this article.

The following is how the rest of the article is sorted. The detailed description of the Lagrange expansion and XLD are covered in Sect. 2. The definition and some of its special cases are discussed in Sect. 3. Some mathematical properties, and other details are presented in Sect. 4. In Sect. 5, the maximum likelihood estimation technique is defined to estimate the unknown parameters of the new distribution. The performance of the GPXLD parameters for the maximum likelihood estimation is also studied using simulation technique in the Sect. 6. A zero-inflated model with respect to the new distribution is discussed in Sect. 7. The applications and the empirical studies based on the new model concerning two real datasets are conducted in Sect. 8. Then, Sect. 9 finishes with the decisive concluding words.

2 Some Preliminaries

In this section, we define the XLD and give some mathematical background on the generalized Lagrangian family.

2.1 Generalized Lagrangian Family (GLF)

Let g(z) and h(z) be two analytic function of z,  which are successively differentiable in \([-1,1]\) such that \(g(1)=h(1)=1,\) and \(g(0)\ne {0}.\) Lagrange considered the inversion of the Lagrange transformation \(u=\frac{z}{g(z)},\) and expressed it as a power series of u. Jenson (1902) defined the Lagrange expansion to be:

$$\begin{aligned} {} h(u)=h(0)+\sum _{x=1}^{\infty }\frac{u^{x}}{x!}\, \biggl \{D^{x-1}\big [\big (g(z)\big )^{x}h^{\prime }(z)\big ]\biggr \}\bigg |_{z=0}, \end{aligned}$$
(1)

where \(D^{r}=\frac{\partial ^{r}}{\partial {z^{r}}}\) and \(h^{\prime }(z)=\frac{\partial {h(z)}}{\partial {z}}.\)

If every term in the series (1) is non-negative, the series turns into a probability generating function (pgf) in u and gives the pmf of the discrete GLF, which is as follows:

$$\begin{aligned} {} P(X=x)=\left\{ \begin{array}{ll} h(0) &{}\quad x=0,\\ \frac{\biggl \{D^{x-1}\big [\big (g(z)\big )^{x}h^{\prime }(z)\big ]\biggr \}\bigg |_{z=0}}{x!}&{}\quad x=1,2,3\ldots \end{array}\right. \end{aligned}$$
(2)

Using the Lagrange expansion described in (1), Consul and Shenton (1972) defined and studied the discrete GLF. For more references on the discrete GLF, see Consul and Famoye (2006).

Using Li et al. (2006), it is possible to obtain the Lagrangian probability model by relaxing the assumption that \(g(1)=h(1)=1.\) With this relaxation, we create the novel discrete mixture distribution based on the pmf of the discrete GLF given in (2).

2.2 The XLindley Distribution

A rv T follows a XLD, denoted as \(X\sim XLD(\theta ),\) if its probability density function (pdf) is given by

$$\begin{aligned} f_{T}(t)=\frac{\theta ^{2}\left( 2+\theta +t\right) e^{-\theta {t}}}{\left( 1+\theta \right) ^{2}},\quad t>0,\ \theta >0. \end{aligned}$$
(3)

Now, the cumulative density function (cdf) of the XLD is given as

$$\begin{aligned} F_{T}(t)=1-\bigg (1+\frac{\theta {t}}{(1+\theta )^{2}}\bigg )e^{-\theta{t}}, \end{aligned}$$

with \(t>0\) and \(\theta >0.\)

The rth distributional moment \((\mu _{r})\) associated with the XLD is given by

$$\begin{aligned} \mu _{r}=E(T^{r})=\frac{\left( \theta ^{2}+2\theta +r+1\right) r!}{\left( 1+\theta \right) ^{2}\theta ^{r}},\quad r=1,2,3\ldots \end{aligned}$$

We have employed the gamma function defined by \(\Gamma (m)=\int _{0}^{\infty }t^{m-1}e^{-t}dt,\) with the relation \(\Gamma (m)=(m-1)!\) for any positive integer m.

The graphical depiction of the pdf of the XLD is shown in the plots in Fig. 1. To learn more about the XLD, see Chouia and Zeghdoudi (2021).

Fig. 1
figure 1

Various pdf shapes of the XLD for different parameter values

3 The Generalized Poisson–XLindley Distribution

The following theorem from Li et al. (2008) is used with the Lagrangian probability model to generate the novel mixture of the XLD:

Theorem 3.1

Let \(g(z)>0\) and \(h(z)>0\) (for all \(z>0\)) be analytic functions such that \(g(0)\ne {0},\) \(\biggl \{D^{x-1}\left[ \left( g(z)\right) ^{x}h^{\prime }(z)\right] \biggr \}_{z=0}{\ge {0}},\) and \(h(0)\ge {0},\) where \(D=\frac{\partial }{\partial {z}}\) is a derivative operator. If the series

$$\begin{aligned} h(u)=h(0)+\sum _{x=1}^{\infty }\frac{u^{x}}{x!}\biggl \{D^{x-1}\left[ \left( g(z)\right) ^{x}h^{\prime }(z)\right] \biggr \}\bigg |_{z=0} \end{aligned}$$

converges uniformly on any closed and bounded interval, then a rv X has a unform mixture of Lagrangian probability model with the pmf

$$\begin{aligned} {} P(X=x)=\left\{ \begin{array}{ll} \int _{0}^{1}\bigl \{\frac{h(0)}{h(t)}\bigr \}dt,&{}\quad x=0,\\ \int _{0}^{1}\biggl \{\frac{\big (\frac{t}{g(t)}\big )^{x}}{x! h(t)}\biggl \{D^{x-1}\left[ \left( g(z)\right) ^{x}h^{\prime }(z)\right] \biggr \}\bigg |_{z=0}\biggr \}dt,&{}\quad x\ge {1}. \end{array}\right. \end{aligned}$$
(4)

Proof

Proof is given in Li et al. (2008) and hence omitted. \(\square\)

Theorem 3.2

Let g(t) and h(t) satisfy the conditions in Theorem 3.1and let f(t) be a pdf for some continuous rv Tthen the pmf of Xa continuous mixture of Lagrangian probability model, is given by

$$\begin{aligned} {} P(X=x)=\left\{ \begin{array}{ll} h(0)\int _{-\infty }^{\infty }\big (\frac{f(t)}{h(t)}\big )dt,&{}\quad x=0,\\ \\ \int _{-\infty }^{\infty }\,\biggl \{f(t)\frac{\big (\frac{t}{g(t)}\big )^{x}}{x! h(t)}\biggl \{D^{x-1}\left[ \left( g(z)\right) ^{x}h^{\prime }(z)\right] \biggr \}\bigg |_{z=0}\biggr \}dt,&{}\quad x\ge {1}. \end{array}\right. \end{aligned}$$
(5)

Proof

Proof is given in Li et al. (2008) and hence omitted. \(\square\)

Proposition 3.1

Assume that X follows the new mixture GPXLD with \(\lambda >0,\) \(0<\rho <1\) and \(\theta >0,\) then the pmf of X is given by

$$\begin{aligned} {} p(x)=\frac{\lambda \left( \lambda +\rho {x}\right) ^{x-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {x}\right) ^{x+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {x}\right) +x+1\biggr \},\quad x=0,1,2,\ldots \end{aligned}$$
(6)

This distribution is denoted as GPXLD\((\lambda ,\rho ,\theta ),\) and one can note \(X\sim {GPXLD(\lambda ,\rho ,\theta )}\) to inform that X follows the GPXLD with parameters \(\lambda ,\) \(\rho\) and \(\theta .\)

Proof

Let \(g(z)=e^{\rho {z}}\) and \(h(z)=e^{\lambda {z}},\) where \(0<\rho <1\) and \(\lambda >0.\) Under the transformation \(z=ue^{\rho {z}}\) and using the Lagrange expansion given in (1), we have

$$\begin{aligned} e^{\lambda {z}}&=1+\sum _{x=1}^{\infty }\,\frac{u^{x}}{x!}\,D^{x-1}\bigg [\left( e^{\rho {z}}\right) ^{x}\lambda \,e^{\lambda {z}}\bigg ]\bigg |_{z=0}\\ &=1+\sum _{x=1}^{\infty }\,\frac{\lambda u^{x}}{x!}\,D^{x-1}\bigg [e^{(\lambda +\rho {x})z}\bigg ]\bigg |_{z=0}\\ &=1+\sum _{x=1}^{\infty }\frac{\lambda }{x!}\bigg (\frac{z}{g(z)}\bigg )^{x}(\lambda +\rho {x})^{x-1}\\ &=1+\sum _{x=1}^{\infty }\frac{\lambda }{x!}\bigg (\frac{z}{e^{\rho {z}}}\bigg )^{x}(\lambda +\rho {x})^{x-1}, \end{aligned}$$

substituting \(z=t,\) we get

$$\begin{aligned} e^{\lambda {t}}=\sum _{x=0}^{\infty }\frac{\lambda \left( te^{-\rho t}\right) ^{x}\left( \lambda +\rho {x}\right) ^{x-1}}{x!}, \end{aligned}$$

which implies

$$\begin{aligned} 1=\sum _{x=0}^{\infty }\frac{\lambda {t}\left( \lambda {t}+\rho {tx}\right) ^{x-1}\,e^{-\lambda {t}-\rho {tx}}}{x!}, \end{aligned}$$

when \(t = 1\) the above formulation reduces to the GPD given in Consul and Jain (1973).

Therefore, by Theorem 3.1, we have a uniform mixture of GPD as:

$$\begin{aligned} P(X=x)&=\int _{0}^{1}\frac{\lambda {t}\left( \lambda {t}+\rho {tx}\right) ^{x-1}\,e^{-\lambda {t}-\rho {tx}}}{x!},\\ &=\frac{\lambda }{\left( \lambda +\rho {x}\right) ^{2}}\bigg [1-e^{-(\lambda +\rho {x})}\sum _{j=0}^{x}\frac{\left( \lambda +\rho {x}\right) ^{j}}{j!}\bigg ], \end{aligned}$$

where \(x=0,1,2,\ldots .\)

Clearly, g(t) and h(t) generate a Lagrangian probability model, which satisfies the conditions given in Theorem 3.1. More generally, assuming that the conditions given in Theorem 3.1 hold, and by letting the variable t to be a continuous rv from the XLD with pdf,

$$\begin{aligned} f(t)=\frac{\theta ^{2}\left( 2+\theta +t\right) e^{-\theta {t}}}{\left( 1+\theta \right) ^{2}},\quad t>0,\ \theta >0. \end{aligned}$$

By using Theorem 3.2, the pmf of the proposed new mixture model is obtained as follows:

$$\begin{aligned} p(x)&=\int _{0}^{\infty }\bigg (\frac{\theta ^{2}\left( 2+\theta +t\right) e^{-\theta {t}}}{\left( 1+\theta \right) ^{2}}\bigg )\frac{t^{x}e^{-\lambda {t}-\rho {tx}}}{x!}\,\lambda \left( \lambda +\rho {x}\right) ^{x-1}dt\\ &=\frac{\lambda \left( \lambda +\rho {x}\right) ^{x-1}\theta ^{2}}{x!\left( 1+\theta \right) ^{2}}\biggl \{(2+\theta )\int _{0}^{\infty }t^{x}e^{-\left( \theta +\lambda +\rho {x}\right) t}dt+\int _{0}^{\infty }t^{x+1}e^{-\left( \theta +\lambda +\rho {x}\right) t}dt\biggr \}\\ &=\frac{\lambda \left( \lambda +\rho {x}\right) ^{x-1}\theta ^{2}}{x!\,(1+\theta )^{2}}\biggl \{\frac{(2+\theta )\Gamma (x+1)}{\left( \theta +\lambda +\rho {x}\right) ^{x+1}}+\frac{\Gamma (x+2)}{\left( \theta +\lambda +\rho {x}\right) ^{x+2}}\biggr \}\\ &=\frac{\lambda \left( \lambda +\rho {x}\right) ^{x-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {x}\right) ^{x+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {x}\right) +x+1\biggr \}. \end{aligned}$$

Hence the proof. \(\square\)

Proposition 3.2

If \(\rho =0,\) the GPD reduces to the Poisson distribution. On doing this, we obtain the Poisson mixture of XLD with parameters \(\lambda\) and \(\theta .\)

Proof

$$\begin{aligned} p(x)&=\int _{0}^{\infty }\frac{\theta ^{2}(2+\theta +t)e^{-\theta {t}}}{(1+\theta )^{2}}\frac{t^{x}e^{-\lambda {t}}\lambda ^{x}}{x!}\\ &=\frac{\theta ^{2}\lambda ^{x}}{(1+\theta )^{2}x!}\int _{0}^{\infty }(2+\theta +t)e^{-\theta {t}}t^{x}e^{-\lambda {t}}dt\\ &=\frac{\theta ^{2}\lambda ^{x}}{(1+\theta )^{2}x!}\bigg [\int _{0}^{\infty }(2+\theta )t^{x}e^{-(\theta +\lambda )t}dt+\int _{0}^{\infty }t^{x+1}e^{-(\theta +\lambda )t}dt\bigg ]\\ &=\frac{\lambda ^{x}\theta ^{2}}{(1+\theta )^{2}(\theta +\lambda )^{x+2}}\big [(2+\theta )(\theta +\lambda )+x+1\big ]. \end{aligned}$$

Hence the proof. \(\square\)

Figure  2 display the graphical representation of the pmf of the GPXLD for different parameter values of \(\lambda ,\) \(\rho\) and \(\theta .\)

The hrf of the GPXLD is obtained by substituting the pmf of the GPXLD in the following equation

$$\begin{aligned} {} h(x)=P(X=x|X\ge {x}) =\frac{p(x)}{\sum _{j=x}^{\infty }p(j)}. \end{aligned}$$
(7)

From (7), it is clear that determining the closed form expression of the hrf is more intricate, although, in order to determine the shape of the hrf, we sketch its graph. Figure 3 demonstrates the following facts about the shapes of the hrf of the GPXLD, indicating that the GPXLD has all of the typical shapes, such as decreasing, upside-down bathtub and increasing shapes for varying parameter values.

Fig. 2
figure 2

Various shapes of pmf of the GPXLD for different parameter values

Fig. 3
figure 3

Various shapes of hrf of the GPXLD for different parameter values

4 Mathematical Properties

In this section, different structural properties of the GPXLD have been evaluated. These include median, mode, non-central moment, etc.

4.1 Median

Let X be a rv following the GPMED. Then the median of X is defined by the smaller integer m in \(\left\{ 0,1,2,\ldots \right\} .\) By the definition, m is the smallest integer in \(\left\{ 0,1,2,\ldots \right\}\) such that \(P(X\le {m})\ge {\frac{1}{2}},\)

$$\begin{aligned} \sum _{x=0}^{m}\biggl \{\frac{\left( \lambda +\rho {x}\right) ^{x-1}\big ((2+\theta )(\theta +\lambda +\rho {x})+x+1\big )}{\left( \theta +\lambda +\rho {x}\right) ^{x+2}}\biggr \}\ge \frac{\left( 1+\theta \right) ^{2}}{2\,\lambda {\theta ^{2}}}, \end{aligned}$$
(8)

which is equivalent to the desired result.

4.2 Mode

Let X be a rv following the GPXLD. Then, the mode of X,  denoted by \(x_{m},\) exists in \(\left\{ 0,1,2,\ldots \right\} ,\) and lies in the case:

We must find the integer \(x=x_{m}\) for which f(x) has the greatest value. That is, we aim to solve \(f(x)\ge {f(x-1)}\) and \(f(x)\ge {f(x+1)}.\) First, note that f(x) can also be written as:

$$\begin{aligned} f(x)=\frac{\lambda \,\theta ^{2}}{(1+\theta )^{2}}\,\eta (x), \end{aligned}$$

where

$$\begin{aligned} \eta (x)=\frac{\left( \lambda +\rho {x}\right) ^{x-1}\bigl \{(2+\theta )(\theta +\lambda +\rho {x})+x+1\bigr \}}{\left( \theta +\lambda +\rho {x}\right) ^{x+2}}. \end{aligned}$$

Obviously, \(f(x)\ge {f(x-1)}\) implies that

$$\begin{aligned} {} \frac{\eta (x)}{\eta {(x-1)}}\ge {1}. \end{aligned}$$
(9)

Also, \(f(x)\ge {f(x+1)}\) implies that

$$\begin{aligned} {} \frac{\eta {(x)}}{\eta (x+1)}\ge {1}. \end{aligned}$$
(10)

By combining (9) and (10), we get (11).

$$\begin{aligned} {} \eta (x_{m})\ge \eta (x_{m}-1)\quad \text{and}\quad \eta (x_{m})\ge {\eta (x_{m}+1)}. \end{aligned}$$
(11)

4.3 rth Order Non-central Moment

The rth non-central moment \(\mu _{r}^{\prime }=E(X^{r})\) of the rv X from the pmf given in (5) is:

$$\begin{aligned} {} \mu _{r}^{\prime }=E(X^{r})=\sum _{x=0}^{\infty }x^{r}p(x) \end{aligned}$$
(12)

and

$$\begin{aligned} {} E(X^{r})=\sum _{x=0}^{\infty }x^{r}\int _{0}^{\infty }f(t)\frac{t^{x}}{x!\left( g(t)\right) ^{x}h(t)}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}dt. \end{aligned}$$
(13)

Then

$$\begin{aligned} {} E(X)=\int _{0}^{\infty }\frac{g(t)}{h(t)}\sum _{x=0}^{\infty }x\,\frac{t^{x}}{\left( g(t)\right) ^{x}x!}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}dt. \end{aligned}$$
(14)

Jenson (1902) showed that the Lagrange expansion could be written as

$$\begin{aligned} {} h(t)=h(0)+\sum _{x=1}^{\infty }\frac{\big (\frac{t}{g(t)}\big )^{x}}{x!}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}. \end{aligned}$$
(15)

Taking first derivative of (15) partially with respect to t,  we have

$$\begin{aligned} {} D^{1}\left[ h(t)\right] =\left( \frac{g(t)}{t}\right) \,D^{1}\left[ \frac{t}{g(t)}\right] \sum _{x=1}^{\infty }\frac{x\,\big (\frac{t}{g(t)}\big )^{x}}{x!}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}, \end{aligned}$$
(16)

which implies that

$$\begin{aligned} {} \frac{t\,D^{1}(h(t))}{g(t)D^{1}\left( \frac{t}{g(t)}\right) }=\sum _{x=1}^{\infty }\frac{x\,\big (\frac{t}{g(t)}\big )^{x}}{x!}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}. \end{aligned}$$
(17)

On using (17) in (14), we get

$$\begin{aligned} E(X)=\int _{0}^{\infty }\,f(t)\,\frac{t\,D^{1}(h(t))}{h(t)g(t)D^{1}\left( \frac{t}{g(t)}\right) }dt=\int _{0}^{\infty }\frac{f(t)D^{1}\log {(h(t))}}{D^{1}\log {\left( \frac{t}{g(t)}\right) }}dt. \end{aligned}$$

Taking the second derivative of (17), we get

$$\begin{aligned} D^{1}\bigg [\frac{t\,D^{1}(h(t))}{g(t)D^{1}\left( \frac{t}{g(t)}\right) }\bigg ]=\sum _{x=1}^{\infty }\frac{x^{2}\,\big (\frac{t}{g(t)}\big )^{x-1}}{x!}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}. \end{aligned}$$

On multiplying both sides by \(f(t)t\left[ h(t)g(t)D^{1}\left( \frac{t}{g(t)}\right) \right] ^{-1},\) we get

$$\begin{aligned} f(t)t\left[ h(t)g(t)D^{1}\left( \frac{t}{g(t)}\right) \right] ^{-1} D^{1}\bigg [\frac{t\,D^{1}(h(t))}{g(t)D^{1}\left( \frac{t}{g(t)}\right) }\bigg ]=\sum _{x=1}^{\infty }\frac{x^{2}\,f(t)\big (\frac{t}{g(t)}\big )^{x}}{h(t)x!}\bigg [D^{x-1}\left( g(z)\right) ^{x}h^{\prime }(z)\bigg ]\bigg |_{z=0}. \end{aligned}$$

Therefore,

$$\begin{aligned} E(X^{2})&=\sum _{x=0}^{\infty }x^{2}p(x)\\ &=\sum _{x=0}^{\infty }x^{2}\int _{0}^{\infty }\frac{f(t)(\frac{t}{g(t)})^{x}}{h(t)x!}\bigg [D^{x-1}\left\{ (g(z))^{x}f'(z)\right\} \bigg ]\bigg |_{z=0}dt\\ &=\int _{0}^{\infty }\sum _{x=0}^{\infty }\frac{x^{2}f(t)\left( \frac{t}{g(t)}\right) ^{x}}{h(t)x!}\bigg [D^{x-1}\left\{ (g(z))^{x}f'(z)\right\} \bigg ]\bigg |_{z=0}dt\\ &=\int _{0}^{\infty }\frac{f(t)t}{h(t)g(t)D\left( \frac{t}{g(t)}\right) }\,D\bigg [\frac{tD\,h(t)}{D\left( \frac{t}{g(t)}\right) g(t)}\bigg ]dt\\ &=\int _{0}^{\infty }\frac{f(t)}{h(t)D\log \left( \frac{t}{g(t)}\right) }D\bigg [\frac{D\log h(t)}{D\log \left( \frac{t}{g(t)}\right) }\bigg ]dt. \end{aligned}$$

In the similar method, the \(r\)th order non-central moment of X is given by,

$$\begin{aligned} {} E(X^{r})=\int _{0}^{\infty }f(t)W_{r}(t)dt=E(X)W_{r}(T), \end{aligned}$$
(18)

where \(W_{1}(t)=D\left\{ \log {h(t)}\left[ D\log \left( \frac{t}{g(t)}\right) \right] ^{-1}\right\} ,W_{2}(t)=L(t)D\left\{ W_{1}(t)\right\} ,\)   \(\ldots ,W_{r}(t)=L(t)D\left( W_{r-1}(t)\right) ,\) where

$$\begin{aligned} L(t)=\left[ D\log \left( \frac{t}{g(t)}\right) \right] ^{-1}. \end{aligned}$$

It is important to observe that the integral part is incomplete gamma distribution and consequently the mean and variance of the GPXLD do not exist as in the case of quasi-negative binomial distribution, see Li et al. (2011).

4.4 Mean and Variance

Using (18), the mean (\(\mu _{x}\)) of the GPXLD is derived as follows:

$$\begin{aligned} \mu _{x}={\text{E}(\text{X})}&=\int _{0}^{\infty }\frac{f(t)D^{1} \log {(h(t))}}{D^{1}\log {\left( \frac{t}{g(t)}\right) }}dt\\ &=\frac{\lambda \theta ^{2}}{\left( 1+\theta \right) ^{2}} \int _{0}^{\infty }\left( 2+\theta +t\right) t^{2}\left( 1-\rho {t}\right) ^{-1}e^{-\theta {t}}dt\\ &=\frac{\lambda \theta ^{2}}{\left( 1+\theta \right) ^{2}}\biggl \{\left( 2+\theta \right) \int _{0}^{\infty }t^{2}(1-\rho {t})^{-1}e^{-\theta {t}}dt+\int _{0}^{\infty }t^{3}(1- \rho {t})^{-1}e^{-\theta {t}}dt\biggr \}. \end{aligned}$$

Analogously, using (18) the variance \((\sigma ^{2}_{x})\) can derived as follows:

$$\begin{aligned} \sigma ^{2}_{x}&=\int _{0}^{\infty }\frac{f(t)}{h(t)D\log \left( \frac{t}{g(t)}\right) }D\bigg [\frac{D\log h(t)}{D\log \left( \frac{t}{g(t)}\right) }\bigg ]dt-\mu _{x}^{2}\\ &=\frac{\lambda \theta ^{2}}{(1+\theta )^{2}}\int _{0}^{\infty }(2+\theta +t)e^{-(\theta +\lambda )t}(1-\rho {t})^{-1}D\left( \frac{1-\rho {t}}{t}\right) ^{-1}dt-\mu _{x}^{2}\\ &=\frac{\lambda \theta ^{2}}{(1+\theta )^{2}}\int _{0}^{\infty }(2+\theta +t)t\,e^{-(\theta +\lambda )t}(1-\rho {t})^{-3}dt-\mu _{x}^{2}. \end{aligned}$$

It is important to point out that, unlike in the case of a quasi-negative binomial distribution, the integral part of the GPXLD is an incomplete gamma distribution, which means that the mean and variance do not exist, see Li et al. (2011).

5 Estimation

Here, we employ the method of maximum likelihood (ML) to estimate the GPXLD’s unknown parameters.

Let \(X_1,X_2,\ldots ,X_n\) be n independently and identically distributed (iid) from the GPXLD\((\lambda ,\rho ,\theta )\) (consequently, using the pmf from (6)), and \(x_1,x_2,\ldots ,x_n\) be n observations. Following that, the appropriate likelihood function is provided by

$$\begin{aligned} L = \frac{\lambda ^{n}\theta ^{2n}\prod _{i=1}^{n}\left( \lambda +\rho {x_{i}}\right) ^{x_{i}-1}\prod _{i=1}^{n}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {x_{i}}\right) +x_{i}+1\biggr \}}{\left( 1+\theta \right) ^{2n}\prod _{i=1}^{n}\left( \theta +\lambda +\rho {x_{i}}\right) ^{x_{i}+2}}. \end{aligned}$$

The log-likelihood function is given by

$$\begin{aligned} {\mathcal {L}}_{n}&=n\log \lambda +2n\log \theta +\sum _{i=1}^{n}\left( x_{i}-1\right) \log (\lambda +\rho {x_{i}})-2n\log (1+\theta )\\ &\quad +\sum _{i=1}^{n}\log \biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {x_{i}}\right) +x_{i}+1\biggr \}-\sum _{i=1}^{n}\left( x_{i}+2\right) \log \left( \theta +\lambda +\rho {x_{i}}\right) . \end{aligned}$$
(19)

The ML estimate (MLE) of the parameter vector \({\Theta }=(\lambda , \rho ,\theta ),\) say \({\hat{\Theta }}=({\hat{\lambda }}, {\hat{\rho }},{\hat{\theta }}),\) is obtained by the solutions of the likelihood equations \(\frac{\partial {{\mathcal {L}}_{n}}}{\partial \lambda }=0,\) \(\frac{\partial {{\mathcal {L}}_{n}}}{\partial \rho }=0,\) and \(\frac{\partial {{\mathcal {L}}_{n}}}{\partial \theta }=0\) with respect to \(\lambda ,\) \(\rho\) and \(\theta .\) With these notations, \({\hat{\lambda }},\) \({\hat{\rho }}\) and \({\hat{\theta }}\) are also called MLEs of \(\lambda ,\) \(\rho\) and \(\theta ,\) respectively.

$$\begin{aligned}&\frac{\partial {{\mathcal {L}}_{n}}}{\partial \lambda }= \frac{n}{\lambda }+\sum _{i=1}^{n}\frac{\left( x_{i}-1\right) }{\left( \lambda +\rho {x_{i}}\right) }+\sum _{i=1}^{n} \frac{\left( 2+\theta \right) }{\bigl \{(2+\theta ) (\theta +\lambda +\rho {x_{i}})+x_{i}+1\bigr \}}- \sum _{i=1}^{n}\frac{\left( x_{i}+2\right) }{\left( \lambda +\rho {x_{i}}\right) }=0\\ &\frac{\partial {{\mathcal {L}}_{n}}}{\partial \rho }=\sum _{i=1}^{n}\frac{x_{i} (x_{i}-1)}{\left( \lambda +\rho {x_{i}}\right) }+\sum _{i=1}^{n}\frac{x_{i}}{\biggl \{(2+\theta )\left( \theta +\lambda +\rho {x_{i}}\right) +x_{i}+1\biggr \}} -\sum _{i=1}^{n}\frac{x_{i}(x_{i}+2)}{(\theta +\lambda +\rho {x_{i}})}=0 \end{aligned}$$

and

$$\begin{aligned} \frac{\partial {{\mathcal {L}}_{n}}}{\partial \theta }=\frac{2n}{\log \theta }+\sum _{i=1}^{n}\frac{2(1+\theta )+\lambda +\rho {x_{i}}}{\biggl \{(2+\theta )(\theta +\lambda +\rho {x_{i}})+x_{i}+1\biggr \}} -\frac{2n}{\left( 1+\theta \right) }-\sum _{i=1}^{n}\frac{(x_{i}+2)}{\theta +\lambda +\rho {x_{i}}}=0. \end{aligned}$$

The likelihood equations have analytical solutions that cannot be found. Despite so, when employing the L-BFGS-B technique, the MLEs can still be determined numerically by maximizing the log-likelihood function offered in (19) using the best approach available in the R programming language.

6 Simulation Study

We do a brief simulation exercise in this part to assess how well the estimates derived using the ML estimation approach perform in random samples. Here, we simulate a GPXLD random sample using the inverse transformation method (see Ross 2013). The inverse transform algorithm used to create the GPXLD rv is as follows:

  • Step 1: Generate a random number from uniform U(0, 1) distribution.

  • Step 2: \(i=0,\) \(p=\frac{\theta ^{2}\left[ (2+\theta )(\theta +\lambda )+1\right] }{(1+\theta )^{2}(\theta +\lambda )^{2}},\) \(F=p.\)

  • Step 3: If \(U<F,\) set \(X=i,\) and stop.

  • Step 4: \(p=p\times \frac{\left( \lambda +\rho (i+1)\right) ^{i}(\theta +\lambda +\rho {i})^{i+2}}{\left( \theta +\lambda +\rho (i+1)\right) ^{i+3}(\lambda +\rho {i})^{i-1}}\times \frac{\left( (2+\theta )(\theta +\rho (i+1)+\lambda )+i+2\right) }{\left( (2+\theta )(\theta +\rho {i}+\lambda )+i+1\right) },\) \(F=F+p,\) \(i=i+1.\)

  • Step 5: Go to Step 3.

where p is the probability that \(X=i,\) and F is the probability that X is less than or equal to i.

The iteration process is repeated for \(N=1000\) times. The specification of the parameter values is as follows:

  1. (i)

    \(\lambda =0.98, \rho =0.51\) and \(\theta =0.01.\)

  2. (ii)

    \(\lambda =0.70,\rho =0.16,\theta =0.28.\)

  3. (iii)

    \(\lambda =0.14,\rho =0.24,\theta =0.75.\)

Thus, we computed the average of the mean square error (MSE), and average absolute bias using the MLEs.

The average absolute bias of the simulated estimates equals \(\frac{1}{1000}\sum _{i=1}^{1000}|{\hat{d}}_{i}-d|\) and the average MSE of the simulated estimates equals \(\frac{1}{1000}\sum _{i=1}^{1000}({\hat{d}}_{i}-d)^2,\) in which i is the number of iterations, \(d\in \left\{ \lambda ,\rho ,\theta \right\}\) and \({\hat{d}}\) is the estimate of d.

Table 1 provides a summary of the study for the samples of sizes 50, 125, 500, and 1000. As the sample size increases, it can be seen that the MSE in both cases of the parameter sets is in decreasing order, and the MLEs of the parameters go closer to their original parameter values, indicating the consistency property of the MLEs.

Table 1 Simulation results for three parameters \(\lambda ,\rho\) and \(\theta\)

7 Zero-Inflated GPXLD

Long or heavy tail properties and an excessive amount of zeros are frequent characteristics of overdispersed count data. The negative binomial distribution (NBD) or GPD are often used distributions to fit data with long or heavy tails. These distributions, however, might not be able to accurately fit the proportion of zeros in the case of an excessive number of zeros. As a result of clustering, the situation with excessive zeros frequently occurs (see Johnson et al. 2005). In this article, we present the definition and some important properties of the zero-inflated version of the new proposed model GPXLD, known as zero-inflated generalized Poisson XLindley distribution (ZIGPXLD).

Definition 7.1

Let \(\psi\) be a rv degenerate at the point zero and let X follows GPXLD\((\lambda ,\rho ,\theta ).\) Assume that \(\psi\) and X are statistically independent. Then a discrete rv Y is said to follow the zero inflated GPXLD or in short the ZIGPXLD if its pmf has the following form.

$$\begin{aligned} f(y)&=\omega \,P(\psi =y)+(1-\omega )\,P(X=y)\\ &=\left\{ \begin{array}{ll} \omega +(1-\omega )\bigg (\frac{\theta }{(1+\theta )(\theta +\lambda )}\bigg )^{2}\biggl \{(2+\theta )(\theta +\lambda )+1\biggr \} ,&{}\quad y=0\\ (1-\omega )\,\frac{\lambda \left( \lambda +\rho {y}\right) ^{y-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {y}\right) ^{y+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {y}\right) +y+1\biggr \},&{}\quad y=1,2,3\ldots \end{array}\right. \end{aligned}$$
(20)

in which \(\omega \in {[0,1]},\) \(\lambda >0,\) \(0<\rho <1\) and \(\theta >0.\)

Clearly, when \(\omega =0,\) the ZIGPXLD reduces to the GPXLD\((\lambda ,\rho ,\theta )\) with pmf given in (20). Next, we present certain properties of the ZIGPXLD through the following results.

By definition, the pgf of the ZIGPXLD with pmf given in (20) is

$$\begin{aligned} \Psi (t)&=\sum _{y=0}^{\infty }t^{y}\,f(y)\\ &=\omega +(1-\omega )\bigg (\frac{\theta }{(1+\theta )(\theta +\lambda )}\bigg )^{2}\biggl \{(2+\theta )(\theta +\lambda )+1\biggr \}\\ {}&+(1-\omega )\sum _{y=1}^{\infty }\frac{\lambda \left( \lambda +\rho {y}\right) ^{y-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {y}\right) ^{y+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {y}\right) +y+1\biggr \}. \end{aligned}$$

The corresponding mean and variance of the ZIGPXLD is as follows:

$$\begin{aligned} Mean=(1-\omega )\sum _{y=1}^{\infty }\frac{\lambda \left( \lambda +\rho {y}\right) ^{y-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {y}\right) ^{y+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {y}\right) +y+1\biggr \} \end{aligned}$$

and

$$\begin{aligned} Variance=\, & {} (1-\omega )\sum _{y=1}^{\infty }y^{2}\frac{\lambda \left( \lambda +\rho {y}\right) ^{y-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {y}\right) ^{y+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {y}\right) +y+1\biggr \}\\ &-\bigg (Mean\bigg )^{2}. \end{aligned}$$

The likelihood function of the ZIGPXLD based on n observations, say \((x_{1},x_{2},\ldots ,x_{n})\) is:

$$\begin{aligned} &L(\omega ,\lambda ,\rho ,\theta )=\prod _{i=1}^{n}\biggl \{\bigg [\omega +(1-\omega )\bigg (\frac{\theta }{(1+\theta )(\theta +\lambda )}\bigg )^{2}\biggl \{(2+\theta )(\theta +\lambda )+1\biggr \}\bigg ]\\ &\quad +\,(1-\omega )\,\frac{\lambda \left( \lambda +\rho {x_{i}}\right) ^{x_{i}-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {x_{i}}\right) ^{x_{i}+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {x_{i}}\right) +x_{i}+1\biggr \}\biggr \}. \end{aligned}$$
(21)

The log-likelihood function of the equation given in (21) can be expressed as follows:

$$\begin{aligned} &{\mathcal {L}}=\sum _{i=1}^{n}\log \biggl \{\bigg [\omega +(1-\omega )\bigg (\frac{\theta }{(1+\theta )(\theta +\lambda )}\bigg )^{2}\biggl \{(2+\theta )(\theta +\lambda )+1\biggr \}\bigg ]\\ &\quad +(1-\omega )\,\frac{\lambda \left( \lambda +\rho {x_{i}}\right) ^{x_{i}-1}\theta ^{2}}{\left( 1+\theta \right) ^{2}\left( \theta +\lambda +\rho {x_{i}}\right) ^{x_{i}+2}}\biggl \{\left( 2+\theta \right) \left( \theta +\lambda +\rho {x_{i}}\right) +x_{i}+1\biggr \}\biggr \}. \end{aligned}$$
(22)

The estimates of the parameters in the non-linear equation given in (22) can be obtained by numerical optimization using “optim” or “nlm” functions in the R software, see R Core Team (2021).

8 Applications in Real Life Study

The goal of this section is to show how important the GPXLD and the ZIGPXLD are empirically.

8.1 Presentation

To show the usage of the proposed model, we utilize two real life data applications in this paper: the first is the number of suicides data set given in Kadhum and Abdulah (2021), which is used to compare the data modeling ability of the GPXLD over some competitive distributions, and the second is the COVID-19 pandemic data set given in El-morshedy et al. (2021), which is used to compare the data modeling ability of the ZIGPXLD over the ZIPD.

We consider the negative log-likelihood \((-\log \text{L}),\) \(\chi ^{2},\) the criteria like Akaike information criterion (AIC), Bayesian information criterion (BIC) and corrected Akaike information criterion (AICc). The better distribution corresponds to the lesser \(\chi ^{2},\) AIC, BIC and AICc values.

\(\text{AIC}=2k-2\,\log \text{L},\) \(\text{BIC}=k\,\log \,\text{n}-2\,\log \text{L}\) and \({\mathrm{AICc}}= \text{AIC}+\frac{2k(k+1)}{n-k-1},\)

where k is the number of parameters in the statistical model, n is the sample size and \(\log \text{L}\) is the maximized value of the log-likelihood function under the considered model.

Also, a graphical technique based on the total time on test (TTT) is used to determine the hrf of the datasets. If the empirical TTT plot is convex, concave, convex then concave, and concave then convex, then the form of associated hrf is decreasing, increasing, bathtub shape, upside-down bathtub shape, respectively (see Aarset 1987). We use the RStudio software for numerical evaluations of these datasets.

8.2 Number of Suicides Data Set

The first real data set is the number of accident suicides in the city of Baghdad between 2017–2020 period as accident data are rare and random events (see Kadhum and Abdulah 2021). Table 2 shows the descriptive measures of this data, which include sample size n,  minimum (min), first quartile \((Q_{1}),\) median (Md),  third quartile \((Q_{3}),\) maximum (max),  and interquartile range (IQR). The empirical index of dispersion (ID) of the data is equal to 0.9958. As a result, our model employed to describe the current data set is capable of dealing with underdispersion. To demonstrate the GPXLD’s potential benefit, the following distributions are considered for comparison.

  • The new Poisson weighted exponential distribution (NPWED) proposed by Altun (2020b), and defined by the following pmf:

    $$\begin{aligned} p_{1}(x)=\alpha (1+\theta )(1+\alpha +\alpha {\theta })^{-x-1},\quad x=0,1,2\ldots , \end{aligned}$$

    with \(\theta >0\) and \(\alpha >0.\)

  • The PXGD proposed by Bilal et al. (2020), and defined by the following pmf:

    $$\begin{aligned} p_{2}(x)=\frac{\theta ^{2}}{2(1+\theta )^{x+4}}\biggl \{2(1+\theta )^{2}+\theta (x+1)(x+2)\biggr \},\quad x=0,1,2\ldots , \end{aligned}$$

    with \(\theta >0.\)

  • The NBD proposed by Consul and Famoye (2006), and defined by the following pmf:

    $$\begin{aligned} p_{3}(x)=\frac{\lambda }{\lambda +x}\left( {\begin{array}{c}\lambda +x\\ x\end{array}}\right) \,\rho ^{x}(1-\rho )^{\lambda },\quad x=0,1,2\ldots , \end{aligned}$$

    with \(\lambda >0\) and \(0<\rho <1.\)

  • The discrete Lindley distribution (DLD) given in Bilal et al. (2020), and defined by the following pmf:

    $$\begin{aligned} p_{4}(x)=\frac{\rho ^{x}}{1+\lambda }\,\biggl \{\lambda (1-2\rho )+(1-\rho )(1-\lambda {x})\biggr \} x,\quad x=0,1,2\ldots , \end{aligned}$$

    with \(\lambda >0\) and \(0<\rho <1.\)

  • The PGLD proposed by Altun (2021), and defined by the following pmf:

    $$\begin{aligned} p_{5}(x)=\frac{1}{(\theta +1)^{x+2}}\biggl \{\theta ^{2}+\frac{\theta ^{\alpha }(\theta +1)^{1-\alpha }\Gamma (x+\alpha )}{\Gamma (\alpha )\Gamma (x+1)}\biggr \},\quad x=0,1,2\ldots , \end{aligned}$$

    with \(\theta >0\) and \(\alpha >0.\)

In addition, Fig. 4 shows an empirical TTT plot of the data and it reveals an increasing hrf.

Fig. 4
figure 4

Total time on test (TTT) plot for the suicides data set

According to Table 3, the GPXLD’s \(\chi ^{2},\) AIC, BIC and AICc values are lower than those of the other distributions under consideration. Therefore, the proposed model is the best choice for modeling the provided data set.

Table 2 Descriptive statistics for the number of suicides data set
Table 3 MLEs, AIC, BIC and AICc values for the suicides data set
Fig. 5
figure 5

Total time on test (TTT) plot for the COVID-19 pandemic datasets

Table 4 Descriptive statistics for the suicides data set

8.3 COVID-19 Pandemic Data Set

Second, we make use of the dataset of daily new cases of COVID-19 disease-related death in Armenia. The data are available at https://www.worldometers.info/coronavirus/country/armenia/ accessed on the 10 September 2020 and are also studied by El-morshedy et al. (2021). They contain the daily new COVID cases between 15 February 2020 and 4 October 2020. Likewise, this data indicates overdispersion problem with ID 4.4822. As a result, our model employed to describe the current data set is capable of dealing with overdispersion. Table 4 shows the descriptive measures of this data, which include nmin, \(Q_{1},\) Md, \(Q_{3},\) max, and IQR. It illustrates that the best fit is the ZIGPXLD, followed by the ZIPD.

In addition, Fig. 5 shows an empirical TTT plot of the data and it reveals an decreasing hrf.

According to Table 5, the ZIGPXLD’s AIC, BIC and AICc values are lower than those of the other distributions under consideration. Therefore, the proposed zero-inflated model is the best choice for modeling the provided data set.

Table 5 MLEs, AIC, BIC and AICc values for the COVID-19 datasets

9 Conclusion

In this work, the mixed count model is proposed, known as GPXLD. We show that its special case is the Poisson mixture of the XLD. In particular, we derive some mathematical properties of the GPXLD. The estimation procedure for parameters is also implemented by the maximum likelihood method. Also, we proposed zero-inflated version of the GPXLD, known as ZIGPXLD. The two proposed distributions are applied to two real datasets and it is compared with some important competitive distributions. The comparison results of the minus log-likelihood, AIC, BIC and AICc values for distributions show that the best fit model is the GPXLD and ZIGPXLD. In conclusion, the GPXLD is a flexible model that can be an alternative way to model count data with too many zeros. If the INAR(1) process of the GPXLD and a bivariate version of the GPXLD are created, the direction of this research may change. This work requires considerable revisions and examinations, which we will leave to additional research.