Abstract
In this study, a one-parameter discrete probability distribution is proposed and studied. The understudy distribution is named “Poisson Moment Exponential distribution”. Mathematical properties of proposed distribution are derived and discussed. For parameter estimation purposes seven different methods maximum likelihood, maximum product spacing, Anderson-Darling, Cramer von-Misses, least-squares, weighted least-squares and right tailed Anderson-Darling are used. The behavior of these estimators is assessed using a Monte Carlo simulation study. Four real datasets from different fields (i.e. failure times, slow-pace students’ marks, epileptic seizure counts, and European corn borer) are used to show the flexibility of the proposed distribution. It is evident that the proposed discrete distribution efficiently analyzed these datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
For modeling of count observations, several one-parameter discrete distributions have been proposed by combining the Poisson distribution and one-parameter lifetime distributions. Compounding a discrete with a continuous probability distribution is a useful technique for developing flexible distributions to aid in the analysis of count data. The count models are important in a variety of applied and theoretical applications, including health, transportation, insurance, engineering, etc. Data science approaches have been used to describe pandemonium behavior, crop harvesting, corporate data mining, e-commerce fraud, and other challenges [1,2,3,4]. Some discrete probability distributions are; discrete Weibull [5], discrete Lindley [6], discrete Burr-Hatke [7], discrete Rayleigh [8], Poisson Ailamujia [9], Poisson xgamma [10], exponentiated discrete Lindley [11], discrete inverted Topp-Leone distribution [12], discrete Ramous-Louzada distribution [13] discrete type-II half-logistic exponential distribution [14] and discrete power-Ailamujia distribution [15].
The moment exponential (ME) distribution was proposed by [16] by weighting the exponential distribution in accordance with Fisher's (1934) theory. The probability density function is given by
where β is the scale parameter. The ME model attained great attention due to its flexibility so various authors studied and further generalized it for more complex datasets. For example, generalized exponentiated moment exponential [17], Marshall-Olkin length biased exponential distribution [18], Kumaraswamy moment exponential distribution [19], and Weibull-Moment Exponential distribution [20] and references therein.
In this study, a new discrete probability distribution is proposed by combing Poisson and moment exponential distributions, as there is a need for a more flexible distribution for statistical data analysis. The proposed distribution has an over-dispersed nature, so this model will be more suited for analyzing over-dispersed count data sets. After little parameterization, the PMEx distribution is similar to the Poisson Ailamujia distribution proposed by [9]. We further investigate new results containing mode, moment generating function, and associated measures. We calculate its reliability characteristics including survival function, hazard function (failure rate), reversed hazard function, second failure rate, and mean residual life function. The actuarial measures are also derived. Seven different estimation methods are used to estimate the model parameter. In the end, four different datasets are used to show the flexibility of the PMEx distribution.
The rest of the study is organized as follows. The derivation of the Poisson-Moment Exponential distribution (PMExD) and the shape of its probability mass function (pmf) is presented in Sect. 2. Section 3 deals with the derivation of statistical properties. The estimation of the PMExD parameter has been discussed using the methods of maximum likelihood, maximum product spacing, Anderson darling, Cramer von-Misses, least-squares, and weighted least-squares estimation in Sect. 4. A comprehensive simulation study is also discussed in this section. In Sect. 5, the suitability of PMExD along with some competitive models has been discussed. Finally, Sect. 6 deals with the concluding remarks of the study.
2 PMEx Distribution
Definition 1
If \(Y\left| \lambda \right.\sim P\left( \lambda \right),\) where \(\lambda\) is a random variable with a parameter that follows a moment exponential distribution with parameter \(\left( \beta \right)\), the distribution that arises from marginalizing over \(\lambda\) is known as a compound of the Poisson and the moment exponential distribution. The resultant distribution is known as \(PMExD\left( {Y;\beta } \right)\). It should be emphasized that because the parent distribution is discrete, the proposed distribution will be discrete.
Theorem 1
The probability mass function (pmf) of Poisson Moment Exponential distribution, i.e. \(PMExD\left( {x;\eta } \right)\) is given by
Proof:
The pmf of a PMExD can be obtained using definition (1) as follows:
The parameter λ follows moment exponential distribution with the probability density function (pdf)
We have
which is the pmf of PMExD. □
Note that for \(\alpha = \frac{1}{\beta },\) we obtain Poisson Ailmujia distribution.
Remark 1:
The first derivative of \(p\left( x \right)\) is
gives:
For:
-
1.
\(\beta > \frac{1}{e - 1}:\hat{x} = \frac{{1 - \log \left( {1 + \frac{1}{\beta }} \right)}}{{\log \left( {1 + \frac{1}{\beta }} \right)}}\) is a critical point which \(p\left( {\hat{x};\beta } \right)\) is maximum.
-
2.
\(0 < \beta \le \frac{1}{e - 1}\), the pmf is decreasing function of x.
and the second derivative is
Therefore the mode of PMExD is given by:
Figure 1 shows the plots of the PMEx distribution pmf for different β values. It is found that the probabilities can be decreasing or unimodal shaped.
The corresponding cumulative distribution function (cdf) is
3 Mathematical Properties
3.1 Survival (Reliability) Function
The reliability function is defined as the probability of a system surveying for a certain time. The survival function of PMExD is as follows:
3.2 Hazard Function (Failure Rate)
The hazard function of PMExD is given by
Proposition 1:
The hazard function of PMExD is an increasing function of x.
Proof:
Using the idea of Glaser (1980) and from the pmf of PMExD.
It follows that
As \(\rho^{\prime}\left( x \right) > 0\) the hazard function of PMExD is increasing.□
Figure 2 displays the failure rate curves of the PMEx distribution for some choices of parameter β.
3.3 Reversed Hazard and Second Failure Rate
The reversed hazard rate and second failure rate of PMEx distribution are given by, respectively
and
3.4 Mean Residual Life (MRL)
For a discrete random variable, the MRL function is defined as
where \({\mathbb{N}}_{0} = \left\{ {0, \,1,\, 2, \ldots ,\,w} \right\}\) and \(0 < w < \infty\). Let Xl be the PMExD random variable, then the MRL is defined as
3.5 Moments and Associated Measures
3.5.1 Moment Generating Function
The moment generating function of the PMEx distribution can be obtained as
The first four moments are about the origin of the PMExD.
The moments about mean can be obtained using the following relation \(\mu_{1} = E\left( {Y - \mu_{1}^{^{\prime}} } \right)^{r}\). The first four moments about the mean for the PMExD are given by;
The dispersion index (DI) is
We noted that the PMExD is overdispersed.
The coefficient of variation (CV), coefficients of skewness (CS), and coefficients of kurtosis (CK) for the PMExD are given by
Some moments, variance, CV, DI in terms of β are presented in Table 1.
3.6 Actuarial Measures
In this section, two risk measure value at risk (VaR) and tail value at risk (TVaR) of the PMExD. The VaR measure is frequently used by practitioners in the field of actuarial sciences and standard financial market risk. The VaR is always supplied with a level of confidence, say p, and represents the percentage loss in portfolio value that will be equaled or exceeded only X percent of the time. Let X represents the loss random variable. The VaR of PMExD is derived as
\(P\left( {X > \tau_{p} } \right) = 1 - p,\) and then \(\tau_{p} = F^{ - 1} \left( p \right),\)where \(p\) is the solution of the equation \(\beta^{x} \left( {\beta + 2 + x} \right) = \left( {1 + \beta } \right)^{x + 2} \left( {1 - p} \right)\).
The TVaR is also known as “conditional tail expectation” or “tail conditional expectation”. It is a useful metric for calculating the expected value of a loss when an event occurs outside of a particular probability level. If X belongs to the PMExD, then the TVaR of X is
Some values of \(VaR_{p}\) and \(TVaR_{p}\) measures for PMExD are listed in Table 2.
4 Parameter Estimation
In this section, seven different estimation methods are used to estimate the parameter of PMExD including maximum likelihood (MLE), maximum product spacing (MPSE), Anderson–Darling (ADE), Cramer-von misses (CVME), least-squares (OLSE), weighted least-squares (WLSE) and right tailed Anderson-Darling (RADE) method.
Let \(X_{1} ,\,X_{2} , \ldots ,\,X_{n}\) be a random sample from the PMExD and \(X_{\left( 1 \right)} < X_{\left( 2 \right)} < \ldots < X_{\left( n \right)}\) denote the corresponding order statistics. Moreover, \(x_{\left( i \right)}\) refers to the observed values of \(X_{\left( i \right)}\). In this regard, the log-likelihood function of the PMEx distribution is
Then, the MLE of parameter \(\beta\) is given as follows
Let us define five functions that are used to obtain the minimum distance-based estimates:
and
The ADEs, RADEs, CVMEs, OLSEs and WLSEs of the parameter \(\beta\) are given, respectively, by
The estimators presented in Eqs. (12, 13, 14, 15, 16) can be obtained by using the optim () function in R.
The maximum product spacing is obtained using the following approach. For \(m = 1,2,3, \ldots , h + 1\), assume \(D_{m} \left( \beta \right) = F\left( {x_{\left( m \right)} |\beta } \right) - F\left( {x_{{\left( {m - 1} \right)}} |\beta } \right),\) be the uniform spacings of a random sample from the PMEx model, where \(F\left( {x_{\left( 0 \right)} |\beta } \right) = 0\), \(F\left( {x_{{\left( {h + 1} \right)}} |\beta } \right) = 1\) and \(\mathop \sum \limits_{r = 1}^{h + 1} D_{m} \left( \beta \right) = 1\). The MPSE of the parameter \(\beta\), say \(\hat{\beta }\), can be estimated by maximizing the geometric mean of the spacings
with respect to the parameter \(\beta\).
4.1 Simulation Study
This section is based on a comprehensive simulation study to compare the estimation performance of the derived estimator in the previous section. The samples are generated from PMExD with sizes n = 10, 25, 50, 100, 200, and four settings \(\left( {\beta = 0.25, \,0.5,\,1.0,\,2.0,\, 5.0} \right)\) are considered. The simulation procedure is based on 10,000 repetitions. The ABSs, MREs and MSEs are given by
and
The average estimates (AVEs), average biases (ABSs), mean relative errors (MREs), and mean square errors (MSEs) are presented in Tables 3, 4, 5, 6 and 7.
It is found that the average estimates moved closer to the true parameter values as the sample increases. Further, the ABSs and MSEs for all estimators are decreasing with an increase in sample size. Hence, the MLE demonstrates the consistency property. As a result, we conclude that the MLE performs well in predicting the DRL distribution's parameter.
5 Application
In this section, four datasets from different fields are used for application purposes. The fit of the PMExD is compared with some competitive one-parameter distributions such as discrete Rayleigh (DR), discrete Burr-Hatke (DBH), discrete Pareto (DPr), discrete inverted Topp-Leone (DITL), and Poisson. The fitted probability distributions are compared using maximized log-likelihood \(\left( l \right)\), Akaike information criterion (AIC), Bayesian information criterion (BIC), Kolmogorov–Smirnov (KS), and the Chi-square test with its corresponding P-values.
Data set I: The first data were reported in [21] and represents the failure times for a sample of 15 electronic components in an acceleration life test. The number are as follows; 1, 5, 6, 11, 12, 19, 20, 22, 23, 31, 37, 46, 54, 60 and 66. The measures, mean, variance dispersion index, skewness, and kurtosis for this data are 27.533, 431.98, 15.689, 0.5532, and 2.0616, respectively. These measures portray that the data set is over-dispersed, positive skewness, and leptokurtic behavior. The MLEs and goodness-of-fit for this dataset are listed in Table 8. The P-P plots for all fitted models are given in Fig. 3.
From Table 8, it is obvious that the PMEx distribution gives a higher value of log-likelihood and Kolmogorov–Smirnov test. Furthermore, the proposed distribution yields a minimum value of AIC and BIC criteria. Figure 3, supports the results listed in Table 8.
Data set II: The second dataset consists of the 2003 final examination marks of 48 slow-pace students in mathematics at the Indian Institute of Technology at Kanpur (Bakouch et al. 2014). The observations are; 29, 25, 50, 15, 13, 27, 15, 18, 7, 7, 8, 19, 12, 18, 5, 21, 15, 86, 21, 15, 14, 39, 15, 14, 70, 44, 6, 23, 58, 19, 50, 23, 11, 6, 34, 18, 28, 34, 12, 37, 4, 60, 20, 23, 40, 65, 19 and 31. The mean, variance, skewness, kurtosis, and dispersion index of this data set are 25.895, 346.14, 1.3317, 4.3233, and 13.367 respectively. The parameter estimates along with model selection measures are given in Table 9. Figure 4 shows the P-P plots of all fitted models for the second data set.
The estimates of \(\hat{\beta }\) and goodness-of-fit measures for the PMExD and other models are reported in Table 9. It is observed that the PMEx distribution is more adequate for these data than the DR, DBH, DPr, Poisson, and DITL distributions. Figure 4 also concludes these results.
Data set III: The third data set represents epileptic seizure counts reported in [22]. The parameter estimates and goodness-of-fit measures for all fitted distributions for the third data set are listed in Table 10. It is evident from the below table, the proposed distribution fits this data set quite well. Figure 5 also supports these results.
Data set IV: The fourth data set is the biological experiment data listed in Table 11 obtained from [23] on the European corn borer. It was an experiment conducted randomly on 8 hills in 15 replications, and the experimenter counts the number of borers per hill of corn. The MLEs for all fitted models are presented in Table 11. The fitted pmfs are given in Fig. 6. It is found that the PMEx distribution provides more efficient fits than competitive distributions.
6 Conclusion
A new one-parameter discrete distribution is proposed. The new distribution is called the Poisson moment exponential distribution (PMExD) and it is suitable for modeling over-dispersed datasets. We derived its mathematical properties, including its moments and associated measures, reliability properties, and two risk or actuarial measures. The distribution parameter is estimated using seven different estimation methods. A Monte Carlo simulation study was used to investigate the efficiency of derived estimators. The PMExD distribution was applied on four datasets from different fields and compared with DR, DBH, DPr, Poisson, and DITL distributions. The findings show that the PMEx distribution outperforms competitive distributions. Further, the Neutrosophic form of the proposed distribution for modeling of datasets with indeterminacy is under investigation.
Data Availability
The data are given in the manuscript.
References
Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York
Shi Y, Tian Y, Kou G, Peng Y, Li J (2011) Optimization based data mining: theory and applications. Springer, Berlin
Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178
Shi Y (2022) Advances in big data analytics: theory, algorithms and practices. Springer, Berline
Nakagawa T, Osaki S (1975) The discrete Weibull distribution. IEEE Trans Reliab 24(5):300–301
Gómez-Déniz E, Calderín-Ojeda E (2011) The discrete Lindley distribution: properties and applications. J Stat Comput Simul 81(11):1405–1416
El-Morshedy M, Eliwa MS, Altun E (2020) Discrete Burr-Hatke distribution with properties, estimation methods and regression model. IEEE Access 8:74359–74370
Roy D (2004) Discrete Rayleigh distribution. IEEE Trans Reliab 53(2):255–260
Hassan A, Shalbaf GA, Bilal S, Rashid A (2020) A new flexible discrete distribution with applications to count data. J Stat Theory Appl 19(1):102–108
Para BA, Jan TR, Bakouch HS (2020) Poisson Xgamma distribution: a discrete model for count data analysis. Model Assist Stat Appl 15(2):139–151
El-Morshedy M, Eliwa MS, Nagy H (2020) A new two-parameter exponentiated discrete Lindley distribution: properties, estimation and applications. J Appl Stat 47(2):354–375
Eldeeb AS, Ahsan-Ul-Haq M, Babar A (2021) A discrete analog of inverted Topp-Leone distribution: properties, estimation and applications. Int J Anal Appl 19(5):695–708
Eldeeb AS, Ahsan-ul-Haq M, Eliwa MS (2021) A discrete Ramos-Louzada distribution for asymmetric and over-dispersed data with leptokurtic-shaped: properties and various estimation techniques with inference. AIMS Math 7(2):1726–1741. https://doi.org/10.3934/math.2022099
Ahsan-ul-Haq M, Babar A, Hashmi S, Alghamdi AS, Afify AZ (2021) The discrete type-II half-logistic exponential distribution with applications to COVID-19 data. Pak J Stat Oper Res 2021:921–932
Alghamdi AS, Ahsan-ul-Haq M, Babar A, Aljohani HM, Afify AZ, Cell QE (2022) The discrete power-Ailamujia distribution: properties, inference, and applications. AIMS Math 7(5):8344–8360
Dara ST, Ahmad M (2012) Recent advances in moment distribution and their hazard rates. LAP LAMBERT Academic Publishing, Chisinau
Iqbal Z, Hasnain SA, Salman M, Ahmad M, Hamedani GG (2014) Generalized exponentiated moment exponential distribution. Pak J Stat 30(4):537–554
ul Haq MA, Usman RM, Hashmi S, Al-Omeri AI (2017) The Marshall-Olkin length-biased exponential distribution and its applications. J King Saud Univ Sci 4763(October):1–11
Hashmi B, Hashmi S, Ahsan ul Haq M, Muhammad Usman R (2019) A generalized exponential distribution with in-creasing, decreasing and constant shape hazard curves. Electron J Appl Stat Anal 12:223–244
Hashmi S, Ahsan M, Haq U, Muhammad R, Ozel G (2019) The Weibull-moment exponential distribution: properties, characterizations and applications. J Reliab Stat Stud 12(1):1–22
Lawless JF (2011) Statistical models and methods for lifetime data, vol 362. Wiley, Hoboken
Chakraborty S (2010) On some distributional properties of the family of weighted generalized poisson distribution. Commun Stat 39(15):2767–2788
Beall G (1940) The fit and significance of contagious distributions when applied to observations on larval insects. Ecology 21(4):460–474
Author information
Authors and Affiliations
Contributions
MA is responsible for the entire contribution to this publication.
Corresponding author
Ethics declarations
Ethical statements
Four datasets are used in the application section and taken from literature.
Conflict of interest
The authors have no conflict of interest.
Funding
The author received no specific funding for this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ahsan-ul-Haq, M. On Poisson Moment Exponential Distribution with Applications. Ann. Data. Sci. 11, 137–158 (2024). https://doi.org/10.1007/s40745-022-00400-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-022-00400-0