Abstract
As a flexible extension of the common Poisson model, the Conway–Maxwell–Poisson distribution allows for describing under- and overdispersion in count data via an additional parameter. Estimation methods for two Conway–Maxwell–Poisson parameters are then required to specify the model. In this work, two characterization results are provided related to maximum likelihood estimation of the Conway–Maxwell–Poisson parameters. The first states that maximum likelihood estimation fails if and only if the range of the observations is less than two. Assuming that the maximum likelihood estimate exists, the second result then comprises a simple necessary and sufficient condition for the maximum likelihood estimate to be a solution of the likelihood equation; otherwise it lies on the boundary of the parameter set. A simulation study is carried out to investigate the accuracy of the maximum likelihood estimate in dependence of the range of the underlying observations.
Avoid common mistakes on your manuscript.
1 INTRODUCTION
The Conway–Maxwell–Poisson (CMP) distribution has been introduced in [9] as an extension of the common Poisson model, which allows to describe under- and overdispersion in count data. In the last two decades, its flexibility and usefulness has been pointed out and, meanwhile, there is a variety of articles in the literature dealing with CMP distributions and their properties including, among others, distribution theory, inference, regression models, time series, tree-based models, Bayesian methods, multivariate extensions, and applications; see, for instance, [1, 2, 4–8, 10–12, 16–22].
The counting density of the CMP distribution \(P_{\boldsymbol{\vartheta}}\) is given by
with normalizing constant
for \(\boldsymbol{\vartheta}=(\lambda,\nu)\in\Theta\) with parameter set
Geometric distributions, Poisson distributions, and Bernoulli distributions are contained in the model and result by setting \(\nu=0\), 1, and by sending \(\nu\) to infinity. A CMP distribution has variance not smaller than its mean for \(\nu<1\) (overdispersion) and variance not larger than its mean for \(\nu>1\) (underdispersion); see, e.g., [13, Sect. 4.2; 10, Subsect. 2.3.1]. In case \(\nu=1\), i.e., for Poisson distributions, we have equality, of course.
In this article two characterization results are provided related to maximum likelihood (ML) estimation of the CMP parameter \(\boldsymbol{\vartheta}\). In a previous article by the authors, a sufficient condition for the non-existence of the ML estimate has been derived, namely that the range of observations is less than two; see [5]. Here, we point out that the condition is also necessary (Section 2). Moreover, in case of existence, a simple necessary and sufficient condition for the ML estimate to be a solution of the likelihood equation is derived (Section 3). If this condition is not met, the ML estimate lies on the boundary of \(\Theta\). Finally, a simulation study is performed to assess the accuracy of the ML estimator given the range of observations (Section 4).
2 A CHARACTERIZATION RESULT ON THE EXISTENCE OF THE ML ESTIMATE
Let \(X_{1},\dots,X_{n}\) be independent and identically distributed random variables with distribution \(P_{\boldsymbol{\vartheta}}\) and counting density \(f_{\boldsymbol{\vartheta}}\) given by formula (1), and let \(x_{1},\dots,x_{n}\in{\mathbb{N}}_{0}\) be realizations of \(X_{1},\dots,X_{n}\). Moreover, let \(x_{(1)}\leq\ldots\leq x_{(n)}\) denote the realizations in ascending order. Furthermore, we introduce the statistic
and denote the convex support of \(\mathbf{T}\), i.e., the closed convex hull of the support of \(\mathbf{T}\), by \(M\). According to Lemma 2.3 in [5], we have the representation
where
Theorem 2.1. An ML estimate of \(\boldsymbol{\vartheta}\) based on \(x_{1},\dots,x_{n}\) exists if and only if \(x_{(n)}-x_{(1)}\geq 2\), i.e., if the range of all observations is at least two. In case of existence, the ML estimate is uniquely determined.
Proof. As shown in [5, 15], the set \(\{P_{\boldsymbol{\vartheta}}^{(n)}:\boldsymbol{\vartheta}\in\Theta\}\) of \(n\)-fold product measures of \(P_{\boldsymbol{\vartheta}}\), \(\boldsymbol{\vartheta}\in\Theta\), forms a full exponential family with canonical parameter \(\boldsymbol{\zeta}=(\ln(\lambda),\nu)\) and minimal sufficient statistic \(\mathbf{T}^{(n)}(\mathbf{x})=\sum_{i=1}^{n}\mathbf{T}(x_{i})\) for \(\mathbf{x}=(x_{1},\dots,x_{n})\in{\mathbb{N}}_{0}^{n}\). Applying Theorem 9.13 in [3, p. 151] to the corresponding exponential family consisting of the distributions of \(\mathbf{T}^{(n)}\), an ML estimate of \(\boldsymbol{\zeta}\) (and then also of \(\boldsymbol{\vartheta}\)) based on \(\mathbf{x}\in{\mathbb{N}}_{0}^{n}\) exists and is then unique if and only if \(\mathbf{T}^{(n)}(\mathbf{x})\) lies in the interior of the convex support of \(\mathbf{T}^{(n)}\) or, equivalently, if \(\sum_{i=1}^{n}\mathbf{T}(x_{i})/n\) lies in the interior of \(M\). The latter condition, in turn, is equivalent to \(x_{(n)}-x_{(1)}\geq 2\) by [5, Lemma 3.2]. \(\Box\)
Theorem 2.1 improves upon a former result in [5, Theorem 3.3 and Corollary 3.5] stating that \(x_{(n)}-x_{(1)}\geq 2\) is a necessary condition for the existence of the ML estimate.
In case of existence, the unique ML estimate \(\hat{\boldsymbol{\vartheta}}=(\hat{\lambda},\hat{\nu})\), say, of \(\boldsymbol{\vartheta}\) is either the unique solution of the likelihood equation
with mapping \(\boldsymbol{\pi}=(\pi_{1},\pi_{2}):\textrm{int}(\Theta)\rightarrow\textrm{int}(M)\) defined by
or lies on the boundary of \(\Theta\) and, hence, corresponds to a geometric distribution; see [5] for details. In the latter case, we necessarily have \(\hat{\boldsymbol{\vartheta}}=(\overline{x}_{n}/(1+\overline{x}_{n}),0)\) with \(\overline{x}_{n}=\sum_{i=1}^{n}x_{i}/n\), since \(\overline{x}_{n}/(1+\overline{x}_{n})\) is the unique ML estimate of \(\lambda\) in the subfamily of geometric distributions. The image of \(\boldsymbol{\pi}\) is of complicated analytic form, but its graphical representation may be useful to decide between the cases; see [5].
3 CHARACTERIZING THE EXISTENCE OF A SOLUTION OF THE LIKELIHOOD EQUATION
In the following, we derive an analytic criterion to decide whether, in case of existence, the ML estimate lies in the interior \(\textrm{int}(\Theta)=(0,\infty)^{2}\) of \(\Theta\) and can thus be obtained as a solution of the likelihood equation or lies on the boundary \(\{(\lambda,0):\lambda\in(0,1)\}\) of \(\Theta\). For this, note that the mapping \(\boldsymbol{\pi}:\textrm{int}(\Theta)\rightarrow\boldsymbol{\pi}(\textrm{int}(\Theta))\) is bijective and continuously differentiable; see [5], Section 2. \(\boldsymbol{\pi}\) therefore possesses a continuous inverse function \(\boldsymbol{\pi}^{-1}:\boldsymbol{\pi}(\textrm{int}(\Theta))\mapsto\textrm{int}(\Theta)\). In particular, we have that \(\boldsymbol{\pi}(\textrm{int}(\Theta))\) is open as the pre-image of \(\textrm{int}(\Theta)\) under \(\boldsymbol{\pi}^{-1}\). For our purposes, it will be convenient to extend the domain of \(\boldsymbol{\pi}\) from \(\textrm{int}(\Theta)\) to \(\Theta\) according to formula (4).
First, some preliminary results related to the behavior of \(\boldsymbol{\pi}\) on the boundary of \(\Theta\) are stated within several lemmas.
Lemma 3.1. Let \(\boldsymbol{\vartheta}\in\Theta\) be a boundary point of \(\Theta\) and \(\boldsymbol{\vartheta}_{j}\in\textrm{int}(\Theta)\), \(j\in{\mathbb{N}}\), be a sequence in the interior of \(\Theta\) with \(\boldsymbol{\vartheta}_{j}\rightarrow\boldsymbol{\vartheta}\) for \(j\rightarrow\infty\). Then, \(C(\boldsymbol{\vartheta}_{j})\rightarrow C(\boldsymbol{\vartheta})\) and \(\boldsymbol{\pi}(\boldsymbol{\vartheta}_{j})\rightarrow\boldsymbol{\pi}(\boldsymbol{\vartheta})\) for \(j\rightarrow\infty\).
Proof. Let \(\boldsymbol{\vartheta}\in\Theta\) be some boundary point of \(\Theta\), i.e., \(\boldsymbol{\vartheta}=(\lambda,0)\) for some \(\lambda\in(0,1)\), and let \(\boldsymbol{\vartheta}_{j}=(\lambda_{j},\nu_{j})\in\textrm{int}(\Theta)\) with \(\boldsymbol{\vartheta}_{j}\rightarrow(\lambda,0)\) for \(j\rightarrow\infty\). Moreover, let \(\varepsilon>0\) be such that \(\lambda+2\varepsilon<1\). For \(j\in{\mathbb{N}}\), let \(g_{j}(k)=\lambda_{j}^{k}/k!^{\nu_{j}}\), \(k\in{\mathbb{N}}_{0}\). Furthermore, let \(\mu\) denote the counting measure on \({\mathbb{N}}_{0}\). Then, there exists \(j_{0}\in{\mathbb{N}}\) with \(g_{j}(k)\leq g(k)=(\lambda+\varepsilon)^{k}\), \(k\in{\mathbb{N}}_{0}\), for all \(j\geq j_{0}\), where \(g\) is \(\mu\)-integrable. Hence, the dominated convergence theorem yields
and thus \(C(\boldsymbol{\vartheta}_{j})\rightarrow C(\boldsymbol{\vartheta})\) for \(j\rightarrow\infty\); see formula (2).
Next, we define for \(j\in{\mathbb{N}}\) the functions \(h_{j}(k)=kg_{j}(k)\) and \(\tilde{h}_{j}(k)=\ln(k!)g_{j}(k)\) for \(k\in{\mathbb{N}}_{0}\). Then, for \(j\geq j_{0}\), we have that \(h_{j}(k)\leq h(k)=k(\lambda+\varepsilon)^{k}\) and \(\tilde{h}_{j}(k)\leq\tilde{h}(k)=\ln(k!)(\lambda+\varepsilon)^{k}\) for \(k\in{\mathbb{N}}_{0}\). Obviously, \(h\) is \(\mu\)-integrable by the ratio test for series. To see that \(\tilde{h}\) is \(\mu\)-integrable, note that by l’Hospital’s rule
where \(\psi\) denotes the digamma function satisfying \(\psi(x)\rightarrow\infty\) for \(x\rightarrow\infty\). Hence, there exists \(k_{0}\in{\mathbb{N}}\) such that \(\ln(k+1)/\ln(k!)<\varepsilon/(\lambda+\varepsilon)\) for \(k\geq k_{0}\). It follows that
for \(k\geq k_{0}\), and \(\tilde{h}\) is then \(\mu\)-integrable by the ratio test for series; cf. the proof of Lemma 2.2 in [5]. Applying the dominated convergence theorem, we finally obtain
for \(j\rightarrow\infty\). \(\Box\)
Lemma 3.2 states a characterization of a boundary point \(\boldsymbol{\vartheta}\) of \(\Theta\) in terms of \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\).
Lemma 3.2. Let \(\boldsymbol{\vartheta}\in\Theta\). Then \(\boldsymbol{\vartheta}\) is a boundary point of \(\Theta\) if and only if \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\) is a boundary point of \(\boldsymbol{\pi}(\textrm{int}(\Theta))\).
Proof. First, let \(\boldsymbol{\vartheta}\in\Theta\) be a boundary point of \(\Theta\). Then there exists a sequence \(\boldsymbol{\vartheta}_{j}\in\textrm{int}(\Theta)\), \(j\in{\mathbb{N}}\), with \(\boldsymbol{\vartheta}_{j}\rightarrow\boldsymbol{\vartheta}\) for \(j\rightarrow\infty\). By Lemma 3.1, we have \(\boldsymbol{\pi}(\boldsymbol{\vartheta}_{j})\rightarrow\boldsymbol{\pi}(\boldsymbol{\vartheta})\) for \(j\rightarrow\infty\), such that every neighborhood of \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\) must contain the points \(\boldsymbol{\pi}(\boldsymbol{\vartheta}_{j})\in\boldsymbol{\pi}(\textrm{int}(\Theta))\), \(j\geq j_{0}\), for some \(j_{0}\in{\mathbb{N}}\).
Next, suppose that \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\in\boldsymbol{\pi}(\textrm{int}(\Theta))\). Then there exists some \(\tilde{\boldsymbol{\vartheta}}\in\textrm{int}(\Theta)\) with \(\boldsymbol{\pi}(\boldsymbol{\vartheta})=\boldsymbol{\pi}(\tilde{\boldsymbol{\vartheta}})\). Since \(\boldsymbol{\vartheta}_{j}\in\textrm{int}(\Theta)\) for all \(j\in{\mathbb{N}}\) and \(\boldsymbol{\pi}(\boldsymbol{\vartheta}_{j})\rightarrow\boldsymbol{\pi}(\boldsymbol{\vartheta})\) for \(j\rightarrow\infty\) by Lemma 3.1, it follows that
for \(j\rightarrow\infty\) by using that \(\boldsymbol{\pi}^{-1}\) is continuous. This leads to the contradiction \(\boldsymbol{\vartheta}=\tilde{\boldsymbol{\vartheta}}\in\textrm{int}(\Theta)\). Hence, we have \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\notin\boldsymbol{\pi}(\textrm{int}(\Theta))\), and \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\) is a boundary point of \(\boldsymbol{\pi}(\textrm{int}(\Theta))\).
On the other hand, for \(\boldsymbol{\vartheta}\in\textrm{int}(\Theta)\), it is evident that \(\boldsymbol{\pi}(\boldsymbol{\vartheta})\) lies in the interior of \(\boldsymbol{\pi}(\textrm{int}(\Theta))\), since \(\boldsymbol{\pi}(\textrm{int}(\Theta))\) is open. \(\Box\)
In what follows, let the mapping \(d:(0,\infty)\rightarrow(-\infty,0)\) be defined by
and let \(D=\{(z,d(z)):z>0\}\) denote the graph of \(d\).
Lemma 3.3. \(D\) is a subset of the boundary of \(\boldsymbol{\pi}(\textrm{int}(\Theta))\) and \(D\cap\boldsymbol{\pi}(\textrm{int}(\Theta))=\emptyset\).
Proof. Applying Lemma 3.2 and using formula (4), the boundary of \(\boldsymbol{\pi}(\textrm{int}(\Theta))\) contains the points
Since for \(\lambda\in(0,1)\)
formula (6) can be rewritten as
which, by setting \(z=\lambda/(1-\lambda)\), can be reparametrized as \((z,d(z))\), \(z>0\), and the first assertion is shown. The second assertion is obvious from the fact that \(\boldsymbol{\pi}(\textrm{int}(\Theta))\) is open and thus cannot contain any of its boundary points. \(\Box\)
Lemma 3.3 enables us to formulate a condition that every point in \(\boldsymbol{\pi}(\textrm{int}(\Theta))\) necessarily fulfils.
Lemma 3.4. For \((t_{1},t_{2})\in\boldsymbol{\pi}(\textrm{int}(\Theta))\), it holds that \(t_{2}>d(t_{1})\) with mapping \(d\) given by formula (5).
Proof. Let \(\boldsymbol{\vartheta}=(1,1)\in\textrm{int}(\Theta)\). Then, according to formula (4) with \(C(\boldsymbol{\vartheta})=1/e\) and formula (5),
where
To establish that \(d(1)<\pi_{2}(\boldsymbol{\vartheta})\), we show by induction that \(a_{k}<b_{k}\) for all \(k\geq 2\).
Obviously, \(a_{2}=\ln(2)/(2e)<\ln(2)/4=b_{2}\) and \(a_{3}=\ln(6)/(6e)\approx 0.110<0.137\approx\ln(3)/8=b_{3}\). Next, let \(a_{k}<b_{k}\) for some \(k\geq 3\). Then, by using that \((k+1)!>2^{k+1}\),
Now, since \(\boldsymbol{\pi}(\textrm{int}(\Theta))\subset\textrm{int}(M)\subset(0,\infty)\times(-\infty,0)\) is connected by [14, p. 668] and \(D\cap\boldsymbol{\pi}(\textrm{int}(\Theta))=\emptyset\) by Lemma 3.3, the assertion follows. \(\Box\)
We arrive at the main result of this section stating a simple and useful characterization.
Theorem 3.5. Let \(x_{(n)}-x_{(1)}\geq 2\) and \(d\) be given by formula \((5)\). Moreover, let
-
(i)
If \(\overline{t}_{n}\leq d(\overline{x}_{n})\), then the ML estimate of \(\boldsymbol{\vartheta}\) lies on the boundary of \(\Theta\) and is given by \(\hat{\boldsymbol{\vartheta}}=(\overline{x}_{n}/(\overline{x}_{n}+1),0)\).
-
(ii)
If \(\overline{t}_{n}>d(\overline{x}_{n})\), then the ML estimate of \(\boldsymbol{\vartheta}\) lies in the interior of \(\Theta\) and is the unique solution of the likelihood equation (3).
Proof. By Theorem 2.1, existence of the ML estimate \(\hat{\boldsymbol{\vartheta}}\) of \(\boldsymbol{\vartheta}\) based on \(x_{1},\dots,x_{n}\) is guaranteed. If \(\overline{t}_{n}\leq d(\overline{x}_{n})\), it follows from Lemma 3.4 that \((\overline{x}_{n},\overline{t}_{n})\notin\boldsymbol{\pi}(\textrm{int}(\Theta))\), such that a solution of the likelihood equation (3) with respect to \(\boldsymbol{\vartheta}\in(0,\infty)^{2}\) does not exist. Hence, \(\hat{\boldsymbol{\vartheta}}\) necessarily lies on the boundary of \(\Theta\) and must then be given by \(\hat{\boldsymbol{\vartheta}}=(\overline{x}_{n}/(\overline{x}_{n}+1),0)\).
Now, let \(\overline{t}_{n}>d(\overline{x}_{n})\). Suppose that the ML estimate \(\hat{\boldsymbol{\vartheta}}\) lies on the boundary of \(\Theta\). Then, it necessarily holds that \(\hat{\boldsymbol{\vartheta}}=(\hat{\lambda},0)\) with \(\hat{\lambda}=\overline{x}_{n}/(\overline{x}_{n}+1)\). Note that the log-likelihood function based on \(\mathbf{x}=(x_{1},\dots,x_{n})\) is given by
Let \(\hat{\boldsymbol{\vartheta}}_{\nu}=(\hat{\lambda},\nu)\) and \(q(\nu)=\ell_{n}(\hat{\boldsymbol{\vartheta}}_{\nu};\mathbf{x})/n\) for \(\nu\geq 0\). Since \(\ln(C(\boldsymbol{\vartheta}))=-\ln(\sum_{k=0}^{\infty}\lambda^{k}/k!^{\nu})\) is differentiable on \((0,\infty)^{2}\) and by using formula (4), it follows that
According to Lemma 3.1, we have \(\boldsymbol{\pi}(\hat{\boldsymbol{\vartheta}}_{\nu})\rightarrow\boldsymbol{\pi}(\hat{\boldsymbol{\vartheta}})\) for \(\nu\searrow 0\) and, hence,
by using formula (4), formula (7) with \(\lambda=\hat{\lambda}\), and formula (5) with \(z=\overline{x}_{n}\). Since \(C(\hat{\boldsymbol{\vartheta}}_{\nu})\rightarrow C(\hat{\boldsymbol{\vartheta}})\) by Lemma 3.1, we also have \(\ell(\hat{\boldsymbol{\vartheta}}_{\nu};\mathbf{x})\rightarrow\ell(\hat{\boldsymbol{\vartheta}};\mathbf{x})\) for \(\nu\searrow 0\), and formula (8) then implies the existence of some small \(\nu>0\) with \(\ell(\hat{\boldsymbol{\vartheta}}_{\nu};\mathbf{x})>\ell(\hat{\boldsymbol{\vartheta}};\mathbf{x})\), which forms a contradiction to \(\hat{\boldsymbol{\vartheta}}\) being the ML estimate. \(\Box\)
The usefulness of Theorem 2.1 and Theorem 3.5 is demonstrated by means of an example.
Example 3.6. We consider three data sets of size \(n=7\), namely
Applying Theorem 2.1, the ML estimate of \(\boldsymbol{\vartheta}\) based on \(\mathbf{x}^{(1)}\) and the ML estimate of \(\boldsymbol{\vartheta}\) based on \(\mathbf{x}^{(2)}\) uniquely exist, while the ML estimate of \(\boldsymbol{\vartheta}\) based on \(\mathbf{x}^{(3)}\) does not exist. The arithmetic mean of the observations in \(\mathbf{x}^{(1)}\) and \(\mathbf{x}^{(2)}\), respectively, is each equal to 1, and we have
Since
and
applying Theorem 3.5 yields that the ML estimate of \(\boldsymbol{\vartheta}\) based on \(\mathbf{x}^{(1)}\) lies in the interior of \(\Theta\) and is the unique solution of the likelihood equation, while the ML estimate of \(\boldsymbol{\vartheta}\) based on \(\mathbf{x}^{(2)}\) lies on the boundary of \(\Theta\) and is given by \(\hat{\boldsymbol{\vartheta}}=(1/2,0)\).
The conditions \(\overline{t}_{n}\leq d(\overline{x}_{n})\) and \(\overline{t}_{n}>d(\overline{x}_{n})\) in Theorem 3.5 are equivalent to \(\overline{t}_{n}\leq\tilde{d}(\overline{x}_{n}/(\overline{x}_{n}+1))\) and \(\overline{t}_{n}>\tilde{d}(\overline{x}_{n}/(\overline{x}_{n}+1))\), respectively, with mapping
the graph of which is depicted in Fig. 1 to ease the comparison of the values \(\overline{t}_{n}\) and \(\tilde{d}(\overline{x}_{n}/(\overline{x}_{n}+1))\) for a given data set.
4 SIMULATION STUDY
We perform a simulation study to investigate the accuracy of the ML estimates in dependence of the range of the underlying observations. Let \(\boldsymbol{\vartheta}_{1}=(\lambda_{1},\nu_{1})=(7.24,0.8)\) and \(\boldsymbol{\vartheta}_{2}=(\lambda_{2},\nu_{2})=(293\,162.5,5)\) be two parameter vectors corresponding to an overdispersed CMP distribution \(P_{\boldsymbol{\vartheta}_{1}}\) and an underdispersed CMP distribution \(P_{\boldsymbol{\vartheta}_{2}}\). Here, for \(i=1,2\), the parameter \(\lambda_{i}\) is determined in dependence of \(\nu_{i}\) as \((12+(\nu_{i}-1)/(2\nu_{i}))^{\nu_{i}}\) to ensure that the mean of \(P_{\boldsymbol{\vartheta}_{i}}\) is approximately equal to 12; see, e.g., [22, formula (7)]. For each parameter vector, we generate \(m=100\,000\) realizations of a sample of size \(n=20\) and compute the corresponding ML estimates, all of which exist by Theorem 2.1, since the observed range of every sample turns out to be greater than 1 (for the parameter vectors considered, the probability of non-existence of the ML estimate is very small). The ML estimates thus obtained are then grouped with respect to the range of the underlying observations and can be considered realizations of the ML estimator conditioned on the range \(X_{(20)}\)–\(X_{(1)}\). For every such group consisting of the ML estimates \((\hat{\lambda}_{i}^{(j)},\hat{\nu}_{i}^{(j)}),1\leq j\leq k\), say, we separately calculate the (empirical) relative absolute bias
and the scaled (empirical) root-mean-square error
as well as respective quantities for the dispersion parameter \(\nu_{i}\), \(i=1,2\). The results are shown in Tables 1 and 2 along with the relative frequency of every group. It is found that all dispersion measures are decreasing/increasing as functions of the range of observations. In the overdispersed case, the ML estimates of \(\lambda_{1}\) and \(\nu_{1}\) are most inaccurate when the range is small, while a large range seems to be less problematic. This finding also applies to the ML estimate of \(\lambda_{2}\) in the underdispersed case. The precision of the ML estimate of \(\nu_{2}\), in turn, appears not to be affected by a small range of observations. Having shown in Theorem 2.1 that an ML estimate does not exist for a range of 0 or 1, the simulation study additionally suggests that, for small ranges greater than 1, ML estimation may produce highly inaccurate values. Although the probability of non-existence of the ML estimate will typically be small in applications, observing a small range greater than 1 might not be that unlikely, in the case of which one should be critical of the ML estimate.
REFERENCES
O. A. Adeoti, J.-C. Malela-Majika, S. C. Shongwe, and M. Aslam, ‘‘A homogeneously weighted moving average control chart for Conway–Maxwell Poisson distribution,’’ Journal of Applied Statistics 49 (12), 3090–3119 (2022).
N. Balakrishnan, S. Barui, and F. S. Milienos, ‘‘Piecewise linear approximations of baseline under proportional hazards based COM-Poisson cure models,’’ Communications in Statistics—Simulation and Computation (2022).
O. Barndorff-Nielsen, Information and Exponential Families in Statistical Theory (Wiley, Chichester, 2014).
S. Bedbur and U. Kamps, ‘‘Uniformly most powerful unbiased tests for the dispersion parameter of the Conway–Maxwell–Poisson distribution,’’ Statistics and Probability Letters 196, 109801 (2023).
S. Bedbur, U. Kamps, and A. Imm, ‘‘On the existence of maximum likelihood estimates for the parameters of the Conway–Maxwell–Poisson distribution,’’ ALEA—Latin American Journal of Probability and Mathematical Statistics 20, 561–577 (2023).
A. Benson and N. Friel, ‘‘Bayesian inference, model selection and likelihood estimation using fast rejection sampling: The Conway–Maxwell–Poisson distribution,’’ Bayesian Analysis 16 (3), 905–931 (2021).
Ch. Chanialidis, L. Evers, T. Neocleous, and A. Nobile, ‘‘Efficient Bayesian inference for COM-Poisson regression models,’’ Statistics and Computing 28, 595–608 (2018).
S. B. Chatla and G. Shmueli, ‘‘A tree-based semi-varying coefficient model for the COM-Poisson distribution,’’ Journal of Computational and Graphical Statistics 29 (4), 827–846 (2020).
R. W. Conway and W. L. Maxwell, ‘‘A queuing model with state dependent service rates,’’ Journal of Industrial Engineering 12, 132–136 (1964).
F. Daly and R. E. Gaunt, ‘‘The Conway–Maxwell–Poisson distribution: Distributional theory and approximation,’’ ALEA—Latin American Journal of Probability and Mathematical Statistics 13, 635–658 (2016).
A. Huang and A. S. I. Kim, ‘‘Bayesian Conway–Maxwell–Poisson regression models for overdispersed and underdispersed counts,’’ Communications in Statistics—Theory and Methods 50 (13), 3094–3105 (2021).
T. Kang, S. M. Levy, and S. Datta, ‘‘Analyzing longitudinal clustered count data with zero inflation: Marginal modeling using the Conway–Maxwell–Poisson distribution,’’ Biometrical Journal 63 (4), 761–786 (2021).
C. C. Kokonendji, D. Mizère, and N. Balakrishnan, ‘‘Connections of the Poisson weight function to overdispersion and underdispersion,’’ Journal of Statistical Planning and Inference 138 (5), 1287–1296 (2008).
S. Kotz, N. Balakrishnan, and N. L. Johnson, Continuous Multivariate Distributions, Models, and Applications (Wiley, New York, 2nd ed., 2000).
P. N. Krivitsky, ‘‘Exponential-family random graph models for valued networks,’’ Electronic Journal of Statistics 6, 1100–1128 (2012).
U. Mammadova and M. R. Özkale, ‘‘Conway–Maxwell Poisson regression-based control charts under iterative Liu estimator for monitoring count data,’’ Applied Stochastic Models in Business and Industry 38 (4), 695–725 (2022).
M. S. Melo and A. P. Alencar, ‘‘Conway–Maxwell–Poisson seasonal autoregressive moving average model,’’ Journal of Statistical Computation and Simulation 92 (2), 283–299 (2022).
S. H. Ong and R. C. Gupta, ‘‘Tiefeng Ma, and Shin Z. Sim. Bivariate Conway–Maxwell Poisson distributions with given marginals and correlation,’’ Journal of Statistical Theory and Practice 15, 10 (2021).
L. S. C. Piancastelli, N. Friel, W. Barreto-Souza, and H. Ombao, ‘‘Multivariate Conway–Maxwell–Poisson distribution: Sarmanov method and doubly intractable Bayesian inference,’’ Journal of Computational and Graphical Statistics, 32 (2), 483–500 (2023).
K. F. Sellers, Sh. Borle, and G. Shmueli, ‘‘The COM-Poisson model for count data: A survey on methods and applications,’’ Applied Stochastic Models in Business and Industry 28 (2), 104–116 (2012).
K. F. Sellers, A. W. Swift, and K. S. Weems. ‘‘A flexible distribution for count data,’’ Journal of Statistical Distributions and Applications 4 (22) (2017).
G. Shmueli, T. P. Minka, J. B. Kadane, Sh. Borle, and P. Boatwright, ‘‘A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution,’’ Journal of the Royal Statistical Society: Series C (Applied Statistics) 54 (1), 127–142 (2005).
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Bedbur, S., Imm, A. & Kamps, U. Characterizing Existence and Location of the ML Estimate in the Conway–Maxwell–Poisson Model. Math. Meth. Stat. 33, 70–78 (2024). https://doi.org/10.3103/S1066530724700042
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1066530724700042