Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

Ong, Seng Huat; Gupta, Ramesh C.; Ma, Tiefeng; Sim, Shin Zhu

doi:10.1007/s42519-020-00141-4

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

Original Article
Published: 12 November 2020

Volume 15, article number 10, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

Download PDF

Seng Huat Ong ORCID: orcid.org/0000-0002-4288-7637¹,
Ramesh C. Gupta²,
Tiefeng Ma³ &
…
Shin Zhu Sim⁴

226 Accesses
7 Citations
Explore all metrics

Abstract

The Conway–Maxwell Poisson (CMP) distribution is a popular model for analyzing data that exhibit under or over dispersion. In this article, we construct bivariate CMP distributions with given marginal CMP distributions and range of correlation coefficient over (− 1, 1) based on the Sarmanov family of bivariate distributions. One of the constructions is based on a general method for weighted distributions. The dependence property is examined. Parameter estimation, tests of independence and adequacy of model and a Monte Carlo power study are discussed. A real data set is used to exemplify its usefulness with comparison to other bivariate models.

Bivariate distributions with equi-dispersed normal conditionals and related models

Article 07 August 2023

Bivariate Discrete Poisson–Lindley Distributions

Article 22 April 2022

On a General Class of Discrete Bivariate Distributions

Article 09 March 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The construction of a bivariate distribution with specified marginals and correlation has been a challenging problem since the early twentieth century. There is much interest in this problem because of its wide ranging applications. For instance, Ong [22] considered computer sampling of some bivariate discrete distributions with given marginals and correlation. Recently, Lin et al. [21] gave a good survey on this topic and this complements previous surveys. A comprehensive overview is found in the monograph by Balakrishnan and Lai [2].

A simple method to formulate a bivariate distribution with fixed marginals and varying correlation is the well-known Farlie–Gumbel–Morgenstern (FGM) family of distributions defined by

$$H\left( {x,y} \right) = F\left( x \right)G\left( y \right)\left\{ {1 + \beta \bar{F} \bar{G}} \right\}$$

(1.1)

where $F\left( x \right)$ and $G\left( y \right)$ are the cumulative distribution functions, $\bar{F} = 1 - F$ and $\bar{G} = 1 - G$ and $\beta$$\varepsilon \left[ { - 1, 1} \right]$ is the parameter that controls the correlation. Schucany et al. [25] had shown that the FGM family (1.1) has correlation coefficient restricted to the interval $\left( { - 1/3, 1/3} \right)$. Various researchers ([14,15,16, 20]; see also, Balakrishnan and Lai [2]) have advocated methods to overcome this drawback of the FGM family.

Sarmanov [23] introduced a family of distributions with better flexibility, and this family includes the FGM family as a particular case. The Sarmanov family is defined by

$$h\left( {x,y} \right) = f\left( x \right)g\left( y \right)\left\{ {1 + \beta \phi_{1} \left( x \right)\phi_{2} \left( y \right)} \right\},\;\;x, y\varepsilon R .$$

(1.2)

where $f\left( x \right)$ and $g\left( y \right)$ are the probability density functions (pdf) of $F\left( x \right)$ and $G\left( y \right)$, respectively. $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$ are measurable functions [21] (also known as mixing functions) satisfying the conditions

$$E\left[ {\phi_{1} \left( X \right)} \right] = 0,\;\;E\left[ {\phi_{2} \left( Y \right)} \right] = 0\;\;{\text{and}}\;\;1 + \phi_{1} \left( x \right)\phi_{2} \left( y \right) \ge 0.$$

If $\phi_{1} \left( x \right) = 1 - 2F$ and $\phi_{2} \left( y \right) = 1 - 2G$, then (1.2) reduces to (1.1). Lee [19] and Shubina and Lee [28] have made a detailed study of the Sarmanov family. In the literature, the family (1.2) is referred to as the Sarmanov-Lee family. There is vast improvement in the range of correlation [21]. For example, the maximum correlation for the bivariate distribution with uniform marginals is 3/4 as opposed to 1/3 for the FGM distribution (1.1). Different bivariate distributions with given marginals are constructed by choosing different mixing functions $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$. Lee [19] has given some examples of the choice of $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$. In this paper, we give a general method of constructing bivariate generalizations of weighted discrete distributions by considering a particular simple choice of $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$.

The objective of this paper is to propose bivariate extensions of a univariate CMP distribution where marginals are CMP. Recently Sellers et al. [24] considered a bivariate CMP (BCMP) distribution where marginal distributions are not CMP.

We construct BCMP distributions with CMP marginals and range of correlation over $\left[ { - 1, 1} \right]$ by using two instances of (1.2). The first is a general method for weighted distributions. The second bivariate distribution is constructed by using $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$ based on the probability generating function. The proposed BCMP distributions includes as a special case a bivariate Poisson distribution which has correlation in $\left[ { - 1, 1} \right]$. Holgate’s [12] bivariate Poisson distribution constructed by random element in common attains positive correlation only.

The CMP distribution generalizes the Poisson distribution by allowing for over-dispersion $\left( {\nu < 1} \right)$ or under-dispersion $\left( {\nu > 1} \right).$ Its probability mass function (pmf) is given by

$$P(X = x) = \frac{{\lambda^{x} }}{{(x!)^{\nu } }}\frac{1}{Z(\lambda ,\nu )},\;x = 0,1,2, \ldots$$

(1.3)

where

$$Z(\lambda ,\nu ) = \sum\limits_{j = 0}^{\infty } {\frac{{\lambda^{j} }}{{(j!)^{\nu } }}} ,\;\lambda > 0,\;\nu \ge 0$$

(1.4)

is the normalizing constant. The mean and variance of the CMP distribution have the following approximations [27]

$$E\left[ X \right] \approx \lambda^{1/\nu } - \frac{\nu - 1}{2\nu },{\text{var}}\left( X \right) \approx \frac{1}{\nu }\lambda^{1/\nu }$$

(1.5)

The CMP distribution may be regarded as a weighted Poisson distribution with pmf

$$P(X = x) = \frac{{e^{ - \lambda } \lambda^{x} }}{x!}\frac{{(x!)^{1 - \nu } }}{W(\lambda ,\nu )},\;x = 0,1,2, \ldots$$

(1.6)

where $W(\lambda ,\,\nu )$ is the normalizing constant.

The CMP distribution is also appealing from a theoretical point of view because it belongs to the class of two parameters power series distribution [27]. As a consequence, sufficient statistics and other elegant properties may be derived. Some of these have recently been investigated by Gupta et al. [10]. Kadane et al. [17] investigated the number of solutions which give rise to the same sufficient statistics.

The paper is organized as follows. Section 2 defines the proposed BCMP distributions and Sect. 3 discusses some dependence properties. The statistical analyses concerning parameter estimation, tests of hypotheses of independence and adequacy are presented in Sect. 4. Section 4 also contains simulation studies to study the power of Rao score and likelihood ratio tests. Section 5 illustrates an application to a real data set. Some concluding remarks are given in Sect. 6.

2 BCMP Distribution and Properties

2.1 Bivariate Discrete Distributions Based on Sarmanov–Lee Family

In this section, we present two bivariate distributions by choosing different mixing functions $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$.

2.1.1 Bivariate Weighted Discrete Distributions

We propose a general method for constructing Sarmanov-type bivariate distributions for weighted distributions. Consider a discrete distribution with pmf $p\left( x \right)$ and mean $\mu$. The weighted distribution for $p\left( x \right)$ has pmf

$$P(X = x) = p\left( x \right)\frac{w\left( x \right)}{W},\;x = 0,1,2, \ldots$$

where $w\left( x \right)$ is the weight and $W$ is the normalizing constant. We make use of the simpler $p\left( x \right)$ to construct the mixing functions. Let $\alpha$ be a positive real number,

$$\phi_{1} \left( x \right) = p^{\alpha } \left( x \right) - E\left[ {p^{\alpha } \left( X \right)} \right]$$

and

$$\phi_{2} \left( y \right) = p^{\alpha } \left( y \right) - E\left[ {p^{\alpha } \left( Y \right)} \right].$$

The expectation $E\left[ {p^{\alpha } \left( X \right)} \right]$ is taken with respect to the weighted pmf $P(X = x)$. A bivariate distribution with joint pmf based on (1.2) is given by

$$\begin{aligned} & P\left( {X = x,Y = y} \right) \\ & \quad = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta \left( {p^{\alpha } \left( x \right) - E\left[ {p^{\alpha } \left( X \right)} \right]} \right)\left( {p^{\alpha } \left( y \right) - E\left[ {p^{\alpha } \left( Y \right)} \right]} \right)} \right\},\;\;\beta \varepsilon \left[ { - 1, 1} \right] \\ \end{aligned}$$

(2.1)

2.1.2 Bivariate Discrete Distributions Based on Probability Generating Functions

Let $\phi_{1} \left( x \right) = \theta^{x} - G\left( \theta \right)$ and $\phi_{2} \left( y \right) = \theta^{y} - G\left( \theta \right)$, where $0 < \theta < 1$ and $G\left( \theta \right)$ is the probability generating function of the marginal distribution. A bivariate distribution is defined as follows with joint pmf

$$P\left( {X = x,Y = y} \right) = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta \left( {\theta^{x} - G\left( \theta \right)} \right)\left( {\theta^{y} - G\left( \theta \right)} \right)} \right\}$$

(2.2)

Note that the functions $\phi_{i} \left( t \right),i = 1,2$ are bounded and $\sum \phi_{i} \left( t \right)P\left( t \right) = 0, \;t = 0,1 2, \ldots$.

Let $\xi_{i} = \sum t\phi_{i} \left( t \right)P\left( t \right)$. From Theorem 2 [19], the correlation coefficient $\rho$ of (2.1) and (2.2) is given by

$$\rho = \frac{{\beta \xi_{1} \xi_{2} }}{{\sigma_{1} \sigma_{2} }}.$$

(2.3)

where $\sigma_{1}^{2} ,\sigma_{2}^{2}$ are the variances.

For model (2.1) the function $\xi_{i}$ is given by

$$\xi_{i} = \sum x\phi_{i} \left( x \right)P\left( x \right) = \sum \left( {xp\left( x \right)P\left( x \right) - xP\left( x \right)E\left[ {p\left( X \right)} \right]} \right) = \delta_{i} - \mu_{i} \gamma_{i} , i = 1,2$$

where $\delta = E\left[ {Xp\left( X \right)} \right]$, $\mu$ is the mean for $P\left( x \right)$ and $\gamma = E\left[ {p\left( X \right)} \right]$. Thus,

$$\rho = \frac{{\beta \left( {\delta_{1} - \mu_{1} \gamma_{1} } \right)\left( {\delta_{2} - \mu_{2} \gamma_{2} } \right)}}{{\sigma_{1} \sigma_{2} }}$$

2.2 BCMP Distributions

Based on (2.1) and (2.2) we have the following BCMP distributions.

The bivariate distribution with joint pmf based on (2.1) is given by

$$P\left( {X = x,Y = y} \right) = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta \left[ {p^{\alpha } \left( x \right) - E\left[ {p^{\alpha } \left( X \right)} \right]} \right]\left[ {p^{\alpha } \left( y \right) - E\left[ {p^{\alpha } \left( Y \right)} \right]} \right]} \right\}$$

(2.4)

where the CMP marginals $P\left( {X = x} \right)$ and $P\left( {Y = y} \right)$ are given by (1.3) with parameters $\left( {\lambda_{1} , \nu_{1} } \right)$ and $\left( {\lambda_{2} , \nu_{2} } \right)$, respectively. We consider computation of $E\left[ {p^{\alpha } \left( Y \right)} \right]$ by writing it in terms of a CMP pmf. For simplicity, we suppress the subscripts and write the parameters as $\left( {\lambda ,\nu } \right)$.

Since $Z(\lambda ,\,\nu ) = \sum\limits_{j = 0}^{\infty } {\frac{{\lambda^{j} }}{{(j!)^{\nu } }}}$, we express

$$E\left[ {p^{\alpha } \left( X \right)} \right] = \mathop \sum \limits_{x = 0}^{\infty } e^{ - \lambda \alpha } \frac{{\lambda^{x\alpha } }}{{\left( {x!} \right)^{\alpha } }}P\left( x \right) = \frac{{e^{ - \lambda \alpha } Z\left( {\lambda^{2} ,\nu + 1} \right)}}{{Z\left( {\lambda ,\nu } \right)}}\mathop \sum \limits_{x = 0}^{\infty } \frac{{\lambda^{{x\left( {\alpha + 1} \right)}} }}{{\left( {x!} \right)^{\nu + \alpha } Z\left( {\lambda^{2} ,\nu + 1} \right)}}$$

That is,

$$E\left[ {p^{\alpha } \left( X \right)} \right] = \frac{{e^{ - \lambda \alpha } Z\left( {\lambda^{2} ,\nu + 1} \right)}}{{Z\left( {\lambda ,\nu } \right)}}P\left( {R = x} \right)$$

where $P\left( {R = x} \right) = \frac{1}{{Z\left( {\lambda^{2} ,\nu + 1} \right)}}\mathop \sum \limits_{x = 0}^{\infty } \frac{{\lambda^{{x\left( {\alpha + 1} \right)}} }}{{\left( {x!} \right)^{\nu + \alpha } }}$ is a CMP pmf with parameters $\left( {\lambda^{2} ,\nu + 1} \right)$. Note that $E\left[ {p^{\alpha}\left( X \right)} \right]$ is an infinite sum of product of Poisson and CMP probabilities and it is easy to see that $E\left[ {p^{\alpha}\left( X \right)} \right] \le 1$. Hence, $\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$ are bounded.

Let $\phi_{1} \left( x \right) = \theta^{x} - g\left( {\theta ;\lambda_{1} , \nu_{1} } \right)$ and $\phi_{2} \left( y \right) = \theta^{y} - g\left( {\theta ;\lambda_{2} , \nu_{2} } \right)$, where $0 < \theta < 1$ and $g\left( {\theta ;\lambda ,\nu } \right)$ is the probability generating function of the CMP distribution given by

$$g\left( {\theta ;\lambda ,\nu } \right) = \frac{{Z\left( {\lambda \theta , \nu } \right)}}{{Z\left( {\lambda ,\nu } \right)}}.$$

The BCMP distribution corresponding to (2.2) has joint pmf defined as follows:

$$P\left( {X = x,Y = y} \right) = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta\phi_{1} \left( x \right)\phi_{2} \left( y \right)} \right\},$$

(2.5)

where the CMP marginals $P\left( {X = x} \right)$ and $P\left( {Y = y} \right)$ are given by (1.3) with parameters $\left( {\lambda_{1} , \nu_{1} } \right)$ and $\left( {\lambda_{2} , \nu_{2} } \right)$, respectively.

If $\nu_{1} = \nu_{2} = 1$, then (2.5) is the joint pmf of the bivariate Poisson distribution given in Section 6.3 of Lee [19] with $\theta = e^{ - 1}$.

2.3 Correlation Coefficient

Let $\mu$ and $\sigma^{2}$ denote the mean and variance of the CMP distribution. For BCMP distributions (2.4) and (2.5), the variances $\sigma_{1}^{2} ,\sigma_{2}^{2}$ may be approximated by (1.5).

For BCMP distribution (2.4),

$$\xi_{i} = \sum x\phi_{i} \left( x \right)P\left( x \right) = \sum \left( {x\frac{{e^{{ - \lambda_{i} }} \lambda_{i}^{x} }}{x!} - xP\left( x \right)} \right) = \lambda_{i} - \mu_{i} ,i = 1,2$$

Thus (2.4) has a very simple expression for the correlation coefficient.

$$\rho = \frac{{\beta \left( {\lambda_{1} - \mu_{1} } \right)\left( {\lambda_{2} - \mu_{2} } \right)}}{{\sigma_{1} \sigma_{2} }}$$

For (2.5),

$$\xi_{i} = \theta \frac{{\partial g_{i} \left( \theta \right)}}{\partial \theta } - \mu_{i} g_{i} \left( \theta \right),i = 1,2$$

where $g_{i} \left( \theta \right) = g\left( {\theta ;\lambda_{i} , \nu_{i} } \right), i = 1, 2$. The correlation coefficient is given by

$$\rho = \frac{{\beta \left( {\theta \frac{{\partial g_{1} \left( \theta \right)}}{\partial \theta } - \mu_{1} g_{1} \left( \theta \right)} \right)\left( {\theta \frac{{\partial g_{2} \left( \theta \right)}}{\partial \theta } - \mu_{2} g_{2} \left( \theta \right)} \right)}}{{\sigma_{1} \sigma_{2} }}$$

For the computation of $Z\left( {\theta , \nu } \right)$ given by (1.4) and related quantities such as derivatives see Section 4 of Gupta et al. [10].

3 Dependence Properties of the BCMP Distribution

The most common measures for determining the relationship between two variables are the Pearson correlation coefficient, Kendall’s tau and Spearman’s rho. As a generalization of Pearson’s correlation coefficient, Bjerve and Doksum [3], Doksum et al. [7] and Blyth [4,5,6] introduced and discussed correlation curve. The correlation curve is a local measure of the strength of association between the two variables X and Y. The correlation curve $\rho \left( x \right)$, a function of x, describes the amount of variance explained by a regression curve varies locally. However, $\rho \left( x \right)$ does not treat X and Y on equal footing, but needs Y to be a response and X, a predictor variable. The correlation curve is a regression concept.

A local dependence function, a function of x and y, should measure the strength and direction of association locally treating both variables symmetrically. For a bivariate distribution, it is defined as follows:

An $r \times c$ contingency table with cell probabilities $p_{i,j}$ specify the joint distribution for two discrete random variables X and Y as

$$p_{i,j} = P\left( {X = i;Y = j} \right);1 \le i \le r;1 \le j \le c$$

(3.1)

The two marginal distributions for X and Y are

$$P\left( {X = i} \right) = \mathop \sum \limits_{j = 1}^{c} p_{i,j} = p_{i + } \;{\text{and}}\;P\left( {Y = j} \right) = \mathop \sum \limits_{i = 1}^{r} p_{i,j} = p_{ + j}$$

respectively. Yule and Kendall [30] and Goodman [9] suggested the following set of local cross product ratios

$$\alpha_{i,j} = \frac{{p_{i,j} p_{i + 1,j + 1} }}{{p_{i.j + 1} p_{i + 1,j} }},1 \le i \le r,1 \le j \le c$$

(3.2)

Equation (3.2) defines the local dependence function. Also, let $\gamma_{i,j} = ln \alpha_{i,j}$. Both $\alpha_{i,j}$ and $\gamma_{i,j}$ measure the association of the $2 \times 2$ tables found by adjacent rows and adjacent columns. It is known that the set $\left\{ {\alpha_{i,j} } \right\}$ or equivalently $\left\{ {\gamma_{i,j} } \right\}$ together with marginal probability distributions uniquely determine the bivariate distribution. For more explanation, see Wang [29] and Holland and Wang [13].

We now present a very important property of the local dependence function. It is in terms of the totally positive of order 2, TP₂ (reverse regular of order 2, RR₂), property defined below.

Definition

A discrete bivariate distribution $P\left( {X = i;Y = j} \right)$ is said to be TP₂(RR₂) if for a₁ < b₁, a₂ < b₂

$$\left| {\begin{array}{*{20}c} {p(a_{1} ,b_{1} )} & {p(a_{1} ,b_{2} )} \\ {p(a_{2} ,b_{1} )} & {p(a_{2} ,b_{2} )} \\ \end{array} } \right| \ge ( \le )0$$

(3.3)

where $p\left( {a_{i} ,b_{j} } \right) = P\left( {X = i;Y = j} \right)$. It can be easily verified that the TP₂(RR₂) property is equivalent to $\alpha_{i,j} \ge \left( \le \right)1$; where $\alpha_{i,j}$ is the local dependence function. This is also equivalent to $\gamma_{i,j} \ge \left( \le \right)0$.

We now obtain the local dependence function for the Sarmanov family of discrete distributions.

3.1 Local Dependence Function for Sarmanov Family of Discrete Distributions

For this family

$$\alpha_{ij} = \frac{{\left[ {1 + \omega \phi_{1} (x)\phi_{2} (y)} \right]\left[ {1 + \omega \phi_{1} (x + 1)\phi_{2} (y + 1)} \right]}}{{\left[ {1 + \omega \phi_{1} (x)\phi_{2} (y + 1)} \right]\left[ {1 + \omega \phi_{1} (x + 1)\phi_{2} (y)} \right]}}$$

(3.4)

So $\alpha_{i,j} \ge ( \le )1$ is equivalent to

$$\omega \left[ {\phi_{2} \left( {y + 1} \right) - \phi_{2} \left( y \right)} \right]\left[ {\phi_{1} \left( {x + 1} \right) - \phi_{1} \left( x \right)} \right] \ge \left( \le \right)0$$

(3.5)

Thus, Sarmanov family is TP₂ if

(a)
$\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$ are both increasing or both decreasing and
(b)
$\omega > 0$.

Similarly, the Sarmanov family is RR₂ if

(a)
$\phi_{1} \left( x \right)$ and $\phi_{2} y$ are both increasing or both decreasing and
(b)
$\omega < 0$.

Let us now obtain such conditions for the BCMP distribution.

We have

$$\phi_{1} \left( x \right) = \theta^{x} - g_{1} \left( \theta \right),0 < \theta < 1.$$

This gives

$$\begin{aligned} \phi_{1} \left( {x + 1} \right) - \phi_{1} \left( x \right) & = \left[ {\theta^{x + 1} - g_{1} \left( \theta \right)} \right]\left[ {\theta^{x} - g_{1} \left( \theta \right)} \right] \\ & = \theta^{x} \left( {\theta - 1} \right) < 0,0 < \theta < 1 \\ \end{aligned}$$

(3.6)

Hence, $\phi_{1} \left( x \right)$ is decreasing. Similarly, $\phi_{2} \left( y \right)$ is decreasing.

Hence, BCMP is TP₂ if and only if $\omega > 0$.

Note that the TP₂ condition is the same as the condition for positive dependence notion that Lehmann [18] called positive likelihood ratio dependence. This notion leads naturally to the order described below.

3.2 Positively Likelihood Ratio Dependent Ordering

Let (X₁,X₂) and (Y₁,Y₂) be two bivariate random vectors having the same marginals. Then, we say that (X₁, X₂) is smaller than (Y₁, Y₂) in the positively likelihood ratio dependent (PLRD) order denoted by

$$\left( {X_{1} ,X_{2} } \right) \le_{\text{PLRD}} \left( {Y_{1} ,Y_{2} } \right)$$

if

$$f\left( {x_{1} ,y_{1} } \right)f\left( {x_{2} ,y_{2} } \right)g\left( {x_{1} ,y_{2} } \right)g\left( {x_{2} ,y_{1} } \right) \le f\left( {x_{1} ,y_{2} } \right)f\left( {x_{2} ,y_{1} } \right)g\left( {x_{1} ,y_{1} } \right)g\left( {x_{2} ,y_{2} } \right), x_{1} \le x_{2} ,y_{1} \le y_{2}$$

where F and G have (continuous or discrete) densities f and g; see [26].

It can be easily seen that

$$\left( {X_{1} ,X_{2} } \right) \le_{PLRD} \left( {Y_{1} ,Y_{2} } \right)$$

is equivalent to

$$\alpha_{{\left( {X_{1} ,X_{2} } \right)}} \le \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}$$

where $\alpha_{{\left( {X_{1} ,X_{2} } \right)}}$ and $\alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}$ are the local dependence functions for (X₁, X₂) and (Y₁,Y₂), respectively.

We shall now compare two Sarmanov families with parameters w₁ and w₂:

After tedious algebra, it can be verified that

$$\begin{aligned} & \alpha_{{\left( {X_{1} ,X_{2} } \right)}} - \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}} \\ & \quad = \left( {\omega_{1} - \omega_{2} } \right) \left[ {\phi_{1} \left( {x + 1} \right) - \phi_{1} \left( x \right)} \right]\left[ {\phi_{1} \left( {y + 1} \right) - \phi_{1} \left( y \right)} \right] \times \left[ {1 - \omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right)} \right] \\ \end{aligned}$$

Now assume

(a)
$\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$ are both increasing or both decreasing and
(b)
$$\omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \le 1$$

Then, $\alpha _{{\left( {X_{1} ,X_{2} } \right)}} \le \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}$ if $\omega_{1} < \omega_{2}$.

The alternative conditions are

(c)
$\phi_{1} \left( x \right)$ and $\phi_{2} \left( y \right)$ are both increasing or both decreasing and
(d)
$$\omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \ge 1$$

Then, $\alpha _{{\left( {X_{1} ,X_{2} } \right)}} \le \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}$ if $\omega_{1} > \omega_{2}$.

We now investigate PLRD ordering for BCMP distribution. In this case,

$$\omega_{1} \phi_{1} \left( x \right)\phi_{2} \left( y \right) \ge - 1\;\;{\text{and}}\;\;\omega_{2} \phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \ge - 1.$$

This does not imply that $\omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \ge 1$. Hence, we cannot compare the two vectors of Sarmanov family according to the PLRD ordering.

Remark. The PLRD ordering is difficult to check in most cases. By contrast finding families of distributions that are not ordered by this relation is relatively easy. For more explanation and comments, see [8].

4 Statistical Analysis

In this section, we examine statistical inference for the BCMP distribution given by (2.2).

4.1 Parameter Estimation

The method of moments estimation for estimating the parameters is conducted as follows:

The marginal parameters $\left( {\lambda ,\nu } \right)$ are estimated by equating the first and second marginal sample moments by using the approximations in (1.5). The estimate for $\beta$ is obtained from (2.3) by equating with the sample correlation coefficient.

For maximum likelihood estimation (MLE), simulated annealing (SA) algorithm is used to determine the estimates corresponding to the global optimum. It is a popular algorithm for searching global extremum in non-smooth functions with a large number of local extrema. Henderson et al. [11] have discussed the convergence of SA and presented practical guidelines for the implementation of SA algorithm, especially its control parameters, to ensure good performance.

The log-likelihood function of the model is given by

$$\ln L = \sum\limits_{i = 1}^{n} {\left[ {x_{i} \ln \lambda_{1} + y_{i} \ln \lambda_{2} + \ln \left\{ {1 + \beta \phi_{1} \left( {x_{i} } \right)\phi_{2} \left( {y_{i} } \right)} \right\} - \nu_{1} \left( {\ln x_{i} !} \right) - \nu_{2} \left( {\ln y_{i} !} \right) - \ln Z_{1} - \ln Z_{2} } \right]}$$

(4.1)

where $Z_{1} = Z(\lambda_{1} ,\,\nu_{1} ),$$Z_{2} = Z(\lambda_{2} ,\,\nu_{2} )$. For the following sections, we consider $\theta = e^{ - 1}$.

4.2 Test of Hypotheses

In this subsection, the tests for the two hypotheses of interest, that is, test of independence and test of adequacy of the proposed BCMP distribution are discussed. To compare the null model (restricted model) against the alternative model (unrestricted model), the score and the likelihood ratio (LR) tests are chosen and their test statistics are summarized as follows.

Let $H_{0}$: $\theta = \theta^{*}$ versus $H_{1}$: $\theta \ne \theta^{*}$. The score test statistic is $T = S^{'} V^{ - 1} S$ where

$$S^{'} = \left. {\frac{\partial }{\partial \theta }\ln L\left( \theta \right)} \right|_{{\theta = \hat{\theta }^{*} }}$$

is the score function and

$$V = \left. {E\left[ { - \frac{{\partial^{2} \ln L(\theta )}}{\partial \theta }} \right]} \right|_{{_{{\theta = \hat{\theta }^{*} }} }}$$

is the information matrix.

The test statistic based on the LR test is defined as $\text{LR} = - 2\log L\left( {\hat{\theta }^{*} } \right)/L(\hat{\theta })$ where $\hat{\theta }^{*}$ and $\hat{\theta }$ are the restricted and unrestricted maximum likelihood estimates.

4.2.1 Test of Independence

The random variables X and Y are independent if $\beta = 0$. The hypotheses to be tested are $H_{0 }: \beta= 0$ against $H_{1} :\beta \ne 0$.

The score functions evaluated under the null hypothesis are needed and they can be easily obtained from the log-likelihood equation in (4.1).

$$\begin{aligned} \left. {\frac{\partial \ln L}{\partial \beta }} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{1} \left( {x_{i} } \right)} \phi_{2} \left( {y_{i} } \right) \\ \left. {\frac{\partial \ln L}{{\partial \lambda_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\frac{{x_{i} }}{{\lambda_{1} }} - \frac{1}{{Z_{1} }}\left( {\frac{{\partial Z_{1} }}{{\partial \lambda_{1} }}} \right)} ,\;\;\left. {\frac{\partial \ln L}{{\partial \lambda_{2} }}} \right|_{\beta = 0} = \sum\limits_{i = 1}^{n} {\frac{{y_{i} }}{{\lambda_{2} }} - \frac{1}{{Z_{2} }}\left( {\frac{{\partial Z_{2} }}{{\partial \lambda_{2} }}} \right)} \\ \left. {\frac{\partial \ln L}{{\partial \nu_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \ln \left( {x_{i} !} \right) - \frac{1}{{Z_{1} }}\left( {\frac{{\partial Z_{1} }}{{\partial \nu_{1} }}} \right)} ,\;\;\left. {\frac{\partial \ln L}{{\partial \nu_{2} }}} \right|_{\beta = 0} = \sum\limits_{i = 1}^{n} { - \ln \left( {y_{i} !} \right) - \frac{1}{{Z_{2} }}\left( {\frac{{\partial Z_{2} }}{{\partial \nu_{2} }}} \right)} \\ \end{aligned}$$

The elements of the information matrix corresponding to $\beta = 0$ are

$$\begin{aligned} \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \left( {\phi_{1} \left( {x_{i} } \right)\phi_{2} \left( {y_{i} } \right)} \right)^{2} } \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \lambda_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{2} \left( {y_{i} } \right)\left( {\frac{{\partial \phi_{1} \left( {x_{i} } \right)}}{{\partial \lambda_{1} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \lambda_{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{1} \left( {x_{i} } \right)\left( {\frac{{\partial \phi_{2} \left( {y_{i} } \right)}}{{\partial \lambda_{2} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \nu_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{2} \left( {y_{i} } \right)\left( {\frac{{\partial \phi_{1} \left( {x_{i} } \right)}}{{\partial \nu_{1} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \nu_{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{1} \left( {x_{i} } \right)\left( {\frac{{\partial \phi_{2} \left( {y_{i} } \right)}}{{\partial \nu_{2} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{1}^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \frac{{x_{i} }}{{\lambda_{1}^{2} }} + \left( {\frac{1}{{Z_{1}^{{}} }}\frac{{\partial Z_{1} }}{{\partial \lambda_{1} }}} \right)}^{2} - \frac{1}{{Z_{1}^{{}} }}\left( {\frac{{\partial^{2} Z_{1} }}{{\partial \lambda_{1}^{2} }}} \right) \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{2}^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \frac{{y_{i} }}{{\lambda_{2}^{2} }} + \left( {\frac{1}{{Z_{2}^{{}} }}\frac{{\partial Z_{2} }}{{\partial \lambda_{2} }}} \right)}^{2} - \frac{1}{{Z_{2}^{{}} }}\left( {\frac{{\partial^{2} Z_{2} }}{{\partial \lambda_{2}^{2} }}} \right) \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \nu_{j}^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\frac{1}{{Z_{j}^{2} }}\left[ {\left( {\frac{{\partial Z_{j} }}{{\partial \nu_{j} }}} \right)^{2} - Z_{j}^{{}} \frac{{\partial^{2} Z_{j} }}{{\partial \nu_{j}^{2} }}} \right]} , \, j = 1,2 \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{j} \partial \nu_{j} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\frac{1}{{Z_{j}^{2} }}\left( {\frac{{\partial Z_{j} }}{{\partial \lambda_{j} }}\frac{{\partial Z_{j} }}{{\partial \nu_{j} }} - Z_{j}^{{}} \frac{{\partial^{2} Z_{j} }}{{\partial \lambda_{j} \partial \nu_{j} }}} \right)} , \, j = 1,2 \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{1} \partial \lambda_{2} }}} \right|_{\beta = 0} & = \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{1} \partial \nu_{2} }}} \right|_{\beta = 0} = \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{2} \partial \nu_{1} }}} \right|_{\beta = 0} = \left. {\frac{{\partial^{2} \ln L}}{{\partial \nu_{1} \partial \nu_{2} }}} \right|_{\beta = 0} = 0 \\ \end{aligned}$$

Both the score and LR tests have an approximate Chi-square distribution with 1 degree of freedom.

4.2.2 Test of BCMP

To test the bivariate Poisson distribution against a BCMP distribution, that is, if the bivariate Poisson is adequate, the proposed hypotheses are

$$H_{0} :\nu_{1} = \nu_{2} = 1\;\text{versus}\;H_{0} \; \text{is}\;\text{not}\;\text{true}.$$

From Eqs. (2.5) and (4.1), the score functions are found to be

$$\begin{aligned} \left. {\frac{\partial \ln L}{\partial \beta }} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} {\frac{{a_{1} a_{2} }}{f}} \\ \left. {\frac{\partial \ln L}{{\partial \lambda_{1} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - 1 + \frac{{x_{i} }}{{\lambda_{1} }} + \frac{{\beta a_{1} e^{{ - \lambda_{1} + \lambda_{1} \theta }} \left( {1 - \theta } \right)}}{f}} \\ \left. {\frac{\partial \ln L}{{\partial \lambda_{2} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - 1 + \frac{{y_{i} }}{{\lambda_{2} }} + \frac{{\beta a_{2} e^{{ - \lambda_{2} + \lambda_{2} \theta }} \left( {1 - \theta } \right)}}{f}} \\ \left. {\frac{\partial \ln L}{{\partial \nu_{1} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - \ln \left( {x_{i} !} \right) - e^{{ - \lambda_{1} }} \left( {\frac{{\partial Z_{1} }}{{\partial \nu_{1} }}} \right)} + \frac{{\beta a_{1} }}{f}\left[ {e^{{ - 2\lambda_{1} + \lambda_{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu_{1} }} - e^{{ - \lambda_{1} }} \frac{{\partial Z_{1}^{*} }}{{\partial \nu_{1} }}} \right] \\ \left. {\frac{\partial \ln L}{{\partial \nu_{2} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - \ln \left( {y_{i} !} \right) - e^{{ - \lambda_{2} }} \frac{{\partial Z_{2} }}{{\partial \nu_{2} }}} + \frac{{\beta a_{2} }}{f}\left[ {e^{{ - 2\lambda_{2} + \lambda_{2} \theta }} \frac{{\partial Z_{2} }}{{\partial \nu_{2} }} - e^{{ - \lambda_{2} }} \frac{{\partial Z_{2}^{*} }}{{\partial \nu_{2} }}} \right] \\ \end{aligned}$$

where $a_{1} = \left( { - e^{{ - \lambda_{2} + \lambda_{2} \theta }} + \theta^{{y_{i} }} } \right)$, $a_{2} = \left( { - e^{{ - \lambda_{1} + \lambda_{1} \theta }} + \theta^{{x_{i} }} } \right)$, f = 1 + $\beta \left( { - e^{{ - \lambda_{1} + \lambda_{1} \theta }} + \theta^{{x_{i} }} } \right)$ $\left( { - e^{{ - \lambda_{2} + \lambda_{2} \theta }} + \theta^{{y_{i} }} } \right)$, $Z_{1}^{*} = Z(\theta \lambda_{1} ,\,\nu_{1} )$ and $Z_{2}^{*} = Z(\theta \lambda_{2} ,\,\nu_{2} )$.

The elements of information matrix evaluated under the null hypothesis are

$$\begin{aligned} \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \left( {\frac{{a_{1} a_{2} }}{f}} \right)} ^{2} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta \partial \lambda _{j} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \frac{{b_{j} }}{g^{2}}\left( {e^{{\lambda _{1} + \lambda _{2} + \lambda _{j} \theta }} } \right)\left( {\theta - 1} \right)} ,{\text{ }}j = 1,2 \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta \partial \nu _{1} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{{\left( {e^{{\lambda _{2} }} b_{1} } \right)}}{g^{2}}\left( {e^{{\lambda _{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu _{1} }} - e^{{\lambda _{1} }} \frac{{\partial Z_{1}^{*} }}{{\partial \nu _{1} }}} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta \partial \nu _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{{\left( {e^{{\lambda _{1} }} b_{2} } \right)}}{g^{2}}\left( {e^{{\lambda _{2} \theta }} \frac{{\partial Z_{2} }}{{\partial \nu _{2} }} - e^{{\lambda _{2} }} \frac{{\partial Z_{2}^{*} }}{{\partial \nu _{2} }}} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{1} ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \frac{{x_{i} }}{{\lambda _{1}^{2} }} - \frac{{\beta b_{1} e^{{\lambda _{1} \theta }} }}{g}\left( {\theta - 1} \right)^{2} \left( {\frac{{\beta b_{1} e^{{\lambda _{1} \theta }} }}{g} + 1} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{1} \partial \lambda _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\left( {\frac{{\theta - 1}}{g}} \right)^{2} } \beta e^{{(\lambda _{1} + \lambda _{2} )(1 + \theta )}} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{2} ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \frac{{y_{i} }}{{\lambda _{2}^{2} }} - \frac{{\beta b_{2} e^{{\lambda _{2} \theta }} \left( {\theta - 1} \right)^{2} }}{g}\left( {\frac{{\beta b_{2} e^{{\lambda _{2} \theta }} }}{g} + 1} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{j} \partial \nu _{j} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} \begin{gathered} e^{{ - \lambda _{j} }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }} - \frac{{\partial ^{2} Z_{j} }}{{\partial \lambda _{j} \partial \nu _{j} }}} \right) + \left( {\frac{{\beta b_{j} }}{g}} \right)^{2} \left( {e^{{\lambda _{j} \left( {\theta - 1} \right)}} \left( {\theta - 1} \right)\left( {e^{{\lambda _{j} \theta }} \frac{{\partial Z_{j} }}{{\partial \nu _{j} }} - e^{{\lambda _{j} }} \frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }}} \right)} \right) \hfill \\ + \frac{1}{g}\left[ {\beta b_{j} e^{{ - \lambda _{j} }} \left( {e^{{\lambda _{j} \theta }} \frac{{\partial ^{2} Z_{j} }}{{\partial \lambda _{j} \partial \nu _{j} }} + e^{{\lambda _{j} \theta }} \left( {\theta - 2} \right)\frac{{\partial Z_{j} }}{{\partial \nu _{j} }} + e^{{\lambda _{j} }} \left( {\frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }} - \frac{{\partial ^{2} Z_{j}^{*} }}{{\partial \lambda _{j} \partial \nu _{j} }}} \right)} \right)} \right] \hfill \\ \end{gathered} ,{\text{ }}j = 1,2 \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{1} \partial \nu _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{1}{{g^{2} }}\left( {\beta e^{{\lambda _{1} + \lambda _{1} \theta }} \left( {\theta - 1} \right)\left( { - e^{\lambda _{2} \theta }\frac{{\partial Z_{2} }}{{\partial \nu _{2} }} + e^{\lambda _{2}} \frac{{\partial Z_{2}^{*} }}{{\partial \nu _{2} }}} \right)} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{2} \partial \nu _{1} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{1}{{g^{2} }}\left( {\beta e^{{\lambda _{2} + \lambda _{2} \theta }} \left( {\theta - 1} \right)\left( { - e^{{\lambda _{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu _{1} }} + e^{{\lambda _{1} }} \frac{{\partial Z_{{_{1} }}^{*} }}{{\partial \nu _{1} }}} \right)} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \nu _{j} ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {e^{{ - 4\lambda _{j} }} \left\{ \begin{aligned} & e^{{2\lambda _{j} }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }}} \right)^{2} - e^{{3\lambda _{j} }} \frac{{\partial ^{2} Z_{j} }}{{\partial \nu _{j}^{2} }} - \left( {\frac{{\beta a_{j} }}{f}\left( {e^{{\lambda _{j} \theta }} \frac{{\partial Z_{j} }}{{\partial \nu _{j} }} - e^{{\lambda _{j} }} \frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }}} \right)} \right)^{2} \\ & + \frac{{\beta a_{j} e^{{\lambda _{j} }} }}{f}\left( \begin{gathered} - 2e^{{\lambda _{j} \theta }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }}} \right)^{2} + e^{{\lambda _{j} + \lambda _{j} \theta }} \frac{{\partial ^{2} Z_{j} }}{{\partial \nu _{j}^{2} }} + \hfill \\ 2e^{{\lambda _{j} }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }}\frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }}} \right) - e^{{2\lambda _{j} }} \frac{{\partial ^{2} Z_{j}^{*} }}{{\partial \nu _{j}^{2} }} \hfill \\ \end{gathered} \right) \\ \end{aligned} \right\}} ,{\text{ }}j = 1,2 \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \nu _{1} \partial \nu _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{\beta }{{g^{2} }}\left( {e^{{\lambda _{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu _{1} }} - e^{{\lambda _{1} }} \frac{{\partial Z_{1}^{*} }}{{\partial \nu _{1} }}} \right)\left( {e^{{\lambda _{2} \theta }} \frac{{\partial Z_{2} }}{{\partial \nu _{2} }} - e^{{\lambda _{2} }} \frac{{\partial Z_{2}^{*} }}{{\partial \nu _{2} }}} \right)} \\ \end{aligned}$$

where $b_{1} = e^{{\lambda_{2} }} \theta^{{y_{i} }} - e^{{\lambda_{2} \theta }}$, $b_{2} = e^{{\lambda_{1} }} \theta^{{x_{i} }} - e^{{\lambda_{1} \theta }}$ and g = $\beta e^{{\left( {\lambda_{1} + \lambda_{2} } \right)\theta }} - \beta e^{{\lambda_{1} + \lambda_{2} \theta }} \theta^{{x_{i} }} - \beta e^{{\lambda_{2} + \lambda_{1} \theta }} \theta^{{y_{i} }} + e^{{\lambda_{1} + \lambda_{2} }} \left( {1 + \beta \theta^{{x_{i} + y_{i} }} } \right)$.

Both the score and LR tests have an approximate Chi-square distribution with 2 degrees of freedom.

4.3 Monte Carlo Simulation Study

The performance of the proposed score and LR tests is compared in order to check the efficiencies of the tests. A simulation study of 1000 replications has been carried out by considering small (n = 100), medium (n = 500) and large (n = 1000) sample sizes for the bivariate discrete distributions based on probability generating functions. The nominal significant level, $\alpha$ is taken as 5% and 10%. For the test of independence, different values of $\beta$, ranging from − 1 to 1 are considered with a few combinations of $\nu_{1}$ and $\nu_{2}$. Tables 1 and 2 display, respectively, the simulated power of the tests under the test of independence and the test of adequacy of BCMP distribution.

Table 1 Simulated power of Score and LR tests for test of independence ($\lambda_{1} = 0.9,\nu_{1} = 2,\lambda_{2} = 0.7,\nu_{2} = 3$)

Full size table

Table 2 Simulated power of Score and LR tests for test of adequacy of BCMP ($\lambda_{1} = 0.9,\nu_{1} = 2,\lambda_{2} = 0.7,\nu_{2} = 3$)

Full size table

From Table 1, the score test performs better in maintaining the nominal significance level of 5% ($\beta$ = 0) if compared to the LR test for small sample sizes, but vice versa for nominal level of 10%. A weak detection is achieved when $\beta$ is 0.5 away from zero and when the sample size is small. The power of detection is greatly improved when the sample size increases. When $\beta$ is 0.8 or 1.0 away from zero, the score test outperforms the LR test when the sample size is small, but the values are very close to each other when the sample size increases.

The powers of the proposed tests increase when $\beta$ diverges further from the value 0 and when the sample size increases. Besides that the powers of both of the tests are very close to each other when the sample size increases to 500 and 1000 regardless of the value of $\beta$ and the nominal levels.

As shown in Table 2, the score test outperforms the LR test in maintaining the nominal levels of 5% and 10% regardless of the sample size. The LR test overestimates the nominal significance level for small sample sizes. In addition, when the parameters $\nu_{1}$ and $\nu_{2}$ are smaller than 1.0, the score and LR tests are comparable to each other. However, if both of the parameters are higher than 1.0, LR test is able to achieve a higher detection compared to the score test. If $\nu_{1}$ is fixed as 1.0, the LR test outperforms the score test when $\nu_{2}$ is larger than 1.0 but vice versa when $\nu_{2}$ is smaller than 1.0. The same result applies when $\nu_{2}$ is fixed as 1.0 and $\nu_{1}$ is set as larger or smaller than 1.0.

Overall, both score and LR tests are powerful when the sample size is equal or larger than 500 as almost 96% detection can be achieved.

5 Example of Application to Real Data

As an illustration of application we consider the data set of the number of accidents sustained by 122 experienced shunters over 2 successive periods of time [1] which was also used by Sellers et al. [24].

In this section, all the summation series involved are computed by recursion with double-precision accuracy and truncation approach is applied to the normalizing constant, $Z\left( {\lambda ,\nu } \right)$. It is set as $Z\left( {\lambda ,\nu } \right) \le 1 \times 10^{200}$. To compute the correlation coefficient $\rho = \frac{{\alpha \xi_{1} \xi_{2} }}{{\sigma_{1} \sigma_{2} }}$, the calculation of mean $\mu_{1}$, $\mu_{2}$ and variances $\sigma_{1}^{2} ,\sigma_{2}^{2}$ from marginals $P\left( {X = x} \right)$ and $P\left( {Y = y} \right)$ are required. They are computed by using the following equations as the accuracy of approximation Eqs. (1.5) hold when $\lambda > 10^{\nu }$. The equations fail when the value of the parameter $\nu$ is close to zero. For example, when $\nu = 0.01$, we have to make sure that $\lambda > 10^{\nu }$ = 1.0233.

$$\begin{aligned} & \mu_{i} = \lambda_{i} \frac{{\partial \ln Z\left( {\lambda_{i} ,\nu_{i} } \right)}}{{\partial \lambda_{i} }} = \lambda_{i} \sum\limits_{j} {\frac{{j\lambda_{i}^{j - 1} }}{{(j!)^{\nu } }}} /\sum\limits_{j} {\frac{{\lambda_{i}^{j} }}{{(j!)^{\nu } }}} ,\quad i = 1,2 \\ & \sigma_{i}^{2} = \lambda_{i} \frac{{\partial \mu_{i} }}{{\partial \lambda_{i} }} = \mu_{i} + \lambda_{i}^{2} \left[ { - \left( {\sum\limits_{j} {\frac{{j\lambda_{i}^{j - 1} }}{{(j!)^{\nu } }}} } \right)^{2} + \left( {\sum\limits_{j} {\frac{{(j - 1)j\lambda_{i}^{j - 2} }}{{(j!)^{\nu } }}} } \right)\left( {\sum\limits_{j} {\frac{{\lambda_{i}^{j} }}{{(j!)^{\nu } }}} } \right)} \right]\left( {\sum\limits_{j} {\frac{{\lambda_{i}^{j} }}{{(j!)^{\nu } }}} } \right)^{ - 2} ,\quad i = 1,2 \\ \end{aligned}$$

For the data set on the number of accidents sustained by 122 experienced shunters over 2 successive periods of time. [1], the sample moments are:

$$\begin{aligned} & {\text{Sample}}\;{\text{mean:}}\;\;\hat{\mu}_{1} = 0.9754,\;\hat{\mu}_{2} = 1.2705 \\ & {\text{Variance:}}\;\;\hat{\sigma}_{1}^{2} = 1.2969,\;\hat{\sigma}_{2}^{2} = 1.6535. \\ & {\text{Sample}}\;{\text{correlation}},\;\hat{\rho} = 0.2585. \\ \end{aligned}$$

This data set has also been fitted by the bivariate negative binomial (BNB) distribution as a comparison. See Table 3.

Table 3 Observed and expected number of accidents sustained by 122 experienced shunters over 2 successive periods of time

Full size table

To confirm that the MLE’s for the data in Table 3 really give a maximum, some likelihood function values are computed for points around the ML estimates. Based on the likelihood function values, the ML estimates do correspond to the maximum (Table 4).

Table 4 Likelihood function values for points around the ML estimates

Full size table

It is observed that in the summary statistics presented in Table 5, the fits by both proposed BCMP are significantly better than the BNB based upon the $\chi^{2}$ values and the log-likelihood values. For this data set, Sellers et al. [24] gave a log-likelihood value of − 341.704 for their BCMP distribution which is about the same as BNB. Thus, based on the log-likelihood values, the proposed BCMP distributions fit better than the BCMP distribution of Sellers et al. [24].

Table 5 Summary statistics

Full size table

6 Concluding Remarks

BCMP distributions with marginal distributions which are CMP distributions and range of correlation coefficient over (− 1, 1) have been proposed. This is based on the Sarmanov family of bivariate distributions which is a simple and elegant approach in constructing bivariate distributions. A method is proposed for constructing bivariate generalizations of weighted distributions. The univariate CMP distribution is a very popular distribution for applications since it has the flexibility to analyze data that exhibit under or over dispersion. The BCMP distribution proposed by Sellers et al. [24] does not have the marginal distributions which are CMP distributions. It is shown in this article that the proposed BCMP distributions fit much better than the BCMP of Sellers et al. [24]. Thus, the proposed BCMP distributions will be of great utility to the data analysts.

References

Arbous AG, Kerrich JE (1951) Accident statistics and the concept of accident proneness. Biometrics 7:340–342
Article Google Scholar
Balakrishnan N, Lai CD (2009) Continuous bivariate distributions, 2nd edn. Springer, New York
MATH Google Scholar
Bjerve S, Doksum K (1993) Correlation curves: measures of association as functions of covariate values. Ann Stat 21:890–902
Article MathSciNet Google Scholar
Blyth S (1993) A note on correlation curves and Chernoff’s inequality. Scand J Stat 20(4):375–377
MathSciNet MATH Google Scholar
Blyth S (1994) Karl Pearson and correlation curve. Int Stat Rev 62(3):393–403
Article MathSciNet Google Scholar
Blyth S (1994) Measuring local association: an introduction to the correlation curve. Sociol Methods 24:171–197
Article Google Scholar
Doksum K, Blyth S, Bradlow E, Meng XL, Zhao H (1994) Correlation curves as local measures of variance explained by regression. J Am Stat Assoc 89(426):571–582
Article MathSciNet Google Scholar
Genest C, Verret F (2002) The TP₂ ordering of Kimeldorf and Sampson has the normal-agreeing property. Stat Probab Lett 57:387–391
Article Google Scholar
Goodman LA (1969) How to ransack social mobility tables and other kinds of cross-classification tables. Am J Sociol 75:1–39
Article Google Scholar
Gupta RC, Sim SZ, Ong SH (2014) Analysis of discrete data by Conway–Maxwell Poisson distribution. AStA Adv Stat Anal 98:327–343
Article MathSciNet Google Scholar
Henderson D, Jacobson SH, Johnson AW (2003) The theory and practice of simulated annealing. In: Handbook of metaheuristics, vol 57, pp 287–319
Holgate P (1964) Estimation for the bivariate Poisson distribution. Biometrika 51:241–245
Article MathSciNet Google Scholar
Holland PW, Wang YJ (1987) Dependence function for continuous bivariate densities. Commun Stat Theory Methods 16:863–876
Article MathSciNet Google Scholar
Huang JS, Kotz S (1984) Correlation structure in iterated Farlie–Gumbel–Morgenstern distributions. Biometrika 71:633–636
MathSciNet MATH Google Scholar
Huang JS, Kotz S (1999) Modifications of the Farlie–Gumbel–Morgenstern distributions: a tough hill to climb. Metrika 49:307–323
Article MathSciNet Google Scholar
Johnson NL, Kotz S (1975) On some generalized Farlie–Gumbel– Morgenstern distributions. Commun Stat Theory Methods 4:415–427
MathSciNet MATH Google Scholar
Kadane JB, Krishnan R, Shmeuli G (2006) A data disclosure policy for count data based on the COM-Poisson distribution. Manag Sci 52(10):1610–1617
Article Google Scholar
Lehmann EL (1966) Some concepts of dependence. Ann Math Stat 37:1137–1153
Article MathSciNet Google Scholar
Lee M-LT (1996) Properties and applications of the Sarmanov family of bivariate distributions. Commun Stat Theory Methods 25:1207–1222
Article MathSciNet Google Scholar
Lin GD (1987) Relationships between two extensions of Farlie–Gumbel–Morgenstern distribution. Ann Inst Stat Math 39:129–140
Article MathSciNet Google Scholar
Lin GD, Dou X, Kuriki S, Huang JS (2014) Recent developments on the construction of bivariate distributions with fixed marginals. J Stat Distrib Appl 1:14
Article Google Scholar
Ong SH (1992) The computer generation of bivariate binomial variables with given marginals and correlation. Commun Stat Simul Comput 21:285–299
Article Google Scholar
Sarmanov IO (1966) Generalized normal correlation and two dimensional Frechet classes. Soviet Math Dokl 7:596–599 [English translation; Russian original in Dokl. Akad. Nauk. SSSR 168, 32–35 (1966)]
Sellers KF, Morris DS, Balakrishnan N (2016) Bivariate Conway–Maxwell–Poisson distribution: formulation, properties, and inference. J Multivar Anal 150:152–168
Article MathSciNet Google Scholar
Schucany WR, Parr WC, Boyer JE (1978) Correlation structure in Farlie-Gumbel-Morgenstern distributions. Biometrika 65(3):650–653
Article MathSciNet Google Scholar
Shaked M, Shanthikumar JG (2007) Stochastic orders. Springer, New York
Book Google Scholar
Shmueli G, Minka TP, Kadane JB, Borle S, Boatwright S (2005) A useful distribution for fitting discrete data-revival of the Conway–Maxwell–Poisson distribution. J R Stat Soc Ser C (Appl Stat) 54(1):127–142
Article MathSciNet Google Scholar
Shubina M, Lee M-LT (2004) On maximum attainable correlation and other measures of dependence for the Sarmanov family of bivariate distributions. Commun Stat Theory Methods 33:1031–1052
Article MathSciNet Google Scholar
Wang YJ (1993) Construction of continuous bivariate density functions. Stat Sin 3:173–187
MathSciNet MATH Google Scholar
Yule GU, Kendall MG (1950) An introduction to the theory of statistics, 14th edn. Charles Griffith & Company Limited, London
MATH Google Scholar

Download references

Acknowledgements

The authors wish to thank the referees for their constructive comments which have vastly improved the paper.

Author information

Authors and Affiliations

Department of Actuarial Science and Applied Statistics, UCSI University, 56000, Kuala Lumpur, Malaysia
Seng Huat Ong
Department of Mathematics and Statistics, University of Maine Orono, Maine, 04469-5752, USA
Ramesh C. Gupta
School of Statistics, Southwestern University of Finance and Economics, Chengdu, 611130, Sichuan, People’s Republic of China
Tiefeng Ma
Department of Mathematical and Actuarial Sciences, Universiti Tunku Abdul Rahman, 43000, Kajang, Selangor, Malaysia
Shin Zhu Sim

Authors

Seng Huat Ong
View author publications
You can also search for this author in PubMed Google Scholar
Ramesh C. Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Tiefeng Ma
View author publications
You can also search for this author in PubMed Google Scholar
Shin Zhu Sim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seng Huat Ong.

Ethics declarations

Conflict of interest

There is no conflict of interest concerning this manuscript

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Derivatives of the BCMP normalizing constant, $Z\left( {\lambda \theta , \nu } \right)$ are shown as follows

$$\begin{aligned} & \frac{{\partial Z_{1}^{*} }}{{\partial \lambda_{1} }} = \theta \sum\limits_{j = 1}^{\infty } {\frac{{j\left( {\lambda_{1} \theta } \right)^{j - 1} }}{{\left( {j!} \right)^{{v_{1} }} }}} ,\,\,\frac{{\partial Z_{1}^{*} }}{{\partial \nu_{1} }} = - \sum\limits_{j = 1}^{\infty } {\frac{{\ln \left( {j!} \right)\left( {\lambda_{1} \theta } \right)^{j} }}{{\left( {j!} \right)^{{v_{1} }} }}} \\ & \frac{{\partial^{2} Z_{1}^{*} }}{{\partial \lambda_{1} \partial \nu_{1} }} = - \theta \sum\limits_{j = 1}^{\infty } {\frac{{j\ln \left( {j!} \right)\left( {\lambda_{1} \theta } \right)^{j - 1} }}{{\left( {j!} \right)^{{v_{1} }} }},\,\,} \frac{{\partial^{2} Z_{1}^{*} }}{{\partial \nu_{1}^{2} }} = - \sum\limits_{j = 1}^{\infty } {\frac{{\ln \left( {j!} \right)^{2} \left( {\lambda_{1} \theta } \right)^{j} }}{{\left( {j!} \right)^{{v_{1} }} }}} . \\ \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ong, S.H., Gupta, R.C., Ma, T. et al. Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation. J Stat Theory Pract 15, 10 (2021). https://doi.org/10.1007/s42519-020-00141-4

Download citation

Accepted: 15 October 2020
Published: 12 November 2020
DOI: https://doi.org/10.1007/s42519-020-00141-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

Abstract

Similar content being viewed by others

Bivariate distributions with equi-dispersed normal conditionals and related models

Bivariate Discrete Poisson–Lindley Distributions

On a General Class of Discrete Bivariate Distributions

1 Introduction

2 BCMP Distribution and Properties

2.1 Bivariate Discrete Distributions Based on Sarmanov–Lee Family

2.1.1 Bivariate Weighted Discrete Distributions

2.1.2 Bivariate Discrete Distributions Based on Probability Generating Functions

2.2 BCMP Distributions

2.3 Correlation Coefficient

3 Dependence Properties of the BCMP Distribution

Definition

3.1 Local Dependence Function for Sarmanov Family of Discrete Distributions

3.2 Positively Likelihood Ratio Dependent Ordering

4 Statistical Analysis

4.1 Parameter Estimation

4.2 Test of Hypotheses

4.2.1 Test of Independence

4.2.2 Test of BCMP

4.3 Monte Carlo Simulation Study

5 Example of Application to Real Data

6 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bivariate Conway–Maxwell Poisson Distributions with Given Marginals and Correlation

Abstract

Similar content being viewed by others

Bivariate distributions with equi-dispersed normal conditionals and related models

Bivariate Discrete Poisson–Lindley Distributions

On a General Class of Discrete Bivariate Distributions

1 Introduction

2 BCMP Distribution and Properties

2.1 Bivariate Discrete Distributions Based on Sarmanov–Lee Family

2.1.1 Bivariate Weighted Discrete Distributions

2.1.2 Bivariate Discrete Distributions Based on Probability Generating Functions

2.2 BCMP Distributions

2.3 Correlation Coefficient

3 Dependence Properties of the BCMP Distribution

Definition

3.1 Local Dependence Function for Sarmanov Family of Discrete Distributions

3.2 Positively Likelihood Ratio Dependent Ordering

4 Statistical Analysis

4.1 Parameter Estimation

4.2 Test of Hypotheses

4.2.1 Test of Independence

4.2.2 Test of BCMP

4.3 Monte Carlo Simulation Study

5 Example of Application to Real Data

6 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation