1 Introduction

The construction of a bivariate distribution with specified marginals and correlation has been a challenging problem since the early twentieth century. There is much interest in this problem because of its wide ranging applications. For instance, Ong [22] considered computer sampling of some bivariate discrete distributions with given marginals and correlation. Recently, Lin et al. [21] gave a good survey on this topic and this complements previous surveys. A comprehensive overview is found in the monograph by Balakrishnan and Lai [2].

A simple method to formulate a bivariate distribution with fixed marginals and varying correlation is the well-known Farlie–Gumbel–Morgenstern (FGM) family of distributions defined by

$$H\left( {x,y} \right) = F\left( x \right)G\left( y \right)\left\{ {1 + \beta \bar{F} \bar{G}} \right\}$$
(1.1)

where \(F\left( x \right)\) and \(G\left( y \right)\) are the cumulative distribution functions, \(\bar{F} = 1 - F\) and \(\bar{G} = 1 - G\) and \(\beta\)\(\varepsilon \left[ { - 1, 1} \right]\) is the parameter that controls the correlation. Schucany et al. [25] had shown that the FGM family (1.1) has correlation coefficient restricted to the interval \(\left( { - 1/3, 1/3} \right)\). Various researchers ([14,15,16, 20]; see also, Balakrishnan and Lai [2]) have advocated methods to overcome this drawback of the FGM family.

Sarmanov [23] introduced a family of distributions with better flexibility, and this family includes the FGM family as a particular case. The Sarmanov family is defined by

$$h\left( {x,y} \right) = f\left( x \right)g\left( y \right)\left\{ {1 + \beta \phi_{1} \left( x \right)\phi_{2} \left( y \right)} \right\},\;\;x, y\varepsilon R .$$
(1.2)

where \(f\left( x \right)\) and \(g\left( y \right)\) are the probability density functions (pdf) of \(F\left( x \right)\) and \(G\left( y \right)\), respectively. \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\) are measurable functions [21] (also known as mixing functions) satisfying the conditions

$$E\left[ {\phi_{1} \left( X \right)} \right] = 0,\;\;E\left[ {\phi_{2} \left( Y \right)} \right] = 0\;\;{\text{and}}\;\;1 + \phi_{1} \left( x \right)\phi_{2} \left( y \right) \ge 0.$$

If \(\phi_{1} \left( x \right) = 1 - 2F\) and \(\phi_{2} \left( y \right) = 1 - 2G\), then (1.2) reduces to (1.1). Lee [19] and Shubina and Lee [28] have made a detailed study of the Sarmanov family. In the literature, the family (1.2) is referred to as the Sarmanov-Lee family. There is vast improvement in the range of correlation [21]. For example, the maximum correlation for the bivariate distribution with uniform marginals is 3/4 as opposed to 1/3 for the FGM distribution (1.1). Different bivariate distributions with given marginals are constructed by choosing different mixing functions \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\). Lee [19] has given some examples of the choice of \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\). In this paper, we give a general method of constructing bivariate generalizations of weighted discrete distributions by considering a particular simple choice of \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\).

The objective of this paper is to propose bivariate extensions of a univariate CMP distribution where marginals are CMP. Recently Sellers et al. [24] considered a bivariate CMP (BCMP) distribution where marginal distributions are not CMP.

We construct BCMP distributions with CMP marginals and range of correlation over \(\left[ { - 1, 1} \right]\) by using two instances of (1.2). The first is a general method for weighted distributions. The second bivariate distribution is constructed by using \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\) based on the probability generating function. The proposed BCMP distributions includes as a special case a bivariate Poisson distribution which has correlation in \(\left[ { - 1, 1} \right]\). Holgate’s [12] bivariate Poisson distribution constructed by random element in common attains positive correlation only.

The CMP distribution generalizes the Poisson distribution by allowing for over-dispersion \(\left( {\nu < 1} \right)\) or under-dispersion \(\left( {\nu > 1} \right).\) Its probability mass function (pmf) is given by

$$P(X = x) = \frac{{\lambda^{x} }}{{(x!)^{\nu } }}\frac{1}{Z(\lambda ,\nu )},\;x = 0,1,2, \ldots$$
(1.3)

where

$$Z(\lambda ,\nu ) = \sum\limits_{j = 0}^{\infty } {\frac{{\lambda^{j} }}{{(j!)^{\nu } }}} ,\;\lambda > 0,\;\nu \ge 0$$
(1.4)

is the normalizing constant. The mean and variance of the CMP distribution have the following approximations [27]

$$E\left[ X \right] \approx \lambda^{1/\nu } - \frac{\nu - 1}{2\nu },{\text{var}}\left( X \right) \approx \frac{1}{\nu }\lambda^{1/\nu }$$
(1.5)

The CMP distribution may be regarded as a weighted Poisson distribution with pmf

$$P(X = x) = \frac{{e^{ - \lambda } \lambda^{x} }}{x!}\frac{{(x!)^{1 - \nu } }}{W(\lambda ,\nu )},\;x = 0,1,2, \ldots$$
(1.6)

where \(W(\lambda ,\,\nu )\) is the normalizing constant.

The CMP distribution is also appealing from a theoretical point of view because it belongs to the class of two parameters power series distribution [27]. As a consequence, sufficient statistics and other elegant properties may be derived. Some of these have recently been investigated by Gupta et al. [10]. Kadane et al. [17] investigated the number of solutions which give rise to the same sufficient statistics.

The paper is organized as follows. Section 2 defines the proposed BCMP distributions and Sect. 3 discusses some dependence properties. The statistical analyses concerning parameter estimation, tests of hypotheses of independence and adequacy are presented in Sect. 4. Section 4 also contains simulation studies to study the power of Rao score and likelihood ratio tests. Section 5 illustrates an application to a real data set. Some concluding remarks are given in Sect. 6.

2 BCMP Distribution and Properties

2.1 Bivariate Discrete Distributions Based on Sarmanov–Lee Family

In this section, we present two bivariate distributions by choosing different mixing functions \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\).

2.1.1 Bivariate Weighted Discrete Distributions

We propose a general method for constructing Sarmanov-type bivariate distributions for weighted distributions. Consider a discrete distribution with pmf \(p\left( x \right)\) and mean \(\mu\). The weighted distribution for \(p\left( x \right)\) has pmf

$$P(X = x) = p\left( x \right)\frac{w\left( x \right)}{W},\;x = 0,1,2, \ldots$$

where \(w\left( x \right)\) is the weight and \(W\) is the normalizing constant. We make use of the simpler \(p\left( x \right)\) to construct the mixing functions. Let \(\alpha\) be a positive real number,

$$\phi_{1} \left( x \right) = p^{\alpha } \left( x \right) - E\left[ {p^{\alpha } \left( X \right)} \right]$$

and

$$\phi_{2} \left( y \right) = p^{\alpha } \left( y \right) - E\left[ {p^{\alpha } \left( Y \right)} \right].$$

The expectation \(E\left[ {p^{\alpha } \left( X \right)} \right]\) is taken with respect to the weighted pmf \(P(X = x)\). A bivariate distribution with joint pmf based on (1.2) is given by

$$\begin{aligned} & P\left( {X = x,Y = y} \right) \\ & \quad = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta \left( {p^{\alpha } \left( x \right) - E\left[ {p^{\alpha } \left( X \right)} \right]} \right)\left( {p^{\alpha } \left( y \right) - E\left[ {p^{\alpha } \left( Y \right)} \right]} \right)} \right\},\;\;\beta \varepsilon \left[ { - 1, 1} \right] \\ \end{aligned}$$
(2.1)

2.1.2 Bivariate Discrete Distributions Based on Probability Generating Functions

Let \(\phi_{1} \left( x \right) = \theta^{x} - G\left( \theta \right)\) and \(\phi_{2} \left( y \right) = \theta^{y} - G\left( \theta \right)\), where \(0 < \theta < 1\) and \(G\left( \theta \right)\) is the probability generating function of the marginal distribution. A bivariate distribution is defined as follows with joint pmf

$$P\left( {X = x,Y = y} \right) = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta \left( {\theta^{x} - G\left( \theta \right)} \right)\left( {\theta^{y} - G\left( \theta \right)} \right)} \right\}$$
(2.2)

Note that the functions \(\phi_{i} \left( t \right),i = 1,2\) are bounded and \(\sum \phi_{i} \left( t \right)P\left( t \right) = 0, \;t = 0,1 2, \ldots\).

Let \(\xi_{i} = \sum t\phi_{i} \left( t \right)P\left( t \right)\). From Theorem 2 [19], the correlation coefficient \(\rho\) of (2.1) and (2.2) is given by

$$\rho = \frac{{\beta \xi_{1} \xi_{2} }}{{\sigma_{1} \sigma_{2} }}.$$
(2.3)

where \(\sigma_{1}^{2} ,\sigma_{2}^{2}\) are the variances.

For model (2.1) the function \(\xi_{i}\) is given by

$$\xi_{i} = \sum x\phi_{i} \left( x \right)P\left( x \right) = \sum \left( {xp\left( x \right)P\left( x \right) - xP\left( x \right)E\left[ {p\left( X \right)} \right]} \right) = \delta_{i} - \mu_{i} \gamma_{i} , i = 1,2$$

where \(\delta = E\left[ {Xp\left( X \right)} \right]\), \(\mu\) is the mean for \(P\left( x \right)\) and \(\gamma = E\left[ {p\left( X \right)} \right]\). Thus,

$$\rho = \frac{{\beta \left( {\delta_{1} - \mu_{1} \gamma_{1} } \right)\left( {\delta_{2} - \mu_{2} \gamma_{2} } \right)}}{{\sigma_{1} \sigma_{2} }}$$

2.2 BCMP Distributions

Based on (2.1) and (2.2) we have the following BCMP distributions.

The bivariate distribution with joint pmf based on (2.1) is given by

$$P\left( {X = x,Y = y} \right) = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta \left[ {p^{\alpha } \left( x \right) - E\left[ {p^{\alpha } \left( X \right)} \right]} \right]\left[ {p^{\alpha } \left( y \right) - E\left[ {p^{\alpha } \left( Y \right)} \right]} \right]} \right\}$$
(2.4)

where the CMP marginals \(P\left( {X = x} \right)\) and \(P\left( {Y = y} \right)\) are given by (1.3) with parameters \(\left( {\lambda_{1} , \nu_{1} } \right)\) and \(\left( {\lambda_{2} , \nu_{2} } \right)\), respectively. We consider computation of \(E\left[ {p^{\alpha } \left( Y \right)} \right]\) by writing it in terms of a CMP pmf. For simplicity, we suppress the subscripts and write the parameters as \(\left( {\lambda ,\nu } \right)\).

Since \(Z(\lambda ,\,\nu ) = \sum\limits_{j = 0}^{\infty } {\frac{{\lambda^{j} }}{{(j!)^{\nu } }}}\), we express

$$E\left[ {p^{\alpha } \left( X \right)} \right] = \mathop \sum \limits_{x = 0}^{\infty } e^{ - \lambda \alpha } \frac{{\lambda^{x\alpha } }}{{\left( {x!} \right)^{\alpha } }}P\left( x \right) = \frac{{e^{ - \lambda \alpha } Z\left( {\lambda^{2} ,\nu + 1} \right)}}{{Z\left( {\lambda ,\nu } \right)}}\mathop \sum \limits_{x = 0}^{\infty } \frac{{\lambda^{{x\left( {\alpha + 1} \right)}} }}{{\left( {x!} \right)^{\nu + \alpha } Z\left( {\lambda^{2} ,\nu + 1} \right)}}$$

That is,

$$E\left[ {p^{\alpha } \left( X \right)} \right] = \frac{{e^{ - \lambda \alpha } Z\left( {\lambda^{2} ,\nu + 1} \right)}}{{Z\left( {\lambda ,\nu } \right)}}P\left( {R = x} \right)$$

where \(P\left( {R = x} \right) = \frac{1}{{Z\left( {\lambda^{2} ,\nu + 1} \right)}}\mathop \sum \limits_{x = 0}^{\infty } \frac{{\lambda^{{x\left( {\alpha + 1} \right)}} }}{{\left( {x!} \right)^{\nu + \alpha } }}\) is a CMP pmf with parameters \(\left( {\lambda^{2} ,\nu + 1} \right)\). Note that \(E\left[ {p^{\alpha}\left( X \right)} \right]\) is an infinite sum of product of Poisson and CMP probabilities and it is easy to see that \(E\left[ {p^{\alpha}\left( X \right)} \right] \le 1\). Hence, \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\) are bounded.

Let \(\phi_{1} \left( x \right) = \theta^{x} - g\left( {\theta ;\lambda_{1} , \nu_{1} } \right)\) and \(\phi_{2} \left( y \right) = \theta^{y} - g\left( {\theta ;\lambda_{2} , \nu_{2} } \right)\), where \(0 < \theta < 1\) and \(g\left( {\theta ;\lambda ,\nu } \right)\) is the probability generating function of the CMP distribution given by

$$g\left( {\theta ;\lambda ,\nu } \right) = \frac{{Z\left( {\lambda \theta , \nu } \right)}}{{Z\left( {\lambda ,\nu } \right)}}.$$

The BCMP distribution corresponding to (2.2) has joint pmf defined as follows:

$$P\left( {X = x,Y = y} \right) = P\left( {X = x} \right)P\left( {Y = y} \right)\left\{ {1 + \beta\phi_{1} \left( x \right)\phi_{2} \left( y \right)} \right\},$$
(2.5)

where the CMP marginals \(P\left( {X = x} \right)\) and \(P\left( {Y = y} \right)\) are given by (1.3) with parameters \(\left( {\lambda_{1} , \nu_{1} } \right)\) and \(\left( {\lambda_{2} , \nu_{2} } \right)\), respectively.

If \(\nu_{1} = \nu_{2} = 1\), then (2.5) is the joint pmf of the bivariate Poisson distribution given in Section 6.3 of Lee [19] with \(\theta = e^{ - 1}\).

2.3 Correlation Coefficient

Let \(\mu\) and \(\sigma^{2}\) denote the mean and variance of the CMP distribution. For BCMP distributions (2.4) and (2.5), the variances \(\sigma_{1}^{2} ,\sigma_{2}^{2}\) may be approximated by (1.5).

For BCMP distribution (2.4),

$$\xi_{i} = \sum x\phi_{i} \left( x \right)P\left( x \right) = \sum \left( {x\frac{{e^{{ - \lambda_{i} }} \lambda_{i}^{x} }}{x!} - xP\left( x \right)} \right) = \lambda_{i} - \mu_{i} ,i = 1,2$$

Thus (2.4) has a very simple expression for the correlation coefficient.

$$\rho = \frac{{\beta \left( {\lambda_{1} - \mu_{1} } \right)\left( {\lambda_{2} - \mu_{2} } \right)}}{{\sigma_{1} \sigma_{2} }}$$

For (2.5),

$$\xi_{i} = \theta \frac{{\partial g_{i} \left( \theta \right)}}{\partial \theta } - \mu_{i} g_{i} \left( \theta \right),i = 1,2$$

where \(g_{i} \left( \theta \right) = g\left( {\theta ;\lambda_{i} , \nu_{i} } \right), i = 1, 2\). The correlation coefficient is given by

$$\rho = \frac{{\beta \left( {\theta \frac{{\partial g_{1} \left( \theta \right)}}{\partial \theta } - \mu_{1} g_{1} \left( \theta \right)} \right)\left( {\theta \frac{{\partial g_{2} \left( \theta \right)}}{\partial \theta } - \mu_{2} g_{2} \left( \theta \right)} \right)}}{{\sigma_{1} \sigma_{2} }}$$

For the computation of \(Z\left( {\theta , \nu } \right)\) given by (1.4) and related quantities such as derivatives see Section 4 of Gupta et al. [10].

3 Dependence Properties of the BCMP Distribution

The most common measures for determining the relationship between two variables are the Pearson correlation coefficient, Kendall’s tau and Spearman’s rho. As a generalization of Pearson’s correlation coefficient, Bjerve and Doksum [3], Doksum et al. [7] and Blyth [4,5,6] introduced and discussed correlation curve. The correlation curve is a local measure of the strength of association between the two variables X and Y. The correlation curve \(\rho \left( x \right)\), a function of x, describes the amount of variance explained by a regression curve varies locally. However, \(\rho \left( x \right)\) does not treat X and Y on equal footing, but needs Y to be a response and X, a predictor variable. The correlation curve is a regression concept.

A local dependence function, a function of x and y, should measure the strength and direction of association locally treating both variables symmetrically. For a bivariate distribution, it is defined as follows:

An \(r \times c\) contingency table with cell probabilities \(p_{i,j}\) specify the joint distribution for two discrete random variables X and Y as

$$p_{i,j} = P\left( {X = i;Y = j} \right);1 \le i \le r;1 \le j \le c$$
(3.1)

The two marginal distributions for X and Y are

$$P\left( {X = i} \right) = \mathop \sum \limits_{j = 1}^{c} p_{i,j} = p_{i + } \;{\text{and}}\;P\left( {Y = j} \right) = \mathop \sum \limits_{i = 1}^{r} p_{i,j} = p_{ + j}$$

respectively. Yule and Kendall [30] and Goodman [9] suggested the following set of local cross product ratios

$$\alpha_{i,j} = \frac{{p_{i,j} p_{i + 1,j + 1} }}{{p_{i.j + 1} p_{i + 1,j} }},1 \le i \le r,1 \le j \le c$$
(3.2)

Equation (3.2) defines the local dependence function. Also, let \(\gamma_{i,j} = ln \alpha_{i,j}\). Both \(\alpha_{i,j}\) and \(\gamma_{i,j}\) measure the association of the \(2 \times 2\) tables found by adjacent rows and adjacent columns. It is known that the set \(\left\{ {\alpha_{i,j} } \right\}\) or equivalently \(\left\{ {\gamma_{i,j} } \right\}\) together with marginal probability distributions uniquely determine the bivariate distribution. For more explanation, see Wang [29] and Holland and Wang [13].

We now present a very important property of the local dependence function. It is in terms of the totally positive of order 2, TP2 (reverse regular of order 2, RR2), property defined below.

Definition

A discrete bivariate distribution \(P\left( {X = i;Y = j} \right)\) is said to be TP2(RR2) if for a1 < b1, a2 < b2

$$\left| {\begin{array}{*{20}c} {p(a_{1} ,b_{1} )} & {p(a_{1} ,b_{2} )} \\ {p(a_{2} ,b_{1} )} & {p(a_{2} ,b_{2} )} \\ \end{array} } \right| \ge ( \le )0$$
(3.3)

where \(p\left( {a_{i} ,b_{j} } \right) = P\left( {X = i;Y = j} \right)\). It can be easily verified that the TP2(RR2) property is equivalent to \(\alpha_{i,j} \ge \left( \le \right)1\); where \(\alpha_{i,j}\) is the local dependence function. This is also equivalent to \(\gamma_{i,j} \ge \left( \le \right)0\).

We now obtain the local dependence function for the Sarmanov family of discrete distributions.

3.1 Local Dependence Function for Sarmanov Family of Discrete Distributions

For this family

$$\alpha_{ij} = \frac{{\left[ {1 + \omega \phi_{1} (x)\phi_{2} (y)} \right]\left[ {1 + \omega \phi_{1} (x + 1)\phi_{2} (y + 1)} \right]}}{{\left[ {1 + \omega \phi_{1} (x)\phi_{2} (y + 1)} \right]\left[ {1 + \omega \phi_{1} (x + 1)\phi_{2} (y)} \right]}}$$
(3.4)

So \(\alpha_{i,j} \ge ( \le )1\) is equivalent to

$$\omega \left[ {\phi_{2} \left( {y + 1} \right) - \phi_{2} \left( y \right)} \right]\left[ {\phi_{1} \left( {x + 1} \right) - \phi_{1} \left( x \right)} \right] \ge \left( \le \right)0$$
(3.5)

Thus, Sarmanov family is TP2 if

  1. (a)

    \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\) are both increasing or both decreasing and

  2. (b)

    \(\omega > 0\).

Similarly, the Sarmanov family is RR2 if

  1. (a)

    \(\phi_{1} \left( x \right)\) and \(\phi_{2} y\) are both increasing or both decreasing and

  2. (b)

    \(\omega < 0\).

Let us now obtain such conditions for the BCMP distribution.

We have

$$\phi_{1} \left( x \right) = \theta^{x} - g_{1} \left( \theta \right),0 < \theta < 1.$$

This gives

$$\begin{aligned} \phi_{1} \left( {x + 1} \right) - \phi_{1} \left( x \right) & = \left[ {\theta^{x + 1} - g_{1} \left( \theta \right)} \right]\left[ {\theta^{x} - g_{1} \left( \theta \right)} \right] \\ & = \theta^{x} \left( {\theta - 1} \right) < 0,0 < \theta < 1 \\ \end{aligned}$$
(3.6)

Hence, \(\phi_{1} \left( x \right)\) is decreasing. Similarly, \(\phi_{2} \left( y \right)\) is decreasing.

Hence, BCMP is TP2 if and only if \(\omega > 0\).

Note that the TP2 condition is the same as the condition for positive dependence notion that Lehmann [18] called positive likelihood ratio dependence. This notion leads naturally to the order described below.

3.2 Positively Likelihood Ratio Dependent Ordering

Let (X1,X2) and (Y1,Y2) be two bivariate random vectors having the same marginals. Then, we say that (X1, X2) is smaller than (Y1, Y2) in the positively likelihood ratio dependent (PLRD) order denoted by

$$\left( {X_{1} ,X_{2} } \right) \le_{\text{PLRD}} \left( {Y_{1} ,Y_{2} } \right)$$

if

$$f\left( {x_{1} ,y_{1} } \right)f\left( {x_{2} ,y_{2} } \right)g\left( {x_{1} ,y_{2} } \right)g\left( {x_{2} ,y_{1} } \right) \le f\left( {x_{1} ,y_{2} } \right)f\left( {x_{2} ,y_{1} } \right)g\left( {x_{1} ,y_{1} } \right)g\left( {x_{2} ,y_{2} } \right), x_{1} \le x_{2} ,y_{1} \le y_{2}$$

where F and G have (continuous or discrete) densities f and g; see [26].

It can be easily seen that

$$\left( {X_{1} ,X_{2} } \right) \le_{PLRD} \left( {Y_{1} ,Y_{2} } \right)$$

is equivalent to

$$\alpha_{{\left( {X_{1} ,X_{2} } \right)}} \le \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}$$

where \(\alpha_{{\left( {X_{1} ,X_{2} } \right)}}\) and \(\alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}\) are the local dependence functions for (X1, X2) and (Y1,Y2), respectively.

We shall now compare two Sarmanov families with parameters w1 and w2:

After tedious algebra, it can be verified that

$$\begin{aligned} & \alpha_{{\left( {X_{1} ,X_{2} } \right)}} - \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}} \\ & \quad = \left( {\omega_{1} - \omega_{2} } \right) \left[ {\phi_{1} \left( {x + 1} \right) - \phi_{1} \left( x \right)} \right]\left[ {\phi_{1} \left( {y + 1} \right) - \phi_{1} \left( y \right)} \right] \times \left[ {1 - \omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right)} \right] \\ \end{aligned}$$

Now assume

  1. (a)

    \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\) are both increasing or both decreasing and

  2. (b)
    $$\omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \le 1$$

Then, \(\alpha _{{\left( {X_{1} ,X_{2} } \right)}} \le \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}\) if \(\omega_{1} < \omega_{2}\).

The alternative conditions are

  1. (c)

    \(\phi_{1} \left( x \right)\) and \(\phi_{2} \left( y \right)\) are both increasing or both decreasing and

  2. (d)
    $$\omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \ge 1$$

Then, \(\alpha _{{\left( {X_{1} ,X_{2} } \right)}} \le \alpha_{{\left( {Y_{1} ,Y_{2} } \right)}}\) if \(\omega_{1} > \omega_{2}\).

We now investigate PLRD ordering for BCMP distribution. In this case,

$$\omega_{1} \phi_{1} \left( x \right)\phi_{2} \left( y \right) \ge - 1\;\;{\text{and}}\;\;\omega_{2} \phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \ge - 1.$$

This does not imply that \(\omega_{1} \omega_{2} \phi_{1} \left( x \right)\phi_{2} \left( x \right)\phi_{1} \left( {x + 1} \right)\phi_{2} \left( {y + 1} \right) \ge 1\). Hence, we cannot compare the two vectors of Sarmanov family according to the PLRD ordering.

Remark. The PLRD ordering is difficult to check in most cases. By contrast finding families of distributions that are not ordered by this relation is relatively easy. For more explanation and comments, see [8].

4 Statistical Analysis

In this section, we examine statistical inference for the BCMP distribution given by (2.2).

4.1 Parameter Estimation

The method of moments estimation for estimating the parameters is conducted as follows:

The marginal parameters \(\left( {\lambda ,\nu } \right)\) are estimated by equating the first and second marginal sample moments by using the approximations in (1.5). The estimate for \(\beta\) is obtained from (2.3) by equating with the sample correlation coefficient.

For maximum likelihood estimation (MLE), simulated annealing (SA) algorithm is used to determine the estimates corresponding to the global optimum. It is a popular algorithm for searching global extremum in non-smooth functions with a large number of local extrema. Henderson et al. [11] have discussed the convergence of SA and presented practical guidelines for the implementation of SA algorithm, especially its control parameters, to ensure good performance.

The log-likelihood function of the model is given by

$$\ln L = \sum\limits_{i = 1}^{n} {\left[ {x_{i} \ln \lambda_{1} + y_{i} \ln \lambda_{2} + \ln \left\{ {1 + \beta \phi_{1} \left( {x_{i} } \right)\phi_{2} \left( {y_{i} } \right)} \right\} - \nu_{1} \left( {\ln x_{i} !} \right) - \nu_{2} \left( {\ln y_{i} !} \right) - \ln Z_{1} - \ln Z_{2} } \right]}$$
(4.1)

where \(Z_{1} = Z(\lambda_{1} ,\,\nu_{1} ),\)\(Z_{2} = Z(\lambda_{2} ,\,\nu_{2} )\). For the following sections, we consider \(\theta = e^{ - 1}\).

4.2 Test of Hypotheses

In this subsection, the tests for the two hypotheses of interest, that is, test of independence and test of adequacy of the proposed BCMP distribution are discussed. To compare the null model (restricted model) against the alternative model (unrestricted model), the score and the likelihood ratio (LR) tests are chosen and their test statistics are summarized as follows.

Let \(H_{0}\): \(\theta = \theta^{*}\) versus \(H_{1}\): \(\theta \ne \theta^{*}\). The score test statistic is \(T = S^{'} V^{ - 1} S\) where

$$S^{'} = \left. {\frac{\partial }{\partial \theta }\ln L\left( \theta \right)} \right|_{{\theta = \hat{\theta }^{*} }}$$

is the score function and

$$V = \left. {E\left[ { - \frac{{\partial^{2} \ln L(\theta )}}{\partial \theta }} \right]} \right|_{{_{{\theta = \hat{\theta }^{*} }} }}$$

is the information matrix.

The test statistic based on the LR test is defined as \(\text{LR} = - 2\log L\left( {\hat{\theta }^{*} } \right)/L(\hat{\theta })\) where \(\hat{\theta }^{*}\) and \(\hat{\theta }\) are the restricted and unrestricted maximum likelihood estimates.

4.2.1 Test of Independence

The random variables X and Y are independent if \(\beta = 0\). The hypotheses to be tested are \(H_{0 }: \beta= 0\) against \(H_{1} :\beta \ne 0\).

The score functions evaluated under the null hypothesis are needed and they can be easily obtained from the log-likelihood equation in (4.1).

$$\begin{aligned} \left. {\frac{\partial \ln L}{\partial \beta }} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{1} \left( {x_{i} } \right)} \phi_{2} \left( {y_{i} } \right) \\ \left. {\frac{\partial \ln L}{{\partial \lambda_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\frac{{x_{i} }}{{\lambda_{1} }} - \frac{1}{{Z_{1} }}\left( {\frac{{\partial Z_{1} }}{{\partial \lambda_{1} }}} \right)} ,\;\;\left. {\frac{\partial \ln L}{{\partial \lambda_{2} }}} \right|_{\beta = 0} = \sum\limits_{i = 1}^{n} {\frac{{y_{i} }}{{\lambda_{2} }} - \frac{1}{{Z_{2} }}\left( {\frac{{\partial Z_{2} }}{{\partial \lambda_{2} }}} \right)} \\ \left. {\frac{\partial \ln L}{{\partial \nu_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \ln \left( {x_{i} !} \right) - \frac{1}{{Z_{1} }}\left( {\frac{{\partial Z_{1} }}{{\partial \nu_{1} }}} \right)} ,\;\;\left. {\frac{\partial \ln L}{{\partial \nu_{2} }}} \right|_{\beta = 0} = \sum\limits_{i = 1}^{n} { - \ln \left( {y_{i} !} \right) - \frac{1}{{Z_{2} }}\left( {\frac{{\partial Z_{2} }}{{\partial \nu_{2} }}} \right)} \\ \end{aligned}$$

The elements of the information matrix corresponding to \(\beta = 0\) are

$$\begin{aligned} \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \left( {\phi_{1} \left( {x_{i} } \right)\phi_{2} \left( {y_{i} } \right)} \right)^{2} } \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \lambda_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{2} \left( {y_{i} } \right)\left( {\frac{{\partial \phi_{1} \left( {x_{i} } \right)}}{{\partial \lambda_{1} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \lambda_{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{1} \left( {x_{i} } \right)\left( {\frac{{\partial \phi_{2} \left( {y_{i} } \right)}}{{\partial \lambda_{2} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \nu_{1} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{2} \left( {y_{i} } \right)\left( {\frac{{\partial \phi_{1} \left( {x_{i} } \right)}}{{\partial \nu_{1} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \beta \partial \nu_{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\phi_{1} \left( {x_{i} } \right)\left( {\frac{{\partial \phi_{2} \left( {y_{i} } \right)}}{{\partial \nu_{2} }}} \right)} \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{1}^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \frac{{x_{i} }}{{\lambda_{1}^{2} }} + \left( {\frac{1}{{Z_{1}^{{}} }}\frac{{\partial Z_{1} }}{{\partial \lambda_{1} }}} \right)}^{2} - \frac{1}{{Z_{1}^{{}} }}\left( {\frac{{\partial^{2} Z_{1} }}{{\partial \lambda_{1}^{2} }}} \right) \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{2}^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} { - \frac{{y_{i} }}{{\lambda_{2}^{2} }} + \left( {\frac{1}{{Z_{2}^{{}} }}\frac{{\partial Z_{2} }}{{\partial \lambda_{2} }}} \right)}^{2} - \frac{1}{{Z_{2}^{{}} }}\left( {\frac{{\partial^{2} Z_{2} }}{{\partial \lambda_{2}^{2} }}} \right) \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \nu_{j}^{2} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\frac{1}{{Z_{j}^{2} }}\left[ {\left( {\frac{{\partial Z_{j} }}{{\partial \nu_{j} }}} \right)^{2} - Z_{j}^{{}} \frac{{\partial^{2} Z_{j} }}{{\partial \nu_{j}^{2} }}} \right]} , \, j = 1,2 \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{j} \partial \nu_{j} }}} \right|_{\beta = 0} & = \sum\limits_{i = 1}^{n} {\frac{1}{{Z_{j}^{2} }}\left( {\frac{{\partial Z_{j} }}{{\partial \lambda_{j} }}\frac{{\partial Z_{j} }}{{\partial \nu_{j} }} - Z_{j}^{{}} \frac{{\partial^{2} Z_{j} }}{{\partial \lambda_{j} \partial \nu_{j} }}} \right)} , \, j = 1,2 \\ \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{1} \partial \lambda_{2} }}} \right|_{\beta = 0} & = \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{1} \partial \nu_{2} }}} \right|_{\beta = 0} = \left. {\frac{{\partial^{2} \ln L}}{{\partial \lambda_{2} \partial \nu_{1} }}} \right|_{\beta = 0} = \left. {\frac{{\partial^{2} \ln L}}{{\partial \nu_{1} \partial \nu_{2} }}} \right|_{\beta = 0} = 0 \\ \end{aligned}$$

Both the score and LR tests have an approximate Chi-square distribution with 1 degree of freedom.

4.2.2 Test of BCMP

To test the bivariate Poisson distribution against a BCMP distribution, that is, if the bivariate Poisson is adequate, the proposed hypotheses are

$$H_{0} :\nu_{1} = \nu_{2} = 1\;\text{versus}\;H_{0} \; \text{is}\;\text{not}\;\text{true}.$$

From Eqs. (2.5) and (4.1), the score functions are found to be

$$\begin{aligned} \left. {\frac{\partial \ln L}{\partial \beta }} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} {\frac{{a_{1} a_{2} }}{f}} \\ \left. {\frac{\partial \ln L}{{\partial \lambda_{1} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - 1 + \frac{{x_{i} }}{{\lambda_{1} }} + \frac{{\beta a_{1} e^{{ - \lambda_{1} + \lambda_{1} \theta }} \left( {1 - \theta } \right)}}{f}} \\ \left. {\frac{\partial \ln L}{{\partial \lambda_{2} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - 1 + \frac{{y_{i} }}{{\lambda_{2} }} + \frac{{\beta a_{2} e^{{ - \lambda_{2} + \lambda_{2} \theta }} \left( {1 - \theta } \right)}}{f}} \\ \left. {\frac{\partial \ln L}{{\partial \nu_{1} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - \ln \left( {x_{i} !} \right) - e^{{ - \lambda_{1} }} \left( {\frac{{\partial Z_{1} }}{{\partial \nu_{1} }}} \right)} + \frac{{\beta a_{1} }}{f}\left[ {e^{{ - 2\lambda_{1} + \lambda_{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu_{1} }} - e^{{ - \lambda_{1} }} \frac{{\partial Z_{1}^{*} }}{{\partial \nu_{1} }}} \right] \\ \left. {\frac{\partial \ln L}{{\partial \nu_{2} }}} \right|_{{\nu_{1} = \nu_{2} = 1}} & = \sum\limits_{i = 1}^{n} { - \ln \left( {y_{i} !} \right) - e^{{ - \lambda_{2} }} \frac{{\partial Z_{2} }}{{\partial \nu_{2} }}} + \frac{{\beta a_{2} }}{f}\left[ {e^{{ - 2\lambda_{2} + \lambda_{2} \theta }} \frac{{\partial Z_{2} }}{{\partial \nu_{2} }} - e^{{ - \lambda_{2} }} \frac{{\partial Z_{2}^{*} }}{{\partial \nu_{2} }}} \right] \\ \end{aligned}$$

where \(a_{1} = \left( { - e^{{ - \lambda_{2} + \lambda_{2} \theta }} + \theta^{{y_{i} }} } \right)\), \(a_{2} = \left( { - e^{{ - \lambda_{1} + \lambda_{1} \theta }} + \theta^{{x_{i} }} } \right)\), f = 1 + \(\beta \left( { - e^{{ - \lambda_{1} + \lambda_{1} \theta }} + \theta^{{x_{i} }} } \right)\) \(\left( { - e^{{ - \lambda_{2} + \lambda_{2} \theta }} + \theta^{{y_{i} }} } \right)\), \(Z_{1}^{*} = Z(\theta \lambda_{1} ,\,\nu_{1} )\) and \(Z_{2}^{*} = Z(\theta \lambda_{2} ,\,\nu_{2} )\).

The elements of information matrix evaluated under the null hypothesis are

$$\begin{aligned} \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \left( {\frac{{a_{1} a_{2} }}{f}} \right)} ^{2} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta \partial \lambda _{j} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \frac{{b_{j} }}{g^{2}}\left( {e^{{\lambda _{1} + \lambda _{2} + \lambda _{j} \theta }} } \right)\left( {\theta - 1} \right)} ,{\text{ }}j = 1,2 \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta \partial \nu _{1} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{{\left( {e^{{\lambda _{2} }} b_{1} } \right)}}{g^{2}}\left( {e^{{\lambda _{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu _{1} }} - e^{{\lambda _{1} }} \frac{{\partial Z_{1}^{*} }}{{\partial \nu _{1} }}} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \beta \partial \nu _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{{\left( {e^{{\lambda _{1} }} b_{2} } \right)}}{g^{2}}\left( {e^{{\lambda _{2} \theta }} \frac{{\partial Z_{2} }}{{\partial \nu _{2} }} - e^{{\lambda _{2} }} \frac{{\partial Z_{2}^{*} }}{{\partial \nu _{2} }}} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{1} ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \frac{{x_{i} }}{{\lambda _{1}^{2} }} - \frac{{\beta b_{1} e^{{\lambda _{1} \theta }} }}{g}\left( {\theta - 1} \right)^{2} \left( {\frac{{\beta b_{1} e^{{\lambda _{1} \theta }} }}{g} + 1} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{1} \partial \lambda _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\left( {\frac{{\theta - 1}}{g}} \right)^{2} } \beta e^{{(\lambda _{1} + \lambda _{2} )(1 + \theta )}} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{2} ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} { - \frac{{y_{i} }}{{\lambda _{2}^{2} }} - \frac{{\beta b_{2} e^{{\lambda _{2} \theta }} \left( {\theta - 1} \right)^{2} }}{g}\left( {\frac{{\beta b_{2} e^{{\lambda _{2} \theta }} }}{g} + 1} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{j} \partial \nu _{j} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} \begin{gathered} e^{{ - \lambda _{j} }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }} - \frac{{\partial ^{2} Z_{j} }}{{\partial \lambda _{j} \partial \nu _{j} }}} \right) + \left( {\frac{{\beta b_{j} }}{g}} \right)^{2} \left( {e^{{\lambda _{j} \left( {\theta - 1} \right)}} \left( {\theta - 1} \right)\left( {e^{{\lambda _{j} \theta }} \frac{{\partial Z_{j} }}{{\partial \nu _{j} }} - e^{{\lambda _{j} }} \frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }}} \right)} \right) \hfill \\ + \frac{1}{g}\left[ {\beta b_{j} e^{{ - \lambda _{j} }} \left( {e^{{\lambda _{j} \theta }} \frac{{\partial ^{2} Z_{j} }}{{\partial \lambda _{j} \partial \nu _{j} }} + e^{{\lambda _{j} \theta }} \left( {\theta - 2} \right)\frac{{\partial Z_{j} }}{{\partial \nu _{j} }} + e^{{\lambda _{j} }} \left( {\frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }} - \frac{{\partial ^{2} Z_{j}^{*} }}{{\partial \lambda _{j} \partial \nu _{j} }}} \right)} \right)} \right] \hfill \\ \end{gathered} ,{\text{ }}j = 1,2 \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{1} \partial \nu _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{1}{{g^{2} }}\left( {\beta e^{{\lambda _{1} + \lambda _{1} \theta }} \left( {\theta - 1} \right)\left( { - e^{\lambda _{2} \theta }\frac{{\partial Z_{2} }}{{\partial \nu _{2} }} + e^{\lambda _{2}} \frac{{\partial Z_{2}^{*} }}{{\partial \nu _{2} }}} \right)} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \lambda _{2} \partial \nu _{1} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{1}{{g^{2} }}\left( {\beta e^{{\lambda _{2} + \lambda _{2} \theta }} \left( {\theta - 1} \right)\left( { - e^{{\lambda _{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu _{1} }} + e^{{\lambda _{1} }} \frac{{\partial Z_{{_{1} }}^{*} }}{{\partial \nu _{1} }}} \right)} \right)} \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \nu _{j} ^{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {e^{{ - 4\lambda _{j} }} \left\{ \begin{aligned} & e^{{2\lambda _{j} }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }}} \right)^{2} - e^{{3\lambda _{j} }} \frac{{\partial ^{2} Z_{j} }}{{\partial \nu _{j}^{2} }} - \left( {\frac{{\beta a_{j} }}{f}\left( {e^{{\lambda _{j} \theta }} \frac{{\partial Z_{j} }}{{\partial \nu _{j} }} - e^{{\lambda _{j} }} \frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }}} \right)} \right)^{2} \\ & + \frac{{\beta a_{j} e^{{\lambda _{j} }} }}{f}\left( \begin{gathered} - 2e^{{\lambda _{j} \theta }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }}} \right)^{2} + e^{{\lambda _{j} + \lambda _{j} \theta }} \frac{{\partial ^{2} Z_{j} }}{{\partial \nu _{j}^{2} }} + \hfill \\ 2e^{{\lambda _{j} }} \left( {\frac{{\partial Z_{j} }}{{\partial \nu _{j} }}\frac{{\partial Z_{j}^{*} }}{{\partial \nu _{j} }}} \right) - e^{{2\lambda _{j} }} \frac{{\partial ^{2} Z_{j}^{*} }}{{\partial \nu _{j}^{2} }} \hfill \\ \end{gathered} \right) \\ \end{aligned} \right\}} ,{\text{ }}j = 1,2 \\ \left. {\frac{{\partial ^{2} \ln L}}{{\partial \nu _{1} \partial \nu _{2} }}} \right|_{{\nu _{1} = \nu _{2} = 1}} & = \sum\limits_{{i = 1}}^{n} {\frac{\beta }{{g^{2} }}\left( {e^{{\lambda _{1} \theta }} \frac{{\partial Z_{1} }}{{\partial \nu _{1} }} - e^{{\lambda _{1} }} \frac{{\partial Z_{1}^{*} }}{{\partial \nu _{1} }}} \right)\left( {e^{{\lambda _{2} \theta }} \frac{{\partial Z_{2} }}{{\partial \nu _{2} }} - e^{{\lambda _{2} }} \frac{{\partial Z_{2}^{*} }}{{\partial \nu _{2} }}} \right)} \\ \end{aligned}$$

where \(b_{1} = e^{{\lambda_{2} }} \theta^{{y_{i} }} - e^{{\lambda_{2} \theta }}\), \(b_{2} = e^{{\lambda_{1} }} \theta^{{x_{i} }} - e^{{\lambda_{1} \theta }}\) and g = \(\beta e^{{\left( {\lambda_{1} + \lambda_{2} } \right)\theta }} - \beta e^{{\lambda_{1} + \lambda_{2} \theta }} \theta^{{x_{i} }} - \beta e^{{\lambda_{2} + \lambda_{1} \theta }} \theta^{{y_{i} }} + e^{{\lambda_{1} + \lambda_{2} }} \left( {1 + \beta \theta^{{x_{i} + y_{i} }} } \right)\).

Both the score and LR tests have an approximate Chi-square distribution with 2 degrees of freedom.

4.3 Monte Carlo Simulation Study

The performance of the proposed score and LR tests is compared in order to check the efficiencies of the tests. A simulation study of 1000 replications has been carried out by considering small (n = 100), medium (n = 500) and large (n = 1000) sample sizes for the bivariate discrete distributions based on probability generating functions. The nominal significant level, \(\alpha\) is taken as 5% and 10%. For the test of independence, different values of \(\beta\), ranging from − 1 to 1 are considered with a few combinations of \(\nu_{1}\) and \(\nu_{2}\). Tables 1 and 2 display, respectively, the simulated power of the tests under the test of independence and the test of adequacy of BCMP distribution.

Table 1 Simulated power of Score and LR tests for test of independence (\(\lambda_{1} = 0.9,\nu_{1} = 2,\lambda_{2} = 0.7,\nu_{2} = 3\))
Table 2 Simulated power of Score and LR tests for test of adequacy of BCMP (\(\lambda_{1} = 0.9,\nu_{1} = 2,\lambda_{2} = 0.7,\nu_{2} = 3\))

From Table 1, the score test performs better in maintaining the nominal significance level of 5% (\(\beta\) = 0) if compared to the LR test for small sample sizes, but vice versa for nominal level of 10%. A weak detection is achieved when \(\beta\) is 0.5 away from zero and when the sample size is small. The power of detection is greatly improved when the sample size increases. When \(\beta\) is 0.8 or 1.0 away from zero, the score test outperforms the LR test when the sample size is small, but the values are very close to each other when the sample size increases.

The powers of the proposed tests increase when \(\beta\) diverges further from the value 0 and when the sample size increases. Besides that the powers of both of the tests are very close to each other when the sample size increases to 500 and 1000 regardless of the value of \(\beta\) and the nominal levels.

As shown in Table 2, the score test outperforms the LR test in maintaining the nominal levels of 5% and 10% regardless of the sample size. The LR test overestimates the nominal significance level for small sample sizes. In addition, when the parameters \(\nu_{1}\) and \(\nu_{2}\) are smaller than 1.0, the score and LR tests are comparable to each other. However, if both of the parameters are higher than 1.0, LR test is able to achieve a higher detection compared to the score test. If \(\nu_{1}\) is fixed as 1.0, the LR test outperforms the score test when \(\nu_{2}\) is larger than 1.0 but vice versa when \(\nu_{2}\) is smaller than 1.0. The same result applies when \(\nu_{2}\) is fixed as 1.0 and \(\nu_{1}\) is set as larger or smaller than 1.0.

Overall, both score and LR tests are powerful when the sample size is equal or larger than 500 as almost 96% detection can be achieved.

5 Example of Application to Real Data

As an illustration of application we consider the data set of the number of accidents sustained by 122 experienced shunters over 2 successive periods of time [1] which was also used by Sellers et al. [24].

In this section, all the summation series involved are computed by recursion with double-precision accuracy and truncation approach is applied to the normalizing constant, \(Z\left( {\lambda ,\nu } \right)\). It is set as \(Z\left( {\lambda ,\nu } \right) \le 1 \times 10^{200}\). To compute the correlation coefficient \(\rho = \frac{{\alpha \xi_{1} \xi_{2} }}{{\sigma_{1} \sigma_{2} }}\), the calculation of mean \(\mu_{1}\), \(\mu_{2}\) and variances \(\sigma_{1}^{2} ,\sigma_{2}^{2}\) from marginals \(P\left( {X = x} \right)\) and \(P\left( {Y = y} \right)\) are required. They are computed by using the following equations as the accuracy of approximation Eqs. (1.5) hold when \(\lambda > 10^{\nu }\). The equations fail when the value of the parameter \(\nu\) is close to zero. For example, when \(\nu = 0.01\), we have to make sure that \(\lambda > 10^{\nu }\) = 1.0233.

$$\begin{aligned} & \mu_{i} = \lambda_{i} \frac{{\partial \ln Z\left( {\lambda_{i} ,\nu_{i} } \right)}}{{\partial \lambda_{i} }} = \lambda_{i} \sum\limits_{j} {\frac{{j\lambda_{i}^{j - 1} }}{{(j!)^{\nu } }}} /\sum\limits_{j} {\frac{{\lambda_{i}^{j} }}{{(j!)^{\nu } }}} ,\quad i = 1,2 \\ & \sigma_{i}^{2} = \lambda_{i} \frac{{\partial \mu_{i} }}{{\partial \lambda_{i} }} = \mu_{i} + \lambda_{i}^{2} \left[ { - \left( {\sum\limits_{j} {\frac{{j\lambda_{i}^{j - 1} }}{{(j!)^{\nu } }}} } \right)^{2} + \left( {\sum\limits_{j} {\frac{{(j - 1)j\lambda_{i}^{j - 2} }}{{(j!)^{\nu } }}} } \right)\left( {\sum\limits_{j} {\frac{{\lambda_{i}^{j} }}{{(j!)^{\nu } }}} } \right)} \right]\left( {\sum\limits_{j} {\frac{{\lambda_{i}^{j} }}{{(j!)^{\nu } }}} } \right)^{ - 2} ,\quad i = 1,2 \\ \end{aligned}$$

For the data set on the number of accidents sustained by 122 experienced shunters over 2 successive periods of time. [1], the sample moments are:

$$\begin{aligned} & {\text{Sample}}\;{\text{mean:}}\;\;\hat{\mu}_{1} = 0.9754,\;\hat{\mu}_{2} = 1.2705 \\ & {\text{Variance:}}\;\;\hat{\sigma}_{1}^{2} = 1.2969,\;\hat{\sigma}_{2}^{2} = 1.6535. \\ & {\text{Sample}}\;{\text{correlation}},\;\hat{\rho} = 0.2585. \\ \end{aligned}$$

This data set has also been fitted by the bivariate negative binomial (BNB) distribution as a comparison. See Table 3.

Table 3 Observed and expected number of accidents sustained by 122 experienced shunters over 2 successive periods of time

To confirm that the MLE’s for the data in Table 3 really give a maximum, some likelihood function values are computed for points around the ML estimates. Based on the likelihood function values, the ML estimates do correspond to the maximum (Table 4).

Table 4 Likelihood function values for points around the ML estimates

It is observed that in the summary statistics presented in Table 5, the fits by both proposed BCMP are significantly better than the BNB based upon the \(\chi^{2}\) values and the log-likelihood values. For this data set, Sellers et al. [24] gave a log-likelihood value of − 341.704 for their BCMP distribution which is about the same as BNB. Thus, based on the log-likelihood values, the proposed BCMP distributions fit better than the BCMP distribution of Sellers et al. [24].

Table 5 Summary statistics

6 Concluding Remarks

BCMP distributions with marginal distributions which are CMP distributions and range of correlation coefficient over (− 1, 1) have been proposed. This is based on the Sarmanov family of bivariate distributions which is a simple and elegant approach in constructing bivariate distributions. A method is proposed for constructing bivariate generalizations of weighted distributions. The univariate CMP distribution is a very popular distribution for applications since it has the flexibility to analyze data that exhibit under or over dispersion. The BCMP distribution proposed by Sellers et al. [24] does not have the marginal distributions which are CMP distributions. It is shown in this article that the proposed BCMP distributions fit much better than the BCMP of Sellers et al. [24]. Thus, the proposed BCMP distributions will be of great utility to the data analysts.