Abstract
Copulas are multivariate distribution functions which their margins are distributed uniformly. Therefore, copulas are pretty useful for modeling several types of data. As they allow different dependence patterns. A numerous number of new classes of copulas have been suggested in the literature. Each granted different characteristics that make it compatible with certain type of data. In this paper, we introduce a new family of Archimedean copulas. The multiplicative Archimedean generator of this copula is the inverse of the probability generating function of a truncated-Poisson distribution. The properties of this copula are studied in detail. Three applications are provided for the sake of comparison between this copula and well-known ones.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Copulas are multivariate distribution functions with uniform marginal distributions. The valuable of copulas is presented in two essential properties. First, copulas, if we may say so, are one of the easiest ways to derive multivariate distribution functions. Besides, they allow various dependence structures. The first appearance of the word copula in statistics was in 1959 by Sklar [54]. Nevertheless, Hoeffding [26, 27] contributed a great deal in copulas idea. He discussed some nonparametric association measures of the probability distributions which are invariant under monotone transformations of the marginals. Also, he illustrated the benefits of a scale-invariant correlation on comparing two multivariate distributions. Moreover, he obtained bounds inequality for standardized distributions which are now known by Fréchet-Hoeffding upper and lower bounds. As he showed that if \(H\left( x_1,\ldots ,x_n\right) \) is an n-variate distribution function with \(\{G_i,i=1,\ldots ,n\}\) univariate marginals defined on [0,1], then \(\max \{\sum _{i=1}^{n}G_i-n+1,0\}\le H\left( x_1,\ldots ,x_n\right) \le \min \{G_i,i=1,\ldots ,n\}\). Latterly, the work of Hoeffding was translated by Fisher and Sen [17] under the name “The Collected Works of Wassily Hoeffding.” Similar results were obtained by Fréchet [19] and Dall’Aglio [13]. The books of Nelsen [50], Trivedi [56] and Durante [16], among others, can be viewed as basic references for copulas. Kolev et al. [35] published a review paper of copula works.
A definition of a bivariate copula as stated in Nelsen [50] is
Definition 1
A two-dimensional copula is a function C from \(I^2=[0,1]^2\) to \(I=[0,1]\) with the following properties:
-
1.
For every u, v in I,
$$\begin{aligned} C(u,0)=0=C(0,v) \end{aligned}$$and
$$\begin{aligned} C(u,1)=u \quad \text {and}\quad C(1,v)=v. \end{aligned}$$ -
2.
For every \(u_1, u_2, v_1, v_2\) in I such that \(u_1\le u_2\) and \(v_1\le v_2\),
$$\begin{aligned} C(u_2,v_2 )-C(u_2,v_1 )-C(u_1,v_2 )+C(u_1,v_1 )\ge 0. \end{aligned}$$
Accordingly, copulas are distribution functions on \(I^2\) with uniform marginals. Thus, the upper and lower bounds of Fréchet-Hoeffding are applied to copula, i.e.,
Note for the bivariate case, both upper and lower bounds satisfy Definition 1, thus, W and M are copulas. (For the proof, see Nelsen [50].)
A long existing problem and still an important topic to address is the construction of copulas. Plackett [53] and Ali-Mikhail-Haq [3] derived their copulas by applying some algebraic characteristics. The members of Plackett family are obtained from joint distributions with constant global cross-ratio. Ali-Mikhail-Haq [3] focused on bivariate distributions of which their survival odds ratios have a determined relation. Marshall and Olkin [41, 42] and Hougaard [28] used mixtures and compound distributions to generate copulas. Hougaard [28] suggested the combined approach. Briefly, at this approach, the marginals are modeled by standard cox models and the dependence modeled by copulas. He used gamma frailty model to construct a copula which is now known by Gumbel–Hougaard copula. Marshall and Olkin [41] defined a bivariate exponential distribution with exponential marginals as \(Y_1 =\min \{X_1, X_3\}\) and \(Y_2 =\min \{X_2, X_3\}\) where \(\{X_i,i=1,2,3\}\) are independent exponential random variables. Arising from this distribution, a family called Marshall–Olkin family of copulas. Additionally, in Marshall and Olkin [42], they used inverse Laplace transformations to generate a special class of copulas (the so-called Archimedean copulas). Genest and Mackay [22] introduced Archimedean copulas in the form \(C(u,v)=\varphi (\varphi ^{-1}(u)+\varphi ^{-1}(v))\), where \(\varphi \) is a real-valued function satisfies certain conditions. The function \(\varphi \) is known as additive Archimedean generator. Most of the famous copulas belong to the Archimedean class, for example but not limited to, the Clayton family which appeared first in Clayton [12], the family of Ali, Mikhail and Haq (AMH) [3] and Frank family that introduced by Frank [18] are all Archimedean copulas. An exhaustive list of Archimedean copulas could be found in Hutchinson and Lai [29]. Joe [30] presented a copula generated by the inversion method. These methods and other methods are discussed in third chapter of Joe [31]. Kim and Sungur [33] define a method to generate new copula from an existing one, by adding a multiplier of two real functions. The properties of these functions and some dependence measures were obtained. Durante et al. [15] gave a generalization of the Archimedean class of bivariate copulas generated by two functions with specific criterions. Najjari et al. [49] introduced a new family of Archimedean copulas using hyperbolic cotangent function as a generator. They studied its properties and gave closed forms for Kendall’s tau and Spearman’s rho measures of correlation. A comparison with some well-known copula is conducted using real data set. Mazo et al. [43] develop a form to obtain multivariate copulas from a product of bivariate copulas and study its characteristics. Parameter’s estimation was addressed using simulation study. Alhadlaq and Alzaid [2] utilized the properties of cumulative distribution functions (cdfs) on [0, 1] to obtain additive and multiplicative Archimedean generators. Also, by observing that the probability generating functions (pgfs) are distribution functions on [0, 1], conditions under which these probability generating functions and their inverses can be used as Archimedean generators were discussed.
One of main interests of copula inference is the estimation of the dependence parameter(s). An ordinary method of estimation is by maximizing the full likelihood function (ML). Yet, the computations of this approach are sometime complicated, literally when the copula has more than one dependence parameter. Alternative methods, suggested in the literate, is to estimate the parameters in two steps. A method called inference functions for margins (IFM), consisting of estimating the parameters of the marginals first using the ordinary maximum likelihood approach. Next, the dependence parameter is estimated in such a way using the copula after replacing the marginal’s parameters by their estimates. For a full explanation of this technique and more, see Joe [31], Chapter 5. Another similar approach is suggested by Genest et al. [21]. This semiparametric method considered empirical distributions of the marginals. So, it is only needed to estimate the dependence parameter. Bayesian approach is a less common approach of copulas estimation. Choroś et al. [11] presented a review of parametric, semiparametric, and nonparametric methods of copulas estimation.
Due to the independence between the copula and the marginals, it is convenient to handle several modeling issues by copulas. Applications from different fields, such as, finance, economics, engineering applications, and survival analysis, were studied using copulas. Bouyé et al. [9] studied copula models with finance applications. A review of copula applications to multivariate survival analysis can be found in Georges et al. [25]. Patton [52] discussed economic time series applications of copulas. Some applications in engineering could be found in Kumar [37]. However, choosing the appropriate copula has been a matter of investigation. Several researches considered the goodness-of-fit tests for copula models. Genest and Rivest [24] and Smith [55] studied model selection among bivariate Archimedean copulas. Klugman and Parsa [34] extended Pearson goodness-of-fit test to the bivariate case. Panchenko [51] develops a goodness-of-fit test for copulas based on positive definite bilinear forms. A goodness-of-fit test based on the theoretical and sample versions of Spearman’s dependence function is introduced by Mesfioui et al. [45]. Genest et al. [23] provide a review for some goodness-of-fit test and presented new ones.
Alhadlaq and Alzaid [2] introduced a new approach based on cdfs and pgfs to define Archimedean generators. Seven forms of Archimedean copulas were given using this technique. It was shown that most of the well-known Archimedean copulas can be generated using certain cdfs/pgfs. The technique provides a good starting point to search for an Archimedean generator, as one could survey the cdfs and pgfs looking for a one that meets the required conditions, then use it to construct a new Archimedean copula. Among the new examples of Archimedean copulas produced by this technique is the one defined based on the pgf of truncated-Poisson distribution. We will call it the truncated-Poisson copula. In this paper, we study in detail the truncated-Poisson copula. The dependence properties and the range of the dependence parameter are investigated. Also, we derive the form of Kendall distribution function of the copula. Three real-life data sets (two are continuous and one is discrete) are used to illustrate that the new copula gives better fit than some of the well-known copulas.
The remainder of this paper proceeds as follows: In Sect. 2, the truncated-Poisson copula is introduced. Some relative functions and properties are discussed. In Sect. 3, some dependence concepts are investigated. Real data sets are used to compare truncated-Poisson copula with some other well-known copula models in Sect. 4. Our conclusions are presented in Sect. 5.
2 Truncated-Poisson Copula
A truncation at zero of the Poisson probability generating function will be served in this section as the inverse of a multiplicative Archimedean generator. Thereafter, the acceptable range of the dependence parameter is discussed. The resulting copula will be studied in detail. The survival copula, conditional copula, and the copula density are presented. The product copula is obtained as a limiting case of this family. Some dependence properties and dependence measures are investigated. In terms of Spearman’s rho correlation coefficient, this copula covers a range of dependence between \(-0.205\) and 0.187.
First, we recall Theorem 8 of Alhadlaq and Alzaid [2]
Theorem 1
If a probability generating function G satisfies \(G(0)=0\), then its inverse \(G^{-1}\) is a strict multiplicative Archimedean generator, i.e.,
is a copula.
The probability generating function of the Poisson distribution is \(G_P(t)=e^{\mu (t-1)} ,\mu \ge 0\). As \(G_P(0)=e^{-\mu }\), we truncated this function at zero by defining
For an easier notation, set \(\theta =e^\mu -1\), hence, \(\mu =\ln {\left[ 1+\theta \right] }\). Therefore, the truncated probability generating function in terms of \(\theta \) is given by
The inverse of \(G_T (t)\) is
Therefore, according to Theorem 1, \(G_{T}^{-1}(s)\) is a strict multiplicative Archimedean generator. However, it seems that the range of \(\theta \) could be extended without affecting the conditions of the multiplicative Archimedean generator. As, for any \(\theta \ge e^{-1}-1\cong -0.63\), \(\psi _\theta (s)\) still satisfy the following:
-
i.
\(\psi _\theta ,:[0,1]\rightarrow [0,1]\) continuous with \(\psi _\theta (0)=0\) and \(\psi _\theta (1)=1\).
-
ii.
\(\psi _\theta \) is non-decreasing, where \(\psi _\theta ^{'}(s)=\frac{\theta }{(1+\theta s)\ln \left[ 1+\theta \right] }\ge 0\) for \(\theta > -1\).
-
iii.
\(\psi _\theta \) is log-concave, as \(\frac{\partial ^2\ln \left[ \psi _\theta (s)\right] }{\partial s^2}=\frac{-\theta ^2\left( 1+\ln \left[ 1+\theta s\right] \right) }{(1+\theta s)^2\ln ^2\left[ 1+\theta s\right] }\le 0\), when, \(1+\ln \left[ 1+\theta s\right] \ge 0\), i.e., \(\theta \ge \max \left\{ \frac{e^{-1}-1}{s}\right\} = e^{-1}-1 \cong -0.6321\).
Figure 1 exhibited the shape of the truncated-Poisson Archimedean generator for several values of the dependence parameter.
Thus, the corresponding copula to \(\psi _\theta \) is given by
The Kendall distribution function of \(C_T(U,V;\theta )\) is
The survival copula, the conditional copula, and the copula density are, respectively, given by
and
Figures 2 and 3, respectively, present the plot of the density of the truncated-Poisson copula with its contours plot and the copula with its contours plot for different levels of correlation in terms of Kendall’s tau (\(\tau \)) correlation coefficient, which will be discussed later in Theorem 5. From Fig. 2, it seems that as the dependence parameter increases, the density gets steady and increases faster near (0,0). From the copula graphs, it seems that the copula increases faster for smaller values of the parameter than the larger values. However, the copula is not stochastically ordered in \(\theta \), as it is shown in the following example.
Example 1
Let \(\alpha =1\le \beta =10\le \eta =30\) and \(u=v=0.5\), then we have \(C_{T}\left( u,v;\alpha \right) =0.268 <0.281=C_{T}\left( u,v;\beta \right) \), but, \(C_{T}(u,v;\beta )=0.281>0.279=C_{T}(u,v;\eta )\).
Theorem 2
The truncated-Poisson copula at \(\theta \rightarrow 0\) or \(\infty \) tends to the product copula, i.e.,
Proof
For the proof, see Appendix A.
3 Dependence
Here, we will present some dependence properties and drive three correlation measures for the truncated-Poisson copula. Also, we will calculate its upper and lower tail dependency indices. The Spearman’s rho and Kendall’s tau correlation coefficients were obtained in integrals through which they can be calculated numerically.
Theorem 3
The density copula \(c_T\) is totally positive of order two (\(TP_2\)) for \(\theta \ge 0\).
Proof
For the proof see Appendix A.
The total positivity of \(c_T\) is considered a very strong positive dependence property and it implies some other dependence properties. Some of these properties are listed in the following corollary.
Corollary 1
Let X and Y be continuous random variables with the truncated-Poisson copula \(C_T\) with \(\theta \ge 0\). Let \(\rho _{(X,Y)}\), \(\tau _{(X,Y)}\), and \(\beta _{(X,Y)}\) denote the Spearman’s, Kendall’s, and Blomqvist’s correlation coefficients, respectively. Then,
-
i
X and Y are positively quadrant-dependent \(PQD\left( X, Y\right) \).
-
ii
\(C_T\) is \(TP_2\) and \({\hat{C}}_T\) is \(TP_2\).
-
iii
\(\rho _{\left( X,Y\right) }\ge \tau _{\left( X,Y\right) }\ge 0\) and \(\beta _{\left( X,Y\right) }\ge 0\).
-
iv
\(C_T\succ C_\pi =uv\) (the product copula).
(see Joe [31], pages 366-367). Obviously, due to the symmetry of the Archimedean copulas, X and Y in the previous properties are exchangeable.
Theorem 4
Spearman’s rho correlation coefficient for the truncated-Poisson copula is given by
where \(Ei(x)=-\int _{-x}^{\infty }\frac{e^{-t}}{t}dt\) is the exponential integral function, and \(li(x)=\int _{0}^{x}\frac{dt}{\ln \left[ t\right] }\) denotes the logarithmic integral function.
Proof
For the proof see Appendix A.
Figure 4 displays Spearman’s rho correlation coefficient for the truncated-Poisson copula. \(\rho _{C_T}\), approximately, takes a range between \(-0.205\) and 0.187 at \(\theta \cong -0.63\) and \(\theta \cong 14\), respectively.
Theorem 5
The Kendall’s tau correlation coefficient for the truncated-Poisson copula is given by
where Ei(x) is the exponential integral function and \(\gamma \) is the Euler’s gamma constant, with approximate numerical value 0.5772.
Proof
For the proof, see Appendix A.
Figure 5 exhibits Kendall’s tau correlation coefficient for the truncated-Poisson copula. Notice that the admissible range for \(\tau _{C_T}\) is approximately between \(-0.137\) and 0.125 at \(\theta \cong -0.63\) and \(\theta \cong 14\), respectively.
Corollary 2
The medial correlation coefficient, Blomqvist’s \(\beta \), for the truncated-Poisson copula \(C_T\) is defined as
which approximately covers the interval \(\left( -0.149,0.125\right) \).
Theorem 6
The upper and lower tail dependencies for the truncated-Poisson copula are both zeros, i.e., this copula has no tail dependencies.
Proof
For the proof, see Appendix A.
Theorem 7
The truncated-Poisson family could be generalized to a family of n-copulas, i.e., with n-variate, for \(\theta \ge 0\) and for all \(n\ge 2\), as
Proof
For the proof, see Appendix A.
4 Applications
When modeling with copulas, the marginal distributions are estimated separately using either the empirical distributions or suitable parametric distributions. After that, one seeks for a copula which meets the dependence structure of the data. In this section, we will study the fitting of the truncated-Poisson copula to three datasets. We choose datasets which were previously studied in the literature using other bivariate distributions (some of which are copulas). Some well-known copulas (such as, Frank, Gumbel–Hougaard, Clayton and Farlie–Gumbel–Morgenstern (FGM)) will be also fitted for the sake of comparison.
Kendall plot, which is introduced in [20], is used here as a tool to detect the dependence. At this plot, the independence case confirmed when the dots lies on the diagonal line. While when the points lie far above (beneath) the diagonal line, one concludes that the data have a positive (negative) correlation. For more explanation, we refer to Genest and Boies [20]. Genest and Rivest [24] proposed a graphical tool to determine the best candidate model among a set of suggested Archimedean copulas. This plot is based on a function called lambda function which is given by \(\uplambda _\theta =t-K_C\left( t;\theta \right) \), where \(K_C(t;\theta )\) is the Kendall distribution function of a copula \(C(u,v;\theta )\). The lambda function for the empirical distribution is plotted with the parametric suggested models and the closer one is the better. The maximum likelihood estimates for the parameter(s) will be obtained. Among many researches which have addressed the goodness-of-fit test, we will consider the two goodness-of-fit tests that based on Cramér-Von Mises \(S_n\) and Kolmogorov–Smirnov \(T_n\), as they focused on the Archimedean class. These tests were introduced by Genest et al. [23] as follows:
Let \(F_r=\frac{\sum _{t=1}^{n}I_{\left( x_t<x_r,y_t<y_r\right) }}{n-1}; r=1, 2,\cdots , n\), be the empirical cumulative distribution function, where I is the indicator function. Hence, the empirical Kendall distribution is defined by \(K_n\left( t\right) =\frac{\sum _{t=1}^{n} I_{\left( F_r\le t\right) }}{n}\). Then, the two statistics \(S_n\) and \(T_n\) are given by
and
where \({\mathbb {K}}_n=\sqrt{n}\left( K_n\left( t\right) -K_{C}\left( t;{\hat{\theta }}\right) \right) \) and \(k_{C}\left( t;{\hat{\theta }}\right) =\frac{dK_{C}\left( t;{\hat{\theta }}\right) }{dt}\) is the Kendall density of the copula.
Large values of \(S_n\) and \(T_n\) leads to reject the hypothesis that a parametric family of copulas is suitable to represent the data. Through a power study, Genest et al. [23] indicated that in general, \(S_n\) is more reliable than \(T_n\). We will apply these two tests using both the maximum likelihood estimator and the Kendall’s tau estimator.
4.1 Kidney infection data
This data set is from McGilchrist and Aisbett [44], which represents the recurrence times to infection at point of insertion of the catheter for 30 kidney patients. For each patient, two recurrence times were recorded. The first variable X refers to the first recurrence time and Y refers to the second recurrence time. The Spearman’s correlation is 0.153, and Kendall’s tau correlation is 0.111. The data are shown in Table 7 at Appendix B.
A plot of the data is presented in Fig. 6.
Mirhosseini et al. [47] modeled this data with bivariate generalized exponential distribution (BGE), where its probability function is given by
where \(\uplambda _1, \uplambda _2>0,0<\theta \le 1\). Two other models were applied for the purpose of comparison. The first is the bivariate exponential model suggested by Block and Basu [8] with probability function
where \(\uplambda _1,\uplambda _2,\uplambda _3>0\) and \(\uplambda =\uplambda _1+\uplambda _2+\uplambda _3\). The second model is the Kundu and Gupta [39] which based on Clayton copula with generalized exponential margins. The distribution function of Kundu and Gupta model is given by
It resulted that among the three models, Kundu and Gupta give the best fit followed by BGE. Abd Elaal and Jarwan [1] discussed the same data with two copula-based models. They used generalized exponential margins with the FGM copula and the Plackett copula, respectively. They concluded that the Plackett copula slightly gives a better fit than the FGM copula. Almetwally et al. [4] suggested the FGM bivariate Weibull (FGMBW) for the data, then compared it with three models which are: FGM bivariate Gamma (FGMBG) proposed by Kotz et al. [36], FGM bivariate generalized exponential (FGMBGE) by Abd Elaal and Jarwan [1] and bivariate Marshall–Olkin Weibull (BMOW) by Kundu and Dey [38]. They showed that FGMBW model gives the best fit for the data.
It is noticed that these data have weak correlation considering the values of Kendall’s and Spearman’s coefficients. Figure 7 exhibits K-plot for the kidney infection data. The dots lie directly above the diagonal line which indicates positive weak correlation.
To fit these data, we considered truncated-Poisson, Frank and Gumbel–Hougaard copulas using both Gamma and Weibull margins. Next, we draw the lambda function for the empirical data against the four copulas in Fig. 8. From the graph, it appears that truncated-Poisson copula gives the closest fit to the data. However, other suggested copulas also seem acceptable. The goodness-of-fit statistics \(S_n\) and \(T_n\) were obtained using both Kendall’s estimator and maximum likelihood estimator (MLE) where the empirical distributions are considered for the marginals. Our results using ten thousand samples are reported in Table 1. Both tests show that all the suggested copulas are acceptable to fit the data. However, the truncated-Poisson copula outperforms other copulas. The independence hypothesis is rejected by the statistic \(T_n\) but not \(S_n\). Yet, we will carry on with modeling for the sake of comparison with the previously fitted models.
To compare our model with the FGMBW model, we obtained the estimates of the parameters using the maximum likelihood method with Gamma and Weibull margins, respectively. The results are listed in Table 2. For both cases, among FGM, Frank, Gumbel–Hougaard and truncated-Poisson copulas, the last one gives the smallest AIC value. However, the difference between AIC values, which is less than 2, may considered not significant (see Burnham and Anderson [10]).
4.2 Pima Indians diabetes data
These data were collected by the US National Institute of Diabetes and Digestive and Kidney Diseases. A sample of 332 diabetic women aged 21 years or more of Pima Indian heritage and living near Phoenix, Arizona. These data are easily accessible from R’s package MASS. A plot for these data is shown in Fig. 9. Li and Fang [40] studied the relationship between the body mass index (BMI) and the diabetes pedigree function (PED). They modeled these two variables using their proposed copula which they called the sine copula. Under the assumption that the marginal’s distributions are Gamma distributions, they obtained the estimates of the parameters using maximum likelihood method as \(\left( \alpha _1,\beta _1\right) =\left( 21.82,1.52\right) \) and \(\left( \alpha _2,\beta _2\right) =\left( 2.58,0.20\right) \), respectively. After a comparison with Clayton and Frank copulas, they concluded that the sine copula gives the best fit for the data (smallest AIC). Bekrizadeh [7] applied a new technique to obtain a generalization of the FGM copula. The resulting copula was used to model the Pima Indian data. Yet, the sine copula gives a better fit in terms of AIC.
Spearman’s and Kendall’s coefficients between BMI and PED are 0.097 and 0.0642, respectively, which indicates that these data have weak positive dependence. Also, this is apparent from the K-Plot in Fig. 10, where we can see most of the points are near the independence line. We fitted these data with truncated-Poisson copula. As these data, up to our knowledge, have only studied through FGM copula, we also considered Frank and Gumbel–Hougaard copulas.
The lambda function of these copulas with the product copula versus the empirical copula is shown in Fig. 11. It seems that the truncated-Poisson and the Frank copulas are more suitable to fit the data.
Table 3 presents the results of the goodness-of-fit tests using both statistics \(S_n\) and \(T_n\). In general, all of the suggested copulas are appropriate to model the data. This test rejects the hypothesis of independence. The truncated-Poisson copula gives better results than others. Table 4 lists the estimates of the dependence parameter using IFM method with estimated marginals as Gamma distributions with parameters \(\left( \alpha _1,\beta _1\right) =\left( 21.82,1.52\right) \) and \(\left( \alpha _2,\beta _2\right) =\left( 2.58,0.20\right) \), respectively. Gumbel–Hougaard copula returned the smallest AIC value. However, as the differences between other AIC values and the smallest one are less than four, thus, these models are also acceptable to fit the data (see Burnham and Anderson [10]).
4.3 Aircrafts Data
The data consist of flight aborts for 109 aircrafts at one year. The first variable represents the flight aborts for the first 6 months and the second variable represents the flight aborts for the last 6 months. The data are shown in Table 8 at Appendix B. A plot for these data is shown in Fig. 12. The data have a correlation coefficient of \(-0.16\). The sample mean and variance of the first and second variables are, respectively, \({\bar{x}}=0.62, S_{x}^{2}=1.03\) and \({\bar{y}}=0.72, S_{y}^{2}=1.07\).
These data were first studied by Mitchell and Paulson [48]. They fitted the data with a bivariate negative binomial distribution (BNB). Zamani et al. [57] used a bivariate Poisson–Lindley distribution (BPL) to fit the data and compared the results with the bivariate Poisson (BP) and BNB. Using AIC criterion, they found that BPL outperformed the other two models. Barbiero [5] modeled these data with a bivariate discrete Weibull based on the FGM copula. He estimated the parameters using four estimation methods (two of them are FML and IFM). In terms of AIC, his model gives a better fit than the BPL. The Frank and the Gaussian copulas with geometric, discrete Weibull, and discrete Lindley marginals were considered to model the data in Barbiero [6]. The later results showed that the smallest AIC is the one calculated form the Gaussian copula model with geometric marginals.
We fitted these data by the truncated-Poisson copula using discrete Weibull marginals with parameters \((q_1,\alpha _1)\) and \((q_2,\alpha _2)\), and geometric marginals with parameters \(p_1\) and \(p_2\), respectively. For the case of the discrete Weibull marginals, we obtained the parameters estimates using IFM method as \({\hat{q}}_1=0.3788,{\hat{\alpha }}_1=0.9774,{\hat{q}}_2=0.4496,{\hat{\alpha }}_2=1.1202\) and \({\hat{\theta }}=-0.63\), whereas for the geometric margins, the parameter estimates are \({\hat{p}}_1=0.6190,{\hat{p}}_2=0.5734\) and \({\hat{\theta }}=-0.63\). Also, we fitted the data by the FGM copula with geometric marginals and gets \({\hat{p}}_1=0.6190,{\hat{p}}_2=0.5734\) and \({\hat{\theta }}=-0.63\). A comparison between our model and some of those which previously studied in the literature is performed using AIC and BIC values. The results are summarized in Table 5.
In general, geometric margins gives better fits. The smallest AIC and BIC values correspond to the truncated-Poisson copula with geometric marginals. However, it seems that there is no significant difference between the AIC and BIC values of the suggested copulas. Figure 13 exhibits the empirical, truncated-Poisson and Frank cumulative distributions (with geometric marginals). In most of the area of the graph, the Frank copula lies beneath the truncated-Poisson copula. This relation is reversed only in a small area near the point (0, 0). The truncated-Poisson copula intersect in several curves with the empirical surface, whereas the Frank copula meets the empirical distribution on a small area near the point (0, 0). It seems that even though the range of the correlation for the truncated-Poisson copula does not cover the correlation of these data, it stills give the better fit. The mean absolute and mean squared distance, multiplied by a hundred, between the observed probabilities and the fitted probabilities obtained by the truncated-Poisson and the Frank copulas with geometric margins are given in Table 6.
5 Conclusions
In this paper, we proposed a new family of copulas and discussed some of its properties. The dependence structure of this copula was studied from different aspects. Spearman’s and Kendall’s correlation coefficients with the upper and lower tails of dependence were investigated. Three applications, where the truncated-Poisson copula compatible with the data, were discussed.
From the illustrative applications, we notice that the truncated-Poisson copula may give better performance than other copulas for datasets with small correlations. Also, the truncated-Poisson copula is sensitive toward independence, i.e., it is more appropriate to fit two dependent variables with weak correlation with it. Moreover, for discrete datasets, the truncated-Poisson copula is capable to give a good fit even though the correlation of the data is out the range \((-0.137,0.125)\). This is due to the fact that discrete correlation measures are not marginal free (see Denuit and Lambert [14] and Mesfioui and Tajar [46]). That is, the truncated-Poisson copula could be considered as a tool to distinguish between independence and weakly dependence. This new copula is not suitable for datasets with lower or upper tail dependencies, as it have no tail dependencies. As shown in Theorem 7, this copula could be used to model multivariate datasets.
Notes
The authors thanks Prof. Wissem Jedidi for supplying this part of the proof of this theorem.
References
Abd Elaal, M.K., Jarwan, R.S.: Inference of bivariate generalized exponential distribution based on copula functions. Appl. Math. Sci. 11(24), 1155–1186 (2017)
Alhadlaq, W., Alzaid, A.: Distribution function, probability generating function and archimedean generator. Symmetry 12(12), 2108 (2020)
Ali, M.M., Mikhail, N., Haq, M.S.: A class of bivariate distributions including the bivariate logistic. J. Multivar. Anal. 8(3), 405–412 (1978)
Almetwally, E.M., Muhammed, H.Z., El-Sherpieny, E.-S.A.: Bivariate weibull distribution: properties and different methods of estimation. Ann. Data Sci. 7(1), 163–193 (2020)
Barbiero, A.: A bivariate count model with discrete weibull margins. Math. Comput. Simul. 156, 91–109 (2019)
Barbiero, A.: Modeling correlated counts in reliability engineering. In: Advances in System Reliability Engineering, pp. 167–191. Elsevier (2019)
Bekrizadeh, H.: Generalized family of copulas: definition and properties. Thailand Stat. 19(1), 163–178 (2021)
Block, H.W., Basu, A.: A continuous, bivariate exponential extension. J. Am. Stat. Assoc. 69(348), 1031–1037 (1974)
Bouyé, E., Durrleman, V., Nikeghbali, A., Riboulet, G., Roncalli, T.: Copulas for finance-a reading guide and some applications. Available at SSRN 1032533, (2000)
Burnham, K.P., Anderson, D.R.: Multimodel inference: understanding aic and bic in model selection. Sociol. Methods Res. 33(2), 261–304 (2004)
Choroś, B., Ibragimov, R., Permiakova, E.: Copula estimation. In: Copula theory and its applications, pp. 77–91. Springer (2010)
Clayton, D.G.: A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65(1), 141–151 (1978)
Dall’Aglio, G.: Fréchet classes: the beginnings. In: Advances in Probability Distributions with Given Marginals, pp. 1–12. Springer (1991)
Denuit, M., Lambert, P.: Constraints on concordance measures in bivariate discrete data. J. Multivar. Anal. 93(1), 40–57 (2005)
Durante, F., Quesada-Molina, J.J., Sempi, C.: A generalization of the archimedean class of bivariate copulas. Ann. Inst. Stat. Math. 59(3), 487–498 (2007)
Durante, F., Sempi, C.: Principles of copula theory, vol. 474. CRC press, Boca Raton (2016)
Fisher, N., Sen, P. (eds.): The collected works of Wassily Hoeffding. Springer, New York (1994)
Frank, M.J.: On the simultaneous associativity off (x, y) andx+ y- f (x, y). Aequationes Math. 19(1), 194–226 (1979)
Fréchet, M.: Sur les tableaux de corrélation dont les marges sont données. Ann. Univ. Lyon, 3ê Serie, Sci. Sect. A 14, 53–77 (1951)
Genest, C., Boies, J.-C.: Detecting dependence with kendall plots. Am. Stat. 57(4), 275–284 (2003)
Genest, C., Ghoudi, K., Rivest, L.-P.: A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82(3), 543–552 (1995)
Genest, C., MacKay, R.J.: Copules archimédiennes et families de lois bidimensionnelles dont les marges sont données. Can. J. Stat. 14(2), 145–159 (1986)
Genest, C., Rémillard, B., Beaudoin, D.: Goodness-of-fit tests for copulas: A review and a power study. Insur. Math. Econom. 44(2), 199–213 (2009)
Genest, C., Rivest, L.-P.: Statistical inference procedures for bivariate archimedean copulas. J. Am. Stat. Assoc. 88(423), 1034–1043 (1993)
Georges, P., Lamy, A.-G., Nicolas, E., Quibel, G., Roncalli, T.: Multivariate survival modelling: a unified approach with copulas. Available at SSRN 1032559, (2001)
Hoeffding, W.: Masstabinvariante korrelationstheorie. Schriften des Mathematischen Instituts und Instituts fur Angewandte Mathematik der Universitat Berlin 5, 181–233 (1940)
Hoeffding, W.: Masstabinvariante korrelationsmasse für diskontinuierliche verteilungen. Archiv für mathematische Wirtschafts-und Sozialforschung 7, 49–70 (1941)
Hougaard, P.: Modelling multivariate survival. Scand. J. Stat. pp. 291–304 (1987)
Hutchinson, T.P., Lai, C.D.: Continuous bivariate distributions emphasising applications. Technical report (1990)
Joe, H.: Parametric families of multivariate distributions with given margins. J. Multivar. Anal. 46(2), 262–282 (1993)
Joe, H.: Multivariate models and multivariate dependence concepts. CRC Press, Boca Raton (1997)
Karlin, S.: Total positivity, vol. 1. Stanford University Press (1968)
Kim, J.-M., Sungur, E.A.: New class of bivariate copulas. In: Proceedings for the Spring Conference of Korean Statistical Society, pp. 207–212 (2004)
Klugman, S.A., Parsa, R.: Fitting bivariate loss distributions with copulas. Insur. Math. Econom. 24(1–2), 139–148 (1999)
Kolev, N., Anjos, U., Mendes, BVd.M.: Copulas: a review and recent developments. Stoch. Model. 22(4), 617–660 (2006)
Kotz, S., Balakrishnan, N., Johnson, N.L.: Continuous multivariate distributions, Volume 1: models and applications, vol. 1. Wiley, Hoboken (2004)
Kumar, P.: Copula functions and applications in engineering. In: Logistics, supply chain and financial predictive analytics, pp. 195–209. Springer (2019)
Kundu, D., Dey, A.K.: Estimating the parameters of the Marshall-Olkin bivariate weibull distribution by em algorithm. Comput. Stat. Data Anal. 53(4), 956–965 (2009)
Kundu, D., Gupta, R.D.: Absolute continuous bivariate generalized exponential distribution. AStA Adv. Stat. Anal. 95(2), 169–185 (2011)
Li, X., Fang, R.: A new family of bivariate copulas generated by univariate distributions. J. Data Sci. 10(1), 107–127 (2012)
Marshall, A.W., Olkin, I.: A generalized bivariate exponential distribution. J. Appl. Probab. 4(2), 291–302 (1967)
Marshall, A.W., Olkin, I.: Families of multivariate distributions. J. Am. Stat. Assoc. 83(403), 834–841 (1988)
Mazo, G., Girard, S., Forbes, F.: A class of multivariate copulas based on products of bivariate copulas. J. Multivar. Anal. 140, 363–376 (2015)
McGilchrist, C., Aisbett, C.: Regression with frailty in survival analysis. Biometrics, pp. 461–466 (1991)
Mesfioui, M., Quessy, J.-F., Toupin, M.-H.: On a new goodness-of-fit process for families of copulas. Can. J. Stat. 37(1), 80–101 (2009)
Mesfioui, M., Tajar, A.: On the properties of some nonparametric concordance measures in the discrete case. Nonparamet. Stat. 17(5), 541–554 (2005)
Mirhosseini, S.M., Amini, M., Kundu, D., Dolati, A.: On a new absolutely continuous bivariate generalized exponential distribution. Stat. Methods Appl. 24(1), 61–83 (2015)
Mitchell, C., Paulson, A.: A new bivariate negative binomial distribution. Naval Res. Logist. Quart. 28(3), 359–374 (1981)
Najjari, V., Bacigál, T., Bal, H.: An archimedean copula family with hyperbolic cotangent generator. Internat. J. Uncertain. Fuzziness Knowl.-Based Syst. 22(05), 761–768 (2014)
Nelsen, R.B.: An introduction to copulas. Springer Science & Business Media, New York (2007)
Panchenko, V.: Goodness-of-fit test for copulas. Physica A 355(1), 176–182 (2005)
Patton, A.J.: A review of copula models for economic time series. J. Multivar. Anal. 110, 4–18 (2012)
Plackett, R.L.: A class of bivariate distributions. J. Am. Stat. Assoc. 60(310), 516–522 (1965)
Sklar, M.: Fonctions de repartition an dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 8, 229–231 (1959)
Smith, M.D.: Modelling sample selection using archimedean copulas. Economet. J. 6(1), 99–123 (2003)
Trivedi, P.K., Zimmer, D.M.: Copula modeling: an introduction for practitioners. Now Publishers Inc (2007)
Zamani, H., Faroughi, P., Ismail, N.: Bivariate poisson-lindley distribution with application. J. Math. Stat. 11(1), 1 (2015)
Acknowledgements
The authors would like to thank Deanship of scientific research in King Saud University for funding and supporting this research through the initiative of DSR Graduate Students Research Support (GSR).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proofs
1.1 Proof of Theorem 2
-
i
To prove that \(\lim _{\theta \rightarrow 0}{C_T\left( u,v;\theta \right) }=uv\), we only need to prove that
$$\begin{aligned} \lim _{\theta \rightarrow 0}{\frac{\psi _\theta (t)\ln \left[ \psi _\theta (s)\right] }{\psi _\theta ^{\prime }(t)}} =t\ln \left[ s\right] , \end{aligned}$$(see Nelsen [50], Pages 139–140). Therefore,
$$\begin{aligned} \lim _{\theta \rightarrow 0}\frac{\psi _\theta (t)\ln \left[ \psi _\theta (s)\right] }{\psi _\theta ^{\prime }(t)}&=\lim _{\theta \rightarrow 0}\frac{(1+\theta t)\ln \left[ 1+\theta t\right] }{\theta }\ln \left[ {\frac{\ln \left[ 1+\theta s\right] }{\ln \left[ 1+\theta \right] }}\right] \\&=\lim _{\theta \rightarrow 0}\frac{(1+\theta t)\ln \left[ 1+\theta t\right] }{\theta }\ln \left[ \lim _{\theta \rightarrow 0}{\frac{\ln \left[ 1+\theta s\right] }{\ln \left[ 1+\theta \right] }}\right] . \end{aligned}$$Using L’Hôpital’s rule for both limits, we get
$$\begin{aligned} \lim _{\theta \rightarrow 0}\frac{\psi _\theta (t)\ln \left[ \psi _\theta (s)\right] }{\psi _\theta ^{\prime }(t)} =&\lim _{\theta \rightarrow 0}\left[ t\ln \left[ 1+\theta t\right] +t\right] \ln \left[ \lim _{\theta \rightarrow 0}{\frac{\left( 1+\theta \right) s}{1+\theta s}}\right] =t\ln \left[ s\right] . \end{aligned}$$Thus, \(\lim _{\theta \rightarrow 0}{C_T\left( u,v\right) }=uv\).
-
ii
For \(\theta \rightarrow \infty \), we have\(^*\)Footnote 1
$$\begin{aligned} \lim _{\theta \rightarrow \infty }\frac{\psi _\theta (t)\ln \left[ \psi _\theta (s)\right] }{\psi _\theta ^{\prime }(t)}&=\lim _{\theta \rightarrow \infty }\frac{\left( 1+\theta t\right) \ln \left[ 1+\theta t\right] \ln \left[ {\frac{\ln \left[ 1+\theta s\right] }{\ln \left[ 1+\theta \right] }}\right] }{\theta }\\&=\lim _{\theta \rightarrow \infty }{\left( \frac{1}{\theta }+t\right) } \lim _{\theta \rightarrow \infty }{\ln \left[ 1+\theta t\right] \ln {\left[ \frac{\ln \left[ \frac{\left( 1+\theta s\right) }{\left( 1+\theta \right) }\right] }{\ln \left[ 1+\theta \right] }+1\right] }}. \end{aligned}$$Note that for small x, we have \(\ln {\left[ 1+x\right] }\sim x\). Also, for large \(\theta \), \(\ln \left[ \frac{\left( 1+\theta s\right) }{1+\theta }\right] \) tends to \(\ln {\left[ s\right] }\) and hence \(\frac{\ln \left[ \frac{\left( 1+\theta s\right) }{\left( 1+\theta \right) }\right] }{\ln \left[ 1+\theta \right] }\) becomes small. Therefore, \(\ln {\left[ \frac{\ln \left[ \frac{\left( 1+\theta s\right) }{\left( 1+\theta \right) }\right] }{\ln \left[ 1+\theta \right] }+1\right] }\sim \frac{\ln \left[ s\right] }{\ln \left[ 1+\theta \right] }\). This implies
$$\begin{aligned}&\lim _{\theta \rightarrow \infty }{\left( \frac{1}{\theta }+t\right) }\lim _{\theta \rightarrow \infty }{\ln {\left[ 1+\theta t\right] }}\ln {\left[ \frac{\ln \left[ \frac{\left( 1+\theta s\right) }{\left( 1+\theta \right) }\right] }{\ln \left[ 1+\theta \right] }+1\right] } \\&\quad =t\lim _{\theta \rightarrow \infty }\frac{\ln \left[ 1+\theta t\right] \ln \left[ s\right] }{\ln \left[ 1+\theta \right] }=t\ln \left[ s\right] . \end{aligned}$$Hence, \(\lim _{\theta \rightarrow \infty }{C_T\left( u,v;\theta \right) }=uv\). \(\square \)
1.2 Proof of Theorem 3
The density of the truncated-Poisson copula is given by
This function is a product of the two non-negative functions
and
Second derivatives for the natural logarithms of \(f_1\) and \(f_2\) are \(\frac{\partial ^{2}\ln \left[ f_1\left( u,v\right) \right] }{\partial u \partial v}=\frac{\theta ^2}{\left( 1+\theta u\right) \left( 1+\theta v\right) \ln \left[ 1+\theta \right] }\ge 0\) and \(\frac{\partial \ln \left[ f_2\left( u,v\right) \right] }{\partial u}=\frac{\theta \ln {\left[ 1+\theta v\right] }}{\left( 1+\theta u\right) \left[ \ln {\left[ 1+\theta \right] }+\ln {\left[ 1+\theta u\right] }\ln {\left[ 1+\theta v\right] }\right] }-\frac{\theta }{1+\theta u}\), respectively. Hence, \(\ln {f_1}\) and \(\ln {f_2}\) are both supermodulars. Thus, \(f_1\) and \(f_2\) are \(TP_2\) functions. Therefore, the multiple of the two functions \(c_T=f_1f_2\) is also \(TP_2\) (see Karlin [32]). \(\square \)
1.3 Proof of Theorem 4
In terms of copulas, Spearman’s Rho correlation is given by
For \(\theta \ge 0\), take \(t=2\ln \left[ \left( 1+\theta \right) \left( 1+\theta v\right) \right] \) for the first integral, and \(s=\left( 1+\theta \right) \left( 1+\theta v\right) \) for the second one. Hence,
\(\square \)
1.4 Proof of Theorem 5
As our copula belongs to the Archimedean class, we will use the following formula to derive Kendall’s tau correlation coefficient
By setting \(t=\frac{\ln \left[ 1+\theta s\right] }{\ln \left[ 1+\theta \right] }\), we get
\(\square \)
1.5 Proof of Theorem 6
The upper tail dependence is given by
Using L’Hôpital’s rule, we get
Also, the lower tail dependence is given by
Using L’Hôpital’s rule, we get
\(\square \)
1.6 Proof of Theorem 7
An Archimedean copula could be generalized to the multivariate case if the inverse of its Archimedean additive generator is completely monotone (see Nelsen [50], Theorem 4.6.2). The inverse of the additive generator of the truncated-Poisson copula takes the form \(\phi _{\theta }^{-1}(t)=\psi _{\theta }^{-1}\left( e^{-t}\right) =\frac{e^{e^{-t}\ln \left[ 1+\theta \right] }-1}{\theta }\). As the function \(f_1\left( t\right) =e^{-t}\) is completely monotone and \(f_2\left( t\right) =\frac{e^{t\ln \left[ 1+\theta \right] }-1}{\theta }\) is absolutely monotone for \(\theta \ge 0\). Then, we have \(f_2\circ f_1=\phi _{\theta }^{-1}\left( t\right) \) is completely monotone, which completes the proof.
Applications’ Datasets
Rights and permissions
About this article
Cite this article
Alzaid, A.A., Alhadlaq, W.M. A New Family of Archimedean Copulas: The Truncated-Poisson Family of Copulas. Bull. Malays. Math. Sci. Soc. 45 (Suppl 1), 477–504 (2022). https://doi.org/10.1007/s40840-022-01333-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40840-022-01333-w
Keywords
- Copula
- Archimedean generator
- Archimedean copula
- Truncated-Poisson Copula
- Probability generating function