Keywords

1 Introduction

Weibull distribution is one of the most popular lifetime distributions. This valuable distribution has been used widely in mechanic engineering (University of Cambridge, 2003). Weibull has also been used in modelling strength data (University of Cambridge, 2003) and in modelling data sets of many other fields commonly. In some study areas, some parameters of this distribution engaged in demonstrating an important situation (Basu et al., 2009). In some analyses, only the parameter value represents the quality (University of Cambridge, 2003). Although this distribution is very capable of modelling very different kinds of lifetime datasets, in some datasets the modelling success rate may be lower. Some studies-to fix this situation-researchers add more parameters for better modelling (Marshall, 1997; Mudholkar & Srivastava, 1993). Weibull distribution has some specialties that this distribution has relations to other distributions (Rinne, 2008). In this study, the main aim is to increase the modeling efficiency of Weibull distribution by a different and special technique. By this approach, the Weibull distribution has three parameters and the new distribution may be more flexible in different kinds of datasets. This technique was used in a study for gaining new distribution (Ünözkan & Yilmaz, 2019). In this article, Weibull distribution gains a different capability than ever before.

2 Materials and Methods

In a study for gaining new distribution for flows a conditional Farlie-Gumbel-Morgenstern Distribution was used. In this process the marginal distributions were exponential. In order to realize this, the study used an important theorem.

2.1 Theorem (Sklar’s Theorem)

Let \(F\) be a joint cumulative distribution function and \(H\) and \(G\) are marginals, then there is a copula function \(C\) in \({\mathbb{R}}\) for every \(x\) and \(y\) (Sklar, 1959).

$$F\left(x,y\right)=C\left(H\left(x\right),G\left(y\right)\right)$$

Farlie-Gumbel-Morgenstern (FGM) copula with marginals u and v can be written as below (Nelsen, 2006).

$${C}_{\theta }\left(u,v\right)=uv+\theta uv\left(1-u\right)\left(1-v\right)$$

Hence, two-dimensional bivariate FGM distribution with marginals \(H\left(x\right)\) and \(G\left(y\right)\) is as follows;

$$F\left(x,y\right)=H\left(x\right)G\left(y\right)\left[1+\lambda \overline{H }\left(x\right)\overline{G }\left(y\right)\right].$$

The probability density function of this distribution is as below.

$$f\left(x,y\right)=h\left(x\right)g\left(y\right)\left[1+\lambda \left(1-2H\left(x\right)\right)\left(1-2G\left(y\right)\right)\right]$$

Under \(Y=y\) condition, \(X\) has a conditional probability density function as follows.

$${f}_{\left.X\right|Y=y}(x)=h\left(x\right)\left[1+\lambda \left(1-2H\left(x\right)\right)\left(1-2G\left(y\right)\right)\right]$$

Under \(Y=y\) condition, \(X\) has a conditional distribution as below.

$${F}_{\left.X\right|Y=y}(x)=\int h\left(x\right)\left[1+\lambda \left(1-2H\left(x\right)\right)\left(1-2G\left(y\right)\right)\right]dx$$
$${F}_{\left.X\right|Y=y}(x)=H\left(x\right)-\lambda \left(1-2G\left(y\right)\right)H\left(x\right)\overline{H }\left(x\right)$$

Under \(T=t\) condition probability of \(X\le t\) is

$${F}_{\left.X\right|Y=y}(t)=H\left(t\right)-\lambda \left(1-2G\left(t\right)\right)H\left(t\right)\overline{H }\left(t\right)$$

Provided (Ünözkan & Yilmaz, 2019).

Considering the models related to natural events, Weibull distribution has a wide range of usability. Because of modelling capability, Weibull distribution has been used widely.

Then we have

$$F\left(t\right)=\left(1+\lambda \right)H\left(t\right)-\lambda {H\left(t\right)}^{2}\left(3-2H\left(t\right)\right)$$

We know from the literature that the transmuted distribution with baseline \(H(t)\) is \(\left(1+\lambda \right)H\left(t\right)-\lambda {H\left(t\right)}^{2}\). Here, \({H\left(t\right)}^{2}\) is the failure distribution of the two-component parallel system (with identical and independent) namely, represented as \({H}_{2:2}\). In the light of this idea, \(F\left(t\right)\) can be also rewritten as the form of \(\left(1+\lambda \right)H\left(t\right)-\lambda {H}_{3:2}(t)\) where \({H}_{3:2}\) represents a failure distribution of 3 out of 2 system with independent and identical component. Thus, we have a different form of transmuted distribution. Hence when baseline distribution is assumed to be Weibull we have the following special form of distribution.

Suppose that \(H\left(t\right)=G\left(t\right)={1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\).

$$F\left(t\right)=\left(1+\lambda \right)\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)-\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\left(3-2\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\right)$$
(1)

The probability density function of conditional Farlie-Gumbel-Morgenstern with Weibull marginal (CFGM-W) is as below.

$$f\left(t\right)=\frac{d}{dt}\left(1+\lambda \right)\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)-\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\left(3-2\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\right)$$
$$\begin{aligned} f\left( t \right) & \, = \,\left( {\frac{\alpha }{\beta }{{\left( {\frac{t}{\beta }} \right)}^{\alpha - 1}}{e^{ - {{\left( {\frac{t}{\beta }} \right)}^\alpha }}}} \right) \\ & \quad \left( {1 + \lambda - 6\lambda \left( {1 - {e^{ - {{\left( {\frac{t}{\beta }} \right)}^\alpha }}}} \right) + 6\lambda {{\left( {1 - {e^{ - {{\left( {\frac{t}{\beta }} \right)}^\alpha }}}} \right)}^2}} \right),\lambda \in \left[ { - {\text{1,1}}} \right],\alpha ,\beta \, > \,0 \\ \end{aligned}$$
(2)

Some shapes of probability density function are as below.

According to Fig. 1, we can easily see that parameter \(\beta\) determines location solely. The other two parameters change the shape of the probability density function effectively in Figs. 2 and 3. Therefore, we believe that CFGM-W can be used in interesting data groups that have bimodal data plots.

Fig. 1
A graph represents the probability density function, f to t against time. 3 curves are plotted. The maximum values of the curves peak at (1, 3.5), (1.8, 2.5), and (1.8, 2). The values are approximated.

Plots of the probability density function-1

Fig. 2
A graph represents the probability density function, f to t against time. 4 curves are plotted. 2 curves are bell-shaped. The maximum values of the curves peak at (1.6, 4.4), (1.6, 3), (1.75, 2.6), and (1.75, 2.5). The values are approximated.

Plots of the probability density function-2

Fig. 3
A graph represents the probability density function, f to t against time. 3 saddle-shaped curves are plotted. The maximum values of the curves peak at (1.5, 2.2), (0.5, 0.5), and the third curve has 2 maximum points at (1, 0.6) and (2, 0.6). The values are approximated.

Plots of the probability density function-3

The survival function of CFGM-W is as follows;

$$S\left(t\right)=1-F\left(t\right)$$
$$=1-\left(\left(1+\lambda \right)\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)-\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\left(3-2\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\right)\right)$$
$$=1-\left(1+\lambda \right)\left(H\left(t\right)\right)+\lambda {\left(H\left(t\right)\right)}^{2}\left(3-2\left(H\left(t\right)\right)\right)$$
$$S\left(t\right)=\left(1+\lambda \right)\overline{H }\left(t\right)-\lambda \left(3{\left(\overline{H }\left(t\right)\right)}^{2}-2{\left(\overline{H }\left(t\right)\right)}^{3}\right)$$

The hazard rate function of CFGM-W is as below.

$$r\left(t\right)=\frac{f\left(t\right)}{S\left(t\right)}$$
$$=\frac{\left(\frac{\alpha }{\beta }{\left(\frac{t}{\beta }\right)}^{\alpha -1}{e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\left(1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)+6\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\right)}{1-\left(\left(1+\lambda \right)\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)-\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\left(3-2\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\right)\right)}$$
$$=\frac{\left(\frac{\alpha }{\beta }{\left(\frac{t}{\beta }\right)}^{\alpha -1}{e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\left(1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)+6\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\right)}{\left(\left(1+\lambda \right){e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}-3\lambda {e}^{-{\left(\frac{t}{\beta }\right)}^{2\alpha }}+2\lambda {e}^{-{\left(\frac{t}{\beta }\right)}^{3\alpha }}\right)}$$

Some shapes of hazard rate function are as below.

According to Figs. 4 and 5, we can easily see that parameter \(\beta\) has a big impact on both the probability density function and the hazard rate function. Thus, we believe that CFGM-W can be used in interesting data groups that may pose changeable types of risks.

Fig. 4
A graph represents the hazard rate function, r to t against time. 4 increasing curves are plotted. The first curve begins at (0, 0), remains constant till (1.4, 0), rises through (2, 40), peaks at (2.3, 160), falls, and ends at (2.3, 105). The gamma value is 0.5. The values are approximated.

Plots of hazard rate function-1

Fig. 5
A graph represents the hazard rate function, r to t against time. 2 curves are plotted. The first curve begins at (0, 0.4), passes through (1, 0.4), and ends at (8, 1.25). The second curve begins at (0, 1.3), passes through (0.25, 2), and ends at (8, 4.5). The values are approximated.

Plots of hazard rate function-2

Figures 4 and 5 show that there is an inverse relationship between the hazard rate function and the value of parameter \(\beta\). When parameter \(\beta\) increases, the hazard rate function decreases. According to plots, there are initially changing proportions of deaths, and at the beginning, some components rapidly deteriorate. Thereafter a balance is formed and an almost constant hazard rate is observed.

According to Fig. 3 parameter \(\alpha\) determines bimodality. When parameter \(\alpha\) has a value bigger than 3 the second model has a bigger top. When parameter \(\alpha\) has a value lower than 3 the first part of the model has a bigger top.

2.2 Maximum Likelihood Estimation

$$L\left(\beta ,\alpha ,\lambda ;\underset{\_}{{\varvec{t}}}\right)=f\left({t}_{1},{t}_{2},{t}_{3},\dots ,{t}_{n};\beta ,\alpha ,\lambda \right)=\prod_{i=1}^{n}f\left({t}_{i};\beta ,\alpha ,\lambda \right)$$
$$=\prod_{i=1}^{n}\left(\frac{\alpha }{\beta }{\left(\frac{t}{\beta }\right)}^{\alpha -1}{e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\left(1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)+6\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\right)$$
$$={\alpha }^{n}{\beta }^{1-n-\alpha }\prod_{i=1}^{n}{\left({t}_{i}\right)}^{\alpha -1}{e}^{-\sum_{i=1}^{n}\frac{{{t}_{i}}^{\alpha }}{{\beta }^{\alpha n}}}\prod_{i=1}^{n}\left(1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)\right)+{\left(6\lambda \right)}^{n}\prod_{i=1}^{n}{\left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)}^{2}$$

By using Log Likelihood, the maximum likelihood estimation of parameters can be obtained with the derivation of \(\beta ,\alpha\) and \(\lambda\).

$$\text{log}\left(L\left(\beta ,\alpha ,\lambda ;\underset{\_}{{\varvec{t}}}\right)\right)=n\text{log}\alpha -\left(n+\alpha -1\right)\text{log}\beta +\left(\alpha +1\right)\sum_{i=1}^{n}\text{log}\left({t}_{i}\right)$$
$$-\sum_{i=1}^{n}\frac{{{t}_{i}}^{\alpha }}{{\beta }^{\alpha n}}+\sum_{i=1}^{n}\text{log}\left(1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)\right)+n\text{log}\left(6\lambda \right)$$
$$+2\sum_{i=1}^{n}\text{log}\left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)$$
$$\frac{\partial }{\partial \lambda }\text{log}\left(L\left(\beta ,\alpha ,\lambda ;\underset{\_}{{\varvec{t}}}\right)\right)=\frac{n}{6\lambda }+\sum_{i=1}^{n}\frac{6{e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}-5}{1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)}=0$$
$$\frac{\partial }{\partial \beta }\text{log}\left(L\left(\beta ,\alpha ,\lambda ;\underset{\_}{{\varvec{t}}}\right)\right)=\frac{-\left(n+\alpha -1\right)}{\beta }+\frac{\alpha n\sum_{i=1}^{n}{{t}_{i}}^{\alpha }}{{\beta }^{\alpha n+1}}+$$
$$\sum_{i=1}^{n}\frac{6\lambda \alpha {t}_{i}{e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}}{1+\lambda -6\lambda \left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)}+\sum_{i=1}^{n}\frac{2\alpha {t}_{i}{e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}}{\left({1-e}^{-{\left(\frac{{t}_{i}}{\beta }\right)}^{\alpha }}\right)}=0$$
$$\frac{\partial }{\partial \alpha }\text{log}\left(L\left(\beta ,\alpha ,\lambda ;\underset{\_}{{\varvec{t}}}\right)\right)=0$$

2.3 Least Squares Estimation

$$F\left(t\right)=\left(1+\lambda \right)\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)-\lambda {\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)}^{2}\left(3-2\left({1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\right)\right)=u$$

Suppose that \(H\left(t\right)={1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\),

$$\lambda H\left(t\right)\left(H\left(t\right)-1\right)\left(2H\left(t\right)-1\right)+H\left(t\right)-u=0$$
$$SS=\sum_{i=1}^{n}{\left(\lambda H\left({t}_{i}\right)\left(H\left({t}_{i}\right)-1\right)\left(2H\left({t}_{i}\right)-1\right)+H\left({t}_{i}\right)-u\right)}^{2}$$
$$\frac{d}{d\lambda }\sum_{i=1}^{n}{\left(\lambda H\left({t}_{i}\right)\left(H\left({t}_{i}\right)-1\right)\left(2H\left({t}_{i}\right)-1\right)+H\left({t}_{i}\right)-u\right)}^{2}=0$$
$$\widehat{{\lambda }_{LSE}}=\frac{\sum_{i=1}^{n}\left(H\left({t}_{i}\right)-u\right)\left(H\left({t}_{i}\right)\left(H\left({t}_{i}\right)-1\right)\left(2H\left({t}_{i}\right)-1\right)\right)}{\sum_{i=1}^{n}{\left(H\left({t}_{i}\right)\left(H\left({t}_{i}\right)-1\right)\left(2H\left({t}_{i}\right)-1\right)\right)}^{2}}$$
$$\frac{d}{d\beta }\sum_{i=1}^{n}{\left(\lambda H\left({t}_{i}\right)\left(H\left({t}_{i}\right)-1\right)\left(2H\left({t}_{i}\right)-1\right)+H\left({t}_{i}\right)-u\right)}^{2}=0$$
$$\frac{d}{d\alpha }\sum_{i=1}^{n}{\left(\lambda H\left({t}_{i}\right)\left(H\left({t}_{i}\right)-1\right)\left(2H\left({t}_{i}\right)-1\right)+H\left({t}_{i}\right)-u\right)}^{2}=0$$

With the least squares estimation, we can reach a close form of estimation for parameter \(\lambda\). For the other parameters, \(\alpha\) and \(\beta\) numerical methods may be used with software support.

In this study, Matlab 2016b software is used to obtain parameter estimations and Kolmogorov Smirnov test statistics.

3 Results and Discussion

Now, using some different data groups, we first compare CFGM-W with the most common statistical distributions. Subsequently, we offer CFGM-W as a new distribution for lifetime data with different kinds of data groups. While comparing distributions, we will use Kolmogorov–Smirnov test statistics for looking at the availability of our distribution to data sets. In Kolmogorov–Smirnov test statistics p-value indicates the success rate of distribution in the explanation (Næss, 2012; Ross, 2009).

Once we see that the two distributions are equal, we will have a new problem. Which distribution is better for this data set? Because according to the hypothesis test, there may be many distributions that are equal to nonparametric distribution. Akaike Information Criterion (AIC) can be used to compare these distributions. When AIC is used, the distribution with the minimum AIC value is selected as the best distribution (Akaike, 1974). Since the AIC is a penalty value and the minimum value represents the maximum similarity to the non-parametric distribution of the data set, the minimum AIC value is the maximum similarity to the distribution (Snipes & Taylor, 2014; University of Cambridge, 2003).

In this section, CFGM-W will be compared with the most known lifetime distributions using some different data groups. While comparing distributions, Kolmogorov–Smirnov test statistics will be used. When using Kolmogorov–Smirnov statistics, the least statistical value is considered to be the best modeling. The p-value of Kolmogorov–Smirnov statistics informs us about the plausibility of conformity.

Data 1: The first data we used are the flood peak values (in m3/s) of the Wheaton River near Carcross in Yukon Territory, Canada. The data consist of 72 exceedances for the years 1958–1984, rounded to one decimal place. This data was analyzed in (Choulakian & Stephens, 2001) and after this the same data was used in Merovci and Puka ( 2014) and Ünözkan and Yilmaz (2019) (see Table 1).

Table 1 Wheaton river flood peaks (m3/s) data

In Table 2 the new distribution offers the best model. Other distributions have been used widely, but CFGM-W fits better than all other known distributions in flow modelling.

Table 2 Wheaton river flood peaks (m3/s) data test results

Data 2: This data group contains 56 measurements of total flows from Sefaatli Creek in April from 1953 to 2014. The data group was received from the Turkish State Water Affairs Directorate and was first used in a study for flow distribution [6] (see Table 3).

Table 3 Sefaatli Creek’s mean flows (m3/s) in April data

In Table 4 the new distribution offers the best model. Other distributions have been used widely, but CFGM-W fits better than all other known distributions in flow modelling.

Table 4 Sefaatli Creek’s mean flows (m3/s) in April test results

Data 3: This data set was used by Bhaumik et al. (2009), this data set carries vinyl chloride data obtained from clean-up gradient monitoring wells in mg/l (see Table 5).

Table 5 Vinyl chloride data

In Table 6 it is obvious that the new distribution increases the modelling capability of the Weibull distribution. Although Weibull distribution is used commonly this new distribution offers a better model than classic distribution.

Table 6 Vinyl chloride data test results

Data 4: The last data set contains Kevlar Epoxy strength results in spaceships (Badrinarayan & Barlow, 1992). This test is implied on fibers under %90 pressure. The data represents failure times (see Table 7).

Table 7 Tensile strength under %90 pressure data

In Table 8 the new distribution offers the best model. Weibull and Weibull with three parameters are available either but CFGM-W fits better than the other two most known statistical distributions.

Table 8 Tensile strength under %90 pressure data test results

4 Conclusion

In the results and discussion section, anybody determines the capability of this new distribution easily. Compare with other lifetime distributions this new distribution may be more appropriate for some data groups. In the Table below are the parameter values of appropriate models. The structure of CFGM-W changes effectively with differences in all three parameters. In Table 9 there are maximum likelihood estimation values for parameters in modeling data 1 to data 4.

Table 9 Values of parameter estimation in models

We can easily see that CFGM-W gains conformity in different parameter values. According to test results for Data 1 to Data 4, we suggest that CFGM-W can be used in many kinds of lifetime data groups.

We examine that CFGM-W has the best results in all data groups. According to the Tables in the application part, we conclude that CFGM-W can be identified as a lifetime distribution.