Abstract
Obtaining new statistical distributions involves employing various techniques aimed at enhancing modeling efficiency. In this particular study, a novel distribution is introduced by extracting the conditional diagonal section from the bivariate Farlie-Gumbel-Morgenstern distribution, where the marginals follow the Weibull distribution. The characteristics and specifications of this newly proposed distribution are thoroughly examined. Statistical discussions are held regarding the structure of the distribution, and parameter estimation techniques are applied using established methods. Furthermore, reliability analysis is conducted to assess its performance. To gauge the effectiveness of this innovative distribution for statistical modeling, data sets sourced from existing literature are utilized. Based on the findings, it is deduced that this fresh approach offers an efficient and robust model specifically suited for analyzing lifetime datasets. With this methodology, according to Kolmogorov Smirnov test statistics, the modeling efficiency of the Weibull distribution is increased by more than 20% in some situations.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Keywords
- Copula
- Farlie-Gumbel-Morgenstern distribution
- Generating distribution
- Reliability analysis
- Weibull distribution
1 Introduction
Weibull distribution is one of the most popular lifetime distributions. This valuable distribution has been used widely in mechanic engineering (University of Cambridge, 2003). Weibull has also been used in modelling strength data (University of Cambridge, 2003) and in modelling data sets of many other fields commonly. In some study areas, some parameters of this distribution engaged in demonstrating an important situation (Basu et al., 2009). In some analyses, only the parameter value represents the quality (University of Cambridge, 2003). Although this distribution is very capable of modelling very different kinds of lifetime datasets, in some datasets the modelling success rate may be lower. Some studies-to fix this situation-researchers add more parameters for better modelling (Marshall, 1997; Mudholkar & Srivastava, 1993). Weibull distribution has some specialties that this distribution has relations to other distributions (Rinne, 2008). In this study, the main aim is to increase the modeling efficiency of Weibull distribution by a different and special technique. By this approach, the Weibull distribution has three parameters and the new distribution may be more flexible in different kinds of datasets. This technique was used in a study for gaining new distribution (Ünözkan & Yilmaz, 2019). In this article, Weibull distribution gains a different capability than ever before.
2 Materials and Methods
In a study for gaining new distribution for flows a conditional Farlie-Gumbel-Morgenstern Distribution was used. In this process the marginal distributions were exponential. In order to realize this, the study used an important theorem.
2.1 Theorem (Sklar’s Theorem)
Let \(F\) be a joint cumulative distribution function and \(H\) and \(G\) are marginals, then there is a copula function \(C\) in \({\mathbb{R}}\) for every \(x\) and \(y\) (Sklar, 1959).
Farlie-Gumbel-Morgenstern (FGM) copula with marginals u and v can be written as below (Nelsen, 2006).
Hence, two-dimensional bivariate FGM distribution with marginals \(H\left(x\right)\) and \(G\left(y\right)\) is as follows;
The probability density function of this distribution is as below.
Under \(Y=y\) condition, \(X\) has a conditional probability density function as follows.
Under \(Y=y\) condition, \(X\) has a conditional distribution as below.
Under \(T=t\) condition probability of \(X\le t\) is
Provided (Ünözkan & Yilmaz, 2019).
Considering the models related to natural events, Weibull distribution has a wide range of usability. Because of modelling capability, Weibull distribution has been used widely.
Then we have
We know from the literature that the transmuted distribution with baseline \(H(t)\) is \(\left(1+\lambda \right)H\left(t\right)-\lambda {H\left(t\right)}^{2}\). Here, \({H\left(t\right)}^{2}\) is the failure distribution of the two-component parallel system (with identical and independent) namely, represented as \({H}_{2:2}\). In the light of this idea, \(F\left(t\right)\) can be also rewritten as the form of \(\left(1+\lambda \right)H\left(t\right)-\lambda {H}_{3:2}(t)\) where \({H}_{3:2}\) represents a failure distribution of 3 out of 2 system with independent and identical component. Thus, we have a different form of transmuted distribution. Hence when baseline distribution is assumed to be Weibull we have the following special form of distribution.
Suppose that \(H\left(t\right)=G\left(t\right)={1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\).
The probability density function of conditional Farlie-Gumbel-Morgenstern with Weibull marginal (CFGM-W) is as below.
Some shapes of probability density function are as below.
According to Fig. 1, we can easily see that parameter \(\beta\) determines location solely. The other two parameters change the shape of the probability density function effectively in Figs. 2 and 3. Therefore, we believe that CFGM-W can be used in interesting data groups that have bimodal data plots.
The survival function of CFGM-W is as follows;
The hazard rate function of CFGM-W is as below.
Some shapes of hazard rate function are as below.
According to Figs. 4 and 5, we can easily see that parameter \(\beta\) has a big impact on both the probability density function and the hazard rate function. Thus, we believe that CFGM-W can be used in interesting data groups that may pose changeable types of risks.
Figures 4 and 5 show that there is an inverse relationship between the hazard rate function and the value of parameter \(\beta\). When parameter \(\beta\) increases, the hazard rate function decreases. According to plots, there are initially changing proportions of deaths, and at the beginning, some components rapidly deteriorate. Thereafter a balance is formed and an almost constant hazard rate is observed.
According to Fig. 3 parameter \(\alpha\) determines bimodality. When parameter \(\alpha\) has a value bigger than 3 the second model has a bigger top. When parameter \(\alpha\) has a value lower than 3 the first part of the model has a bigger top.
2.2 Maximum Likelihood Estimation
By using Log Likelihood, the maximum likelihood estimation of parameters can be obtained with the derivation of \(\beta ,\alpha\) and \(\lambda\).
2.3 Least Squares Estimation
Suppose that \(H\left(t\right)={1-e}^{-{\left(\frac{t}{\beta }\right)}^{\alpha }}\),
With the least squares estimation, we can reach a close form of estimation for parameter \(\lambda\). For the other parameters, \(\alpha\) and \(\beta\) numerical methods may be used with software support.
In this study, Matlab 2016b software is used to obtain parameter estimations and Kolmogorov Smirnov test statistics.
3 Results and Discussion
Now, using some different data groups, we first compare CFGM-W with the most common statistical distributions. Subsequently, we offer CFGM-W as a new distribution for lifetime data with different kinds of data groups. While comparing distributions, we will use Kolmogorov–Smirnov test statistics for looking at the availability of our distribution to data sets. In Kolmogorov–Smirnov test statistics p-value indicates the success rate of distribution in the explanation (Næss, 2012; Ross, 2009).
Once we see that the two distributions are equal, we will have a new problem. Which distribution is better for this data set? Because according to the hypothesis test, there may be many distributions that are equal to nonparametric distribution. Akaike Information Criterion (AIC) can be used to compare these distributions. When AIC is used, the distribution with the minimum AIC value is selected as the best distribution (Akaike, 1974). Since the AIC is a penalty value and the minimum value represents the maximum similarity to the non-parametric distribution of the data set, the minimum AIC value is the maximum similarity to the distribution (Snipes & Taylor, 2014; University of Cambridge, 2003).
In this section, CFGM-W will be compared with the most known lifetime distributions using some different data groups. While comparing distributions, Kolmogorov–Smirnov test statistics will be used. When using Kolmogorov–Smirnov statistics, the least statistical value is considered to be the best modeling. The p-value of Kolmogorov–Smirnov statistics informs us about the plausibility of conformity.
Data 1: The first data we used are the flood peak values (in m3/s) of the Wheaton River near Carcross in Yukon Territory, Canada. The data consist of 72 exceedances for the years 1958–1984, rounded to one decimal place. This data was analyzed in (Choulakian & Stephens, 2001) and after this the same data was used in Merovci and Puka ( 2014) and Ünözkan and Yilmaz (2019) (see Table 1).
In Table 2 the new distribution offers the best model. Other distributions have been used widely, but CFGM-W fits better than all other known distributions in flow modelling.
Data 2: This data group contains 56 measurements of total flows from Sefaatli Creek in April from 1953 to 2014. The data group was received from the Turkish State Water Affairs Directorate and was first used in a study for flow distribution [6] (see Table 3).
In Table 4 the new distribution offers the best model. Other distributions have been used widely, but CFGM-W fits better than all other known distributions in flow modelling.
Data 3: This data set was used by Bhaumik et al. (2009), this data set carries vinyl chloride data obtained from clean-up gradient monitoring wells in mg/l (see Table 5).
In Table 6 it is obvious that the new distribution increases the modelling capability of the Weibull distribution. Although Weibull distribution is used commonly this new distribution offers a better model than classic distribution.
Data 4: The last data set contains Kevlar Epoxy strength results in spaceships (Badrinarayan & Barlow, 1992). This test is implied on fibers under %90 pressure. The data represents failure times (see Table 7).
In Table 8 the new distribution offers the best model. Weibull and Weibull with three parameters are available either but CFGM-W fits better than the other two most known statistical distributions.
4 Conclusion
In the results and discussion section, anybody determines the capability of this new distribution easily. Compare with other lifetime distributions this new distribution may be more appropriate for some data groups. In the Table below are the parameter values of appropriate models. The structure of CFGM-W changes effectively with differences in all three parameters. In Table 9 there are maximum likelihood estimation values for parameters in modeling data 1 to data 4.
We can easily see that CFGM-W gains conformity in different parameter values. According to test results for Data 1 to Data 4, we suggest that CFGM-W can be used in many kinds of lifetime data groups.
We examine that CFGM-W has the best results in all data groups. According to the Tables in the application part, we conclude that CFGM-W can be identified as a lifetime distribution.
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705
Badrinarayan, B., & Barlow, J. W. (1992). Metal parts from selective laser sintering of metal-polymer powders. Solid Freeform Fabrication Symposium, 1, 141–146.
Basu, B., Tiwari, D., Kundu, D., & Prasad, R. (2009). Is Weibull distribution the most appropriate statistical strength distribution for brittle materials? Ceramics International, 35(1), 237–246. https://doi.org/10.1016/j.ceramint.2007.10.003
Bhaumik, D. K., Kapur, K., & Gibbons, R. D. (2009). Testing parameters of a gamma distribution for small samples. Technometrics, 51(3), 326–334. https://doi.org/10.1198/tech.2009.07038
Choulakian, V., & Stephens, M. A. (2001). Goodness-of-fit tests for the generalized Pareto distribution. Technometrics, 43(4), 478–484. https://doi.org/10.1198/00401700152672573
Marshall, A. (1997). A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika, 84(3), 641–652. https://doi.org/10.1093/biomet/84.3.641
Merovci, F., & Puka, L. (2014). Transmuted Pareto distribution. ProbStat Forum.
Mudholkar, G. S., & Srivastava, D. K. (1993). Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Transactions on Reliability, 42(2), 299–302. https://doi.org/10.1109/24.229504
Næss, S. K. (2012). Application of the Kolmogorov-Smirnov test to CMB data: Is the universe really weakly random? Astronomy and Astrophysics, 538, A17. https://doi.org/10.1051/0004-6361/201117344
Nelsen, R. B. (2006). An introduction to copulas (Vol. 42, Issue 3). Springer New York. https://doi.org/10.1007/0-387-28678-0
Rinne, H. (2008). The Weibull distribution. Chapman and Hall/CRC. https://doi.org/10.1201/9781420087444
Ross, S. M. (2009). Introduction to probability and statistics for engineers and scientists. Elsevier. https://doi.org/10.1016/B978-0-12-370483-2.X0001-X
Sklar, M. (1959). Fonctions de Répartition à n Dimensions et Leurs Marges. In Annales de l’ISUP (Issue 3, pp. 229–231). Publications de l’Institut Statistique de l’Université de Paris.
Snipes, M., & Taylor, D. C. (2014). Model selection and Akaike Information Criteria: An example from wine ratings and prices. Wine Economics and Policy, 3(1), 3–9. https://doi.org/10.1016/j.wep.2014.03.001
University of Cambridge. (2003). Materials data book. In Materials & design.
Ünözkan, H., & Yilmaz, M. (2019). A new method for generating distributions: An application to flow data. International Journal of Statistics and Applications, 9(3), 92–99. https://doi.org/10.5923/j.statistics.20190903.04
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Unozkan, H., Potas, N., Yilmaz, M. (2024). Some Comments on Increasing Modelling Efficiency of Weibull Distribution. In: Erçetin, Ş.Ş., Açıkalın, Ş.N., Tomé, L. (eds) Chaos, Complexity, and Leadership 2023. ICCLS 2018. Springer, Cham. https://doi.org/10.1007/978-3-031-64265-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-64265-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64264-7
Online ISBN: 978-3-031-64265-4
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)