Introduction

Natural (plant) fibers are promising reinforcements to replace synthetic glass fibers in many nonstructural and semi-structural applications thanks to their desirable properties such as high specific strength and stiffness, good acoustic insulation, vibration damping, and lower environmental impacts [1,2,3,4]. As a result, the use of these natural reinforcements in polymer composite materials has recently gained substantial attention. However, unlike synthetic fibers, the properties of natural fibers have a relatively large uncertainty that arises from their natural variabilities, extraction, and processing. These uncertainty sources are generally hard or impossible to control due to their stochastic nature. Therefore, it is necessary to consider the variation of properties, such as fiber strength, when modeling or predicting the behavior of natural fiber-reinforced composites.

Weibull statistics is a popular data processing tool for interpreting life data such as time to failure and strength. Weibull distribution parameters can be adjusted to fit many life distributions that are compatible with the weakest link theory or have a sudden failure characteristic. For example, the Weibull distribution has been widely used to process the strength data of brittle materials, such as glass and carbon fibers, in which the most severe flaw controls the strength [2, 5, 6]. It has also been successfully applied to a variety of natural fibers such as hemp, flax, sisal, agave, and bamboo fibers [7,8,9,10,11,12,13,14], as natural fiber failure follows the “brittleness” and “weakest link” assumptions of the Weibull distribution [15].

Although the basic version of the Weibull distribution is the 2-parameter model, it usually is not capable of accurately describing the strength distribution of technical natural fibers [8, 15]. Many efforts have been made to improve the accuracy of the Weibull distribution by considering the influential factors of the failure mechanism in the distribution model [4, 16,17,18]. The modified 3-parameter Weibull distribution is an improved version in which the third parameter is introduced such that the effect of defect density is considered as a function of fiber geometry. For instance, E. Trujillo et al. [15] applied the 3-parameter Weibull model to the strength data of bamboo fibers tested at different gauge lengths and showed that the model can predict the strength of fibers against their length with an acceptable degree of accuracy. According to the modified 3-parameter Weibull distribution, the probability of failure of a fiber with a volume \(V\) at stress less than or equal to \(\sigma \) is given by [15]:

$$ F\left( {\sigma |\sigma_{0} ,{ }\beta ,{ }m} \right) = 1 - exp{ }\left( { - \left( {\frac{V}{{V_{0} }}} \right)^{\beta } \left( {\frac{\sigma }{{\sigma_{0} }}} \right)^{m} } \right) $$
(1)

where \(m\) is the shape parameter, \(\sigma_{0}\) is the scale parameter, and \(\beta\) is the geometry sensitivity. In Eq. (1),\( V\) is the fiber volume, and \(V_{0}\) is the reference volume, which is an arbitrary parameter used to normalize the effect of the fiber volume. Interpretations of values of the distribution parameters are available in [14, 15]. There are several classical methods for estimating the Weibull distribution parameters from the experimentally measured dataset, such as Linear Regression (LR), Maximum Likelihood (ML), and Method of Moments (MM). However, depending on the scatter of the experimental data and the corresponding confidence interval, these methods may result in different values for the distribution parameters. This divergence in the estimated parameters becomes more pronounced for the natural fibers due to the high variation in their strength. Therefore, the classical methods to estimate the Weibull parameters are not reliable approaches for finding the best fitting distribution to the observed strength data.

Unlike the classical methods, the Bayesian approach is a comprehensive alternative for parameter estimation and model calibration, which has gained a lot of attention in recent years. The Bayesian approach is not only able to avoid over-fitting but also provides a credible interval for the fitting parameters, which is a valuable quantity for evaluating the uncertainty of the unknown parameters [19]. The latter is the main advantage of the Bayesian estimator when the scatter of the experimental data is high. In fact, the Bayesian estimator can describe the uncertainty of the estimated parameters through the posterior predictive distribution. Using the Bayesian inference, the posterior distribution is obtained based on our prior beliefs about the parameters and the assessment of their fit to the experimental data [20, 21].

Almongy et al. [22] recently brought up a Bayesian approach to estimate the credible intervals of generalized Weibull distribution parameters in the case of type-II censored data. They applied their model to carbon fiber strength data. Chako et al. [23] have used a Bayesian estimator to obtain the parameters of the Weibull distribution for analyzing competing risk data with binomial removals, which is applicable to specific failure data sets. Ducros et al. [24] proposed a Bayesian restoration maximization approach for a set of heterogeneous data to fit a mixture of 2-parameter Weibull distributions.

Using the Bayesian approach, we are interested in the posterior distribution of the estimated parameters. The posterior probability according to the Bayesian inference is given by:

$$ p\left( {{\varvec{\theta}}|{\varvec{x}}} \right) = \frac{{f({\varvec{x}}|{\varvec{\theta}})\pi \left( {\varvec{\theta}} \right)}}{{\smallint f({\varvec{x}}|{\varvec{\theta}})\pi \left( {\varvec{\theta}} \right)}} $$
(2)

where \({\varvec{x}}\) is the observed data and \({\varvec{\theta}}\left( {{\varvec{\theta}} \subseteq R^{q} , q \ge 1} \right)\) is the parameters vector, \(\pi \left( {\varvec{\theta}} \right)\) denotes the prior distribution, and \(f({\varvec{x}}|{\varvec{\theta}})\) is the likelihood function that is derived from a statistical model for the observed data. The prior probability is subjectively determined in advance; however, computing the likelihood function in closed-form is not possible in most cases [24,25,26,27]. The term \(\smallint f({\varvec{x}}|{\varvec{\theta}})\pi \left( {\varvec{\theta}} \right)\) is the normalization constant. As a result, the posterior distribution is always directly proportional to the product of the likelihood function and the prior distribution.

Approximation Bayesian Computation (ABC) is a computational method based on the Bayesian inference, seeking to estimate the posterior distributions of model parameters computationally rather than analytically. The ABC approach is one of the popular methods for assessing statistical models, particularly for analyzing complex problems. This approach facilitates the estimation of the posterior distribution by bypassing the evaluation of the likelihood function of the conventional Bayesian inference. As a result, various methods have been introduced under this scheme over the last years [28,29,30,31,32]. However, it has been shown that the application of the direct Monte Carlo to implement the ABC is not computationally efficient, particularly for models with high dimensions. Hence, various algorithms have been proposed to enhance the efficiency of the ABC [33]. One way is to take advantage of the Markov Chain Monte Carlo (MCMC) sampling. Another way is to apply sequential importance sampling with some variations called Sequential Monte Carlo (SMC) [34].

In this study, for the first time, we applied the ABC approach to fit a modified 3-parameter Weibull distribution to the experimentally measured strengths of date palm fibers and determined the highest posterior density intervals for the non-deterministic distribution parameters. The Metropolis–Hasting (MH), as a class of MCMC sampling, and an SMC algorithm were employed to obtain the posterior distributions of the parameters. This study provides a reliable method for predicting the fiber strength, considering fiber dimensions.

Estimation of modified weibull distribution using ABC

In order to estimate the modified 3-parameter Weibull distribution parameters, the ABC algorithm is applied to the strength data of date palm fibers measured at seven different gauge lengths. Here, the uncertainty of the distribution parameters (i.e., posterior distribution) is modeled by fitting the most appropriate statistical distribution based on the chi-square goodness of fit test. The general idea of the ABC algorithm can be described as follows:

  • Generate a family of random parameters from prior distribution (previous belief)

  • Simulate data according to generative model using generated parameters

  • Compare the simulated data and data observation (update our belief)

  • Repeat this process till the posterior parameters converge to stationary distributions

The prior distributions are chosen to be exponential with parameters \(\lambda_{1}\), \(\lambda_{2}\), \(\lambda_{3}\) for \(\sigma_{0}\), \(\beta\), and \(m\), respectively. The likelihood of the modified Weibull distribution, considering the model independence, can be written as:

$$ f\left( {{\varvec{x}}|{\varvec{\theta}}} \right) = f\left( {x_{1} |{\varvec{\theta}}} \right)f\left( {x_{2} |{\varvec{\theta}}} \right) \ldots f\left( {x_{n} |{\varvec{\theta}}} \right) $$
(3)

thus

$$ f\left( {{\varvec{x}}|{\varvec{\theta}}} \right) = \mathop \prod \limits_{i = 1}^{n} \left[ {\frac{m}{{\sigma_{0}^{m} }}\left( {\frac{{V_{i} }}{{V_{0} }}} \right)^{\beta } \left( {\sigma_{i} } \right)^{m - 1} } \right]exp\left( {\mathop \sum \limits_{i = 1}^{n} - \left( {\frac{{V_{i} }}{{V_{0} }}} \right)^{\beta } \left( {\frac{{\sigma_{i} }}{{\sigma_{0} }}} \right)^{m} } \right) $$
(4)

where \(n\) represents the number of the observed data and vector \({\varvec{x}} = \left( {x_{1} , x_{2} , \ldots , x_{n} } \right)^{T}\) is our data observation. In order to compare the simulated and real data, we calculate the norm of the difference vector and compare it with a threshold \(\varepsilon\). The threshold value depends on the level of accuracy. Although various methods based on quantiles have been suggested in the literature to determine optimal \(\varepsilon\), they do not apply to all sorts of models [35, 36]. The comparison process is carried out as follows:

figure b

where ||.|| is the norm of the vector. This process is repeated until \(N\) particles are accepted, and the posterior distribution is converged to a stationary distribution. In the following, we explore different approaches, such as MCMC and SMC, to improve the computational efficiency of the process.

Approximation bayesian computation using markov chain monte carlo (ABC MCMC)

MCMC sampling is a promising method for smartly generating a family of random variables from a population. This method uses the Markov process to produce a new random variable (i.e., new state) corresponding to the previous one (i.e., old state). From the Ergodic theorem, which is the central limit theorem (CLT) in the Markov process, and the rule of big numbers, it can be assured that this process will converge to a stationary distribution [37]. As a result, the MCMC sampling is used extensively in the ABC algorithm to facilitate the process.

Several MCMC-based algorithms have been proposed in the literature that apply to various statistical models. Among them, Metropolis–Hasting (MH), Gibbs Sampling, and Hamiltonian Monte Carlo are the most frequently used algorithms [38,39,40]. In this study, we aim to use the Metropolis–Hasting algorithm to efficiently generate random numbers in order to reduce computational costs.

The Metropolis–Hasting algorithm comprises two-part: generating random variables and evaluating the generated numbers. In the first part, we need to define a new distribution called proposal distribution. There have been many proposal distributions brought up by previous studies [41,42,43]. Here, the gamma distribution is used as the proposal distribution due to the positivity of the Weibull parameters. The gamma distribution is generally described by two parameters; the shape parameter \(a\) and the scale parameter \(b\). The scale parameter is defined in terms of the precision parameter \(\tau\) as:

$$ b = \frac{1}{\tau } $$
(5)

Considering that the mean value of the gamma distribution is obtained by μ = ab, the proposal distribution can be stated as:

$$ f(x|\mu ,\tau ) = gamma\left( {\mu \tau ,{ }\frac{1}{\tau }} \right) $$
(6)

This simple substitution facilitates generating the Weibull parameters by proposing parameters with the precision \(\left( \tau \right)\) and the mean value \(\left( \mu \right)\) instead of the shape and scale parameters.

In the second part of the MH algorithm, a judgment is performed on the proposed value. The algorithm is implemented by accepting/rejecting particles component-wisely. In other words, for each parameter, a value is proposed, and then, the judgment is made. For initiating the Markov process, we need to guess an initial value for the parameters and choose a fixed value for the precision \(\tau\). This process is repeated till \(t < N\), where \(N\) is the total number of iterations that the chain is run, and \(t\) is the specific iteration that the chain runs through it. The iteration is continued until the required convergence is achieved. The error ellipse and parameters distribution are visualized in the next section for different numbers of iterations. Figure 1 shows the entire process of the ABC MCMC-MH algorithm, including the judgment process.

Figure 1
figure 1

Flowchart of the ABC MCMC using Metropolis–Hasting algorithm

Approximation bayesian computation using sequential monte carlo (ABC SMC)

The idea of the SMC algorithm is based on the population Monte Carlo and the importance sampling. The application of different types of SMC samplers in the ABC framework has been investigated by several researchers [44, 45]. The ABC SMC algorithm produces random particles within some measurable common spaces using particle distribution \(\left\{ {P_{{\varepsilon_{t} }} ({\varvec{\theta}}|{\varvec{x}})} \right\}_{1 \le t \le T}\). The ABC algorithm is applied to the particles with decreasing thresholds \(\left\{ {\varepsilon_{t} } \right\}_{1 \le t \le T}\) in each time step \(t\).

For \(t \ge 2\), the parameters are sampled from the set of accepted particles at the previous sequence and perturbed according to a suitable perturbation kernel instead of sampling from the prior distribution. Each of these particles is associated with a weight \(\left\{ {{\varvec{\theta}}^{{\left( {i,{ }t - 1} \right)}} ,{ }w^{{\left( {i,{ }t - 1} \right)}} } \right\}_{1 \le i \le N}\) at each time, which is propagated corresponding to a kernel \(\left\{ {K_{t} \left( {{\varvec{\theta}}^{{\left( {i,{ }t} \right)}} {|}{\varvec{\theta}}^{{\left( {j,{ }t - 1} \right)}} } \right)} \right\}_{1 \le t \le T}\) for any particles \({\varvec{\theta}}^{{\left( {i,{ }t} \right)}}\) and \({\varvec{\theta}}^{{\left( {j,{ }t - 1} \right)}}\). Note that only the first sequence particles are sampled from the prior distribution. Thus, the choice of the kernel is so influential on the acceptance rate and the efficiency of the ABC SMC.

Here, the uniform kernel is employed for perturbing the particles [34]. In this kernel, each component \(\theta_{j}\), \(j \in \left\{ {1,2, \ldots d} \right\}\) of the parameter vector \({\varvec{\theta}} = \left( {\theta_{1} , \ldots \theta_{d} } \right)\) is perturbed independently within \(\left[ {\theta_{j} - \sigma_{j}^{\left( t \right)} ;{ }\theta_{j} + \sigma_{j}^{\left( t \right)} } \right]\) with a density of \(\frac{1}{{2\sigma_{j}^{\left( t \right)} }}\), in which \(\sigma_{j}^{\left( t \right)}\) is the kernel width and defined as:

$$ \sigma_{j}^{\left( t \right)} = \frac{1}{2}\left( {\mathop {\max }\limits_{1 \le k \le N} \left\{ {\theta_{j}^{{\left( {k, t - 1} \right)}} } \right\} - \mathop {\min }\limits_{1 \le k \le N} \left\{ {\theta_{j}^{{\left( {k, t - 1} \right)}} } \right\}} \right) $$
(7)

This sequence is repeated until \(t = T\), when the measurable common space of particles converges to a stationary distribution. Another influential factor in ABC SMC efficiency is the sequence of decreasing thresholds, \(\left\{ {\varepsilon_{t} } \right\}_{1 \le t \le T}\). Various methods have been proposed in the literature for determining the thresholds. However, they have limitations that make them not applicable to all models [35, 46]. In this study, the thresholds are determined by trial and error. Figure 2 illustrates the implementation flowchart of the ABC SMC algorithm.

Figure 2
figure 2

Flowchart of the ABC SMC algorithm

These two methods are implemented using MATLAB and applied to the experimentally measured strength of the date palm fibers to determine the fitting parameters of the modified 3-parameter Weibull distribution. The performance of these methods and their computational efficiency are compared in the next section.

Results and discussion

Fiber strength data

The tensile strength of the technical date palm fibers, extracted from leaf sheath, was measured through the single fiber tensile test as per ASTM C1557-20. The fibers were tested at seven different gauge lengths; 10 mm, 15 mm, 20 mm, 25 mm, 30 mm, 40 mm, and 50 mm. The mean diameter (d) of an individual fiber was obtained by measuring the width of the fiber in six different locations along the fiber length using optical microscopy. Then, the corresponding cross-sectional area (A) of the fibers was calculated, assuming a circular cross section for fibers. Moreover, the volume of fibers was calculated by multiplying the cross-sectional area by the fiber length. At least twenty fibers were tested at each gauge length, resulting in at least 140 data points in total.

Figure 3 shows the distributions of fiber diameters of all tested fibers and their corresponding strengths. The range of fiber diameters was from 0.107 to 0.361 mm. To assure an unbiased comparison, a one-way ANOVA was carried out at a confidence level of 95% (\(\alpha = 0.05\)).Table 1 reports the mean value and standard deviation of the fiber diameters for different gauge lengths and the p value of the ANOVA test. It can be seen that the calculated P value is greater than 0.05, meaning there is no statistically significant difference in fiber diameters between the groups of different gauge lengths.

Figure 3
figure 3

Scatterplot of fiber diameter versus strength with their corresponding marginal distributions

Table 1 ANOVA analysis for different gauge lengths of fibers

Estimation of modified 3-parameter weibull distribution parameters

The ABC MCMC and ABC SMC algorithms were applied to the strength data of the date palm fibers to estimate the fitting parameters of the modified Weibull distribution of Eq. (1), and results were presented. Their performances were also compared and evaluated.

Figure 4 illustrates the error ellipses of estimated parameters for both the ABC MCMC and ABC SMC methods with 10,000 iterations. Error ellipses are a useful graphical tool to show the pair-wise correlation between the computed values. Here, we used a 95% confidence interval to plot the error ellipses, meaning 95% of the population of each parameter falls within the ellipse. Comparing the orientation of the error ellipses, both the methods resulted in a relatively similar pair-wise correlation between the parameters. In addition, it can be seen that all the error ellipses of the ABC SMC method are larger than that of the ABC MCMC method. The larger error ellipse is an indication of the higher scatter of the posterior distribution of the parameters. Therefore, we can conclude that the use of the ABC SMC algorithm results in higher standard deviations in the estimated parameters.

Figure 4
figure 4

Error ellipses of the estimated parameters with respect to each other for a ABC MCMC (MH) and b ABC SMC algorithms

The traces of the ABC MCMC estimations of the parameters \(\sigma_{0}\), \(\beta\), and \(m\) against the number of iterations are given in Fig. 5. The results show that, after a few iterations, the generated parameters fall within a stable range, which results in a stationary distribution.

Figure 5
figure 5

Trace of the generated fitting parameters \(\sigma_{0} ,{ }\beta ,\) and \(m\). against the iteration of the ABC MCMC algorithm

Figure 6 visualizes the population of the parameters of the modified Weibull distribution, \({\varvec{\theta}} = \left( {\sigma_{0} ,\beta ,m} \right)\), at different sequences of the ABC SMC simulation. We can clearly see that the parameter space rapidly shrinks as the sequence advances, meaning that the simulation converges to the posterior distribution of the parameters.

Figure 6
figure 6

Population of the parameters of the modified Weibull distributions in different sequences of the ABC SMC simulation: a sequence 2, b sequence 6, c sequence 12

To provide a better understanding of each algorithm’s efficiency, the ABC was implemented using the direct Monte Carlo to estimate the model parameters. Figure 7 compares the acceptance rate of the ABC MCMC, ABC SMC, and ABC (direct Monte Carlo). The results indicate that the acceptance rate of the ABC and ABC MCMC have the lowest and highest acceptance rate, respectively. Moreover, the acceptance rate of the ABC is significantly lower than the two other algorithms. Besides, it can be seen that the ABC MCMC has a higher acceptance rate compared to the ABC SMC. This result is consistent with the error ellipses of these two models shown in Fig. 4.

Figure 7
figure 7

Acceptance rate of ABC MCMC, ABC SMC, and ABC-direct Monte Carlo

The histograms of the marginal posterior distribution of the estimated fitting parameters obtained from the ABC (direct Monte Carlo), ABC MCMC, and the ABC SMC after reaching the convergence are compared in Fig. 8. The ABC MCMC and ABC SMC were found to converge to stationary distributions for all the parameters after 50,000 iterations, whereas the ABC required at least 2 million iterations to converge.

Figure 8
figure 8

Histograms of the fitting parameters \(\sigma_{0} ,{ }\beta ,\) and \(m\)

As can be seen, all the posterior distributions of the parameters have the central tendency and are unimodal. This enables us to model the estimated parameters characterization by finding the best fitting probability distributions. However, there are some differences in the spread and mode of the estimated distribution of the parameters between the algorithms. The results obtained from the ABC SMC are very close to the ABC. Both are more scattered and have different peaks compared to those of the ABC MCMC. The larger scatter of the ABC SMC results is consistent with its bigger error ellipses shown in Fig. 4. This difference is associated with the sampling algorithm of the ABC SMC, where instead of the prior distribution, the parameters are sampled from the set of accepted particles at the previous stage and perturbed according to the uniform kernel. The higher scatter in the estimated parameters means higher uncertainty in the fiber strength predicted by the modified Weibull distribution. This is discussed later in the next section.

Parameters distribution

In order to characterize the uncertainty of the fitting parameters of the modified Weibull distribution, we fit probability distributions to the posterior data of the parameters \(\sigma_{0} ,{ }\beta ,\) and \(m\). The chi-squared (\(\chi^{2}\)) goodness of fit test was used to find the distributions that best fit the estimated parameters. The test is based on the null hypothesis (\(H_{0}\)) that the estimated data follow a specific distribution. The p value greater than 0.05 indicates that the null hypothesis is not rejected at the 5% significance level. The summary of the distribution fitting the posterior data of the Weibull parameters estimated by the ABC (direct Monte Carlo), ABC SMC, and ABC MCMC, as well as their associated p values with respect to different numbers of iterations, are given in Tables 2, 3, and 4, respectively. This information is useful because it can be used to describe and compare the uncertainty of the estimated parameters of the modified Weibull distribution.

Table 2 Statistical distribution fitting to estimated parameters by ABC (direct Monte Carlo) with corresponding chi-squared goodness of fit
Table 3 Statistical distribution fitting to estimated parameters by ABC SMC with corresponding chi-squared goodness of fit
Table 4 Statistical distribution fitting to estimated parameters by ABC MCMC with corresponding chi-squared goodness of fit

The estimated highest density interval (HDI) of the Weibull distribution parameters with a 95% credible interval indicated on their probability density functions is shown in Fig. 9. When comparing the HDIs of the corresponding parameters from both algorithms, it is clearly seen that the HDIs of the ABC SMC consist of a wider range compared to the ABC MCMC. This difference in the estimated HDIs is associated with the higher scattering of the posterior distributions estimated by the ABC SMC algorithm (Fig. 8).

Figure 9
figure 9

Highest density interval (HDI) for parameters \(\sigma_{0} ,{ }\beta ,{ }m\) estimated by a ABC MCMC (MH) and b ABC SMC algorithms

Therefore, it can be said that a higher level of uncertainty is predicted when the ABC SMC algorithm is employed to fit the Weibull distribution to the experimental observations. This is illustrated in Fig. 10, in which the uncertainty in the Cumulative Distribution Function (CDF) of the fiber strength was compared with the empirical CDF and the modified Weibull distribution fitted by the ML method. Moreover, the upper and lower bounds of the empirical CDF are given for comparison purposes. The advantage of this illustration is that we can evaluate the uncertainty of the corresponding failure probability using a probability distribution for any value of the fiber strength. Also, the difference between the deterministic estimation of fitting parameters and the ABC method can be visualized.

Figure 10
figure 10

The cumulative distribution functions of the date palm fiber strength where the uncertainty in the failure probability estimated by ML are shown

Despite the differences in the Bayesian estimation algorithms, both led to conservative results compared to the ML. For example, the probability of failure at \(\sigma = 170 MPa\) estimated by the modified Weibull distribution with deterministic parameters is 0.157, whereas the mean probability value by the ABC MCMC and ABC SMC is 0.161 and 0.131, respectively. As a result, neglecting the uncertainty effects of the parameters can lead to an overestimation of the strength value of the fibers.

Conclusion

In this paper, the application of the ABC framework was brought up for estimating the parameters of the modified 3-parameter Weibull distribution to model the strength of the date palm fibers. Two computationally efficient methods, namely MCMC and SMC, were employed, and their performances in fitting the distribution to the strength data were investigated. The Metropolis–Hastings algorithm was implemented for sampling from the probability distribution in the ABC MCMC method. Conversely, the sampling in the ABC SMC was conducted from the set of previously accepted particles and perturbed according to the uniform perturbation kernel. The marginal posterior distributions of the fitting parameters were modeled by the best-fit probability distributions, and their corresponding HDIs were with a 95% credible interval were computed.

It is found that the posterior distributions of the parameters obtained by the ABC SMC algorithm were more spread compared to that of the ABC MCMC. Therefore, the latter resulted in smaller HDIs for the fitting parameters. The results of this study suggest that the uncertainties in the estimation of the Weibull distribution parameters should be assessed when there is high variability in the experimental data, like natural fiber strength. Consequently, the consideration of these uncertainties is significant to have a reliable and accurate predictive model for the failure probability of date palm fibers.