Introduction

The Process Capability Indices (PCIs) are statistical measures used to assess the ability of a process. These indices provide an indication of how well a process meets customer requirements and helpful to identify areas for improvement. The term PCI was first introduced by1 and after that there has been extensive research on the usage and implementation of PCIs in various industries and sectors. Researchers and practitioners2,3,4,5,6,7 have explored different aspects of PCIs to enhance their effectiveness and applicability in different contexts. Several indices have been defined but most commonly used are \({C}_{p}, {C}_{pk}, {C}_{pm}\) and \({C}_{pmk}\)2. defined a supersaturated generalized PCI based on two non-negative parameters \((\lambda ,v)\) for these four indices as

$${C}_{p}\left(\lambda ,\nu \right)=\frac{d-\lambda \left|\mu -M\right|}{3\sqrt{{\sigma }^{2}+{\nu (\mu -T)}^{2}}}$$
(1)

All four basic PCIs can be generated by putting \(\lambda\) and \(v\) as \(0\) and 1. These indices rely on the assumption that the underlying process data follows a normal distribution. In this case, the PCIs are calculated using the mean and standard deviation of the data3,7. However, it's worth noting that there are alternative methods available for estimating PCIs when the normality assumption is violated or when dealing with non-normal data7. Various methodological avenues have been investigated for non-normal quality characteristics which can be categorized into different categories (i) Transform non-normal data into normal and use the traditional PCIs. Box-Cox transformation, Johnson transformation system and Clement’s methods using Pearson curves are commonly used transformations to make data normal8,9,10,11. preferred to compute PCIs from the transformed data (ii) Develop some modified or robust PCIs useful for non-normal data3,4,10,12,13,14,15,16,17,18. have used some robust measures to compute PCIs. Comparison of different approaches is given in7,17,19,20,21 and references therein. Focusing on the alternate measures to compute PCI under non-normality, the most attractive method was suggested by12. The modification of replacing natural interval \(\left(6\sigma \right)\) by the width of 99.865th and 0.135th percentile of distribution in \({C}_{p}\) proved to be more reliable for Pearson family of the distribution. The method is simple and attractive for practical and theoretical point of view because it does not require the transformation of the data. Later on, Pearn and Kotz22 used Clement’s approach and defined other two indices \({\text{C}}_{\text{pm}}\) and \({\text{C}}_{\text{pmk}}\)23. modified the Clements idea and used single measure for all cases and defined all four PCIs. It is shown that the three processes, one on-target and the other two off-target, proved that the modified estimator outperforms the original Clements estimators. In another study, Kashif et al.4 examine the effectiveness of modified PCIs based on the Pearn-and-Chen quantile method. They discovered that the Gini's mean difference is a more trustworthy indicator of Weibull-based data variability. A very limited application is available for PCIs based on Clements approach for different asymmetric behavior of non-normal distributions24. Moreover, the sample standard deviation is based on the assumption of normality and is sensitive to outliers. In case of non-normal data or data contain outliers, the sample standard deviation may not accurately represent the dispersion of the underlying population and alternative estimator of population variability is recommended in the literature3,4,7,25,26. Among these alternatives, median absolute deviation (MAD) is one of the robust measures which is less sensitive to extreme values and does not assume a specific distribution. It provides a more robust estimate of the spread of the data and is less affected by outliers. More details on these topics can be seen in27,28,29,30,31,32,33,34,35,36.

For some non-normal distributions, various modified PCIs based on robust location and dispersion measures have demonstrated promising results. Kashif et al.3,4 has presented comparison of first and second generation PCIs by which are based on some robust measures for Weibull distribution.

But the performance of third generation PCIs \({(C}_{pm} \; \&\; {C}_{pmk})\) is yet need to be evaluated for different asymmetric behavior of different non-normal distributions. In this study, the hypothesis is that utilizing robust measures to estimate process variability can yield more accurate estimates for the third generation PCIs in non-normal distribution. For this purpose, Median Absolute Deviation (MAD), Gini’s Mean Difference (GMD) and Inter Quartile Range (IQR) are considered as robust measures that may perform well under non-normality.

Keeping in view the presented problem the present study is planned to evaluate the performance of three robust scale measures: \(MAD, \; GMD \;and\; IQR\) in third generation PCIs and compare their performance with quantile-based PCIs using Weibull process. Further to construct bootstrap confidence intervals of processed robust PCIs using different asymmetric behavior of Weibull processes. The rest of the paper is structured as follows: “Material and methods” section explains the third generation PCIs based on the aforementioned robust measures. “Results and discussion” section reports the comparison of the robust third generation PCIs to the PC method, along with their interval estimation.

Material and methods

Robust process capability indices

The idea of the use of robust measures in PCIs was introduced by16. However, the possible effects on PCIs are somewhat less known. On the other hand, the robust methods have been successfully utilized in the development of control chart theory25,37,38,39. As noted by39 some robust measure of variability should be used when using median as measure of central tendency instead of sample mean. Here in this study three robust measures for dispersion, Median Absolute Deviation (MAD), Inter Quartile Range (IQR) and Gini’s Mean Difference (GMD) are considered to derive robust PCIs.

Quantile based PCI

Suppose that \(y\) is a random variable with probability distribution \(f\left(x,\theta \right)\), where \(\theta\) is a single unknown parameter. Let \([{y}_{1},{y}_{2},\dots .,{y}_{n}]\) be an i.i.d random sample selected from the process having density \(f\left(x,\theta \right)\). \(\theta ={\left({\theta }_{1},\dots ,{\theta }_{k}\right)}^{\tau }\) is the transpose of the column vector of process parameters. The likelihood and log likelihood function of \(\theta\) are given by

$$L\left(\theta \right)=\prod_{i=1}^{n}f\left({y}_{i},\theta \right).$$
(2)
$$l\left(\theta \right)=\sum_{i=1}^{n}lnf\left({y}_{i},\theta \right).$$
(3)

respectively. The \(\alpha\)-quantile of the process distribution is defined implicitly by the function

$$\alpha =F\left({\mathbb{Q}}_{\alpha };\theta \right)={\int }_{-\infty }^{{\mathbb{Q}}_{\alpha }}f\left(y,\theta \right)dy .$$
(4)

Then the quantile-based PCI superstructure is a function of the population parameter \(\theta\). That is

$${C}_{Np}\left(\eta ,\kappa ,\theta \right)=\frac{d-\eta \left|{\mathbb{Q}}_{{p}_{2}}(\theta )-m\right|}{3\sqrt{{\left[\frac{{\mathbb{Q}}_{{p}_{3}}(\theta )-{\mathbb{Q}}_{{p}_{1}}(\theta )}{6}\right]}^{2}+\kappa {\left({\mathbb{Q}}_{{p}_{2}}(\theta )-T\right)}^{2}}}$$
(5)

Let \(\widehat{\theta }={\left({\widehat{\theta }}_{1},{\widehat{\theta }}_{2},\dots ,{\widehat{\theta }}_{k}\right)}^{\tau }\) which maximizes \(L\left(\theta \right)\) or \(l\left(\theta \right)\), be the MLE of \(\theta\). The maximum likelihood estimator of quantile \({\mathbb{Q}}_{\alpha }\) is defined to the \({\widehat{\mathbb{Q}}}_{\alpha }={\mathbb{Q}}_{\alpha }\left(\widehat{\theta }\right)\). Therefore, the parametric maximum likelihood estimators of the supersaturated PCI is

$${\widehat{C}}_{Np}\left(\eta ,\kappa ,\widehat{\theta }\right)=\frac{d-\eta \left|{\mathbb{Q}}_{{p}_{2}}(\widehat{\theta })-m\right|}{3\sqrt{{\left[\frac{{\mathbb{Q}}_{{p}_{3}}(\widehat{\theta })-{\mathbb{Q}}_{{p}_{1}}(\widehat{\theta })}{6}\right]}^{2}+\kappa {\left({\mathbb{Q}}_{{p}_{2}}(\widehat{\theta })-T\right)}^{2}}}$$
(6)

Note that \({C}_{Np}\left(\eta ,\kappa ,\theta \right)\) is a real-valued function of quantile, \({\mathbb{Q}}_{{p}_{1}},{\mathbb{Q}}_{{p}_{2}},\) and \({\mathbb{Q}}_{{p}_{3}}\) which are a continuous real-valued function of the parameter \(\theta\). Since \(\widehat{\theta }\) is a consistent MLE of \(\theta\), \({\widehat{C}}_{Np}\left(\eta ,\kappa ,\widehat{\theta }\right)\) is a consistent MLE of \({C}_{Np}\left(\eta ,\kappa ,\theta \right)\) under some regularity conditions40.

Median absolute deviation (MAD) based PCI

Suppose that the sample median (MD) is computed from a random sample \(({x}_{1},{x}_{2},\dots \dots ,{x}_{n})\). Then MAD from the sample median is defined as25,26,41.

$$MAD=b*median\left\{\left|{x}_{i}-MD\right|\right\} .$$
(7)

The value of constant b in (7) is used to make it as a consistent estimator. In case of normal distribution, MAD is an unbiased estimator of \(\sigma\) if \(b=1.4826\). For any non-normal distribution, this value changes to \(b={Q}_{0.75}^{-1}\), where \({Q}_{0.75}\) is the \({75}^{th}\) quantile of any underlying distribution. In case of normality, \({Q}_{0.75}^{-1}=1.4826\). Thus, the unbiased estimator of \(\sigma\) is

$$\widehat{\sigma }=1.4826\left(MAD\right)$$
(8)

Using (8) the MAD based estimators for supersaturated and third generation PCIs can be defined as

$${\widehat{C}}_{MAD}\left(\eta ,k\right)=\frac{d-\eta \left|M-m\right|}{3\sqrt{{\widehat{\sigma }}^{2}+k{\left(M-T\right)}^{2}}}$$
(9)
$${\widehat{C}}_{pmMAD}=\frac{USL-LSL}{6\sqrt{{\widehat{\sigma }}^{2}+{\left(M-T\right)}^{2}}}$$
(10)
$${\widehat{C}}_{pmkMAD}=\frac{\mathrm{min}(USL-M,M-LSL)}{3\sqrt{{\widehat{\sigma }}^{2}+{\left(M-T\right)}^{2}}}$$
(11)

Inter quantile range (IQR) based PCI

The population IQR for any continuous distribution is defined as

$$IQR={Q}_{3}-{Q}_{1}$$
(12)

where both upper and lower quantiles are found by solving the following integrals

$$\underset{-\infty }{\overset{{Q}_{3}}{\int }}f\left(x\right)dx=0.75 .$$
(13)
$$\underset{-\infty }{\overset{{Q}_{1}}{\int }}f\left(x\right)dx=0.25 .$$
(14)

Using (12) the IQR based estimators for supersaturated and third generation PCIs can be defined as

$${\widehat{C}}_{pmIQR}=\frac{USL-LSL}{2(IQR+\left|M-T\right|)}$$
(15)
$${\widehat{C}}_{pmkIQR}=\frac{\mathrm{min}(USL-M,M-LSL)}{2(IQR+\left|M-T\right|)}$$
(16)

Gini’s mean difference (GMD) based PCI

The Gini’s Mean Difference for a set of \(n\) ordered observations, \(\left\{{x}_{1},{x}_{2},\cdots ,{x}_{n}\right\}\) of a random variable \(X\) which arranged in ascending order of magnitude, is defined as

$${G}_{n}=\frac{2}{n\left(n-1\right)}\sum_{j=1}^{n}\sum_{i=1}^{n}\left|{x}_{i}-{x}_{j}\right|.$$
(17)
$${G}_{n}=\frac{2}{n\left(n-1\right)}\sum_{i=1}^{n}\left[\left({x}_{i}-{x}_{1}\right)+\left({x}_{i}-{x}_{2}\right)+\cdots +\left({x}_{i}-{x}_{i-1}\right)\right]$$
(18)
$${G}_{n}=\frac{2}{n\left(n-1\right)}\sum_{i=1}^{n}\left(2i-n-1\right){x}_{(i)} .$$
(19)

If the random variable x follows normal distribution with mean \(\mu\) and variance \({\sigma }^{2}\), then42, suggests as a possible unbiased estimator of standard deviation \((\sigma )\) is

$${\sigma }^{*}=c\sum_{i=1}^{n}\left(2i-n-1\right){x}_{(i)}/n\left(n-1\right) .$$
(20)

where \(c=\sqrt{\pi }=1.77245\) and latter on43 proved that

$${\sigma }^{*}=0.8862* {G}_{n} .$$
(21)

is an unbiased measure of variability. Using (21) GMD based estimators for supersaturated and third generation PCIs can be defined as

$${\widehat{C}}_{pmGMD}=\frac{USL-LSL}{3\sqrt{{{\sigma }^{*}}^{2}+{(M-T)}^{2}}}$$
(22)
$${\widehat{C}}_{pmkGMD}=min\left[\frac{USL-M}{3\sqrt{{{\sigma }^{*}}^{2}+{(M-T)}^{2}}},\frac{M-LSL}{3\sqrt{{{\sigma }^{*}}^{2}+{(M-T)}^{2}}}\right]$$
(23)

Case studies for non-normal distribution

One of the most suitable distribution that fits the quality parameters is Weibull distribution. The two parameter Weibull distribution, with \(\gamma\) and \(\beta\) as shape and scale parameters, is given as

$$f\left(z,\gamma ,\beta \right)=\frac{\gamma }{\beta }{\left(\frac{z}{\beta }\right)}^{\gamma -1}{e}^{-{\left(\frac{z}{\beta }\right)}^{\gamma }}$$
(24)

The cumulative distribution, quantile function for (24) respectively are defined as

$$F\left(z,\gamma ,\beta \right)=1-exp\left[{\left(-\frac{z}{\beta }\right)}^{\gamma }\right]$$
(25)
$${\mathbb{Q}}_{\alpha }=\beta {\left[-\mathit{ln}(1-\alpha )\right]}^{\frac{1}{\gamma }}$$
(26)

The maximum likelihood estimator of \(\gamma\) and \(\beta\) are defined as

$$\widehat{\gamma }={\left[\left(\frac{\sum_{i=1}^{n}{{z}_{i}}^{\gamma }\left(\mathit{ln}{z}_{i}\right)}{\sum_{i=1}^{n}{{z}_{i}}^{\gamma }}\right)-\frac{{\sum }_{i=1}^{n}\mathit{ln}{z}_{i}}{n}\right]}^{-1}$$
(27)
$$\widehat{\beta }={\left(\frac{\sum_{i=1}^{n}{{z}_{i}}^{\gamma }}{n}\right)}^{\frac{1}{\gamma }}.$$
(28)

The IQR of Weibull process

The IQR for Weibull process defined in (24) is defined as

$${IQR}_{Wei}={\beta }^{\frac{1}{\gamma }}\left\{\mathit{ln}{\left(0.25\right)}^{\frac{1}{\gamma }}- ln{(0.75)}^{\frac{1}{\gamma }}\right\}$$
(29)

The Gini’s mean difference of Weibull process

By following the procedure of44, the unbiased estimator of GMD for Weibull distribution is,

$$E\left({G}_{n}\right)=\left(2-{2}^{1-\frac{1}{\gamma }}\right)\frac{\Gamma \left(1+\frac{1}{\gamma }\right)}{1/\beta }={\sigma }_{gw}$$
(30)

To evaluate the performance of robust third generation PCIs at different skewness behaviour of Weibull distribution, shape and scale parameters are selected so that the skewness level may be categorized as low, moderate and high as shown in Fig. 1.

Figure 1
figure 1

PDF plots of Weibull distributions with different asymmetry levels along with shape and scale parameters.

Methods of bootstrap confidence interval

The bootstrap technique originated from45. Morove Efron45 and Hall et al.46 provide theoretical details about the bootstrap technique. This technique can be used to construct confidence intervals for parameters when the usual interval estimation approach is not feasible. BCIs are commonly applied in constructing the confidence intervals for various PCIs. Suppose that \({\varsigma }_{1}, {\varsigma }_{2}, ..., {\varsigma }_{n}\) constitute a random sample with n observations taken from a distribution of interest, say \(\upphi\), i.e. \({\varsigma }_{1}, {\varsigma }_{2}, \dots , {\varsigma }_{n}\sim\upphi\). Let \(\widehat{\uptheta }\) represent an estimator of an arbitrary PCIs say \({C}_{pm or }{C}_{pmk}\).

Then the bootstrap technique is implemented as follows:

  1. i.

    A bootstrap sample with n observations (with replacement) is taken from the original sample by using \(\frac{1}{n}\) as the mass at each point, where this bootstrap sample is denoted as \({\varsigma }_{1}^{*},{\varsigma }_{2}^{*},\dots ,{\varsigma }_{n}^{*}\).

  2. ii.

    From the kth bootstrap sample, for \(1\le k\le n\), the kth bootstrap estimator of θ (an arbitrary PCI) can be denoted as \({\widehat{\uptheta }}^{*}=\widehat{\uptheta }\left({\varsigma }_{1}^{*},{\varsigma }_{2}^{*},\dots \dots ,{\varsigma }_{n}^{*}\right).\)

  3. iii.

    If the number of resamples in the bootstrap technique is B, then a total of B estimates of \({\widehat{\uptheta }}^{*}\) can be obtained. Arranging the whole collection from the smallest to the largest value constitutes an empirical bootstrap distribution of \(\widehat{\uptheta }\)13. B = \(1000\) bootstrap resamples is considered in this article. The confidence intervals of \(\widehat{\uptheta }\) can be constructed using any of the following three bootstrap techniques.

Method 1: Standard bootstrap (SB) confidence interval

The sample average and sample standard deviation are computed as follows using the 1000 bootstrap estimates of \({\widehat{\uptheta }}^{*}\):

$${\overline{\uptheta } }^{*}={\left(1000\right)}^{-1}\sum_{i=1}^{1000}{\widehat{\uptheta }}^{*}$$
(31)
$${s}_{{\widehat{\uptheta }}^{*}}=\sqrt{\frac{1}{999}\sum_{i=1}^{1000}{\left({\widehat{\uptheta }}^{*}\left(i\right)-{\overline{\uptheta } }^{*}\right)}^{2}}$$
(32)

Consequently, the 1\(00\left(1-\alpha \right)\mathrm{\%}\) SB confidence interval is obtained as

$${\mathrm{CI}}_{\mathrm{SB}}={\overline{\uptheta } }^{*}\pm {z}_{\left(1-\frac{\alpha }{2}\right)} {s}_{{\widehat{\uptheta }}^{*}},$$
(33)

where \({z}_{\left(1-\frac{\alpha }{2}\right)}\) is the \({\left(1-\frac{1}{\alpha }\right)}{\mathrm{th}}\) quantile of the standard normal variable.

Method 2: Percentile bootstrap (PB) confidence interval

Since there is a total of \(B\) resamples of \({\widehat{\uptheta }}^{*}\), these resamples will produce \(B\) estimates of \({\widehat{\uptheta }}^{*}\). An arrangement of these estimates from the smallest value to the largest value will form an empirical distribution of \({\widehat{\uptheta }}^{*}\). From the ordered empirical distribution of \({\widehat{\uptheta }}^{*}\), choose the \(100\left(\frac{\alpha }{2}\right)\) and \(100\left(1-\frac{\alpha }{2}\right)\) percentiles as the end points of the interval, which results in the \(100\left(1-\alpha \right)\mathrm{\%}\) PB confidence interval for \({\widehat{\uptheta }}^{*}\) given as

$${\mathrm{CI}}_{\mathrm{PB}}=\left({{\widehat{\uptheta }}^{*}}_{1000\left(\frac{\alpha }{2}\right)},{{\widehat{\uptheta }}^{*}}_{1000\left(1-\frac{\alpha }{2}\right)}\right)$$
(34)

For example, the \(95\mathrm{\%}\) confidence interval with 1000 bootstrap estimates is

$${\mathrm{CI}}_{\mathrm{PB}}=\left({{\widehat{\uptheta }}^{*}}_{\left(25\right)},{{\widehat{\uptheta }}^{*}}_{\left(975\right)}\right)$$
(35)

where \({{\widehat{\uptheta }}^{*}}_{\left(25\right)}\) and \({{\widehat{\uptheta }}^{*}}_{\left(975\right)}\) represent the 25th and 975th ordered collection of the bootstrap estimates of \({\widehat{\uptheta }}^{*}\).

Method 3: Bias-corrected percentile bootstrap (BCPB) confidence interval

This technique was established to address the potential bias that could occur as the bootstrap distribution is based on a sample from the complete bootstrap distribution, which may be shifted higher or lower than would be expected. The following steps explain the implementation of this technique:

  1. i.

    By means of the (ordered) distribution of \({\widehat{\uptheta }}^{*}\), calculate

    $${l}_{0}=\mathrm{Pr}\left({\widehat{\uptheta }}^{*}\le \widehat{\uptheta }\right)$$
    (36)
  2. ii.

    By letting \({\rho }^{-1}\) as the inverse distribution function of the standard normal variable, calculate

    $${q}_{0}={\rho }^{-1}({l}_{0})$$
    (37)
  3. iii.

    The lower percentile and upper percentile of the ordered distribution of \({\widehat{\uptheta }}^{*}\) are

    $${P}_{L}=\rho \left(2{q}_{0}+{z}_{\left(\frac{\alpha }{2}\right)}\right)$$
    (38)

    and

    $${P}_{U}=\rho \left(2{q}_{0}+{z}_{\left(1-\frac{\alpha }{2}\right)}\right)$$
    (39)

    respectively, where ρ, \({z}_{\left(\frac{\alpha }{2}\right)}\) and \({z}_{\left(1-\frac{\alpha }{2}\right)}\) are the distribution function, \({\left(\frac{\alpha }{2}\right)}{\mathrm{th}}\) quantile and \({\left(1-\frac{\alpha }{2}\right)}{\mathrm{th}}\) quantile, respectively, of the standard normal distribution. Consequently, the \(100\left(1-\alpha \right)\mathrm{\%}\) BCPB confidence interval is constructed as

    $${\mathrm{CI}}_{\mathrm{BCPB}}=\left({{\widehat{\uptheta }}^{*}}_{1000\left({P}_{L}\right)},{{\widehat{\uptheta }}^{*}}_{1000\left({P}_{U}\right)}\right)$$
    (40)

The average width (AW) is considered to compare the three different types of BCIs. The AW of the BCI is computed using a total of \(M\) trials. Next, the estimated AW is computed as

$$\mathrm{AW}=\frac{\sum_{i=1}^{M}({U}_{{p}_{i}}-{L}_{{w}_{i}})}{M}$$
(41)

where \({L}_{{w}_{i}}\) and \({U}_{{w}_{i}}\) are the estimated lower confidence limit and upper confidence limit of the 100 \((1-\alpha )\mathrm{\%}\) confidence interval for any of the three types of BCIs based on the ith replicate.

Results and discussion

The point and interval estimation of modified PCIs based on Quantile (PC), MAD, IQR and GMD for different asymmetric behavior of Weibull distribution is given in Tables 1, 2, 3, 4.

Table 1 The statistical indicators of index \({C}_{pm}\) and \({C}_{pmk}\) for different asymmetric level of Weibull process based on PC-method.
Table 2 The statistical indicators of index \({C}_{pm}\) and \({C}_{pmk}\) using selected asymmetric level of Weibull process based on MAD-method.
Table 3 The statistical indicators of index \({C}_{pm}\) and \({C}_{pmk}\) using selected asymmetric level of Weibull process based on IQR-method.
Table 4 The statistical indicators of index \({C}_{pm}\) and \({C}_{pmk}\) using selected asymmetric level of Weibull process based on GMD-method.

Following47 target values equal to 1.33 corresponding to existing processes were considered for the point estimation of indices \({C}_{pm}\), and \({C}_{pmk}\). The performance of each modified PCI under different asymmetric behavior is evaluated by using 10,000 simulated samples of size 25, 50,75 and 100. The R-Statistical language was used to complete simulation study. Bias and Mean Square Error (MSE) criteria has been used for the comparison purpose. The simulations have been performed on the following steps

  1. 1.

    Collect 10,000 samples of size 25 from Weibull process with parameters \(\left[\left(Shape, Scale\right)= \left(2.8, 3.5\right), \left(\mathrm{1.80,2.00}\right), (\mathrm{1.00,1.30})\right]\).

  2. 2.

    Compute \({\widehat{C}}_{pm}\) & \({\widehat{C}}_{pmk}\) based on the measures of MAD (Median Absolute Deviation), IQR (Interquartile Range) and GMD (Ginni' s Mean Differnce)\(.\)

  3. 3.

    Calculate average and standard deviation of the computed PCIs.

  4. 4.

    Repeat the entire process for sample size of 50, 75 and 100.

Results for PC based PCIs

Simulation results of quantile approach as suggested by23 are presented in Table 1 for Weibull distribution. These tables depict the simulated mean, MSE, standard deviation and bias in parenthesis, bias and mean square error (MSE) corresponding to the target value equal to 1.33 for both indices for low, moderate and high asymmetric behavior of Weibull distribution.

In the case of index \({C}_{pm},\) the PC-method gives good results under low and moderate asymmetric behavior, however, underestimates the target value in case of high asymmetry. As the sample size increases, the estimated values come close to the target values and ultimately produce less bias and mean square error. For the index \({C}_{pmk}\), the PC-method is more accurate as compared to other three indices and gives lowest bias and MSE under low and moderate asymmetric conditions for the sample \((n=100)\). For the three asymmetric levels of the Weibull distribution, following conclusions can be drawn; the PC-method gives a lower bias and MSE for indices \({C}_{pmk}\) under lower and moderate asymmetric behaviors when the target value is \(1.33\).

Results of MAD base PCIs

Table 2 summarize the results of MAD- based estimators of both PCIs i.e., \({C}_{pm}\) and \({C}_{pmk}\). Unlike the PC-method, MAD-based estimators of two indices showed a different pattern for Weibull process. Summing up the overall results, it can be concluded that performance of MAD based estimators is consistently better than that of PC-based estimator from low to high asymmetry.

For index \({C}_{pm}\), except for high asymmetry, the MAD-based estimator is closer to the target value and less biased for large samples. The MAD-based estimator of index \({C}_{pmk}\) showed good performance for small sample sizes only. It showed accurate results under low and moderate asymmetric condition whereas for the new process it deals better with high asymmetry. In both cases, it slightly underestimates the target values for a large sample.

Results for IQR based PCIs

The simulation results of IQR based PCIs \({C}_{pm}\), and \({C}_{pmk}\) for Weibull distribution under low, moderate and high asymmetric levels are reported in Table 3. The simulation results of both indices using IQR-method for all asymmetric levels of three distributions show the overestimation using all sample sizes. So, these estimators do not consider as good estimators. In all cases, large bias and MSE for all sample sizes is observed. The situation tends to worse estimation for all indices as asymmetry level turns from low to high level. Moreover, the findings of the simulation results indicate that IQR-method could not be a useful and attractive method for practical point of view due to large bias and MSE.

Results for GMD based PCIs

In this section, the performance of both PCIs, \({C}_{pm}\) and \({C}_{pmk}\) based on GMD-method has been assessed and compared under low, moderate and high asymmetric condition of Weibull distribution. The results are presented in Table 4. The results indicate that GMD-based PCIs perform better under the moderate asymmetric condition for the index \({C}_{pm}\) for large samples. The bias and MSE reduce as sample size increases. In case of index \({C}_{pmk}\), the GMD-based estimator slightly overestimates the target value of 1.67 for small samples under low asymmetry, but bias increases as sample size increases. For moderate asymmetry this method underestimates, and for high asymmetry, it again overestimates the target value of new processes. Based on the above observations, GMD-based estimators of indices \({C}_{pm}\) and \({C}_{pmk}\) have the following results

  1. 1.

    The GMD-method performed well for new processes under moderate conditions for large samples sizes for index \({C}_{pm}\).

  2. 2.

    In case of index \({C}_{pmk}\), this method is good for small samples under low asymmetric conditions.

  3. 3.

    As compared to other methods, in case of GMD-method, the mean estimated values increases as sample size increases. However, for the efficient process in which there is a very low amount of product is outside the specification limits, GMD is recommended under high asymmetry.

Bootstrap confidence intervals for \({{\varvec{C}}}_{{\varvec{p}}{\varvec{m}}}\) and \({{\varvec{c}}}_{{\varvec{p}}{\varvec{m}}{\varvec{k}}}\)

In this section, four bootstrap confidence intervals, namely standard, percentile, bias-corrected percentile and percentile-t bootstrap confidence intervals are discussed for indices \({C}_{pm}\) and \({C}_{pmk}\) using PC, MAD and GMD method. For the simulation, Weibull process are used under low, moderate and high asymmetric conditions for sample sizes n = 25, 50, 75 \(\text{and 100}\). The results are presented in Tables 5, 6, 7, 8, 9, 10 which indicate true index value, 95% confidence limits, and coverage probability of each index under low, moderate and high asymmetric conditions for all sample sizes. These results are based on 1000 replications and different values of USL and LSL for the three types of processes which are given in Table 2 above.

Table 5 The 95% bootstrap confidence intervals with coverage probabilities for Weibull distribution using PC methods for index \({C}_{pm}\).
Table 6 The 95% bootstrap confidence intervals with coverage probabilities for weibull distribution using PC methods for index \({C}_{pmk}\).
Table 7 The 95% bootstrap confidence intervals with coverage probabilities Weibull distribution using MAD methods for index \({C}_{pm}\).
Table 8 The 95% bootstrap confidence intervals with coverage probabilities Weibull distribution using MAD methods for index \({C}_{pmk}\).
Table 9 The 95% bootstrap confidence intervals with coverage probabilities for Weibull distribution using GMD methods for index \({C}_{pm}\).
Table 10 The 95% bootstrap confidence intervals with coverage probabilities for Weibull distribution using GMD methods for index \({C}_{pmk}\).

Tables 5 to 6, present the 95% BCIs for the Weibull process using PC-method, while the coverage probability of each method is reported below each interval. Similarly, Tables 7 to 8 presents the 95% BCIs for Weibull process along with coverage probabilities using the MAD method. The results presented in all these tables indicate that the average width of all confidence intervals, which is the difference between lower and upper specification limit, reduces when the sample size increases in all cases under study. Moreover, the asymmetric levels effect the average width, where the average width increases as asymmetry increases.

BCIs for Weibull distribution

From the results of Weibull distribution, followings conclusions have been drawn.

  1. i.

    Among the PC-based estimators of both indices \({C}_{pm}\) and \({C}_{pmk}\), BCBP method explicated least average width, under low, moderate and high asymmetric behavior of Weibull process.

  2. ii.

    Based on the average with, the four bootstrap methods are ranked as BCPB < PB < PTB < SB.

  3. iii.

    The coverage probability is directly proportional to sample size and reached to the nominal level 0.95 for large sample size in the case of SB and \({\text{BCPB}}\) method. However, other two methods did not reach to a nominal level, particularly for small samples.

  4. iv.

    In the case of the MAD method, both \({\text{BCPB}}\) and PB CIs showed less average width as compared to SB and PTB. Based on the average with, the four bootstrap methods are ranked as BCPB < PB < PTB < SB.

  5. v.

    Among \({\text{BCPB}}\) and PB CIs, former showed lower coverage probability than later. Consequently, PB CI performed better for MAD-method.

  6. vi.

    In both methods, when the transition is made from low to high asymmetric conditions the average width approximately increased by two times. It means under high asymmetry; the width of CI is larger as compared to low and moderate asymmetry.

In general, \({\text{BCPB}}\) CI is recommended for all asymmetric condition when PC-method is used. On the other hand, PB CI is recommended for MAD-method under low, moderate and high asymmetric behavior of Weibull process. The recommendation is made on the basis of low average width and high coverage probability among four BCIs.

Application of proposed methodology using practical data

A data sets was analysed using GMD, MAD and PC based PCIs. The results are appended in the following section.

Data: strength measures in GPA for single fibres data

In this section, a real-life example is presented to demonstrate the application of the MAD, PC and GMD- methods for the indices \({C}_{pm}\), and \({C}_{pmk}\). The data which represents the strength measures in GPA for single fibres and impregnated 1000-carbon fibre tows. Single fibres were tested under tension at a gauge length of 20 mm with sample size \(n=69\) and are given in Table 1148,49,50.

Table 11 Data set of strength measure in GPA for single fibre (20-mm).

To select the appropriate distribution, the different goodness of fit statistics51 were used and reported in Table 12 along with summary statistics of the data. Based on AIC and BIC values, it is confirmed that two- parameter Weibull distribution is suitable for this data as compared to other distributions. By fitting two- parameter Weibull distribution, the maximum likelihood estimator for shape and scale parameters are \(\widehat{\upgamma }=\) 5.504809, \(\widehat{\upbeta }=\) 2.650830, respectively.

Table 12 The summary statistics and goodness of fit statistics for fibre strength data.

To evaluate the adequacy of the data K-S goodness of fit test is used. The K-S distance value for this data is 0.056 with p-value 0.9816, which also in favor of Weibull distribution. The lower and upper specification limits used for the calculations of PCIs were (0.3989, 4.4960). The estimates of both indices using three methods and their corresponding bootstrap CIs are reported in Table 13. Likewise, simulation study, the performance of MAD and GMD method are more accurate than PC-method. Both indices \({C}_{pk}\), \({C}_{pmk}\) showed better performance and estimated value is close to existing process target values. Based on the average width of CIs, the four bootstrap methods are ranked as \({\text{BCPB}} \, \text{<} \, {\text{PB}} \, \text{<} \, {\text{PTB}} \, \text{<} \, {\text{SB}}\). Overall, MAD and GMD, based estimator showed the wider spread of CIs.

Table 13 The point estimates and width of four BCIs for fibre strength data.

Summary and conclusion

Statistical Process Control (SPC) is an attractive statistical tool and commonly used to monitor the processes in many industries now a days. Among SPC, PCIs have become an attractive and important tool to measure the quality of any product within specified limits. It seems difficult to choose the proper PCI that performs accurately in non-normal distribution while process variability and mean are being affected by non-normality. Moreover, any PCI which does not provide high target value (> 1.33) even then its importance cannot be neglected. So, the conditions under which PCI performs poorly it opens a new research horizon for the researchers.

The pragmatic attempt has been conducted to address the non-normality issues in PCIs using quantile (PC), MAD, IQR and GMD methods under asymmetric conditions of Weibull distribution. Moreover, the point and interval estimation of modified PCIs were assessed using simulation studies. The point estimation of quantile-based PCIs using PC-method has been observed an effective approach under low and moderate asymmetric conditions of Weibull process. PC-based estimator tends to be an under-estimation. However, this trend increases as sample size increases. Results not only indicate that PC-based estimator produces large bias but also explain under and overestimation of target values. Moreover, the PC-based estimator is influenced by high asymmetry and explains the worst estimation for all three distributions.

The simulation studies reveal that the results of MAD-method can be successfully used and has a great potential to deal with non-normality for Weibull process under low and moderate asymmetry. Overall, MAD-based estimators tend to produce very accurate results under low and moderate asymmetric conditions. In the case of high asymmetry, MAD-estimator of index \({C}_{pm}\) has shown good performance only for a sample of size less than 50.

The simulation studies for PCIs show that IQR-method gives overestimation problem for selected asymmetric levels of Weibull distribution. Moreover, a large bias and MSE has been observed for all sample sizes. The situation became worse when asymmetry level turned from low to high. Therefor, the IQR based estimators were not considered as good estimators for dealing non-normality.

Finally, we demonstrated the application of GMD as a measure of variability in PCIs \({C}_{pm}\) and \({C}_{pmk}\) for Weibull distribution under low, moderate and high asymmetric conditions. The results indicate that GMD-method works well to some extent under high asymmetry but to get a better estimation of PCIs more research is required.

Beside point estimation, interval estimation of all PCIs was constructed. Moreover, four types of bootstrap confidence intervals i.e., SB, PB, BCPB and PTB and their coverage probabilities using simulation studies were calculated. The selection of the appropriate confidence interval for each method has been made by low average width and higher coverage probability.

The simulations illustrated that \({\text{BCPB}}\) CIs produce the smallest average widths and highest coverage probabilities under all asymmetric levels of Weibull distribution for quantile-based (PC) indices \({C}_{pm}\), and \({C}_{pmk}\). On the other hand, the \({\text{PB}}\) and \({\text{PTB}}\) CIs are recommended for MAD-based indices. Both asymmetric behavior and sample size effect the width and coverage probabilities of confidence intervals. Moreover, coverage probabilities approach to nominal levels with the increase of sample size. The BCPB and \({\text{PB}}\) CIs provides higher coverage probability with a smaller width in case of GMD-based estimators.

Recommendations

By conducting a comprehensive study, we concluded the following two recommendations.

  1. 1.

    The performance of both modified PCIs is highly effected by asymmetric behavior of the distributions. However, the accurate performance of a particular method for one distribution does not necessitate accurate results for another distribution having different tail behavior.

  2. 2.

    To deal with high asymmetry, more care is needed both for point and interval estimation. In general, in the case of point estimation, quantile-based PC-method leads towards under-estimation, while robust methods like MAD, IQR, and GMD leads towards over-estimation. For interval estimation, a wider spread of CIs was observed under high asymmetry as compared to low and moderate asymmetry.