1 Introduction

The definition of Rayleigh distribution was originally given by Rayleigh who concerned a problem about the area of acoustics, which was a particular form of the Weibull distribution. Its hazard rate function would grow linearly over time. Due to this important property, Rayleigh distribution is indispensable in numerous application fields. Some applications of Rayleigh distribution could be found in construction of physics model fields, for instance, sound radiation, ray radiation, wave heights and wind speed. Considering that Rayleigh distribution could exactly represent Instantaneous state, one would use it to describe instantaneous peak values in communication theory, see Gómez-Déniz and Gómez-Déniz (2013). Polovko (1968) and Dyer and Whisenand (1973) pointed out that Rayleigh distribution played a significant role in Electronic Vacuum Devices and Communication Engineering such as the swing of radio noise and envelopes of certain stochastic processes, which had probability density function of Rayleigh. In addition, the Rayleigh distribution fits very well with the model that grew rapidly over time. So it is very popular in probability and statistics. It is widely used as a useful life model in the field of reliability, while it applies in other sciences, including operations, statistics and biology. In the recent past, the distributions associated with the Rayleigh distribution have attracted the attention of many authors. Raqab and Madi (2009) considered informative and non-informative priors to get estimation and prediction of the exponentiated Rayleigh model. Mahmoud and Ghazal (2016) proposed estimations of exponentiated Rayleigh distribution with type-II censoring model. Also, Kundu and Raqabb (2005) introduced different estimation procedures to estimate the unknown parameter(s) of generalized Rayleigh distribution. Wu (2006) discussed the problem of accepting the sampling scheme when the samples came from the generalized Rayleigh distribution. Apart from that, Ali (2015) derived the Bayesian estimation for inverted Rayleigh distribution. Soliman et al. (2010) obtained estimation and prediction under lower record values for the inverted Rayleigh distribution. In a recent study, Rastogi and Tripathi (2014) showed that the estimation of unknown parameters using type-II progressive censoring, meanwhile the distribution was inverted exponentiated Rayleigh distribution.

We note that the hazard rate of inverted exponentiated Rayleigh distribution is not monotonous. In view of numerous practical situations, it is universal that hazard rate is nonmonotone. Therefore, we are interested in the inverted exponentiated Rayleigh distribution when the observed data is progressive first-failure censored. To the best of our knowledge, there is no research about this case. We consider the estimation of the parameters for the inverted exponentiated Rayleigh distribution by using the progressive first failure censored sample. The failure rate function of the inverted exponentiated Rayleigh distribution is unimodal. With the help of the advantage, the inverted exponentiated Rayleigh distribution is favored by many statisticians. In fact, the lifetime of electrical and mechanical components, the treatment of hospital patients will change with time, and is not only decreasing with the growth of time. Therefore, inverted exponentiated Rayleigh distribution can be used to fit the data, because its failure rate function is not monotonous. When we estimate the parameters of the inverted exponentiated Rayleigh distribution, the progressive first-failure censoring is used to obtain samples. The progressive first-failure censoring sample is the most time saving and cost efficient sampling method in the known censored modes. This method of obtaining samples can help us save experimental costs and carry out experiments at a fast speed.

In consideration of cost and time, many experimenters pay more attention on censoring in life testing of reliability analysis, such as Aslam et al. (2018), Aslam et al. (2019) and Sajid Ali and Butt (2019). More importantly, deletion is reasonable. Aslam et al. (2018), Muhammad (2018), Aslam (2019) and Aslam and Arif (2018) have studied it. The most common censoring schemes have been used widely by many authors with all kinds of lifetime models, which are type-I and type-II censoring, one can refer to Kay (1976), Aslam et al. (2018) and Sinha (1986). However, there is a more popular censoring scheme, progressive censoring that could removal test units at time points not final terminal point by Balakrishnan and Aggarwala (2000), Nasrullah Khan (2018) applied this censoring method. And more, Balasooriya (1995) proposed that experimenter could divide the test items into several groups in order to cut short the time and cost of experiment. However, the more attractive censoring is uniting the characteristics of progressive censoring and grouping censoring into a better censoring, namely, progressive first-failure censoring scheme, Wu and Kus (2009) explained it in detail. Some recent studies, one can refer to Soliman et al. (2011) and Soliman et al. (2012).

The cumulative distribution funcution(cdf) of inverted exponentiated Rayleigh distribution(IERD), denotes by IERD \( (\alpha , \beta ) \), is written as

$$\begin{aligned} F(x;\alpha ,\beta )=1-\left(1-e^{-(\beta /x^{2}) }\right)^{\alpha }, \quad x,\alpha ,\beta >0 \end{aligned}$$
(1.1)

\(\alpha \) is a shape parameter, the scale parameter as \(\beta \). And corresponding probability density function (pdf) can be given by

$$\begin{aligned} f(x;\alpha ,\beta )=2\alpha \beta x^{-3} e^{-\left(\beta /x^{2}\right) }\left(1-e^{-(\beta /x^{2}) }\right)^{\alpha -1}, \quad x,\alpha ,\beta >0 \end{aligned}$$
(1.2)

Next, we give the IERD’s reliability and failure rate funcutions, respectively,

$$\begin{aligned} r(t)&= {} \left(1-e^{(\beta /t^{2}) }\right)^{\alpha }, \quad t>0, \end{aligned}$$
(1.3)
$$\begin{aligned} h(t)&= {} 2 \alpha \beta t^{-3}e^{-(\beta /t^{2}) }\left(1-e^{-(\beta /t^{2}) }\right)^{-1}, \quad t>0 \end{aligned}$$
(1.4)
Fig. 1
figure 1

Plots of f(t) and h(t) of IERD when \( \beta =1 \)

Figure 1 shows the failure rate of inverted exponentiated Rayleigh distribution for \( \alpha =0.5,0.8,1 \) and \( \beta =1 \). It is not hard to find that failure rate of IERD is nonmonotone.

We should firstly define this progressively first-failure censored scheme as below: we assume there are n independent groups in a life experiment, each of which has k units. Once the first failed group is discovered, we will remove the group and \(R_{1} \) groups from the experiment, and the second failed group has taken place while \( R_{2} \) groups and the group are got rid of samples at random, which is discovered the second failure, and so on. The procedure would work m times, then we can get \( x_{1:m:n:k}<x_{2:m:n:k}<\cdots <x_{m:m:n:k} \) with the progressively censoring scheme \( R=( R_{1}, R_{2},\ldots , R_{m}) \) as progressively first-failure censoring order statistics, which comes from a population with pdf \( f(\cdot ) \) and cdf \( F(\cdot ) \). For simplicity, we shall utilize \( x_{i} \) to take place of \( x_{i:m:n:k}\). It is obvious that \( m+R_{1}+ R_{2}+\cdots + R_{m} = n \). The joint pdf with progressively first-failure censoring dataset was derived by Wu and Kus (2009):

$$\begin{aligned} f( x_{1},x_{2} ,\ldots , x_{m})= Ck^{m}\prod _{i=1}^{m}f( x_{i})(1-F( x_{i}))^{k(R_i+1)-1}; \end{aligned}$$
(1.5)

\( 0< x_{1}<x_{2}<\cdots<x_{m}<\infty. \)

\( C = n (n- R_{1}-1)(n- R_{1}-R_{2}-1)\cdots (n- R_{1}-R_{2}-\cdots -R_{m-1}-m+1)\). Note that the censored scheme \( R=( R_{1}, R_{2},\ldots , R_{m}) \) could be determined in advance.

Obviously, if \( k=1\) and \(R_{1}=R_{2}=\cdots =R_{m}=0 \), the progressively first-failure censored scheme becomes the complete sample. With group size \( k=1 \), the scheme develops into the progressively type-II censoring. Also, when \( R=(R_{1}=R_{2}=\cdots =R_{m-1}=0) \) and \( R_{m}=n-m \), the progressively first-failure censoring scheme is changed into first failure type-II sample, and it corresponds to first failure censored sample and let \( R=(0,\ldots ,0) \) . And more interesting, this censoring plan can be regarded as traditional type-II censoring plan while \( k=1 \) and \( R=(0,\ldots ,n-m) \). Hence, progressively first-failure censored is a generalization of censoring whose advantages is considering test cost and time.

The structure of article is as below. Section 2 deals with MLEs, which uses the observed Fisher information matrix to obtain parameters’asymptotic confidence intervals (CI) and coverage probabilities (CP). We provide the parametric bootstrap CIs of the parameters with constructing the censoring sample in Sect. 3. Section 4 discusses Bayesian estimators of parameters, and we cover the Bayes estimations using Metropolis–Hastings (M–H) technique to gain highest posterior density (HPD) credible intervals of parameters in the same section. Section 5 adopts a method of simulation study by Monte Carlo. It is applied in order to evaluate the performance of different censored schemes, group number and group sizes. Finally, we illustrate all the methods of this article by a real dataset in the last Section.

2 Maximum likelihood estimation

2.1 Point estimation

\( x_{1}, x_{2}, \ldots , x_{m} \) can be seen as a censoring sample, which is from the inverted exponentiated Rayleigh distribution \((\alpha ,\beta )\) whose censoring scheme is \( \mathbf{R } \). Using Eqs. (1.1), (1.2) and (1.5), the likelihood function can be obtained

$$\begin{aligned} L(\alpha ,\beta )=ck^{m}2^{m}\alpha ^{m}\beta ^{m}\prod _{i=1}^{m}x_{i}^{-3}e^{\sum _{i=1}^{m}\beta /x_{i}^2} e^{\sum _{i=1}^{m}\{\alpha k (R_{i}+1)-1\}\ln \left(1-e^{-\beta /x_{i}^{2}}\right)}. \end{aligned}$$
(2.1)

We can also write the log-likelihood function as

$$\begin{aligned}&log L(\alpha , \beta )\propto m\ln (\alpha )+m\ln (\beta )-\beta \sum _{i=1}^{m}\left( \frac{1}{x_{i}^{2}}\right) \nonumber \\&\quad +\sum _{i=1}^{m}\{\alpha k (R_{i}+1)-1\}\ln \left(1-e^{-\beta /x_{i}^{2}}\right) \end{aligned}$$
(2.2)

By taking derivatives from Eq. (2.2) with respect to \(\alpha \) and \(\beta \), then equating to zero in order to get the equations:

$$\begin{aligned} \frac{\partial \log L(\alpha , \beta )}{\partial \alpha }&= {} \frac{m}{\alpha }+\sum _{i=1}^{m}k(R_{i}+1)\ln \left(1-e^{-\beta /x_{i}^{2}}\right)=0, \end{aligned}$$
(2.3)
$$\begin{aligned} \frac{\partial \log L(\alpha , \beta )}{\partial \beta }&= {} \frac{m}{\beta }-\sum _{i=1}^{m}\frac{1}{x_{i}^{2}}\nonumber \\&\quad +\,\sum _{i=1}^{m} \frac{1}{x_{i}^{2}}\frac{\alpha k(R_{i}+1)-1}{1-e^{-\beta /x_{i}^{2}}}e^{-\beta /x_{i}^{2}}=0. \end{aligned}$$
(2.4)

From Eq. (2.3) obtain that

$$\begin{aligned} {\hat{\alpha }}= -\frac{m}{\sum _{i=1}^{m}k(R_{i}+1)\ln \left(1-e^{-\beta /x_{i}^{2}}\right)}, \end{aligned}$$
(2.5)

Then replacing the Eq. (2.4) with Eq. (2.5)

$$\begin{aligned}&\frac{m}{\beta }-\sum _{i=1}^{m}\left( \dfrac{1}{x_{i}^{2}}\right) \nonumber \\&\quad +\,\sum _{i=1}^{m} \frac{1}{x_{i}^{2}}\frac{-m \{k(R_{i}+1)-1\}}{\{\sum _{i=1}^{m}k(R_{i}+1)\ln \left(1-e^{-\beta /x_{i}^{2}}\right)\}\left(1-e^{-\beta /x_{i}^{2}}\right)}e^{-\beta /x_{i}^{2}}=0. \end{aligned}$$
(2.6)

The \( {\hat{\beta }} \) can be got with the help of Newtown-Raphson iteration method. At the same time, with the invariance character of MLEs, r(t) and h(t) can be derived at a predetermined time t

$$\begin{aligned} r(t)&= {} \left(1-e^{-({\hat{\beta }}/t^{2}) }\right)^{{\hat{\alpha }}} \end{aligned}$$
(2.7)
$$\begin{aligned} h(t)&= {} 2{\hat{\alpha }}{\hat{\beta }} t^{-3}e^{-\left({\hat{\beta }}/t^{2}\right) }\left(1-e^{-({\hat{\beta }}/t^{2}) }\right)^{-1} \end{aligned}$$
(2.8)

Here, \( t>0 \).

2.2 Confidence interval estimation

On the base of likelihood equation, this section infers observed Fisher information (OFI). We rename \( \theta =(\alpha ,\beta ) \), we can obtain \( \theta 's \) Fisher information matrix

$$\begin{aligned} I( \theta )=E\left[ \begin{array}{cc} -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \alpha ^2 }&{} -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \alpha \partial \beta }\\ -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \beta \partial \alpha }&{} -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \beta ^2 } \end{array} \right] \end{aligned}$$

Here,

$$\begin{aligned} \frac{\partial ^2\log L(\alpha , \beta )}{\partial \alpha ^2 }&= {} -\frac{m}{\alpha ^{2}} \\ \frac{\partial ^2\log L(\alpha , \beta )}{\partial \beta ^2 }&= {} -\frac{m}{\beta ^{2}}- \sum _{i=1}^{m}\frac{\{k\alpha (R_{i}+1)-1\}e^{-\beta /x_{i}^{2}}}{x_{i}^{4}\left(1-e^{-\beta /x_{i}^{2}}\right)^{2}} \\ \frac{\partial ^2\log L(\alpha , \beta )}{\partial \alpha \partial \beta }&= {} \frac{\partial ^2\log L(\alpha , \beta )}{\partial \beta \partial \alpha }= \sum _{i=1}^{m}\frac{k(R_{i}+1)e^{-\beta /x_{i}^{2}}}{x_{i}^{2}\left(1-e^{-\beta /x_{i}^{2}}\right)^{2}} \end{aligned}$$

We use OFI matrix in our calculations, not the Fisher information matrix, because the expectation of the above expressions are hard to solve. The OFI matrix is obtained as \( I({\hat{\theta }}) \) wtih \({\hat{\theta }} =({\hat{\alpha }},{\hat{\beta }}) \)

$$\begin{aligned} I({\hat{\theta }})=\left[ \begin{array}{cc} -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \alpha ^2 }&{} -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \alpha \partial \beta }\\ -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \beta \partial \alpha }&{} -\frac{\partial ^2\log L(\alpha , \beta )}{\partial \beta ^2 } \end{array} \right] _{\theta ={\hat{\theta }}} \end{aligned}$$

Next, the inverted of OFI matrix is the observed variance-covariance matrix of MLEs we need,

$$\begin{aligned} I^{-1}({\hat{\theta }})=\left[ \begin{array}{cc} {\hat{Var}}({\hat{\alpha }})&{} {\hat{Cov}}({\hat{\alpha }},{\hat{\beta }})\\ {\hat{Cov}}({\hat{\beta }},{\hat{\alpha }})&{}{\hat{Var}}({\hat{\beta }}) \end{array} \right] \end{aligned}$$

The asymptotic distribution of \({\hat{\theta }} \) is a normal distribution as \({\hat{\theta }} \sim N(\theta , I^{-1}({\hat{\theta }})) \) by Lawless (2011) given, which estimation method is MLE. Therefore, bivariate normal distribution as \( 100(1-\varepsilon ) \% \) CI for \( \theta \) is \( {\hat{\theta }}\pm z_{\varepsilon /2}\sqrt{{\hat{Var}}({\hat{\theta }})} \). And the CP can be gained by Monte Carlo simulation as

$$\begin{aligned} CP_{\theta }=P\left[ \left| \dfrac{({\hat{\theta }}-\theta )}{\sqrt{{\hat{Var}}({\hat{\theta }})}}\right| \le z_{\varepsilon /2} \right] \end{aligned}$$

where the right \( (\varepsilon /2){th} \) percentile of standard normal distribution as \( z_{\varepsilon /2} \), and \(\theta \) could be \(\alpha \) or \( \beta \).

3 Bootstrap estimation for confidence intervals

Obviously, the CIs with asymptotic normal is executed well in view of the large effective sample m. Nevertheless, the sample m in the actual situation is not always large. The bootstrap method would be a wise choice. In this section, we propose two parametric bootstrap algorithms: (I) percentile bootstrap (Boot-p) method utilizes theory of Efron (1982) and (II) percentile bootstrap-t (Boot-t) method in view of Hall (1988). We show the bootstrap procedures below.

3.1 Boot-p method

  1. (1)

    Step 1: Compute MLEs \( ({\hat{\alpha }},{\hat{\beta }}) \) with the original sample \( x_1,x_2,\ldots ,x_m \), where n, k and \( \mathbf{R } \) are the amount of groups, the size of each group and censored schemes, respectively.

  2. (2)

    Step 2: Use \( ({\hat{\alpha }},{\hat{\beta }}) \) to create an independent sample, which in order to compute the MLEs \( ({\hat{\alpha }}^{*},{\hat{\beta }}^{*}) \).

  3. (3)

    Step 3: Repeat N times of Step 2, we can acquire a series of bootstrap estimates \( ({\hat{\alpha }}^{*}_{i},{\hat{\beta }}^{*}_{i}) \), \( i= 1,2,\ldots ,N \).

  4. (4)

    Step 4: The \( 100(1-2\epsilon ) \% \) Boot-p CI of \( {\hat{\alpha }} \) is provided as

    $$\begin{aligned} ({\hat{\alpha }}^{*}_{Boot-p(\epsilon )},~~ {\hat{\alpha }}^{*}_{Boot-p(1-\epsilon )}) \end{aligned}$$

    after ordering \({\hat{\alpha }}^{*}_{(1)}\le {\hat{\alpha }}^{*}_{(2)}\le ,\ldots ,\le {\hat{\alpha }}^{*}_{(m)}\). As for \(\beta \), we can use the same method to gain the similar result.

3.2 Boot-t method

The initial two steps of Boot-t method are alike as Boot-p method.

  1. (3)

    Step 3: Do last step N times to compute a series of statistics:

    $$\begin{aligned} T_{\alpha i}^{*}=\dfrac{({\hat{\alpha }}^{*})-{\hat{\alpha }}}{\sqrt{I^{-1}({\hat{\alpha }}^{*})}} \end{aligned}$$

    \( i=1,2,\ldots ,N \).

  2. (4)

    Step 4: The circa \( 100(1-2\epsilon ) \% \) Boot-t CI of \( {\hat{\alpha }} \) is given by

    $$\begin{aligned}&({\hat{\alpha }}-T^{*}_{\alpha Boot-t(\epsilon )}\sqrt{I^{-1}({\hat{\alpha }}^{*})}, \quad {\hat{\alpha }}+\,T^{*}_{\alpha Boot-t(1-\epsilon )}\sqrt{I^{-1}({\hat{\alpha }}^{*})}) \end{aligned}$$

    after ordering \(T^{*}_{(\alpha 1)}\le T^{*}_{(\alpha 2)}\le ,\ldots ,\le T^{*}_{(\alpha m)}. \)

Similarly, the same approach is used to \(\beta \) would achieve the confidence interval of \( \beta \).

4 Bayesian estimation

Although the traditional methods of estimation are always effective, it is possible to appear hard in mathematical. In the other hand, the simulation method would acquire more attention, for example, Markov Chain Monte Carlo (MCMC). And mroe, the simulation can get point estimation, at the same time, provide the interval estimation. For parameter estimation we have considered squared error loss function. We suppose that gamma priors for both \( \alpha , \beta \):

$$\begin{aligned} \pi _{1}(\alpha )&= {} \dfrac{b^{a}}{\Gamma (a)}\alpha ^{a-1}e^{-b \alpha }, \alpha>0, a>0, b>0, \end{aligned}$$
(4.1)
$$\begin{aligned} \pi _{2}(\beta )&= {} \dfrac{d^{c}}{\Gamma (c)}\beta ^{c-1}e^{-d \beta }, \beta>0, c>0, d>0. \end{aligned}$$
(4.2)

here abcd reflect prior knowledge about parameters. The model shall more flexible because of hyper-parameters, and when the super-parameters are zero, which is the non-informative prior. In addition, the prior distribution of gamma distribution is very reasonable, many authors have applied this prior, see Gupta and Singh (2013) and Almutairi et al. (2015). The posterior distribution of parameters can be given with combining the likelihood function and the prior distributions

$$\begin{aligned} h_{\alpha ,\beta }(\alpha ,\beta |data)=\dfrac{l(data|\alpha ,\beta )\pi _{1}(\alpha )\pi _{2}(\beta )}{\int _{0}^{\infty }\int _{0}^{\infty }l(data|\alpha ,\beta )\pi _{1}(\alpha )\pi _{2}(\beta )d\alpha d\beta } \end{aligned}$$
(4.3)

the joint posterior can be written as

$$\begin{aligned} h_{\alpha ,\beta }(\alpha ,\beta |data)\propto \alpha ^{m+a-1}\beta ^{m+c-1}\prod _{i=1}^{m}\dfrac{e^{-\beta _{i}/x_{i}^{2}}}{1-e^{-\beta _{i}/x_{i}^{2}}}e^{\left\{-\alpha [b-\sum _{i=1}^{m}k(R_{i}+1)\ln (1-e^{-\beta _{i}/x_{i}^{2}})]-d\beta \right\}} \end{aligned}$$
(4.4)

Because we don’t have enough prior information, so consider \( a=b=c=d=0 \). Next, we should give the marginal posterior distributions of parameters \( \alpha ,\beta \), respectively:

$$\begin{aligned}&h_{\alpha }(\alpha ,\beta |data)\propto \alpha ^{m+a-1}e^{\left\{-\alpha [b-\sum _{i=1}^{m}k(R_{i}+1)\ln (1-e^{-\beta _{i}/x_{i}^{2}})]\right\}} \end{aligned}$$
(4.5)
$$\begin{aligned}&h_{\beta }(\alpha ,\beta |data)\propto \beta ^{m+c-1}\prod _{i=1}^{m}\dfrac{e^{-\beta _{i}/x_{i}^{2}}}{1-e^{-\beta _{i}/x_{i}^{2}}}e^{\left\{\sum _{i=1}^{m}\alpha k(R_{i}+1)\ln (1-e^{-\beta _{i}/x_{i}^{2}})-d\beta \right\}} \end{aligned}$$
(4.6)

We can find that \( h_{\alpha }(\alpha ,\beta |data) \) is a gamma density whose parameters are \( (m+a, b-\sum _{i=1}^{m}k(R_{i}+1)\ln (1-e^{-\beta _{i}/x_{i}^{2}}))\), it’s not difficult to use any gamma generating algorithm to obtain samples of \( \alpha \). However, the samples of \( \beta \) are not directly to get by standard approaches. For this reason, we consider to apply Metropolis–Hastings algorithm.

4.1 Point estimation of Metropolis–Hastings algorithm

There are many methods to solve Bayesian estimation, for example, Lindley’s method, importance sampling procedure and Metropolis–Hastings algorithm (M–H). However, M–H is more attractive than other two methods, because: (I) the numerical procedures and computational are relative straightforward, (II) it provides point estimations, also gives HPD intervals for parameters, and (III) it allows a full analysis of the data. M–H approach is more flexible, simple and effective. For applications of the algorithm, one can refer to Dey and Pradhan (2014).

M–H approach is developed by Metropolis et al. (1952) and later extended by Hastings (1970), and now it is a more popular MCMC method. The M–H algorithm works as below:

  1. (1)

    Step 1: give an initial surmise of \( \beta ^{0} \) and make t=1.

  2. (2)

    Step 2: Crate \( \alpha ^{t} \) by using Gamma \( (m+a, b-\sum _{i=1}^{m}k(R_{i}+1)\ln (1-e^{-\beta _{i}/x_{i}^{2}}))\).

  3. (3)

    Step 3: Crate \( \beta ^{*} \) from \( h_{\beta }(\alpha ,\beta |data) \) with the proposal distribution \( N(\beta ^{t-1},\sigma ^{2}) \), where \( \sigma ^{2} \) is the variance of \( \beta \).

  4. (4)

    Step 4: Calculate \( \rho (\beta ^{*}|\beta ^{t-1})= \min \{\dfrac{f(\beta ^{*}|x)q(\beta ^{t-1}|\beta ^{*})}{f(\beta ^{t-1}|x)q(\beta ^{*}|\beta ^{t-1})},1\}. \)

  5. (5)

    Step 5: Take \(\upsilon \) from Uniform(0,1).

  6. (6)

    Step 6: If \( \upsilon \le \rho (\beta ^{*}|\beta ^{t-1}) \) then \( \beta ^{t}=\beta ^{*} \), otherwise \( \beta ^{t}=\beta ^{t-1} \).

  7. (7)

    Step 7: let \(t=t+1.\)

  8. (8)

    Step 8: Repeat Step 2–7 N times.

We discard M initial samples in order to get an independent sample. Thus, the approximate Bayes estimation of \( \theta =(\alpha , \beta ) \) is obtained as

$$\begin{aligned} {\hat{\theta }}_{MH}=\dfrac{1}{N-M} \sum _{j=M+1}^{N}\theta _{j} \end{aligned}$$

4.2 Highest posterior density credible interval

The point estimation does not take the sampling error into consideration. It is difficult to make a judgement by simply relying on one value in the case of the inevitable occurrence of the sampling error. The advantage of interval estimation will be highlighted at this case. Interval estimation takes into account the sampling error on the basis of point estimation, and ensures that the error does not exceed a given range with a certain probability. So interval estimation provides more reliable information.

With the help of the Bayesian point estimation in the last section, we will build the HPD credible interval for parameter \( \theta \) based on the previously used MCMC method. In the last section, we have got \( N-M \) values of \( \theta \). By sorting the \( \theta _{N+1},\theta _{N+2},\ldots ,\theta _{M} \), a series of ordered values \(\theta _{(N+1)}<\theta _{(N+2)}<,\ldots ,<\theta _{(M)} \) can be obtained. Therefore, we can construct a HPD credible interval for \( \theta \). The strict theory of this method proof, one can see Chen and Shao (1999). Soliman et al. (2010), Soliman et al. (2012) and Dube et al. (2016) also apply this method to construct the HPD credible intervals of parameters. In the case of setting the set letter level \( \delta \), the \( 100(1-2\delta )\% \) HPD credible interval of \( \theta \) as follows:

$$\begin{aligned} (\theta _{[\delta (N-M)]}, \theta _{[(1-\delta )(N-M)]}) \end{aligned}$$
(4.7)

where \( \theta _{[\delta (N-M)]} \) is the number \( [\delta *(N-M)] \) value of \(\theta _{(N+1)}<\theta _{(N+2)}<,\ldots ,<\theta _{(M)} \), and \( [\delta *(N-M)] \) is the integral part of \( \delta *(N-M) \).

5 Simulation study

We conduct the simulation study to compare the behavior of the different estimates for diverse censored schemes with various group sizes and number of groups, and different priors with different criterions in this section. We assess the performance of MLEs and Bayes estimations in view of bias and mean squares errors (MSE). In addition, we analyse some interval estimates of asymptotic CIs, bootstrap CIs and HPD credible intervals in respect of the average confidence lengths (AL), and CP. When computing the Bayes estimator, we suppose the two priors:

  1. (1)

    Prior 1: \( a=b=c=d=0 \),

  2. (2)

    Prior 2: \( a=b=c=d=1 \).

It is clear that prior 2 carries more information, comparing with prior 1. Six censoring schemes (CS) and four combinations of (knm) are considered, they are list next.

  1. (1)

    six CS: (25, 0 * 24), (1 * 25), (0 * 24, 25), (20, 0 * 29), ((2, 0, 0) * 10), (0 * 29, 20),

  2. (2)

    four (knm) : (2, 50, 25), (2, 50, 30), (3, 50, 25), (3, 50, 20).

where (1 * 25) means that (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1).

Tables 1 and 2 report the results above average absolute bias and MSE with MLEs and Bayes estimate of \( \alpha ,~\beta \) when \(\alpha =0.5 ,~ \beta =1\). From Tables 1 and 2, we have the following conclusions:

Table 1 Bias and MSE of MLE and Bayes estimate of the \( \alpha \) assume \(\alpha =0.5, \beta =1\)
Table 2 Bias and MSE of MLE and Bayes estimate of the \( \beta \) assume \(\alpha =0.5, \beta =1\)
  1. (1)

    with the increasing of group size k, the MSE would not decrease, however, bias and MSE will decrease with effective m increasing;

  2. (2)

    Bayesian estimators based on informative prior perform better than non-informative prior;

  3. (3)

    Bayes estimate is better than MLEs in terms of MSE.

The estimated values \( \hat{r(t)} \) and \( \hat{h(t)} \) are presented in Tables 3 and 4.

Table 3 Bias and MSE of MLE and Bayes estimate of r(t) assume t=2
Table 4 Bias and MSE of MLE and Bayes estimate of h(t) assume t=2

One can note that

  1. (1)

    all the cases show underestimating of \( \hat{r(t)} \) and vice versa for \( \hat{h(t)} \);

  2. (2)

    as group size k increasing and sample size m is decreasing, the estimates of both MLEs and Bayes are increasing according as bias and MSE;

  3. (3)

    Bayes estimates obtained with informative prior is outperform.

The AL and CP of the parameter \( \alpha ,~\beta \) in Tables 5 and 6. And, we can see that

  1. (1)

    the AL of asymptotic CIs, bootstrap CIs and HPD credible intervals would become narrow as effective sample size m and group size k increasing;

  2. (2)

    Bayes estimates work better than MLEs with a view to AL and CP, especially the information priori Bayesian performance very great;

  3. (3)

    bootstrap CIs are worse than other CIs in accordance with AL and CP at the case of not large effective sample.

Table 5 AL and associated CP of the parameter \( \alpha =0.5 \) with 95% asymptotic CI, bootstrap CI, HPD credible interval
Table 6 AL and associated CP of the parameter \( \beta =1 \) with 95% asymptotic CI, bootstrap CI, HPD credible interval

6 Real data

We seek out a real dataset, which helps us to illustrate the estimations of our paper. At first, we will analyze the strength data set, which is initially used by Badar and Priest (1982). The data of numerical example gave the strength meterage in GPA of single carbon fibers, and impregnates 1000-carbon fiber tows, which was tested under tension at gauge lengths of 10mm. The dataset is given in Table 7.

Table 7 Data set (gauge lengths of 10mm)

This data was analyzed previously by Kundu and Gupta (2006) and Kundu and Gupta (2005).

Before progressing further, we verify the inverted exponentiated Rayleigh distribution to the data, which is removed the first data and is subtracted 0.75 from the data in order to group the dataset, and compare its fitting with inverted exponentiated exponential (IEER) and inverted exponentiated Pareto distributions (IEPD). The pdfs of them as follows:

$$\begin{aligned} f(x;\alpha ,\beta )&= {} \alpha \beta x^{-2} e^{-(\beta /x) }\left(1-e^{-(\beta /x) }\right)^{\alpha -1}, \quad x >0,\alpha>0,\beta>0 \\ f(x;\alpha ,\beta )&= {} \alpha \beta \frac{1}{x(1+x)} e^{-\beta \ln (1+1/x) }\left(1-e^{-\beta \ln (1+1/x) }\right)^{\alpha -1}, \quad x>0,\alpha>0,\beta >0 \end{aligned}$$

We present different criterions to evaluate the optimum degree for fitting these distributions, which applying MLE to compute subtractive log-likelihood function(\(-\ln L \)), Akaike information criterion (\( \textit{AIC} \)), Bayesian information criterion(\(\textit{BIC}\)) and Kolmogorov−Smirnov (KS) statistic, we just to know \( \textit{p} \) values of KS. Here, the maximum value for likelihood function of an estimated model is named L. And \( \textit{AIC} \) is defined that \( \textit{AIC}=2\times (\textit{p}-\ln (\textit{L})) \), and the amount of distribution’s parameters as \(\textit{p}\). We also give definition of \(\textit{BIC} \) : \( \textit{BIC}=\textit{p}\times \ln (n)-2\times \ln (\textit{L}) \), and the sum of observations in sample as n. Thus the minimum \( -\ln L,\textit{AIC},\textit{BIC} \) and highest \( \textit{p} \) values of KS statistics values would be the criterions for best distribution. Apart from that, we propose a graphic method for the optimum degree of fitting different distributions at Fig. 2.

Fig. 2
figure 2

quantile-quantile plots of three distributions with actual example

Table 8 lists that \( -\ln L,~\textit{AIC},~\textit{BIC} \) and \( \textit{p} \) values.

Table 8 Fitting of the real data

Next, we divide the dataset into \( n=31 \) groups, which has \( k=2 \) units in each group, to generate first-failure censored sample. Then, we consider three progressive censoring schemes with the last sample. Table 9 presentes the censoring schemes and corresponding samples.

Table 9 Different progressively first-failure censoring samples

We estimate the two parameters and reliability characteristics with MLEs and Bayes estimators in Table 10.

Table 10 MLE and Bayes estimate corresponding to actual example

And, we compute r(t) and h(t) corresponding to real data by using \( t=4 \). This section considers Bayes estimation with non-informative prior, due to that we don’t have prior information about the parameters. Table 11 lists the \( 95\% \) asymptotic, Boot-p, Boot-t CIs and HPD credible intervals for parameters.

Table 11 The 95% CI of asymptotic, Boot-p and Boot-t CI, and HPD credible interval of actual example

7 Conclusion

Censoring is an ordinary technique to acquire sample from various experiments. Progressively first-failure censored is an extension of censoring, which helps to reduce time and cost. And, the hazard rate of IERD is nonmonotone. It is common that nonmonotone of hazard rate in many practical situations. We study the estimation of parameters of IERD under progressively first-failure censored in our paper. We propose MLE and corresponding asymptotic CI estimate of the parameters with IERD. We then compute confidence intervals with bootstrap method. The Bayesian estimates and the associated HPD interval estimates under square error loss function are developed. It is hard to obtain explicit forms of Bayes estimators, however, we make use of M–H algorithm for Bayes estimation. A simulation is used to evaluate the various estimators’work. And the theoretical results have been applied with real data. Moreover, the methodology is discussed in our article could be helpful to data analyst and reliability practitioners.