1 Introduction

Entropy is an uncertainty measure of random variables and the information shared by them, in which the entropy mathematically represents the prospective quantity of the information. In the nineteenth century, the concept of entropy was introduced in thermodynamics and statistical mechanics as a measure of the disorder of a physical system. Shannon (1948) developed the idea of entropy of random variables and related it to the theory of communication as a measure of information. In the last two decades, a considerable body of literature has been devoted to the importance and applications of entropy, see, for example, Adami (2004); Misra et al. (2005); Liu et al. (2011) and Robinson (2011). There has been much work recently regarding parametric estimation of entropy using classical and Bayesian methods for several distributions. Among others are Cho et al. (2015) and Lee (2017).

Now, we suppose that the random variable X has the distribution function F with pdf f. The Shannon entropy H(f) of X is defined by

$$\begin{aligned} H_{X}(f)=-\int _{-\infty }^{\infty }f(x;\theta )\log f(x;\theta )\textrm{d}x, \end{aligned}$$

provided that the integral exists, see Cover and Thomas (2005). It is observed that entropy is higher when the probability is spread out, while entropy is low for a very sharply peaked distribution, i.e. H is a measure of uncertainty associated with f. Moreover, entropy can be viewed as the measure of the uniformity of distribution. Differential entropy has many applications in various areas especially in time series, population genetics, physics, machine learning and information theory among others. For further properties and applications of the entropy, one may refer to Liu et al. (2011) and Singh (2013).

In this paper, we suppose that N units are subjected to a life-testing experiment and that the lifetimes of the units follow a Rayleigh distribution with a probability density function (pdf) and cumulative distribution function:

$$\begin{aligned} f(x)=\frac{x}{\beta ^{2}}\exp \left( -\frac{x^{2}}{2\beta ^{2}}\right) \quad \textrm{and} \quad F(x)=1-\exp \left( -\frac{x^{2}}{2\beta ^{2}}\right) , \end{aligned}$$
(1)

respectively, \( x>0\). So, the corresponding reliability and hazard functions at \(t_{0}\) (an arbitrary time) are given by

$$\begin{aligned} R(t_{0})=\exp \left( -\frac{t_{0}^{2}}{2\beta ^{2}}\right) \quad \textrm{and} \quad h(t_{0})=\frac{t_{0}}{\beta ^{2}},\quad t_{0}>0. \end{aligned}$$

Historically, the Rayleigh distribution was introduced by Rayleigh (1880). The Rayleigh distribution is widely used in communication engineering, electro vacuum devices and several areas of statistics as a suitable model for service time or in the lifetime of an object that progresses rapidly over time since its failure rate is a linear function and increasing. As a result, it has attracted several researchers and has a number of applications, cf. Lee (2018) and Polovko (1968). Over the past decade, a large body of literature has been devoted to the problem of inference about the Rayleigh distribution (see, for example, Dey and Dey (2014); Kim and Han (2009); Kotb and Raqab (2018) and Kotb and Mohie El-Din (2022)). Due to the importance of the Rayleigh model, our goal is to estimate the entropy for this model. The mean, variance and entropy of the Rayleigh model can be written as follows:

$$\begin{aligned} \mu =\sqrt{\frac{\pi }{2}}\beta ,\quad \textrm{Var}(X)=\frac{4-\pi }{2}\beta ^2 \quad \textrm{and} \quad H(f)=1+\log \left( \frac{\beta }{\sqrt{2}}\right) +\frac{\hbar }{2}, \end{aligned}$$
(2)

where \(\hbar =0.577216\) is the Euler-Mascheroni constant.

On another important issue, one can utilize type-I and type-II censoring schemes as well as a mixture of type-I and type-II (referred to as hybrid censoring) in life studies when the researcher is unable to watch the lifetimes of all test units. Balasooriya (1995) introduced a new scheme, named the first-failure censoring scheme, in which tests \(k\times n\) (instead of only n) units by testing n independent sets, each containing k units. The censoring plan is conducted by first failure testing for each set separately. One of the disadvantages of the different censoring plans mentioned above is that they do not have flexibility to allow the removal of units or sets from the test at different stages during the experiment. Therefore, many researchers have introduced censoring mechanisms that are more general than the traditional censoring schemes. The progressive scheme is considered to be a generalization of type-I or type-II censoring schemes where the removal of units prior to failure is preplanned at points other than the terminal point of the experiment. For further details on progressive censoring, one may refer to Balakrishnan and Aggarwala (2000) and Kotb and Raqab (2019). For an excellent review of results on progressive censoring schemes, one may refer to Balakrishnan (2007). Wu and Kus (2009) proposed a new life test plan called a PFFC scheme to improve the cost-effectiveness of choosing sample units for an experiment by combining the first-failure with progressive filtering. More recent references can be found in Ahmadi et al. (2013) and Wu and Huang (2012). The PFFC scheme has been used by several authors to estimate some unknown parameters of the distributions. For example, Kotb et al. (2021); Mohammed et al. (2017) and Soliman et al. (2012).

In this study, our main aim is to discuss the problem of estimating the entropy of the Rayleigh model H(f) based on PFFC Rayleigh data. Also, we present a comparison study to find out which prior distribution, square root inverted gamma (SRIG) prior or Gumbel prior, is more informative in the sense of estimating the entropy.

This article is organized as follows. In Sect. 2, we derive the maximum likelihood estimators (MLEs) and associated two-sided approximate CI. In Sect. 3, we derive the Bayes estimates (BEs) and the corresponding CrIs with SRIG and Gumbel priors under four different loss functions for the entropy of the Rayleigh model. In Sect. 4, an intensive Monte Carlo simulation study is performed to compare the so developed methods. One data set representing a COVID-19 mortality rates data belongs to Mexico of 108 days and failure times of software model are analyzed for illustrative purposes in Sect. 5. Finally, we conclude with a brief summary and some remarks on the simulation results in Sect. 6.

2 Non-Bayesian estimation

Let \(X_{1;m,n,k}^{R}<X_{2;m,n,k}^{R}<\ldots <X_{m;m,n,k}^{R}\) be PFFC order statistics with the progressive censoring scheme (CS) \({\textbf{R}}=\left( R_{1},R_{2}\ldots ,R_{m}\right) \), where \(n=m+\sum _{j=1}^{m}R_{j}\). Based on PFFC sample, \(X_{1;m,n,k}^{R}\), \(X_{2;m,n,k}^{R}, \ldots , X_{m;m,n,k}^{R}\) which can be written for simplicity as \({\textbf {x}}=(x_{1},x_{2},\ldots ,x_{m})\), the joint pdf is given by

$$\begin{aligned} L(\theta |{\textbf {x}})=C \prod _{j=1}^{m}f(x_{j},\theta )\left[ 1-F(x_{j},\theta ) \right] ^{k(R_{j}+1)-1}, \end{aligned}$$
(3)

where \({\textbf {x}}=\left( {\textbf {x}}_{1},{\textbf {x}}_{2},\ldots ,{\textbf {x}}_{q}\right) \) and \(C=P k^{m}\) with \(P=n(n-R_{1}-1)(n-R_{1}-R_{2}-2)\ldots (n-R_{1}-R_{2}\ldots -R_{m-1}-m+1)\).

On substituting the cumulative distribution function and pdf in (1) into (3), the likelihood function of \(\beta \) can be written as

$$\begin{aligned} L(\beta |{\textbf {x}})=C\left( \prod _{j=1}^{m}\frac{x_{j}}{\beta ^{2}}\right) \exp \left( -\frac{k}{2\beta ^{2}} \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) , \end{aligned}$$
(4)

hence, the log-likelihood function from (4) can be written as (except for the constant term)

$$\begin{aligned} \ell (\beta |{\textbf {x}})=-2m\log \beta -\frac{k}{2\beta ^{2}} \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}. \end{aligned}$$
(5)

In our set-up, the MLE of \(\beta \) is found to be

$$\begin{aligned} {\tilde{\beta }}_{ML}=\left( \frac{k}{2m}\sum _{j=1}^{m}(R_{j}+1) x_{j}^{2}\right) ^{1/2}. \end{aligned}$$
(6)

Using the invariance property of MLEs (see Zehna (1966)), from (2) and (6), the MLEs of the entropy function is given by

$$\begin{aligned} {\tilde{H}}_{ML}=1+\log \left( \frac{{\tilde{\beta }}_{ML}}{\sqrt{2}}\right) +\frac{\hbar }{2}. \end{aligned}$$
(7)

Moreover, after some algebraic simplification, the Fisher information on \(\beta \) can be obtained by using (5) as

$$\begin{aligned} I_{\beta }=\frac{1}{\beta ^{2}}\left\{ -2m+\frac{3k}{\beta ^{2}}E\left[ \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right] \right\} . \end{aligned}$$

Hence, the sampling distribution of \({\tilde{\beta }}-\beta /\sqrt{I_{\beta }}\) can be approximated by a standard-normal distribution. These will enable us to construct an CI for \(\beta \) based on the limiting standard-normal distribution. For constructing the asymptotic CI of \(\Lambda =H(f)\), the asymptotic variance of \({\tilde{\Lambda }}\) has to be computed. From the asymptotic efficiency of MLEs, the variance of \({\tilde{\Lambda }}\) can be approximated by

$$\begin{aligned} AVar({\tilde{\Lambda }})=\left[ \left( \frac{\partial \Lambda }{\partial \beta }\right) ^{2}I_{\beta }^{-1}\right] _{{\tilde{\beta }}}. \end{aligned}$$

Therefore, Meeker and Escobar (1998) reported that the CI based on the asymptotic theory of \(\ln {\tilde{\Lambda }}\) is superior to the one of \({\tilde{\Lambda }}\). The CI with confidence level \(100(1-\tau )\%\) for \(\ln {\tilde{\Lambda }}\) (denoted by LN) is given by

$$\begin{aligned} \left( \Lambda _{LN}^{(l)},\Lambda _{LN}^{(u)}\right)= & {} \left[ \frac{{\tilde{\Lambda }}}{\exp \left( z_{1-\tau /2}{\tilde{S}}_{\Lambda } /{\tilde{\Lambda }}\right) }, {\tilde{\Lambda }}\exp \left( z_{1-\tau /2}{\tilde{S}}_{\Lambda }/{\tilde{\Lambda }}\right) \right] , \end{aligned}$$

where \({\tilde{S}}_{\Lambda }=\sqrt{\textrm{AVar}({\tilde{\Lambda }})}\).

3 Bayes Estimation

Here, we present the posterior densities of \(\beta \) and H based on Rayleigh PFFC data to obtain the BEs (either point or interval) of the entropy of the Rayleigh distribution. This is done with respect to the squared error loss (SEL), LINEX loss, general entropy loss (GEL), and Al-Bayyati loss (ABL) functions. The posterior distribution is computationally efficient and analytically tractable when both prior and posterior densities belong to similar families. So, we use the SRIG and Gumbel priors to obtain the posterior densities of \(\beta \) and H, respectively.

3.1 Bayes estimation using SRIG prior

The SRIG prior density, see Fernández (2000), of \(\beta \) is written as

$$\begin{aligned} \pi \left( \beta ;\delta \right) \propto \beta ^{-2\sigma -1}\exp \left( -\frac{\rho }{2\beta ^{2}}\right) ,\quad \beta >0, \end{aligned}$$
(8)

where \(\delta =(\sigma ,\rho )\); \(\sigma >0\) and \(\rho >0\) are hyper-parameter constants. By combining (4) and (8), the posterior density of \(\beta \) can be written as

$$\begin{aligned} \pi _{1}(\beta |{\textbf {x}})= & {} \frac{1}{I\left( \sigma ,\rho \right) }\; \beta ^{-2(\sigma +m)-1}\exp \left( -\frac{1}{2\beta ^{2}}\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) \right) ,\qquad \end{aligned}$$
(9)

where \(I\left( \sigma ,\rho \right) \) being the normalized constant and

$$\begin{aligned} I\left( \sigma ,\rho \right) = \frac{2^{\sigma +m-1}\Gamma \left( \sigma +m\right) }{\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) ^{\sigma +m}}. \end{aligned}$$

It is known that the BE of \(\Lambda \) under SEL function is the posterior mean \(\Lambda \). Then, the BE for H(f) can be obtained as follows:

$$\begin{aligned} {\tilde{H}}_{SE}= & {} 1+\int _{0}^{\infty }\log \left( \frac{\beta }{\sqrt{2}}\right) \pi _{1}(\beta |{\textbf {x}}) \textrm{d}\beta +\frac{\hbar }{2}\\= & {} 1+\frac{1}{2}\log \left( \frac{1}{4}\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) \right) -\frac{1}{2}PG\left[ \sigma +m\right] +\frac{\hbar }{2}, \end{aligned}$$

where PG\(\left[ z\right] \equiv \textrm{PolyGamma}\left[ z\right] \) gives the digamma function which can be computed easily by WOLFRAM MATHEMATICA. There are various asymmetric loss functions used in the literature, see Chandra (2001) and Zellner (1986). One of the most common asymmetric loss functions is the LINEX loss function. It was introduced by Varian (1975) and has since been used this loss function in different estimation problems. Furthermore, the BE of H under the LINEX loss function, denoted by \({\tilde{H}}_{BL}\), is given by

$$\begin{aligned} {\tilde{H}}_{BL}= & {} -\frac{1}{c}\ln E_{H}\left[ \exp \left( -c H\right) |{\textbf {x}}\right] \\= & {} -\frac{1}{c}\ln \left( \frac{2^{c}\Gamma ^{*}(c)\exp \left( -c-c \hbar /2\right) }{\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) ^{c/2}}\right) , \end{aligned}$$

where \(\Gamma ^{*}(c)=\Gamma \left( \sigma +m+c/2\right) /\Gamma \left( \sigma +m\right) \). A suitable alternative to the modified LINEX loss, see Basu and Ebrahimi (1991), is the GEL proposed by Calabria and Pulcini (1996). Based on posterior density (9), the BE of H under the GEL function may be defined as

$$\begin{aligned} {\tilde{H}}_{GE}=\left( E\left[ H^{-\pounds }|{\textbf {x}}\right] \right) ^{-1/\pounds }=\left( \frac{\eta \left( -\pounds \right) }{ I\left( \sigma ,\rho \right) }\right) ^{-1/\pounds }, \end{aligned}$$

where

$$\begin{aligned} \eta \left( \zeta \right)= & {} \int _{0}^{\infty }\left( 1+\log \left( \frac{\beta }{\sqrt{2}}\right) +\frac{\hbar }{2}\right) ^{\zeta }\beta ^{-2(\sigma +m)-1}\nonumber \\{} & {} \times \exp \left( -\frac{1}{2\beta ^{2}}\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) \right) \textrm{d}\beta . \end{aligned}$$
(10)

By integrating the integrands in the above equation with respect to \(\beta \), the BEs can be approximated via one of the numerical methods. Moreover, we use the loss function which was introduced by Al-Bayyati (2002) and various authors as an example, Kotb and Raqab (2018); Kotb and Mohie El-Din (2022) have used this loss function in different estimation problems. The ABL function for H is

$$\begin{aligned} \Im _{AB}\left( {\tilde{\beta }},\beta \right) =\beta ^{b}\left( {\tilde{\beta }}-\beta \right) ^{2}, \quad b\in {\mathbb {R}}. \end{aligned}$$

Due to its analytical tractability in Bayesian analysis, it is frequently used. By using the posterior density (9), the BE of H under ABL function is given by

$$\begin{aligned} {\tilde{H}}_{AB}=\frac{E\left[ H^{b+1}|\textrm{x}\right] }{E\left[ H^{b}|\textrm{x}\right] }=\frac{\eta \left( b+1\right) }{\eta \left( b\right) }, \end{aligned}$$

where \(\eta \left( b\right) \) is given in (10).

3.1.1 Bayes probability intervals

The posterior density of H can be defined through re-parameterization of the original parameters. This can be achieved by considering a one-to-one transformation from

$$\begin{aligned} \Omega _{\beta }=\left\{ \beta :\beta >0\right\} \quad \textrm{onto}\quad \Omega _{H}=\left\{ H:-\infty<H<\infty \right\} . \end{aligned}$$

Therefore, the posterior density of H based on \({\textbf {x}}\) is

$$\begin{aligned} \pi _{2}(H|{\textbf {x}})= & {} \frac{1}{J\left( \sigma ,\rho \right) }\;\exp \left( -2\left( \sigma +m+1/2\right) H -\frac{d}{4}\right. \nonumber \\{} & {} \times \left. \left( \rho +k \sum _{j=1}^{m}(1+R_{j})x_{j}^{2}\right) e^{-2H}\right) , \end{aligned}$$
(11)

where \(d=\exp (2+\hbar )\), \(J\left( \sigma ,\rho \right) \) being the normalized constant,

$$\begin{aligned} J\left( \sigma ,\rho \right) = \frac{\Gamma \left( \sigma +m+1/2\right) }{2\left( (d/4)\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) \right) ^{\sigma +m+1/2}}, \end{aligned}$$

and \(\Gamma (.)\) is the gamma function.

A \(100(1-\tau )\%\) (\(0<\tau <1\)) two-sided Bayes probability interval (BPI) of H for the limits \(\phi _{L}\) and \(\phi _{U}\) can be established by solving:

$$\begin{aligned} \frac{\tau }{2}=\int _{-\infty }^{\phi _{L}}\pi _{2}\left( H|{\textbf{x}}\right) \textrm{d}H=\varphi _{L},\; \textrm{say}, \end{aligned}$$

and

$$\begin{aligned} \frac{\tau }{2}=\int _{\phi _{U}}^{\infty }\pi _{2}\left( H|{\textbf{x}}\right) \textrm{d}H=\varphi _{U},\; \textrm{say}, \end{aligned}$$

see (Martz and Waller (1982), p. 208–209). After some algebra, we can be obtained \(\phi _{L}\) and \(\phi _{U}\) by solving:

$$\begin{aligned} \varphi _{L}= \sum _{\ell =0}^{\infty } \psi _{\ell ,v}({\textbf{x}})\;\exp \left( -2\left( \sigma +m+1/2\right) \phi _{L}\right) =\frac{\tau }{2}, \end{aligned}$$

and

$$\begin{aligned} \varphi _{U}=-\sum _{\ell =0}^{\infty } \psi _{\ell ,v}({\textbf{x}})\;\exp \left( -2\left( \sigma +m+1/2\right) \phi _{U}\right) =\frac{\tau }{2}, \end{aligned}$$

where

$$\begin{aligned} \psi _{\ell ,v}({\textbf{x}})= & {} \frac{(-1)^{\ell +1}\left( d/4\right) ^{\sigma +m+\ell +1/2}}{\ell !(\sigma +m+\ell +1/2)}\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\gamma }\right) ^{\sigma +m+\ell +1/2}. \end{aligned}$$

Next, we use the procedure proposed by Berger (1985) to develop \(100(1-\tau )\%\) highest posterior density (HPD) credible set for H of the form \(CS_{H}=\left\{ H: \pi _{2}\left( H|{\textbf{x}}\right) >C_{\tau }\right\} \), where \(C_{\tau }\) is chosen so that \(P(\phi \in CS_{H}|{\textbf{x}})=1-\tau \). Based on posterior density, the \(100(1-\tau )\%\) HPD CrI, \((\phi _{L},\phi _{U})\) of H, must satisfy:

$$\begin{aligned} \int _{\phi _{L}}^{\phi _{U}}\pi _{2}\left( H|{\textbf{x}}\right) \textrm{d}H=1-\tau \quad \textrm{and}\quad \pi _{2}\left( \phi _{L}|{\textbf{x}}\right) =\pi _{2}\left( \phi _{U}|{\textbf{x}}\right) . \end{aligned}$$

Hence, the \(100(1-\tau )\%\) HPD CrI, \((\phi _{L},\phi _{U})\) of H, can be solved numerically from: \(\varphi _{L}+\varphi _{U}=\tau \) and

$$\begin{aligned}\frac{e^{-2\phi _{L}}-e^{-2\phi _{U}}}{\phi _{U}-\phi _{L}}=\frac{2\left( \sigma +m+1/2\right) }{(d/4)\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) }. \end{aligned}$$

Another quite useful method is the Gibbs sampling technique. This technique requires being able to generate Markov chain Monte Carlo (MCMC) samples and then compute the BE and corresponding CrI of H. It should be noted that the posterior density (11) has the modified extreme value (MEV) distribution. Specifically, the posterior pdf of H can be rewritten in the following form:

$$\begin{aligned} H|{\textbf {x}}\sim MEV\left( \sigma +m+\frac{1}{2},\frac{d}{4}\left( \rho +k \sum _{j=1}^{m}(1+R_{j})x_{j}^{2}\right) \right) . \end{aligned}$$

Moreover, to compute the BE as well as the corresponding CrI of H, we apply the following Metropolis-Hastings (M-H) algorithm:

M-H algorithm for estimation

  • Step 1: Generate H from

    $$\begin{aligned} MEV\left( \sigma +m+\frac{1}{2},\frac{d}{4}\left( \rho +k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) \right) . \end{aligned}$$
  • Step 2: Repeat Step 1 for N times to obtain \(H_{t}\), \(t=1,2,\ldots ,N\).

  • Step 3: To obtain the credible intervals of H, arrange \(H_{t}\) as \(H^{[1]}\), \(H^{[2]}\), \(\ldots \), \(H^{[N]}\).

  • Step 4: The \(100(1-\tau )\%\) symmetric credible intervals of H is

    $$\begin{aligned} \left( L^{MC},U^{MC}\right) =\left( H^{[N (\tau /2)]},H^{[N(1-\tau /2)]}\right) . \end{aligned}$$

3.2 Bayesian estimation using Gumbel prior

The likelihood function of H can be defined through re-parameterization of the original parameters. This can be achieved by considering a one-to-one transformation from \(\Omega _{\beta }\) onto \(\Omega _{H}\). Therefore, the likelihood function of H based on \({\textbf {x}}\) is

$$\begin{aligned} \Im (H|{\textbf {x}})\propto \exp \left( -2m H-\frac{d k}{4}e^{-2H} \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}\right) , \end{aligned}$$
(12)

where \(d=\exp (2+\hbar )\). From Eq. (12), the MLE of H can be obtained as

$$\begin{aligned} {\tilde{H}}_{ML}=-\frac{1}{2}\ln \left( \frac{4m}{d k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}}\right) . \end{aligned}$$

It is clear that the above equation is equivalent to Eq. (7). Now, to develop the BE of H, we take the prior density of H as Gumbel distributed denoted by Gum(2). It is well known that the random variable H has Gumbel distribution, Gum(\(\lambda \)), if it pdf is given by

$$\begin{aligned} \pi (H)=\lambda \exp \left( -\lambda (H-\mu )- e^{-\lambda (H-\mu )}\right) , \end{aligned}$$
(13)

where \(-\infty<H<\infty , \mu >0\) is location parameter and \(\lambda >0\) is scale parameter. By combining (12) and (13), the posterior density of H is

$$\begin{aligned} \pi _{3}(H|{\textbf {x}})= & {} \frac{1}{J\left( \mu \right) }\;\exp \left( -2(m+1)H-e^{-2H}\left( \frac{d k}{4} \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }\right) \right) , \end{aligned}$$
(14)

where \(J(\mu )\) being the normalized constant,

$$\begin{aligned} J\left( \mu \right) = \frac{\Gamma \left( m+1\right) }{2\left( (d k/4) \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }\right) ^{m+1}}. \end{aligned}$$

It follows from (14) that the BE of H, under SEL function, is found to be

$$\begin{aligned} {\tilde{H}}_{SE}^{*}= & {} \log \left( (d k/4) \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }\right) ^{1/2}+\frac{\hbar }{2}-\frac{1}{2}HN(m), \end{aligned}$$

where \(HN(m)\equiv HarmonicNumber[m]\) gives the mth harmonic number \(H_{m}\). Similarly, the BE of H based on the LINEX can expressed as follows:

$$\begin{aligned} {\tilde{H}}_{BL}^{*}= & {} -\frac{1}{c}\ln \left( \frac{2^{c}\Gamma ^{**}(c)\exp \left( -c-c \hbar /2\right) }{\left( k \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }\right) ^{c/2}}\right) . \end{aligned}$$

where \(\Gamma ^{**}(c)=\Gamma \left( m+c/2+1\right) /\Gamma \left( m+1\right) \). The BEs of H under GEL and ABL functions are

$$\begin{aligned} {\tilde{H}}_{AB}^{*}=\frac{\eta \left( b+1\right) }{\eta \left( b\right) }\quad \textrm{and} \quad {\tilde{H}}_{GE}^{*}=\left( \frac{\eta \left( -\pounds \right) }{J\left( \mu \right) }\right) ^{-1/\pounds }, \end{aligned}$$

where

$$\begin{aligned} \eta \left( b\right)= & {} \int _{-\infty }^{\infty }H^{b}\exp \left( -2(m+1)H-e^{-2H}\left( \frac{d k}{4}\sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }\right) \right) \textrm{d}H. \end{aligned}$$

By integrating the integrands in the above equation with respect to H, the BEs can be approximated via one of the numerical methods.

3.2.1 Bayes probability intervals

A \(100(1-\tau )\%\) two-sided BPI of H for the limits \(\phi _{L}^{*}\) and \(\phi _{U}^{*}\) can be obtained by solving:

$$\begin{aligned} \frac{\tau }{2}=\int _{-\infty }^{\phi _{L}^{*}}\pi _{3}\left( H|{\textbf{x}}\right) \textrm{d}H=\varphi _{L}^{*},\; \textrm{say}, \end{aligned}$$

and

$$\begin{aligned} \frac{\tau }{2}=\int _{\phi _{U}^{*}}^{\infty }\pi _{3}\left( H|{\textbf{x}}\right) \textrm{d}H=\varphi _{U}^{*},\; \textrm{say}. \end{aligned}$$

After some algebra, we can find \(\phi _{L}^{*}\) and \(\phi _{U}^{*}\) by solving:

$$\begin{aligned} \varphi _{L}^{*}= \sum _{\ell =0}^{\infty }\sum _{v=0}^{\ell } \psi _{\ell ,v}({\textbf{x}})\;\frac{\Gamma (\ell -v+1,2v\phi _{L}^{*})}{\Gamma (q{\bar{m}}+1)}=\frac{\tau }{2}, \end{aligned}$$

and

$$\begin{aligned} \varphi _{U}^{*}=-\sum _{\ell =0}^{\infty }\sum _{v=0}^{\ell } \psi _{\ell ,v}({\textbf{x}})\;\frac{\Gamma (\ell -v+1,2v\phi _{U}^{*})}{\Gamma (q{\bar{m}}+1)}=\frac{\tau }{2}, \end{aligned}$$

respectively, where

$$\begin{aligned} \psi _{\ell ,v}({\textbf{x}})= & {} \frac{(-1)^{\ell +1}}{\ell !v^{\ell -v+1}}{\ell \atopwithdelims (){v}}(m+1)^{\ell -v}\left( \frac{d k}{4} \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }\right) ^{v+m+1}, \end{aligned}$$

and

$$\begin{aligned} \Gamma (\epsilon ; z)=\frac{1}{\Gamma (\epsilon )}\int _z^{\infty } u^{\epsilon -1}\;e^{-u}\;du,\; \epsilon , z >0, \; (\Gamma (\epsilon ; 0)\equiv 1). \end{aligned}$$

Based on posterior density (14), the \(100(1-\tau )\%\) HPD CrI, \((\phi _{L}^{*},\phi _{U}^{*})\) of H, must satisfy:

$$\begin{aligned}{} & {} \int _{\phi _{L}^{*}}^{\phi _{U}^{*}}\pi _{3}\left( H|{\textbf{x}}\right) \textrm{d}H=1-\tau \quad \text { and } \quad \pi _{3}\left( \phi _{L}^{*}|{\textbf{x}}\right) =\pi _{3}\left( \phi _{U}^{*}|{\textbf{x}}\right) . \end{aligned}$$
(15)

After some algebra, the equations in (15) become

$$\begin{aligned} \varphi _{L}^{*}+\varphi _{U}^{*}=\tau \quad \textrm{and} \quad \frac{e^{-2\phi _{L}^{*}}-e^{-2\phi _{U}^{*}}}{\phi _{U}^{*}-\phi _{L}^{*}}=\frac{2\left( m+1\right) }{(d k/4) \sum _{j=1}^{m}(R_{j}+1)x_{j}^{2}+e^{2\mu }}. \end{aligned}$$

By the simultaneous solution of the above equations, the values of \(\phi _{L}^{*}\) and \(\phi _{U}^{*}\) can be obtained numerically.

4 Numerical simulation study

Here, we carry out some numerical computations to examine and compare the performances among the estimators of the entropy-based on PFFC samples from the Rayleigh distribution. In each case, the MLEs and BEs using SEL, ABL and GEL functions are computed and compared in terms of the mean square error (MSE), for the following different sampling CSs:

  • CS-I: \(R_{m}=n-m\), \(R_{j}=0\) for \(j\ne m\).

  • CS-II: \(R_{1}=n-m\), \(R_{j}=0\) for \(j\ne 1\).

  • CS-III: \(R_{m/2}=n-m\), \(R_{j}=0\) for \(j\ne m/2\), if m is even, \(R_{(m+1)/2}=n-m\), \(R_{j}=0\) for \(j\ne (m+1)/2\), if m is odd.

Based on these CSs, we replicate the process 5000 times and then compute the corresponding MSEs as well as the average confidence interval lengths (ALs) and coverage probability (CP) of LN \(95\%\) CIs as well as BPI, HPD and MCMC credible intervals. For MCMC method, we compute the CIs based on 1000 MCMC sample and discard the first 100 values as ’burn-in’. For computing the BEs, the hyperparameter values \((\sigma ,\rho )=(3.5,1.5)\) and \(\mu =0.446\) are chosen, which allows us to generate the value of \(\beta =0.735\). Based on that generated values of \(\beta \), the real values of H is 0.634. For different tests, we considered the different sample sizes \(n(m)=24(21)\) (small), \(n(m)=35(30)\) (moderate) and \(n(m)=80(40,60,70)\) (large) with different group sizes \(k=1, 3\). The MSEs of the simulated estimates of H are displayed in Table 1 and the ALs and simulated CPs of H(f) are computed and displayed in Table 2.

Now, we present the comparison of the various loss functions by the use of the optimal estimations for the entropy \(H_{\eta }(f)\) based on the optimal choice of \(\eta =(c,b,\pounds )\). For simplicity of discussion, we want to determine the value of \(\eta \) to get the best-estimated value of the entropy. Thus, we choose the optimal value of \(\eta \), \(\eta _{_{OP}}\), as listed below

$$\begin{aligned} \eta _{_{OP}}= \left\{ \eta |\min _{\eta } |{\tilde{H}}_{\eta }(f)-H_{t}(f)|\right\} , \end{aligned}$$

where \(H_{t}(f)\) is the true values of the entropy H. Here, using NMinimize option of Mathematica 11 to get the value of \(\eta _{_{OP}}\) which minimize \(Abs({\tilde{H}}(\eta )-H_{t})\), \(\eta _{_{OP}}= \eta /.\textrm{Last}[\textrm{NMinimize}[Abs({\tilde{H}}_{\eta }(f)-H_{t}(f)),\eta ]]\). The findings of MSEs for the different estimators of entropy with respect to both the SRIG prior and the Gumbel prior under different loss functions are summarized in Figure 1 based on the optimal choice of \(\eta =(c,b,\pounds )\). Visually, it is evident that the performances of BEs under the ABL function based on SRIG prior are better than their corresponding BEs in other cases.

Fig. 1
figure 1

Results of MSEs of the BEs with respect to both the SRIG prior and the Gumbel prior based on the optimal choice of \(\eta =(c,b,\pounds )\) and different CSs

5 Data analysis (COVID-19 mortality rates from Mexico)

Now, we discuss the analysis of a real-life data set with Rayleigh fitting distribution and illustrate the methods of estimation developed here. The considered data set is taken from ”https://covid19.who.int/”, which represents a COVID-19 mortality rates data belongs to Mexico of 108 days. This data was recorded from 4 March to 20 July 2020 and presented in Table 3. Before progressing further, we check whether the Rayleigh model is suitable for analyzing the above data, by using the Kolmogorov-Smirnov distance and Kaplan-Meier estimator (see Kaplan and Meier (1958)). To test the null hypothesis

$$\begin{aligned} H_0: F(x) \sim \text{ Rayleigh }\;\text{ model }\quad \mathrm {vs.} \quad H_1: F(x) \not \sim \text{ Rayleigh }\;\text{ model }. \end{aligned}$$

Table 4 presents the Cramer-von Mises, Anderson-Darling, Kolmogorov-Smirnov and Pearson \(\chi ^{2}\) tests and the corresponding p-values. We reject \(H_0\) if p−value \(<\tau \) \((\tau = 0.05)\). One cannot rule out the possibility that the data sets came from the Rayleigh distribution based on the p−values. Furthermore, figure 2 shows fitted survival and empirical functions as well as the P-P plot of Kaplan-Meier estimator and QQ plots. Visually, it can be easily seen that the depicted points for the fitted Rayleigh survival function are near the 45 line, indicating a good fit. For more confirmation, the correlation coefficient measure is also used to compare the fitting of this model. Table 4 presents the correlation coefficient between the observed data and the corresponding expected values of the underlying model. These results indicate that the Rayleigh distribution fits data set quite well.

To illustrate the inferential methods developed in this paper, after the order of the data, we assume that the mortality data belong to Mexico and are randomly divided into 36 groups with \(k=3\) items in each group. Suppose that the pre-determined PFFC plan is applied using three different progressive censoring schemes, see Table 5. 9 groups are censored for this example, and 27 first failures are noted. Using Table 5, the estimates of the entropy function are obtained in Tables 6 and 7. Moreover, the lower (L) and the upper (U) \(95\%\) CI for H(f) are computed and displayed in Table 8.

Fig. 2
figure 2

The empirical\(^{(a)}\), and P-P plot\(^{(b)}\) and QQ-plot\(^{(c)}\) of Kaplan-Meier estimator of Rayleigh distribution for COVID-19 data of Mexico

Table 1 MSEs for the simulated estimates of H(f), with \((\sigma ,\rho ,\mu )= (3.5,1.5,0.446 )\)
Table 2 ALs and CPs of \(95\%\) CIs for the simulated estimates of H(f)
Table 3 Real data set
Table 4 Correlation coefficient, MLEs and goodness-of-fit test statistics
Table 5 Three different PFFC data sets

6 Conclusions

Here, we have addressed the problem of estimating the entropy under PFFC Rayleigh data. The MLEs, BEs and corresponding CIs and CrIs of the entropy have been obtained. The BEs of the entropy under SEL, LINEX, ABL and GEL functions are developed and compared to the MLEs in the sense of MSE. Our findings in this paper are applied and illustrated by using a simulation study, for different choices of (nmk) and different CSs, and a real-life data set. From the results presented earlier, the following remarks can be concluded:

  1. 1.

    From the results obtained in this article, we obtain the following special cases: (a) Setting \(n=m\), \(k=1\) and \(R_{j}=0\), \(j=1,2,\ldots ,m\), we get the result for the complete sample case. (b) Setting \(n=m\), \(k\ne 1\) and \(R_{j}=0\), \(j=1,2,\ldots ,m\), we get the result for the first-failure censoring sample. (c) Put \(R_{m}=n-m\), \(k\ne 1\) and \(R_{j}=0\), \(j=1,2,\ldots ,m-1\), we get the result for the type-II first-failure censoring sample. (d) If \(k=1\), then we obtain the result for the progressive type-II censoring sample. (e) If \(R_{m}=n-m\), \(k=1\) and \(R_{j}=0\), \(j=1,2,\ldots ,m-1\), then we obtain the result for the usual type-II censored sample case.

  2. 2.

    According to Table 1 and Figure 1, we note the following observations:

    1. (a)

      It is checked that the performances of BEs of the entropy based on PFFC data under the SEL function do well when compared to the MLEs.

    2. (b)

      Based on the SEL function, it can be observed that the BEs under the SRIG prior case have smaller MSE than the corresponding BEs under the Gumbel prior case.

    3. (c)

      For fixed values of n and m, we found that the results are not sensitive to CSs or k. Also, for fixed values of k and for selected censoring schemes, we found that the results are relatively sensitive to n and m.

    4. (d)

      From Figure 1, it is noticed that the MSEs of BEs under ABL function are getting smaller when compared to corresponding estimators under other loss functions, it is recommended to choose the ABL function for estimating the parameters by using the Bayesian approach. This shows the importance of adopting different error loss functions for developing the BEs.

  3. 3.

    Table 2 shows that:

    1. (a)

      In general, the ALs and CPs for BPIs are nearly close to MCMC in most cases.

    2. (b)

      In all cases, For fixed values of n and m, the ALs and CPs are not sensitive to CSs or k.

    3. (c)

      It is checked that the HPD credible intervals perform well when compared to the approximate CIs based on MLEs or other credible intervals, for different censoring schemes. Moreover, under the SRIG and Gumbel prior cases, ALs of all the estimators have generally the following order:

      $$\begin{aligned} AL_{HPD}<AL_{MCMC}<AL_{BPI}<AL_{LN}. \end{aligned}$$
    4. (d)

      As expected, all the estimators and ALs become better as n and m increase. It is also observed that the ALs based on BPIs and MCMC are better with the SRIG prior case when compared to the Gumbel prior case.

Table 6 Estimates of H(f), with \((\sigma ,\rho ,\mu )= (3.5,1.5,0.446 )\)
Table 7 Estimates of H(f), with \((\sigma ,\rho ,\mu )= (3.5,1.5,0.446 )\)
Table 8 The lower (L) and the upper (U) of the \(95\%\) CIs for H(f)