1 Introduction

Exponential distribution has been used quite effectively to analyze lifetime data, mainly due to its analytical tractability. Although, one parameter exponential distribution has several interesting such as ‘lack of memory property, one of the major disadvantages of the exponential distribution is that it has a constant hazard function. Moreover, the probability density function (PDF) of the exponential distribution is always a decreasing function. Due to this reason several generalizations of the exponential distribution have been suggested in the literature. For example, Weibull, gamma, generalized exponential (GE) distribution as considered by Gupta and Kundu [11] are different extensions of the exponential distribution, which contain exponential distribution as a special case. All the three distributions can have increasing or unimodal PDFs, and monotone hazard functions. Unfortunately, none of them can have non-monotone hazard functions. In many practical situations, one might observe non-monotone hazard functions, and clearly in those cases, none of these distribution functions can be used.

In the last decade Marshall and Olkin [16] introduced a general method to introduce a shape parameter mainly to expand a family of distributions. They have used their method to the one-parameter exponential distribution and created a two-parameter exponential distribution. They have also indicated to apply their method to the two-parameter Weibull distribution, but did not pursue further.

The main aim of this paper, is to apply the Marshall-Olkin method to the two-parameter generalized exponential distribution. In this paper, we introduce a new distribution function for \(\alpha > 0\), \(\lambda > 0\), \(\theta > 0\),

$$\begin{aligned} G(x;\lambda ,\alpha ,\theta )=\frac{(1-e^{-\lambda x})^\alpha }{\theta +(1-\theta )(1-e^{-\lambda x})^\alpha }, \quad \hbox {for} \quad x > 0, \end{aligned}$$
(1)

and 0 otherwise. Clearly, (1) is a proper distribution function, and it generalizes the generalized exponential distribution. From now on a random variable \(X\) with the distribution function (1) will be denoted by MOGE\((\alpha , \lambda , \theta )\).

It may be observed that several special cases can be obtained from (1). For example, if we set \(\theta =1\) in (1), then we obtain the generalized exponential distribution as introduced by Gupta and Kundu [11]. It will be denoted by GE\((\alpha , \lambda )\). For \(\alpha =1\) we obtain the Marshall-Olkin exponential distribution introduced by Marshall and Olkin [16]. For \(\alpha =1\) and \(\theta =1\) we obtain the exponential distribution with parameter \(\lambda \). Now we provide some physical justification of the proposed model, see also Marshall and Olkin [16] in this respect.

First, let us consider a series system with \(N\) independent components. Suppose that a random variable \(N\) has the probability mass function \(P(N=n)=\theta (1-\theta )^{n-1}\), \(n=1,2,\ldots \) and \(0<\theta <1\). Let \(X_1\), \(X_2\), \(\ldots \) represent the lifetimes of each component and suppose they are independent and identically distributed (i.i.d.) GE random variables with parameters \(\lambda \) and \(\alpha \). Then a random variable \(Y=\min (X_1,\ldots ,X_N)\) represents the time to the first failure with distribution function

$$\begin{aligned} G(x)&= 1-\sum _{n=1}^\infty P(\min (X_1,\ldots ,X_n)>x)\theta (1-\theta )^{n-1} \\&= 1-\theta (1-(1-e^{-\lambda x})^\alpha )\sum _{n=1}^\infty \left( (1-\theta )(1-(1-e^{-\lambda x})^\alpha )\right) ^{n-1} \\&= \frac{(1-e^{-\lambda x})^\alpha }{\theta +(1-\theta )(1-e^{-\lambda x})^\alpha }, \quad \hbox {for} \quad x > 0. \end{aligned}$$

Thus we obtain the distribution function given by (1).

Second, let us consider now a parallel system with \(N\) independent components and suppose that a random variable \(N\) has the probability mass function \(P(N=n)=\theta ^{-1}(1-\theta ^{-1})^{n-1}\), \(n=1,2,\ldots \) and \(\theta >1\). Let \(X_1\), \(X_2\), \(\ldots \) represent the lifetimes of each component and suppose they are generalized exponential distributed with parameters \(\lambda \) and \(\alpha \). Then a random variable \(Z=\max (X_1,\ldots ,X_N)\) represents the lifetime of the system. The distribution function of the random variable \(Z\) is given as (1).

Third, let \(\theta >{1}/{2}\). Using the series expansion

$$\begin{aligned} (\theta +(1-\theta )(1-e^{-\lambda x})^\alpha )^{-1} =\theta ^{-1}\sum _{k=0}^\infty \left( 1-{\theta ^{-1}}\right) ^k (1-e^{-\lambda x})^{\alpha k}, \end{aligned}$$

we obtain that the distribution function (1) can be rewritten as

$$\begin{aligned} G(x;\lambda ,\alpha ,\theta )=\theta ^{-1}\sum \limits _{k=0}^\infty (1-\theta ^{-1})^k F_{GE(\alpha (k+1),\lambda )}(x). \end{aligned}$$

On the other hand, if \(0<\theta <2\), by using the series expansion

$$\begin{aligned} (\theta +(1-\theta )(1-e^{-\lambda x})^\alpha )^{-1}=\theta ^{-1}\sum _{k=0}^\infty \left( 1-{\theta }\right) ^k \left( 1-(1-e^{-\lambda x})^{\alpha }\right) ^k, \end{aligned}$$

we obtain that (1) can be rewritten as

$$\begin{aligned} G(x;\lambda ,\alpha ,\theta )=\sum _{k=0}^\infty (k+1) (1-\theta )^k \sum _{j=0}^k {k\atopwithdelims ()j}(-1)^j (1-e^{-\lambda x})^{\alpha j}. \end{aligned}$$

Thus it follows that the distribution function given by (1) can be represented as a generalized mixture of generalized exponential distribution function. It may be mentioned that generalized mixture distribution has received some attention recently, see for example [9]. Since it allows negative weights also, it has more flexibility than the mixture models.

We call this new three-parameter extension of the GE distribution as the Marshall-Olkin generalized exponential (MOGE) distribution. As expected this new three parameter distribution has two shape parameters and one scale parameter. It is observed that the proposed MOGE distribution can have decreasing or unimodal PDFs. It is interesting to observe that the hazard function can take four different major shapes. It can have increasing, decreasing, bathtub or inverted tub shaped. Therefore it can be used quite extensively to analyze life time data. Since it has only three unknown parameters, the estimation of the unknown parameter is also not very difficult. It may be mentioned that not too many three parameter distributions can have all the three possible hazard functions, therefore, the introduction of the proposed three-parameter MOGE distribution will be quite useful. Moreover, since MOGE distribution has a compact distribution function, it can be used very effectively to analyze censored data, and the generation from a MOGE distribution is also very straight forward.

We have derived several properties of the MOGE distribution. The PDF of the proposed MOGE is either a decreasing or an unimodal function. Interestingly, because of the introduction of a new shape parameter, the MOGE can have an increasing, decreasing, unimodal or bathtub shaped hazard functions. The median and mode can be obtained in explicit forms. The moments cannot be obtained explicitly, we have obtained the moments in terms of infinite series. A small table is provided indicating the first four moments of MOGE distribution for different values of the shape parameters. We have obtained the density function of the \(i\)-th order statistics, and it is observed that it can be represented as an infinite mixture of the beta generalized exponential density function. We have also provided the Renyi’s entropy, which measures the uncertainty of variation. Since MOGE distribution has been obtained as a geometric maxima or minima of i.i.d. GE distributions, several ordering properties can be easily established.

The maximum likelihood estimators (MLEs) cannot be obtained in explicit form. Three dimensional optimization procedure is needed to compute the MLEs. We propose to use the EM algorithm, see [8], to compute the MLEs of the unknown parameters. Two data analysis are performed for illustrative purposes.

The paper is organized as follows. In Sect. 2 we derive the probability density function and discuss its shapes. The hazard function is considered in Sect. 3. In Sect. 4 we give some expressions for the moments. The order statistics and the limiting distribution of sample extremes are considered in Sect. 5. In Sect. 6 we derive two entropies, the Rényi’s and the Shannon’s entropy. We have derived several ordering relations of MOGE in Sect. 7. The maximum likelihood estimation and an EM algorithm are provided in Sect. 8. Analysis of two data sets are provided in Sect. 9, and finally we conclude the paper in Sect. 10.

2 The probability density function

If the random variable \(X\) has a distribution function (1), the corresponding probability density function (PDF) for \(\alpha > 0\), \(\lambda > 0\) and \(\theta > 0\), is

$$\begin{aligned} g(x;\alpha ,\lambda ,\theta ) =\frac{\alpha \lambda \theta e^{-\lambda x} (1-e^{-\lambda x})^{\alpha -1}}{(\theta +(1-\theta )(1-e^{-\lambda x})^\alpha )^2}, \quad x > 0, \end{aligned}$$
(2)

see also [3]. Suppose that \(0<\theta <2\). Then the denominator in (2) can be expressed as

$$\begin{aligned} (\theta +(1-\theta )(1-e^{-\lambda x})^\alpha )^{-2}&= \sum _{k=0}^\infty (k+1) (1-\theta )^k (1-(1-e^{-\lambda x})^\alpha )^k\\&= \sum _{k=0}^\infty (k+1) (1-\theta )^k \sum _{j=0}^k {k\atopwithdelims ()j}(-1)^j (1-e^{-\lambda x})^{\alpha j}. \end{aligned}$$

Using this result, we obtain that the pdf given by (2) can be expressed in the generalized mixture form as

$$\begin{aligned} g(x;\alpha ,\lambda ,\theta )=\theta \sum _{k=0}^\infty \sum _{j=0}^k (-1)^j (1-\theta )^k {k+1\atopwithdelims ()j+1} f_{GE(\alpha (j+1),\lambda )}(x), \end{aligned}$$
(3)

where \(f_{GE(\alpha (j+1),\lambda )}(x)\) denotes the pdf of a random variable with generalized exponential distribution with parameters \(\alpha (j+1)\) and \(\lambda \), see also [3]. Note that the density \(g(x; \alpha , \lambda , \theta )\) can be represented in the generalized mixture form of beta generalized exponential probability density functions, Barreto-Souza et al. [4], as \(\displaystyle g(x;\alpha ,\lambda ,\theta )=\theta \sum \limits _{k=0}^\infty (1-\theta )^k f_{BGE(1,k+1,\lambda ,\alpha )}(x)\).

Similarly, if \(\theta >1/2\) and using the expansion

$$\begin{aligned} (\theta +(1-\theta )(1-e^{-\lambda x})^\alpha )^{-2}=\theta ^{-2}\sum _{k=0}^\infty (k+1) \left( 1-\frac{1}{\theta }\right) ^k (1-e^{-\lambda x})^{\alpha k}, \end{aligned}$$

we obtain the expression

$$\begin{aligned} g(x;\alpha ,\lambda ,\theta ) =\theta ^{-1}\sum _{k=0}^\infty \left( 1-\frac{1}{\theta }\right) ^k f_{GE(\alpha (k+1),\lambda )}(x). \end{aligned}$$
(4)

Let us consider the shape of the PDF of MOGE distribution. Since \(\lambda \) is the scale parameter, the shape of the PDF of MOGE distribution does not depend on \(\lambda \). It can be easily shown that for (i) \(0 < \alpha \le 1\) and \(0 < \theta \le 1\), the PDF of MOGE decreases with \(g(0)=\infty \) and \(g(\infty )=0\), (ii) \(0 < \alpha \le 1\) and \(\theta > 1\), for some \(x_1 < x_2\), the probability density function \(g(x; \alpha , \theta )\) decreases on \((0,x_1)\cup (x_2,\infty )\) and increases on \([x_1,x_2]\). Furthermore, \(g(0)=\infty \) and \(g(\infty )=0\). (iii) For \(\alpha >1\), it follows that the PDF \(g(x; \alpha , \theta )\) has a single mode and \(g(0)=g(\infty )=0\).

We can conclude that the shape of the PDF of the MOGE is different than the shape of the PDF of the GE distribution. The PDF of the GE distribution is a decreasing function for \(0<\alpha <1\), while for \(\alpha >1\) is an increasing function. Some possible shapes of the probability density function \(g(x; \alpha , \theta )\) are presented in Fig. 1.

Fig. 1
figure 1

The PDF of the MOGE distribution for different values of \(\alpha \) and \(\theta \) when \(\lambda \) \(=\) 1. (i) \(\alpha = 0.8\), \(\theta = 2.0\), (ii) \(\alpha = 0.4\), \(\theta = 4.0\), (iii) \(\alpha = 2.0\), \(\theta = 2.0\)

3 The hazard rate function

Now we study the shapes of the hazard function of MOGE distribution for different values of \(\alpha \) and \(\theta \). Since \(\lambda \) is the scale parameter, the shape of the hazard function does not depend on \(\lambda \). So without loss of generality we assume that \(\lambda = 1\). Therefore, the hazard function of the MOGE is of the form

$$\begin{aligned} h(x; \alpha , \theta ) = \frac{\alpha e^{-x}(1-e^{-x})^{\alpha -1}}{(\theta +(1-\theta )(1-e^{-x})^\alpha ) (1-(1-e^{-x})^\alpha )}, \quad \hbox {for} \quad x > 0. \end{aligned}$$

Since the shape of \(h(x; \alpha , \theta )\) is same as the shape of \(\ln h(x; \alpha , \theta )\), we study the shape of \(\ln h(x; \alpha , \theta )\) only. The first derivative of \(\ln h(x; \alpha , \lambda )\) is

$$\begin{aligned} \frac{d \log h(x; \alpha , \lambda )}{dx} =\frac{s(x)}{(1-e^{-x})(\theta +(1-\theta )(1-e^{-x})^\alpha ) (1-(1-e^{-x})^\alpha )}, \end{aligned}$$

where

$$\begin{aligned} s(x)=-\theta +(2\theta -1)(1-e^{-x})^\alpha + \alpha \theta e^{-x}+(1-\theta )(1-e^{-x})^{2\alpha }(1+\alpha e^{-x})\,. \end{aligned}$$

Four shapes of the hazard rate function are possible:

  • If \(0<\alpha <1\) and \(0<\theta <(1+\alpha ){/}(2\alpha )\), then the function \(s(x)\) is negative for \(x>0\) and it follows that the hazard function is a decreasing function with \(h(0)=\infty \) and \(h(\infty )= 1\).

  • If \(0<\alpha <1\) and \(\theta >(1+\alpha ){/}(2\alpha )\), then the function \(s\) has one root \(x_0\) with \(s(0)=\theta (\alpha -1)<0\) and \(s(\infty )=0\). Thus we obtain that the hazard rate function \(h(x)\) decreases on \((0,x_0)\) and increases on \((x_0,\infty )\) with \(h(0)=\infty \) and \(h(\infty )= 1\).

  • If \(\alpha > 1\) and \(\theta >(1+\alpha ){/}(2\alpha )\), then the function \(s(x)\) is positive for \(x>0\) and it follows that the hazard function is an increasing function with \(h(0)=0\) and \(h(\infty )= 1\).

  • If \(\alpha >1\) and \(0<\theta <(1+\alpha ){/}(2\alpha )\), then the function \(s(x)\) has one root \(x_0\) with \(s(0)=\theta (\alpha -1)>0\) and \(s(\infty )=0\). Thus we obtain that the hazard function \(h(x)\) increases on \((0,x_0)\) and decreases on \((x_0,\infty )\) with \(h(0)=0\) and \(h(\infty )= 1\).

In comparison with the hazard rate function of the Weibull, gamma or GE distributions, the hazard rate function of the proposed MOGE distribution has two more possible shapes. Therefore it becomes more flexible for analyzing lifetime data. Some possible shapes of the hazard function \(h(x; \alpha , \theta )\) for different values of \(\alpha \) and \(\theta \), are presented in Fig. 2.

Fig. 2
figure 2

The hazard function of the MOGE distribution for different values of \(\alpha \) and \(\theta \) when \(\lambda = 1\). (i) \(\alpha = 0.5\), \(\theta = 0.5\), (ii) \(\alpha = 0.5\), \(\theta = 2.0\), (iii) \(\alpha = 1.5\), \(\theta = 0.5\), (iv) \(\alpha = 1.5\), \(\theta = 2.0\)

Let us derive now the reverse hazard rate function. As was noted in [19], the reverse hazard rate function is useful in constructing the information matrix and in estimating the survival function for censored data. The reverse hazard function of MOGE distribution is given as

$$\begin{aligned} r(x)=\frac{g(x)}{G(x)} =\frac{\alpha \theta e^{- x}}{(1-e^{-x})(\theta +(1-\theta )(1-e^{-x})^\alpha )},\quad x,\alpha ,\lambda ,\theta >0. \end{aligned}$$

The reverse hazard rate function decreases on \((0,\infty )\) with \(r(0)=\infty \) and \(r(\infty )=0\). We can see that the reverse hazard function for \(\theta \ne 1\) is not a linear function of \(\alpha \) as the reverse hazard function of GE distribution.

4 Moments

In this section we derive the \(n\)-th moments of a random variable \(X \sim \) MOGE\((\alpha , \lambda , \theta )\). Let \(Y_{\alpha ,\lambda } \sim \) GE\((\alpha , \lambda )\). We will first consider the case \(0<\theta <2\). By using (3), we obtain that the \(n\)-th moment of a random variable \(X\) as

$$\begin{aligned} E(X^n)=\theta \sum _{k=0}^\infty \sum _{j=0}^k (-1)^j (1-\theta )^k {k+1\atopwithdelims ()j+1} E(Y_{\alpha (j+1),\lambda }^n). \end{aligned}$$
(5)

Nadarajah and Kotz [17] derived the \(n\)-th moment of a random variable \(Y_{\alpha ,\lambda }\) as

$$\begin{aligned} E\left( Y_{\alpha ,\lambda }^n\right) =\frac{\alpha (-1)^n}{\lambda ^n} \frac{\partial ^n}{\partial p^n} B(\alpha ,p)\Big |_{p=1}. \end{aligned}$$
(6)

Here for \(u > 0\) and \(v > 0\), \(B(u,v)\) is the beta function defined as follows: \(\displaystyle B(u,v) = \textstyle \int _{0}^{1} x^{u-1} (1-x)^{v-1} dx\). Now by combining (5) and (6), the \(n\)-th moment of a random variable \(X\) can be calculated as

$$\begin{aligned} E(X^n)=\frac{\alpha \theta (-1)^n}{\lambda ^n}\sum _{k=0}^\infty \sum _{j=0}^k \frac{(-1)^j (1-\theta )^k (k+1)!}{j! (k-j)!} \frac{\partial ^n}{\partial p^n} B(\alpha (j+1),p)\Big |_{p=1}\,. \end{aligned}$$

In particular, the expectation is

$$\begin{aligned} E(X)=\frac{\theta }{\lambda } \sum \limits _{k=0}^\infty \sum \limits _{j=0}^k (-1)^k (1-\theta )^k {k+1\atopwithdelims ()j+1} \left[ \Psi (\alpha (j+1)+1)-\Psi (1) \right] , \end{aligned}$$

and the second moment is

$$\begin{aligned} E(X^2)&= \frac{\theta }{\lambda ^2} \sum \limits _{k=0}^\infty \sum \limits _{j=0}^k (-1)^k (1-\theta )^k {k+1\atopwithdelims ()j+1} \left[ \Psi ^2(1)+\Psi ^{\prime }(1)\right. \\&\left. \;-2\Psi (1) \Psi (\alpha (j+1)+1)-\Psi ^{\prime }(\alpha (j+1)+1)+\Psi ^2(\alpha (j+1)+1)\right] , \end{aligned}$$

where \(\Psi (x)=\frac{d\log \Gamma (x)}{dx}\) is the Euler’s psi function.

Similarly for the case \(\theta >1/2\), it can be shown in this case that the \(n\)-th moment of a random variable \(X\) can be calculated as

$$\begin{aligned} E(X^n)=\frac{\alpha (-1)^n}{\theta \lambda ^n}\sum _{k=0}^\infty \left( 1-\frac{1}{\theta }\right) ^k (k+1) \frac{\partial ^n}{\partial p^n} B(\alpha (k+1),p)\Big |_{p=1}, \end{aligned}$$

see [3]. The first two moments can be written as

$$\begin{aligned} E(X)=\frac{1}{\theta \lambda } \sum \limits _{k=0}^\infty (1-\theta ^{-1})^k \left[ \Psi (\alpha (k+1)+1)-\Psi (1) \right] , \end{aligned}$$

and the second moment is

$$\begin{aligned} E(X^2)&= \frac{1}{\theta \lambda ^2} \sum \limits _{k=0}^\infty (1-\theta ^{-1})^k \left[ \Psi ^2(1)+\Psi ^{\prime }(1)-2\Psi (1) \Psi (\alpha (k+1)+1)\right. \\&\left. \;-\Psi ^{\prime }(\alpha (k+1)+1)+\Psi ^2(\alpha (k+1)+1)\right] . \end{aligned}$$

5 Order statistics

In this section we consider the order statistics \(X_{1:n}\), \(X_{2:n}\), \(\ldots \), \(X_{n:n}\), from a random sample \(X_1\), \(X_2\), \(\ldots \), \(X_n\) from the MOGE distribution. Let us derive the density function of the \(i\)-th order statistics \(X_{i:n}\), \(1\le i\le n\). We have that

$$\begin{aligned} g_{i:n}(x)&= \frac{n!}{(i-1)! (n-i)!}\,\cdot g(x) (G(x))^{i-1} (1-G(x))^{n-i}\\&= \frac{n!}{(i-1)! (n-i)!}\,\cdot \frac{\alpha \lambda \theta ^{n-i+1} e^{-\lambda x} (1-e^{-\lambda x})^{\alpha i-1} (1-(1-e^{-\lambda x})^\alpha )^{n-i}}{ (\theta + (1-\theta )(1-e^{-\lambda x})^\alpha )^{n+1}}\,. \end{aligned}$$

This density function can be represented as an infinite weighted sum of beta generalized exponential density function. Consider the case when \(\theta >1/2\). Using the series expansion \((1-z)^{-k}=\sum _{j=0}^\infty \frac{\Gamma (k+j)}{j! \Gamma (k)}\, z^j\), \(k>0\), we obtain the representation

$$\begin{aligned} g_{i:n}(x)=\frac{1}{\theta ^i}\sum _{j=0}^\infty {i+j-1\atopwithdelims ()j}\left( 1-\frac{1}{\theta }\right) ^j f_{BGE}(x;i+j,n-i+1,\lambda ,\alpha ). \end{aligned}$$

Similarly, in the case \(0<\theta <2\), we obtain the representation

$$\begin{aligned} g_{i:n}(x)=\theta ^{n-i+1}\sum _{j=0}^\infty {n+j-i\atopwithdelims ()j}(1-\theta )^j f_{BGE}(x;i,n+j-i+1,\lambda ,\alpha ). \end{aligned}$$

Barreto-Souza et al. [4] derived the moments of the \(i\)-th order statistics from beta generalized exponential distribution. Let \(\mu _{i:n}^r(a,b)\) represents the \(r\)-th moment of the \(i\)-th order statistics from the BGE\((a,b,\lambda ,\alpha )\) distribution. Then the \(r\)-th moment of the \(i\)-th order statistics from the MOGE distribution can be derived as

$$\begin{aligned} E(X_{i:n}^r) =\left\{ \begin{array}{ll} \frac{1}{\theta ^i}\sum _{j=0}^\infty {i+j-1\atopwithdelims ()j} \left( 1-\frac{1}{\theta }\right) ^j \mu _{i:n}^r(i+j,n-i+1), &{}\quad \theta >1/2, \\ \theta ^{n-i+1}\sum _{j=0}^\infty {n+j-i\atopwithdelims ()j}(1-\theta )^j \mu _{i:n}^r(i,n+j-i+1), &{}\quad 0<\theta <2, \end{array}\right. \end{aligned}$$

see Barreto-Souza et al. [3].

Now we discuss the asymptotic distributions of the order statistics. First we consider the sample maxima \(X_{n:n}\). Since \(G^{-1}(1)=\infty \), \(\lim _{x\rightarrow \infty } h(x)=\lambda \) and \(\lim _{x\rightarrow \infty } \frac{g^{\prime }(x)}{g(x)}=-\lambda \), the von Mises’ condition (iii) from Arnold et al. [2, Theorem 8.3.3] is satisfied. This implies that

$$\begin{aligned} \frac{X_{n:n}-a_n}{b_n}\mathop {\rightarrow }\limits ^{d} e^{-e^{-x}},\, x\in \mathbf {R}, \end{aligned}$$

where the normalizing constants \(a_n\) and \(b_n\) can be derived by Arnold et al. [2, Theorem 8.3.4 (iii)].

Second we consider the sample minimum \(X_{1:n}\). Since \(G^{-1}(0)=0\) and \(\lim _{\varepsilon \rightarrow 0_+} \frac{G(\varepsilon x)}{G(\varepsilon )}=x^\alpha \), we obtain from Arnold et al. [2, Theorem 8.3.6 (ii)] that

$$\begin{aligned} \frac{X_{1:n}-a_n^*}{b_n^*}\mathop {\rightarrow }\limits ^{d} 1-e^{-(-x)^\alpha },\quad x<0,\alpha >0, \end{aligned}$$

where \(a_n^*=0\) and \(b_n^*=G^{-1}(1/n)\).

Finally, the asymptotic distribution of the order statistics \(X_{n-i+1:n}\) follows from the asymptotic distribution of the sample maxima. Thus

$$\begin{aligned} \frac{X_{n-i+1:n}-a_n}{b_n}\mathop {\rightarrow }\limits ^{d} e^{-e^{-x}} \sum _{j=0}^{i-1} \frac{e^{-j x}}{j!}, \quad x\in \mathbf {R}, \end{aligned}$$

where \(a_n\) and \(b_n\) are the normalizing constants derived by Arnold et al. [2, Theorem 8.3.4 (iii)].

6 Rényi entropy

The entropy is a measure of diversity, uncertainty or randomness of a system. A popular entropy is the Rényi entropy, see [20], which generalizes the well known Shannon entropy.

The Rényi entropy is given by \(\displaystyle I_R(\xi )=\textstyle \frac{1}{1-\xi }\log \smallint _0^\infty g^\xi (x) dx\), where \(\xi >0\) and \(\xi \ne 1\). For \(\theta >1/2\) the function \(g^\xi (x)\) can be expanded as

$$\begin{aligned} g^\xi (x)=\frac{\lambda ^\xi \alpha ^\xi }{\theta ^\xi } \sum _{j=0}^\infty \frac{\Gamma (2\xi +j)}{\Gamma (2\xi )j!}\left( 1-\frac{1}{\theta }\right) ^j e^{-\lambda \xi x}(1-e^{-\lambda x})^{\xi (\alpha -1)+\alpha j}. \end{aligned}$$

Now using the fact that \(\int _0^\infty e^{-\lambda \xi x}(1-e^{-\lambda x})^{\xi (\alpha -1)+\alpha j} dx=\lambda ^{-1} B(\xi ,\xi (\alpha -1)+\alpha j+1)\), we obtain that for \(\theta >1/2\) the Rényi entropy is

$$\begin{aligned} I_R(\xi )=\frac{1}{1-\xi }\log \left\{ \frac{\lambda ^{\xi -1}\alpha ^\xi }{\theta ^\xi } \sum _{j=0}^\infty \frac{\Gamma (2\xi +j)}{\Gamma (2\xi )j!} \left( 1-\frac{1}{\theta }\right) ^j B(\xi ,\xi (\alpha -1)+\alpha j+1)\right\} . \end{aligned}$$

Let us consider the case when \(0<\theta <2\). The the function \(g^\xi (x)\) can be expanded as

$$\begin{aligned} g^\xi (x)=\lambda ^\xi \alpha ^\xi \theta ^\xi \sum _{j=0}^\infty \sum _{k=0}^j \frac{\Gamma (2\xi +j)}{\Gamma (2\xi )j!} {j\atopwithdelims ()k}(-1)^k \left( 1-\theta \right) ^j e^{-\lambda \xi x}(1-e^{-\lambda x})^{\xi (\alpha -1)+\alpha k}, \end{aligned}$$

which implies that the Rényi entropy for \(0<\theta <1\) is

$$\begin{aligned} I_R(\xi )&= \frac{1}{1-\xi }\log \left\{ \lambda ^{\xi -1}\alpha ^\xi \theta ^\xi \sum _{j=0}^\infty \sum _{k=0}^j\frac{\Gamma (2\xi +j)}{\Gamma (2\xi )j!} {j\atopwithdelims ()k}(-1)^k \left( 1-\theta \right) ^j B(\xi ,\xi (\alpha -1)+\alpha k+1)\right\} . \end{aligned}$$

A special case of the Rényi entropy is the Shannon entropy defined as \(E(-\log g(X))\), where \(X\) is a random variable. The Shannon entropy represents the limit of \(I_R(\xi )\) when \(\xi \uparrow 1\). If we suppose that a random variable \(X\) has the MOGE, then the Shannon entropy is \(E(-\log g(X))=-\log (\alpha \lambda \theta )+\lambda E(X)-(\alpha -1)E(\log (1-e^{-\lambda X})+2 E(\log (\theta +(1-\theta )(1-e^{-\lambda X})^\alpha )\). Replacing \(E(\log (1-e^{-\lambda X})= \log \theta {/}(2(1-\theta ))\) and \(E(\log (\theta +(1-\theta )(1-e^{-\lambda X})^\alpha )= 1+\log \theta {/}(1-\theta )\) in the last equation, we obtain that the Shannon entropy is

$$\begin{aligned}E(-\log g(X))=2-\log (\alpha \lambda )+\lambda E(X)+\frac{(\alpha \theta +1)\log \theta }{\alpha (1-\theta )}\,.\end{aligned}$$

7 Random minima, maxima and different ordering relations

In reliability and survival analysis the occurrence of a series or parallel system with random number of components is very common, see for example [12]. In many agricultural and biological experiments it is impossible to have a fixed sample size as some of the observations often get lost due to different reasons. In many situations the sample size may depend on the occurrence of some specific event, which makes the sample size random. For example, a common dose of radiation is given to a group of animals, often the interest is in the times that the first and last expire, see [7]. In actuarial science, the claims received by an insurer in a certain time interval often make up of a sample of random size, and the largest claim amount is of chief interest there, see [15]. It has already been observed that the proposed MOGE distribution can be obtained as the random minima or random maxima of the GE distributions depending on \(0 < \theta < 1\) or \(1 < \theta < \infty \). Therefore, the proposed model may be used quite effectively in these cases. In this section we establish different results based on this property.

Let us recall the following definitions. Suppose \(U\) and \(V\) be two continuous random variables with PDFs \(f_U\) and \(f_V\), respectively. The corresponding cumulative distribution functions (CDF) will be denoted by \(F_U\) and \(F_V\), respectively. The random variable \(U\) is said to be smaller than the random variable \(V\) in the likelihood ratio ordering (denoted by \(U \le _{lr} V\)) if \(f_U(x) f_V(y) \ge f_U(y) f_V(x)\), for all \(0 < x \le y < \infty \). The random variable \(U\) is said to be smaller than the random variable \(V\) in stochastic order (denoted by \(U \le _{st} V\)) if \(P(U \ge x) \le P(V \ge x)\), for all \(x > 0\). The random variable \(U\) is said to be smaller than the random variable \(V\) in dispersive order (denoted by \(U \le _{disp} V\)) if \(\displaystyle F^{-1}_U(\beta ) - F_U^{-1}(\alpha ) \le F^{-1}_V(\beta ) - F_V^{-1}(\alpha )\), for all \(0 < \alpha \le \beta < 1\). The random variable \(U\) is said to be smaller than the random variable \(V\) in hazard rate order (denoted by \(U \le _{hr} V\)), if \(\displaystyle P(V > x)/P(U > x)\) is an increasing function of \(x\). The random variable \(U\) is said to be smaller than the random variable \(V\) in the convex transform order (denoted by \(U \le _c V\)), if \(F_V^{-1} F_U\) is a convex function in \((0,\infty )\). Further, the random variable \(U\) is said to be smaller than the random variable \(V\) in star order (denoted by \(U \le _* V\)), if \(F_V^{-1} F_U(x)/x\) is an increasing function of \(x \in (0,\infty )\). We have the following results.

Result 1

Let \(X \sim \) GE\((\alpha , \lambda )\), \(Y \sim \) MOGE\((\alpha ,\lambda , \theta )\), \(Z \sim \) MOGE\((\alpha ,\lambda , 1/\theta )\), where \(\alpha > 0\), \(\lambda > 0\) and \(0 < \theta < 1\), then

$$\begin{aligned} Y \le _{lr} X \le _{lr} Z. \end{aligned}$$

Proof

The result mainly follows from Corollary 2.5 of [21]. \(\square \)

Result 2

  1. (a)

    If \(Y_1 \sim \) MOGE\((\alpha _1,\lambda ,\theta )\) and \(Y_2 \sim \) MOGE\((\alpha _2,\lambda ,\theta )\), where \(0 < \alpha _1 < \alpha _2\), \(\theta > 0\) and \(\lambda > 0\), then \(Y_1 \le _{st} Y_2\).

  2. (b)

    If \(Y_1 \sim \) MOGE\((\alpha ,\lambda _1,\theta )\) and \(Y_2 \sim \) MOGE\((\alpha ,\lambda _2,\theta )\), where \(0 < \lambda _1 < \lambda _2\), \(\theta > 0\) and \(\alpha > 0\), then \(Y_2 \le _{st} Y_1\).

Proof

Note that if \(U \sim \) GE\((\alpha _1,\lambda )\) and \(V \sim \) GE\((\alpha _2,\lambda )\), then for \(\alpha _1 \le \alpha _2\), \(U \le _{st} V\). Similarly, if \(U \sim \) GE\((\alpha ,\lambda _1)\) and \(V \sim \) GE\((\alpha ,\lambda _2)\), then for \(\lambda _1 \le \lambda _2\), \(V \le _{st} U\). Hence both (a) and (b) follow using Theorem 3.1 of [21]. \(\square \)

Result 3

  1. (a)

    If \(Y_1 \sim \) MOGE\((\alpha _1,\lambda ,\theta )\) and \(Y_2 \sim \) MOGE\((\alpha _2,\lambda ,\theta )\), where \(0 < \alpha _1 < \alpha _2\), \(\theta > 0\) and \(\lambda > 0\), then \(Y_1 \le _{disp} Y_2\).

  2. (b)

    If \(Y_1 \sim \) MOGE\((\alpha _1,\lambda ,\theta )\) and \(Y_2 \sim \) MOGE\((\alpha _2,\lambda ,\theta )\), where \(0 < \alpha _1 < \alpha _2\), \(\theta > 0\) and \(\lambda > 0\), then \(Y_1 \le _{hr} Y_2\).

Proof

Suppose \(U \sim \) GE\((\alpha _1,\lambda )\) and \(V \sim \) GE\((\alpha _2,\lambda )\), then for \(\alpha _1 \le \alpha _2\), \(U \le _{disp} V\) and \(U \le _{hr} V\), see [11]. Hence (a) and (b) follow using Theorem 3.2 and Theorem 3.3, respectively of [21]. \(\square \)

Result 4

  1. (a)

    If \(Y_1 \sim \) MOGE\((\alpha _1,\lambda ,\theta )\) and \(Y_2 \sim \) MOGE\((\alpha _2,\lambda ,\theta )\), where \(0 < \alpha _1 < \alpha _2\), \(\theta > 0\) and \(\lambda > 0\), then \(Y_1 \le _c Y_2\).

  2. (b)

    If \(Y_1 \sim \) MOGE\((\alpha _1,\lambda ,\theta )\) and \(Y_2 \sim \) MOGE\((\alpha _2,\lambda ,\theta )\), where \(0 < \alpha _1 < \alpha _2\), \(\theta > 0\) and \(\lambda > 0\), then \(Y_1 \le _* Y_2\).

Proof

Suppose \(U \sim \) GE\((\alpha _1,\lambda )\) and \(V \sim \) GE\((\alpha _2,\lambda )\), then for \(\alpha _1 \le \alpha _2\), \(U \le _c V\), see [11]. Hence (a) follows using Theorem 2 (a) of [5]. Since convex ordering implies start ordering, (b) follows from (a). \(\square \)

8 Estimation

In this section we consider the maximum likelihood estimation of the unknown parameters based on a complete sample. Let us assume that we have a sample of size \(n\), namely \(\{x_1\), \(\ldots , x_n\}\) from MOGE\((\alpha , \lambda ,\theta )\) distribution. The log-likelihood function is given by

$$\begin{aligned} l(\alpha , \theta , \lambda | Data)&= \log L(\alpha ,\theta ,\lambda ) \nonumber \\&= n\log (\alpha \lambda \theta )- \lambda \sum _{i=1}^n x_i+(\alpha -1) \sum _{i=1}^n \log (1-e^{-\lambda x_i}) \\&- 2\sum _{i=1}^n \log (\theta +(1-\theta )(1-e^{-\lambda x_i})^\alpha ). \end{aligned}$$

Normal equations can be obtained by taking the first derivatives of the log-likelihood function with respect to \(\lambda \), \(\alpha \) and \(\theta \) are equate them to zeros as follows;

$$\begin{aligned} \frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \lambda }&= \frac{n}{\lambda }-\sum _{i=1}^n x_i+(\alpha -1) \sum _{i=1}^n \frac{x_i e^{-\lambda x_i}}{1-e^{-\lambda x_i}}\\&-2\alpha (1-\theta )\sum _{i=1}^n \frac{x_i e^{-\lambda x_i} (1-e^{-\lambda x_i})^{\alpha -1}}{\theta +(1-\theta )(1-e^{-\lambda x_i})^\alpha }\\ \frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }&= \frac{n}{\alpha }-\sum _{i=1}^n \log (1-e^{-\lambda x_i}) +2\theta \sum _{i=1}^n \frac{\log (1-e^{-\lambda x_i})}{\theta +(1-\theta )(1-e^{-\lambda x_i})^\alpha }\\ \frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \theta }&= \frac{n}{\theta }-2\sum _{i=1}^n \frac{1-(1-e^{-\lambda x_i})^\alpha }{\theta +(1-\theta )(1-e^{-\lambda x_i})^\alpha }\,. \end{aligned}$$

It is clear that the MLEs do not have explicit solutions, and the MLEs can be obtained by solving a three dimensional optimization process. We may use the standard Gauss-Newton or Newton-Raphson methods, but they have their usual problem of convergence. If the initial guesses are not close to the optimal value, the iteration may not converge, see for example [18] for a recent reference on this issue on a related problem. Moreover, choosing a three dimensional initial guesses may not very simple in most of the practical situations. Before progressing further we present the following result related to the MLE.

If the parameters \(\lambda \) and \(\theta \) are known, the properties of the MLE of the parameter \(\alpha \) follow from the following theorem.

Theorem 1

Let \(\alpha \) be the true value of the parameter. If \(0<\theta <1\), then the equation \(\frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }=0\) has exactly one root. If \(\theta >1\), then the root of equation \(\frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }=0\) lies in the interval \([(2\theta -1)^{-1}\psi _\lambda ^{-1},\psi _\lambda ^{-1}]\), where \(\psi _\lambda =-n^{-1} \sum _{i=1}^n \log (1-e^{-\lambda x_i})\).

Proof

Let us first consider the case \(0<\theta <1\). Then the function \(\frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }\) is decreasing with \(\lim _{\alpha \rightarrow 0} \frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }=\infty \) and \(\lim _{\alpha \rightarrow \infty } \frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }=-n\psi _\lambda <0\). Thus it follows that exists exactly one root. Now consider the case \(\theta >1\) and let \(w(\alpha ;\lambda ,\theta )= 2\theta \sum _{i=1}^n \frac{\log (1-e^{-\lambda x_i})}{\theta +(1-\theta )(1-e^{-\lambda x_i})^\alpha }\). The function \(w\) is increasing. We can see that \(\lim _{\alpha \rightarrow 0} w\!=\!2\theta \sum _{i=1}^n \log (1-e^{-\lambda x_i})\) and \(\lim _{\alpha \rightarrow \infty } w=2\sum _{i=1}^n \log (1-e^{-\lambda x_i})\). This implies that

$$\begin{aligned} \frac{n}{\alpha }+(2\theta -1) \sum _{i=1}^n \log (1-e^{-\lambda x_i})<\frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }<\frac{n}{\alpha }+\sum _{i=1}^n \log (1-e^{-\lambda x_i}). \end{aligned}$$

Then we obtain that \(\frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }>0\) for \(\alpha < (2\theta -1)^{-1}\psi _\lambda ^{-1}\) and \(\frac{\partial \log L(\alpha ,\theta ,\lambda )}{\partial \alpha }<0\) for \(\alpha > \psi _\lambda ^{-1}\). This proves the theorem. \(\square \)

As it has been mentioned before that it is possible to use the standard three dimensional optimization algorithm to maximize the log-likelihood function (7). We propose a simple iterative technique to compute the MLEs of the unknown parameters, which avoids solving a three dimensional optimization process directly, it needs solving three one dimensional optimization problems. The idea comes from the following observations;

Let us consider the random variables \(X\) and \(Z\) with the following joint PDF

$$\begin{aligned} f(x, z;\ \alpha ,\lambda ,\theta )=\frac{\alpha \lambda \theta z e^{-\lambda x}(1-e^{-\lambda x})^{\alpha -1}}{(1-(1-e^{-\lambda x})^{\alpha })^2}\, e^{-z(\theta -1+(1-(1-\exp (-\lambda x))^\alpha )^{-1})}, \end{aligned}$$
(7)

It can be easily observed that the random variable \(X\) follows the Marshall-Olkin Generalized exponential distribution. Based on a random sample of size \(n\), say \(\{x_i, z_i\}\) from (7), the log-likelihood function can be written as;

$$\begin{aligned} \ln l(\alpha ,\lambda ,\theta ; \ Data, z_1, \ldots , z_n)&= n\ln \alpha + n\ln \lambda + n\ln \theta + \sum _{i=1}^n \ln z_i - \lambda \sum _{i=1}^n x_i \nonumber \\&+ (\alpha -1)\sum _{i=1}^n \log (1-e^{-\lambda x_i})-2\sum _{i=1}^n \log (1-(1-e^{-\lambda x_i})^\alpha ) \nonumber \\&-\sum _{i=1}^n z_i \left( \theta -1+(1-(1-e^{-\lambda x_i})^\alpha )^{-1}\right) . \end{aligned}$$
(8)

Note that the maximization of (8) with respect to \(\alpha , \lambda \) and \(\theta \) can be decoupled. The maximization of (8) with respect to \(\theta \) can be obtained as \(\displaystyle \widehat{\theta } = \frac{n}{\sum _{i=1}^n z_i}\), and the maximization of (8) with respect to \(\alpha \) and \(\lambda \) can be obtained by maximizing \(g(\alpha , \lambda )\), where

$$\begin{aligned} g (\alpha ,\lambda )&= n\ln \alpha + n\ln \lambda - \lambda \sum _{i=1}^n x_i + (\alpha -1)\sum _{i=1}^n \log (1-e^{-\lambda x_i}) \nonumber \\&-2\sum _{i=1}^n \log (1-(1-e^{-\lambda x_i})^\alpha ) -\sum _{i=1}^n z_i (1-(1-e^{-\lambda x_i})^\alpha )^{-1}. \end{aligned}$$
(9)

The method proposed by Song et al. [22] can be used to maximize (9). The method was used by Kannan et al. [13] in a similar problem, and it can be described as follows. Let us write \(g(\alpha , \lambda )\) as;

$$\begin{aligned} g(\alpha , \lambda ) = g_1(\alpha , \lambda ) + g_2(\alpha , \lambda ), \end{aligned}$$
(10)

where

$$\begin{aligned} g_1(\alpha , \lambda ) = n\ln \alpha + n\ln \lambda - \lambda \sum _{i=1}^n x_i + (\alpha -1)\sum _{i=1}^n \log (1-e^{-\lambda x_i}) , \end{aligned}$$
(11)

and

$$\begin{aligned} g_2(\alpha , \lambda ) = -2\sum _{i=1}^n \log (1-(1-e^{-\lambda x_i})^\alpha ) -\sum _{i=1}^n z_i (1-(1-e^{-\lambda x_i})^\alpha )^{-1}. \end{aligned}$$
(12)

We need to solve

$$\begin{aligned} g^{\prime }(\alpha , \lambda ) = g_1^{\prime }(\alpha , \lambda ) + g_2^{\prime }(\alpha , \lambda ) = 0 \quad \Leftrightarrow \quad g_1^{\prime }(\alpha , \lambda ) = - g_2^{\prime }(\alpha , \lambda ), \end{aligned}$$
(13)

Here \(\displaystyle g^{\prime }(\alpha , \lambda ) = \left( \frac{\partial g(\alpha , \lambda )}{\partial \alpha }, \frac{\partial g(\alpha , \lambda )}{\partial \lambda } \right) \). First solve

$$\begin{aligned} g_1^{\prime }(\alpha , \lambda ) = 0. \end{aligned}$$
(14)

using the following fixed point type non-linear equation iteratively

$$\begin{aligned} \lambda = \left( \frac{1}{n} \sum _{i=1}^n \frac{x_i e^{-\lambda x_i}}{(1-e^{-\lambda x_i})} \left( 1 + \frac{n}{\sum _{i=1}^n \ln (1 - e^{-\lambda x_i})} \right) + \frac{1}{n} \sum _{i=1}^n x_i \right) ^{-1}. \end{aligned}$$
(15)

If \(\lambda ^{(0)}\) is the solution of (15), then obtain

$$\begin{aligned} \alpha ^{(0)} = - \frac{n}{\sum _{i=1}^n \ln (1 - e^{-\lambda ^{(0)} x_i})}. \end{aligned}$$
(16)

Now \(\alpha ^{(1)}\) and \(\lambda ^{(1)}\) can be obtained as the solution of the following

$$\begin{aligned} g_1^{\prime }(\alpha , \lambda ) = - g_2(\alpha ^{(0)}, \lambda ^{(0)}), \end{aligned}$$
(17)

similarly, \(\alpha ^{(2)}\) and \(\lambda ^{(2)}\) can be obtained as the solution of the following

$$\begin{aligned} g_1^{\prime }(\alpha , \lambda ) = - g_2(\alpha ^{(1)}, \lambda ^{(1)}), \end{aligned}$$
(18)

The iteration continues until converges. Note that the solution \((\widetilde{\alpha }, \widetilde{\lambda })\) of the following equation, for any arbitrary \(c_1\) and \(c_2\)

$$\begin{aligned} g_1^{\prime }(\alpha , \lambda ) = (c_1, c_2) \end{aligned}$$
(19)

can be obtained as follows. First solve the non-linear equation iteratively

$$\begin{aligned} \lambda = \left[ \frac{c_2}{n} + \frac{1}{n} \sum _{i=1}^n x_i + \left( 1 - \frac{n}{c_1 - \sum _{i=1}^n \ln (1 - e^{-\lambda x_i})} \right) \times \left( \frac{1}{n} \sum _{i=1}^n \frac{x_i e^{-\lambda x_i}}{1 - e^{-\lambda x_i}} \right) \right] ^{-1}\quad \end{aligned}$$
(20)

to obtain \(\widetilde{\lambda }\), and then obtain

$$\begin{aligned} \widetilde{\alpha } = \left[ \frac{c_1 - \sum _{i=1}^n \ln (1 - e^{-\lambda x_i})}{n} \right] ^{-1}, \end{aligned}$$
(21)

see Kannan et al. [13]. Finally for implementation of the EM algorithm we need the following result. The conditional expectation of \(Z\) given \(X = x\), is

$$\begin{aligned} E(Z|X = x; \, \alpha , \lambda , \theta ) = \frac{2(1-(1-e^{-\lambda x})^{\alpha })}{\theta + (1-\theta ) (1-e^{-\lambda x})^{\alpha }}. \end{aligned}$$
(22)

Now we are ready to provide the EM algorithm. Suppose at the \(k\)-th stage the value of \(\alpha \), \(\lambda \) and \(\theta \) are \(\alpha ^{(k)}\), \(\lambda ^{(k)}\) and \(\theta ^{(k)}\) respectively.

E-step’: In the E-step obtain the ‘pseudo log-likelihood function’ (8) replacing \(z_i\) by \(z_i^{(k)}\), where

$$\begin{aligned} z_{i}^{(k)} = E(Z|X = x_{i}; \, \alpha ^{(k)}, \lambda ^{(k)}, \theta ^{(k)}); \quad i = 1, \ldots , n. \end{aligned}$$
(23)

M-step: At the \(k\)-th stage, in the M-step, we maximize the ‘pseudo-log-likelihood’ function with respect to \(\alpha \), \(\lambda \) and \(\theta \), to compute the \(\alpha ^{(k+1)}\), \(\lambda ^{(k+1)}\) and \(\theta ^{(k+1)}\). The maximization can be performed, as it has been described before.

9 Data Analysis

For illustrative purposes, in this section we present the analysis of two data sets to show how our proposed model works in practice.

Guina Pig Data: This data set has been obtained from Bjerkedal [6]. This data set represents the survival times (in days) of guinea pigs injected with different doses of tuber bacilli. It may be mentioned that guinea pigs have high susceptibility to human tuberculosis, and that is why they are usually used in this kind of study. The data set consists of survival times of 72 animals who were under the regimen 4.342. The regimen number is the common logarithm of the number of bacillary units in 0.5 ml., of challenge solution, \(i.e.\) regimen 4.342 corresponds to 2.2 \(\times \) \(10^4\) bacillary units per 0.5 ml. \((\hbox {log} (2.2 \times 10^4) = 4.342\)), see Gupta et al. [10].

This data set is available in Gupta et al. [10], and it has been analyzed by them also. The preliminary data analysis by Gupta et al. [10] indicated that the data are right hand skewed, and the empirical hazard function is unimodal. Due to this reason Gupta et al. [10] analyzed the data using the log-normal model. The MLEs of the log-normal parameters are 5.0043 (\(\mu \)) and 0.6290 (\(\sigma \)) respectively, the associated log-likelihood value is \(-\)429.0945. Based on the Kolmogorov-Statistic (KS) distance 0.1298, and the associated \(p\) value (0.1765) and also from the quantile plot they claimed that the log-normal model provides a good fit to the data.

Since the proposed MOGE model can have a unimodal hazard function, we analyze the data using the MOGE model also. The MLEs of \(\alpha \), \(\theta \) and \(\lambda \) are 3.6050, 1.0287 and 0.0113 respectively. The associated 95 % bootstrap confidence intervals are (0.7288, 6.4930), (0.1734, 4.2919), (0.0075, 0.0223) respectively. The corresponding log-likelihood value is \(-\)425.8080. The KS distance between the fitted and empirical distribution function is 0.0917, and the associated \(p\) values is 0.5803.

For comparison purposes, we have also fitted the Birnbaum-Saunders distribution, which also has unimodal hazard function, see for example Kundu et al. [14]. The MLEs of the unknown Birnbaum-Saunders parameters are 0.7038 (\(\alpha \)) and 141.7175 (\(\beta \)), and the associated log-likelihood value is \(-\)434.0186. The KS distance between the fitted and the empirical distribution functions is 0.1569 and the associated \(p\) value is 0.0576.

If we want to perform the following test:

$$\begin{aligned} H_0: \hbox {log-normal} \quad \hbox {vs.} \quad H_1: \hbox {MOGE} \end{aligned}$$

then based on the likelihood ratio test statistic (6.573) the corresponding \(p\) value is \(<\)0.05, based on the \(\chi ^2_1\) distribution. Since the \(p\) value is quite small we reject the null hypothesis. Similarly, we also reject null hypothesis of the following test:

$$\begin{aligned} H_0: \hbox {Birnbaum-Saunders} \quad \hbox {vs.} \quad +H_1: \hbox {MOGE} \end{aligned}$$

Therefore, based on the KS distance and also based on the likelihood ratio test, we prefer MOGE distribution than log-normal or Birnbaum-Saunders distribution.

Strength Data

Now we present the analysis of a data set obtained from Prof. R.G. Surles. It is a strength data measured in GPA, the single carbon fibers, and impregnated 1,000-carbon fiber tows. Single fibers were tested under tension at gauge length 1 mm. The data are provided below:

2.247

2.64

2.908

3.099

3.126

3.245

3.328

3.355

3.383

3.572

3.581

3.681

3.726

3.727

3.728

3.783

3.785

3.786

3.896

3.912

3.964

4.05

4.063

4.082

4.111

4.118

4.141

4.246

4.251

4.262

4.326

4.402

4.457

4.466

4.519

4.542

4.555

4.614

4.632

4.634

4.636

4.678

4.698

4.738

4.832

4.924

5.043

5.099

5.134

5.359

5.473

5.571

5.684

5.721

5.998

6.06

Before progressing further first we provide the histogram of the strength data in Fig. 3. It is immediate that the data are unimodal. We further provide the the scaled TTT transform, see [1], of the data set in Fig. 4.

Fig. 3
figure 3

The histogram of the strength data set

Fig. 4
figure 4

The scaled TTT transform of the strength data set

Since the scaled TTT plot is concave, it indicates that the empirical hazard function is an increasing function. We have subtracted 2.0 from all the data points before analyzing the data set. We have used the proposed MOGE model and estimates of \(\alpha \), \(\theta \) and \(\lambda \) are 1.5759, 67.6793 and 2.0866 respectively. The associated 95% bootstrap confidence intervals are (0.4821, 2.6123), (55.2345, 82.5123) and (1.6041, 2.6704) respectively. The corresponding log-likelihood value is \(-\)67.8507. The KS distance between the fitted and the empirical distribution functions is 0.0474 and the associated \(p\) value is 0.9996.

For comparison purposes we have fitted two-parameter Weibull, gamma and GE distributions. The MLEs, the corresponding log-likelihood values, the KS distances between the fitted and the empirical distribution functions and the associated \(p\) values are reported in Table 1. From the table values

Table 1 Maximum likelihood estimates, maximized log-likelihood values, K-S statistics and the associated \(p\)-values for Weibull, gamma and GE distributions while fitting to the strength data

it is clear that between Weibull, gamma and GE distributions Weibull provides the best fit. Now if we want to test the hypothesis

$$\begin{aligned} H_0: \hbox {Weibull} \quad \hbox {vs.} \quad H_1: \hbox {MOGE} \end{aligned}$$

then based on the likelihood ratio test, the \(p\) value is less than 0.05. Therefore, we reject the null hypothesis. Similarly, if we want to test \(H_0: \hbox {GE}\) or \(H_0: \hbox {Gamma}\) and the alternative in both the cases in \(H_1: \hbox {MOGE}\), we reject the null hypothesis in both the cases. Therefore, in this case also based on the KS distances, and also based on the likelihood ratio tests, we prefer MOGE distribution, than Weibull, gamma or GE distributions.

10 Conclusions

In this paper we have introduced a new three-parameter distribution by incorporating the Marshall-Olkin method to the generalized exponential distribution. This new three-parameter distribution has an explicit distribution function and the PDF is also in a compact form. It is a very flexible three-parameter distribution, and it can have all possible four different hazard functions depending on the two shape parameters. Finally it should be mentioned that although we have incorporated only the generalized exponential distribution, but many of the properties are valid for a more general class of distributions, namely the proportional reversed hazard class. It will be interesting to see different properties of the general Marshall-Olkin proportional reversed hazard class. More work is needed in that direction.