Abstract
We extend the existing family of flexible survival models by assembling models scattered across the literature into a more knit form and under the same umbrella. New special cases are obtained not only by constraining the shape and scale parameters of the extended generalized gamma (EGG) model to fixed constants, but also by imposing relationships (such as equality, reciprocal, and negative reciprocal) between them. Apart from common parametric distributions such as exponential, Weibull, gamma, and log normal, the further extended family includes Rayleigh, inverse Rayleigh, ammag, inverse ammag, and half-normal distributions. The models are applied, in a Bayesian framework, on time to entry into first marriage among Eritrean men and women based on data from the 2010 Population and Health Survey. The application demonstrates that the further extended family of distributions provides a wide range of alternatives for a baseline distribution in the analysis of survival data. The empirical results reveal that the inverse gamma model fits best the data for men. It also performs closely as good as the EGG model in the data for women as well as in the combined sample.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- Extended generalized gamma (EGG) model
- Parametric survival models
- Accelerated failure-time (AFT) models
- Proportional hazards models
- Bayesian inference
- Parametric inference
- Likelihood ratio test
- Eritrea
- Entry into marriage
- Nuptiality
- Demographic and Health Surveys (DHS)
- Education
- Residence
- Exponential distribution
- Inverse exponential distribution
- Weibull distribution
- Reciprocal Weibull distribution
- Rayleigh distribution
- Inverse Rayleigh distribution
- Standard and generalized gamma distributions
- Inverse gamma distribution
- Ammag distribution
- Inverse Ammag distribution
- Log normal distribution
- Half normal distributions
- Log predictive density score (LPDS)
- Censoring
- Bayes Factors
- Time to event data
- Model comparison
- Markov chain Monte Carlo (MCMC)
- Random walk
- Metropolis Hasting algorithm
- Block sampling
- Posterior distribution
- Ergodic mean theorem
- Inefficiency factor (IF)
1 Introduction
The usual goals in the analysis of survival data include: (a) describing the distributional shape of the time variable; (b) comparing the survival experiences of different groups in a population; and (c) modeling the relationship between explanatory variables and survival time—as measured by time to the event of interest or the rate at which the event occurs.
Two classes of models are common in the literature for investigating effects of explanatory variables on survival. In the Cox proportional hazards models, the explanatory variables act multiplicatively on a baseline hazard so that their effect is to increase or decrease the hazard relative to that of the baseline group. A second class of models, known as the accelerated failure-time models, specifies the covariates to act multiplicatively on time to event itself so that their effect is to accelerate or decelerate time to event relative to an event time for baseline group. According to Wei (1992), the accelerated failure-time model has an intuitive physical interpretation and would be a useful alternative to the Cox PH model in survival analysis.
It has been documented that covariate effects on survival time are not robust to the choice of the baseline distribution—see, for instance (Addison and Portugal, 1987; Bergström and Edin, 1992; Bergström et al., 1994; Ghilagaber, 2005). It is, therefore, of paramount importance to correctly specify the baseline distribution if results from analysis of survival data are to be utilized optimally. A number of distributions for survival data are available in the literature scattered across disciplines and application areas. Some previous works have attempted to put these scattered models in a more knit form by embedding a number of competing models under the umbrella of a general parametric framework as in Butler and McDonald (1986) and Peng et al. (1998). This enables the use of ordinary parametric inference for assessment of each competing model relative to a more comprehensive one. Among others, (Ghilagaber, 2005) shows that five parametric duration models (exponential, Weibull, gamma, log normal, and reciprocal Weibull) may be treated as special cases of a more general extended generalized gamma (EGG) model by constraining the shape and/or scale parameters of the EGG model to some fixed constants.
In this chapter, we extend the EGG model further and increase the family of flexible distributions to include 13 special cases. This is achieved by including distributions that not only constrain the shape and scale parameters to specified constants but also impose some relationships between them. The new set of special cases include the Rayleigh and inverse Rayleigh distributions as well as the ammag and inverse ammag distributions as described in Cox et al. (2007). Further, a half-normal distribution can be obtained as a special case of ammag distribution.
A Bayesian approach is used to fit the EGG model and its 13 special cases to data on time to entry into first marriage among Eritrean men and women. Each special case model is then tested relative to a more general model using the log predictive density score (LPDS) in a Bayesian approach, see Li et al. (2010). Compared to the classical likelihood inference approach, the Bayesian approach provides three main advantages. First, we sample from a posterior density using Markov Chain Monte Carlo (MCMC), and hence, we can make exact inference for any sample size in any parametric survival models of various complexities. Second, we do not need to worry about the problem of local maximum trapping since our algorithm can go through the whole parameter spaces supported by the data. Third, it is straightforward to investigate the performance of joint posterior density, whereas in a frequentist paradigm, we need to run simulation by pre-specifying the true values of parameters when evaluating the performance of maximum likelihood estimates.
In Sect. 2, we introduce the accelerated failure-time models and demonstrate how a number of common distributions can be brought under the umbrella of the EGG model. Bayesian density estimation of the EGG model and MCMC implementation is described in Sect. 3. In Sect. 4, we illustrate the models of Sect. 2 and the methods of Sect. 3 using real-life data from the 2010 Eritrean Population and Health Survey. Section 5 concludes the chapter by way of summary and concluding remarks. A full list of the distributions used in this chapter, a proof for a lemma, and the R code used in the illustrative example are provided in Appendices.
2 Parametric Models for Survival Data
2.1 Background
Survival data contain information on durations until event or censoring (t 1, t 2, ..., t n) together with a censoring indicator as well as background variables or covariates (z 1, z 2, ..., z p) that are often socio-demographic characteristics of individuals or organizations. The distribution of survival time, T, may be described by its three equivalent functions: the survival function, \(S(t)=P\left (T>t\right )\), the density function, f(t), or the hazard (intensity) function, h(t) = f(t)∕S(t), where the last two functions require absolute continuity.
These functions can vary not only over time, but also among individuals within a population. Thus, one objective in the analysis of survival data is to draw inferences about the influence of covariates on these functions. One popular model is the Cox proportional hazards model presented in Cox (1972) where a p-dimensional vector of covariates z affects the hazard function in a multiplicative manner according to
where h 0(t) is an unspecified baseline function of time and β ∈ R p is an unknown vector of parameters representing the effect of the covariates z. The factor \(\exp (\mathbf {z}'\boldsymbol {\beta })\) describes the intensity (hazard) for an individual with vector z relative to that of a standard individual (with z = 0).
2.2 Accelerated Failure-Time Models
A second class of models, the accelerated failure-time model, specifies the covariates to act multiplicatively on the event time itself rather than on the hazard function.
If T 0 is the random time to event associated with an individual in the baseline group (z = 0), then the accelerated failure-time model specifies that for an individual with a non-zero vector of covariates z, the event time is given by
or equivalently
where, as before, T is the event time, z is a vector of covariates, and β is a vector of regression parameters. Since covariates alter, by a scale factor, the rate at which an individual traverses the time axis, Eq. (2) is referred to as the accelerated failure-time model. Thus, in accelerated failure-time models, the effect of the explanatory variables is to accelerate or decelerate time to event relative to T 0.
The model in (3) is a linear model with \(\ln (T_0)\) playing the role of an error term with an underlying baseline distribution. Usually, a scale parameter δ is allowed in the model to give
where a more conventional notation 𝜖 is used for the error term.
From (4), we note that \(T=e^{\mathbf {z}'\boldsymbol {\beta }}T_{0}^{\delta }\). Thus, the survival function of T may be written in terms of that of T 0:
where S 0(.) is the survival function of the baseline time with scale parameter δ, \(T_{0}^{\delta }\), and \(e^{-\mathbf {z}'\boldsymbol {\beta }}\) is the accelerating/decelerating factor. In other words, the probability for an individual with covariate vector z surviving beyond time t is the same as the probability for an individual in the baseline group (z = 0) surviving beyond time \(te^{-\mathbf {z}'\boldsymbol {\beta }}\). A positive coefficient β shifts the time \(te^{-\mathbf {z}'\boldsymbol {\beta }}\) to the left of t, while a negative β shifts the time \(te^{ -\mathbf {z}'\boldsymbol {\beta }}\) to the right of t if all components of z > 0. Accordingly, the density and hazard functions can also be written in terms of the baseline density and hazard:
The distribution of T 0 in (4) may be selected from positive-valued distributions such as Weibull or log normal that, in turn, yield extreme-value and normal distributions for the error term 𝜖. Below, we demonstrate how the list may be expanded by assembling various models under the same umbrella.
2.3 The Extended Generalized Gamma (EGG) Model
Stacy (1962) introduced the generalized gamma model that is useful in embedding competing models into a single parametric framework. This model is the distribution of T such that \(\ln (T)=\mu +\delta \epsilon \), where μ ∈ R, δ > 0, and the random error term 𝜖 has the density
where k is an additional shape parameter. Prentice (1974) showed that a shift of parameter of the form \(q=k^{-\frac {1}{2}}\) leads to a standard normal distribution for T giving an interior point for q = 0 in the parameter space. The final model with parameters μ, q ∈ R and δ > 0 can be written as \(\ln (T)=\mu +\delta \epsilon \), where the error density function f(q, 𝜖) is given by
The distribution of T when the error term has the density given in Eq. (6) is known as the extended generalized gamma (EGG) distribution, see, for instance (Ghilagaber, 2005; Ghilagaber et al., 2014).
As can be seen from the lower part of (6), the EGG model reduces to the standard normal distribution for 𝜖 when the shape parameter q is equal to zero. Accordingly, T will have a log-normal distribution. When the shape parameter q = 1, (6) reduces to
which is the standard (type 1) extreme-value distribution. As \(\ln \left ( T\right ) \) is a linear function of 𝜖, it has the same (extreme-value) distribution as 𝜖. Hence, \(T=\exp (\mathbf {z}'\boldsymbol {\beta } +\delta \epsilon )\) as defined in Eq. (4) will have a Weibull distribution. If q = 1 and δ = 1, then T has the exponential distribution as a special case of the Weibull distribution. The case of q = −1 corresponds to extreme maximum-value distribution for \(\ln \left ( T\right )\). This, in turn, corresponds to reciprocal Weibull distribution for T.
The case of δ = 1 and q > 0 is also of interest. Farewell and Prentice (1977) argue that this gives the ordinary gamma distribution for T. Others, (Bergström and Edin, 1992; Bergström et al., 1994, 1997), argue that this did not hold in their case illustrations. Consequently, we shall relax this special case to δ = 1 and q ∈ R and label it the “gamma” distribution in our illustrative example. Below, we further extend the above family of distributions by imposing some relationships between the scale and shape parameters.
2.4 Further Extensions of the EGG Model
We begin with a baseline distribution for time to event, \(T_{0}\thicksim EGG(0,1,q)\), and label it as standard generalized gamma distribution with density and survival functions given by
where Φ(⋅) is the cumulative distribution function of the standard normal distribution. By transformation, \(t=e^{\mu }t_{0}^{\delta }\thicksim EGG(\mu ,\delta ,q)\), and T is said to have the extended generalized gamma distribution with shape parameter μ ∈ R, scale parameter δ > 0, and an additional index shape parameter q ∈ R. We denote this by \( T\thicksim EGG(\mu ,\delta ,q)\), with density
The component
in the above equation is the density of the gamma distribution for \( t^{\frac {q}{ \delta }}\) with a shape parameter q −2 and a rate parameter \({q}^{-2}({e}^{-\mu })^{\frac {q}{\delta }}.\) The next lemma gives the rth moment and the first four central moments of the EGG density. The following definitions of skewness and excess kurtosis are used:
where V (T) is the variance.
Lemma 1
If \(T\thicksim EGG(\mu ,\delta ,q)\) , then
A simplified proof of Lemma 1 is provided in Appendix 2.
From Lemma 1, we note that S(T) and K(T) are the functions of q and δ∕q, implying both q and δ∕q are shape parameters.
The survival function of t is then given by
where \(\gamma \left [q^{-2},t^{\frac {q}{\delta }}{q}^{-2}( {e}^{-\mu })^{\frac {q}{\delta }}\right ]/\Gamma (q^{-2})\) is the corresponding cumulative distribution function of the gamma distribution for \(t^{\frac {q}{ \delta }}\) when q > 0 and \(\gamma \left [ q^{-2},t^{\frac {q}{\delta }}{q} ^{-2}({e}^{-\mu })^{\frac {q}{ \delta }}\right ]\) is a lower incomplete gamma function with the form of \( \gamma (s,r)=\int \nolimits _{0}^{r}x^{s-1}e^{-x}dx\) described in Abramowitz and Stegun (1964).
The EGG model redefined in Eqs. (7) and (8) is a rich and versatile model containing many special cases based on different combinations of q and δ.
Apart from those mentioned in the previous subsection, the list may be extended to include the inverse exponential (q = −δ = −1), standard gamma (q = δ), inverse gamma (when q = −δ), ammag (q = 1∕δ), inverse ammag (q = −1∕δ), Rayleigh (q = 1 and δ = 1∕2), inverse Rayleigh (q = −1 and δ = 1∕2), and half-normal (\(q=\sqrt { 2}\) and \(\delta =\sqrt {2}/2\)).
EGG nests more special cases such as Maxwell–Boltzmann, but we have not included this in the present chapter since our focus is on the distribution of survival time T. Further, the equivalent distributions of some special cases are excluded. For instance, the inverse gamma model is equivalent to the Levy model in some special cases: inverse gamma(q −2, q −2 e −μ) ↔ Levy(0, c) when q −2 = 1∕2 and q −2 e −μ = 2c. The standard gamma model is also equivalent to a chi-squared model in some situations: standard gamma(q −2, q −2 e −μ) ↔ \(\chi ^2_{(v)}\) when q −2 = v∕2 and q −2 e −μ = 1∕2.
To sum up, the EGG model constitutes of at least 13 special cases whose relationships are depicted in Fig. 1. Each special case model can be assessed relative to a more comprehensive one using appropriate procedures for comparing nested models. A summary of the density functions, f(t), and survival functions, S(t), for 13 special cases is provided in Appendix 1. The corresponding hazard functions can be obtained by h EGG(t) = f EGG(t)∕S EGG(t). The hazards in the EGG models can be of various forms—increasing, decreasing, bathtub, or arc-shaped (Cox et al., 2007).
When we adapt the generalized gamma distribution to accelerated failure-time models, the location parameter μ can be composed of a linear predictor based on p covariates \( \mu =\beta _{0}+\sum \limits _{i=1}^{n}z_{ji}\beta _{j}\) (j = 1⋯p), which justifies the feasibility of the EGG in accelerated failure-time models.
The distribution of \(\epsilon =\ln (T_{0})\) is given in Eq. (6). When q = 0, 𝜖 is standard normal distributed; when q ≠ 0, it can be manipulated to give
with the corresponding survival functions
Based on the density of 𝜖, Fig. 2 shows the shape of some density functions, f(𝜖), for some selected values of q. Here, we have a special case of \(\ln (T)=\mathbf {z}'\boldsymbol {\beta } +\delta \epsilon =\mu +\delta \epsilon \) in which μ = 0 and δ = 1. We note that the densities are positively skewed for q < 0 and negatively skewed for q > 0 with both the absolute skewness and kurtosis monotone increasing in |q|—which are in accordance with those of Prentice (1974).
3 Bayesian Inference in the Extended Generalized Gamma Model
Bayesian inference for a three-parameter EGG model and four-parameter generalized gamma distribution (EGG model with one extra location parameter) is discussed in Tsionas (2001) and Van Noortwijk (2001) for situations where there is no censoring. Inference becomes more complicated in the presence of censored observations due to, for instance, difficulty to find conjugate prior or derive full conditional posterior.
Heleno and Alberto (1986) have used Bayesian approach for EGG model with censored data using Jeffrey multi-parameter prior. Ramos et al. (2017) have shown that both the Jeffreys prior and the reference priors give improper posteriors to the EGG model, and then proposed the overall reference prior in Berger et al. (2015), which provided the proper posterior. In this section, we present Bayesian inference in the EGG model that allows for any type of censoring mechanism.
3.1 Prior and Posterior
In a Bayesian framework, any prior information about the parameters of interest is combined with the data (likelihood) to derive a posterior distribution.
In our present case, we use normal priors with mean 0 and large variance \(\sigma _1^{2}\) for each effect parameters β j(j = 0, ⋯ , p). We also assume a vague prior, a gamma distribution with hyperparameters a and b for the scale parameter δ. For the index shape parameter q, a normal prior with mean 0 and large variance \(\sigma _2^{2}\) is assumed. These independent priors can be summarized as follows:
We can use any prior that reflects our prior knowledge (if any) of the unknown parameters. In our illustration in Sect. 4, we will use σ 1 = σ 2 = 1000 and hyperparameters a = b = 1. The rationale behind this is to let the likelihood dominate the posterior so that the inferences drawn are driven by the data.
Denoting data with \(\mathcal {D}\), the joint posterior distribution is then given by
where \(L(\boldsymbol {\beta }, \delta , q;\mathcal {D})\) is the likelihood function, and f(⋅) is the prior density function of β j, δ, and q with known hyperparameters. The above posterior can be generalized to other types of likelihood functions based on other censoring mechanisms (than the standard right censoring assumed in our present case). With right censored data, the likelihood function becomes
where d i is the censoring indicator and f(𝜖 i, q) and S(𝜖 i, q) are given by Eqs. (9) and (10), respectively.
Since there is no explicit analytical form for the posterior distribution, sampling is performed using numerical methods based on Markov Chain Monte Carlo (MCMC).
3.2 MCMC: Random Walk Metropolis–Hastings Algorithm with Block Sampling
We sample all parameters sequentially from the joint posterior distribution using the Metropolis–Hastings algorithm. See Gelman et al. (2004) for more details on the Metropolis–Hastings algorithm and its nice properties. A random walk Metropolis–Hastings algorithm with block sampling is used, and the sampling procedure for the parameters θ = (β, δ, q)′ can be summarized as follows:
-
(1)
Set the initial values for the parameters θ 0 = (β 0, δ 0, q 0)′.
-
(2)
Construct the proposal distribution J(θ p|θ c) ∼ N(θ c, c 2 Σ), where θ p is the candidate value, θ c is the current value, and c is the scaling constant and Σ is a known covariance matrix. Here we choose \(\boldsymbol {\Sigma }=-H^{-1}(\hat {\boldsymbol {\theta }})\), where \(H(\hat {\boldsymbol {\theta }})\) is the Hessian matrix evaluated at \(\hat {\boldsymbol { \theta }}\), which is obtained by Newton’s method. Following Gelman et al. (2004), we choose a value of \(c=2.4/\sqrt {k}\), where k is the length of the vector θ.
-
(3)
Generate θ ∗ from J(θ p|θ c) and U from U(0, 1).
-
(4)
If
$$\displaystyle \begin{aligned} U<\frac{f(\boldsymbol{\theta}^{*}|\mathcal{D})f(\boldsymbol{\theta}^{*})J(\boldsymbol{\theta}_{c}|\boldsymbol{\theta}_{p})}{f(\boldsymbol{\theta} _{c}|\mathcal{D})f(\boldsymbol{\theta}_{c})J(\boldsymbol{\theta}_{p}| \boldsymbol{\theta}_{c})}, \end{aligned}$$the candidate vector θ ∗ is accepted and θ c = θ ∗; otherwise, we keep θ c.
-
(5)
Return to step (2).
3.3 Posterior Statistics and Convergence Diagnostics
We summarize our posterior distribution by way of posterior means and highest posterior density (hpd). Since MCMC is based on ergodic mean theorem (Markov chain law of large numbers), convergence can be verified using diagnostic plots such as a plot of the cumulative mean against the number of iterations. In addition, inefficiency factors (IF) can be computed as a measure of the efficiency of the MCMC scheme.
3.4 Bayesian Model Comparisons
The common way of comparing models in the Bayesian framework is the use of Bayes factor that is the ratio of marginal likelihood of two competing models.
Suppose we have a set of candidate models \(\mathcal {M}_{m}, m=1,\cdots ,M\) and the corresponding model parameters θ m. The posterior model probability is then given by
where Y represents the data at hand. The posterior odds \(P(\mathcal {M} _m|Y)/P(\mathcal {M}_l|Y)\) can be used to compare two models, and it can be written in terms of the Bayes factor:
where BF ml is the Bayes factor between \(\mathcal {M}_m\) and \( \mathcal {M}_l\) with the form
The marginal likelihood is a conditional expectation for the likelihood given the prior
It is sensitive to the choice of the prior, especially when the prior is not very informative (Villani et al., 2009). For instance, if \(P(\boldsymbol {\theta _m}|\mathcal {M}_m)\) is far from \(P(Y|\boldsymbol {\theta _{m}},\mathcal {M}_m)\), while \(P( \boldsymbol {\theta _{l}}|\mathcal {M}_l)\) is close to \(P(Y|\boldsymbol { \theta _{l}},\mathcal {M}_l)\), it is possible that \(P(Y|\mathcal { M}_{m})\) is less than \(P(Y|\mathcal {M}_{l})\) even though \(\mathcal {M}_l\) is a sub-model of \(\mathcal {M}_m\).
To avoid such sensitivity to the choice of prior, we compare our models in the illustration on the basis of their predictive performance. The data is split randomly into B folds, and B-1 fold is used as a training data \( \tilde {y}_{-b}\), while the rest one-fold is used as a testing data \(\tilde {y}_b\). The B-fold cross-validation of the log predictive density score (LPDS) is then formed as
In other words, part of the observations are used to update the flat (non-informative) prior and the sensitivity to the prior can be reduced substantially. According to Villani et al. (2009), the Bayes factor is roughly B times more discriminatory than the LPDS. For selecting models in Sect. 4, the LPDS was computed using B = 5 folds of the data.
4 Application: Educational and Residential Differences in Marriage Timing Among Eritrean Men and Women
We now illustrate the models and methods described in the previous sections with real-life data—entry into marriage among Eritrean men and women based on its 2010 Population and Health Survey (EPHS2010).
The main goals with the illustration are to study the distributional shapes of the times to marriage, model the effects of covariates on these event times, and examine if inferences regarding covariate effects are robust to the choice of distributional shape.
The study of marriage timing (age at marriage) is also of substantive interest in its own because of its strong negative association with women’s health directly (Raj, 2010) or indirectly through its negative impact on health care utilization (Godha et al., 2016).
4.1 Data and Variables
The data used for illustration in this chapter come from the 2010 Eritrea Population and Health Survey, EPHS2010 (National-Statistics-Office-Eritrea and Fafo-AIS, 2013). The EPHS2010 was designed as a follow-up to its predecessors—the 1995 and 2002 Demographic and Health Surveys (National-Statistics-Office-Eritrea and Macro-International-Inc., 1997, 2003), and to update the information from the previous surveys as well as provide findings on some new topics of interest.
The EPHS2010 was conducted between January and July 2010 and gathered information from 30224 women aged 15–49 and 5021 men aged 15–59. For the purpose of this paper, only respondents with known values on marital status at the time of the survey are used in the analyses. This resulted in 10238 usable records for women and all 5021 records for men. Detailed tabulations for the entire survey may be found in the EPHS2010 Final Report (National-Statistics-Office-Eritrea and Fafo-AIS, 2013). Summary statistics for the subset of data used in the present chapter are shown in Table 1.
By the survey time (January–July 2010), 7421 of the 10238 women (72 %) and 2569 of the 5021 men (51 %) have responded they were ever married (this includes those who might have been separated or widowed after). The rest, 2817 women and 2452 men (28 % and 49 %, respectively), have responded that they were still single at the time of interview. The distribution of the women across educational level shows that 4186 (41 %) had no education at all, 2055 (20 %) had primary-level education, 1827 (18 %) had middle-level education, 1894 (18 %) had secondary-level education, while the rest 276 (3 %) had post-secondary education. The corresponding figures for men are 1051 (21 %), 803 (16 %), 1209 (24 %), 1516 (30 %), and 442 (9 %), respectively. Further, 1819 (18 %) of the women respondents were from the capital (Asmara), 2504 (24 %) were from other towns, while the majority 5915 (58 %) were from rural areas. The corresponding figures for men are 931 (19 %), 1257 (25 %), and 2833 (56 %), respectively.
The columns of percentage married in Table 1 reveal clear differentials across both educational levels and residence for both women and men. For instance, while women with no education constitute 41 % of the entire sample, they constitute 51 % of the marriages (3799 of 7421). Women with post-secondary education, on the other hand, constitute only 1 % of the marriages (96 of 7421). The pattern is similar but less dramatic for men—those with no education constitute 35 % of the marriages, while those with post-secondary education constitute only 8 % of the marriages. Differentials across residence show that women from rural areas constitute 58 % of the sub-sample but 64 % of the marriages. Women from the capital, on the other hand, constitute 19 % of the sub-sample but only 13 % of the marriages. The contribution of men from the capital to the sub-sample is 18 %, while their contribution to the total marriage is 15 %. Men from rural areas constitute 56 % of the sub-sample but 65 % of the marriages.
Plots of survival functions by education and residence for women and men are shown in Figs. 3, 4, 5, 6, and 7. Figures 3 and 4 show plots for women by education and residence, respectively, while Figs. 5 and 6 show the corresponding plots for men. Figure 7 shows gender differences in entry to first marriage among all men and women.
The plots depict what we already noted in Table 1—that there are differentials across education and residence and that the educational differences are more pronounced in the women data than in men data. The last figure shows that women enter marriage at faster rates than men.
The summary in Table 1 and Figs. 3, 4, 5, 6, and 7 provides a good description of the data at hand, but in order to make sound inferences based on the sub-sample, we need deeper analyses of the data and formal statistical tests. Ghilagaber (2018) has analyzed the data sets using frequentist statistical methods ranging from elementary measures of association between an event of interest and background variables to more complex and advanced methods that utilize the data more efficiently. Elsewhere in this book, (Munezero and Ghilagaber, 2022b) analyze the data sets using dynamic Bayesian approach where covariate effects are allowed to vary over time.
In the next sub-section, we present and discuss results from fitting the further EGG model of Sect. 2 to the above data sets in the Bayesian framework of Sect. 3.
4.2 Results from Bayesian Analysis of the Data Using the EGG Model
Table 2 contains a summary of our results to which we will return at the end of this section. Results from fitting the extended generalized gamma (EGG) model and its 13 special cases to the data for women, men, and the combined sample are shown in Tables 3, 4, and 5, respectively.
In Table 3, the results from the unconstrained EGG model show that the scale and shape parameters (which are freely estimated from the data) are δ = 0.246 and q = −0.526, respectively.
These estimates give early indications of the constants to which the scale and shape parameters are close as well as the relationship between them. For instance, the estimated shape parameter (− 0.526) is much closer to − 1 and 0 than it is to 1. This, in turn, means the reciprocal Weibull distribution (which constrains the shape parameter to − 1) and the log-normal distribution (which constrains the shape parameter to 0) are more plausible candidate distributions than the Weibull distribution (which constrains the shape parameter to 1).
With regard to the relationships between the scale and shape parameters, a model that constrains negative equality is δ = −q that seems to be more plausible compared to, for instance, a model that constrains reciprocal or negative reciprocal relationship. This is so because a reciprocal relationship would give a scale parameter of 1∕(−0.526) = −1.90, while a negative reciprocal relationship would yield − (1∕(−0.526)) = 1.90 both of which are far from the freely estimated scale parameter 0.246. This, in turn, excludes models such as ammag and inverse ammag in favor of the inverse gamma model.
The above closeness of the special case models to the more general EGG model is also reflected in the values of log predictive density scores (LPDS) given in the last columns of each model. For instance, the LPDS of the EGG model is − 4584, while that of the closest model (the inverse gamma) is − 4594. On the other hand, the LPDS for ammag and inverse ammag are − 5705 and − 5184, respectively, which are far from that of the EGG.
Another point worth noting is that the estimates of the covariate effects and their associated 95% hpd are much alike in the models that are close to each other (in terms of estimated scale and/or shape parameters or in terms of LPDS) than those estimates that are far apart.
Thus, for the women data, it would not make much difference if we base our conclusions on the estimates from the EGG model or the inverse gamma model though a formal test would favor the larger EGG model.
The results for men shown in Table 4 can be interpreted similarly. Here, the scale and shape parameters estimated freely from the data in the EGG model are δ = 0.235 and q = −0.199, respectively. Again, the inverse gamma model that imposes a negative relationship between the scale and shape parameters (δ = −q) seems to be much more plausible than any other model. In fact, a closer look at the LPDS values shows that it even outperforms the larger EGG model though the difference in LPDS is marginal.
Hence, for the men data, we have a very strong evidence to base our conclusions on the results from the inverse gamma model that, of course, are identical to those from the EGG model.
Last, the results for the combined sample are shown in Table 5. Similar reasoning as in the above leads to the choice of EGG model or the inverse gamma model though a formal test would favor the larger EGG model. That the results for the combined sample reflected those for women are not surprising because women constitute about two-third of the combined sample.
The final estimates of covariate effects and their associated 95% hpd from our chosen models for respective data sets are summarized in Table 2.
The results in Table 2 show that there are significant differentials in entry to first marriage across women’s educational level and residence where lower education and rural residence are associated with higher intensities of marriage. For men, the educational differences are less pronounced as there is no significant difference in the intensities of entry to first marriage between those with no education and those with primary or middle education. The residential differential is, however, still significant. The results for the combined sample follow those of women because, as mentioned before, women constitute majority in the combined sample.
5 Summary and Concluding Remarks
In this chapter, we presented the extended generalized gamma (EGG) model for survival data with censored observations. Previous works have shown that five known models can be treated as special cases of the EGG model by constraining the scale parameter, shape parameter, or both to some constants. In the present chapter, we extended the EGG model further to include 13 special case models. This was achieved by imposing relationships between the scale and shape parameters in addition to constraining them to some constants.
The issues were illustrated with data on entry into first marriage among Eritrean men and women based on data from the 2010 Eritrean Population and Health Survey (EPHS 2010). Inference was fully Bayesian using a random walk Metropolis–Hastings algorithm to sample from the posterior distribution, and we compared the models with each other and relative to the more general EGG model using the log predictive density score (LPDS).
The application demonstrates that the further extended family of distributions provides a wide range of alternatives for a baseline distribution in the analysis of survival data with censored observations. For instance, we found that the inverse gamma model, where we impose the scale parameter to be the negative of the shape parameters (δ = −q), fits the men data best and outperforms the EGG model. It also performs well in the women data and the combined sample though the evidence is not as strong as in the men data. This was in accordance with the freely estimated values of the scale and shape parameters in the EGG model.
The empirical results in the final selected models reveal significant differentials in the pace of entry to first marriage across women’s educational levels and residence. As would be expected, lower education and rural residence is associated with higher intensities of marriage. Educational differentials are, however, less pronounced for men as there was no significant difference in the intensities of entry to first marriage between those with no education (the baseline group) and those with primary or middle education. The residential differential was still significant in the men’s data. When we analyzed the combined data, the results followed those of women due, mainly, to the fact that women constitute about two-third in the combined sample.
It may be worth noting that the educational level of the individuals refers to what is achieved by the survey time. As such, it is anticipatory in the sense that the reported educational level might have been achieved after the event of interest. But, our aim here is to demonstrate the models and methods empirically, and the anticipatory nature of education does not affect our purpose. Ghilagaber and Koskinen (2009), Ghilagaber and Larsson (2019), and Munezero and Ghilagaber (2022a) study potential biases due to the use of anticipatory covariates and how to account for that.
Our analysis was based on the tacit assumption that the survivor function S(t) tends to 0 as the study period gets longer. This, in turn, means that we have assumed all individuals will experience the event of interest sooner or later. This may not be true for the event in our illustrative example (marriage) as there may be some individuals who may never marry for various reasons. Future works may, therefore, consider accounting for such long-term survivors (those who may never experience the event of interest). This can be achieved by using, for instance, a mixture model consisting of a hazard/intensity model for those who experienced the event or may experience it in the future and a logistic model for the probability of being long-term survivor (never experiencing the event).
References
Abramowitz, M., & Stegun, I. (1964). Handbook of mathematical functions with formulas, graphs, and mathematical tables. New York: Dover publications.
Addison, J., & Portugal, P. (1987). On the distributional shape of unemployment duration. The Review of Economics and Statistics, 69(3), 520–526.
Berger, J. O., Bernardo, J. M., Sun, D. et al. (2015). Overall objective priors. Bayesian Analysis, 10, 189–221.
Bergström, R., & Edin, P. (1992). Time aggregation and the distributional shape of unemployment duration. Journal of Applied Econometrics, 7(1), 5–30.
Bergström, R., Engwall, L., & Wallerstedt, E. (1994). Organizational foundations and closures in a regulated environment: Swedish commercial banks 1831–1990. Scandinavian Journal of Management, 10(1), 29–48.
Bergström, R., Engwall, L., & Wallerstedt, E. (1997). The importance of flexible hazard functions in the analysis of organizational survival data–experiences from a cohort of Swedish commercial banks. Quality and Quantity, 31(1), 15–35.
Butler, R., & McDonald, J. (1986). Trends in unemployment duration data. The Review of Economics and Statistics, 68(4), 545–557.
Cox, C., Chu, H., Schneider, M., & Muñoz, A. (2007). Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Statistics in Medicine, 26(23), 4352–4374.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–202.
Farewell, V., & Prentice, R. (1977). A study of distributional shape in life testing. Technometrics, 19(1), 69–75.
Gelman, A., Carlin, J., Stern, H., & Rubin, D. (2004). Bayesian data analysis. Boca Raton, FL: Chapman & Hall/CRC.
Ghilagaber, G. (2005). The extended generalized gamma model and its special cases: Applications to modeling marriage durations. Quality and Quantity, 39(1), 71–85.
Ghilagaber, G. (2018). Statistical analysis of educational and urban-rural differences in marriage-timing: Evidence from the 2010 Eritrean population and health survey for women and men. In Proceedings of the International Conference on Eritrean Studies, 20–22 July 2016 (pp. 1097–1124).
Ghilagaber, G., Elisa, W., & Gyimah, S. O. (2014). A family of flexible parametric duration functions and their applications to modeling child-spacing in sub-Saharan Africa. In Advanced Techniques for Modelling Maternal and Child Health in Africa. Springer Series on Demographic Methods and Population Analysis (Vol. 34, pp. 185–209).
Ghilagaber, G., & Koskinen, J. H. (2009). Bayesian adjustment of anticipatory covariates in analyzing retrospective data. Mathematical Population Studies, 16(2), 105–130.
Ghilagaber, G., & Larsson, R. (2019). Maximum likelihood adjustment of anticipatory covariates in the analysis of retrospective data. Research report 2019–1. Stockholm: Department of Statistics, Stockholm University.
Godha, D., Gage, A. J., Hotchkiss, D. R., & Cappa, C. (2016). Predicting maternal health care use by age at marriage in multiple countries. Journal of Adolescent Health, 58(5), 504–511.
Heleno, A., & Alberto, J. (1986). The log-linear model with a generalized gamma distribution for the error: a Bayesian approach. Statistics and Probability Letters, 4(6), 325–332.
Li, F., Villani, M., & Kohn, R. (2010). Flexible modeling of conditional distributions using smooth mixtures of asymmetric student t densities. Journal of Statistical Planning and Inference, 140, 3638–3654.
Munezero, P., & Ghilagaber, G. (2022a). Dynamic Bayesian adjustment of anticipatory covariates in retrospective data: application to the effect of education on divorce risk. Journal of Applied Statistics, 49(6), 1382–1401.
Munezero, P., & Ghilagaber, G. (2022b). Dynamic Bayesian modelling of educational and residential differences in family initiation among Eritrean men and women. In Modern Biostatistical Methods for Evidence-Based Global Health Research. Springer Series on Emerging Topics in Statistics and Biostatistics.
National-Statistics-Office-Eritrea & Fafo-AIS (2013). Eritrea Population and Health Survey 2010. New York: National Statistics Office and Fafo Institute for Applied International Studies.
National-Statistics-Office-Eritrea & Macro-International-Inc. (1997). Eritrea Demographic and Health Survey 1995. New York: National Statistics Office and Macro International.
National-Statistics-Office-Eritrea & Macro-International-Inc. (2003). Eritrea Demographic and Health Survey 2002. New York: National Statistics and Evaluation Office and ORC Macro.
Peng, Y., Dear, K., & Denham, J. (1998). A generalized F mixture model for cure rate estimation. Statistics in Medicine, 17(8), 813–830.
Prentice, R. (1974). A log gamma model and its maximum likelihood estimation. Biometrika, 61(3), 539.
Raj, A. (2010). When the mother is a child: the impact of child marriage on the health and human rights of girls. Archives of Disease in Childhood, 95(11), 931–935.
Ramos, P. L., Achcar, J. A., Moala, F. A., Ramos, E., & Louzada, F. (2017). Bayesian analysis of the generalized gamma distribution using non-informative priors. Statistics, 51, 824–843.
Stacy, E. (1962). A generalization of the gamma distribution. The Annals of Mathematical Statistics, 33(3), 1187–1192.
Tsionas, E. (2001). Exact inference in four-parameter generalized gamma distributions. Communications in Statistics-Theory and Methods, 30(4), 747–756.
Van Noortwijk, J. (2001). Bayes estimates of flood quantiles using the generalised gamma distribution. In System and Bayesian Reliability; Essays in Honor of Professor Richard E. Barlow on his 70th Birthday (pp. 351–374).
Villani, M., Kohn, R., & Giordani, P. (2009). Regression density estimation using smooth adaptive Gaussian mixtures. Journal of Econometrics, 153, 155–173.
Wei, L. J. (1992). The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Statistics in Medicine, 11(14–15), 1871–1879.
Acknowledgements
Yuli Liang’s research is partly supported by the strategic funding of Örebro University. Liang carried out part of the work while affiliated with Stockholm University. The data sets analyzed in this chapter were provided to the co-author (GG) by the National Statistics Office, Eritrea. He is grateful to its director Mr. Ainom Berhane and its senior statisticians Mr. Hagos Ahmed and Mr. Samuel Tesfamariam. The views expressed in the chapter are solely of the authors and do not express the views or opinions of the data source or its employees.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1: Density Functions, f(t), and Survival Functions, S(t), of Special Cases in the Extended Generalized Gamma Model
Distribution | f(t) | S(t) |
---|---|---|
Exponential | \(e^{-\mu }e^{-e^{-\mu }t}\) | \(e^{-e^{-\mu }t}\) |
I. Exponential | \(e^{\mu }e^{-\frac {e^{\mu }}{t}}/t^2\) | \(1-e^{-\frac {e^{\mu }}{t}}\) |
Weibull | \(\frac {1/\delta }{e^{\mu }}\left (\frac {t}{e^{\mu }}\right )^{1/\delta -1}e^{-\left (\frac {t}{e^{\mu }}\right )^{1/\delta }}\) | \(e^{-t^{1/\delta }e^{-\mu /\delta }}\) |
R. Weibull | \(\frac {1/\delta }{e^{\mu }}\left (\frac {t}{e^{\mu }}\right )^{-1/\delta -1}e^{-\left (\frac {t}{e^{\mu }}\right )^{-1/\delta }}\) | \(1-e^{-t^{-1/\delta }e^{\mu /\delta }}\) |
Rayleigh | \(\frac {2t}{e^{2\mu }}e^{-\frac {t^2}{e^{2\mu }}}\) | \(e^{-\frac {t^2}{e^{2\mu }}}\) |
I. Rayleigh | \(\frac {2e^{2\mu }}{t^3}e^{-\frac {e^{2\mu }}{t^2}}\) | \(1-e^{-\frac {e^{2\mu }}{t^2}}\) |
S. Gamma | \(1/\Gamma (q^{-2})\left (q^{-2}e^{-\mu }\right )^{q^{-2}}t^{q^{-2}-1}e^{-q^{-2}e^{-\mu }t}\) | \(1-\gamma \left (q^{-2},tq^{-2}e^{-\mu }\right )/\Gamma (q^{-2})\) |
Gamma | (q > 0) \(qt^{q-1}\frac {1}{\Gamma (q^{-2})}\left (q^{-2}e^{-q\mu }\right )^{q^{-2}}\left (t^q\right )^{q^{-2}-1}e^{-q^{-2}e^{-q\mu }t^q}\) | \(1-\gamma \left (q^{-2},t^qq^{-2}e^{-q\mu }\right )/\Gamma (q^{-2})\) |
(q < 0) \(-qt^{q-1}\frac {1}{\Gamma (q^{-2})}\left (q^{-2}e^{-q\mu }\right )^{q^{-2}}\left (t^q\right )^{q^{-2}-1}e^{-q^{-2}e^{-q\mu }t^q}\) | \(\gamma \left (q^{-2},t^qq^{-2}e^{-q\mu }\right )/\Gamma (q^{-2})\) | |
I. Gamma | \(t^{-2}\frac {1}{\Gamma (q^{-2})}\left (q^{-2}e^{\mu }\right )^{q^{-2}}\left (t^{-1}\right )^{q^{-2}-1}e^{-q^{-2}e^{\mu }t^{-1}}\) | \(\gamma \left (q^{-2},t^{-1}q^{-2}e^{\mu }\right )/\Gamma (q^{-2})\) |
Ammag | \(q^2t^{q^2-1}\frac {1}{\Gamma (q^{-2})}\left (q^{-2}e^{-q^2\mu }\right )^{q^{-2}}\left (t^{q^2}\right )^{q^{-2}-1}e^{-q^{-2}e^{-q^2\mu }t^{q^2}}\) | \(1-\gamma \left (q^{-2},\frac {t^{q^2}}{q^2}e^{-q^2\mu }\right )/\Gamma (q^{-2})\) |
I. Ammag | \(q^2t^{-q^2-1}\frac {1}{\Gamma (q^{-2})}\left (q^{-2}e^{q^2\mu }\right )^{q^{-2}}\left (t^{-q^2}\right )^{q^{-2}-1}e^{-q^{-2}e^{q^2\mu }t^{-q^2}}\) | \(\gamma \left (q^{-2},\frac {t^{-q^2}}{q^2}e^{q^2\mu }\right )/\Gamma (q^{-2})\) |
Log normal | \(\frac {1}{t\delta \sqrt {2\pi }}e^{-\frac {(logt-\mu )^2}{2\delta ^2}}\) | \(\frac {1}{2}(1+erf(\frac {x-\mu }{\sqrt {2}\delta }))\) |
Half-normal | \(\frac {1}{e^{\mu }}\sqrt {\frac {2}{\pi }}e^{-\frac {x^2}{2e^{2\mu }}}\) | \(1-erf(\frac {x}{\sqrt {2}e^\mu })\) |
Appendix 2: Proof of Lemma 1
Proof:
Let \(x=q^{-2}(e^{-\mu })^{\frac {q}{\delta }}t^{\frac {q}{\delta }}\); then \( t=(q^2e^{\frac {q\mu }{\delta }}x)^{\frac {\delta }{q}}\),
When r = 1, we get E(T), and when r = 2, we get E(T 2). Using these, we have V (T) = E(T 2) − E 2(T). \(\blacksquare \)
Appendix 3: R Program Codes for Bayesian Inference
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Liang, Y., Ghilagaber, G. (2022). Bayesian Survival Analysis with the Extended Generalized Gamma Model: Application to Demographic and Health Survey Data. In: Chen, DG.(., Manda, S.O.M., Chirwa, T.F. (eds) Modern Biostatistical Methods for Evidence-Based Global Health Research. Emerging Topics in Statistics and Biostatistics . Springer, Cham. https://doi.org/10.1007/978-3-031-11012-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-11012-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11011-5
Online ISBN: 978-3-031-11012-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)