Abstract
The parametric survival model with Weibull distribution can be used to model a wide range of practical lifetime data. While there have been several studies comparing the fit of various distributions to right-censored and interval censored data, there are no recommendations in the literature on optimal distributions to use for left-censored heavy-tailed data. Parametric Reverse Hazards (PRH) has gained considerable attention from time-to-event data researchers for its excellent properties and appropriateness to analyzing left-censored survival data. To analyze left-censored with heavy-tailed data, we derived the PRH model for a variety of distributions including the Exponential, Log-normal, Inverse Gaussian, Log-logistic, Gompertz–Makeham, Gamma, Generalized Gamma, Inverse Gamma, Generalized Inverse Gamma, Weibull, Inverse Weibull, Generalized Inverse Weibull, Modified Weibull, Flexible Weibull, Power Generalized Weibull, and Marshal–Olkin distributions. Extensive statistical simulations were used to assess the performance of the derived PRH models and compare these to establish a guideline for which distribution/s would “best” fit for left-censored heavy-tailed data. We then applied the best performing model to the South Carolina Enhanced HIV/AIDS Reporting Surveillance System data to explain the effects of different demographic, social, and treatment factors on patients’ viral load transition from detectable-to-undetectable levels.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Human immunodeficiency virus (HIV) is a chronic disease which weakens the immune system, leading to increased susceptibility to a wide range of infections and some types of cancer [46]. An important biomarker to measure HIV disease progression is HIV viral load (VL), the number of copies of actively replicating HIV virus in an individual [42]. By the US Health and Human Services guideline, if the number of copies is less than or equal to 200 per milliliter of blood, VL is classified as undetectable; otherwise, it is classified as detectable [14]. To date, there is no cure for HIV but the suppression of VL to undetectable levels improves physical functioning, reduces opportunistic infections, reduces HIV related mortality, and is associated with a substantial decrease in the probability of transmitting HIV to others [6, 10, 15]. Not only is suppressing VL important on an individual level, it also has the potential to decrease HIV incidence rates in a community because of reduced infectivity [10, 13]. Consequently, the focus of care has shifted from survival to improving health outcomes and the success of highly active antiretroviral therapy (ART) to suppress VL to undetectable levels for prolonged periods of time has transformed HIV into a manageable chronic disease [42].
To gain insight into the HIV endemic, survival models of patient VL may be an effective way since traditional regression models are not able to handle censored data directly. Additionally, these models can be used to assess the effect of various factors and treatments on VL suppression. The commonly used form of survival model can be written as
where \(\lambda _{0}(t)\) is the baseline hazard function, \(x_{i}\) is the set of covariates, and \(\beta \) are parameters estimating covariate effects on hazard. In semi-parametric survival models, the regression coefficients are estimated leaving the baseline hazard unspecified. For example, the Cox Proportional Hazards (PH) model [11] introduced the use of the partial-likelihood function to estimate the coefficients without needing to characterize the baseline Hazard Rate. To avoid making distributional assumptions about the baseline hazard, several studies used nonparametric methods to correct for censoring [18, 31, 34, 35]. However, this can also be disadvantageous since assuming an underlying distribution naturally smooths the data so that censoring has less impact on parameter estimates.
While the parametric survival models can be advantageous in many respect, a well-suited parametric distribution for baseline hazard generally ensures more precise estimation of hazard parameters when compared to the semi-parametric counterpart. However, such benefits also come with a very commonly faced challenge for the applied researchers of selecting the appropriate parametric distribution. This gets even more challenging in cases where the data present characteristics, e.g., left censored time-to-events, heavy-tailed event time density, not very frequently studied in related literature. Given the importance of choosing right distribution in a parametric survival model, any guidance on choosing well-fitting parametric distributions can be a useful addition to the related literature helping applied researchers.
There have been several studies (e.g., [23, 24, 39]) that studied the comparative fits of various distributions to the right-censored and interval-censored lifetime data. However, there are no known recommendations or exploration in the literature on guiding the optimal choice of distribution to use while modeling left-censored time-to-event with heavy-tailed event times. These data features not rare in chronic disease biomarker settings, e.g., HIV VL as discussed in above and characterized further in below (see Fig. 1).
Among the very limited studies in literature analyzing time-to-event of HIV VL suppression, it may be notable that [40] applied a lognormal survival model using a fully parametric approach to take into account the left-censored HIV VL counts. The choice of lognormal distribution was guided by a previous work of [17] based on the estimated lognormal survival distribution function was contained within the 95% confidence interval of nonparametric Kaplan–Meier estimate. Despite [40] presented sensitivity analysis by comparing the lognormal survival model to a univariate mixed model and a Cox PH model, it was not known if any other survival distributions could provide a better fit to the data.
While analyzing left-censored event time data, a further challenge can be the use of the appropriate hazard function for estimation of event risk. The common and widely used estimate of time-to-event risk, Hazard Rate (HR), is appropriate for using with right-censored lifetime data and may be very unstable if used for analyzing left-censored event time risk [43]. A more appropriate choice of estimating left-censored time-to-event risk can be the Reversed Hazard Rate (RHR) [43].
Since its introduction in 1963 [2], the RHR has been used in various applications and several articles [5, 12, 16, 20, 28, 29, 36, 37] studying the properties of the RHR function and devising methodologies based on it to analyze left-censored lifetime data are found in the literature. One recent development is the Parametric Reversed Hazards (PRH) model based on the RHR to be applied to left-censored lifetime data [43]. In this formulation [43], the lifetime random variable was assumed to be distributed as inverted Weibull.
This current study derives the PRH model for a variety of distributions which may be appropriate for left-censored heavy-tailed data including the Exponential, Log-normal, Inverse Gaussian, Log-logistic, Gompertz–Makeham, Gamma, Generalized Gamma, Inverse Gamma, Generalized Inverse Gamma, Weibull, Inverse Weibull, Generalized Inverse Weibull, Modified Weibull, Flexible Weibull, Power Generalized Weibull, and Marshal–Olkin distributions. Extensive statistical simulations are used to assess the performance of the derived PRH models and compare these to establish a guideline for which distribution/s would “best” fit left-censored, heavy-tailed HIV VL data. We applied the selected best performing model to the South Carolina Enhanced HIV/AIDS Reporting Surveillance System (SC eHARS) data to explain effects of different demographic, social, and treatment factors on patients’ VL transition from detectable-to-undetectable levels. Recommendations from this study may help researchers apply more accurate models for this type of censoring, specifically in HIV VL-related studies where left censoring may be a common occurrence and the data demonstrate considerably uncommon features, e.g., being heavy-tailed.
2 The Parametric Reverse Hazards model
The Parametric Reversed Hazard (PRH) model [43] is a fully parametric model based on the Reversed Hazard Rate (RHR) for the analysis of left-censored data. The Hazard Rate (HR) used for analyzing more common right-censored time-to-event data is defined as the instantaneous rate of an event in an infinitesimal time width, \(\varDelta t\), following an event free time t and expressed mathematically as
Unlike the above, RHR of T is the instantaneous rate of the event occurring in an infinitesimal time width, \(\varDelta t\), preceding t, given that the event occurred before time t. It is defined as
In terms of the distribution function, F(t), and probability density function, f(t), the RHR function can be written as
By letting X be a \(p \times 1\) vector of covariates, we can now define the PRH model as
where \(\lambda _{0}(t)\) is the baseline RHR, \(g(\beta ; X)\) is a nonnegative function of X and \(\beta \) (a \(p \times 1\) vector of regression parameters), and \(\lambda (t | X)\) is the RHR of T given the covariates X.
The PRH model can also be expressed in terms of the distribution function as
where F(t|X) is the distribution function of T given X and \(F_{0}(t)\) is the baseline distribution function in the absence of covariates.
Suppose that the lifetime random variable T is randomly left-censored by Z. In practice, we may observe the vectors \((Y, \delta , X)\), where \(Y =\) max(T, Z) and \(\delta = I(T = Y)\) with I(.) being the indicator function. The likelihood function can then be written as
Using this general notation, we show the derivation assuming Generalized Inverse Weibull as the baseline hazard distribution. See supplementary materials for model derivations for the other baseline hazard distributions.
When the lifetime random variable follows a Generalized Inverse Weibull distribution, the baseline distribution function is given by
The baseline RHR of T is then obtained as
In the presence of the covariates X, we have
From these, the likelihood and the log-likelihood functions are obtained as
Similar derivations for several other distributions including the Exponential, Log-normal, Inverse Gaussian, Log-logistic, Gompertz–Makeham, Gamma, Generalized Gamma, Inverse Gamma, Generalized Inverse Gamma, Weibull, Inverse Weibull, Modified Weibull, Flexible Weibull, Power Generalized Weibull, and Marshal–Olkin distributions are provided in supplementary materials.
3 Simulation Study
We used the SC eHARS HIV VL data, further described in the next section, as a real-life example of such data and simulate data with similar distribution for the time to transit from detectable VL to undetectable VL state after HIV diagnosis. Figure 1 presents the density of time to transition from detectable-to-undetectable VL transition for both observed data and a randomly selected set of simulated data showing similarities in densities. The time to detectable-to-undetectable VL transition data were simulated from a Skewed Normal distribution with location, scale, and shape parameters, respectively, as 5, 30, 50. Different parameters were tested under the Skewed Normal distribution using a trial-and-error approach until the simulated data matched as close as possible to the SC VL data.
To assess the model fits best, we used information criteria including
-
1.
Akaike Information Criterion (AIC) rewards goodness of fit but penalizes the model for increasing the number of estimated parameters:
$$\begin{aligned} AIC = 2k - 2\ln (L) \end{aligned}$$ -
2.
Bayesian Information Criterion (BIC), which uses a larger penalty than AIC:
$$\begin{aligned} BIC = k\ln (n) - 2\ln (L) \end{aligned}$$ -
3.
Corrected Akaike Information Criterion (AICC), which corrects the AIC for overfitting of the data in cases where the sample size is relatively small compared to the number of parameters in the model:
$$\begin{aligned} AICC = AIC + (2k(k+1))/(n-k-1) \end{aligned}$$ -
4.
Hannan–Quinn Information Criterion (HQIC), which is often cited in the literature but, unlike AIC, it is not asymptotically efficient:
$$\begin{aligned} HQIC = 2k\ln (\ln (n)) - 2\ln (L) \end{aligned}$$ -
5.
Bozdogan’s Consistent Akaike Information Criterion (CAIC), is another adjusted form of AIC which is consistent:
$$\begin{aligned} CAIC = k(\ln (n)+1) - 2\ln (L) \end{aligned}$$
Where k is the number of parameters to be estimated, L is the maximum value of the likelihood function, and n is the number of observations. The model with the smallest average AIC, BIC, AICC, HQIC, and CAIC value was determined to be the model with the best fit. The simulation studies were conducted using the Statistical Computing Software, R version 3.2.5. The summaries of the simulation results are presented in Tables 1, 2, 3.
Table 1 summarizes the results for the simulated data with a censoring rate of 20%, Table 2 for data with censoring rate of 30%, and Table 3 for data with censoring rate 40%. From these tables, it is clear that the Generalized Inverse Weibull distribution consistently performs the best, having the lowest average AIC, BIC, AICC, HQIC, and CAIC. Following closely behind in performance are the Log-Logistic, Log-Normal, Inverse Gaussian, and Gamma distributions, respectively. This is consistent across all censoring rates and sample sizes. The consistently worst performing distributions are the Modified Weibull, Inverse Weibull, Inverse Gamma, Power Generalized Weibull, and Exponential distributions, respectively.
4 Application to SC eHARS Data
The HIV endemic disproportionately impacts the Southern states in the US in terms of the overall number of people living with HIV/AIDS (PLWHA), and survival rates after diagnosis [33]. SC, like many Southern states, ranks high for poverty, unemployment, and low educational completion which are all characteristics that may promote disease transmission. The number of PLWHA in SC has increased from 12,089 in 2004 to 16,311 in 2014 [38]. Studies on retention in HIV care found that a large proportion of PLWHA in SC failed to remain in care on a regular basis [30, 41]. Given the HIV burden in SC and the need to focus on retention in HIV care within the context of the National HIV/AIDS Strategy goals, it is important to identify factors which suppress VL. Identifying these factors will assist in developing targeted strategies to reduce the HIV burden in SC.
Since January 2004, all health care providers, hospitals, and laboratories in SC are legally mandated to report all CD4 count and VL measurements to the SC Department of Health and Environmental Control (DHEC) [7]. These data are stored in the SC eHARS database along with the patient’s socio-demographic characteristics. The quality rating of the SC eHARS database exceeds the CDC minimum standards of reporting timeliness with 95% of new cases being reported within 6 months of HIV diagnosis and 98% of all HIV cases reported [44]. Our sample consisted of 6,221 residents in SC who were aged \(\ge 13\) years or older; diagnosed or living with HIV infection between January 1, 2005, and December 31, 2013; had detectable VL at the start of the study period; had at least two reported VL values during the study period.
This study applies the best model as determined from the simulation study to left-censored heavy-tailed HIV VL data from South Carolina. The aim of applying the PRH model to this dataset is to explain the risk behavior of transitioning from detectable VL to undetectable VL. Patients with undetectable VL at the beginning of the study were defined as being left-censored. Covariates that were assessed include gender (male or female), race (White, Black, or other), HIV risk exposure group (heterosexual, men who have sex with men, or other), place of residence (rural or urban), age at baseline, initial treatment regimen (single tablet regimen, multiple tablet regimen), and baseline CD4 count (200 or less, 201 to 350, 351 to 500, or more than 500). Note that HIV risk exposure group refers to how the patient was first exposed to HIV with options including heterosexual HIV infected partner, men who have sex with other men, injecting drug user, no identifiable risk, and no risk reported. Results from the PRH model are presented and discussed in the next section.
Of the individuals in our sample, 1703 (27%) had an undetectable VL at the beginning of the observation period, so they were considered as being left-censored (Table 4). Mean age of the sample at baseline was 40.0 years (range = 14.8–81.6). The majority of subjects were male (n = 3657, 58.8%), Black (n = 4966, 79.8%), and lived in an urban county when diagnosed with HIV (n = 4208, 67.6%). The CD4 count at the beginning of the study was less than 200 cells/mm3 for just over one third of the individuals (34.03%). Almost half of the sample had missing treatment regimen (n = 2928, 47.1%).
The Generalized Inverse Weibull distribution, which was found to be the best performing distribution from the simulation study, is applied to analyze the left-censored SC eHARS data time-to-event data for detectable-to-undetectable VL transition. Table 5 shows the results of the estimated PRH model using a Generalized Inverse Weibull distribution. Information on treatment regimen is a very important variable to use in our model to assess which type of treatment has the most, if any, impact on the transition from detectable-to-undetectable VL. However, this information is missing in almost 50% of the subjects in our sample. Thus, we fit the model without this starting treatment regimen information (Model 1) and then we fit a second model with reduced sample size after including the treatment variable in the model (Model 2). It should be noted that if there was not such a large proportion of missing values in the treatment variable, we would fit only one model, Model 2.
While several covariates have been shown to have an effect on the time-to-event of transitioning from detectable-to-undetectable VL level, the significant change in behavior of some of these covariates comparing the model incorporating the treatment variable compared to the model without this important factor suggests that an interaction may be present between treatment regimen and each of the other covariates. Additional models were run testing for these interactions. The only statistically significant interaction found was between treatment regimen and age, the results of which are shown in Table 6.
The final model is shown in Table 6. Males are 1.11 times more likely to reach undetectable levels faster than their female counterparts (95% CI 1.00, 1.24). White individuals are 1.53 times more likely to reach undetectable levels faster than Black individuals (95% CI 1.40, 1.67). Other races are 0.86 times less likely to reach undetectable levels faster than Black individuals, though this finding is not significant (95% CI 0.63, 1.16). Risk of exposure, place of residence (rural vs urban), and CD4 count do not seem to have any statistically significant impact on the time taken to transition from detectable-to-undetectable VL levels. The significant interaction between treatment regimen and age highlights that older people living with HIV/AIDS are 0.97 times less likely to reach undetectable levels faster than their younger counterparts (95% CI 0.97, 0.98).
5 Discussion
The current study derived several extensions of the PRH model and conducted extensive simulation studies to evaluate the usefulness of parametric regression models based on the Reversed Hazard Rate for analyzing left-censored heavy-tailed HIV viral load time-to-event data. Simulation studies suggested the best distribution to use under the PRH model is the Generalized Inverse Weibull distribution followed in order of performance by Log-Logistic, Log-Normal, Inverse Gaussian, and Gamma distributions.
Application of this best performing model on the SC eHARS data revealed important factors on the time to transition from detectable-to-undetectable viral load levels. Males were found to be more likely to reach undetectable levels faster than females. This trend is also evident in several recent studies [4, 7, 25]. A possible reason for this disparity may be attributed to the higher rates of treatment adherence among males compared to females. Though some studies did not find an association between gender and treatment adherence, a meta-analysis [22] of 207 studies concluded that males adhere more to ART than females.
White individuals are more likely to reach undetectable levels faster than Black individuals. This is supported by several studies which show that Black individuals are disproportionately affected by HIV/AIDS as they tend to have poorer access to health care, are less likely to receive treatment, less likely to adhere to treatment, and less likely to survive HIV/AIDS [4, 7,8,9, 19, 27, 32].
This study did not find any statistically significant association between place of residence and time to transition from detectable-to-undetectable VL levels. This may seem in contrary to the expectation that individuals who live in urban areas would be more likely to reach undetectable levels faster than those who live in rural areas due to the typically increased access to health care and higher range of specialists available to people living with HIV/AIDS in urban areas [44, 45]. However, other studies (e.g.,[9]) have also reported analyses supporting the current study reporting no significant effect of place of residence on detectable-to-undetectable VL transition.
Finally, the interaction between drug regimen and age highlights that older people who are on a multiple treatment regimen are likely to reach undetectable levels slower than their younger counterparts. There are mixed findings on this in the existing literature. Young people with HIV tend to have delayed diagnosis and thus higher VL at baseline. One study [7] suggests that this along with underutilization of health care due to HIV-related stigma explains their finding that younger people with HIV reach undetectable levels slower than their older counterparts. A possible explanation of our result may be that older people are not as adherent to treatment [22] or perhaps they have a co-existing morbidity which effects the rate at which they reach undetectable levels.
There are several limitations of the SC eHARS database. Data on VL and CD4 count measurements were not available for those who dropped out of medical care after initial diagnosis—this includes those who passed away, moved to a different state, etc. Additionally, persons living with HIV/AIDS who have not been diagnosed were not captured in this database. The database also does not include information on morbidities which may be co-existing with HIV/AIDS which can impact the effect of drug regimens, especially in older people. Since the interaction between age and drug regimen is found to be statistically significant to have an impact on the VL transition, co-existing conditions warrant further exploration. These limitations may have resulted in not finding an association with factors we would expect based on prior research.
Regardless of these limitations, the application to the SC eHARS database provides important information on the trajectories of VL in SC over time. The results obtained in this study can be used to direct researchers in applying more accurate models when studying similar databases. We recommend that the Generalized Inverse Weibull PRH model be used for analyses involving skewed, left-censored heavy-tailed HIV VL data.
References
Bagdonavicius, V., Nikulin, M.: Accelerated Life Models: Modeling and Statistical Analysis. CRC Press, London (2001)
Barlow, R.E., Marshall, A.W., Proschan, F.: Properties of probability distributions with monotone hazard rate. Ann. Math. Stat. 5, 375–389 (1963)
Bebbington, M., Lai, C.D., Zitikis, R.: A flexible weibull extension. Reliab. Eng. Syst. Saf. 92(6), 719–726 (2007)
Beer, L., Mattson, C.L., Bradley, H., Skarbinski, J., et al.: Understanding cross-sectional racial, ethnic, and gender disparities in antiretroviral use and viral suppression among hiv patients in the united states. Medicine 95(13), e3171 (2016)
Block, H.W., Savits, T.H., Singh, H.: The reversed hazard rate function. Probab. Eng. Inf. Sci. 12(01), 69–90 (1998)
Chakraborty, H., Sen, P.K., Helms, R.W., Vernazza, P.L., Fiscus, S.A., Eron, J.J., Patterson, B.K., Coombs, R.W., Krieger, J.N., Cohen, M.S.: Viral burden in genital secretions determines male-to-female sexual transmission of hiv-1: a probabilistic empiric model. AIDS 15(5), 621–627 (2001)
Chakraborty, H., Iyer, M., Duffus, W.A., Samantapudi, A.V., Albrecht, H., Weissman, S.: Disparities in viral load and cd4 count trends among hiv-infected adults in south carolina. AIDS Patient Care STDS 29(1), 26–32 (2015)
Chakraborty, H., Weissman, S., Duffus, W.A., Hossain, A., Varma Samantapudi, A., Iyer, M., Albrecht, H.: Hiv community viral load trends in south Carolina. Int. J. STD AIDS 28(3), 265–276 (2017)
Chakraborty, H., Hossain, A., Latif, M.A.: A three-state continuous time markov chain model for HIV disease burden. J. Appl. Stat. 46(9), 1671–1688 (2019)
Cohen, M.S., Chen, Y.Q., McCauley, M., Gamble, T., Hosseinipour, M.C., Kumarasamy, N., Hakim, J.G., Kumwenda, J., Grinsztejn, B., Pilotto, J.H., et al.: Prevention of HIV-1 infection with early antiretroviral therapy. N. Engl. J. Med. 365(6), 493–505 (2011)
Cox, D.R.: Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodological) 34(2), 187–220 (1972)
Di Crescenzo, A.: Some results on the proportional reversed hazards model. Stat. Probab. Lett. 50(4), 313–321 (2000)
Dieffenbach, C.W.: Preventing hiv transmission through antiretroviral treatment mediated virologic suppression: aspects of an emerging scientific agenda. Curr. Opin. HIV AIDS 7(2), 106–110 (2012)
Eisinger, R.W., Dieffenbach, C.W., Fauci, A.S.: HIV viral load and transmissibility of HIV infection: undetectable equals untransmittable. J. Am. Med. Assoc. 321(5), 451–452 (2019)
Gill, C.J., Griffith, J.L., Jacobson, D., Skinner, S., Gorbach, S.L., Wilson, I.B.: Relationship of HIV viral loads, cd4 counts and heart use to health-related quality of life. J. Acquir. Immune Defic. Syndr. 30(5), 485–492 (1999)
Gupta, R.C., Wu, H.: Analyzing survival data by proportional reversed hazard model. Int. J. Reliab. Appl. 2(1), 1–26 (2001)
Henderson, R., Diggle, P., Dobson, A.: Joint modelling of longitudinal measurements and event time data. Biostatistics 1(4), 465–480 (2000)
Kleinschmidt, I., Schwabe, C., Benavente, L., Torrez, M., Ridl, F.C., Segura, J.L., Ehmer, P., Nchama, G.N.: Marked increase in child survival after four years of intensive malaria control. Am. J. Trop. Med. Hyg. 80(6), 882–888 (2009)
Kong, M.C., Nahata, M.C., Lacombe, V.A., Seiber, E.E., Balkrishnan, R.: Association between race, depression, and antiretroviral therapy adherence in a low-income population with HIV infection. J. Gen. Intern. Med. 27(9), 1159–1164 (2012)
Kundu, D., Gupta, R.D.: A class of bivariate models with proportional reversed hazard marginals. Sankhya B 72(2), 236–253 (2010)
Lai, C., Xie, M., Murthy, D.: A modified weibull distribution. IEEE Trans. Reliab. 52(1), 33–37 (2003)
Langebeek, N., Gisolf, E.H., Reiss, P., Vervoort, S.C., Hafsteinsdóttir, T.B., Richter, C., Sprangers, M.A., Nieuwkerk, P.T.: Predictors and correlates of adherence to combination antiretroviral therapy (art) for chronic HIV infection: a meta-analysis. BMC Med. 12(1), 142 (2014)
Leung, K.M., Elashoff, R.M., Afifi, A.A.: Censoring issues in survival analysis. Annu. Rev. Public Health 18(1), 83–104 (1997)
Lindsey, J.: A study of interval censoring in parametric regression models. Lifetime Data Anal. 4(4), 329–354 (1998)
Manolescu, L., Marinescu, P.: Sex differences in HIV-1 viral load and absolute cd4 cell count in long term survivors HIV-1 infected patients from giurgiu, romania. Rom. Rev. Lab. Med. 21(2), 217–224 (2013)
Marshall, A.W., Olkin, I.: A new method for adding a parameter to a family of distributions with application to the exponential and weibull families. Biometrika 84(3), 641–652 (1997)
Muthulingam, D., Chin, J., Hsu, L., Scheer, S., Schwarcz, S.: Disparities in engagement in care and viral suppression among persons with HIV. JAIDS J. Acquir. Immune Defic. Syndr. 63(1), 112–119 (2013)
Nanda, A.K., Das, S.: Dynamic proportional hazard rate and reversed hazard rate models. J. Stat. Plan. Inference 141(6), 2108–2119 (2011)
Nanda, A.K., Shaked, M.: The hazard rate and the reversed hazard rate orders, with applications to order statistics. Ann. Inst. Stat. Math. 53(4), 853–864 (2001)
Olatosi, B.A., Probst, J.C., Stoskopf, C.H., Martin, A.B., Duffus, W.A.: Patterns of engagement in care by HIV-infected adults: South Carolina, 2004–2006. AIDS 23(6), 725–730 (2009)
Passamonti, F., Rumi, E., Arcaini, L., Boveri, E., Elena, C., Pietra, D., Boggi, S., Astori, C., Bernasconi, P., Varettoni, M., et al.: Prognostic factors for thrombosis, myelofibrosis, and leukemia in essential thrombocythemia: a study of 605 patients. Haematologica 93(11), 1645–1651 (2008)
Reif, S., Geonnotti, K.L., Whetten, K.: Hiv infection and aids in the deep south. Am. J. Public Health 96(6), 970–973 (2006)
Reif, S., Whetten, K., Wilson, E., Gong, W.: HIV/aids epidemic in the south reaches crisis proportions in last decade (2011)
Rosholm, M.: An analysis of the processes of labor market exclusion and (re-) inclusion (2001)
Rosholm, M.: HIV/aids epidemic in the south reaches crisis proportions in last decade (2002)
Roy, D.: A characterization of model approach for generating bivariate life distributions using reversed hazard rates. J. Jpn. Stat. Soc. 32(2), 239–245 (2002)
Sankaran, P., Gleeja, V.: On bivariate reversed hazard rates. J. Jpn. Stat. Soc. 36(2), 213–224 (2006)
South Carolina Department of Health and Environmental Control (SCDHEC): An epidemiologic profile of HIV and AIDS in South Carolina (2015)
Sparling, Y.H., Younes, N., Lachin, J.M., Bautista, O.M.: Parametric survival models for interval-censored data with time-dependent covariates. Biostatistics 7(4), 599–614 (2006)
Thiébaut, R., Jacqmin-Gadda, H., Babiker, A., Commenges, D.: Joint modelling of bivariate longitudinal data with informative dropout and left-censoring, with application to the evolution of cd4+ cell count and HIV rna viral load in response to treatment of HIV infection. Stat. Med. 24(1), 65–82 (2005)
Tripathi, A., Youmans, E., Gibson, J.J., Duffus, W.A.: The impact of retention in early HIV medical care on viro-immunological parameters and survival: a statewide study. AIDS Res. Hum. Retroviruses 27(7), 751–758 (2011)
US Department of Health & Human Services: Viral Load (2020). https://www.aids.gov/hiv-aids-basics/index.html
Variyath, A., Sankaran, P.: Parametric regression models using reversed hazard rates. J. Probab. Stat. 2014, 1–5 (2014)
Weis, K.E., Liese, A.D., Hussey, J., Gibson, J.J., Duffus, W.A.: Associations of rural residence with timing of HIV diagnosis and stage of disease at diagnosis, South Carolina 2001–2005. J. Rural Health 26(2), 105–112 (2010)
Weissman, S., Duffus, W.A., Iyer, M., Chakraborty, H., Samantapudi, A.V., Albrecht, H.: Rural-urban differences in HIV viral loads and progression to aids among new HIV cases. South. Med. J. 108(3), 180–188 (2015)
World Health Organization: WHO Fact Sheets: HIV/AIDS (2020). https://www.who.int/news-room/fact-sheets/detail/hiv-aids
Acknowledgements
This research was supported by the Duke University Center for AIDS Research (CFAR), a NIH-funded program (5P30 AI064518). We thank Teresa G. Stephens from the Division of Surveillance and Technical Support, South Carolina Department of Health and Environmental Control, for her help with data management, and Chris Finney, Demographics Division, South Carolina Department of Revenue and Fiscal Affairs, for creating the de-identified data extract used in this study. Opinions stated herein are the authors’ and do not represent any official position of the South Carolina Department of Health and Environmental Control or the South Carolina Department of Revenue and Fiscal Affairs.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Rafiqul I. Chowdhury.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
1.1 Exponential Distribution
When the lifetime random variable follows an Exponential distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Exponential distribution is obtained as
so that the log likelihood function is
1.2 Log-normal Distribution
When the lifetime random variable follows a Log-normal distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Log-normal distribution is obtained as
so that the log likelihood function is
1.3 Inverse Gaussian Distribution
When the lifetime random variable follows a Inverse Gaussian distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Inverse Gaussian distribution is obtained as
so that the log likelihood function is
1.4 Log-logistic Distribution
When the lifetime random variable follows a Log-logistic distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Log-logistic distribution is obtained as
so that the log likelihood function is
1.5 Gompertz–Makeham Distribution
When the lifetime random variable follows a Gompertz–Makeham distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Gompertz–Makeham distribution is obtained as
so that the log likelihood function is
1.6 Gamma Distribution
When the lifetime random variable follows a Gamma distribution, the baseline distribution function is given by
where \(\gamma (\alpha ,t)\) is the incomplete Gamma function and \(\varGamma (\alpha )\) is the complete Gamma function. The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Gamma distribution is obtained as
so that the log likelihood function is
Note that the Exponential distribution is a special case of this result.
1.7 Generalized Gamma Distribution
When the lifetime random variable follows a Generalized Gamma distribution, the baseline distribution function is given by
where \(\gamma (\alpha ,t)\) is the incomplete Gamma function and \(\varGamma (\alpha )\) is the complete Gamma function. The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Generalized Gamma distribution is obtained as
so that the log likelihood function is
1.8 Inverse Gamma Distribution
When the lifetime random variable follows a Inverse Gamma distribution, the baseline distribution function is given by
where \(\gamma (\alpha ,t)\) is the incomplete Gamma function and \(\varGamma (\alpha )\) is the complete Gamma function. The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Inverse Gamma distribution is obtained as
so that the log likelihood function is
1.9 Weibull Distribution
When the lifetime random variable follows a Weibull distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Weibull distribution is obtained as
so that the log likelihood function is
1.10 Generalized Inverse Weibull Distribution
When the lifetime random variable follows a Generalized Inverse Weibull distribution, the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Generalized Inverse Weibull distribution is obtained as
so that the log likelihood function is
1.11 Modified Weibull Distribution
When the lifetime random variable follows a Modified Weibull distribution [21], the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Modified Weibull distribution is obtained as
so that the log likelihood function is
1.12 Flexible Weibull Distribution
When the lifetime random variable follows a Flexible Weibull distribution [3], the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Flexible Weibull distribution is obtained as
so that the log likelihood function is
1.13 Power Generalized Weibull Distribution
When the lifetime random variable follows a Power Generalized Weibull distribution [1], the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Power Generalized Weibull distribution is obtained as
so that the log likelihood function is
1.14 Marshal–Olkin Distribution
When the lifetime random variable follows a Marshal–Olkin distribution [26], the baseline distribution function is given by
The baseline Reversed Hazard Rate of T is then obtained as
In the presence of the covariates X, we have the following
From these, the likelihood function for the Marshal–Olkin distribution is obtained as
so that the log likelihood function is
Rights and permissions
About this article
Cite this article
Hossain, A., Islam, F. & Chakraborty, H. Parametric Regression Model Based on Reversed Hazard Rate: An Application to left censored heavy tailed HIV Viral Load Data. Bull. Malays. Math. Sci. Soc. 45 (Suppl 1), 567–598 (2022). https://doi.org/10.1007/s40840-022-01360-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40840-022-01360-7
Keywords
- Left censored data
- Hazard Rate
- HIV/AIDS
- Reversed Hazard Rate
- Parametric regression
- Survival analysis
- Viral Load