Introduction

Cluster or group-randomized trials (GRTs) are often used to test the efficacy of HIV prevention interventions [1, 2]. In GRTs, intact social groups (e.g. clinics, communities, or schools), rather than individual participants, are randomly assigned to study conditions. This study design is used when individual randomization is not feasible either because the intervention is targeted at the community or structural level, or when investigators are concerned about contamination of study conditions due to frequent interaction between participants. GRTs have been used in resource-limited settings to test a number of HIV prevention interventions including HIV testing and counseling [3], peer education [46], mass media [7, 8], risk reduction counseling [9], and treatment of sexually transmitted infections [5, 1012].

Individuals within groups in GRTs tend to share characteristics and experiences, inducing correlation among observations within groups and violating the assumption of independence of observations required by many statistical tests. Failure to adjust for this within-group correlation in analyses can increase the chance that an investigator will incorrectly observe a difference between the two study conditions when no difference actually exists (i.e. inflate the risk of a type I error) [13, 14].

The amount of similarity among observations in GRTs is typically quantified by the intraclass correlation coefficient (ICC). Values of the ICC generally range from 0 to 1 (though they can be negative) and vary by type of outcome variable, group type, and group size [15]. To ensure that a GRT is adequately powered, the sample size must be increased by a variance inflation factor (VIF; also called a “design effect”) [13]. The VIF is [1 + (m − 1) ICC], where m is the average group size. Both the magnitude of the ICC and group size will impact power to detect an effect, though group size has less of an impact than the number of groups [13]. To properly calculate the VIF and the sample size needed for a GRT, investigators should have an accurate estimate of the expected ICC based on prior studies. However, very few HIV trials have published ICC estimates [16] and even fewer have published these estimates for resource-limited settings [16]. A 2004 extension of the CONSORT statement for cluster-randomized trials requires investigators reporting the results of cluster-randomized trials to report both the ICC used for sample size estimation and ICCs for primary outcomes [17], but few investigators in HIV prevention seem aware of this requirement [18].

Covariate adjustment in analysis of a GRT can reduce the magnitude of an ICC by explaining between-group variation, and a reduced ICC will improve the power of a study to detect an intervention effect [15]. However, covariate adjustment can also decrease power by increasing between-group variation and making the ICC larger. The increase is due to confounding. In the unadjusted data, the members of a group appear less similar than they really are due to the uneven distribution of a covariate among the groups. Adjustment for the covariate reveals the confounding, as the members of a group now appear more similar to one another than to members of other groups, reflected in higher ICC. Therefore, it is beneficial to explore not just ICC estimates, but also the impact of covariate adjustment on ICC estimates using covariates other investigators are likely to measure.

This paper presents ICC estimates from a GRT evaluating a clinic-based HIV prevention intervention conducted in sub-Saharan Africa. We provide ICC estimates for a variety of common variables related to HIV risk behavior and physical and mental health, and compare unadjusted ICCs with ICCs adjusted for key covariates to determine where covariate adjustment can reduce ICCs. We then demonstrate sample size estimation for GRTs using our ICC estimates. The ICC estimates from this analysis will allow investigators planning similar studies to more precisely estimate the required sample size for future HIV prevention trials.

Methods

Study Design

The HIV Prevention for People Living with HIV Project was a group randomized controlled trial evaluating a clinic-based HIV prevention intervention for people living with HIV. Eighteen HIV care and treatment clinics in three sub-Saharan African countries (Kenya, Namibia and Tanzania) were paired on clinic characteristics (e.g. number of patients enrolled in care, health care provider/patient ratio) and then randomly assigned to either the intervention or control arm. A detailed report of the study design, intervention and patient characteristics was previously published [19]. Approximately 200 patient participants were enrolled from each clinic and interviewed using a structured questionnaire at baseline and 6- and 12-months post-intervention. For this analysis, we used data collected at baseline from patient questionnaires and medical chart reviews.

Study Measures

Physical Health

ICCs were estimated for several measures of study participants’ physical health status. CD4 count (cells/mm3) is presented as a continuous variable and as two categorical variables representing different thresholds for anti-retroviral treatment initiation. Participants’ level of HIV medication adherence within the past 30 days was measured in two ways. They were first asked to report how many times they had missed a dose of their prescribed medication(s). Participants were then shown a visual analog scale (VAS) and asked to mark their level of HIV medication adherence on the scale. Physical functioning and mental health over the past 30 days was assessed using the SF-8 Health Survey, an 8-item scale [20, 21].

Mental Health

The Center for Epidemiologic Studies Depression Scale (CES-D Scale) was used to assess symptoms of depression experienced by participants over the previous week [22]. Social support (physical, emotional, and other support) was assessed with an 8-item measure. The last question asked whether someone would be able to take care of the participants’ children if he/she was sick. The total score for participants without children was multiplied by 8/7 to preserve the range of 0–8 for all respondents.

HIV Risk Behaviors

Disclosure and Knowledge of Partners Status

Participants were asked whether they had disclosed their HIV status to sex partners and/or disclosed to anyone else. Participants were also asked whether their sex partners had been tested for HIV and whether they knew the HIV status of each partner.

Condom/Contraceptive Use

Participants reported how many times they had sex and how many times they used condoms during vaginal sex in the past 90 days and in the past 14 days, and a dichotomous “consistent condom use” variable was created from these question. Participants also reported whether they used a condom during their last act of vaginal sex. Dual contraceptive method use was defined as consistent condom use in addition to one of the following methods of contraception: oral contraceptive pills, intra-uterine device (IUD), injectables, implants or female sterilization.

Other HIV Risk Behaviors

Questions about exchanging sex for a place to stay, money, food and/or gifts during the past 6 months were collapsed into one dichotomous sex exchange variable (yes/no). Female participants were asked whether someone had hit, slapped, kicked or inflicted physical harm upon them and/or forced them to have sexual intercourse or perform sexual acts within the past 6 months (intimate partner violence) The World Health Organization’s Alcohol Use Disorders Identification Test (AUDIT), a 10-item scale, was used to assess participants’ alcohol use [23]. ICCs from this scale are presented in two ways: (1) as a dichotomous variable comparing non-drinker/non-harmful drinker versus harmful/likely dependent drinker the over a 6-month period; and (2) as a dichotomous variable (yes/no) indicating any binge drinking during the past 6 months. A cutoff of 8 on the AUDIT scale was used to categorize participants into the harmful/likely dependent drinker category. Binge drinking was defined as consuming 6 or more drinks on one occasion.

Statistical Methods

We calculated ICCs for each of the outcome measures of interest including binary and continuous variables by using the following equation:

$$ {{\uprho}} = \frac{{{{\upsigma}}_{\text{g}}^{2} }}{{{{\upsigma}}_{\text{m}}^{2} + {{\upsigma}}_{\text{g}}^{2} }} $$

where \( {{\upsigma}}_{\text{g}}^{2} \) is the between group component of variance and \( {{\upsigma}}_{\text{m}}^{2} \) is the within group component of variance, following the notation of Murray [24]. For the present analyses, g represents the clinic, and m represents participants within clinic. We estimated ICCs: (1) without covariates, (2) with country as a covariate, and (3) with country and other socio-demographics including, gender, age, any paid work and time since HIV diagnosis as covariates. Among the three models, the two adjusted models were compared with the unadjusted model by using the percentage change in the ICC estimate using the formula: ICCchange = (ρadj − ρunadj)/ρunadj × 100.

We used SAS version 9.2 [25] to conduct all analyses. We estimated the variance component for clinic and participants within each clinic by using the SAS GLIMMIX procedure to fit generalized linear mixed models for binary variables and PROC MIXED to fit linear mixed models for continuous variables. For binary variables, the variance components were converted from logit scale to the linear scale before calculating the ICC [24]. Lower and upper confidence limits were calculated as [26]:

$$ \left( {\frac{{\frac{F}{FU} - 1}}{{m + \frac{F}{FU} - 1}},\;\frac{{\frac{F}{FL} - 1}}{{m + \frac{F}{FL} - 1}}} \right) $$

where F = [ICC*(m − 1) + 1]/(1 − ICC), m is the average number of members per group, FL is the value from the F-distribution at α = 0.025, with m − 1 numerator and g*(m − 1) denominator degrees of freedom, and FU is the value from the F-distribution at α = 0.975 with the same degrees of freedom.

Results

A total of 3,538 study participants, of whom 2,054 (58 %) were female, completed baseline patient questionnaires in the 18 clinics. A full socio-demographic profile of study participants has been previously presented [19]. Table 1 presents descriptive statistics, ICC estimates and 95 % confidence intervals for the physical and mental health variables. The unadjusted ICC estimates ranged from 0.012 for CD4 counts less than 350 cells/mm3 to 0.019 for CD4 counts less than 200 cells/mm3. When we just adjusted for country, the ICC estimates for these two variables increased slightly. However, after adjusting for all covariates including country, gender, age, any paid work and time since HIV diagnosis, ICCs for the three CD4 count variables decreased as much as 33 %. ICC estimates for medication adherence during the past 30 days were 0.029 for any missed doses and 0.041 for the visual analog scale. After adjusting for covariates, these values increased slightly.

Table 1 Intraclass correlation coefficients for variables related to the physical and mental health of HIV-positive patients attending HIV clinical care, 2009–2011 (N = 3,538)

The ICC estimates for physical and mental functioning were 0.043 and 0.060, respectively. Those for social support and depression were 0.076 and 0.090, respectively. The percent reduction in covariate-adjusted ICCs ranged from about 8 % to over 16 %.

Table 2 presents descriptive statistics and ICC estimates for HIV risk behavior variables. ICC estimates were 0.027 for disclosure to anyone and 0.039 for disclosure to sex partners and these were reduced by about 20 and 60 % respectively with covariate adjustment. After covariate adjustment, the ICC estimate for knowing whether the partner was tested for HIV decreased while the estimate for knowing the partner’s HIV status increased slightly. ICC estimates for the three condom use variables were 0.020 for condom use during the past 14 days, 0.038 for condom use during the past 90 days, and 0.055 for condom use at the last sexual encounter. Covariate adjustment reduced the estimates for the three condom use variables substantially, with percent change ranging from about −40 % to over −80 %. The ICC estimate for dual method use was 0.049 and decreased about 35 % after covariate adjustment.

Table 2 Intraclass correlation coefficients for HIV risk behavior variables from HIV-positive patients attending HIV clinical care, 2009–2011 (N = 3,538)

Covariate adjustment decreased the estimate for any sex exchange during the past 6 months by about 20 %. The ICC estimate for IPV was 0.008 and this estimate decreased by 25 % following covariate adjustment. The ICC estimate for both binge drinking and harmful/likely dependent drinking was 0.030. After covariate adjustment, the ICC estimates for both variables decreased substantially.

Discussion

This paper is unique in its presentation of ICC estimates for behavioral and clinical variables from an HIV prevention intervention in a resource-limited setting. Though estimates varied widely, we can point to some general trends likely to be helpful to investigators. ICCs for patients’ perceptions of their physical functioning and mental health tended to be higher than those for HIV risk behavior variables. ICCs for CD4 count were lower than most of the HIV risk behavior variables. The lowest ICCs were observed among three behavior variables including knowledge of a sex partner’s HIV status, any sex exchange during the past 6 months, and (for female participants only) any experience with intimate partner violence during the past 6 months. Our ICCs follow the pattern noted by others that unadjusted ICCs tend to be smallest for physiological measures, higher for behaviors, and highest for knowledge, attitude, and belief measures [15, 27].

Overall, covariate adjustment reduced ICC estimates for most variables. For HIV risk behavior variables, adjustment for the country reduced ICCs by as much as 79 %. Adding socio-demographic covariates, after adjusting for country, further decreased ICCs for most variables except for knowledge of a sex partner’s HIV status and sex exchange during the past 6 months. The often substantial decreases observed in the magnitude of the ICC estimate after covariate adjustment can significantly impact the number of groups required to adequately power a GRT, lowering costs and participant burden. It is most beneficial to have multiple estimates of ICCs of interest and demonstrations of the impact of adjustment for covariates because ICCs and the effect of covariate adjustment vary due to sampling error and differences in study characteristics.

Only one other publication that we are aware of reports ICCs for sexual risk behavior variables [27]. These ICCs were estimated using data from a recent intervention trial testing the effectiveness of a condom awareness campaign. Although the target population and setting differ (young women in western US cities), the ICC estimates are surprisingly similar to ours. For example, the adjusted ICC estimate for any unprotected sex during the past 90 days (among those who ever had sex) was 0.0143 at the baseline assessment, and our ICC estimate for the same variable was 0.006. ICC estimates for condom use at last sex were also very similar at 0.0163 for the 2009 study and 0.013 for the current study.

When estimating sample size for a GRT, it is desirable to identify published ICC estimates for the same variable using data from a similar study design and target population as those in the study being planned. However, if this is not possible due to the scarcity of published estimates, investigators are encouraged to explore available datasets from past similar studies to estimate ICCs. For this purpose, cross-sectional data may be used if the dataset contains a variable indicating group (cluster) membership. If no such data are available, investigators should use published estimates for as similar a study as possible to the one being planned, keeping in mind that ICC are generally larger for (1) smaller clusters (e.g. clinics vs. entire communities), (2) behavioral or attitudinal variables (vs. physiologic variables), and (3) studies that do not include multiple timepoints, as including time in the analysis has been demonstrated to reduce ICCs appreciably [15]. ICC estimates used for sample size estimation can be modified following those general guidelines, and a sensitivity analysis (varying the ICC to determine the impact on needed sample size) should be done to ensure that if the ICC is larger than expected, the study will still have sufficient power to detect the expected intervention effect.

To illustrate how ICC estimates can be used to calculate the sample size requirements for a simple cross-sectional comparison between two study conditions in a group randomized trial, consider the following formula, adapted from Murray [24]:

$$ {\text{g}} = \frac{{\left[ {{\widehat{{\text{P}}_{1}}} \left( {1 - {\widehat{{\text{P}}_{1}}}} \right) + {\widehat{{\text{P}}_{2}}} \left( {1 - {\widehat{{\text{P}}_{2}}} } \right)} \right]\left[ {1 + \left( {{\text{m}} - 1} \right){\text{ICC}}_{\text{m:g:c}} } \right]\left( {{\text{t}}_{{{\text{critical}}:{{\upalpha}}/2}} + {\text{t}}_{{{\text{critical}}:{{\upbeta}}}} } \right)^{2} }}{{{\text{m}}\left( {{\widehat{{\text{P}}_{1}}} + {\widehat{{\text{P}}_{2}}} } \right)^{2} }} $$

Where g is the number of groups per condition, \( {\widehat{{\text{P}}_{1}}} \) and \( {\widehat{{\text{P}}_{2}}} \) are the event rate in the two study conditions; m is the average number of members per group (clinic); ICCm:g:c is the intraclass correlation coefficient measuring the correlation of members nested within groups, nested within study condition; and tcritical:α/2 and tcritical:β are critical values from the t-distribution. About 76 % of participants reported consistent condom use in the past 90 days and the ICC estimate for this variable was 0.039 (Table 2). If we assume 200 participants in each group, in order to have 80 % power at the 0.05 significance level to detect a minimum 10 % increase in the proportion of condom use in the intervention group, we would need about 12 groups per condition. If the ICC drops to about 0.004, as was the case with our covariate adjustment, only about 4 groups per condition would be required. This demonstrates how variation in ICC estimates can dramatically change the sample size necessary to maintain adequate power to detect an intervention effect.

This analysis has several limitations. First, the relatively small number of groups resulted in wider confidence intervals than would be the case if we had more groups. Second, the ICC estimates for these variables were derived from an HIV/AIDS prevention trial conducted among HIV-positive patients attending HIV clinical care in a resource-limited setting. It may be difficult to generalize these ICC estimates to other populations and settings. This highlights the need for investigators working in other settings and with other populations to publish their ICC estimates.

Reviews have shown that very few publications of GRTs include evidence of power calculations that have adequately considered the ICC effect. Given the high cost and participant burden associated with GRTs, it is imperative that these studies are adequately powered to detect an intervention effect. ICCs used to estimate sample size for a study should always be reported in the paper reporting the trial results, as required by the CONSORT extension for cluster-randomized trials [17]. This practice will allow others to evaluate the assumptions sample size estimates were based on, and to use the ICC estimates for planning future trials. Investigators should also consider reporting ICCs for other variables they measured that might be used as outcome variables in future trials, and exploring the impact of covariate adjustment on ICC estimates. We hope that the unadjusted and covariate-adjusted ICCs we provided will allow investigators planning similar trials to more precisely estimate the required sample size.