Introduction

A major goal of HIV prevention efforts is to reduce the number of new infections that occur each year. Although in the United States the mid-1980s were marked by numbers of new infections as high as 150,000 per year, since the late 1990s a stable rate of 40,000 new cases occurs each year (Centers for Disease Control and Prevention [CDC] 2006). Each new case adds to a growing HIV/AIDS prevalence and increases the burden and impact of the disease on public health. Although highly active antiretroviral therapy has greatly increased lifespan for those infected with HIV, a growing HIV positive population brings with it new challenges, including an increase in potential sources of infection and an increase in drug resistant strains of HIV (Crepaz et al. 2006). As increasing numbers of individuals suffer with the disease, there is an additional consequence—the lifetime cost of treating HIV/AIDS infection. In fact, a recent study suggested that the average lifespan of an HIV-infected individual in the United States is now approximately 24.2 years from the time one enters into medical care (Schackman et al. 2006). This study also reported that the average lifetime medical costs for such an individual are projected to be between $303,100 and $618,900. Thus, multiplying 40,000 new HIV infections per year times these medical costs results in figures of between $12,124,000,000 and $24,756,000,000 for each cohort of new infections.

For a multitude of reasons, then, researchers and practitioners have continued to focus on prevention efforts in order to decrease new HIV infections. Future preventive interventions, however, will only have the greatest chance of being effective if lessons learned from past intervention efforts are seriously considered. The earliest HIV prevention interventions were grassroots programs developed and implemented in 1982 by primarily gay men in San Francisco and New York City, along with activities initiated by CDC in the early 1980s (CDC 2006). Since that time, a multitude of behavioral interventions to reduce HIV risk behavior have been designed, implemented, and formally evaluated, resulting in a large scientific literature on such interventions. Behavioral interventions to reduce HIV risk behavior have been undertaken in a variety of venues, from clinics to community centers, and using a variety of approaches, including face-to-face, small group, and community-level approaches. Such interventions are typically peer or expert-led and vary in their length and intensiveness, from one to as many as 12 or more sessions. In addition, whereas some interventions are evaluated based on changes to a sole outcome (e.g., increased condom use), others examine changes in multiple outcomes (e.g., reduced unprotected sex, sex partners, and sexually transmitted diseases [STDs]).

Efficacy of Behavioral Interventions

What do we know about the efficacy of these interventions in reducing HIV-related sexual risk behavior? Although a number of narrative reviews of such interventions exist, most useful are meta-analyses that quantitatively synthesize the literature. Meta-analyses are characterized by a number of strengths, including (1) exhaustive literature searches; (2) a systematic approach to coding study features; (3) a focus on precise effect sizes rather than solely on statistical significance; (4) an ability to synthesize large literatures; and (5) an ability to empirically test moderators of study outcomes and help understand why certain studies had stronger effects than others (Johnson et al. in press; Noar 2006; Rosenthal 1991).

In the area of behavioral interventions to reduce HIV-related sexual risk behavior, a number of meta-analyses exist. An early meta-analysis of behavioral interventions yielded promising results (Kalichman et al. 1996), while another early meta-analysis called into question the value of standard HIV counseling and testing as a primary prevention strategy (Weinhardt et al. 1999). This was followed by a significant effort by CDC (begun in 1996; Neumann et al. 2002) to synthesize the growing behavioral intervention literature via a meta-analysis initiative. In 2002, CDC published the first results from the Prevention Research Synthesis (PRS) project in a special issue of the Journal of Acquired Immune Deficiency Syndromes. This effort resulted in four meta-analyses involving 99 behavioral interventions and four key risk populations—sexually active adolescents (Mullen et al. 2002), heterosexual adults (Neumann et al. 2002), drug users (Semaan et al. 2002), and men who have sex with men (MSM; Johnson et al. 2002). Each of the projects provided evidence that behavioral interventions can be efficacious in changing HIV sexual risk behavior, although few moderators of effect size were uncovered in those projects (see Des Jarlais and Semaan 2002).

As more behavioral interventions have been evaluated and published, more meta-analyses of such interventions have been completed. The three most visible research groups that have undertaken these projects are CDC’s PRS team (mentioned above), Blair Johnson and his Synthesis of HIV and AIDS Research Project team (SHARP; e.g., Johnson et al. 2003, 2006), and Dolores Albarracín and her Attitudes and Persuasion Lab (e.g., Albarracín et al. 2003, 2005, 2007). Since 2002, these and other researchers have undertaken many additional (updated and new) meta-analytic projects. These include meta-analyses of behavioral interventions with adolescents (Johnson et al. 2003), heterosexual adults (Logan et al. 2002), MSM (Herbst et al. 2005, 2007a), Hispanics/Latinos (Albarracín et al. 2007; Herbst et al. 2007b), injection drug users (Copenhaver et al. 2006), STD clinic patients (Crepaz et al. 2007; Ward et al. 2005), people with severe mental illness (Johnson-Masotti et al. 2003), and individuals living with HIV (Crepaz et al. 2006; Johnson et al. 2006). Projects focusing on particular types of interventions, rather than particular target populations, have also appeared in the literature. These include meta-analyses focused on condom use communications (Albarracín et al. 2003), intervention source characteristics (Durantini et al. 2006), long-term intervention effects (Albarracín et al. 2005), eroticizing sex strategies (Scott-Sheldon and Johnson 2006), fear arousal and counseling and testing (Earl and Albarracín, 2007), and frequency of sexual activity following safer sex intervention (Smoak et al. 2006). Given that the intervention literature has continued to grow at a rapid pace, these more recent meta-analyses have taken advantage of more sophisticated analyses that larger literatures allow for. In particular, more recent projects have been better able to determine which factors moderate intervention effects (i.e., which study features are associated with stronger effects on outcome variables).

Given the large number of behavioral interventions that have now been evaluated and published in the literature, and the multitude of meta-analyses that have examined such interventions across numerous risky populations, the following questions are posed: What kinds of effects do “typical” behavioral interventions result in across key outcomes such as condom use, unprotected sex, number of sex partners, and incident STDs? And, how consistent are these effects across differing at-risk populations? These are critical questions, the answers which may serve a variety of purposes. That is, understanding typical effects of behavioral interventions may (1) aid researchers, practitioners, and policy-makers in an understanding of the kinds of effects that such interventions do (and do not) achieve, which may help in making decisions about whether and when to implement and disseminate such interventions in practice; (2) aid researchers conducting cost-effectiveness studies by supplying figures for typical intervention effects; and (3) help researchers in understanding whether interventions tend to have similar or different effects across at-risk populations and key outcomes.

In addition, another critical question posed is the following: what moderators of the efficacy of interventions have been uncovered in extant meta-analyses, and which moderators have most consistently been shown to be associated with larger or smaller intervention effects? Researchers in the area of HIV prevention behavioral interventions have long suggested that particular principles—such as use of behavioral theory, targeting of interventions, and safer sex skill-building are critical components of effective interventions (Edgar et al. 2008a; Peterson and DiClemente 2000). However, does meta-analytic evidence provide empirical support for these and other principles? Understanding which study features are associated with more efficacious interventions can help behavioral scientists to understand key ingredients of efficacious interventions and inform the development of the next generation of interventions to reduce HIV infection.

Current Study

The overriding purpose of the current study was to review extant meta-analyses of behavioral interventions aimed at reducing HIV-related sexual risk behavior in a defined target population. Specifically, it was of interest to (1) examine and compare effect sizes across differing behavioral outcomes in order to examine, across the meta-analytic literature, the kinds of effects that are typical in such interventions; (2) examine and compare effect sizes across differing target populations in order to examine how interventions have performed across populations; and (3) examine which moderators of effect size have been most often found to be associated with larger intervention effects.

Methods

Search Strategy and Inclusion Criteria

A comprehensive search to locate meta-analyses of behavioral interventions to reduce HIV-related sexual risk behavior was undertaken. The intent was to locate all meta-analyses published in peer-reviewed journals that were available (in print or electronic form) through May of 2007 and met criteria for this review. Comprehensive searches of the Medline and PsycINFO computerized databases were conducted using combinations of the following keywords: sexual risk, HIV/AIDS, prevention, behavioral intervention, program, condom use, unprotected sex; and meta-analysis, research synthesis, systematic review. Articles were also found based upon the author’s own knowledge of such meta-analyses as well as examining citations within review and meta-analysis articles. Meta-analyses were included in the review if they:

  1. (1)

    conducted a meta-analysis (quantitative research synthesis) of formally developed and evaluated HIV prevention behavioral interventions targeting sexual risk behavior;

  2. (2)

    were focused on a defined target population, such as MSM or heterosexual adults. Meta-analyses that included numerous target populations together were excluded;

  3. (3)

    examined one or more of the following outcome variables: condom use, unprotected sex, number of sexual partners, STD acquisition, sexual risk composite. Meta-analyses that focused solely on pregnancy prevention programs were excluded, as the focus here is HIV prevention;

  4. (4)

    were published in a peer-reviewed journal.

Initial searches using the search strategies discussed above resulted in hundreds of abstracts examined for relevance. Forty-nine articles that had the potential to be included in the review were located and examined. Of these:

  • Eighteen articles (37%) were excluded because they were narrative reviews of the literature rather than meta-analyses (e.g., Robin et al. 2004).

  • Seven articles (14%) were excluded because they did not focus on one particular target population, but rather included many target populations together in the same meta-analysis (e.g., Albarracín et al. 2003; Kalichman et al. 1996). It should be noted that the intervention literature examined in these excluded meta-analyses overlaps greatly with the intervention literature examined in the final set of included meta-analyses.

  • Two articles (4%) were excluded because they focused on pregnancy (rather than HIV) prevention (e.g., DiCenso et al. 2002).

  • Two articles (4%) were excluded because rather than testing formally developed behavioral interventions, they examined the efficacy of HIV counseling and testing as it tends to be implemented in practice (Hutchinson et al. 2006; Weinhardt et al. 1999). The authors of the Weinhardt et al. (1999) meta-analysis themselves indicated that there is an important distinction between formally developed interventions and counseling and testing as it is implemented in practice (Weinhardt et al. 2000).

  • One article (2%) was excluded because it was a meta-analysis of P values rather than effect size (Mize et al. 2002), and as such could not be meaningfully compared to the other meta-analyses included in this review.

  • One report (2%) was excluded because it was not published in a peer-reviewed journal (Johnson et al. 2007). A number of very similar meta-analyses were published in peer-reviewed journals, however, and were included in the current review (Herbst et al. 2007a; Johnson et al. 2005).

As a result of these search strategies and inclusion criteria, a final set of 18 meta-analyses (37% of the 49 meta-analysis/narrative review articles) were included in the current review.

Article Coding

Study characteristics and effect sizes were retrieved and coded by two independent coders. Basic descriptive information from each study was coded along with effect sizes, 95% confidence intervals, and heterogeneity information. In order to understand which study features led to stronger intervention outcomes, results of moderator analyses were also coded. However, early on in the coding it became clear that many different approaches to testing moderators were used in this set of 18 meta-analyses. In particular, meta-analyses stratified effect sizes into groups and conducted within-group statistical tests (Crepaz et al. 2006; Johnson et al. 2002, 2005; Ward et al. 2005), took the same approach but used the QB statistic to test for between-group differences (Albarracín et al. 2007; Crepaz et al. 2007; Herbst et al. 2005, 2007a; Johnson et al. 2006; Mullen et al. 2002; Neumann et al. 2002; Semaan et al. 2002), used correlation/regression techniques (Copenhaver et al. 2006; Johnson et al. 2003, 2006; Prendergast et al. 2001), and used focused comparisons (Logan et al. 2002). A small number of meta-analyses did not examine moderating influences (Herbst et al. 2007a; Johnson-Masotti et al. 2003). While all meta-analyses that tested moderators presented some form of univariate analysis of moderators, a smaller number of meta-analyses went a step further and also conducted multivariate analyses (e.g., Johnson et al. 2003; Johnson et al. 2002, 2005).

The most conservative approach to testing (and thus summarizing) moderators would be to only consider evidence from multivariate tests. As many of the meta-analyses did not conduct such analyses, summarizing results from such tests was not feasible. Another approach is only considering a variable to be an effective moderator if the between-group test (e.g., QB statistic) is statistically significant (or in the case of continuous moderators, if there is a statistically significant association with effect size). For example, if one group of studies conducted skills-training and the other group did not, and there is a statistically significant difference between those two effect sizes, then clear evidence of a significant moderator exists. This approach was feasible, as nearly every meta-analysis conducted some type of between-group test of moderators, and in the case of those that did not, the authors of those meta-analyses were contacted and supplied those data.

Using only this approach to considering a moderator of potential importance, however, may be overly conservative. This is the case because tests of moderation in meta-analysis are often under-powered statistically (Hedges and Pigott 2004). In fact, many of the meta-analyses in the current review cautioned that statistical power may have been too low to statistically detect moderators using between-group tests (e.g., Crepaz et al. 2007; Herbst et al. 2005; Prendergast et al. 2001). Perhaps because of this, some meta-analyses did not present such tests but rather discussed the potential importance of moderators based on the magnitude of effect sizes and/or the results of within-group significance tests (e.g., Crepaz et al. 2006; Ward et al. 2005). In addition, the purpose here was to summarize factors that may be important in order to look for patterns within and across this set of meta-analyses and to potentially stimulate further research on promising moderator variables.

Considering these arguments, the approach to coding the moderators in the current review was as follows:

  1. (1)

    If a moderator was found to be statistically significant (P  <  .05) by way of a between-group test, or in the case of continuous moderators, by way of a statistically significant (P  <  .05) association with effect size, it was considered to have demonstrated evidence of moderation.

  2. (2)

    If there was evidence to support the fact that a moderator may have been of importance, by way of a within-group statistical test, this was considered evidence of possible moderation. For instance, if a group of interventions that conducted skills-training had a larger and statistically significant (P  <  .05) effect size as compared to the smaller and non-significant effect size of the no-skills comparison group, this was considered evidence of possible moderation. Note that a variable was only considered a possible moderator if two conditions held true: (1) one subgroup had a statistically significant (P  <  .05) effect size while the other did not; and (2) the subgroup with the statistically significant effect size had a larger effect size than the other subgroup(s). This latter rule was included to guard against flagging variables that clearly have not acted as moderators—for instance, where two subgroups met condition 1 but had identical effect sizes or the non-significant group actually had the larger effect size (either condition can take place when statistical bias is introduced because of a large set of studies is in one subgroup and a very small set in the other). It is also important to note that these possible moderators could only be identified in meta-analyses that presented the full stratification analyses. This was the case in every meta-analysis that tested for moderators except for Johnson et al. (2003, 2006), Prendergast et al. (2001), Copenhaver et al. (2006), and Albarracín et al. (2007). Given that these meta-analyses took a different statistical approach (typically a correlation/regression approach), variables with possible evidence of moderation could not be identified.

  3. (3)

    If a variable was tested for moderation and neither of the above types of evidence were found, the variable was considered lacking evidence of moderation.

The coders met to discuss each article after it was coded on all dimensions to compare the two coders’ work and discuss any discrepancies that were present. Inter-coder reliability was calculated for each characteristic that was coded. Percent agreement was calculated by dividing the number of agreed upon coded instances by the total, and was calculated for each coding category. Cohen’s (1960) kappa for inter-coder reliability, which corrects for chance categorizations, was also calculated. Percent agreement ranged from a low of 91% to a high of 100%, with a mean percent agreement of 98% (most categories had 100% agreement). Cohen’s kappa ranged from a low of .82 to a high of 1.0, with a mean of .97. These figures indicated very good agreement among the coders. All discrepancies between coders were resolved through discussion.

Effect Size Conversion

The meta-analyses included in this review used three differing effect size indicators. While 11 (61%) of the meta-analyses used odds ratios (OR), 6 (33%) used d and 1 (6%) used r. In order to provide a common metric for interpretation and comparison across all meta-analyses, effect sizes and confidence intervals in the d and r meta-analyses were converted to OR using the formula provided in Sanchez-Meca et al. (2003). The OR statistic was chosen because it was the predominant statistic used in the meta-analyses, it required few conversions to be made, and it is easily interpreted. In addition, for consistency and ease of interpretation, the direction (i.e., below or above 1) of some ORs was converted using the formula 1/OR, such that values above 1 indicate an increase in condom use and values below 1 indicate a reduction in unprotected sex, number of sexual partners, STDs, and composite sexual risk.

Results

Table 1 lists characteristics for each of the 18 meta-analyses. As can be seen, to date meta-analyses have been conducted on interventions with sexually active adolescents (two articles), heterosexual adults (two articles), Hispanics/Latinos (two articles), MSM (four articles), drug users (three articles), people with severe mental illness (one article), STD clinic patients (two articles), and people living with HIV (two articles). These meta-analyses have examined interventions from as early as 1966 (Ward et al. 2005) and as late as 2006 (Herbst et al. 2007a). In the current review, the term “study” (represented by the letter k) is used to refer to the primary intervention trials. The current set of meta-analyses typically treated each research trial as one study (deriving one effect size from each report), although in some cases trials only reported data in subgroups (e.g., separately for males and females), leading meta-analysts to treat those separate groups as different “studies” (deriving multiple effect sizes from a single report). Using this definition, these meta-analyses have included as few as four studies with a cumulative N = 365 (Johnson-Masotti et al. 2003) and as many as 56 studies with a cumulative N = 35,282 (Johnson et al. 2003), with a median of k = 19 primary studies and N = 9,423 participants. While the Albarracín et al. (2007) meta-analysis appears to be the largest in the group, it also took a different analytic approach compared to the others by examining 350 treatment and control study conditions with a cumulative N = 110,092.

Table 1 Description of basic meta-analytic study characteristics by target population

Efficacy of Behavioral Interventions

Table 2 is a summary of effect size indices and heterogeneity tests across study outcomes in the meta-analyses. Note that some mean effect sizes were found to be homogenous, as the individual study effect sizes that made up those means were similar and moderator analyses were less likely to be fruitful. Many of the other mean effect sizes, however, were heterogeneous, and thus there was variability in those individual study effect sizes that can be more fully understood through moderator analyses. In those cases, the mean should be interpreted with the caveat that moderator analyses should be more carefully considered in order to achieve a fuller understanding of the variability therein. In addition, all effect sizes reported are from fixed effects analyses except for Crepaz et al. (2006, 2007), Neumann et al. (2002), Herbst et al. (2007a), Semaan et al. (2002), and Ward et al. (2005) which used random effects analyses. For ease of comparison, effect sizes and their 95% confidence intervals were plotted and are presented in Figs. 15 along with heterogeneity information and k for each analysis.

Table 2 Summary of meta-analytic effect sizes
Fig. 1
figure 1

Forest plot of meta-analytic effect sizes and 95% confidence intervals for condom use. Asterisks indicate that effects were found to be statistically heterogeneous

Next, median effect sizes were computed for each outcome. Because there was some overlap among studies in these meta-analyses, before medians were computed the most representative (i.e., largest) meta-analysis for each target population was retained within each outcome, while the other meta-analyses that overlapped were discarded. This was intended to reduce the potential impact of the same research trial appearing in more than one meta-analytic project. Results indicated that every meta-analysis (11 of 11) that examined condom use found significant effects, and the median effect size (with Neumann et al. 2002; Herbst et al. 2007b; Johnson et al. 2002 removed) was OR = 1.34 (range 1.13–1.64), suggesting that typical interventions produced a 34% increase in the odds of condom use. The weakest effects were found in adolescents and the strongest effects in MSM. As Fig. 1 shows, there appears to be some variability on this outcome. Nine of 11 meta-analyses that examined unprotected sex found significant results, and the median effect size (with Johnson et al. 2002; Herbst et al. 2005, 2007a removed) was OR = .76 (range .57–.93), suggesting a 32% reduction (1/.76 = 1.32) in the odds of unprotected sex. The weakest effects were in injection drug users and the strongest effects in people living with HIV. An examination of Fig. 2 suggests little variability in mean effect sizes across extant meta-analyses.

Fig. 2
figure 2

Forest plot of meta-analytic effect sizes and 95% confidence intervals for unprotected sex. Asterisks indicate that effects were found to be statistically heterogeneous

Only three of eight meta-analyses examining numbers of sex partners found significant results in the expected direction. Three additional meta-analyses contain ORs that suggest a reduction in sex partners, but the confidence interval includes one. One meta-analysis found significant results suggesting an increase (rather than a decrease) in number of sexual partners (Johnson-Masotti et al. 2003). It should be noted, however, that these results appear to be the result of participants in less intensive intervention groups increasing sexual partners less than those in more intensive intervention groups (see Johnson-Masotti et al. 2003). The median effect across meta-analyses (with Neumann et al. 2002; Johnson et al. 2002 removed) was OR = .87 (range .74–1.54), suggesting a 15% reduction (1/.87 = 1.15) in sex partners, with the strongest effects in MSM. Examination of Fig. 3 suggests more variability on this outcome as compared to condom use and unprotected sex. In addition, four of six meta-analyses found significant results for reduction of STDs, while the remaining two were suggestive of no effect. The median value (with Ward et al. 2005 removed) was an OR = .74 (range .20–1.12), suggesting a 35% reduction (1/.74 = 1.35) in the odds of acquiring an STD, with the strongest effects in people living with HIV. Examination of Fig. 4 reveals some variability on this outcome. Finally, for the sexual risk composite outcome, the make-up of which varied by meta-analysis, five of five meta-analyses had significant results, with a median effect across meta-analyses of OR = .78 (range .65–.86). This suggests that interventions typically produced a 28% reduction (1/.78 = 1.28) in the odds of engaging in sexual risk behavior. Examination of Fig. 5 suggests little variability on this outcome.

Fig. 3
figure 3

Forest plot of meta-analytic effect sizes and 95% confidence intervals for number of sexual partners. Asterisks indicate that effects were found to be statistically heterogeneous

Fig. 4
figure 4

Forest plot of meta-analytic effect sizes and 95% confidence intervals for acquisition of sexually transmitted diseases (STDs). Asterisks indicate that effects were found to be statistically heterogeneous

Fig. 5
figure 5

Forest plot of meta-analytic effect sizes and 95% confidence intervals for sexual risk composite. Asterisks indicate that effects were found to be statistically heterogeneous

Moderators of Intervention Efficacy

All moderator characteristics that were tested in individual meta-analyses are listed in Table 3 by meta-analytic study. Those which demonstrated evidence of moderation are in bold; those which demonstrated evidence of possible moderation are in italics; and those which did not provide evidence of moderation are in plain type. Here, some of the key findings are highlighted. With regard to participant characteristics, evidence to support segmentation practices is apparent. For instance, evidence exists that interventions delivered to single racial groups of adolescents (Mullen et al. 2002), same gender groups of Latinos (Herbst et al. 2007b), and separate gender sessions with drug users (Prendergast et al. 2001) have been more efficacious than non-segmented interventions. Evidence of possible moderation exists for interventions with same gender groups of heterosexuals (Logan et al. 2002) and same or predominant racial groups of MSM (Johnson et al. 2005). Crepaz et al. (2007) found a slightly different result in STD patients—namely that interventions delivered to females may have been more efficacious than those delivered to males, while Johnson et al. (2003) and Ward et al. (2005) found no evidence for segmentation effects in same gender groups.

Table 3 Summary of tests of potential moderating variables in the meta-analyses

With regard to intervention characteristics, and related to the above findings, one meta-analysis found support (Crepaz et al. 2007) and one found possible support (Herbst et al. 2007b) for interventions which matched the race of facilitators to participants, while three meta-analyses found no support for racial/ethnic targeting practices (Albarracín et al. 2007; Crepaz et al. 2007; Johnson et al. 2003). In addition, a number of meta-analyses found evidence (Crepaz et al. 2006; Herbst et al. 2007b; Johnson et al. 2003, 2006; Prendergast et al. 2001) or possible evidence (Crepaz et al. 2007; Herbst et al. 2005; Johnson et al. 2002) to support skills training as an important characteristic of interventions. Although, some meta-analyses did not find support for particular skills-training components. Further, three meta-analyses provided evidence for the superiority of theory-based interventions (Herbst et al. 2005; Johnson et al. 2003, 2006), while two provided possible evidence (Crepaz et al. 2006; Ward et al. 2005) and two provided either null evidence (Herbst et al. 2007b) or evidence against theory-based components (Albarracín et al. 2007). While one meta-analysis found evidence for the superiority of group- over individual-level interventions (Neumann et al. 2002) and two found possible differences in format (Crepaz et al. 2006; Herbst et al. 2007b), a number of meta-analyses testing intervention format found no such differences. Further, two meta-analyses found larger effects (Crepaz et al. 2006; Herbst et al. 2007b) and three found possible larger effects (Crepaz et al. 2007; Herbst et al. 2005; Ward et al. 2005) among interventions with greater length, one found larger effects in shorter interventions (Mullen et al. 2002), and a number of meta-analyses found no association on these variables.

Finally, with regard to methodological characteristics, two meta-analyses (Johnson et al. 2003; Semaan et al. 2002) found evidence that interventions were more efficacious when the comparison group received less or no HIV-related intervention, one meta-analysis found possible evidence of this (Crepaz et al. 2006), and one meta-analysis had the opposite finding (Prendergast et al. 2001). Interestingly, a number of meta-analyses that tested the influence of comparison group on effect size did not find any association. In addition, two meta-analyses found evidence that the shorter time to follow-up, the larger the study effects (Copenhaver et al. 2006; Logan et al. 2002), and one found possible evidence of this (Mullen et al. 2002). Although, two meta-analyses found possible evidence of larger effects among studies with longer follow-up periods (Herbst et al. 2007b; Johnson et al. 2002). Finally, a number of meta-analyses testing study attrition/retention found that it had no impact on effect size.

Discussion

The purpose of the current study was to review and synthesize meta-analyses of behavioral interventions to reduce HIV-related sexual risk behavior. One of the most promising findings is that every meta-analysis in this review found significant sexual risk reduction outcomes on at least one outcome variable. In fact, every meta-analysis that tested for it found significant effects on condom use and 9 of 11 found effects on unprotected sex. Examination of effects on reduction of sex partners, however, revealed that such effects were less strong and less consistent. It may be that the prevailing message within risk reduction interventions is to use condoms and avoid unprotected sex, rather than to limit numbers of sexual partners. Given that it only takes one infected partner to contract an STD or HIV, it does appear that many behavioral interventions have primarily focused on a “use condoms” and “avoid unprotected sex” message (e.g., Peterson and DiClemente 2000). In addition, in some cases individuals may find “safer sex” behavioral changes easier to make in the form of using condoms rather than limiting numbers of partners.

Moreover, also promising is the fact that four of six meta-analyses found significant reductions in STDs as a result of interventions. Although changes in safer sexual behaviors theoretically should lead to reduced STD incidence, demonstrating this link is difficult for a number of reasons (Crosby et al. 2003; Fishbein and Pequegnat 2000; Noar et al. 2004), including: (1) condoms are not 100% protective against all STDs; (2) some individuals may adopt condom use but use condoms inconsistently, leaving themselves at risk for infection; and (3) the incorrect use of condoms can and does decrease condom effectiveness. Despite these challenges, several meta-analyses demonstrated a reduction in STD incidence and given that many of these projects also demonstrated reductions in unprotected sex and/or increases in condom use (but either modest or no reductions in numbers of sexual partners), changes in these particular behavioral outcomes appear to have led to changes in the disease outcomes.

Further, calculation of median effect sizes across meta-analyses suggested effect sizes that future HIV prevention behavioral interventions might seek to outperform. Although these scores are simple medians, and are not effect sizes generated from sophisticated weighting and aggregation of meta-analytic effect sizes, they may still be useful in suggesting typical effects of interventions as well as benchmarks that future interventions can be compared against. Using such a metric, the current review suggests that typical behavioral interventions increased the odds of condom use by 34%, decreased the odds of unprotected sex by 32%, decreased the odds of sex partners by 15%, decreased the odds of new STDs by 35%, and decreased the odds of risk behavior (as measured by sexual risk indices) by 28%. It is important to note that the distinction between condom use and unprotected sex was preserved in this review just as individual meta-analysts preserved it. While there is a strong relationship between these outcome variables, they are not perfectly correlated and conceptual and measurement considerations should be considered with regard to these outcomes (e.g., Crosby et al. 2004; Fishbein and Pequegnat 2002; Noar et al. 2006). Although there is potential for intervention outcomes to be affected by which outcome measure is used (Fishbein and Pequegnat 2002), the current review suggests that behavioral interventions have impacted these behavioral outcomes similarly.

There are some caveats that should be considered when interpreting these median effect sizes, however. First, it is possible that effect sizes generated by intervention studies, and thus meta-analyses of those studies, are underestimates of the actual effects of behavioral interventions. This may be the case because effect sizes are calculated as the comparison between an intervention group and a comparison group, where the comparison group may receive some HIV-related intervention (see Johnson et al. 2005) and where no-treatment control groups often improve over time (see Albarracín et al. 2005). Thus, it may be that the actual impact of behavioral interventions is greater than that captured in these effect sizes. In addition, another important issue is the fact that many of the tests of heterogeneity of effect size in this set of meta-analyses were statistically significant, strongly suggesting that collections of intervention studies were often diverse and that variability in effect sizes could be explained through analysis of moderating variables. Thus, one could argue that effect size means and medians are of limited value in these cases, as they may not provide an accurate picture of the literature. Alternatively, a more practical view may be that such effect size estimates suggest general tendencies in the literature and provide a sense of the kinds of effects that behavioral interventions typically produce. Finally, given that these effect sizes were not computed through aggregation and weighting, as is done in individual meta-analyses, meta-analytic projects with larger numbers of studies (which typically provide more reliable effect size estimates) were not weighted more heavily than those with smaller numbers of studies. In particular cases this would have been a real strength, such as with the STD outcome where some meta-analyses had as few as two studies in their analysis.

That said, some concurrence for the effect size estimate for condom use comes from what is likely the largest single meta-analysis of HIV prevention interventions conducted to date. Albarracín et al. (2005) conducted a meta-analysis of 354 HIV prevention interventions and 99 control groups spanning 17 years of literature. This meta-analysis was conducted independently of the meta-analyses in the current review but covers much of the same literature. In addition, the meta-analysis took a different analytical approach in that they collapsed all intervention and control groups together, respectively, and then compared effect sizes. They found the efficacy of intervention conditions for condom use to be d = .26 and the efficacy of control conditions to be d = .08. In order to estimate the true intervention effect, one value can be subtracted from the other to arrive at d = .18. This value converts to an OR = 1.35, which is nearly identical to the OR = 1.34 found in the current review.

On the issue of effect size, it should also be noted that just as replication of primary studies is important to the progress of science, so are replications, extensions, and updates of meta-analyses. It was promising to see that all four meta-analyses of MSM yielded very similar weighted mean effect sizes for unprotected sex—all between OR = .65–.78. Many of these projects contained the same core literature but were updated as newer studies were published in the literature. Updates and extensions of meta-analyses in other areas may similarly help to validate effect sizes in those literatures as well as continue to track the growing behavioral intervention literature among important target populations.

In the current review, when considering all of the sexual risk outcomes, there was some variability by target population. For instance, interventions with MSM, people living with HIV, and Hispanics/Latinos appeared to have the strongest effects overall; more moderate effects were found in heterosexual adults and drug users; and weaker/more mixed effects were found in adolescents and STD patients. While the moderate and stronger effects have occurred among important at-risk and infected populations, the fact that many adolescent and STD clinic-based interventions have been less successful may be cause for concern and calls for the development of new intervention approaches with these populations. Results of moderator analyses within those particular meta-analyses may shed light on features that can be maximized to improve future HIV prevention efforts among these populations. A focus on target populations also reveals the glaring omission of any meta-analysis of interventions with African-Americans. The HIV/AIDS epidemic now impacts African-Americans more than any other racial/ethnic group and at such a serious rate that the CDC has called AIDS among African-Americans a “major health crisis” and issued a heightened national response to HIV/AIDS prevention among this group (CDC 2007). Efficacious interventions with African-Americans are thus urgently needed to reduce the racial disparity that currently exists with regard to new HIV infections (Holtgrave et al. 2007), as well as meta-analysis of such interventions to determine effect size and moderators of intervention efficacy.

The current review also examined moderating analyses across extant meta-analyses of HIV prevention interventions. Such analyses vary by individual meta-analysis both because of decisions made by particular meta-analysts as well as the particular make-up of the group of interventions being examined. For instance, Crepaz et al. (2007) could not test theory-based vs. non theory-based interventions because every study except 1 was theory-based. Indeed, what “theory-based” means as well as what authors report may vary from intervention to intervention, making what seems like a straightforward analysis more complex (e.g., see Michie and Abraham 2004). That said, such analyses can potentially advance the field by helping researchers gain a greater understanding of which intervention features may increase efficacy. Although there were some conflicting findings across meta-analyses in terms of particular moderating factors, some general conclusions can be made.

Evidence was found to support segmentation strategies within interventions in that a number of interventions were more efficacious when they were delivered to single (versus mixed) race or gender groups. This was presumably the case because homogeneous groups allow for intervention content to be more carefully targeted to those groups. Related to this, some meta-analyses provided evidence that interventions were more efficacious when the race of facilitators was matched to participants and one provided support for tailoring on gender/cultural norms (Herbst et al. 2007b). These findings are consistent with both tailoring and targeting practices (Noar et al. 2007) as well as a recent meta-analysis which suggested that HIV prevention interventions were more efficacious when interventionists and recipients shared similar demographics (Durantini et al. 2006). Future research should consider the variety of ways in which interventions can be targeted to particular groups, including with regard to core intervention content, the way that intervention content is presented, the target audience itself, and with regard to the facilitator(s) of intervention sessions (Noar et al. 2007; Resnicow et al. 2008). This is a particularly important area of study given the disproportionate impact of HIV/AIDS on racial minority populations (CDC 2006).

In addition, evidence was found to support skills-training as an important component of behavioral interventions. This finding is consistent with several behavioral theories which suggest that individuals need not only the motivation to engage in safer sex, but also the skills and self-efficacy to engage in safer sexual behaviors (Fisher and Fisher 2000; Noar 2007). It is also consistent with the findings of a recent large meta-analysis of HIV prevention interventions across numerous target populations which found skills training to be an important intervention component (Albarracín et al. 2005). In the current review, evidence did not support all skills training components in all meta-analyses, however, and an important distinction to draw is what types of skills are being taught. For instance, some interventions focus on personal or self-management skills such as goal setting and self-reinforcement, others focus on communication skills such as discussing and negotiating condom use, and still others focus on technical skills such as how to correctly use a condom. The relation of these differing skill types to efficacy varied, and interventionists should carefully consider which skills might be most important for a particular target population. In addition, how such skills are taught, which has been found to vary greatly by intervention, is an important consideration and one that is worthy of further research attention (see Edgar et al. 2008b).

Moreover, support was found for the long held notion that theory-based interventions are more efficacious than those that are not theory-based (e.g., Peterson and DiClemente 2000). As noted, however, as increasing numbers of interventions are becoming theory-based, such a simple comparison may no longer be feasible in meta-analysis. Instead, meta-analysts might focus more carefully on which particular theories and theoretical concepts may act as moderators of intervention efficacy. Some recent meta-analyses have begun to move in this direction by examining particular theoretical concepts and their relation to intervention efficacy and finding, for instance, that attitudinal and behavioral skills (self-efficacy) concepts are generally more effective in interventions than perceived susceptibility (e.g., Albarracín et al. 2003, 2005). Interestingly, the Albarracín et al. (2007) meta-analysis included in the current review found that for Latinos/Latin Americans, perceived susceptibility arguments were effective while attitudinal and behavioral skills components were not. This suggests that some theoretical concepts may be more or less effective with particular populations, something further research might more closely examine. Although behavioral theories in the HIV/AIDS area hold many similarities to one another (Noar 2007), there are also important conceptual differences that should be considered in future analyses.

Finally, methodological moderators are important to consider in future interventions. Not surprisingly, some meta-analyses found that when behavioral interventions were compared to no-treatment control groups or minimal intervention groups, they achieved larger effect sizes than when compared to other equally intensive interventions. Methodologically, this is very important to take into account when interpreting the results of any intervention evaluation. In addition, at this point in the literature it is likely that interventions outperform no-treatment control groups and future intervention trials may aim to compare interventions to at least minimal intervention comparison groups, if not other HIV prevention interventions of comparable length/intensity. As the science of HIV prevention continues to progress, it is likely that we will see more of a trend toward comparing interventions to other interventions. In that case, it is critical that effect sizes generated from such studies (and meta-analyses of those studies) are interpreted appropriately, as we would expect an intervention compared to no-treatment control to have a larger effect size than one compared to another equally intensive intervention. In fact, given some trends in the literature in this direction, some meta-analyses have already begun to separate these different classes of comparisons into separate analyses (e.g., see Johnson et al. 2005).

Limitations

This review had several limitations. Most importantly, each meta-analysis contained in the review was in and of itself a unique project. Like any review, by necessity the current article could only focus on generalizations across projects, rather than focus on the unique approach and details of each meta-analytic study. Meta-analysts make differing decisions in a number of areas, including what statistical approach to use, what moderators to test, whether to include unpublished work in their review, and so forth. As a result of summarizing a number of meta-analyses using different approaches into one article, some of the intricate details of each paper were likely not represented. For instance, while some meta-analyses conducted more rigorous multivariate tests of moderating factors (e.g., Johnson et al. 2003; Johnson et al. 2002, 2005), these were not summarized in the current paper given that not all meta-analyses conducted such analyses. Readers interested in a particular meta-analysis with a particular target population should consult the original article for more complete reporting and details.

In addition, meta-analyses often cannot make up for biases or limitations that occur in primary studies. In a sense, a meta-analysis is only as good as the primary studies that exist in the literature. Consulting the individual meta-analyses included in this review can shed more light on the strengths and weaknesses of each particular review. Further, the current “review of reviews” was limited in that it did not include every HIV prevention behavioral intervention conducted to date. For instance, although interventions with high-risk populations such as African-Americans do exist in the literature, they may be underrepresented in the current review given that a meta-analysis of interventions with African-Americans has not yet been published. Rather, the current review included a particular set of meta-analyses, selected through detailed inclusion criteria, to represent interventions delivered to a number of key target populations. Although each meta-analysis likely yields a reasonable estimate of effect size, because each review contains its own inclusion and exclusion criteria, there is no guarantee of consistency across meta-analyses of differing target populations.

Finally, while methods for meta-analysis have been studied for decades, methods for “meta-analysis of meta-analyses” do not currently exist. Thus, as discussed earlier, the current review relied on simple statistics, such as median effect size, to attempt to estimate the “typical” effect of interventions. More sophisticated approaches that aggregate and weight effect sizes are used in meta-analysis, and perhaps in the future such methods will be applied to meta-analysis of meta-analyses.

Implications for Translation and Dissemination

A major conclusion of the current review is that behavioral interventions are efficacious in increasing condom use, reducing unprotected sex, and reducing STD incidence, perhaps including HIV. Now that the efficacy of numerous interventions has been demonstrated, such programs can only have further public health impact if their reach is broadened beyond the scope of the participants involved in the original research trials. Such a task is complicated by the fact that most of the studies in the HIV prevention behavioral intervention literature are efficacy rather than effectiveness trials. Glasgow et al. (2003) define efficacy trials as those that involve intensive standardized interventions, delivery in one setting, motivated samples, plentiful resources, and implementation by research staff. Effectiveness trials, on the other hand, involve brief feasible interventions, delivery in multiple settings, broad and heterogeneous samples, adapted to fit a setting, and implementation by different staff persons with competing priorities. Thus, although HIV behavioral interventions have demonstrated widespread efficacy in research trials, it is yet to be determined whether or not they are capable of widespread effectiveness under real world conditions. In addition, many if not most HIV prevention interventions have been studied only with regard to short-term efficacy; few examine longer-term efficacy and the sustainability of intervention effects, a salient issue with regard to translation. More effectiveness trials of interventions that have demonstrated high efficacy and are targeted to key at-risk populations are greatly needed (Solomon et al. 2006), perhaps building in longer-term follow ups with regard to intervention effects.

Moreover, given the urgency of the HIV epidemic, recent efforts have been undertaken to translate and disseminate many of the most efficacious interventions (e.g., Lyles et al. 2007) into practice. This includes a major CDC effort, the Diffusion of Behavioral Interventions (DEBI) project, whose objective is to enhance the capacity of community-based agencies and state and local health departments to implement efficacious interventions (Collins et al. 2006). Such projects are challenging, however, given that there is a paucity of research on the translation of evidence-based interventions from research to practice in community settings (Rebchook et al. 2006). In fact, translational research is revealing the difficulty and challenges that come with this task, including the fact that interventions often need to be adapted to local circumstances while maintaining fidelity to core elements, and the fact that technical assistance often needs to accompany interventions (see Neumann and Sogolow 2000; Solomon et al. 2006). Moreover, insights from translational work suggest that additional interventions that maximize the feasibility of translation and dissemination may need to be developed (Harshbarger et al. 2006; Neumann and Sogolow 2000). From this perspective, researchers have a role to play in translation even at the early stages of development and testing of an intervention (Eke et al. 2006; Glasgow et al. 2003). Thus, future HIV prevention behavioral interventions should, where possible, be developed with translational concerns in mind, in order to increase the ability of such interventions to ultimately be disseminated (see Eke et al. 2006; Glasgow et al. 2003; Rietmeijer 2007). In addition, further research on the translation and adaptation of efficacious interventions into new contexts is urgently needed.