Introduction

People with intellectual disability can manifest impaired intellectual functioning and adaptative skills [1]. However, these limitations do not affect the interest in sentimental relationships and sexual desires that arise in adolescence [2] and develop during adulthood in the same way as people without disabilities. Recent studies argue that much of this population feels the need to understand the process of falling in love and they also show desires to find a partner and having children [3].

The fact of people with intellectual disabilities having concerns about love and sex is contradictory, but at the same time their knowledge about this topic is limited [4]. This situation is largely due to the inequality of access to sex education [5]. Although teachers believe that with the right professional development they could educate about sexuality [6], and parents manifest their desire for their children to be socially included [7], fears persist regarding the capacity of sexual activity and reproduction of these individuals [8].

The scientific literature has reported the negative consequences that the scarce opportunities of affective-sexual expression produce in the physical and psychological health of people with intellectual disabilities. These negative outputs range from the manifestation of inappropriate sexual behaviors [9] and the development of low self-esteem [10] to the practice of unsafe sex [11, 12] and the involvement in experiences of sexual abuse [13, 14]. Therefore, having valid sex education programs which contribute to the improvement of the quality of life seems necessary [15, 16].

In this line, several theoretical reviews have examined the scientific work on sex education programs aimed at teaching people with disabilities. Whitehouse and McCabe [17] pointed out in their study that the number of developed programs working on certain areas of sexuality was greater than those that addressed it from a comprehensive approach. They also noted that many of them had not evaluated their effectiveness through standardized tests. However, they described exhaustively the few studies which provide statistical evidence of the validity of the programs. Thus, they concluded that the experimental groups obtained better results than the control groups. Besides, the reported efficacy was greater in cases with follow-up. Later, Barge et al. [18] and Doughty and Kane [19] reviewed the scientific literature on the effectiveness of behavioral skills, decision making and sexual abuse prevention programs between 1998 and 2007. They concur with the findings of Whitehouse and McCabe [17] in which the participants of the intervention groups significantly provided more positive responses than the control groups [18, 19] and when the follow-up was measured, the effectiveness scores increased [19]. However, both studies differed from Whitehouse and McCabe [17] in which the programs also addressed the development of attitudes and behaviors, and not only the improvement of theoretical knowledge. In recent years, scientific interest in the topic has continued, although the findings have not been satisfactory. Schaafsma et al. [20] have investigated the development process of the programs and have concluded that they do not have a theoretical foundation, they are mainly focused on people with intellectual disabilities but other agents are not taken into account, and a systematic evaluation of them is not carried out. Regarding the effectiveness of teaching methods, Schaafsma et al. [21] point out the need to describe them in detail, to facilitate their understanding and indicate a scarce application of the contents to everyday situations.

Although previous reviews have provided a description of the program’s characteristics, no previous scientific literature has analyzed the issue from a meta-analytical approach that quantitatively synthesizes the degree of effectiveness of the scientific literature in this regard. The meta-analysis is presented as the best option to fill this gap since it allows integrating the data of intervention effects of different investigations which share methodological properties in order to obtain a global empirical evidence of the effects’ intensity [22].

The purpose of this study is to evaluate the degree of effectiveness of sex education programs for people with intellectual disabilities and determine which moderating variables are involved in this effectiveness. This general purpose is specified in the following specific aims: (a) to analyze the characteristics of sex education programs for people with intellectual disabilities; (b) to study the variability of the results attending to substantive, methodological and extrinsic variables; (c) to propose future lines of research based on the results obtained.

In response to the previous scientific literature, it is expected that: (1) Sex education programs for people with intellectual disabilities will be more effective in intervention groups than in control groups [17,18,19]; (2) the age and sex of the participants will influence the results, being the older age and single-sex groups which will obtain the best scores; (3) the intellectual quotient (IQ) of the participants, the intervention technique and the country in which the program has been applied will influence the results as substantive variables; (4) the people who teach the program with a higher level of education and experience will obtain the best results; (5) programs with longer sessions will be more effective; (6) programs including a follow-up will reveal better results in the experimental group [17, 19]; (7) studies published in recent years will be more effective.

Method

Selection Criteria of the Studies

Studies that met the following criteria were selected: (a) the program should develop contents on sexual education; (b) the participants must be people with intellectual disabilities; (c) the study should have a design with an experimental group and a control group and pretest–posttest measurements; (d) the study had to provide enough data to calculate the effect sizes.

Search Procedures

The bibliographic search was carried out in 4 databases: Web of Science (Science Citation Index Expanded, Social Science Citation Index Expanded), Scopus, PsycINFO and ERIC. The search strategy used was (sex* or “sex education”) and (“intellectual disability” or “mental retardation”) and (program* or intervent* or treat*). Regarding the period of time, no limits were imposed to obtain the maximum number of studies, from the first publication dates until October 2017. Likewise, other sources (e.g. google scholar and the references’ list of the theoretical revisions) were used to rescue research works that may not have been recovered from the mentioned databases.

3826 records were identified, of which 2866 did not simultaneously appear in all the databases used. From these studies, 42 addressed the evaluation of sex education programs for people with intellectual disabilities. The exhaustive reading of the articles allowed the choice of 8 papers that met all the inclusion criteria. Some of them included different measures of program evaluation, so they were analyzed as independent meta-analyses. As a result, 31 independent studies were identified. In Fig. 1, the selection process followed by the PRISMA checklist is represented.

Fig. 1
figure 1

Study selection process

Coding of Studies

The coding of the studies was carried out based on three types of moderating variables: substantive, methodological and extrinsic, following the guidelines of Lipsey [23]. The gender, age and IQ level of the participants; the level of training or experience of the people who teach the program; the duration of the sessions; the technique of the intervention; the type of control group (active or inactive); and the country in which the program was delivered were coded as substantive variables. Methodological variables were to carry out or not a follow-up and the random assignment or not of the participants to the experimental and control group. In addition, an extrinsic variable was coded, which was the year of publication. The coding process was performed separately by two researchers in order to finally obtain a reliable and accurate code relationship.

Computation of Effect Size and Statistical Analysis

The index of the size of the difference effect of typified mean was used, which is known as d [24]. Negative values of d showed an improvement in the posttest. Taking into account that works with higher sample sizes exerted a greater weight in the statistical analysis of the effect sizes, the model of Hedges and Olkin [25] which weighs each effect size according to the inverse of its variance was applied. The mean efficacy was calculated and the heterogeneity was assessed using the Q test and the index I2 [26]. Statistical significance was set at p < .05. Significant heterogeneity was considered with the following values: p < .05 and I2 > 50%. In those cases in which heterogeneity was present, the influence of the moderating variables was examined. The analyses were calculated with the Review Manager 5.3 program of the Cochrane Collaboration.

Results

Descriptive Characteristics of the Studies

The eight selected studies were published between 1988 and 2017. The characteristics of each of them are shown in Table 1. In all the research reports the design was quasi-experimental, and each group already formed was randomly assigned to the experimental group condition or control except in two cases [27, 28]. The study written by Khemka et al. [29] was the only one in which there was participant drop-out, specifically in the control group during the follow-up phase. Participants’ age ranged between 11 and 56 years. Of the eight studies, four were composed of participants with mild intellectual disability, two reports with participants whose disability was mild or moderate, and one study with participants with mild, moderate or severe disability [28]. One study did not provide information about this characteristic [27]. With regard to gender, samples were composed only by men, women or mixed. Regarding the intervention techniques, three categories were identified: psychosocial techniques, cognitive-behavioral techniques and traditional educational strategies based on information transmission. Regarding the activity of the control group during the implementation of the program, the absence of intervention in this group predominated in almost all the analyzed works except for one study in which a program not based on sex education was applied [30]. In terms of geographical location, five investigations were conducted in the United States, one in Japan, one in Australia and one in China.

Table 1 Descriptive characteristics of the studies included in the meta-analysis

Regarding the intervention, the average duration was nine sessions of 1 h per week. Three of the research studies reported that a follow-up was carried out with an average of 6 weeks later. Among the contents treated by the programs, social skills and decision-making predominate in situations of abuse, followed by inappropriate sexual behavior and sexual abuse. To a lesser extent, healthy sexual relations and the management of fear and stress were also addressed as program contents. Additionally, two investigations did not provide data on the reliability and validity of the instruments used to evaluate the programs [27, 31]. The instructors were in most cases researchers assisted by other agents or personnel previously trained by specialists. In this sense, only a minority of studies used researchers or students to carry out the intervention.

Mean Effect Size and Heterogeneity Analysis

Seven independent meta-analyses were carried out according to the different components of the programs (inappropriate behaviors, social skills and relationships, decision making, sexual relations, sexual abuse, other variables) and an assessment of the effect size as a whole for the totality of the studies, which was called global effect.

The main measure of the effectiveness of the treatment was the size of the effect obtained in the posttest and in the follow-up. Considering the size of the global mean effect of all studies (d = − .64), sex education programs aimed at people with intellectual disabilities were effective towards the experimental group (see Table 2). These effect sizes were of high magnitude for the dimensions inappropriate behaviors and decision making (d = − 1.26 and − 1.03, respectively), of moderate magnitude for the global effect and sexual abuse (d = − .64 and − .71, respectively), and of small magnitude for social skills and relationships (d = − .41). The homogeneity test was significant and the I2 index showed heterogeneity in the effect sizes for the global effect and the components inappropriate behaviors and decision making, so the analyses of possible moderating variables were performed to explain the heterogeneity obtained in these cases.

Table 2 Effect size and analysis of heterogeneity in the posttest

Figure 2 presents a forest plot for the overall mean effect of all the studies showing medium degree of variability for the effect sizes. Figures 3 and 4 offer a forest plot for the dimensions inappropriate behaviors and decision making in the posttest, showing in both cases a high degree of heterogeneity (I2 = 84% and 79%, respectively).

Fig. 2
figure 2

Forest plot of effect sizes for global effect behaviors in the posttest

Fig. 3
figure 3

Forest plot of effect sizes for measures of inappropriate behaviors in the posttest

Fig. 4
figure 4

Forest plot of effect sizes for measures of decision making behaviors in the posttest

Mean Effect Size in the Follow-Up

Of the selected studies, 16 were analyzed again to calculate the effect size of the follow-up, oscillating the periods between 1 week and 2 months. Figure 5 presents the forest plot obtained with a statistically significant effect size d = − .62 (95% CI: − .84 and − .40) in favor of the experimental group.

Fig. 5
figure 5

Forest plot of effect sizes for post posttest

Funnel Plot

Since all the studies included in the meta-analysis were published articles, a study of publication bias was carried out. For this purpose, a funnel plot was designed to verify whether the results of the meta-analysis can be threatened by the publication bias (see Fig. 6). The effect sizes take a fairly symmetric form so the publication bias is rejected as a threat against the validity of the results of the meta-analysis.

Fig. 6
figure 6

Funnel plot meta-analysis for global effect

Analyzing Moderator Variables

There were eleven moderator variables put to a test (participants’ gender, year of publication, level of training/experience of the program’s instructor, follow-up/non-follow-up, duration, country, age, IQ level, intervention techniques, type of assignment to the control/experimental group and intervention/non-intervention in the control group).

Global Effect

The inter-category homogeneity statistic was significant for the variables gender, publication year and level of training of the program’s instructor (see Table 3). Regarding gender, the effectiveness of the program was compared by differentiating three groups (men, women and mixed). Statistically significant differences were observed between the groups (Q = 10.27, p = .006), being the group of men (d = − 2.94) and women (d = − .88) more effective than mixed groups (d = − .37). Regarding the variable year of publication, three publication periods were compared (1988–1999, 2000–2009, 2010–2017). Statistically significant differences were obtained between the groups (Q = 6.60; p = .04), being the period that covers the publications released between the year 2000 and 2009 (d = − 1.23) more effective than those published between 1988 and 1999 (d = − .56) or between the year 2010 and 2017 (d = − .38). Finally, as regards the variable level of training, the instructors of the program were divided into three groups (low, medium and high). The results revealed statistically significant differences between the groups (Q = 6.17; p = .04), being those people with a high level of education (d = − .90) who reported a greater impact in the experimental group compared to those with a medium (d = − .71) and low (d = − .34) level of education. The 80.5% of the heterogeneity is explained by the gender groups whereas the 69.7% and 67.6% by the publication year and the level of training of the program’s instructor, respectively.

Table 3 Moderating variables for mean global effect

Programs Components

Regarding the analysis of the moderating variables for the two components of the programs that revealed high levels of heterogeneity, none of the moderating variables was significant for the inter-category homogeneity statistic in the Inappropriate behaviors dimension. On the contrary, significant moderating variables were found for the Decision making dimension: gender, year, level of training of the program’s instructor, follow-up and duration of the sessions.

With regard to gender, the effectiveness of the program was compared by differentiating two groups (women and mixed) from the data reported by the nine studies that make up this dimension. Statistically significant differences were observed between the groups (Q = 10.37, p = .001), being the group composed only of women (d = − 1.52) more effective compared to mixed groups (d = − .27).

Regarding the variable year of publication, two publication periods were compared (2000–2009, 2010–2017). Statistically significant differences were obtained between the groups (Q = 10.37, p = .001), revealing a greater impact the publications made between 2000 and 2009 (d = − 1.52) compared to those published in the period 2010–2017 (d = − .27).

As regards level of training of the people who implemented the programs, two groups were distinguished (low and medium). The results revealed statistically significant differences between the groups (Q = 10.37, p = .001), being those with a medium level (d = − .71) more effective than those with a low level of training (d = − .27).

With respect to the follow-up variable, two groups were distinguished (follow-up and non-follow-up). The results revealed statistically significant differences between the groups (Q = 10.37, p = .001), being those programs including follow-up (d = − 1.52) more effective than those that did not perform a follow-up (d = − .27).

Finally, regarding the duration variable, two groups were distinguished (40–45 min/session and 45–60 min/session). The results revealed statistically significant differences between the groups (Q = 4.97, p = .03), being those studies that applied the program in shorter sessions (d = − .93) more effective than those whose duration was longer (d = − .27).

Discussion

The aim of this study was to determine the effectiveness of sex education programs for people with intellectual disabilities and to analyze the influence of possible moderating variables. In line with the previous scientific literature [17,18,19], the programs examined have proved effectiveness in favor of the intervention groups. Specifically, an effect size of moderate magnitude (d = − .64) was obtained in favor of the experimental group for the overall effect of the studies, a result which supports the first hypothesis formulated.

According to the second hypothesis formulated, gender has been a moderating variable which affects the effectiveness of the programs, being the groups formed by participants of a single sex (men or women) more effective in comparison with the mixed groups. However, in the second hypothesis it was also suggested that the participants’ age would influence the effect size, so that the older participants would present better scores, and this variable has not been significant. Since age is not a moderating variable, it can be deduced that there is no specific age for the application of the programs to have a greater guarantee of success. Nevertheless, in order to fulfill the preventive nature of these programs, it is advisable to develop them during adolescence [2, 32].

The third hypothesis of the study is rejected because, despite the predominance of participants with mild intellectual disability and the completion of studies in the United States, the IQ level and the country have not influenced the effect size. Thus, it is concluded that both substantive variables do not act as moderators of the effectiveness of the programs. On the contrary, the level of training of the instructors’ programs has had an impact on the effect size, being those professionals with higher training the most effective. These findings confirm the fourth hypothesis, considering the level of training another moderating variable.

Regarding the duration variable, although the majority of programs presented a similar number of sessions, differences were found in their duration. The results revealed for the Decission making component that those programs whose sessions ranged between 40 and 45 min showed a greater impact on the experimental group than those sessions of longer duration. These findings do not support the fifth hypothesis in which it was expected that the longer the duration of the sessions, the greater the impact they would have. However, the results obtained could be based on the fact that the longer the session, the greater the probability of causing fatigue or inadequate attention, which are deficit aspects present in this population [33].

As for the studies that applied follow-up measures, the results obtained support the sixth hypothesis when confirming a significant effect size in favor of the experimental groups, of greater magnitude in the investigations that included follow-up compared to those that did not perform it 17, 19].

Regarding the year of publication, the publications made between 2000 and 2009 have been shown to be significantly more effective than those of previous and succeeding years for the global effect, and subsequent years for the Decision Making dimension. This finding rejects the seventh hypothesis, because it was expected that the most recent studies would be the most effective. Attending all the theoretical revisions analyzed [16,17,18,19,20], this result could be explained because it was in 1998 when sex education programs for people with intellectual disabilities stopped addressing only theoretical content and they started to consider attitudinal and behavioral issues. Moreover, it must be added that in 2002 the proposal of a new theoretical model by the American Association for Mental Retardation emerged and it could encourage the development of new programs aimed at people with intellectual disabilities.

In this study none of the moderating variables for the dimension Inappropriate behaviors were found to be significant. These results could be due to the smaller number of studies that were linked to this component, in this case only five.

At this point, some limitations of this meta-analysis should be mentioned. One of them was the scarce number of studies that fulfilled the selection criteria. As a consequence, results need to be interpreted with caution pending the publication of new studies in this field. Another limitation was the absence of a more detailed description of some studies’ characteristics (e.g. intervention techniques). Finally, the limited number of research teams that investigate this topic limits the generalizability of the results. Despite these limitations, the practical implications that are extracted from the results obtained are diverse. First, sexual education programs for people with intellectual disabilities should consider as areas of intervention the recognition of inappropriate behaviors and decision-making in situations of abuse, since they are the components that have shown a greater effectiveness. Besides, groups should be formed by participants of only one sex and the duration of the sessions of the programs should not exceed 45 min in order to avoid the appearance of fatigue or inattention in the participants. Finally, the instructors should have a high degree of training and carry out a follow-up to evaluate the effectiveness of the program over time. Demonstrated the effectiveness of sexual education programs, activities that promote sexual education in adolescents and adults with intellectual disabilities must be considered, contemplating the orientations of this work.