With substantial substance use among youth (Johnston et al. 2016), implementing effective prevention programming remains an important goal. However, there is considerable debate in the prevention community about how to implement these programs and no clear theoretical guidance to date, particularly when taking programs to scale. While the goal of some is universal prevention, others argue that adaptation or modification of prevention programs for different cultural contexts is both inevitable and desirable (Barrera et al. 2017; Castro et al. 2004).

The current study examines these issues during the implementation of keepin’ it REAL (kiR), a middle-school substance use prevention curriculum that merits being “taken to scale” by virtue of its status as one of the few multi-cultural, evidence-based programs listed on Websites such as Youth.gov (http://youth.gov/content/keepin%E2%80%99-it-real), NREPP (https://nrepp.samhsa.gov/Legacy/ViewIntervention.aspx?id=133), and CrimeSolutions.gov (https://www.crimesolutions.gov/TopicDetails.aspx?ID=4#program), as well as the recommendation of the U.S. Surgeon General’s (2016) report on addiction (U.S. Department of Health and Human Services 2016). The curriculum is based on a model derived from Narrative Engagement Theory (Miller-Day and Hecht 2013), the Principle of Cultural Grounding (Hecht and Krieger 2006), and Social Emotional Learning Theory (Durlak et al. 2015). The intervention is narrative in both content as well as delivery and was culturally grounded in youth cultures as well as the urban, multi-ethnic population of Phoenix, Arizona where two efficacy trials demonstrated effects on substance use (Hecht et al. 2006; Marsiglia et al. 2011).

kiR is multi-cultural because it is grounded in the narratives of various ethnic groups as well as both genders and across socio-economic statuses (i.e., provides representation of multiple cultures). The multi-cultural version of the curriculum proved more effective than culturally targeted versions (Hecht et al. 2006). Studies of other interventions, however, led to the conclusion that cultural grounding may not have the same effect on all ethnic groups (Johnson et al. 2005). Discrepant findings and the potential for consideration of other cultural factors such as rurality suggest ongoing questions about adaptation that led to the design of the current group randomized trial to evaluate the effectiveness of the multi-cultural, urban (hereafter referred to as urban) version of kiR as well as a new, culturally adapted rural version of the program created for this study.

Designed Adaptation

The literature suggests that one of the primary reasons curricula get adapted is to fit local needs (for review, see Barrera et al. 2017). Implementers often feel that the generic curriculum does not match the needs of their audience and adapt for race/ethnicity, geography, and other factors. Although there are risks inherent in the process of adaption, there are also benefits including addressing local needs, increasing community ownership, and increasing cultural relevance (Botvin 2004; Dusenbury et al. 2003). In some cases, adaptation is necessary due to the “mismatch effect” of programs that are implemented in populations that are very different than the group for which they were originally developed. Castro et al. (2004)) argue that this mismatch threatens efficacy even when there is high fidelity because messages tend to be more effective when they represent the culture of the target audience (Hecht et al. 2003). This might explain the finding that in schools with higher non-white populations, teachers are more likely to locally adapt prevention curricula developed for white audiences (Ringwalt et al. 2004).

Although some prevention researchers believe that the need for and effectiveness of local adaptation may be over-stated (Elliott and Mihalic 2004), others support balancing the need for program fidelity with a desire for local or cultural adaptation (Dusenbury et al. 2003; Ringwalt et al. 2004). However, despite general interest in adaptation among the prevention community (e.g., Baker 2001; Barrera et al. 2011; Castro et al. 2004; Lee et al. 2008) and a call for evaluation of designed or “planned” adaptations (Pentz 2004), there are few examples of programs that describe the process of adapting a program for a new intended audience. In one study, Botvin et al. (1989) report how they adapted a smoking prevention curriculum originally tested with predominantly white, suburban students for an urban, Hispanic population. A second example is Project Northland that was originally designed to prevent early-onset alcohol use among rural adolescents in Minnesota and was adapted for use with a multi-ethnic population in Chicago (Komro et al. 2008). The current study builds on the limitations of previous research by examining the effectiveness of designed adaptation in the context of rural substance use prevention.

One model of designed adaptation is based on the Principle of Cultural Grounding (Hecht and Krieger 2006). The Principle is a prevention approach based on communication competence (Spitzberg and Cupach 1984) and Narrative Engagement Theory (Miller-Day and Hecht 2013), as well as multi-culturalism (Green 1999). This Principle considers culture a prime factor in both curriculum development and adaptation indicating a perceived need for curricula that communicates a high degree of cultural sensitivity to its target audience (Hecht et al. 2003). In doing so, it addresses how to adapt a curriculum to a new culture, including both surface-structures (e.g., language, rituals, food) as well as deep-structures (e.g., values and beliefs) (Castro et al. 2004). Cultural grounding argues that the prevention messages must be derived from and with cultural group members as active participants in message design and production. It invokes core values and communication styles as central features of a culture’s deep structure that are expressed in narratives.

The process of culturally grounding kiR began with studies articulating an adolescent perspective (i.e., youth culture) on substance use. That is, describing how adolescents make sense of drug offers, their norms and values, how they make decisions about use, and how they resist offers (Miller et al. 2000; Pettigrew et al. 2011). Characterized as a “from kids, through kids, to kids” approach, curriculum development started with cultural narratives and proceeded iteratively through participatory action research (Hecht and Miller-Day 2009). The current study provides an evaluation of a version of the curriculum culturally adapted for rural youth as compared with original version created for urban youth (see Colby et al. 2013) to test the need for cultural adaptation. The design allows for a comparison of the adapted and non-adapted versions of the program against controls in rural schools to examine the overall effectiveness of the curriculum.

Designed Adaptation of kiR for Rural Youth

The original kiR culturally grounded in the narratives of urban youth was adapted or “re-grounded” for rural schools (Colby et al. 2013). The urban focus of kiR as well as many, perhaps most, programs may mean that cultural mismatch is experienced when implemented in rural areas. Briefly, rural cultures differ from their urban and suburban counterparts in several important ways, including experiencing considerable health disparities (Haynes and Smedley 1999). These disparities extend to substance use where rural adolescents report higher levels of tobacco, alcohol, and methamphetamines use than their non-rural counterparts (Warren et al. 2016) and often begin using drugs at an earlier age (Zollinger et al. 2006). Additional research suggests that protective factors, such as peer and parental disapproval, may be weaker among rural youth (Lenardson et al. 2012). Formative research collected rural narratives about drug offers, identifying the complexities of rural drug offers/refusals processes (Moreland et al. 2013; Pettigrew et al. 2012). In collaboration with rural teachers, this information was integrated into the curriculum including role play activities, decision scenarios, and homework as well as new videos that retain the prevention strategy, curriculum design, and intervention strategy (for more detail, see Colby et al. 2013).

Delivery Quality

As noted, culturally mismatched programs are more likely to be adapted during implementation giving rise to questions about whether effectiveness is driven by the mismatch or the quality of delivery. Regardless of whether adapted or not, the success of interventions may rest on the quality of their delivery because altering material may detract from program outcomes (Botvin 2004; Elliott and Mihalic 2004). Unfortunately, when evidence-based programs are taken to scale, they are rarely implemented as designed (Elliott and Mihalic 2004; Botvin et al. 1989; Miller-Day and Hecht 2013; Ringwalt et al. 2004).

Dusenbury et al. (2003) argue for conceptualizing implementation broadly to include adherence (i.e., fidelity), dose, participant responsiveness, quality of delivery, and program differentiation. This reflects a shift from viewing all change as maladaptive to asking a more global question of the quality of implementation with fidelity as one of many factors and perhaps not even the most important. Berkel et al. (2011) suggest a multi-dimensional view that includes both teacher delivery and student responsiveness behaviors, and Pettigrew et al. (2015) report that delivery quality (comprised of both teacher and student behaviors) is a better predictor for program outcomes than adherence to the program. Findings like these suggest taking a broad and inclusive view toward delivery quality and that more research is needed to evaluate its importance. Thus, the current study aims to examine the relationship between delivery quality and program adaptation.

Hypotheses

We hypothesized that the designed adaptation of the kiR curriculum delivered with high quality would be more effective than the adapted curriculum delivered with low quality or the control condition. Similarly, it was posited that the multi-cultural, non-adapted/urban kiR delivered with high quality would be more effective than the non-adapted/urban curriculum delivered with low quality or the control condition. In accomplishing these goals, this study provides a test of the novel curriculum based on the principle of cultural grounding and narrative engagement theory while providing answers to questions about the most efficacious adaptation. The following hypotheses are posited:

  • H: There will be significant differences in substance use across the five conditions such that:

    • Ha: Participants in the high quality designed-adapted rural kiR will report less substance use than those in the control condition.

    • Hb: Participants in the low quality designed-adapted rural kiR will report less substance use than those in the control condition.

    • Hc: Participants in the high quality designed-adapted rural kiR will report less substance use than those in the low quality designed-adapted rural kiR condition.

    • Hd: Participants in the high quality non-adapted urban kiR will report less substance use than those in the control condition.

    • He: Participants in the low quality non-adapted urban kiR will report less substance use than those in the control condition.

    • Hf: Participants in the high quality non-adapted urban kiR will report less substance use than those in the low quality non-adapted urban kiR condition.

Methods

Participants and Procedures

Schools were recruited to participate from rural school districts in Pennsylvania and Ohio based on rural-urban classifications provided by the National Center for Education Statistics (http://nces.ed.gov). Approximately 64 rural schools were contacted and a total of 39 schools from rural areas agreed to participate and were randomly assigned to the control condition (n = 14), the culturally non-adapted urban curriculum condition (n = 11), or the culturally adapted rural curriculum (n = 14) (see Graham et al. 2014). Prior to implementation, treatment (adapted/rural and non-adapted/urban conditions) schools were provided curriculum materials and a standard 1-day training. After training, 32 teachers in the treatment schools delivered the program. No other delivery support was provided beyond training so that conditions approximated an effectiveness trial. Control schools were provided information about study procedures and promised the curriculum materials and training after the study ended.

Students from participating schools were invited to participate in the study. Random assignment of schools to conditions minimized the chance for selection bias among participating students. Passive parental consent and active student assent were obtained to participate in four waves of self-report, paper-and-pencil surveys on computer-scannable forms administered by the Penn State University’s Survey Research Center. Survey data were collected in fall and spring of 7th grade (2009, 2010) and subsequently in spring of 8th (2011) and 9th grades (2012). Surveys followed a planned missing design due to time constraints (Graham et al. 1996, 2006). All procedures were approved by a university institutional review board.

A total of 2781 students (at 9th grade: M = 14.71 years, SD = .60) participated in all four waves of data collection from 7th through 9th grades. This sample included 1095 students in control condition (39%), 664 in the non-adapted/urban condition (24%; N = 329, 12% for low delivery quality; N = 335, 12% for high delivery quality), and 1022 in the adapted/rural condition (37%; N = 590, 21% for low delivery quality; N = 432, 15% for high delivery quality). Of the total, 51% reported themselves as male and 97% indicated as European American. Students’ demographics matched the geographic location of data collection. See Fig. 1 for CONSORT flow diagram.

Fig. 1
figure 1

Consort Flow Diagram for Rural keepin’ it REAL RCT. Note: one school from the non-adapted condition declined to participate in the wave 3 survey (2011) but participated in wave 4

Measures

Both observational and self-report measures were used in this study. Observational measures assessed delivery quality and self-report measures assessed student outcomes. Data for this study were from wave 1, pre-test data for the baseline controlling variables for student outcomes and wave 4, final post-test data collected 24 months after implementation.

Delivery Quality (Observations)

To measure delivery quality, kiR lessons were videotaped and coded by three trained coders. Teachers in treatment schools were provided digital video cameras to record all lessons. Digital recordings were mailed to project staff. Teachers were compensated $10 per lesson if they completed all research activities. This resulted in a corpus of 688, 20–60-min videos. A random selection of 276 videos was analyzed including approximately four lessons per class. The first and last lessons were excluded to focus on lessons with the most prevention content (Pettigrew et al. 2012).

A team of three trained coders received 20 h of coding training (see Pettigrew et al. 2015 for more detail). Five indicators of teacher engagement were rated on four-point scales measuring how attentive, enthusiastic, serious, clear, and positive the teacher was. Student engagement assessed the level of attentiveness and participation observed during the lesson on a four-point rating scale. A third dimension, global teaching quality or overall effectiveness, was rated on a five-point scale from poor to excellent. Weekly meetings were held and, monthly, coders were randomly assigned the same video to re-check coding agreement. This process prevented coder drift and allowed coders to maintain consistent standards. Inter-coder reliability using the Krippendorff alpha (Hayes and Krippendorff 2007) showed high agreement at four different time points during the coding process (0.94, 0.93, 0.84, and 0.92).

For this study, teacher engagement, student engagement, and global teaching quality were combined into a single variable labeled delivery quality. Because teacher engagement and student responsiveness are “two measures of a single phenomenon unfolding at the same time, or at least in very rapid sequence” (Pettigrew et al. 2015, p. 97), we subsumed these into a single variable. These procedures followed previous work (i.e., Pettigrew et al. 2015) which demonstrated that adherence and delivery quality were highly correlated, and that delivery quality was a better predictor of adolescent substance use outcomes. Thus, we opted to focus the current analysis on effects of delivery quality. We averaged each indicator within-lesson and then averaged across lessons producing a mean rating for each variable for unique class of students. We then computed delivery quality by calculating the mean of weighted standardized variable scores where teacher engagement was given twice the weight of student engagement and global teaching quality. Two factors determined our weighting calculus: videos were positioned in the backs of classrooms to focus on teachers while still capturing the entire class, and coders rated six items assessing teachers and two assessing students (Pettigrew et al. 2015). We created weighted standardized scores for the delivery quality variable and used zero as a cut-off point to determine high and low delivery quality in the treatment conditions. That is, positive means of the delivery quality were entered as high delivery quality and negative means of the delivery quality were entered as low delivery quality. Because we used weighted scores to compute IQ, a standardized mean was considered an appropriate cut-point between high and low quality.

Substance Use (W1 and W4 Surveys)

Four items measured youth lifetime use of alcohol, cigarette, marijuana, and chewing tobacco (Hansen and Graham 1991). Students were asked to answer items, “How many drinks of alcohol have you had in your entire life” with ten response options, “How many cigarettes have you smoked in your entire life” with ten response options, “How many times have you used marijuana in your entire life” with seven response options, “How many times have you used chewing tobacco in your entire life” with eight response options. See Table 1 for descriptive statistics of substance use in the five conditions.

Table 1 Baseline (W1) and lifetime (W4) substance use: means and standard deviations

Analytical Plan

Prior to the main analyses, multiple imputation (Graham 2012) was employed to handle the data missingness by entering youth reports of lifetime substance use at W2 and W3 as auxiliary variables, using Mplus (Muthén and Muthén 1998–2015). Next, prior to the analyses, four dummy variables were created to compare a reference group and the rest of the conditions. For example, the first analysis put the control as the reference group and tested the comparisons between the reference group and the other four conditions. The second analysis used the low quality non-adapted/urban condition as the reference group and compared the outcomes between the reference group and the other two conditions (e.g., control condition and high quality non-adapted/urban condition). Since the individual responses were nested within the school levels, we controlled for the school level effects. Baseline substance use also was included as a covariate in the main analysis due to the significant correlations with the lifetime substance use at W4. In this way, we were able to compare all five conditions while controlling for school level effects and baseline substance use. To summarize, we tested the hypotheses by computing a series of mixed model analyses to examine the effects on youth substance use behavior (W4 in 9th grade) while statistically controlling for school level effects and baseline (W1 in 7th grade) pre-test reports of youth lifetime substance use.

Results

The hypotheses predicted that there would be significantly different effects on youth substance use across the five conditions. Mixed model analyses compared the high- and low-quality adapted/rural and non-adapted/urban conditions to the control condition (see Table 2). We found that youth in both high- and low-quality delivery of the adapted/rural condition reported significantly less cigarette use than those in the control condition (Ha and Hb partially supported). All the other effects for high-quality adapted/rural curriculum were in the desired direction but not statistically significant. Effects for low-quality adapted/rural curriculum were not statistically significant but were in the desired direction for alcohol and chewing tobacco. Youth in the high-quality, non-adapted/urban condition reported less marijuana use than those in the low quality, non-adapted/urban condition (Hf partially supported) but neither high nor low quality delivery of the non-adapted curriculum produced substance use outcomes that differed significantly from the control. The other hypotheses were not supported.

Table 2 Mixed model results

Discussion

Findings from this study underscore the importance of both planned or “designed” adaptation and, to a limited degree, delivery quality. Advancing knowledge in these two domains contributes new evidence in prevention science that can aid program developers and the prevention community as interventions are scaled. Findings also support the effectiveness of keepin’ it REAL when it is culturally grounded but not when it is culturally mismatched. This section reviews the findings and explores their implications.

Designed Adaptation

The most significant finding was the emergence of designed adaptation as the key in generating positive program effects. The adapted, rural curriculum that matches the culture of the rural target audience proved effective in reducing adolescent cigarette use. A similar, but non-significant pattern was observed for alcohol, marijuana, and chewing tobacco. This general pattern was true regardless of delivery quality. While it may be that the study was under-powered for the low baseline rates reported in Table 1 (i.e., the study might have demonstrated significant effects if the sample were larger), we cannot assume a methodological explanation. One possible explanation for the alcohol findings could be the regulatory environment. Both Ohio and Pennsylvania restrict retail sales to state-run distributors making it more difficult for under-age youth to obtain alcohol. If youth do not have access to alcohol they consume less and, therefore, there may be less need for early prevention interventions like keepin’ it REAL in addressing this problem.

Overall, we can conclude that the adapted or culturally matched curriculum proved efficacious in reducing tobacco use among rural youth. The same, however, cannot be said for the non-adapted/urban or culturally mismatched curriculum. While the high-quality implementation of this curriculum resulted in significantly less marijuana use than the low-quality delivery, neither differed significantly from the control group for any of the substances.

These findings support the importance of cultural re-grounding (Colby et al. 2013) when a curriculum is culturally mismatched. The current findings suggest that even evidence-based practices can be ineffective when culturally mismatched. Pettigrew and Hecht (2015) argue that this process should be considered when developing prevention curricula, partly because it allows the voices of the target intervention group to be represented in the program. It also may explain why some evidence-based programs produce null effects when taken to new populations (Ringwalt et al. 2010). Problems of “taking programs to scale” are widely acknowledged and cultural mismatch may be part of a broader explanation. If the voices of the new target culture are not represented in the program, the Principle of Cultural Grounding (Hecht and Krieger 2006) argues that the target population will not easily adopt or internalize these messages. Accordingly, programs may require adaptation when transported to new settings.

As Colby et al. (2013) note, questions remain about the means for inclusion in multilevel, culturally situated community interventions, which requires consideration of core components and philosophies (see also Barrera et al. 2017). While cultural re-grounding appears successful, it is not necessarily easy and can be resource-intensive. As a potential solution, Miller et al. (2000) provided a “how-to” appendix for schools to develop their own prevention programs and Colby et al. (2013) provides an exemplar of this process. Whatever methods are used, the current study suggests a need to find ways to adapt programs to make them match local culture.

A second strategy is to build dissemination into the initial design by making the curriculum truly multi-cultural. This means a curriculum that is inclusive rather than targeted to salient identities. The multi-cultural, non-adapted/urban curriculum was developed for Phoenix, Arizona schools and included youth, SES, gender as well as three ethnic cultures that represented over 95% of the local population (Mexican American, African American, White American). When this was adapted for national distribution by D.A.R.E. America, kiR was modified to include rural, suburban and urban adaptations (including allowing for mixing to accommodate contexts that have elements that cut across geographic identities) along with an even greater range of ethnic identities (i.e., Asian Americans and other groups). We do not assume that all members of an identity group can be represented this way because there are likely to be small groups in certain sections of the country that do not identify with the represented groups. The D.A.R.E. Alaska program, for example, felt the need to re-ground kiR to address the needs of their indigenous population. However, we believe that inclusiveness fosters engagement. Thus, when students across the USA participate in D.A.R.E., which is in over 70% of school districts, most can see themselves represented.

Third, we argue that narrative pedagogy is a key to facilitate taking curriculum to scale. Even after including as many identity groups and cultural factors as possible in assembling a multi-cultural curriculum there may be other local factors that need to be considered. Narrative Engagement Theory (Miller-Day and Hecht 2013) uses narrative pedagogy as a strategy for accommodating new settings. Using stories as exemplars, scenarios for activities, role plays, discussions, and other pedagogical tactics provides students and implementers with opportunities for localization while maintaining fidelity. During these activities, participating youth provide their own narratives or stories, localizing the content as they apply it. This is particularly true when developmentally appropriate narratives are derived from youth culture to stimulate discussion and role plays. The theory argues that personal narratives emerge in response to these iconic narratives, which engages audiences and localizes the curriculum by allowing youth to “see themselves” even in a curriculum that does not directly address every aspect of their individual cultural identities. Thus, narrative pedagogy is an essential element of the curriculum plan to facilitate dissemination.

Delivery Quality

Perhaps the most surprising results was the limited influence of delivery quality. The only effects observed were within the non-adapted/urban condition where high-quality delivery produced better outcomes than low quality. We can speculate about why this was observed. For example, the effects of cultural matching may have overridden those of delivery. In other words, cultural matching is simply a more powerful predictor of prevention outcomes smoothing out or masking any delivery effects. We find, for example, when examining only the treatment group that delivery quality does, in fact, impact program outcomes (Pettigrew et al. 2015). The most powerful prevention likely results from synergy between content and delivery, however, findings indicate that culturally appropriate and strong prevention content matters more than good delivery.

Thus, delivery quality remains an important topic of study and practice with many unanswered questions (for review, see Durlak and DuPre 2008). Efforts should remain focused on either (a) developing prevention programs that are immune to low delivery quality (Bumbarger 2015), such as adaptive interventions (Collins et al. 2004) or (b) selecting, training, and supporting high delivery quality (Fixsen et al. 2009). While programs may never be totally immune to low-quality delivery, programmed interventions and/or those online may minimize the threat. For example, REAL Prevention recently developed and implemented an e-learning curriculum that locks students into a progression through the lessons and informs instructors if students are not making progress. This program, and others like it, capitalize on emerging technologies for scaling interventions. Evaluations can provide evidence for the efficacy of this approach.

Trainer and training issue have long been discussed in the prevention literature, concluding that training improves outcomes (Fixsen et al. 2005). What is less clear is how much and what kind of training and technical support. Some find training plus ongoing phone reminders and support improved outcomes (Kaner et al. 1999) whereas others found no effects from training plus coaching (Ringwalt et al. 2009a; Ringwalt et al. 2009b). D.A.R.E. America provides extensive training, including a practicum, and ongoing technical support for their implementation of kiR. D.A.R.E. requires a minimum of 80 h of training for new officer implementers and provides a network of officer mentors and educators to support their efforts. As a result, studies of Officer implementers demonstrate their fidelity and delivery quality (Bumbarger and Miller 2007; Hammond et al. 2008). Unfortunately, little is known about training or technical support, other than acknowledging that training matters. Another option involves taking advantage of technology that allows user customization (i.e., the user makes choices in curriculum options to individualize the content) (Kang and Sundar 2016) while retaining the overall prevention strategy.

Limitations

Although findings in this study shed light on the important topics of adaptation and implementation, results should be interpreted with caution. Our analysis did not control for other aspects of implementation quality, such as adherence or dose. In addition, the findings may have been limited by the cut-off score used to differentiate high and low delivery quality. While we based the cut-off score on an empirical criterion utilizing the findings of Pettigrew et al. (2015) to identify the most salient implementation variables, there may be better, conceptually grounded methods.

In addition, as indicated in the Consort Flow Diagram (see Fig. 1), several schools declined to participate in this study. This may limit generalizability.

This study was conducted with youth in districts defined as rural by National Education Statistics across two states in the USA who may not represent all rural youth in the country. There also likely is heterogeneity in the degree to which youth living in rural places identify with rural life. To better understand effects of cultural mismatch, attention should be given to the interplay between the structural and phenomenological definitions of culture. Moreover, while the sample is similar to the local population characteristics, it does not include substantial numbers of rural minority populations and does not allow for subgroup analysis. The findings, therefore, may not generalize to the entire youth population in the United States.

Overall Conclusions

The current study attempted to answer questions about the importance of designed adaptation and delivery quality for the widely disseminated, evidence-based keepin’ it REAL curriculum. In our study, only the rural version of the program adapted specifically for rural youth demonstrated effectiveness in reducing substance use and those results were limited to tobacco use. Given the important health effects of this substance, this finding should not be minimized and provides some support for the effectiveness of keepin’ it REAL when adapted for local circumstances as well as, hopefully, for the nationally disseminated D.A.R.E. program that is more widely grounded in geographic, ethnic, gender, and other types of diversity.

At the same time, delivery quality did not provide the overall benefits reflected in previous research (Pettigrew et al. 2015). Further research is needed to determine if delivery quality is primarily protective or if delivery itself is a quality of effective interventions (i.e., improves effects of all curricula). One can even conceptualize a combination of these, protecting against iatrogenic effects of some curricula, increasing the effectiveness of moderately effective curricula, and having no effectives on powerful curricula.

Based on this, we might ask, what classifies as a universal program? Questions remain about the key characteristics of a target population (i.e., what cultural features require re-grounding) and how these characteristics are defined (e.g., will a program developed in rural Pennsylvania and Ohio be effective in rural Maine? Rural California?). Do any of these characteristics transcend the importance of core components and clear logic models? Such questions about implementation science are important to consider and investigate as programs developed in one place are disseminated elsewhere. Prevention and implementation science that includes processes for taking programs to scale is in its infancy; we hope that this study demonstrates the complexity of these issues, raises relevant questions for future exploration, and shares some guidance in this process.

In conclusion, this study highlights the need for culturally regrounding prevention curricula. The non-adapted/urban version of the keepin’ it REAL curriculum that has proven effective in two previous randomized clinical trials in an urban, multi-ethnic community, did not achieve the same effects when it was implemented in the largely white, rural settings of Ohio and Pennsylvania. The controversies around fidelity, implementation quality, universal prevention, cultural grounding/appropriateness, etc. are addressed, in part, by these findings that argue for regrounding curriculum and suggest that this process may be more important than implementation quality in achieving program effects. Greater attention to issues surrounding inclusion (i.e., multi-culturalism) versus targeting are needed as we address the viability of universal prevention.

Beyond the more conceptual implications for prevention science, these findings demonstrate the need for further research in our under-served rural communities. Rural culture, like other cultural variables, merits consideration when interventions are designed and implemented. While the size of our rural populations may be less than they were even 30 years ago, a significant number of people still live in these cultures and others have likely retained a portion of their rural identities even after relocating to other geographic areas.