The commercial sexual exploitation of children and youth (CSECY) is a well-established public health problem. This reality is underscored by research linking CSECY to a host of negative health concerns, including self-harm, suicide, substance use, severe mental health problems (posttraumatic stress disorder [PTSD], depression, anxiety, disassociation), Sexually Transmitted Infections and Human Immunodeficiency Virus (HIV), as well as experiences of violence (e.g., rape, physical abuse, psychological abuse, murder, etc.; Clawson et al. 2009; Greene et al. 1999; Hughes 2014; Yates et al. 1991). Despite the data supporting its impact on health, there is a lack of consistent evidence on the incidence and prevalence of CSECY. This is most likely related to the phenomenon’s hidden nature (Albanese 2007; Cusik 2002), as well as limitations of available research (Marcus et al. 2012). Scholars consider CSECY to be a significant problem in the United States (Albanese 2007; Barnitz 2001; Clayton et al. 2013; Estes and Weiner 2001), though available estimates vary widely depending on the data source and the focus of data collection. According to End Child Prostitution, Child Pornography and Trafficking of Children for Sexual Purposes International (ECPAT 2012), at least 100,000 children are exploited in the United States through prostitution every year. A study of CSECY in New York City alone estimated a population of 3946 affected youth (Curtis et al. 2008). Prospective cohort data have also been used to estimate incidence rates of trading sex among “street-involved” youth. For example, in the Montreal Street Youth Cohort study, of female youth ages 14 to 25 who had never engaged in commercial sex at baseline (n = 148 of total sample of 330) almost a quarter (22 %, n = 33) had engaged in commercial sex by follow-up (an average of 2.4 years post-baseline; Weber et al. 2004).

Systematic reviews pertaining to CSECY have synthesized literature on a variety of sub-topics, including governmental approaches for addressing child prostitution (Cusik 2002) and theory-informed victim vulnerabilities (Reid 2012). Others have focused on the broad topic of sex trafficking, without specific attention to children or youth (e.g., van der Laan et al. 2011). A recent review covered a breadth of topics, such as the service needs of victims, existing research efforts, placement types and interventions, and child welfare department responses to the commercial sexual exploitation of children (Hughes 2014). The author concluded that despite the existence of growing attention to the issue of CSECY and a variety of “promising practices” to address the issue, there is a great deal of additional work needed to build a robust evidence-base in this area. There are only a few existing victims’ service programs having benefitted from formal evaluation. This review did not focus specifically on synthesizing existing evaluation findings and thus, did not provide the depth or detail likely to be optimal for either characterizing the current evidence base of CSECY-focused programing or guiding future efforts in this area. The present paper aims to address this need through a systematic review of available evaluation research on programs and policies intended to prevent the commercial sexual exploitation of children and youth.

Terminology and Definitions

Many terms are used in the scientific literature, in advocacy discourse, and in popular media to describe the exchange of sex for money or in-kind items (e.g., housing, food, gifts, etc.). Sometimes these terms are used interchangeably, whereas other times one term is used as an umbrella term for a larger phenomenon (e.g., sexual exploitation) into which sub-phenomena (e.g., commercial sexual exploitation, domestic minor sex trafficking) fall. The following definition of “the sex trade” from The Young Women’s Empowerment Project is useful, as it serves as an umbrella term under which CSECY falls:

[The sex trade is] any form of being sexual (or the idea of being sexual) in exchange for money, gifts, safety, drugs, hormones, or survival needs like housing, food, clothes, or immigration and documentation – whether [the young person involved] gets to keep the money/goods/service or someone else profits from these acts. (Iman et al. 2009, p. 7)

In many instances, the various concepts and terms that fall under the broad category of “the sex trade” can be considered interchangeable or differing only slightly from one another (i.e., sexual exploitation of children [SEC], commercial sexual exploitation of children [CSEC], domestic minor sex trafficking [DMST], child/juvenile prostitution, and survival sex). Despite the slight conceptual differences that may be associated with such terminological differences (which will not discussed in detail here), the current review will include research that is positioned under any of the major terms that have been utilized in the literature on CSECY.

The Need for Rigorous and Informative Program Evaluation

Program evaluation is one of 10 essential public health services considered to be important for all communities to undertake. Specifically, communities should “evaluate effectiveness, accessibility, and quality of personal and population-based health services” (Centers for Disease Control and Prevention (CDC) 2013, p.1). Evaluation can be defined as the “examination of the worth, merit, or significance of an object” and a program as “any set of organized activities supported by a set of resources to achieve a specific and intended result” (CDC 2012, p. 1). Despite the directive to evaluate programs thoroughly, evaluation is not practiced in a consistent manner across public health areas or is not well-integrated into everyday program operations (CDC 1999). It is particularly lacking in efforts to improve child welfare (Collins 2008). Although there are myriad efforts across the nation aiming to address CSECY (e.g., education for at-risk youth, training for victim and support service professionals, outreach and public awareness initiatives, prevention programs, and hotlines for victims and service providers), “the specific goals, target populations, sources of funding, ideology, and designs of these efforts vary significantly” (Hughes 2014, p. 9).

The RE-AIM Framework

The present review synthesizes the limited evaluation findings available, regarding programs and policies, to address CSECY using the RE-AIM framework from the field of public health. RE-AIM emerged from a need for improved reporting on implementation and external validity of public health research and was developed to address issues of translation from “lab” to real-world settings (Kessler and Glasgow 2011). The framework is based on the idea that comprehensive evaluation of the impact of a program or policy requires assessment of its five combined dimensions: (1) reach, (2) effectiveness, (3) adoption, (4) implementation, and (5) maintenance (Glasgow and Linnan 2008; Glasgow et al. 1999). The RE-AIM framework is especially useful for reviewing program and policy evaluations as it calls attention to what is typically evaluated (i.e., one or two of the five categories) versus what could or should be evaluated. For example, evaluations that focus purely on the internal validity of the effectiveness of a program or policy in a specified environment miss the potential for evaluating external validity, or the ability to transfer the program or policy to other settings and potentially with other groups (Glasgow et al. 1999).

We are interested in both the internal and external validity of programs and policies addressing CSECY, therefor we use the RE-AIM framework to guide our review of existing evaluations. RE-AIM has been used as an evaluation framework for a variety of public health interventions. A systematic review of the use of RE-AIM between 1999 and 2010 found 71 articles employing the framework for evaluation in a variety of content areas, including physical activity and obesity, disease management, substance abuse, and health promotion in diverse settings of community, policy, healthcare, and education (Gaglio et al. 2013). Notably, of the 71 articles, only 61 % (n = 44), reported on all five dimensions of the framework (Gaglio et al. 2013).

The overarching research questions addressed in this review are nested under each of the five dimensions of RE-AIM. Table 1 outlines the RE-AIM framework and provides descriptions of the five dimensions as well as the corresponding questions for the current review.

Table 1 RE-AIM categories and questions for systematic review

Method

Search Strategies

Following best practice guidelines for systematic literature reviews (Cooper 2010), evaluations of CSECY programs and policies were identified through multiple channels. These included: 1) search of several reference databases, including Google Scholar, PsycINFO, PubMed, and PROQUEST theses and dissertations, and websites of relevant journals (e.g., Youth Violence and Juvenile Justice) using the following search terms: “commercial sexual exploitation of children”, “commercial sexual exploitation of youth”, “domestic minor sex trafficking”, “youth prostitution”, “child prostitution”, and/or “child sexual exploitation” coupled with “intervention”, “prevention”, “policy”, and/or “program”; 2) examination of relevant prevention and training programs’ websites; 3) review of the reference lists of retrieved articles as well as those of prior reviews (e.g., Hughes 2014); and 4) contacting authors who had published an article in the past 5 years that was deemed appropriate for inclusion in the final review, as well as other known researchers in the area of CSECY.

Article Selection and Data Abstraction

The search processes identified more than 4000 documents. After reviewing document titles and abstracts for relevance, the number of documents for potential inclusion was reduced to 34. We applied inclusion and exclusion criteria (described below) based on a thorough review of available content, which resulted in 12 documents (containing 13 studies) that met our criteria for the review.

Inclusion and Exclusion Criteria

The Identified documents had to meet the following criteria for inclusion in the review:

  • The document described a program or policy that 1) aimed to prevent sexual exploitation or commercial sex among children or youth; or 2) aimed to intervene with children or youth who have engaged in commercial sex; or 3) identified children or youth who had been sexually exploited in some capacity. If different but synonymous terms to describe sexual exploitation were used, for example, domestic minor sex trafficking or youth prostitution, the article was included.

  • The target audience for the program or policy included children/youth, service professionals working with affected youth, buyers of sex from youth, or other exploiters. Youth was defined broadly to include persons up to age 24 (United Nations 2013).

  • The document provided evaluation data about the program or policy that addressed at least one of the guiding questions for the review (see Table 1).

  • The program or policy’s setting was a developed country or region (e.g., The United States, Canada, The United Kingdom, Australia), since the conditions under which youth engage in commercial sex may be significantly different in non-western/developing countries and thus, associated programs and policies would not be amendable for synthesis.

Reports were excluded from the review if either of the following applied:

  • The program or policy focused on international sex trafficking occurring in the United States (i.e., young people trafficked from one country to another by a third-party).

  • The program or policy addressed commercial sex among adults in general, without a specific focus on children or youth.

Coding and Data Abstraction

A coding guide was created to review articles and abstract key findings for synthesis (sheet available by request from the first author). The main sections of the coding guide were as follows: Report Characteristics, Program Characteristics, Sample and Research Design, and Evaluation Data. Report Characteristics documented the publication year, outlet type, and discipline of the primary author. Program Characteristics documented the name of the program or policy, the target audience (e.g., at-risk youth, youth victims, service providers, buyers of sex) and the level of prevention (i.e., universal: targets the general population, selective: targets at-risk groups, or indicated: targets those already engaged in the behavior (Institute of Medicine 1994). Sample and Research Design documented information about sample size and demographics, method(s) of data collection (quantitative, qualitative, mixed, or process/“lessons learned”), and the type of data (e.g., self-report, record review, individual interview). Evaluation Data documented the key components and findings of the evaluation from the perspective of the RE-AIM framework and our associated guiding questions.

Results

We will initially report the overall descriptive characteristics of the studies included in the final review in order to provide context for our findings in terms of the RE-AIM framework. Of the 12 documents included in the analysis, 13 studies were extracted and reviewed. Most of them were peer-reviewed journal articles (n = 8). The remaining four consisted of two governmental reports, a non-governmental publication, and an unpublished manuscript. Of the 13 studies, 11 evaluated a program and the remaining two evaluated a policy (see Table 2 for a brief description of each program or policy). Only five explicitly identified a guiding theory or conceptual framework for the program or policy (e.g., community organizing/coalition building principles, the Transtheoretical Model, and developmental traumatology coupled with resilience theory). Many of the programs had simultaneous levels of prevention targets: 23.1 % (n = 3) applied a universal approach, 53.8 % (n = 7) a selective approach, and 84.6 % (n = 11) an indicated approach. Sixty-two percent (n = 8) of the programs or policies focused on at-risk youth and/or youth with a history of sexual exploitation; 8 % (n = 1) focused on offenders; and the remaining 30 % (n = 4) targeted various services providers, including medical, mental health, legal, and others. One policy evaluated the Trafficking Victims Protection Act (TVPA; Adams et al. 2010), which aims to protect children through the prosecution of offenders of commercial sexual exploitation of children (under 18 years old). One program, The Hmong Youth Task Force, included members from the general community in their program (Saewyc et al. 2007).

Table 2 Inventory of studies selected for inclusion

Sample and Research Design

Sample and research design characteristics of studies are summarized in Table 2. The median sample size of the evaluations (including comparison/control group participants, if any) was 25. Fewer than half of the articles included data describing the race/ethnicity of participants and none provided information about participating youths’ sexual identity/orientation. Ten of the studies were quasi-experimental (i.e., lacking random assignment), though only three used a control or comparison group; the remainder used a one-group pretest/posttest design. The remaining three studies were non-experimental (i.e., descriptive/exploratory). No studies used a true experimental research design involving random assignment. One study evaluating a child protective services staff training attempted random assignment with the intention of delaying the control group’s participation in the training until after the evaluation research was complete. Randomization was unsuccessful, however, as many control participants switched to the treatment group due to scheduling conflicts with the control group’s scheduled training date (McMahon-Howard and Reimers 2013).

Six of the 13 studies utilized a multiple methods approach, one a purely qualitative approach, another a “lessons learned” approach, and the remaining five, a quantitative approach. Data were collected for the 13 studies in a variety of ways, including surveys (n = 9), medical and case record reviews (n = 4), observations (n = 3), interviews (n = 8), focus groups (n = 4), and other secondary sources (n = 2).

Findings of the evaluations with respect to the components of the RE-AIM framework are summarized in Table 3.

Table 3 Program and policy evaluation findings by RE-AIM categories

Reach

In general, minimal relevant data about the program or policy’s reach (or planned reach) were provided in the documents reviewed. Notably, none of the documents provided information on the whether or not the studies’ samples represented the larger population of potential participants. Several of the documents did not provide key data on the program/policy’s sample response rate and participant demographics. Some of the studies excluded participants for reasons that were not made clear and that seem counterintuitive, given the needs of those affected by commercial sexual exploitation. One example of this is the evaluation of the GRACE program, in which transgender females were excluded from the study sample, although it has been documented that transgender women are disproportionately affected by violence and, in particular, sexual exploitation (Dank et al. 2015).

Effectiveness

Outcomes data was shared for nearly all of the studies. Evidence of favorable outcomes for participating youth was reported in seven of the eight studies reporting on outcomes of this type. These outcomes showed increases in employment and higher educational aspirations (Cohen et al. 2011; Saewyc and Edinburgh 2010), decreases in arrest (for both LIFESKILLS and GRACE programs; Cohen et al. 2011), increased family connectedness and support (Pierce 2012; Saewyc and Edinburgh 2010), and increased knowledge about when to use condoms (Edinburgh and Saewyc 2009). While some of the studies that were reviewed demonstrated decreases in drug and alcohol use (e.g., Pierce 2012; Saewyc and Edinburgh 2010), others did not demonstrate statistically significant changes (e.g., Cohen et al. 2011). Murphy et al. (2016) noted increased attitudes and knowledge about the problem of CSECY, whereas Cohen et al. did not.

Positive outcomes for service professionals were documented in all five studies that reported on outcomes of this type (Adams et al. 2010; Ferguson et al. 2009; McMahon-Howard and Reimers 2013; Saewyc et al. 2007; Swann and Balding 2002) and in the two evaluations that reported on offender prosecutions (Adams et al. 2010; Swann and Balding 2002). Evidence of the positive impact of a community task force (Saewyc et al. 2007) and the perceived benefits of a countrywide policy (Swann and Balding 2002) were also reported.

Adoption

Most of the documents did not report data pertaining to issues of program or policy adoption other than a basic overview of the type of organization or area in which a program or policy was implemented. This may be a result of the programs being implemented in one, rather than multiple, settings; but without more detail, it is difficult to discern the breadth of the adoption of a program or policy. For example, additional sites may have been approached to implement a program or policy, but were unwilling or unable to do so. This information about what did not happen is useful and important to report. There also exists the potential for variability among staff within adopting sites with respect to implementation readiness and willingness - another important detail to report. One of the studies that did provide useful adoption data was Swann and Balding's (2002) report on a countrywide policy in England.

Implementation

Most of the documents reviewed provided some findings related to implementation; however, program fidelity was rarely discussed (Cohen et al. 2011 is an exception). Data on implementation was most often related to 1) the number of sessions that participants completed (e.g., Cohen et al. 2011), 2) the ways in which the program or policy was adapted over time to meet the needs of participants (e.g., adding “fun” group sessions; Cohen et al. 2011 – LIFESKILLS), or 3) allowing participants to take on leadership roles as they became comfortable (Hickle and Roe-Sepowitz 2014). Factors hindering implementation and program/policy “success” were addressed by some of the studies, including the need for a consistent definition of CSEC with respect to the Trafficking Victims Protection Act (TVPA; Adams et al. 2010) and the need to have consistent member engagement for the Hmong youth task force (Saewyc et al. 2007). The authors frequently shared data indicating that participants enjoyed, and/or believed they benefited from, the program or policy (e.g., Murphy et al. 2016; Thomson et al. 2011; Cohen et al. 2011).

Maintenance

The reviewed documents provided minimal information related to the domain of maintenance. In general, authors (who were typically also the evaluators) frequently provided suggestions for improving the program, but did not discuss how and if the program would be maintained. Although some authors described challenges to sustaining the program or policy under evaluation, there was little discussion of plans for obtaining funding or other resources required for continuing the work. Saewyc and Edinburgh (2010) did find a number of sustained positive outcomes (increases in protective factors and decreases in risk-factors) for the youth participants at 6-months and 12-months post-intervention. Hickle and Roe-Sepowitz (2014) discussed plans for a quasi-experimental follow-up evaluation to their non-experimental “lessons learned” study, and Saewyc et al. (2007) described future priorities for their task force as well as a belief that the work would continue indefinitely in some capacity.

Discussion

There were a limited number of documents to include in our review and a number of them lacked detailed information for each of the RE-AIM categories. The work reviewed here, however, collectively, is a seminal step in understanding potentially effective approaches to addressing CSECY. There is limited and competitive available funding, as well as a dearth of information on what programs and policies might work in specific contexts. Despite this, we applaud the efforts of the researchers and practitioners reviewed in this paper who aimed to make a difference, often with little financial and administrative support.

Key Areas for Improvement

In comparison to the fairly large number of programs and policies that appear to have been put in place to target the issue of CSECY, this review points to a relative paucity of evaluation data available on these efforts.

Need for More Comprehensive Evaluations

The RE-AIM framework used in our review served to identify a number of areas that merit attention in future evaluations of efforts to address CSECY. First, we must consistently conduct evaluations of programs and policies, as evaluations provide data to determine the overall utility of an intervention (CDC 1999). Second, there is a need for more comprehensive evaluations of existing programs and policies, as many of the reviewed documents lacked complete evaluation data for each of the RE-AIM domains. Saewyc and Edinburgh (2010)’s evaluation of the Runaway Intervention Program is the exception to this point and provided data for all five RE-AIM domains, allowing the reader to gain a more complete sense of the study. As with all of the studies reviewed, however, there is omitted or incompletely described data that would be useful in fully understanding the program’s utility.

Reach

Few of the documents reviewed provided detailed information on the sample and its representativeness of the specified population, though this information may have been largely unknown to the researchers given the dearth of incidence and prevalence data on the phenomenon. We recommend that the best estimate of a program or policy’s reach be included in evaluation reports. Additionally, there is a need to gather and report individual-level descriptive data on study participants in order to inform current and future programs and policies that target more diverse groups (e.g., buyers of sex, parents of CSECY, and lesbian, gay, bisexual, transgender, and queer youth). This information is particularly useful given current attention to the relevance of these groups in terms of addressing CSECY (e.g., Clayton et al. 2013).

Efficacy

Given the limitations of the study designs and sample sizes in the reviewed documents, it is premature to reach any conclusions about the effectiveness of the programs or policies involved. Additionally, many of the evaluations reviewed lacked methodological rigor, making it difficult to determine effectiveness. Overall, the studies employing solely or heavily quantitative designs utilized the most rigorous research methods. The one qualitative method and the mixed and multiple methods studies, fell short by qualitative research standards in very specific ways. First, none of the authors thoroughly described the key components of a sound qualitative, multiple method, or mixed methods research study, including (a) the researchers’ critical reflection on their role as a data collection instrument, (b) explicating their study design (e.g., narrative, ethnography, case study) and its fit with the research question, and (c) thoroughly reviewing their steps to analysis (Creswell 2014; Creswell and Miller 2000). Second, the authors neglected to validate the data through accepted procedures, including triangulation and member checking (Charmaz 2014; Creswell 2014; Creswell and Miller 2000). These findings point to a clear need for future CSECY program or policy evaluation studies employing qualitative methods (which we recommend be used when possible) to incorporate more rigorous data collection and analysis procedures.

Another important methodological consideration that is relevant for both overall evaluation and the specific domain of efficacy is the use of experimental research designs when evaluating programs or policies targeting CSECY. The challenge is evident in McMahon-Howard and Reimers’ (2013) unsuccessful attempt at random assignment that was discussed above. The inability to keep the study’s treatment and control groups separated highlights the complexities of conducting experimental research on the issue of CSECY. It also points to the ethical imperative that those in need of an intervention should not be required to wait to participate. In the case of the evaluated child protective services staff training (McMahon-Howard and Reimers 2013), it was more important for all of the participants to complete training at a time that was convenient for them, rather than keeping the random assignment intact.

Adoption

Few of the reviewed documents provided information on the setting(s) in which the program or policy was implemented and the staff’s willingness to support implementation. We recommend this data be included in future evaluation reports, since it is useful to both understand the context of the program/policy and to determine if the program/policy could be adopted in another setting. There is also a need to focus evaluation efforts on policies that have been implemented in a variety of settings, as only two of the 13 studies were policy evaluations.

Implementation

The documents reviewed provided detailed implementation data. The data was mostly related to participant satisfaction and length of time in the program or number of sessions completed. While this information is useful for evaluation, we recommend that measures of fidelity also be included in evaluation reporting in order to gain a complete understanding of the implementation process, including when and how adaptations were made.

Maintenance

Most of the documents reviewed did not report on the ways in which the program/policy’s outcomes will be maintained over time or what changes will be made in future iterations. While it may not be feasible to gather data after a program/policy has ended, it is still useful for evaluators to consider what data would be ideal to collect and to delineate plans for continuation of the program/policy over time.

External Validity

Scholars have drawn attention to the under-emphasis on external validity in evaluation, in favor of attention to internal validity (Green and Glasgow 2006; Glasgow et al. 1999, 2006). The findings of this review highlight the need to conduct thorough evaluations that not only focus on effectiveness in ideal situations (internal validity), but gather and report data and contextual information that speaks to the potential for transferring programs and policies to other settings (external validity). We believe the RE-AIM framework can be instructive for all evaluators who wish to conduct thorough evaluations with the potential for translation to other settings and populations.

Limitations of the Current Review

There are several limitations with our review that should be discussed. First, we were limited to the findings and other information that were included in the evaluation reports. It is possible that additional data, such as fidelity of program implementation, were collected and analyzed, but not included in the published documents. It is also worth nothing that the majority of the evaluations were published in scholarly journals, many of which have page or word limits. This may explain why the journal articles, in particular, did not thoroughly address all of the RE-AIM dimensions. We believe, however, that data for each of these categories should be reported in journal articles, even if only briefly, as well as through other means (e.g., supplemental on-line materials, scholarly or practice-based research reports) so that external validity may be assessed and the work translated for practical applications. Second, although a multi-channel literature search process was used, it is possible that some relevant evaluations might have been overlooked. If this is the case, the findings of our review may not be truly representative of the full breadth of available evaluation literature on CSECY programs and policies. Finally, there may be work that is being conducted, but without funding to support formal evaluations and/or sharing of evaluation data in a systematic way (e.g., journal articles, reports, etc.), as this is a newer area in program and policy development and implementation. It would be extremely useful to have a central location where practitioners and researchers could access data from ongoing or completed evaluations of programs and policies targeting CSECY. This resource would be useful in order to accomplish the following: (1) encouraging and providing space for the reporting of evaluation data, even in an informal sense, (2) sharing evaluation information which may strengthen developing and existing programs and policies, (3) connecting those working on similar initiatives, and (4) preventing unnecessary duplication of efforts.

Concluding Observations

We have proposed that there is much to be gained from more comprehensive and rigorous evaluations of programs and policies to address CSECY and have provided a number of specific recommendations that may be helpful. We also believe that the limitations noted, when considered collectively, are potentially indicative of larger, more systemic issues. One of these issues may be a tendency for funders, program developers, researchers, and others to champion efforts to “save” children from commercial sexual exploitation by any means that seem appropriate. Such a response is understandable given the raw emotions and sense of urgency to “do something” that are evoked by the realities of the commercial sex exploitation of children and youth. However, this also means that programs and policies to combat CSECY may be rolled out without the necessary attention to, and provisions for, evaluation planning and support. A second, and broader concern, is the reality that in contrast to many of the other significant issues confronted by young persons (e.g., mental health problems, academic difficulties), commercial sexual exploitation is not likely to be within the realm of the direct experience or knowledge of most policy-makers and funders. As such, we suspect that CSECY is less well-positioned than many other “causes” to attract the types of sustained investment that are required for supporting the types of top-flight evaluation work that we are recommending. Thus, we are excited to see the growing number of interventions for both the general population and at-risk and affected youth, as well as governmental and other sources of support for them. We hope that funders, program developers, and researchers will understand the need for such efforts to be informed by strong evaluation data and reporting, as well as broader systematic factors that may present formidable barriers to advancing the availability of well-evaluated programs and policies.