Introduction

Psychological treatments for children and adolescents have been given less attention than those implemented in the adult population. In many cases, psychological interventions involving children and adolescents were designed as adaptations of those of adults (Jacobs et al. 2008) when in clinical practice it can be verified that, for example, a child suffering from depression has specific characteristics that differ greatly from those of adults in terms of the etiology, symptoms, evolution and treatment of this disorder. In their comprehensive review of the literature on the treatment of adolescents, Weisz and Hawley (2002) examined 25 empirically supported psychotherapies that have been used in children and adolescents. According to these authors, 14 of the 25 therapies have been shown to be effective in adolescents. Interestingly, seven are downward adaptations of treatments originally designed for adults and six are upward adaptations of treatments originally designed for children, leaving only one that was developed specifically for adolescents. In conclusion, few of the 14 empirically supported treatments that have been used in adolescents were designed with a focus on the primary developmental task of adolescence (Holmbeck et al. 2010).

Interest in therapies for children and adolescents began a little later than Eysenk’s influential work (1952), which questioned the benefit of psychotherapies, and the subsequent meta-analyses of Smith and Glass (1977) and Shapiro and Shapiro (1982), which supported the beneficial effects of psychotherapy in adults. In this regard, Casey and Berman (1985) published a meta-analysis of child treatment studies, concluding that “the evidence from this review suggests that previous doubts about the overall efficacy of psychotherapy with children can be laid to rest” (p. 388). Later, Weisz and colleagues conducted two meta-analyses of psychotherapy studies with children (Weisz et al. 1987, 1995). These studies were the first to provide empirical evidence that the effects of child psychotherapy appear to differ depending on a variety of factors, including the child’s problem and the type of therapy (Southam-Gerow and Prinstein 2014). Recently, Weisz et al. (2017) have performed a new meta-analysis of child and adolescent treatment studies encompassing the last five decades, concluding that youth psychological therapy has a beneficial effect of moderate magnitude and is relatively durable over time, although this effect depends on the child’s problem, the type of therapy used, the control condition employed and who reports the outcome.

The American Psychological Association (APA) Task Force on Promotion and Dissemination of Psychological Procedures made a significant effort to systematically define how psychological treatments should be evaluated, which included professionals from the private health sector, the public health system, researchers and users. The task force published several reports (Chambless and Hollon 1998; Chambless and Ollendick 2001; Chambless et al. 1996, 1998) with lists of evidence-based treatments based on criteria to assess randomized controlled trials (RCTs) using control groups following standardized treatment guidelines (APA 2006). Criteria began to be developed to clearly define empirically supported treatments (ESTs) for mental health disorders (Barlow 1996; Seligman 1995; Shapiro 1996).

Possibly one of the major contributions of the list of ESTs has involved the creation of institutions that act as mediators between research and clinical practice, as well as the establishment of explicit criteria for judging the quality of evidence of the various interventions. This mediation entails both the evaluation of evidence (through selective reviews guided by criteria) and the transfer of information (through publications, books, manuals, training courses, etc.) to the different stakeholders involved (psychologists, patients, health institutions and the general public). However, the institutions that evaluate the evidence often use different criteria and degrees of assessment, thus suggesting that the reliability among lists is significantly different in terms of how they are constructed and analyzed (Primero and Moriana 2011).

The evidence concerning psychosocial treatments for children and adolescents experiencing behavioral health problems is building up at an impressive rate (Southam-Gerow and Prinstein 2014). For the period 1965–2009, Chorpita et al. (2011) identified over 750 treatment protocols from 435 studies on child and adolescent mental health. Moreover, in the last few decades, professionals and stakeholders have shown a growing interest in psychosocial treatments that have been found to ameliorate child and adolescent clinical disorders (Silverman and Hinshaw 2008), and several authors have proposed different criteria to evaluate the evidence of psychological treatments in children and adolescents (Chorpita et al. 2011; Kazdin and Wilson 1978). In addition, the Society of Clinical Child and Adolescent Psychology of the APA (Lonigan et al. 1998; Silverman and Hinshaw 2008; Southam-Gerow and Prinstein 2014) and other organizations (e.g., National Institute for Health and Care Excellence, Australian Psychological Society and Cochrane Collaboration) have made different proposals in this regard, although agreement among them is not unanimous.

The present study therefore aims to analyze and compile lists of evidence-based psychological treatments in children and adolescents by disorder using data provided by RCTs, meta-analyses, guidelines and systematic reviews of the Society of Clinical Child and Adolescent Psychology of the APA, the National Institute for Health and Care Excellence (NICE), the Australian Psychological Society (APS) and Cochrane Collaboration. The data were then reviewed to compare the criteria, levels of evidence and lists of these organizations with the aim of analyzing the level of agreement among them.

These four organizations were selected for the review for the following reasons. The Society of Clinical Child and Adolescent Psychology of the APA is a leading international organization which promotes evidence-based psychological treatments in children and adolescents. NICE and Cochrane Collaboration are international organizations that provide guidance on all kinds of evidence-based therapies on a wide range of health disorders, and the APS facilitates clear and rigorous information about the efficacy of a broad range of psychological interventions across mental disorders.

Method

The method used in this review conforms to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Moher et al. 2009).

Description of the Organizations Included in the Study

Society of Clinical Child and Adolescent Psychology (Division 53) of the APA

APA is the leading scientific and professional organization representing psychology in the USA. APA’s 54 divisions are interest groups organized by members. Some represent subdisciplines of psychology (e.g., clinical psychology), while others focus on thematic areas such as aging or ethnic minorities. The Society of Clinical Child and Adolescent Psychology (Division 53) includes APA members who are active in practice, research, teaching, administration and/or conduct studies in the field of clinical child and adolescent psychology. The mission of Division 53 of the APA (D53) is to promote the advancement of clinical child and adolescent psychology by integrating its scientific and professional aspects, and promoting scientific inquiry, training, and professional practice in clinical child and adolescent psychology as a means of improving the mental health of children, adolescents and families. The D53 Web site (www.effectivechildtherapy.com) informs the general public about research evidence for psychological treatments in this age group.

Evidence-based treatment reviews have appeared in the Journal of Clinical Child and Adolescent Psychology (JCCAP) over the past two decades and have also been disseminated on the D53 Web site. In 1998, Lonigan et al. (1998) published a special issue on empirical support for specific psychological treatments. Some years later, Silverman and Hinshaw (2008) published a second special issue of evidence-based treatment updates. Due to the large number of new treatment studies, the D53 Board of Directors determined that a decennial review of the evidence base was insufficient to keep up with the rapidly collecting evidence (Southam-Gerow and Prinstein 2014). Therefore, a new special issue focusing on evidence-based treatments was published in 2014 (Southam-Gerow and Prinstein 2014) and D53 aimed to publish more updates on evidence-based treatments for various child and adolescent problems more regularly.

D53 currently classifies levels of evidence into five levels. To be considered a Level One treatment (also defined as “Works well” or “Well-established treatments”), at least two large-scale RCTs must have demonstrated the superior efficacy of the treatment to some other treatments and the studies must have been conducted by independent investigatory teams working in different research settings. Level Two therapies (also defined as “Works” or “Probably efficacious therapies”) have strong research support, but may not have been tested by different or independent teams. In Level Three therapies (also defined as “Might work” or “Possibly efficacious therapies”), there may be one study showing that the treatment is better than no treatment, or there may be a number of smaller clinical studies without all of the appropriate procedural controls. Level Four therapies (also defined as “Unknown,” “Untested” or “Experimental therapies”) may be in use, but have not been studied carefully. For some child/adolescent symptoms or disorders with limited therapy options, a treatment at this level could be worth considering. Finally, Level Five therapies (also defined as “Does not work” or “Tested but did not work”) have been tested in well-designed studies and have not yet shown positive results or have been shown to make symptoms or behaviors worse. A therapy currently listed as Level Five would not be a good treatment option.

National Institute for Health and Care Excellence (NICE)

NICE is an organization that is responsible for providing evidence-based guidance on health and social care to the National Health Services (NHS) in the UK, which works closely with other organizations such as NHS England, Public Health England or Health Education England. NICE publishes clinical guidelines, technology appraisal guidance, interventional procedures guidance and public health guidelines that make evidence-based recommendations on a wide range of health, public health and socialcare topics. Its competences range from providing information, education and advice to launching campaigns and prevention programs for specific treatments for primary, secondary and specialized services covering all medical specialties. Each NICE guideline is developed by a different committee of experts, which includes members from clinical practice, public health and social care. In addition, all committees include at least two lay members, who can be patients, caregivers, service users or the general public. The committees conduct systematic reviews and network meta-analyses for evaluating and comparing the benefits and cost-effectiveness of the different forms of treatment included in each guideline. The process to develop each guideline usually takes between 18 and 24 months, although there are “short clinical guidelines” that take between 11 and 13 months to produce and are generally used in cases where the development of a guide on an emerging problem is considered urgent. NICE classifies evidence by level in a hierarchy which is similar to that of D53, although different criteria are used. Level I includes the type of evidence obtained from meta-analyses and RCTs (at least one) and corresponds to recommendation grade “A”; level II includes evidence from at least one controlled study without randomized groups, or a quasi-study and corresponds to grade “B”; level III, which includes descriptive studies (or those which do not fully meet the criteria in levels I and II), also corresponds to grade “B”; and level IV, which includes evidence obtained from expert committee reports or opinions and/or clinical experiences, corresponds to grade “C.” More recently, the NICE guidelines were incorporated into the GRADE system for rating clinical guidelines (Atkins et al. 2004). The GRADE system classifies levels of evidence as high quality (further research is very unlikely to change our confidence in the estimate of the effect); moderate quality (further research is likely to have an important impact on our confidence in the estimate of the effect and may change the estimate); low quality (further research is likely to have an important impact on our confidence in the estimate of the effect and is likely to change the estimate) and very low quality (any estimate of effect is very uncertain).

Cochrane Collaboration

This organization comprises a network of researchers, practitioners, patients and caregivers from over 130 countries working cooperatively to provide evidence-based data in order to facilitate decision making about which treatment to choose for a particular disorder or health problem. The Cochrane Collaborators are affiliated to the organization through Cochrane groups, which are review groups related to health topics, thematic networks, groups involved in the methodology of systematic reviews and regional centers. These groups are established around the world, and most of their work is done online. Each group is a “mini-organization” in itself, with its own funding, Web site and workload. Based on their interests, experience or geographical location, collaborators join a group or, in some cases, various groups. The Cochrane groups perform systematic reviews and meta-analyses of specific health topics on all kinds of diseases. The reviews provide a summary of the results of available studies, mainly RCTs, which present information about the effectiveness of interventions in a specific health topic. Cochrane reports on evidence for and against treatments, treatment efficacy and treatment comparison studies to facilitate decision making in health care. Like NICE, Cochrane has also recently incorporated the GRADE model (Atkins et al. 2004) as criteria to determine the quality of evidence.

Australian Psychological Society (APS)

The APS is the premier professional organization for psychologists in Australia. The functions of the APS are conducted through more than 201 active member groups within the society. Each group consists of an elected committee that meets regularly and organizes activities, such as professional development. Evidence-based practice has become a central issue in the delivery of health care in Australia and, as such, government-sponsored health programs require the use of treatment interventions that are evidence-based as a means of discerning the allocation of funding.

The National Health and Medical Research Council (NHMRC) of Australia has published a guide for evaluating evidence and developing clinical practice guidelines. The NHMRC guide informs public health policy in Australia and has been adopted as a protocol for evidence reports by the APS. The NHMRC has developed a rating scale to designate the level of evidence of clinical studies: Level I—systematic review of all relevant randomized controlled trials; Level II—at least one properly designed randomized controlled trial; Level III-1—well-designed pseudo-randomized controlled trials (alternate allocation or some other method); Level III-2—comparative studies with concurrent controls and allocation of not randomized (cohort studies) or interrupted time series with a control group; Level III-3—comparative studies with historical control, two or more single-arm studies or interrupted time series without a parallel control group; and Level IV—case series, either post-test or pre-test and post-test.

APS has published a comprehensive review of the available evidence up to January 2010, which examines the efficacy of a broad range of psychological interventions across mental disorders affecting adults, adolescents and children (APS 2010). This review of the literature examining the efficacy of a broad range of psychological interventions for the ICD-10 mental disorders has been undertaken to support the delivery of psychological services under government mental health initiatives. To determine the level of evidence of the treatments included in the review, APS uses the criteria developed by NHMRC mentioned above.

Search Strategy

We first consulted the Web sites of the organizations described above (APA, Division 53, www.effectivechildtherapy.org; NICE, www.nice.org.uk; Cochrane, www.cochrane.org; and APS, www.psychology.org.au) to gather all the treatments, disorders and levels of evidence they report for children and adolescents. In a second stage, we collected the RCTs, reviews and meta-analyses presented by each organization. The last date of access and updated information uploaded by the organization was October 15, 2017.

Inclusion and Exclusion Criteria

Owing to the sheer number of related disorders and treatments, we selected as our inclusion criteria only those investigated in children and adolescents. Problems related to health psychology, learning disorders, speech disorders, personality disorders, substance abuse, self-harm, body-focused repetitive behaviors and drug therapies were excluded. In the case of Cochrane, the following types of reviews were also excluded: reviews of specific sectors of the population (e.g., psychological interventions for depression in adolescents and adults with congenital heart disease), prevention reviews, reviews on assessment tools, systematic reviews of studies on specific non-psychological procedures (i.e., cranial magnetic stimulation or electroconvulsive therapy), systematic reviews of studies assessing diagnostic test accuracy and the protocols for reviews.

Data Collection Process

Treatment recommendations for the disorders addressed in this study can be found in Results section. Information on the evidence provided by the different organizations for each treatment is specified in the tables, while the box corresponding to treatments for which there is no reference to evidence is left blank. When an organization deems that there are not enough studies to consider the treatment effective, we use the term “Insufficient Evidence.” In addition, next to the level of evidence we specify the number of RCTs and meta-analyses or systematic reviews that each organization has used to reach their conclusions.

As a result, in the row corresponding to D53 we classify the quality of the evidence of a particular treatment as Level One, Level Two, Level Three, Level Four or Level Five. In the row corresponding to NICE, we specify the grade of recommendation (A, B, C) for posttraumatic stress disorder and obsessive–compulsive disorder, or the level of evidence according to the GRADE criteria (high, moderate, low and very low) for other disorders included by this organization. Moreover, the update guideline for attention deficit hyperactivity disorder (ADHD) (NICE 2013a) does not report the level of evidence of behavioral classroom management (BCM) and organization training (OT). Consequently, we only indicate whether these treatments are considered effective, non-effective or if there is insufficient evidence, without specifying the level of effectiveness of the treatments in the tables. Finally, some treatments are accompanied by the indication “no research support” or, when appropriate, “advised against using.” For Cochrane, we opted to show the data exactly as it appears in the systematic reviews obtained from the system. Specifically, for all the reviews conducted after 2012 and that of Reichow et al. (2012), Storebø et al. (2011) and Krisanaprakornkit et al. (2010), we indicate the level of evidence according to the GRADE criteria, while for other reviews we indicate whether a particular treatment is effective or non-effective. Regarding APS, we specify the levels of evidence according to the criteria used by the organization itself, which are described above (Level I, Level II, Level III-1, Level III-2, Level III-3 and Level IV).

Finally, the total number of organizations that report a given therapy as being effective is shown in the tables. For this purpose, we have considered that a therapy is deemed effective by an organization in the following cases. D53: Level One, Level Two, Level Three and Level Four; NICE: A, B, C, high, moderate, low, very low or effective; Cochrane: high, moderate, low, very low or effective; APS: Level I, Level II, Level III-1, Level III-2, Level III-3 or Level IV.

Statistical Analysis

To analyze agreement among organizations, we have classified the different levels of evidence proposed by each organization into an ordinal scheme as no evidence, weak evidence, moderate evidence and strong evidence (see Table 1). In the case of NICE for autism and D53 for autism, depression and disruptive disorder, where different levels of evidence may appear for a treatment (see Tables 3, 5 and 6, respectively), we have used the higher level of evidence.

Table 1 Ordinal scheme to classify the different levels of evidence

The intra-class correlation (ICC) is one of the most commonly used statistics for assessing inter-rater reliability (IRR) for ordinal, interval and ratio variables (Hallgren 2012). The ICC is suitable for this type of measurements since it evaluates the reliability of the obtained qualifications when comparing the variability of the different grades for the same treatment with total variation across all classifications and treatments. As in the previous study of Moriana et al. (2017), IRR has been performed using a two-way mixed, consistency, average measures ICC to assess the level of agreement among the four organizations for each diagnosis, taking into account only those therapies considered effective by at least one institution.

According to Hallgren (2012), higher ICC values suggest a greater IRR, with an ICC estimate of 1 indicating perfect agreement and 0 indicating only random agreement. Moreover, this author states that negative ICC estimates indicate systematic disagreement, and some ICCs may be less than − 1 when there are three or more coders. The cutoffs proposed by Cicchetti (1994) for the qualitative rating of agreement based on ICC values were used, with IRR being poor for ICC values less than .40, fair for values between .40 and .59, good for values between .60 and .74 and excellent for values between .75 and 1.

Results

Search Results

The APA Division 53 Web site includes a list of 13 diagnostic categories. In accordance with the inclusion criteria, 10 mental disorders were analyzed, giving rise to a total of 91 psychotherapeutic interventions associated with them.

We consulted the guidelines relating to mental disorders published on the NICE Web site and reviewed sections corresponding to evidence-based treatments. Of the 39 guidelines published by the mental health and behavioral conditions group, nine met the criteria for inclusion in our review. One set of guidelines on urological conditions that provide information on 13 disorders and 63 therapies was also included.

We analyzed the systematic reviews provided by Cochrane for the group of mental disorders in children and adolescents and obtained data from the evidence for each of the treatments reviewed. The Cochrane Web site includes a total of 935 reviews belonging to the mental health and developmental, psychosocial and learning problems group. Of these, 22 which provide information on 26 psychological treatments for eight disorders met the criteria for inclusion in our analysis.

Finally, we incorporated the lists of treatments included in the document published by APS (2010). This guide includes 17 disorders in the interventions in children and adolescents section. Consistent with the inclusion criteria, 14 disorders relating to 21 interventions were selected.

Agreement for Included Disorders

In what follows, we compare the four organizations to determine whether there is agreement among them regarding treatments for the disorders.

Anxiety Disorders

General Symptoms of Anxiety

The only organizations that provide information about effective psychological treatments for general symptoms of anxiety are D53 and Cochrane, which present 21 different types of treatments supported by some degree of evidence. The ICC (.266) indicates poor agreement among organizations for this disorder. The review presented by D53 (Higa-McMillan et al. 2016) does not specify the number of studies included in analyses for each treatment family. According to the review, there is Level One evidence for cognitive behavioral therapy (CBT), exposure, modeling, CBT with parents, education and CBT with medication; Level Two evidence for family psychoeducation, relaxation and assertiveness training, attention control, CBT for children and parents, cultural storytelling, hypnosis and stress inoculation; Level Three evidence for contingency management and group therapy; Level Four evidence for biofeedback, CBT with parents only, play therapy, psychodynamic, rational emotive therapy and social skills; and Level Five evidence for assessment/monitoring, attachment therapy, client-centered therapy, eye movement desensitization and reprocessing (EMDR), peer pairing, psychoeducation, relationship counseling and teacher psychotherapy. In turn, Cochrane (James et al. 2015) suggests that CBT is an effective treatment for childhood and adolescent anxiety disorders, with a low-to-moderate level of evidence (41 RCTs).

Specific Anxiety Disorders

Psychological treatment for social anxiety disorder (SAD) in children and adolescents has been studied by NICE and APS, which report three different types of treatments supported by some degree of evidence. The ICC (0) indicates random agreement among organizations for this disorder. The only treatment that APS (2010) considers effective for this disorder is CBT, which was rated as Level II of evidence (two RCTs). However, in addition to considering CBT effective and assigning it a low level of evidence (eight RCTs), NICE (2013b) also considers CBT with parents (very low to low; three RCTs) and self-help therapy (low; two RCTs) to be effective for this disorder. As a result, CBT is the only therapy considered effective by NICE and APS.

Specific phobias (SP) in children and adolescents are only documented by APS (2010), which assigns CBT a Level II of evidence (one RCT). This organization is also the only one that provides evidence for generalized anxiety disorder in this age group, for which it confers a Level I of evidence to CBT (one RCT). Given that only one organization included treatments for these disorders, the ICC could not be calculated. Finally, no organization provides information regarding empirically supported treatments for panic disorder in this age group.

Attention Deficit Hyperactivity Disorder

In reviewing the treatments included by the four organizations for attention deficit hyperactivity disorder (ADHD) in children and adolescents, we found nine different types of treatments supported by some degree of evidence (see Table 2). The ICC (.173) indicates poor agreement among organizations for this disorder. Behavioral parent training (BPT) was the treatment with the highest level of agreement (three organizations consider it effective), while the other treatments were regarded as effective by less than three institutions.

Table 2 Attention deficit hyperactivity disorder: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Autism

In examining treatments for autism in children and adolescents, we identified 14 different types of treatments supported by some degree of evidence (see Table 3). The ICC (− 1.447) indicates systematic disagreement among organizations for this disorder. Parent training was the treatment with the highest level of agreement (three organizations consider it effective). The other treatments were regarded as effective by less than three institutions, 12 of which are considered effective by only one organization.

Table 3 Autism: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Bipolar Disorder

When analyzing treatments for bipolar disorder in children and adolescents, we found four different types of treatments supported by some degree of evidence (see Table 4). The ICC (.667) indicates good agreement among organizations for this disorder. Family-focused therapy (FFT) was the treatment that obtained the highest level of agreement (three organizations consider it effective), while the other therapies were deemed effective by only one institution.

Table 4 Bipolar disorder: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Depression

An analysis of the treatments for depression in children and adolescents revealed 12 treatments supported by some degree of evidence (see Table 5). The ICC (.286) indicates poor agreement among organizations for this disorder. CBT, interpersonal therapy, FFT and self-help therapy obtained the highest level of agreement (three organizations consider them effective), but none of them obtained the consensus of the four organizations, since Cochrane suggests that there is very limited evidence upon which to base conclusions about the relative effectiveness of psychological interventions for treating depressive disorders in this age group (Cox et al. 2014). The other treatments studied were regarded as effective by less than three institutions, five of which are considered effective by only one organization.

Table 5 Depression: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Disruptive Behavior

In examining treatments for disruptive behavior in children and adolescents, we found 12 different types of treatments supported by some degree of evidence (see Table 6). The ICC (−.273) indicates systematic disagreement among organizations for this disorder. Family-focused interventions (FFI) and parent-focused behavior therapy (PFBT) both obtained the highest degree of agreement (three organizations regard them to be effective). The other treatments were considered effective by one or two institutions.

Table 6 Disruptive behavior in children and adolescents: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Eating Disorders

Anorexia Nervosa

When reviewing the treatments documented for anorexia nervosa (AN), five different treatments were found to be supported by some degree of evidence (see Table 7). The ICC (.655) indicates good agreement among organizations for this disorder. Family therapy-behavioral (FTB) obtained the highest level of agreement (three organizations consider it effective). However, other types of treatments were regarded as effective by one or two organizations.

Table 7 Anorexia Nervosa: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement
Bulimia Nervosa

We found four treatments supported by some degree of evidence when reviewing treatments for bulimia nervosa (see Table 8). The ICC (0) indicates random agreement among organizations for this disorder. FTB obtained the highest level of agreement (three organizations consider it effective), while the other treatments were considered effective by one or two institutions.

Table 8 Bulimia Nervosa: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement
Binge Eating Disorder

Binge eating disorder (BED) in children and adolescents is only documented by NICE (2017), which assigns a low level of evidence for individual CBT (1RCT/0 meta-analysis or systematic reviews), group CBT (0/0) and self-help therapy (1/0). With regard to other organizations, although D53 states that CBT is somewhat effective in adolescents with BED, according to this division, no child and adolescent therapies for this disorder have been tested for effectiveness. In the case of Cochrane, there are no reviews for this age group. In turn, APS (2010) reports that no recent studies have been found to indicate the effectiveness of any interventions for this disorder. Given that only one organization included treatments for this disorder, the ICC could not be calculated.

Enuresis

In reviewing the treatments included by the four organizations for enuresis, we identified 10 different types of treatments supported by some degree of evidence (see Table 9). The ICC (− 1.15) indicates systematic disagreement among organizations for this disorder. Enuresis alarm, CBT, random waking and star charts were the treatments with the highest level of agreement (two organizations regard them to be effective). The other six therapies were considered effective by only one organization.

Table 9 Enuresis: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Insomnia

Insomnia in children and adolescents is only documented by APS (2010), which assigns a Level II of evidence to CBT (one meta-analysis). Given that only one organization included treatments for this disorder, the ICC could not be calculated.

Obsessive–Compulsive Disorder

We found three treatments supported by some degree of evidence when reviewing treatments for obsessive–compulsive disorder (OCD) (see Table 10). The ICC (.955) indicates excellent agreement among organizations for this disorder. Individual CBT obtained the maximum level of agreement (four organizations consider it effective). The other therapies, both variants of CBT, were deemed effective only by D53.

Table 10 Obsessive–compulsive disorder: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Posttraumatic Stress Disorder

In examining treatments for posttraumatic stress disorder (PTSD), we found 10 different types of treatments supported by some degree of evidence (see Table 11). The ICC (.579) indicates fair agreement among organizations for this disorder. CBT was the treatment that obtained the highest level of agreement (three organizations consider it effective). The other treatments studied were regarded as effective by less than three institutions, eight of which are considered effective by only one organization.

Table 11 Post-traumatic stress disorder: Level of evidence/RCTs/meta-analyses or systematic reviews of psychological treatments and number of organizations in agreement

Psychosis and Schizophrenia

Psychosis and schizophrenia in children and adolescents are only documented by NICE (2013f), which assigns a low level of evidence to CBT (12 RCTs), family therapy (two RCTs) and arts therapies (one RCT), including dance movement therapy, body psychotherapy, drama therapy and music therapy. Furthermore, this organization recommends that supportive therapy or social skills training not be routinely provided as specific therapies for children and adolescents with psychosis or schizophrenia. Given that only one organization included treatments for this disorder, the ICC could not be calculated.

Discussion

The goal of the criteria used to evaluate psychological treatment is to help therapists and clients make good choices about the treatments they provide or request (Southam-Gerow and Prinstein 2014). However, recommendations regarding the effectiveness of a given treatment depend on the organization being reviewed (Moriana et al. 2017). These authors analyzed evidence-based treatments provided by Division 12 of the APA, NICE, Cochrane and APS in relation to mental disorders in adults and concluded that, in most cases, there was little agreement among organizations and that there were several discrepancies within certain disorders.

Based on the previous study, the objective of this work was to compile a list of evidence-based psychological treatments by disorder in relation to mental disorders in children and adolescents. For this purpose, data provided by four international organizations were used to analyze the level of agreement among them regarding each diagnosis and each treatment within the disorders. The results of the analysis showed that agreement is low for most of the disorders, as only three of them show an acceptable ICC. Excellent agreement among organizations was found OCD, while good agreement was observed for bipolar disorder and anorexia nervosa. For all other treatments, the agreement among institutions was low.

As in adults, the main findings of this study highlight the existing discrepancies in the evidence presented by different organizations reporting on the effectiveness of psychological treatments in children and adolescents. Moriana et al. (2017) reported that the discrepancies in adults could be explained by a combination of different issues: the procedures or committees may be biased, different studies were reviewed, different criteria are used by each organization or the reviews of existing evidence were conducted in different time periods.

In analyzing the existing discrepancies in children and adolescents, the fact that numerous treatments are included by a single organization may support the theory that the procedures or committees are biased. In most cases, these institutions only provide information on treatments they consider effective with a higher or lower level of evidence. Therefore, we cannot determine why they do not recommend certain treatments. This is evident in PTSD, where eight out of 10 treatments are considered effective by only one organization. In some cases, however, organizations also provide information about therapies they do not consider effective, but numerous treatments are still included by a single organization. In autism, for example, information is provided for 18 therapies, of which 11 are reported by a single organization. This also occurs with ADHD or depression in seven out of 12 treatments and five out of 12 treatments, respectively. Moreover, the evidence provided by NICE and Cochrane may be biased as it relies on the meta-analyses which they commission, and the recommendations of D53 are based on the reviews that they perform. APS is the only institution that bases part of its recommendations on the reviews or meta-analyses conducted by other organizations or institutions.

As concerns the issue of whether or not different studies were reviewed, the analysis of the main discrepancies regarding therapies for mental disorders in children and adolescents shows that, in some cases, the organizations do indeed use different studies to determine the quality of the evidence. For example, in the case of ADHD, D53 (Evans et al. 2014) considers that behavioral classroom management is a Level One treatment for this disorder based on the RCTs of Fabiano et al. (2010) and Mikami et al. (2013), while NICE (2013a), based solely on Mikami et al. (2013), deem that the evidence on the beneficial effect of this therapy is insufficient. The same applies to bladder training and retention control training (BTRCT) for enuresis, where Cochrane (Caldwell et al. 2013b) confers a low level of evidence for this therapy but NICE (2010) does not believe that the evidence for BTRCT is sufficient to recommend its use over other treatments. When comparing the six studies used by Cochrane (Caldwell et al. 2013b) and the five studies on which the NICE (2010) recommendations are based, we found that only two coincide (i.e., Bennett et al. 1985 and Harris and Purohit 1977).

Several discrepancies were found for autism, which may also be due to the fact that different studies were reviewed. For instance, while NICE (2013a) considers that the evidence for music therapy is inconclusive based solely on the RCT of Gattino et al. (2011), Cochrane (Geretsegger et al. 2014), based on 10 studies (including the RCT of Gattino et al. 2011), supports that music therapy may help children with autism to improve their skills in important areas such as social interaction and communication with a low-to-moderate level of evidence. The same applies to picture exchange communication system (PECS). Thus, while D53 (Smith and Iadarola 2015) reports a Level Two of evidence on the effectiveness of PECS based on the RCTs of Yoder and Stone (2006a, b), NICE (2013c) considers that it is not possible to draw conclusions about the relative benefit of PECS on reciprocal social communication and interaction in children with autism based on the RCT of Howlin et al. (2007).

As to the different criteria used by each organization, a comparison among them showed that the requirements for granting, for example, the highest level of evidence to a certain treatment differed among institutions. D53 requires at least two large-scale RCTs which have demonstrated the superior efficacy of the treatment to some other treatment. The criteria used initially by NICE require at least one meta-analysis or RCT. The GRADE system, used later by NICE and Cochrane, grants the highest level of evidence if further research is very unlikely to change the confidence in the estimate of the effect. Finally, APS requires a systematic review of all relevant RCTs to confer the highest level of evidence. The analysis of these discrepancies also shows that, in other cases, the studies which the institutions use to determine the quality of the evidence are the same. Therefore, in these cases, the reason for the discrepancies could be the criteria used. This is the case of autism, where, for example, D53 (Smith and Iadarola 2015) confers a Level Three of evidence to the early star Denver model (ESDM) based on the RCT of Dawson et al. (2010), while NICE (2013c), based on the same study, considers that the evidence for ESDM on overall autistic behaviors was inconclusive. The case of family therapy for depression is significant. NICE (2015) considers this therapy to be effective (low level of evidence) based solely on the RCT of Diamond et al. (2002), while D53 (Weersing et al. 2017) grants a Level Three of evidence to this therapy based on Diamond et al. (2002) and Brent et al. (1997), among other studies. In contrast, Cochrane (Henken et al. 2007) suggests that the current evidence base is too heterogeneous and sparse to draw conclusions on the overall effectiveness of family therapy in the treatment of depression also based on Diamond et al. (2002) and Brent et al. (1997), among others. Lastly, APS (2010) confers a Level I of evidence to family therapy based on this Cochrane review and another review presented by David-Ferndon and Kaslow (2008).

As regards enuresis, we have also found differences among organizations which may be due to the fact that different criteria were used. For instance, while Cochrane (Caldwell et al. 2013b) suggests that dry bed training is effective for enuresis based solely on the study of Bennet et al. (1985), NICE (2010) recommends that dry bed training not be used for the treatment of enuresis in children and young people based on five studies, among them the study of Bennet et al. (1985). The same applies to fluid restriction. Thus, while Cochrane (Caldwell et al. 2013b) concludes that there is evidence to suggest that this therapy is effective based on the study of Bhatia et al. (1990), NICE (2010) concludes that no evidence for fluid restriction was found based on the same study.

The fact that some reviews of existing evidence were conducted in different time periods may also explain the discrepancies found. For this reason, it is advisable to that lists reporting effective psychological treatments be updated on a regular basis since a substantial number of these lists, reviews and guides are currently out of date (Moriana et al. 2017). Moreover, the fact that NICE (2005) suggests that the evidence of EMDR for the treatment of PTSD in children is inconclusive, while D53 (Dorsey et al. 2017) confers a Level Two of evidence to this treatment based on three RCTs after the year 2007 and APS (2010) grants a Level I of evidence to EMDR, indicates that these discrepancies in the observed evidence may be due to the different time periods in which the reviews were conducted.

Hence, as in adults, the discrepancies in the effectiveness of psychological treatments in children and adolescents can be explained by the combination of the issues discussed above. These results reinforce the argument of Moriana et al. (2017) that it would be advisable to unify the criteria for assessing evidence and improve coordination between organizations in order to verify that a treatment is truly effective using high-quality reproducibility studies performed by independent teams.

The four organizations examined in this work are not the only sources that provide information on evidence of psychological treatments for mental disorders in children and adolescents. In many cases, these organizations do not include information contributed by other reviews that have been independently published, such as Davis et al. (2011), who reviewed evidence-based treatments for anxiety and phobias in children and adolescents. These authors considered that CBT in the form of a one-session treatment (Davis et al. 2009) is the best overall treatment option (well established) for specific phobias, either behavior therapy or group CBT would be optimal (probably efficacious) for SAD, CBT is the treatment of choice (well established) for OCD, CBT is the most efficacious choice (well established) for PTSD and group CBT merits well-established status for childhood anxieties (combined), while individual CBT and family-focused CBT merit probably efficacious status for this last disorder. Additionally, the recent meta-analysis of Öst and Ollendick (2017) has shown that brief, intensive and concentrated CBT is effective for anxiety disorder, and that there is strong support for specific phobia, modest support for PTSD and OCD and minimal support for panic disorder, SAD, separation anxiety disorder and mixed anxiety disorders. Another recent review of a meta-analysis of CBT in children and adolescents (Crowe and McKay 2017) has obtained overall medium effect sizes for anxiety, small-to-medium effect sizes for depression, a large effect size for OCD and a small-to-medium effect size for PTSD. Focusing on PTSD, the recent meta-analysis of Brown et al. (2017) has shown a medium-to-large effect size for CBT, EMDR, narrative exposure therapy and classroom-based interventions. Another meta-analysis (Gutermann et al. 2016) showed a medium-to-large effect size for CBT and a small-to-large effect size for EMDR, concluding that CBT is the most promising treatment for this disorder.

As regards effective treatments for depression, a meta-analysis in preadolescent children (12 years and younger) indicated that evidence on the effectiveness of CBT, FFT and psychodynamic therapy is inconclusive for this age group as the number of participants in the trials was relatively small (Forti-Buratti et al. 2016). In contrast, other meta-analyses have shown that CBT is effective in children with depression (Yang et al. 2017) and behavioral activation may be effective for this kind of patients, although this last conclusion should be interpreted with caution (Martin and Oliver 2018; Tindall et al. 2017). In the case of bipolar disorder, a narrative review (Weinstein et al. 2013) considered that FFT, psychoeducational psychotherapy, child- and family-focused CBT, dialectical behavior therapy, interpersonal and social rhythm therapy and CBT are effective treatments for children and adolescents. Although evidence of the effectiveness of psychological treatments in pediatric psychotic disorders is limited, Stevens et al. (2014) suggested in their review that CBT and psychoeducation are available treatments for these patients.

Concerning ADHD, Fabiano et al. (2015) conducted a review of meta-analyses to investigate the degree to which some narrative reviews (Evans et al. 2014; Pelham and Fabiano 2008; Pelham et al. 1998) that use operationalized criteria to graduate the effectiveness of psychological treatments were consistent with the meta-analytic literature. The authors concluded that the recommendations of the narrative reviews about the effectiveness of behavioral parent training and school-based contingency management were consistent with the meta-analytic literature; in turn, no meta-analysis calculated the effect sizes for training- and peer-focused interventions, which the narrative reviews determined to be effective. For disruptive behavior, a recent meta-analysis has pointed out that parent–child interaction therapy, multicomponent intervention and parent-focused intervention are effective treatments, although there is not enough evidence to determine which of them is superior (Bakker et al. 2017). Another meta-analysis suggested that treatments categorized as multicomponent interventions and treatments with only a parent component are similar in their effectiveness, while therapies with only a child component are less effective (Epstein et al. 2015).

Brunner and Seung (2009) conducted a literature review on evidence-based treatments for autism spectrum disorder. The authors concluded that there is solid evidence regarding the efficacy of applied behavior analysis (ABA), milieu teaching, pivotal response treatment (PRT), developmental interventions (including parent training), video modeling and augmentative and alternative communication (PECS and sign language training), and that the evidence on classroom-based treatments, social skill interventions and functional communication treatment remain in an exploratory stage of investigation.

As regards eating disorders, several systematic reviews and meta-analyses consider that CBT is an effective treatment for anorexia nervosa, although it is not superior to other treatments such as dietary counseling, non-specific supportive management, interpersonal therapy or behavioral family therapy (Galsworthy-Francis and Allan 2014), that behavioral family therapy for adolescents with eating disorders is superior to individual therapy at follow-up, while there is no difference at the end of the treatment (Couturier et al. 2013), and that cognitive remediation therapy has potential as a supplementary treatment for young people with anorexia nervosa (Tchanturia et al. 2017). Another recent review also recommends the use of CBT and family-based therapy to treat eating disorders, anorexia and bulimia in children and adolescents (Herpertz-Dahlmann 2017).

Regarding nocturnal enuresis, Caldwell et al. (2013a) affirm that although behavioral therapies (such as fluid restriction or rewards) are superior to no active treatment, they are inferior to alarm training, which is the first-line treatment for this disorder. Another review suggests that alarm training alone or combined with dry bed training increases the number of dry nights compared to no treatment, while the evidence for acupuncture, hypnotherapy and dry bed training alone is weak (Kiddoo 2013). Lastly, a recent meta-analysis on insomnia has provided evidence that CBT is an efficacious treatment for adolescents with sleep and mental health problems (Blake et al. 2017).

The lists of ESTs for different disorders are an important source of consultation, information and guidance for professionals who work with patients, as well as for professors and students in the higher education setting and in the qualification and ongoing training of professionals. The lack of consensus among the list of ESTs provided by the different organizations suggests the need to better identify these treatments. A first step would be to guarantee the quality of all the RCTs included in the systematic reviews and meta-analyses. Currently, several institutions have taken steps to ensure the quality of RCTs through prior registration in a database and subsequent monitoring. This is the case of the US National Library of Medicine and their ClinicalTrials.gov database (https://clinicaltrials.gov/). Likewise, it would be advisable to guarantee the quality of the systematic reviews and meta-analyses by registering in the International Prospective Register of Systematic Reviews (PROSPERO; https://www.crd.york.ac.uk/prospero/) of the Centre for Reviews and Dissemination of the University of York (UK), which is funded by the UK’s National Institute for Health Research. Although RCTs are considered to provide the most reliable evidence on the effectiveness of interventions (Akobeng 2005) and the existence of one or two RCTs with a quality methodological design is usually a requirement to reach the first level in the different evidence classification systems, it is recommended that the results of individual trials be endorsed by systematic reviews and meta-analyses, taking into account that the samples used in this type of studies in psychology are usually not very large.

Given the importance of scientific research on psychological treatments and its important repercussion on the mental health of the population, international consensus should be promoted through the creation of working groups formed by various organizations in order to establish common criteria to graduate the quality of the evidence and select RCTs, systematic reviews and other empirical studies that ensure minimum quality standards. In this regard, it seems that the GRADE system for rating clinical guidelines (Atkins et al. 2004) has met with increasing international support. These working groups should establish measures to improve the methodological aspects of RCT design and the inclusion and exclusion criteria of studies in systematic reviews and meta-analyses, in addition to controlling the biases produced by competing theoretical models in order to improve and ensure the objectivity of the scientific method in psychology.

Due to the difficulty of interventions with children or adolescents when complex techniques or proper programs are used (i.e., therapies based on relaxation training or problem solving), it is even harder to determine to what extent each treatment played a part in the individual’s improvement. Most ESTs are packages comprising several techniques. In many cases, there are no explanations for the causal mechanism and we cannot know which component of the treatment is responsible for the effect. Comprehensive treatment programs have often been evaluated without identifying their causal mechanisms. Because programs are designed prior to being evaluated, we do not know whether the design of a chosen program is superior to the multiple possible variants (O’Donohue and Yater 2003). This raises doubts concerning the causal mechanisms of the treatment (Primero and Moriana 2011). The next generation of research could analyze procedures (techniques, strategies) that are simpler units of analysis to determine what is useful, harmful or harmless in each treatment guide and thus make changes that will improve treatment efficacy (Westen et al. 2004). In this line, a recent review of 136 published RCTs of youth CBT treatments by Rith-Najarian et al. (2017) has proposed the need to use multi-parameter filtering in treatment selection and clinical decision making with different types of evidence. However, although we believe that the analysis of techniques or strategies is very positive for research on evidence-based psychological treatments, studies which jointly apply several techniques are recommended. That is, it is equally important to determine the efficacy of both a single technique and the interaction of several techniques packaged into a treatment.

In addition, RCTs with children and adolescents pose an ethical and legal challenge to clinicians and researchers due to several factors (Hoagwood and Cavaleri 2010). One of them involves the informed consent of parents who must authorize experimental therapies with their children or the possibility of being assigned to a control group or waiting list, which usually involves a higher level of resistance than that normally found in research with adults. Another aspect is the cultural and ethnic diversity of children and their families (Kazdin 2008). It is also necessary to consider the therapist’s abilities, the context in which the treatments are developed and the specific characteristics of each developmental stage. Moreover, in the context of child psychology there is a basic differentiating component compared to adult treatments: In many of the interventions the direct or indirect participation of the parents and/or relatives is essential, thus adding complexity to the process.

Limitations

First, the heterogeneity of levels of evidence established by the different organizations greatly hinders a comparative assessment. Second, our objective has been to compile and compare the information provided by the four organizations exactly as it is provided by them. Thus, it is possible that some of the treatments included in our review share several components. Third, although we have reviewed and compared data provided by four international organizations, many other organizations confer grades and levels of evidence whose inclusion would have made our review more robust. And lastly, the disorders examined in this study only comprise a small part of the spectrum of mental disorders in children and adolescents.

Future Directions

Future studies should aim to reach a consensus on the scientific methods used to validate psychological treatments in order to unify the criteria among organizations, researchers and professionals on levels of evidence and methodological approaches for improving the quality of the studies that support them. Moreover, performing studies similar to ours on addictions, health psychology and other related areas not addressed in this study is both necessary and of interest.

Conclusions

This study is the first to compare evidence provided by four leading international organizations on different psychological treatments for the principal child and adolescent mental disorders. From the main findings, it should be highlighted that there is no consensus regarding the evidence presented to support the effectiveness of psychological treatments for most mental disorders in children and adolescents. In addition, although there are numerous treatments for many of the disorders addressed here, not all provide the same quality of evidence or studies to support them. As a result, we need to contribute to improve the quality of RCTs through more independent studies that promote and contemplate reproducibility as a much more important criterion than envisaged so far. Finally, as regards the comparison, we found that while similar evidence exists for some disorders (e.g., OCD), for others there is a significant number of treatments for which the level of evidence varies greatly depending on the organization (e.g., autism), and some notable divergences between organizations regarding the evidence presented for treatments for disorders (e.g., enuresis).