Introduction

Conduct disorder (CD) is a childhood onset, prevalent, and morbid psychiatric disorder estimated to affect up to 7% of youth in the United States [1,2,3]. CD is characterized by “a repetitive and persistent pattern of behavior in which the basic rights of others or major age-appropriate societal norms or rules are violated,” as defined in the Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM 5) [4]. CD is more prevalent in males than in females [1, 5], and frequently co-occurs with other psychiatric disorders [6,7,8]. When CD co-occurs with other psychiatric disorders in the clinical setting the diagnosis of CD may not always be considered due to the higher acuity and therapeutic opportunities for the co-occurring disorders.

Yet, CD is associated with serious complications. Youth with CD are at five times greater risk to develop a substance use disorder (SUD) compared to youth without CD [9], and this risk is even higher in CD youth with comorbid disorders such as attention deficit hyperactivity disorder (ADHD) and bipolar (BP) disorder [10,11,12]. CD in adolescents has also been associated with increased risk for premature death in young adulthood [3]. Given the increased morbidity associated with CD increased efforts are needed to help identify children with CD in the clinical setting.

One potential screening method for CD is the Child Behavior Checklist (CBCL), a simple to use, inexpensive, empirically derived, broad band assessment of psychopathology with excellent psychometric properties [13] and norms for both sexes. One of the scales from the CBCL, the Rule-Breaking Behavior scale (previously referred to as the Delinquency scale) bears strong conceptual resemblance to the clinical diagnosis of CD. Questions that contribute to the Rule-Breaking Behavior scale score are similar to those used to diagnose CD, for example, questions about the frequency with which the child lies or cheats, sets fires, and steals at home or outside the home.

Previous research found the Rule-Breaking Behavior scale to be significantly associated with a clinical diagnosis of CD in largely male samples of children with ADHD [14, 15], community [16], and psychiatrically referred samples [17,18,19]. However, uncertainties remain as to the optimal cut off point for the CBCL Rule-Breaking Behavior scale, and whether the scale can identify girls with a possible diagnosis of CD with high accuracy. It is particularly important to examine the accuracy of the CBCL Rule-Breaking Behavior scale in the identification of girls with CD because this diagnosis was for many years under recognized and under studied relative to boys [20].

The main aim of this study was to identify the optimal cut-off point of the CBCL Rule-Breaking Behavior scale to help identify children of both sexes with a clinical diagnosis of CD. To this end, we applied conditional probability and receiver operating curve (ROC) analysis to a large sample comprised of four data sets of children of both sexes with and without ADHD and BP-I disorder. Children enrolled in these studies were systematically and thoroughly assessed with the same methodology which allowed us to combine the samples to evaluate this research question. Based on the literature and our previous work, we hypothesized that modest elevations of the CBCL Rule-Breaking Behavior scale would predict a structured interview derived diagnosis of CD. To the best of our knowledge this is the largest and most comprehensive evaluation of the diagnostic efficiency of the CBCL Rule-Breaking Behavior scale to help identify children with a clinical diagnosis of CD of both sexes.

Methods

Sample

The sample was derived from four independent studies using identical assessment methodology: (1) and (2) were prospectively controlled family studies of boys and girls 6 to 17 years of age with and without DSM-III-R ADHD (Boys Study: N = 140 ADHD and N = 120 Controls; Girls Study: N = 140 ADHD and N = 122 Controls) [21, 22] (3) was a prospective controlled family study of youth 10 to 18 years of age with (N = 105) and without (N = 98) DSM-IV pediatric BP-I disorder [23]; and (4) was a prospective family study of youth 6 to 17 years of age of both sexes with active symptoms of DSM-IV BP-I Disorder (N = 105) [24]. The ADHD studies recruited participants from pediatric and psychiatric clinics. The BP disorder studies recruited participants from referrals to the Clinical and Research Programs in Pediatric Psychopharmacology at the Massachusetts General Hospital and through advertisements in the community. Controls were recruited from pediatric clinics, advertisements to hospital personnel and community newspapers, and Internet postings. Potential participants, including controls, were excluded from all four studies if they had been adopted, if their nuclear family was not available for study, if they had major sensorimotor handicaps, autism, inadequate command of the English language, or Full-Scale IQ < 70 (< 80 for the ADHD studies). Potential participants were also excluded from the ADHD studies if they had psychosis, and from the BP disorder studies if their BP-I disorder was due solely to a medication reaction. For all four studies parents provided written informed consent to participate. Children and adolescents provided written assent to participate. The Partners Human Research Committee approved these studies.

Assessment Procedures

In all four studies, psychiatric assessments of participants were made with the Kiddie Schedule for Affective Disorders—Epidemiologic Version [25, 26]. Diagnoses were based on independent interviews with parents and direct interviews with children older than 12 years of age. Data were combined such that endorsement of a diagnosis by either reporter resulted in a positive diagnosis.

Extensively trained and supervised psychometricians with undergraduate degrees in psychology conducted all interviews. For the ADHD studies and the controlled BP disorder study, raters were blind to the ascertainment status of the families. For the BP disorder Family study, raters were blind to the study assignment and whether the subject was a proband or sibling.

To assess the reliability of our overall diagnostic procedures, we computed kappa coefficients of agreement by having experienced, blinded, board-certified child and adult psychiatrists and licensed experienced clinical psychologists diagnose participants from audiotaped interviews made by the assessment staff. Based on 500 assessments from interviews of children and adults, the median kappa coefficient was 0.98 for the ADHD studies and the controlled BP disorder study, and 0.99 for the BP disorder Family study.

Socioeconomic status was measured using the 5-point Hollingshead scale [27]. A higher score indicates being of a lower socioeconomic status.

Child Behavior Checklist

The parent of each participant completed the 1991 version of the CBCL for ages 4 to 18 years. The CBCL queries the parent about the child’s behavior in the past 6 months with a three-point likert scale ranging from “not true” to “very true or often true,” and aggregates this data into behavioral problem t-scores [28]. A computer program calculates the t-scores for each scale. Raw scores are converted to gender and age standardized scores (t-scores having a mean of 50 and standard deviation of 10). A minimum t-score of 50 is assigned to scores that fall at percentiles of ≤ 50 on the syndrome scales to permit comparison of standardized scores across scales. T-scores above 70 (2 standard deviations) indicate clinical disorder. Scales include Anxious/Depressed, Withdrawn/Depressed, Somatic Complains, Social Problems, Thought Problems, Attention Problems, Rule-Breaking Behavior, and Aggressive Behavior.

Statistical Analysis

We first compared demographic characteristics between participants with and without CD in each of the studies separately and then in the combined sample using Student’s t-test for continuous outcomes, Pearson’s chi-square test or Fisher’s exact test (for expected counts < 5) for binary outcomes, and Wilcoxon rank-sum for ordinal outcomes. Next, we calculated conditional probabilities for each of the studies separately using a conservative cut-off point of > 60 (1 standard deviation) for the CBCL Rule-Breaking Behavior scale. We subsequently combined the data from the four studies and subjected them to ROC curves, using a nonparametric approach, to examine the ability of the CBCL Rule-Breaking Behavior t-scores to identify those with and without a structured interview diagnosis of CD. ROC analysis uses each value across the entire range of CBCL scale t-scores as the cutoff for defining a case and compares this classification to the “true” diagnosis, as defined by the clinical interview. The ROC analysis then plots the false positive rate (1-specificity) and the true positive rate (sensitivity) for each CBCL Rule-Breaking Behavior t-score on the x- and y-axis, respectively, to create the ROC curve. Starting in the lower right-hand corner of the plot, each successive point corresponds to an increase in one point in the CBCL Rule-Breaking Behavior t-score. ROC analysis summarizes diagnostic efficiency with the area under the curve statistic. An area under the curve of 0.5 means the test does not predict the disorder in any way, and an area under the curve of 1.0 means the test predicts the disorder perfectly. The area under the curve statistic is useful in that it is equivalent to the Mann–Whitney U-statistic computed from a comparison of the CBCL-Rule Breaking Behavior score between the CD and non-CD groups [29]. We used conditional probabilities to examine the diagnostic utility of various cutoff points. For each cutoff, we calculated sensitivity, specificity, the positive predictive value, negative predictive value, and the percent correctly classified. Based on the information from the ROC curve analysis, we used the nearest-to-(0,1) method to calculate the optimal cut-point to identify those with and without CD. This method identifies the cut-point on the ROC curve that is closest to the upper left corner of the graph, i.e. the point plotted at (0,1), representing perfect sensitivity and specificity. All analyses were performed using Stata® (Version 14).

Results

Demographic Characteristics of the Sample

Participants from the four original independent samples were only included in this sample if a CBCL was completed. Table 1 shows the demographic details from the four contributing samples. There were no meaningful demographic differences between children with and without a structured diagnostic interview of CD within each of the four samples. Children with CD were more likely to be male and older when compared to children without CD. No other meaningful sociodemographic differences were identified in socioeconomic status or race (Table 1).

Table 1 Demographic characteristics of those with and without CD from the individual studies and all studies combined

Conditional Probability Analysis

As shown in Table 2, similar values of sensitivity [77–89%], specificity [75–86%], and percent correctly classified with CD [80–86%] were observed in the Boys ADHD study, Girls ADHD study, and BP disorder controlled study using the conservative cut off point of > 60 (1 SD) for the CBCL Rule-Breaking Behavior scale. The BP disorder family study had similar sensitivity (87%) to the other studies, but lower specificity (38%) and percent correctly classified (64%).

Table 2 Sensitivity, specificity, and percent correctly classified using the conservative cut-off of > 60 on the Rule-Breaking subscale of the CBCL to identify youth with conduct disorder in each study

ROC Analysis

Since all the studies used identical methodology and assessments and had mostly similar conditional probability analysis results, we combined data from the four samples for this analysis to improve statistical power. Thus, our combined sample consisted of 674 participants, of which 114 (16.9%) had CD. CBCL Rule-Breaking Behavior t-scores ranged from 50 to 87 with a median of 51 and a mean (standard deviation) of 56.5 (8.8). First, we ran a model to test for an interaction between CBCL Rule-Breaking Behavior scores and sex to see if the scores identified males and females with CD differently. This interaction was not significant (z =  − 0.46, P = 0.65) and thus we removed it from our model. Next, we ran a model predicting CD from only the CBCL Rule-Breaking Behavior scale. Figure 1 depicts the combined T-scores from the four studies of the CBCL Rule-Breaking Behavior scale that yielded an area under the curve of 0.9.

Fig. 1
figure 1

Receiver operating characteristic (ROC) curve of the CBCL Rule Breaking Tscores in subjects from the total sample with and without CD (N = 674)

Further examination of the performance of specific cut off t-scores that correspond to 0.5 SD increases in the CBCL Rule-Breaking Behavior scale to correctly identify participants with a structured diagnostic interview diagnosis of CD showed that a cut-point of 60 had the best properties as determined by the area under the curve with 85% sensitivity, 81% specificity, 48% positive predictive value, 96% negative predictive value, and 82% correctly classified with CD (Table 3). This cut-point has the greatest tradeoff between sensitivity and specificity.

Table 3 Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and percent correctly classified in the use of the CBCL Rule-Breaking subscale T-scores to to identify youth with CD in the total sample from all four studies (N = 674)

Discussion

ROC analysis using data from children of both sexes showed that a modestly elevated t-score of one standard deviation on the CBCL Rule-Breaking Behavior scale very efficiently identified children with a structured interview diagnosis of CD (area under the curve = 0.9). These results confirm and extend previously reported findings and provide strong evidence that the CBCL Rule-Breaking Behavior scale is a useful tool to identify youth who may have CD in both sexes in the clinical setting.

Our results showing the very high efficiency of the CBCL Rule-Breaking Behavior scale in identifying CD in clinical samples of both sexes extends to females previously reported findings in largely male samples of referred youth with ADHD [15], community [16], and psychiatrically referred samples [17,18,19]. The absence of an interaction effect by sex suggests that sex does not moderate the efficiency of the CBCL Rule-Breaking Behavior scale in identifying children who may have a clinical diagnosis of CD.

Our finding documenting that a modest t-score of 60 (1 standard deviation) on the CBCL Rule-Breaking Behavior scale had the best properties for identifying youth with CD is consistent with our previous work [14] in a sample of psychiatrically referred boys with ADHD. It is also consistent with findings reported by Kazdin et al. who also found an association between a score of 60 on the CBCL Delinquency/Rule-Breaking Behavior scale and a diagnosis of CD in a sample of psychiatrically referred males [19].

However, our conditional probability analysis showing high (81%) specificity associated with a 1 standard deviation elevation on the CBCL Rule-Breaking Behavior scale with a clinical diagnosis of CD is discrepant with the low (7.3%) specificity findings reported by Kazdin et al. [19]. Although the reasons for the discrepancy are unknown, it may be related to differences in sample composition (psychiatric inpatient versus outpatient and community referrals), as well as differences in assessment methodology (open label clinical diagnosis versus structured diagnostic interview by blinded raters). More work is needed to reconcile these differences.

The high negative predictive value (95%) for the CBCL Rule-Breaking Behavior scale suggests that a clinician can be very confident that a child does not have CD if the scale is not elevated. On the other hand, the modest positive predictive value suggests some children with an elevated CBCL Rule-Breaking Behavior scale will not meet criteria for CD when evaluated clinically. The modest positive predictive value for the CBCL Rule-Breaking Behavior scale supports the need for clinical judgement in weighing the presence or absence of this diagnosis.

Our findings documenting the high efficiency of the CBCL Rule-Breaking Behavior scale in the identification of children of both sexes with a likely clinical diagnosis of CD has important clinical implications. The identification of children with CD can lead to targeted intervention efforts, such as individual or group parent behavioral therapy [30], aimed at mitigating the well-known complication of CD-associated risk for SUD that can benefit the affected youth, their family, and society at large. For youth with CD co-occurring with other psychiatric disorders such as ADHD and BP disorder, the CBCL Rule-Breaking Behavior scale can be an invaluable simple and low-cost tool to alert clinicians to youth who may be at very high risk to develop SUD due to the presence of CD.

Strengths of our study include the large sample size, equal number of participants of both sexes, structured diagnostic interviews, and raters who were blinded to subject ascertainment status and the CBCL t-scores. Our study also needs to be viewed with consideration of some methodologic limitations. Since the sample was largely referred and mostly Caucasian, our findings may not generalize to non-treatment seeking community samples and other ethnic groups. However, the finding do generalize to clinical samples including community and primary care settings. While we focused on the Rule Breaking Behavior scale because of its congruence with the DSM-5 definition of CD, we cannot rule out the possibility that other CBCL scales such as the Aggression scale may have diagnostic utility. Prior research, however, that has examined the Aggression scale and disruptive diagnoses including ODD, CD, and ADHD found high correspondence between the aggression scale and ODD [31, 32]. Additionally, we do not know whether the CBCL Rule Breaking Behavior scale holds differently for children with ADHD, BP-I, or no diagnosis (i.e. whether there is measurement invariance). Future studies would benefit from performing multi-group confirmatory analysis to evaluate this.

Despite these limitations, our work strongly suggests that modest elevations of the CBCL Rule-Breaking Behavior scale can very efficiently help identify children of both sexes who may have a clinical diagnosis of CD. Such children may benefit from appropriate early intervention strategies to avert the well-documented risks associated with CD.

Summary

Since CD is associated with significant morbidity identification of CD could help guide early intervention strategies to improve prognosis. We evaluated whether the Rule Breaking scale from the parent reported CBCL could identify males and females with CD using a large sample of children (N = 674) of both sexes with and without structured interview derived clinical diagnosis of CD. We found that a modestly elevated t-score on the Rule Breaking scale very efficiently identified children with CD. We recommend that clinicians utilize the CBCL Rule-Breaking Behavior scale as a low-cost tool to identify children who may have a clinical diagnosis of CD who might benefit from early referral for mental health services and treatment.