In recent years interests in understanding the early precursor of adult psychopathy have been increasing. Among the several dimensions emerged in the construct of adult psychopathy, Callous/Unemotional traits (CU), referring to specific deficiencies in affective experience and interpersonal style, are believed to be the core features of childhood and adolescent psychopathy (Frick and White 2008). These traits are important for designating a distinct subgroup of antisocial and delinquent children and adolescents who show more severe, stable, and instrumental aggressive pattern of behavior (e.g., Essau et al. 2006). These individuals are at higher risk for early onset delinquency and later antisocial behavior (Frick and White 2008), and also show relatively poorer response to treatment (e.g., Frick and Dickens 2006). Finally, longitudinal studies have shown that the CU traits are relatively stable from late childhood to early adolescence (e.g., Munoz and Frick 2007). Therefore it is important to assess these traits reliably and validly.

The Inventory of Callous and Unemotional Traits (ICU) was developed to specifically assess the CU traits in youth (Frick 2004). It includes 24 items coded on a 4-point Likert scale, ranging from 0 (not at all true) to 3 (definitely true). There are Youth Self-Report, Parent Report, Teacher Report, Parent Report (Preschool), and Teacher Report (Preschool) versions. Studies have generally indicated that there are three aspects of CU traits: Callousness, capturing behavior that includes a lack of empathy, guilt, and remorse; Uncaring, indicating a lack of caring about one’s performance in tasks and for others’ feelings; and Unemotional, representing a lack of emotional expression (e.g., Essau et al. 2006).

These three dimensions have generally been supported by different samples using different translations (e.g., Byrd et al. 2013; Ezpeleta et al. 2013; Kimonis et al. 2013). For example, Essau et al. (2006) using an exploratory factor analysis in a sample of 1443 13–18-year-old Germans from general population for the first time identified these three factors, plus a common general dimension comprising all the items (i.e., bifactor model) of the self-report version, and the factor structure was invariant for boys and girls. Kimonis et al. (2008) confirmed this factor structure with the self-report version in a sample of 248 12- to 20-year-old American juvenile offenders. However, in their model two Callousness items (item #2 and #10) were deleted due to low item-total correlations, and the internal consistency for the Unemotional factor was low (0.53). Ciucci et al. (2014) in 540 Italian children (mean age = 12.7 years) confirmed the bifactor model with self-report version and the factor structure was invariant across sex and grade.

Recently a few studies raised questions about the bifactor model and validity of scales (Benesch et al. 2014; Hawes et al. 2014; Willoughby et al. 2015). For example, Willoughby et al. (2015) examined the parent ratings of the ICU in 1078 American first-graders (mean age = 7.3 years) and found a two-factor model that distinguished “callous-unemotional” (including nine Callousness and two Unemotional items from the original scale) and “empathic-prosocial” (including eight Uncaring, two Callousness, and three Unemotional items) dimensions to be the best fit, and none of the items were cross-loaded on more than one factor. Hawes et al. (2014) in a sample of 250 American boys with significant conduct problems (ages 6–12 years) obtained a two-factor structure (Callousness and Uncaring) using 12 of the original 24 items of the parent-report version. Houghton et al. (2013a) examined the factor structure of self-report ICU in 268 Australian children aged 7- to 12- years and found that a two-factor model was superior to the three-factor model. In this two-factor model, all Unemotional items were deleted due to poor model fit and poor internal reliability, and the Callousness and Uncaring factors each consisted of 8 items with internal reliabilities of 0.77 and 0.85, suggesting satisfactory reliability. In addition, eight pairs of error terms were correlated to improve model fit.

In summary, the bifactor model (with three intercorrelated factors and an overarching general dimension) has obtained the strongest support from studies focusing on adolescent and adult samples (above 12 years), whereas in children aged 7–12 years a two-factor structure (without the overarching general dimension) seems to be a better fit. However, the factor structure of the ICU varies for different raters and various samples, and in most studies either the self-report or parent-report was investigated (although Roose et al. (2010) examined both). In Roose et al.’s study, the same bifactor structure was supported for self, parent, and teacher versions of the ICU in a community sample of adolescents. However, for the convergent and criterion validity analyses, the authors only reported results for a teacher/parent composite rating, utilizing the highest score on each item from these different raters. It is generally believed that different informants are likely to report different aspects of behavior, and children are less capable than adults as informants for disruptive behavior problems (Loeber et al. 1991). However, it can also be argued that children are more aware of their own behaviors across all settings and are also most intimately familiar with their own motivations and feelings (particularly important for assessing the characteristics such as lack of empathy and remorse). To shed further light on this issue, the present study included both self-report and parent ratings in order to provide a more comprehensive account of CU traits.

Another limitation in this research is that only two studies have explicitly tested the invariance of the factor structure of the scale across sex, with both reporting that the factor structure of the ICU was invariant for male and female adolescents (Essau et al. 2006) and children at ages 11 to 14 (Ciucci et al. 2014). Thus, the factor structure of the ICU appears to be invariant across sex but further testing among children under 11 is lacking.

In terms of the validity of the ICU, many studies have demonstrated the significant associations between CU traits and antisocial, aggressive, and delinquent behavior (e.g., Fanti et al. 2009; Kimonis et al. 2008; Roose et al. 2010). However, most studies have mainly relied on self-report for the validation measures of the ICU, and more importantly, less is known about the relationships between CU traits and internalizing behavior problems. Although some studies have shown that CU traits are unrelated or negatively related to internalizing symptoms including anxiety and depression (Frick et al. 1999; Pardini et al. 2012), others have linked CU traits to higher levels of internalizing behavior problems (Berg et al. 2013; Essau et al. 2006). These inconsistent findings highlight the need for further examination of the associations between CU traits and internalizing symptoms.

The main aim of the study was to investigate the psychometric properties of the self-report and parent versions of the ICU in boys and girls drawn from the community. Specifically, we expect for each version of the ICU to (1) compare the bifactor, three-factor, and two-factor models reported in previous studies by conducting a series of confirmatory factor analyses (CFA), (2) investigate the item-total correlation to identify poorly discriminating items, (3) directly test whether the factor structure and levels of scales were invariant across sex, and (4) investigate convergent and discriminant validity of the refined measure’s factor and total scores by relating them to the number of Conduct Disorder (CD) and Oppositional Defiant Disorder (ODD) symptoms and internalizing symptoms. Based on prior research (Roose et al. 2010), we predicted that the two-factor structure (Callousness and Uncaring) would be replicated in this sample for both self-report and parent versions, and that the structure of ICU would be invariant across sex. In addition, we predicted that the symptoms of ODD and CD would be associated with CU traits (Feilhauer et al. 2012). Finally, we also explored the relationships between ICU subscale scores and internalizing symptoms.

Method

Participants

The sample was drawn from participants in the Healthy Childhood Study, an ongoing longitudinal study aiming to examine the interplay of biological and social factors on the development of antisocial behavior in middle childhood. It consisted of 8- to 10-year old boys and girls (Mean age = 9.06, SD = 0.60) living in Brooklyn, New York. Within the study area, fliers soliciting enrollment were placed in public areas and targeted mailings were also sent to parents of 8- to 10- year- old children living in the geographic catchment area. Children with a diagnosed psychiatric disorder, mental retardation, or a pervasive developmental disorder were excluded. The sample included 164 male (48.2 %), 11 % Hispanic (n = 38), 21 % Caucasian (n = 71), 52 % African-American (n = 176), 2 % Asian (n = 8), and the remaining 14 % of mixed/other (n = 48). Compared to the Kings County or New York population, our sample consisted of slightly more African-Americans, and had lower median family income ($43,200 compared to $45,215). Participants and their main caregivers were invited to the university for a 2-h laboratory assessment including behavioral interviews, neurocognitive testing, psychophysiological recording, and social risk factor assessment. Incentives were provided to the participating families at the end of the assessment. All procedures were approved by the university Institutional Review Board, and both parental consent and child assent were obtained.

Measures

Callous-Unemotional Traits

Both parent and child filled out the 24-item parent- and self-report version of the ICU (Frick 2004). A trained research assistant accompanied the child during the whole testing session and answered any question the child might have had. Parent filled out the ICU in a different room. The items are rated on a 4-point scale from 0 (not true at all) to 3 (very true) and are summed up for scorings for each subscale. There are 11 items for the callousness subscale (e.g., Does not seem to know “right” from “wrong”), 8 items for the uncaring subscale (e.g., Feeling bad or guilty when he/she has done something wrong), and 5 items for the unemotional subscale (e.g., Expresses his/her feeling openly). In this sample with original ICU items, means were 18.89 (SD = 7.03, α = 0.64) and 17.71 (SD = 8.81, α = 0.85) for the self- and parent- reported ICU total scores, respectively. For the subscales, means were 5.87 (SD = 3.94, α = 0.56), 5.68 (SD = 3.92, α = 0.70), and 7.34 (SD = 2.80, α = 0.39) for the self-report version and 4.29 (SD = 3.46, α = 0.66), 9.37 (SD = 4.97, α = 0.83) and 4.04 (SD = 2.54, α = 0.64) for the parent version of the Callousness, Uncaring, and Unemotional subscales, respectively.

ODD/CD Symptoms

Children’s ODD/CD symptoms were assessed using the Diagnostic Interview Schedule for Children (DISC-IV; Shaffer et al. 2000). The reliability and validity of the DISC-IV are well established (Shaffer et al. 2004). Caregivers were administered the DISC-IV to assess the lifetime number of CD and ODD symptoms. For CD, 44 % of the boys had at least one symptom, and 22 % of boys had at least 2 symptoms in their lifetime. In contrast, only 28 % of girls had one or more CD symptoms. For ODD, 84 % of the boys had at least one symptom, and 75 % of boys had at least 2 symptoms in their lifetime. In girls, 81 % had one or more ODD symptoms. None of the participants met diagnostic criteria for CD or ODD.

Internalizing Problems

Information on internalizing problems was collected using the Child Behavior Checklist (CBCL; Achenbach 1991) and Youth Self-Report (YSR; Achenbach 1991). The CBCL is a caregiver rating scale composed of 112 items concerning a child’s behavior within the past 12 months. Items are rated on a 3-point scale (0 = not true, 1 = sometimes or somewhat true, 2 = very true or often true). Children completed the YSR, a 3-point scale containing a list of 118 specific problems in children and adolescents (Achenbach 1991). For the purposes of the present article, only the Anxious/Depressed, Withdrawal/Depressed, and Somatic Complaints subscales of the CBCL and YSR were used in our analyses.

Statistical Analyses

Firstly, the raw scores of the 24 items of ICU would be subjected to CFA with maximum likelihood estimation using AMOS 21.0. A unidimensional model with all items loading on a single ICU factor (Model 1) would be tested. Then a three-factor model in which items are loaded on three distinct but correlated latent factors, Callousness, Uncaring, and Unemotional (Model 2) would be examined. The third model (Model 3) would be a three-factor bifactor model in which there is an overarching general ICU dimension as well as three ICU factors. Five commonly used fit indices would be used to assess the goodness of fit of all measurement models: the χ 2, the comparative fix index (CFI; Bentler 1990), the root mean square error of approximation (RMSEA; Steiger 1990), the chi-square/df ratio (CMIN/DF; Carmines and McIver 1981), and the goodness-of-fit index (GFI; Jöreskog and Sörbom 1989; Tanaka and Huba 1985). Non-significant χ 2 indicates a good fit. An adequate model fit is indicated by CFI > 0.90, RMSEA <0.08, CMIN/DF < 3, and GFI > 0.90, and a good model fit is indicated by CFI > 0.95, RMSEA <0.05, CMIN/DF < 2, and GFI > 0.95 (Hu and Bentler 1995; Schermelleh-Engel et al. 2003). Internal reliability would be assessed using Cronbach’s α (< 0.60 indicates insufficient fit, 0.60 to 0.69 indicates marginal fit, 0.70 to 0.79 indicates acceptable fit, 0.80 to 0.89 indicates good fit, and >0.90 indicates excellent fit; Barker et al. 1994). Secondly, items with item-total correlation below 0.30 would be considered to provide poor discrimination and be eliminated (Nunnally and Bernstein 1994). Prior models would again be subjected to CFAs with remaining items. Thirdly, multiple groups analyses would be conducted to evaluate if the factor structure of the revised ICU are equivalent across sex. The factor loadings, factor variances, factor covariance, and error variances and covariance would be assessed incrementally. The change of CFI (ΔCFI) would be used to evaluate the merits of the competing model, and ΔCFI >0.01 indicates that the unconstrained model should be accepted, meaning the existence of sex differences (Cheung and Rensvold 2002). Then independent samples t-tests would be conducted to determine the effects of sex on factor scores and total score. Finally, validity of the ICU total score and its subscales would be assessed by examining their correlations with the number of ODD/CD symptoms and internalizing problems. Partial correlation would also be conducted to examine these associations after controlling for sex. Separate analyses would be conducted for the self-report and parent versions of ICU.

Results

Self-Report ICU

A unidimensional model (Model 1) with all items loading on a single ICU factor showed an unsatisfactory fit. The three-factor model (Model 2) fit significantly better (Δdf = 3, Δχ 2 = 228.938, p < 0.001), although several fit indices were unacceptable (see Table 1 for model indices). Further, the three-factor bifactor model (Model 3) showed a better fit than the three-factor model (Δdf = 21, Δχ 2 = 143.651, p < 0.001), yet continued to provide inadequate fit to the data. In addition, the internal consistencies for the original three subscales were inadequate. The item–total correlations across the items were examined to identify poorly discriminating items (item-total correlation r < 0.30). Item 2 (r = 0.07), item 4 (r = 0.08), item 8 (r = 0.01), and item 10 (r = 0.11) for the Callousness factor; item 5 (r = 0.24) and item 13 (r = 0.22) for the Uncaring factor; and item 6 (r = 0.15) and item 22 (r = 0.11) for the Unemotional factors were removed in a stepwise fashion.Footnote 1 With these items eliminated, CFAs for this modified three-factor model (Model 4) and the three-factor bifactor model (Model 5) were performed. The modified three-factor model (Model 4) was significantly better than the original three-factor model (Model 2, Δdf = 148, Δχ 2 = 365.1, p < 0.001), and fit adequately to the data. Similarly, the modified three-factor bifactor model (Model 5) also showed a better fit than the prior bifactor model (Δdf = 128, Δχ 2 = 209.284, p < 0.001), and fit satisfactorily to the data. Given that Model 5 did not fit better than Model 4 (Δdf = 1, Δχ 2 = −12.165, p = 1.000), and that compared to Model 4, Model 5 had poorer values of all fit indices, lower factor loadings items (below 0.35), and an opposite sign to the content of the item, the revised three-factor model without a hierarchical factor (Model 4) was considered superior and retained for further analyses (Ezpeleta et al. 2013).

Table 1 Fit indices comparing the measurement models for self-report (Top) and parent-report (Bottom) ICU

For each factor of the revised three-factor model (Model 4), the internal consistency was insufficient for the Unemotional factor (α = 0.53) (Callousness factor, α = 0.65; Uncaring factor, α = 0.77). Following Houghton et al. (2013a, b), we clustered the Callousness and Unemotional items together, so a model where Unemotional items were loaded on the Callousness factor (Model 6) was tested. However, this model did not provide a sufficient fit, and factor loadings for the three Unemotional items on the Callousness factor were low (between −0.11 and 0.01). As a result, the entire Unemotional factor was deleted and a two-factor model (Model 7) showed an adequate fit. To further improve the fit, modification indices for Model 7 were reviewed to evaluate if certain pairs of error terms with similar item content could be correlated. As a result, two Uncaring items were correlated in Model 8.Footnote 2 This model fit data significantly better than Model 7, Δχ 2 (1) = 23.881, p < 0.001, and this final model (Model 8) fit data well (see Table 1 for model indices). Table 2 shows factor loadings and reliabilities for the final model.

Table 2 Factor loadings for final self-report (top) and parent-report (bottom) ICU models

Effect of Sex

A series of model comparisons indicated that the factor structures were invariant across sex (all ΔCFIs <0.008; not shown for brevity, but available on request). Means for the revised Callousness factor, Uncaring factor, and total ICU were 2.37 (SD = 3.11), 3.40 (SD = 3.50), and 5.77 (SD = 4.70) for boys, and 2.66 (SD = 3.31), 2.414 (SD = 2.71), and 5.08 (SD = 4.84) for girls, respectively. Independent samples t-tests showed that boys and girls did not differ significantly on the Callousness factor (t (321) = −0.81, p = 0.419, d = −0.09) and the total ICU (t (321) = 1.31, p = 0.190, d = 0.15). However, boys showed significantly higher Uncaring factor score than girls, t (321) = 2.84, p < 0.05, d = 0.32.

Parent-Report ICU

The unidimensional model (Model 1), the three-factor model (Model 2), and the three-factor bifactor model (Model 3) did not provide insufficient fit to the data (see Table 1 for model indices). Item 2 (r = 0.26), item 8 (r = 0.16), item 10 (r = 0.08), and item 21 (r = 0.25) for the Callousness factor and item 6 (r = 0.26) for the Unemotional factor had low item-total correlation and factor loadings, and thus were removed in a stepwise fashion. With deletions of these items, the modified three-factor model (Model 4) was superior to the original three-factor (Δdf = 100, Δχ 2 = 454.058, p < 0.001) and the original three-factor bifactor model (Δdf = 79, Δχ 2 = 135.487, p < 0.001) but did not reach an acceptable fit. The modified three-factor bifactor model (Model 5) did not provide better fit than the original bifactor model (Δdf = 81, Δχ 2 = 78.501, p = 0.558) and fit inadequately to the data. In comparing the two modified models (Model 4 and Model 5), analyses suggested that some of the parameters for the Model 5 were unsatisfactory (Δχ 2 was statistically non-significant, factor loadings were below 0.35 or with an opposite sign to the content of the item), thus the modified three-factor model without a hierarchical factor (Model 4) was considered superior. Internal consistency was sufficient for all factors in Model 4 (Callousness factor, α = 0.71; Uncaring factor, α = 0.83; Unemotional factor, α = 0.63). To further improve the fit, modification indices for Model 4 were reviewed, and 8 pairs of error variables were correlated.Footnote 3 This model with correlated error terms (Model 6) achieved an acceptable fit, and fit significantly better than Model 4, Δχ 2 (8) = 159.021, p < 0.001. See Table 2 for factor loadings.

Effect of Sex

The factor structures are invariant across sex (all ΔCFIs < 0.005), although there was sex difference in error variances (ΔCFI = 0.032). Specifically, sex differences were found for the error variance of Callousness item 12 (Seems very cold and uncaring), suggesting that the reliability of this item was lower in boys. Means for the revised Callousness, Uncaring, Unemotional factor, and total ICU were 2.05 (SD = 2.36), 9.77 (SD = 4.81), 4.04 (SD = 2.22), and 15.85 (SD = 7.57) for boys, and 1.82 (SD = 2.52), 8.92 (SD = 5.01), 3.64 (SD = 2.38), and 14.38 (SD = 7.81) for girls, respectively. Boys and girls did not differ on total ICU scores or any of the subscale scores, t < 1.75, p > 0.08.

Associations with Behavioral Problems

The associations between the revised ICU scales scores (self- and parent-report), internalizing behaviors, and ODD/CD symptoms are presented in Table 3. In general, higher levels of CU traits were related to more internalizing and ODD/CD symptoms, and these correlations largely remain unchanged after sex was controlled for. The cross-informant correlations are 0.23, 0.23, and 0.15 for the total ICU, Uncaring, and Callousness subscale scores, respectively (all p < 0.01). For the parent version, the Uncaring scale correlated with the Callousness scale at 0.42 (p < 0.001) and Unemotional scale at 0.49 (p < 0.001) and the Callousness correlated with the Unemotional scale at 0.25 (p < 0.001). For the self-report version, the Callousness correlated with the Uncaring scale at 0.13 (p < 0.05).

Table 3 Correlations between the revised ICU scales and behavioral problems

Discussion

In this study, we examined the psychometric properties of the self-report and parent versions of the ICU in a community sample of boys and girls. The main findings are (1) for the self-report ICU, a two-factor structure comprising one Callousness factor (7 items) and one Uncaring factor (6 items) fit adequately to the data and was found to be superior to the bifactor or three-factor model; the factor structure was invariant across sex, although boys scored higher on the Uncaring factor than girls; (2) for the parent-report ICU, a three-factor structure (Callousness, 7 items; Uncaring, 8 items; and Unemotional, 4 items) was supported; boys and girls did not differ on factor structure or mean levels of the three factors; (3) the revised ICU total score and factor scores were positively associated with ODD/CD symptoms and internalizing behavior problems. Findings suggest that future refinement of the ICU should take various raters into account.

In general, our findings on the self-report version bear some similarities to Houghton et al. (2013a)’s results. First, all the Unemotional items were excluded from the final model. This factor had poor internal reliability, and none of the items making up the Unemotional factor loaded onto the other two factors. Houghton and colleagues argued that children younger than 12 years old might not be able to “feel” the (affective) emotions of others, and that “they may not have had the experience to be able to attribute these emotions to themselves or others” (Houghton et al. 2013a). Thus self-report Unemotional items may not be useful in differentiating children with high or low CU traits. In fact, it has been argued that the Unemotional factor may be tapping into a distinct construct from CU and further examination of this factor is needed in future studies to determine the need to include these items as part of future (self-report) version of the ICU (Kimonis et al. 2013).

Meanwhile, our findings are different from Houghton et al. (2013a)’s in a few important ways. First, we found boys to have higher scores than girls on Uncaring (i.e., a lack of caring about one’s performance in tasks and for others’ feelings), whereas Houghton et al. failed to find any sex difference in the mean levels of factor scores. Second, one pair, instead of 8 pairs of error items were correlated in our final model. These different findings may be caused by several factors, including the different characteristics of the samples involved in the two studies. Houghton et al.’s study included Australian children from grades 3 to 7, with a wide age range from 7 to 13 years, and it was purely school-based. In contrast, our sample was community based; it composed of American children aged 8 to 10 years, and the majority of them were African-Americans and from low SES families. These inconsistencies may call for further study on the potential effects of ethnicity, social background, and developmental factors on the factorial structure of self-report ICU.

For the parent-report version, we found that the three-factor model fit the data adequately well, after four Callousness itemsFootnote 4 and one Unemotional item were deleted due to low item-scale correlation, and 8 pairs of errors were correlated. The internal reliability was acceptable for two of the three subscales, with coefficient alphas being 0.83 for the Uncaring factor and 0.71 for the Callousness factor. The internal reliability was marginal (0.63) for the Unemotional factor. It was consistent with prior studies that also showed lower levels of reliability for the Unemotional factor, partly due to fewer items, than for the other two factors (e.g., Essau et al. 2006). Also in line with prior literature (Frick et al. 2003), the internal reliability estimates for the self-report scales were lower than for the parent version (Table 2). Note that the cross-informant correlations for the ICU total score and subscales ranged from 0.15 to 0.23, slightly lower than the correlations found in past studies of CU traits (r = 0.29–0.57; Frick et al. 2003; Roose et al. 2010). Taken together, ratings from two different informants seem to provide overlapping but different perspectives of CU traits. Although youth self-reports may provide an introspective and private view of their own behaviors, our findings seem to suggest that parent reports on these traits may be more accurate and reliable, at least in children at this age rage.

As expected, total ICU score and factor scores were all significantly associated with the number of ODD/CD symptoms. Positive relationships were also found between CU scores and internalizing behaviors, indicating that CU traits may relate to parents or children themselves endorsing them as being socially withdrawn, isolated, or low in mood. These findings are in line with prior research that links the occurrence of CU traits with ODD and CD symptoms (Barry et al. 2000; Feilhauer et al. 2012) and internalizing problems (Hawes et al. 2014; Waller et al. 2015), and advance this past work in showing that these associations extend to both self- and parent-report ICU in children under 12 years. The high correlations between ICU scores and counts of ODD/CD symptoms provide some support to the ICU’s convergent validity. However, its discriminant validity seems to be questionable because psychopathy and anxiety/depression are believed to be unique concepts and valid measures of each should not correlate too highly (Bagozzi et al. 1991). One possible explanation for these positive associations is that we did not examine the entire concept of psychopathy. Although CU traits are considered the central features of psychopathy, it is important to examine a broader range of psychopathic traits, including the grandiose-manipulative and daring-impulsive dimensions, to get a more comprehensive understanding of psychopathic traits in children and adolescents (Salekin 2015). Finally, it is worth noting that significant correlations emerging with self-reported ICU scores are predominately YSR scores, and those emerging with parent-reported ICU are predominately CBCL and DISC-IV scores (Table 3), indicating the influence of shared method variance.

The current findings should be considered in light of several limitations. First, the present study was community-based, so the findings may not be generalized to children presenting with elevated CU traits, including those in clinical, institutional or referral-based settings. In addition, African-American participants and low-income families were slightly overrepresented in our sample. Second, the reliabilities for some subscales were relatively low, although even lower reliabilities (around 0.40) have been reported in some studies (see review by Waller et al. 2013). These low reliabilities highlight the need for continued investigation of the construct of CU traits among youth samples. Third, the current study was cross-sectional. Additional time points of measurement would have allowed the investigation of trajectories of change over time. Investigating how CU traits and its subfactors change across ages, and linking these changes to behavioral changes during adolescence and adulthood should be explored in future.

In summary, the current study compared two-factor, bifactor, and three-factor structure for the self-report and parent versions of ICU in a community sample of children. Findings indicate that the factorial structure is same for boys and girls, but different for various informants (i.e., children themselves vs. parents). In addition, the ICU total and factor scores were positively associated with ODD/CD symptoms and internalizing behavioral problems, demonstrating ICU’s satisfactory convergent but questionable discriminant validity. Together with previous research, current findings demonstrate better reliability for the parent than for the self-report version of the ICU, and underscore the need to further refine this instrument for the two informants separately.