Psychopathy refers to a personality disorder that is characterized by affective (e.g., shallow affect, lack of empathy), interpersonal (e.g. grandiosity, manipulation), and behavioral (antisocial behavior, impulsivity) characteristics (Hare and Neumann 2008). Research indicates that psychopathic traits have been associated with violence, aggression, and severe and chronic criminal behavior in adult samples (Douglas et al. 2006; Porter and Woodworth 2006). Recently, the concept of psychopathy has been extended to youth and research indicates that the presence of callous-unemotional (CU) traits, tapping the affective features of psychopathy (Hare and Neumann 2008), may designate a group of antisocial youth with a more severe and chronic form of antisocial behavior that shows a poor response to typical treatments (Edens et al. 2007; Frick 2009; Frick and White 2008; Salekin and Lynam 2010). Further, CU traits have been associated with distinct cognitive and affective characteristics which could suggest that children and adolescents with these traits have different causal factors leading to their behavior problems compared to other antisocial youths. For example, children with serious conduct problems who are elevated on CU traits tend to show blunted emotional reactivity to stimuli involving negative emotions, especially signs of distress in others, and they show an insensitivity to punishment cues which are not found for children with serious conduct problems who show normative levels of CU traits (Frick et al. 2013). Thus, the presence or absence of non-normative levels of CU traits appear to designate important subgroups of youth with serious conduct problems and this distinction is not presently captured in existing classification systems for diagnosing children with behavior problems.

Based on this research, the Fifth Edition of the Diagnostic and Statistical Manual for Mental Disorders (DSM-5) has added to the diagnostic criteria for Conduct Disorder (CD) a specifier to designate those individuals with the disorder who also show significant levels of CU traits (American Psychiatric Association, 2013). Specifically, persons with CD who also show two or more of the following four characteristics involving CU traits can be designated with the specifier “with Limited Prosocial Emotions”: lack of remorse or guilt, callous-lack of empathy, unconcern about performance in important activities, and shallow or deficient affect. With this change in the diagnostic criteria, it is likely that the assessment of CU traits in both research and practice will increase. As a result, evaluating measures which assess these traits is a critically important focus of research.

To date, CU traits have been assessed using several different formats, including parent and teacher rating scales (Frick et al. 2000; Lynam 1997), self-report scales (Andershed et al. 2002; Muñoz and Frick 2007), parent and youth structured interviews (Lahey et al. 2008), and clinician ratings (Forth et al. 2003). Some of the measurement formats are not amenable to large scale assessments in community samples, because they are time intensive (e.g., require extensive training and time consuming interviews) and require use of institutional records. Also, most of these measures have included only a limited number of items specifically assessing CU traits, often with as few as 4 (Forth et al. 2003) or 6 (Frick et al. 2000) items. Further, and possibly owing to this limited item pool, measures of CU traits often have had some significant psychometric limitations, such as displaying poor internal consistency in some response formats (Poythress et al. 2006).

To provide a more comprehensive assessment of CU traits that can be used in various settings, the Inventory of Callous-Unemotional Traits (ICU; Kimonis et al. 2008) was developed using the 4 items that have been proven to be highly indicative of the construct using different assessment methods and in different samples (Forth et al. 2003; Frick et al. 2000). These four items also directly correspond to the symptoms included in the proposed specifier for the diagnosis of CD. Next, for each of these 4 items, 6 new items were created which were equally divided into positively and negatively worded items to avoid potential response biases caused by ratings all being made in a single direction. The resulting 24-items are rated on a 4-point likert scale to avoid the possibility of a middle rating.

To date, several studies have tested the construct validity of the ICU using factor analyses. In a sample of 1,443 German adolescents aged 13–18 (Essau et al. 2006), factor analyses produced three factors: callousness (capturing a lack of empathy and remorse), uncaring (capturing an uncaring attitude about performance on tasks and other’s feelings), and unemotional (capturing deficient emotional affect). Further, the results indicated that a bi-factor model fit the data best, indicating that the three subfactors loaded onto a broad overarching CU factor. Kimonis et al. (2008) replicated the same bi-factor structure in a sample of 248 juvenile offenders aged 12–20 from the United States, as did Fanti et al. (2009) in a community sample of Greek Cypriot adolescents between the ages of 12–18. In both of these studies, only the self-report of the ICU was used. Roose et al. (2010) investigated the factor structure of the ICU using both self- and other (i.e., parents and teachers) reports in a community sample of 455 Dutch adolescents between the ages of 14 and 20. Again, the same factor structure was found and it generalized across informants.

Thus, the factor structure of the ICU appears to be fairly robust across languages and raters (see Feilhauer et al. 2012 for an exception in another Dutch sample). However, there are several notable limitations in this research that would be important to address, especially given the potential inclusion of CU traits in diagnostic classification. First, to date only one study has explicitly tested the invariance of the factor structure of the scale across gender. Specifically, Essau et al. (2006) reported that the factor structure of the ICU was invariant for boys and girls. Also, the samples used to test the structure of the ICU have largely been confined to adolescent samples. Thus, further testing of the construct validity of the ICU, and thus, the structure of CU traits, across gender and age groups is important.

Several studies have tested the differential correlations of the subscales of the ICU (i.e., callousness, uncaring, and unemotional) with important external criteria to clarify the commonalities and differences across these dimensions. Specifically, in past research both the callousness and uncaring subscales have been correlated with antisocial, aggressive, and delinquent behavior (Essau et al. 2006; Fanti et al. 2009; Kimonis et al. 2008; Roose et al. 2010). In contrast, the uncaring and unemotional scales show the strongest negative associations with measures of empathy (Kimonis et al. 2008; Roose et al. 2010). These studies are important for understanding how the dimensions which form the construct of CU traits are differentially related to various aspects of psychopathological and personality functioning. However, this past research has a number of limitations.

First, most studies have mainly relied on self-report for the validation measures of the ICU. Thus, some of the correlations with the ICU may be inflated due to shared method variance in correlating the self-report of the ICU with other self-report measures. Second, some measures important for a child’s adjustment in school classrooms (i.e., bullying; academic achievement; classroom misbehavior) have not been tested extensively in terms of their association with the various dimensions of CU traits, possibly due to the focus in past research largely on older adolescent samples. One exception was the study by Fanti et al. (2009) which reported that both the callousness and uncaring dimensions, but not the unemotional dimension, were associated with bullying. Also, tests with other measures of school adjustment, including academic achievement, are important because many indicators of CU traits included on the ICU, especially on the uncaring factor, focus on a child’s attitudes toward schoolwork (i.e., “I care about how well I do at school or work”). Third, past research has not consistently tested the associations between scores on the ICU and important external criteria, controlling for the correlations among the subscales. Thus, it is not clear how much unique variance each subscale contributes to the prediction of important measures of adjustment.

To begin to address these limitations in the existing research, the current study tested the factor structure of another translation (i.e., Italian) of the ICU in a large sample (n = 540) of middle school children (M = 12 years and 7 months, SD = 1 year and 3 months). Based on past research, we predicted that the factor structure found consistently in past samples of an overarching CU dimension with three sub-factors would be replicated in this sample. However, we directly tested whether the factor structure of the ICU, and thus the structure of CU traits, was invariant across age and gender. Further, this study included external criteria to validate the ICU that have been neglected in past studies but are important for a child’s school adjustment, including academic achievement, school discipline, aggression, and bullying. Based on past research we predicted that school discipline problems, bullying, and aggression would be associated with CU traits but this would be stronger for the callousness and uncaring dimensions than for the unemotional dimension. Importantly, in the current study, we included measures of bullying from both self and peer reports and we included measures of various types of bullying (i.e., direct, indirect, and cyber) to both provide a more comprehensive measure of bullying and to provide an assessment that did not rely solely on the child’s self-report. Although not tested in past research, we predicted that school achievement would be most strongly and negatively associated with the uncaring subscale of the ICU, due to this dimension capturing a lack of motivation to perform in important activities up to others’ expectations. Finally, both the overall association (i.e., zero-order correlations), as well as the unique associations controlling for the shared variance of the other subscales, were tested for all of the measures of adjustment.

Method

Participants and Procedures

The sample consisted of 540 children in middle school (grades 6 and 8) in Tuscany (Central Italy). Girls comprised 52.6 % of the sample and participants ranged in age from 10 years and 6 months to 16 years and 2 months, although the majority of the sample was between the ages of 11–14 (M = 12 years and 7 months, SD = 1 year and 3 months). Almost all of the sample was Italian (90.93 %); the other students were East-European or Balkan (5.93 %), African (1.85 %) and less than 1 % for both South-American and Asian. The sample was diverse in regards to parental educational level but representative of families in the school district. For father’s highest level of education, 3.5 % reported primary school, 41.3 % reported middle school, 38.3 % reported high school, and 9.4 % had obtained a university degree. For mother’s highest educational level, 1.9 % reported primary school, 36.7 % reported middle school, 41.9 % reported high school, and 13.7 % had obtained a university degree.

Prior to data collection, institutional review board approval was obtained for all study procedures. Students were contacted through letters that were sent home with attached consent forms for parents. Written parental consent was obtained and children’s participation was voluntary. Parents and children were not compensated in any way for study participation. Students completed the questionnaires individually in their classroom and the order of administration was counterbalanced across classrooms. The questionnaires were administrated by trained assistants who ensured the anonymity of answers.

Measures

Callous-Unemotional Traits

Callous-Unemotional (CU) traits was measured using the Inventory of Callous-Unemotional Traits (ICU; Kimonis et al. 2008). As noted above, the ICU is a 24-item self-report with items scored on a four-point scale (0 = “not at all true,” 1 = “somewhat true,” 2 = “very true,” and 3 = “definitely true”). The reliability and construct validity (i.e., factor structure, correlations with aggression and delinquency) of the ICU have been supported in several different samples using different translations (Essau et al. 2006; Fanti et al. 2009; Kimonis et al. 2008; Roose et al. 2010). Across samples and languages, the best fitting factor structure shows a general callous-unemotional factor and three subfactors: callousness (e.g., “the feelings of others are unimportant to me”), unemotional (e.g., “I hide my feelings from others”), and uncaring (e.g., “I try not to hurt others’ feelings”) (reversed scored item).

Academic Achievement

In Italian middle school, each teacher evaluates each student’s achievement on the basis of written and oral tests. A summary evaluation is made using a 10-point likert scale, with 6 representing the cut off for sufficient academic performance. Twice within a school year students receive an official document called a “report card” with this summary evaluation for all academic subjects (Italian, History, Geography, Math, Science, English, Technology, Art, Music, Physical Education). The measure of academic achievement that we used in analyses is the mean rating across all subjects obtained in the previous 3 months.

Formal Warnings

Conduct problems were assessed through the number of formal warnings each child received by teachers on the class register during the previous 3 months. In Italian schools, formal warnings are given in the cases of severe and/or repeated violation of school rules, as defined by a Presidential Decree called “Lo Statuto delle studentesse e degli studenti” (24 June 1998, n. 249, subsequently supplemented and amended in 2007). Behaviors that can lead to a formal warning are: physical and verbal aggressions against adults and peers, school and classmates’ property damage, repeated disruption during academic lessons, repeated failure to turn in homework, and unexcused absences.

Self-reported Traditional Bullying and Cyberbullying

The involvement in traditional bullying was measured by an 11-item self-report questionnaire (Menesini et al. 2012). Students indicated whether they had bullied others by any of 11 behaviors during “the previous 2 or 3 months” (e.g., having hit or beaten someone up; having called someone bad or nasty names) using a 5-point Likert-type scale: never, only once or twice, two or three times a month, about once a week, several times a week. A similar section consisting of 10 items (e.g., nasty text messages; phone pictures/photos/videos of violent scenes) was used to assess the involvement in cyberbullying (Menesini et al. 2011). Past work has found a mono-factorial structure for both measures (alpha = .76 and alpha = .75). Within the present sample the alphas were .71 for traditional bullying and .71 for cyberbullying. For the latter, one item (e.g., verbal or texted pranks using a cell phone) was removed in order to improve the reliability (from .66 to .71) of the scale. The deleted item was then left out of cyberbullying score in all subsequent analyses.

Bullying Nominations

Using a time frame of the previous 2–3 months, children were asked to indicate up to 6 classmates they believed directly bullied other children. The same procedure was used to assess indirect bullying. The description of direct and indirect bullying behaviors was adapted from Wolke et al. (2000) and Woods et al. (2009). For direct bullying, children were asked to nominate classmates who frequently “hit/beat up, stole belongings, threatened, blackmailed, played nasty tricks”; for indirect bullying children were asked to nominate classmates who frequently “called nasty names, deliberately left out of games, withdrew friendship, and spread nasty rumors”. Participants produced the names of nominated peers in their classrooms with the aid of classroom rosters. A direct and indirect bullying score was computed for each student by summing the number of bullying nominations received and standardizing each score within grades to a mean of 0 and standard deviation of 1.

Peer Interpersonal Assessments

Peer interpersonal assessments were used to determine classmates’ perceptions of peers’ social and behavioral characteristics, popularity, and emotionality. Using a time frame of the previous 2–3 months, students were asked to nominate up to 6 classmates who best fit descriptors for eighteen items. They were told that they could nominate the same person for more than one item. In the present study, five items were used in analyses. One item, labeled reactive aggression, described classmates who frequently “react aggressively when teased”. In addition, four items, derived by Kaukiainen et al. (1999) and labeled prosocial emotions and behaviors, are nominations for classmates who frequently “…helps classmates in trouble”; “…is able to feel joy about the success of others”, “… comforts others when they are sad”, “…gets upset when she/he sees another child being hurt”.

As was the case for the bullying measure, participants produced the names of nominated peers in their classrooms with the aid of classroom rosters. Each child obtained a score for the number of nominations for reactive aggression or the number of nominations across the four items assessing prosocial behaviors. Again, each score was standardized within grades to a mean of 0 and standard deviation of 1. Chronbach alpha for the measure of prosocial emotions and behaviors was .85; a mean score of this measure was computed for each participant.

Data Analyses

Prior to conducting the factor analyses, the English version of the ICU was translated in Italian and then back translated by a native English speaker. In order to examine whether the structure of the Italian version was similar to the one which previously emerged, a series of confirmatory factor analyses were performed. The analyses were conducted using R statistic software (R-Core Team 2012).

Prior to the factor analysis, the distribution of the 24 ICU items was examined and several items (4, 7, 9, 11, 12, 15, 16, 17, 18, 20, 21) were log-transformed because they presented skewness and kurtosis scores strongly out of a normal range −1.00 to +1.00. Next, we explored whether some items presented low item total correlations, contributing to poor fit; item 2 (i.e., “What I think is right and wrong is different from what other people think”) and item 10 (i.e., “I do not let my feelings control me”), both from the callousness dimension, showed low values (respectively r = .11 and r = .09) and were deleted from the subsequent analysis. Importantly, the same items were also found to be unrelated to the other items in past samples as well (Kimonis et al. 2008).

Next, several factor models were compared. The first model examined a single-factor model in which all items load onto a general factor representing the CU traits (Model 1). The second model specified the presence of three intercorrelated factors (Callousness, Uncaring, and Unemotional) (Model 2). The third model was a bifactor model in which all items load onto a general factor, as well as on the three above-mentioned factors (Model 3). Lastly, the final model was a hierarchical G-model, in which the correlations of the three first order factors were subsumed by a second order factor (with modification indexes) (Model 4). To compare the fit of each model, several fit indices were used to overcome the limitations of each index (Hooper et al. 2008; Marsh et al. 1996). From the family of absolute fit indices (which determine how well an a priori model reproduces the sample data), we chose the relative chi-square (χ 2/df), the goodness-of-fit index and the adjusted goodness-of-fit index (GFI and AGFI; Jöreskog and Sörbom 1989), the root mean square error of approximation (RMSEA; Steiger and Lind 1980), and the standardized root mean square residual (SRMR; Bentler 1995). From the comparative (or incremental) fit indices family (which compares the tested model to a baseline one) we chose the normed fit index (NFI; Bentler and Bonett 1980), the non-normed fit index (NNFI; Bentler and Bonett 1980) and the comparative fit index (CFI; Bentler 1990). From the family of parsimony fit indices (that indicate, when different models are compared, which one is the most parsimonious) we adopted the Akaike information criterion (AIC; Akaike 1987). A good model fit is indicated by a relative chi-square value between 2 and 3, GFI and AGFI values exceeding .90, RMSEA and SRMR of .08 or lower, NFI, NNFI and CFI exceeding .90; moreover, the model with the minimum values of AIC is regarded as the best fitting model (Byrne 1994; Hooper et al. 2008).

After determining the best fitting model in the overall sample, we examined the generalizability of the factor structure across gender and age. This was done by using two multi-group analyses, following the methodology presented by Evermann (2010). Specifically, we compared a model which allowed the factor coefficients to differ across groups to a model which constrained them to be equal. Because Software R allows only multiple-group analysis between equal size sub-samples, we randomly extracted two sub-samples of 200 boys and 200 girls to test for gender differences and two sub-samples of 200 students in 6th grade and 200 students in 8th grade to test for grade differences. Next, to explore gender and grade differences we performed a 2 × 2 (Gender × Grade) ANOVA for the ICU total score and a 2 × 2 (Gender × Grade) MANOVA for the three ICU subscale scores.

Finally, the validity of the ICU total score and its subscales were further explored by examining their correlations with the measures of behavioral and academic adjustment. Spearman’s Rho was used to test the associations because several scales were not normally distributed. These associations were explored overall in the sample and then separately for boys and girls. The associations with the subscales of the ICU were also tested controlling for the other subscales to determine each scale’s unique contribution to the prediction of the various measures of adjustment.

Results

Confirmatory Factor Analyses

The fit indices for the various factor models tested are provided in Table 1. The fit indices for the single factor Model 1 were generally not acceptable (CFI, NFI and NNFI values were lower than .90, whereas χ 2/df was 4.00). Model 2, which specified three factors, showed a significantly improved fit compared to Model 1 (Δχ2 = 219.51, Δdf = 3, p < .001) but the fit indices still did not show support for acceptable model fit. Model 3, the bifactor model, showed a significant improvement in fit (Δχ2 = 190.41, Δdf = 19, p < .001) and demonstrated overall acceptable fit indices; nevertheless, several factorial loadings were not statistically significant.Footnote 1 Considering this, we tried Model 4, a hierarchical factor model in which modification indices were utilized (seven covariances between error variables of items were allowed). Overall, the fit indices indicated adequate fit and all factorial loadings were significant (p < .01) and exceeded .35. The factor loadings for this final model are provided in Table 2. The high and positive correlations between the three first order factors (see Table 4) further supported our use of this hierarchical model as the best fitting model.

Table 1 Model Fit indices for confirmatory factor analyses
Table 2 Factor loadings from a three factor hierarchical model

Thus, the factor analyses suggested that the best fitting model was one with three factors of callousness, uncaring, and unemotional loading on a general callous-unemotional factor. Further, this factor structure proved to be invariant across gender (Δχ2 = 33.28, Δdf = 25, p = .12) and grade (Δχ2 = 8.89, Δdf = 25, p = .99). Descriptive statistics showing the internal consistency and distribution of the ICU scales are provided in Table 3. The internal consistency of the ICU total score (alpha = .81) and the uncaring dimension (alpha = .72) were respectively good and acceptable, whereas the coefficients of callousness and unemotional (respectively, alpha = .66 and alpha = .64) were marginal but sufficient (Barker et al. 1994). In this community sample, the distribution of the ICU scales did not deviate significantly from normality.

Table 3 Descriptive statistics for study variables

Finally we explored gender and grade differences. From an ANOVA testing differences for the total ICU, boys (M = .93, SD = .37) showed higher levels than girls (M = .72, SD = .33), F (1, 538) = 48.04; η2 = .08; p ≤ .001. No grade differences emerged (younger students = .81, SD = .37), older students (M = .82, SD = .36), F (1,538) = .01; η2 = .00; p = .92). From a MANOVA testing differences across the three ICU subscales, main effects of gender (Pillai’s Trace = .09; F (3, 533) = 16.42; η2 = .09; p ≤ .001) and grade (Pillai’s Trace = .02; F (3, 533) = 4.04; η2 = .02; p < .05) both emerged. Boys showed higher scores than girls on the uncaring: boys (M = .87, SD = .50), girls (M = .66, SD = .42), F (1,538) = 28.75; η2 = .05; p ≤ .001; and unemotional subscales: boys (M = 1.52, SD = .61), girls (M = 1.31, SD = .62), F (1,538) = 15.68; η2 = .03; p ≤ .001). Students enrolled in 8th grade showed higher uncaring scores than students in 6th (younger students (M = .72, SD = .49), older students (M = .80, SD = .46), F (1,538) = 3.85; η2 = .01; p ≤ .05).

Associations with School Outcomes and Behavioral Adjustment

The correlations of the scores from the ICU and the measures of academic and behavioral adjustment were tested using Spearman’s Rho and these correlations are reported in Table 4 for the full sample and for boys and girls separately in Table 5. The ICU total score, as well as the callousness and uncaring subscales, showed negative associations with academic achievement and prosocial behavior, positive associations with formal warnings, and all the self- and peer-report measures of bullying and aggression. In contrast, the unemotional subscale was negatively associated with prosocial behaviors, and positively associated (although modestly) with peer-nominated direct bullying.

Table 4 Bivariate correlations (Spearman’s Rho) in the full sample (n = 540)
Table 5 Bivariate correlations (Spearman’s Rho) for boys (n = 254) and girls (n = 286)

When these associations were examined for boys and girls separately (Table 5), for boys, the correlations were generally very similar to those found in the full sample, although the uncaring scale was the only one to be significantly associated with formal warnings. For girls, however, there were several differences from those found with boys and the total sample. Specifically, none of the ICU scales were associated with aggression in girls and the callousness subscale was not associated with either self-report of cyberbullying or peer report of direct bullying.

Finally, partial correlations were used to examine the unique associations of each of the ICU subscales with the measures of academic and behavioral adjustment, while controlling for the other subscales. These partial correlations are reported in Table 6 for the full sample and for boys and girls separately. Overall, the uncaring dimension accounted for unique variance in all of the adjustment measures. In contrast, the callousness subscale accounted for unique variance in the measure of academic achievement, formal warnings, self-report of bullying, and peer-report of prosocial behaviors. The unemotional scale only contributed significantly to the prediction of peer-reported prosocial behaviors.

Table 6 Partial correlations (Spearman’s Rho) for the total sample (n = 540) and for boys (n = 254) and girls (n = 286) separately

The relative importance of the different subscales were not, however, consistent across boys and girls. Specifically, in boys callousness contributed uniquely to the prediction of both self-reported and peer-reported bullying. Further, in boys the uncaring subscale contributed uniquely to the prediction of self-reported bullying and cyberbulling, as well as to the prediction (negatively) of peer-reported prosocial behaviors. Importantly, the uncaring dimension was the only subscale to contribute uniquely to the prediction of academic achievement (negatively), as was predicted. However, in girls, callousness was the only scale to contribute to the prediction of academic achievement, as well as to the prediction of formal warnings and prosocial behaviors. Further, the uncaring dimension was uniquely associated with all of the measures of bullying and aggression in girls. Also in girls, the unemotional dimension was uniquely and negatively associated with peer-nominated reactive aggression.

Discussion

The first aim of this study was to examine the factor structure of a comprehensive measure of CU traits (the ICU; Kimonis et al. 2008) for the first time using an Italian translation. Consistent with past research, the confirmatory factor analyses largely supported the factor structure found in other samples with other translations (Essau et al. 2006; Fanti et al. 2009; Kimonis et al. 2008; Roose et al. 2010). Footnote 2 Specifically, the best fitting model was one that specified an overarching callous-unemotional dimension and three sub-dimensions of callousness, uncaring, and unemotional. Importantly, although boys tended to score higher on most of the ICU scales consistent with past research (e.g. Essau et al. 2006; Viding et al. 2009), the structure of the ICU was invariant for boys and girls and for students in grade 6 and students in grade 8. These results, combined with the results from past factor analyses, provide strong support for the structure of the ICU across languages, types of samples, gender, and age. In fact, a recent publication supported this factor structure for parent report on the ICU in a sample as young as ages 3 and 4 (Ezpeleta et al. 2012).

Given this consistent support for this factor structure of the ICU and its implications for understanding the structure of CU traits, it is important that research continues to explore the differential associations of the total score and subscales with theoretically and practically important variables. In the current study the total scale, as well as the callousness and uncaring scales, were positively associated with school behavior problems, bullying, and reactive aggression consistent with much past research (Essau et al. 2006; Fanti et al. 2009; Kimonis et al. 2008; Roose et al. 2010). The current results advance this past work in showing that these associations extend to cyberbullying and to bullying reported by both self-report and by peer nominations. This latter finding is particularly important for showing that the associations are not solely due to shared method variance across self-report measures. Further, the current study was the first to show that, in addition to being associated with problems in behavioral adjustment in the classroom, the ICU scales were also associated with lower levels of academic achievement, although again this was largely accounted for by the callousness and uncaring dimensions.

These associations with behavior and academic adjustment were largely similar for both boys and girls. A notable exception was that none of the ICU scales were associated with reactive aggression in girls in the current sample. This finding may have been due to the aggression measure focusing on only reactive aggression (i.e., in response to being teased) and not proactive aggression which is not in response to provocation (Crapanzano et al. 2010). It could also be due to the failure to distinguish between physical and relational (i.e., aggression designed to hurt others’ peer relationships) aggressive responses to teasing. Specifically, previous research has found significant associations between CU traits and relational aggression in samples of girls (Marsee et al. 2011; Marsee and Frick 2007). Also consistent with this explanation is the finding that ICU scores were more consistently correlated with measures of indirect (verbal) bullying, as compared to direct bullying, in girls.

A final focus of the current study was to explore the unique variance accounted for (i.e., association controlling for the other dimensions) in measures of behavioral and academic adjustment by the three dimensions of the ICU. Two results are of note from these analyses. First, in the full sample, the uncaring dimension was the one subscale which most consistently accounted for significant unique variance in the measures of behavioral and academic adjustment. This finding would support the greater weighting of this dimension in the proposed specifier for CD which includes two symptoms related to the uncaring dimension (i.e., lack of remorse or guilt, unconcern about performance in important activities) but only one symptom related to either the callousness (i.e., callous-lack of empathy) or unemotional (i.e., shallow or deficient affect) dimensions (Frick and Nigg 2012). Second, problems in academic achievement were best predicted by the uncaring dimension in boys, whereas the callousness dimension showed the strongest unique association with problems in academic achievement in girls. This finding in boys was predicted, given that the uncaring items focus on a lack of concern about the consequences of behavior, as well as a failure to put forth the effort to perform well in important activities. However, the association with callousness in girls was unexpected. Interestingly, callousness was also uniquely associated with lower levels of peer-reported prosocial behaviors in girls. One interpretation of this finding is that performance in academic subjects and behaving in a prosocial manner is more expected or valued in girls (Watt et al. 2012) and, as a result, it requires a more callous disregard for the expectations of others in girls.

A consistent finding in the current study was the very few correlations between the unemotional subscale and the measures of behavioral and academic adjustment. Such findings could argue against the importance of this dimension when trying to predict problems in maladjustment and could call into question its inclusion as part of the proposed specifier for the diagnosis of CD (Ezpeleta et al. 2012). Importantly, however, this scale did show consistent negative associations with prosocial behaviors and this is consistent with past studies showing this scale as being negatively correlated with measures of empathy (Kimonis et al. 2008; Roose et al. 2010). Thus, this dimension could be important for being more specifically related to an absence of prosocial emotions and behaviors rather than to the presence of antisocial behaviors. Further, the unemotional subscale has also been negatively associated with negative affectivity (Essau et al. 2006; Roose et al. 2010) and negatively associated with emotional reactivity to distressing stimuli (Kimonis et al. 2008), which could suggest that this dimension captures an aspect of deficient affect that is not captured as well by the other dimensions. This possibility would also be consistent with the unique negative correlation with reactive (i.e., emotional) aggression found in girls in the current sample. In summary, the unemotional dimension appears to show the most divergent correlations with indices of emotional, behavioral, and academic adjustment compared to the other subscales of the ICU and much more research is needed to clarify its contribution to the construct of CU traits.

All of these results need to be interpreted in light of several limitations. First, the cross-sectional nature of the current study means that temporal or causal relationships between variables cannot be determined. Thus, it would be important for future research to test the predictive validity of the dimensions of CU traits over time. In addition, the current study was conducted with a sample of Italian school children who were relatively homogenous with respect to ethnicity. As a result, the generalizability of the findings to children in other countries and with other ethnic backgrounds needs to be tested. Moreover, only the self-report version of the ICU was tested in the current sample and, as a result, more research is needed on the validity of the reports of CU traits from other informants (see Ezpeleta et al. 2012; Roose et al. 2010). Finally, it is important to recognize that the correlations between CU and the various academic outcomes reported in the current study were similar in magnitude to the correlations between CU traits and other important outcomes, such as delinquent and antisocial behavior, reported in past research (e.g., Leistico et al. 2008). However, they still indicate that CU traits only account for a modest amount of variance in these outcomes and, as a result, other contributors to these outcomes need to be considered.

Within the context of these limitations, the consistency of the factor structure of CU traits across languages, gender, and age, all support the contention that these traits measure an overarching construct with three sub-dimensions of callousness, uncaring, and unemotional. Such findings would support the proposed specifier for the DSM-5 which includes indicators of each of these dimensions contributing to a single overarching construct (American Psychiatric Association, 2013). Further, the current results support the utility of CU traits in predicting a large number of problems in school adjustment, including both behavioral and academic adjustment. Importantly, these problems seem largely predicted by the callousness and uncaring dimensions, supporting their relative strength in predicting behavior problems. In contrast, the unemotional dimension appears to largely be related to deficits in prosocial emotions and behaviors which may be important for defining the construct of CU traits but may be less important for predicting problems in classroom adjustment. However, future research should continue to explore the unique associations among the different dimensions of CU traits to better understand this important multi-dimensional construct.