Interpersonal aggression represents a major threat to public health and is a serious mental health concern. To effectively reduce and prevent future violence, it is necessary to understand the factors involved in its development. A wealth of research has demonstrated that children exhibiting callous-unemotional (CU) traits/tendencies are at significant risk for severe violence later in life (Frick et al., 2014a, 2014b; Waller & Hyde, 2017). CU traits refer to an interpersonal style characterized by a lack of guilt, a dampened concern for others’ welfare, shallow or superficial expression of affect, and disinterest in the performance of important activities. Individual differences in CU traits emerge in early childhood (Kimonis et al., 2016) and remain relatively stable into adolescence and adulthood (Byrd et al., 2012; Lynam et al., 2007). Based on this body of evidence, CU was included as a subtype specifier (referred to as “limited prosocial emotions”) for the diagnosis of conduct disorder in the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5; American Psychiatric Association, 2013).

Given its demonstrated utility for identifying youth at risk for aggression, researchers have called for greater attention to the specific social-emotional correlates of CU tendencies early in life (Waller & Wagner, 2019). Towards this end, the current study focused on the relation between guilt and CU traits in early and middle childhood. Although a dearth of guilt is central to the definition of CU (Frick et al., 2014a), surprisingly few studies have empirically examined the association between these two constructs prior to adolescence. Moreover, the broad operationalization of guilt in the CU literature as a “negative” or “aversive” affective response to wrongdoing fails to consider the myriad of reasons underlying children’s emotional experiences. This is a critical limitation in light of recent research indicating that guilt is most protective against aggressive behavioral outcomes when it is motivated by ethical principles of fairness, justice, and care (i.e., ethical guilt; Colasante et al., 2021; Jambon & Smetana, 2020).

Differentiating Ethical and Non-Ethical Forms of Guilt

In the developmental literature, guilt refers to a negative emotional state characterized by feelings of regret, remorse, and/or discomfort over one’s own wrongdoing (Kochanska & Aksan, 2006; Malti, 2016; Tangney et al., 2007; also see Tilghman-Osborne, et al., 2010). It is considered both a self-evaluative and self-conscious emotion, requiring an individual to recognize and accept responsibility for violating a norm or expectation. The aversive nature of guilt is hypothesized to function as a deterrent against future misconduct and as a motivator of acts aimed at repairing and maintaining relationships (Malti, 2016; Tangney et al., 2007; Vaish & Hepach, 2020). Consistent with this notion, decades of cross-sectional, longitudinal, and experimental research indicates that the tendency to experience negative emotions in response to misbehavior is associated with reduced behavioral problems and higher rates of prosocial and adaptive functioning across childhood and adolescence (Cryder et al., 2012; Donohue & Tully, 2019; Goffin et al., 2018; Kochanska, et al., 2009; Malti & Krettenauer, 2013).

Despite this body of evidence, scholars recently have begun to urge for a more differentiated approach to the study of guilt (Jambon & Smetana, 2020; Malti, 2016). This stems from the proposition that affective discomfort over wrongdoing may reflect a variety of concerns and motivations, with different implications for children’s subsequent behavior and development. For example, a child who shoves a playmate in order to obtain a toy may subsequently feel bad due to genuine sense of concern for the victim’s wellbeing. Because this ethically grounded form of guilt is a direct consequence of the act itself (i.e., stemming from the victim’s pain), over time it may lead to the formation of generalized expectations about the intrinsic emotional costs of harming others. This, in turn, should facilitate the child’s ability and willingness to behave in accordance with internalized ethical principles and refrain from aggressive impulses (Arsenio, 2014; Kochanska & Aksan, 2006; Malti, 2016). By contrast, a different child in this situation could have little regard for their victim’s suffering, yet still experience a strong aversive emotional response stemming from a fear of being caught and punished by an adult. Although this non-ethical form of guilt may aid the child’s ability to inhibit aggressive impulses in certain contexts (e.g., when the chances of getting caught are high), their reasons for doing so would largely be driven by factors external to the self. Research in early childhood indicates that this type of “situational compliance” is indicative of an underdeveloped conscience and constitutes a risk factor for future behavioral problems (Kochanska & Aksan, 2006). Narrowly focusing on the mere presence or absence of negative emotions in these situations, as much of the developmental literature on guilt has done, limits researchers’ ability to disentangle qualitatively distinct emotional experiences.

This differentiated conceptualization of guilt also aligns with a large body of social-cognitive research demonstrating that, from preschool age onward, children think and reason about actions that violate ethical principles involving others’ rights or wellbeing (e.g., aggression) differently from non-ethical rule violations such as failing to comply with an authority figure’s commands or adhere to conventional norms (e.g., social etiquette; for a review, see Smetana et al., 2018). Although most children consider it unacceptable to violate both types of norms without good reason, evaluations of ethical violations typically center on the negative consequences of harm for others. In contrast, children reason about non-ethical violations by appealing to the importance of rules, the need for social order, or the potential for punishment over misbehavior. Broadly focusing on negative emotional responses to vague instances of “misbehavior” or “wrongdoing” fails to account for this distinction in children’s understanding of the world.

In support of this theorizing, Tani and Ponti (2018) found that preadolescents who rated themselves as possessing a stronger disposition towards “reparative” guilt grounded in other-oriented concern (e.g., “I feel the need to make amends for the wrongs I did to others”) were less likely to report behaving aggressively towards their peers, whereas greater proneness towards experiencing “persecutory” guilt involving feelings of anxiety or fear over punishment (e.g., “The thought of being punished for my mistakes worries me a lot”) was associated with higher levels of aggression. Studies employing interview-based methods aimed at eliciting children’s emotion attributions in response to hypothetical transgressions have similarly found that, compared to guilt ratings in non-ethical norm violation contexts, stronger feelings of guilt in response to ethical violations (e.g., hitting, shoving) are more closely linked to reduced aggression in early and middle childhood (Colasante et al., 2021; Jambon & Smetana, 2020). Although this emerging work has thus far focused on aggressive behavior, differentiating between ethical and non-ethical forms of guilt may be especially relevant for understanding CU traits in young children.

Guilt and CU Traits

A lack of guilt has long been a defining characteristic of the adult psychopath (Hare & Neumann, 2008). Extending the adult model of psychopathy to childhood, Frick et al. (1994) coined the phrase callous/unemotional traits to describe a constellation of behavioral and affective tendencies involving a “lack of guilt, lack of empathy, and superficial charm” (p. 704). Despite refinements to the conceptualization and assessment of CU in the intervening years, guilt deficits remain central to the CU construct (Frick et al., 2014b; Waller & Hyde, 2017). Indeed, a lack of guilt after wrongdoing is referenced in virtually all validated measures of CU (e.g., Frick & Hare, 2001; Frick, 2004), and is listed as the first criterion for the “limited prosocial emotions” specifier of a DSM-5 diagnosis of conduct disorder.

Despite this centrality to the conceptualization and measurement of CU, surprisingly few studies have employed separate assessments to empirically examine links between guilt and CU traits in childhood. A meta-analysis by Waller et al. (2020) identified four studies examining correlations between self- and parent-report measures of guilt and CU tendencies, revealing a moderate-to-large negative effect size (r = − .40). However, only two studies relied on different informants to assess the constructs, and none differentiated between ethical and non-ethical dimensions of guilt. Related research has shown that, when presented with hypothetical scenarios involving aggressive conflicts, adjudicated adolescents self-reporting higher CU tendencies are also less likely to care about the consequences of harming others (Pardini, 2011). Similar studies in community samples have found negative associations between the amount of guilt adolescents expect to feel after engaging in hypothetical aggressive and delinquent acts and their self-reported CU tendencies (Feilhauer et al., 2013; Fragkaki et al., 2016).

These findings indicate that guilt-proneness and CU tendencies (assessed via parent-reports) are interrelated, yet the reliance of past studies on single informant designs and questionnaire measures that do not differentiate ethical and non-ethical concerns limit the conclusions that can be drawn from this work. For instance, are CU tendencies associated with a general inability feel negative emotions after wrongdoing, or are these deficits specific to ethical guilt grounded in a concern for others’ wellbeing? Moreover, despite calls for greater research into the social-emotional factors involved in the early development of CU tendencies in the general population, research in this area has predominantly focused on clinical or forensic samples of adolescents (Frick et al., 2014a; Waller & Hyde, 2018; Willoughby et al., 2011; for an exception, see Waller et al., 2015).

Current Study

Our central aim was to examine whether ethical and non-ethical forms of guilt were differentially associated with CU tendencies in early and middle childhood. We tested this using data collected from an ethnically diverse community sample of Canadian children participating in a larger longitudinal study of social-emotional development. At study onset, we conducted semi-structured interviews with two cohorts of children (aged 4 and 8 years) to assess their emotion attributions and reasoning in response to hypothetical acts of aggression. Drawing on recent theorizing and research (Colasante et al., 2021; Jambon & Smetana, 2020; Malti, 2016), we operationalized ethical guilt as the degree to which children expected to feel negative emotions due to concerns for others’ rights/wellbeing, whereas non-ethical guilt reflected negative emotions stemming from other concerns (e.g., rule breaking; anxiety over punishment). Consistent with the traditional conceptualization of guilt in the CU literature as “feeling bad after wrongdoing”, we also calculated an undifferentiated guilt score reflecting the degree of negative affect children expected to feel after engaging in aggression, regardless of their underlying reasons. Primary caregivers completed a battery of questionnaires assessing various aspects of children’s social-emotional and behavioral functioning. Based on procedures used in past studies (Viding et al., 2007; Willoughby et al., 2011), we selected a subset of theoretically relevant items from different scales to create a proxy measure of CU. In order to test whether the relations between guilt and CU traits would generalize to a valid and reliable measure of CU, caregivers also completed the Inventory of Callous-Unemotional Traits (Frick, 2004; Hawes et al., 2014) 3 years later when children were approximately 7 and 11 years of age, respectively.

Based on meta-analyses by Malti and Krettenauer (2013) and Waller et al. (2020), we expected undifferentiated guilt to be negatively associated with CU tendencies concurrently and 3 years later. Consistent with recent research on the relations between different forms of guilt and aggression (Colasante et al., 2021), when undifferentiated scores were broken down by children’s underlying reasons, we hypothesized that negative associations would be driven by individual differences in ethical guilt ratings. We did not expect non-ethical guilt to explain variability in CU traits.

We assessed demographic and family-level covariates (e.g., child age, gender, income) to ensure that any effects were independent of the confounding influence of these variables. We also tested whether relations between guilt and CU traits differed for younger vs. older children and between boys and girls. Although age and gender differences in mean levels of guilt and CU have been documented (Malti & Krettenauer, 2013; Frick et al., 2014b), we had no theoretical or empirical basis to expect that associations between guilt and CU traits would be moderated by these variables.

Method

Sample

The sample consisted of 300 children (50% female) and their primary caregivers (85% female; 98% biological parent). At study onset, the sample was equally divided between 4- (n = 150; M age= 4.53 years, SD = 0.30, Range = 4.03 to 4.99; 50%) and 8-year-olds (n = 150; M age= 8.53, SD = 0.29, Range = 8.01 to 9.78). Families were drawn from an existing university database and were originally recruited from community centers, events, and summer camps in Mississauga, Ontario, Canada. Approximately 93% of caregivers were married or in a domestic partnership. Caregivers reported their highest level of education as 5% high school or less, 18% college/apprenticeship/trade school, 49% bachelor’s degree, 24% graduate degree; 4% chose not to answer. The ethnic composition of the sample was 33% European, 27% Asian, 4% Central/South American, 6% other, and 19% multi-ethnic; 11% chose not to answer.

Procedure

Families were invited to attend the university laboratory for a total of four annual testing sessions. Data for the current study were drawn from the first (collected November 2015 to July 2017) and last waves (collected December 2018 to March 2020). For simplicity, we refer to these data collection points as Time 1 (T1) and Time 2 (T2) in this study. Prior to data collection, children provided verbal assent and caregivers provided written informed consent. During the visits, trained undergraduate research assistants (RAs) completed a battery of assessments with children in a designated room while caregivers remained in a waiting area and completed questionnaires on a touchscreen tablet. At the end of each one-hour session, children were gifted a book.

We implemented multiple barrier-reduction (e.g., flexible scheduling), reminder (e.g., birthday cards), and tracing strategies (e.g., collecting alternative contact info) to maximize retention. Caregivers were emailed one month prior to their child’s expected testing date. Weekly follow-up phone calls were made to families who did not respond to the initial contact attempt. Once scheduled, families were sent reminder emails 1 week and 1 day prior to their visit. Hard-to-schedule families were emailed a link to complete an online version of the questionnaire at home. As a final strategy, paper questionnaires and prepaid return envelopes were mailed to all remaining families.

At the T2 assessment, 217 caregivers (72% of the full sample) completed questionnaire ratings when children were approximately 7 (n = 120 [80% of younger cohort]; M age= 7.56 years, SD = 0.31, Range = 7.02 to 8.13; 49% female) and 11 years old (n = 97 [65% of older cohort]; M age= 11.60 years, SD = 0.31, Range = 11.04 to 13.00; 51% female). Most caregivers completing T2 questionnaires did so during the lab visit (n = 190), whereas the remaining surveys were completed at home online (n = 25) or on paper (n = 2).

CU Trait Measures (T1 and T2)

T1 CU (Ages 4 and 8)

A validated CU measure was not included in the T1 assessments. Following procedures used by other researchers (Viding et al., 2007; Willoughby et al., 2011), we selected conceptually relevant items from available scales to create a proxy measure. This included 2 items from the Child Behavior Checklist aggression syndrome scale (CBCL; Achenbach & Rescorla, 2000; “doesn’t seem to feel guilty after misbehaving”; “punishment doesn’t change his/her behavior”), 3 items from the prosocial behavior subscale of the Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997; “helpful if someone is hurt, upset, or feeling ill”; “considerate of other people’s feelings”; “kind to younger children”), 2 items from the reparation/amends subscale of the My Child questionnaire (Kochanska et al., 1994; “after breaking someone else’s belongings during play, simply moves to another game or activity”; “after hurting another child’s feelings, does not seem concerned about making them feel better”), and the 3 item academic motivation subscale from the Holistic Student Assessment questionnaire (HSA; Malti et al., 2018; “wants to be a good student”; “does well on class activities [e.g., homework, arts, etc.]”; “works hard in school”). Caregivers rated the CBCL, SDQ, and My Child items using a 7-point rating scale (0 = never, 3 = about half of the time, 6 = always), and rated the HSA academic motivation items using a 4-point rating scale (0 = not at all true, 3 = almost always true). All items were scored such that higher values reflected higher levels of CU. After transforming the HSA ratings into a 7-point scale, all 10 items were averaged to create a composite CU trait proxy variable in the 4-year-old cohort (α = 0.80). The “punishment” item from the CBCL is not included in the 6–18 version of the aggression syndrome scale and therefore was not assessed in older children. As such, the CU composite for 8-year-olds was comprised of 9 items (α = 0.76).

T2 CU (Ages 7 and 11)

Caregivers completed the well-validated 12-item short-form of the ICU (Frick, 2004; Hawes et al., 2014; e.g., “seems cold and uncaring towards others”; “does not care who s/he hurts”; “Tries not to hurt others’ feelings” reversed scored). Items were scored on a 4-point scale ranging from 0 (not at all true) to 3 (definitely true). We averaged the items to create a composite T2 CU trait scale, with higher scores reflecting greater CU (α = 0.80).

Guilt

Interview Procedures

At T1, trained undergraduate RAs conducted semi-structured interviews with children to elicit their emotions and reasoning in response to hypothetical vignettes involving interpersonal transgressions (Malti et al., 2009). Prior to the start of the interview, children were instructed to use an emotion intensity rating scale (see below) and were then engaged in a brief training exercise to familiarize them with the task. After being shown an image on a computer screen of a hand holding a tennis ball, children were given a real tennis ball to hold and were asked to imagine that they were the person in the picture. Comprehension checks ensured that children adopted this first-person perspective (i.e., that they understood they were the person holding the ball in the picture). Children who failed the checks were corrected and presented with the exercise again. After passing the checks, children were asked to imagine themselves as the main characters in the upcoming stories.

Interviewers then presented children with four hypothetical vignettes involving acts of overt aggression (shoving a classmate to get a lollipop; stealing a candy bar from a classmate’s bag) and social exclusion (not allowing a child from a different school paint with you; not allowing an economically disadvantaged child to sit next to you on the bus). All stories were told from the first-person perspective of the transgressor and consisted of two illustrated slides projected on the computer screen accompanied by a pre-recorded narration of the events. Story order was counterbalanced across participants, with vignette characters matched to the child’s gender and skin tone. Interviews were video recorded and transcribed for later coding.

Emotion Assessments

After each story, children were asked, “How would you feel if you did this?” to assess their open-ended anticipated emotions. Children who could not verbalize an emotion after several prompts were presented with the forced choice question, “If you had [committed transgression], would you feel good, bad, or good and bad?” After stating an emotion, children were then asked to explain the rationale for their response (“Why would you feel [emotion]?”) to determine their reasoning. Non-codable responses were probed by the interviewer. Finally, emotion intensity ratings were assessed for each emotion by asking children to indicate how strongly they would feel that emotion using a 3-point scale depicting squares of increasing size (1 = not strong, 2 = somewhat strong, 3 = very strong).

Initial Emotion Coding

Two trained undergraduate RAs coded anticipated emotions and reasoning responses. Prior to coding, interrater reliability was established on a randomly selected subset of protocols (n = 209) drawn from a larger existing database of children’s interview responses to these and similar vignettes. The RAs were then randomly assigned to each code half of the interview protocols from the current study. To prevent drift, coders met on a biweekly basis to review 10% of interviews and discuss ambiguous or difficult responses. Coder disagreements were discussed with the first author until a consensus was reached.

Anticipated emotions were initially assigned to one of 11 discrete emotion categories. Up to three emotions were coded for each story. Interrater reliability was perfect (κ = 1.00). Most (97%) children provided at least one codable emotion across the four stories. Approximately 36% of children reported two emotions for at least one story, but very few (1%) ever reported three emotions for any story. The full emotion coding scheme is provided in the Online Supplemental Material.

Children’s reasoning for each emotion was initially coded into one of 12 categories based on prior research (Colasante et al., 2021). Up to two reasons were coded for each emotion. Average interrater reliability across the reasoning categories was good (κ = .83). Five categories pertained to ethical concerns, including references to 1) principles of fairness/rights (e.g., “the chocolate belonged to him”), 2) others’ physical or psychological wellbeing (e.g., “she’ll be sad”), 3) universal principles, obligations, or character (e.g., “it is always wrong to steal”; “That’s not me. I’m not a selfish person”), 4) counterfactual behaviors the child should have done (“I could have let him sit with me. It wouldn’t have been a big deal”), and 5) concerns with building and maintaining relationships (e.g., “She might think I hate her and don’t wanna [sic] be friends anymore”). The remaining six categories pertained to non-ethical concerns, including references to 6) peer-based retaliation (e.g., “He might shove me back“), 7) authority-based sanctions (e.g., “I’ll have to go to the principal’s office”), 8) social rule violations (e.g., “Because you’re not supposed to do that stuff at school; there are rules”), 9) self-centered concerns (e.g., “[feel good] because I like lollipops”; “[feel bad] because I might get a tummy ache if I eat lollipops”), 10) disruptions to group-functioning (e.g., “he’s not from our school so he doesn’t belong here”), and 11) appeals to personal choice (e.g., “I don’t have to let her sit there if I don’t want to”). An “other” category was used for all non-codable responses (e.g., “Because”, “It’s bad”; “It’s not nice”).

Undifferentiated Guilt Ratings

Consistent with the traditional operationalization of guilt in the CU literature as a global, negative emotional response to wrongdoing, we first calculated an undifferentiated guilt score for each story. References to positive/neutral emotions were scored as 0, whereas children who reported a negative emotion (e.g., sad, bad, guilty) were assigned their corresponding intensity rating. This resulted in a 4-point undifferentiated guilt rating scale ranging from 0 (no negative emotion) to 3 (very strong negative emotion). Only the negative emotion rating was calculated for responses entailing a mix of positive and negative emotions. Responses that did not contain a codable emotion or reason (regardless of content) were treated as missing. Requiring a codable reason helped to ensure that children understood the interviewer’s questioning. Scores across the four stories were averaged to create a composite rating, with higher scores reflecting greater undifferentiated guilt (α = 0.89).

Differentiated Guilt Ratings

Children’s interview responses were then re-coded to disentangle specific reasons underlying their emotion attributions.

Ethical Guilt. Responses entailing a negative emotion based on ethical concerns (reasoning categories 1–5) were assigned their corresponding intensity rating, whereas responses referencing positive/neutral emotions or negative emotions based on non-ethical concerns were assigned a score of 0. This resulted in a 4-point ethical guilt score for each story (0 = no ethical guilt to 3 = very strong ethical guilt). Following the procedures used in past interview studies (Colasante et al., 2021), for responses entailing a mix of ethical and non-ethical emotions (e.g., feeling bad for making another child cry, but also good because they got a lollipop/bad because they would get in trouble), only the ethical guilt rating was used. Responses that did not contain a codable emotion or reason were treated as missing. Ratings across the four stories were averaged to create a composite, with higher scores reflecting greater ethical guilt (α = 0.82).

Non-Ethical Guilt. Responses referencing negative emotions based on non-ethical concerns (reasoning categories 6–11) were assigned their corresponding intensity rating, whereas responses reflecting positive/neutral emotions or negative emotions based on ethical concerns were assigned a score of 0. Consistent with the procedures outlined above, children who referenced a mix of ethical and non-ethical negative emotions were assigned a non-ethical guilt score of 0. This resulted in a 4-point non-ethical guilt score for each story (0 = no non-ethical guilt to 3 = very strong non-ethical guilt). The reliability of this composite measure based on all four stories was unacceptably low (α = 0.47). A principal components analysis (PCA) with varimax rotation conducted on emotion scores across all stories revealed a two-factor solution: ratings for the shoving and stealing stories loaded onto the first factor (βs = 0.81, 0.84), ratings for the painting exclusion story loaded onto the second factor (β = 0.93), and ratings for the school bus exclusion story cross loaded (βs = 0.41 and 0.45). An examination of the interitem correlations indicated that emotion ratings across the two aggression stories (shoving and stealing) were moderately correlated (r = .40, p < .001), whereas emotion ratings for the two exclusion stories were weakly correlated with each other (r = .10, p = .27) and with shoving/stealing ratings (rs = 0.05–0.19, ps = 0.03–0.50). Compared to social exclusion, physical aggression and property violations involve more direct consequences for the victim’s rights and welfare, and children evaluate overt aggression as both more wrong and more serious than relational aggression (Murray-Close et al., 2006). As such, scores for the two exclusion stories were dropped and ratings across the shoving and stealing stories were averaged to create a non-ethical guilt composite, with higher scores reflecting greater non-ethical guilt.

Control Variables (T1)

We included eight conceptually relevant control variables in order to evaluate their overlap with guilt and CU tendencies and identify potential correlates of missing data. Caregiver reports of their own and their partner’s highest level of education (7-point scale: 1 = less than high school, 4 = college diploma, 7 = Ph.D. or equivalent) were averaged to create a composite family education score (r = .43). Caregivers also reported on the number of children living in the home (5-point scale: 1 = one to 5 = five or more), annual household income (9-point scale: 1 = less than $10k, 5 = $40k-$49k, 9 = $125k or more), and the number of books in the home (5-point scale: 1 = 0 to 10 books to 5 = More than 200 books). Children’s verbal ability was assessed with the verbal subtest of the Kaufman Brief Intelligence Test 2nd edition (KBIT-2; Kaufman & Kaufman, 2004). Scores were calculated by subtracting participants’ number of errors from the total correct responses (Ms = 13.89 and 29.04, SDs = 4.21 and 5.39, for younger and older children, respectively). Given expected age differences (Cohen’s d = 3.13), verbal ability scores were centered within age group. We also included variables representing children’s age group (0 = 4-year-olds/younger, 1 = 8-year-olds/older), exact age (centered within each age group), and child gender (0 = girl, 1 = boy).

Attrition and Missing Data Analyses

Reasons for missing T2 data included: caregivers started but did not complete questionnaires (n = 3), were busy or declined participation (n = 48), could not be contacted (n = 24), moved (n = 8), or were experiencing personal issues (e.g., family illness; n = 3). Little’s missing-completely-at-random (MCAR) test conducted on all study variables described above was significant, χ2 (203, N = 300) = 257.71, p = .006, indicating that the likelihood of having missing data was associated with our measured variables. Younger children were more likely than older children to have missing guilt ratings at T1 (21% vs. 3%), t (298) = 142.43, p < .001, d = 0.58, whereas dropout at T2 was greater among older compared to younger children (37% vs. 21%), t (298) = 3.10, p = .002, d = 0.36. Given these discrepant attrition rates and missing data patterns, follow-up analyses were conducted separately for younger and older children.

For younger children, missing T1 guilt ratings were associated with lower verbal ability (r = − .26, p = .002) and lower household income (r = − .26, p = .003). Younger children with lower verbal skills were somewhat more likely to have missing T2 data (r = − .15, p = .07). For older children, missing T1 guilt ratings were associated with a greater number of children in the home (r = .19, p = .002). Older children living in homes with fewer books (r = − .22, p = .007) and with (marginally) lower verbal ability (r = − .14, p = .10), ethical guilt (r = − .15, p = .07), and undifferentiated guilt ratings (r = − .14, p = .09) were more likely to have missing T2 data. Family education was not associated with missing data in either age group. These results indicated that missing data could be explained partially by children’s language abilities and sociodemographic characteristics of the family, particularly in the older cohort. As such, we retained the full sample in all analyses, and missing data were estimated under the missing-at-random (MAR) assumption using full information maximum likelihood estimation with robust standard errors (MLR).

Analysis Plan

We conducted path models in Mplus 8.4 to address our research questions. After screening for multivariate outliers and examining descriptive statistics and bivariate correlations, we estimated a series of cross-sectional models to examine the associations between T1 CU traits and undifferentiated negative emotions (Model 1U), ethical guilt (Model 1E), and non-ethical guilt (Model 1N). We then followed a similar procedure to examine relations between T1 guilt ratings and T2 CU traits assessed with the ICU (Model 2U, Model 2E, and Model 2N). Additional exploratory analyses examining the effects of guilt on changes in CU traits are reported in the Online Supplemental Materials.

In addition to testing direct effects, we used multigroup modeling to examine whether age and gender (categorical variables) moderated the association between guilt and CU traits. This was accomplished by comparing the χ2 values of models with the regression parameters across the groups (younger vs. older; girls vs. boys) constrained to equality to models with the parameters freely estimated. Significant changes in χ2 indicated that the strength of the regression paths differed between groups (i.e., evidence that the effect is moderated by age/gender).

Results

Preliminary Analyses

Overall descriptive statistics are reported in Table 1. An initial examination of various outlier indices—including Mahalanobis distance, Cook’s D, and the loglikelihood distance influence measure—revealed no evidence of multivariate outliers in the data. Compared to younger children, older children reported higher levels of undifferentiated, χ2 (1) = 167.48, p < .001, d = 1.58, and ethical guilt, χ2 (1) = 207.10, p < .001, d = 1.78, but not non-ethical guilt, χ2 (1) = 0.29, p = .59, d = 0.06. Older children were also rated by caregivers as being lower in CU tendencies at T1, χ2 (1) = 11.75, p < .001, d = 0.42, and T2, χ2 (1) = 3.99, p = .04, d = 0.25. No gender differences were found for undifferentiated, χ2 (1) = 0.51, p = .48, d = 0.09, ethical, χ2 (1) = 0.54, p = .47, d = 0.09, or non-ethical guilt ratings, χ2 (1) = 0.17, p = .67, d = 0.05, but caregivers did rate males as being higher in CU tendencies at T1, χ2 (1) = 5.623, p = .02, d = 0.28, and (marginally) at T2, χ2 (1) = 2.72, p = .10, d = 0.20.

Table 1 Descriptive Statistics and Correlations Among Study Variables

Bivariate correlations are also reported in Table 1. As expected, undifferentiated and ethical guilt were associated with lower CU traits at both time points. Unexpectedly, non-ethical guilt was associated with higher CU tendencies at each wave. Consistent with the stability estimates reported in past longitudinal studies during childhood (mean r = .59; Frick et al., 2014a), CU ratings were moderate-to-strongly correlated over time. Number of books in the home was negatively correlated with CU traits at T1 and (marginally) at T2, and household income was positively correlated with T2 CU traits.

To facilitate the estimation of missing data under the MAR assumption, we retained number of children in the home and child verbal ability (i.e., the variables correlated with missing data) as auxiliary variables via the auxiliary command in Mplus. Because age group, gender, number of books in the home, and yearly household income were correlated with our outcomes, these variables were included as controls in all models. Caregiver education and children’s exact age at T1 (centered within age group) were not associated with missing data or any observed scores and were dropped from subsequent analyses.

Model 1: Concurrent Associations at T1

Parameter estimates for the analyses at T1 are provided in Table 2. Controlling for demographic variables, undifferentiated guilt was no longer significantly associated with CU tendencies at T1 (Model 1U; p = .11). In contrast, and as hypothesized, lower ethical guilt remained significantly associated with higher T1 CU tendencies (Model 1E; p = .008). The positive association between non-ethical guilt and T1 CU tendencies identified in the correlational analyses was reduced to a non-significant trend in the regression model (Model 1 N; p = .067).

Multigroup modeling indicated that the effects of undifferentiated and ethical guilt did not vary by age group, χ2s (1) = 0.05, 0.02, ps = 0.82, 0.89, but a significant age difference was found for non-ethical guilt, χ2 (1) = 3.73, p = .05. Follow-up analyses indicated that non-ethical guilt was positively associated with T1 CU tendencies in 8-year-olds (β = 0.16, p = .009, 95% CI [0.04, 0.29]), but not in 4-year-olds (β = 0.00, p = .99, 95% CI [-0.14, 0.15]). Gender did not moderate any of the paths, χ2s (1) = 0.78, 0.93, 0.02, ps = 0.38, 0.34, and 0.88 (for undifferentiated, ethical, and non-ethical guilt, respectively).

Table 2 Parameter Estimates for Model 1

Model 2: Associations Between T1 Guilt and T2 CU Traits

Parameter estimates for Model 2 are provided in Table 3. Controlling for demographic variables, undifferentiated guilt was not significantly associated with T2 CU as assessed via the ICU (Model 2U; p = .19). Consistent with the cross-sectional results, lower ethical guilt remained a significant negative predictor of CU tendencies 3 years later (Model 2E; p = .001). The positive association between non-ethical guilt and T4 CU tendencies was significant (Model 2N; p = .027).

Table 3 Parameter Estimates for Model 2

Age group did not moderate the paths from undifferentiated or ethical guilt and T2 CU, χ2s (1) = 3.00, 1.25, ps = 0.09, 0.27. In contrast to the age group differences observed at T1, the strength of the positive association between non-ethical guilt and later CU scores did not differ between younger and older children, χ2 (1) = 0.14, p = .70. Similarly, gender did not moderate any of the paths, χ2s (1) = 0.51, 0.01, 0.51 ps = 0.48, 0.93, and 0.48 (for undifferentiated, ethical, and non-ethical guilt, respectively).

Discussion

Drawing on recent developmental theorizing and research (Colasante et al., 2021; Jambon & Smetana, 2020), we examined whether adopting a more differentiated conceptualization of guilt would clarify its role in the presentation CU tendencies in childhood. Child-reported guilt reflecting ethical concerns for the welfare and rights of others was negatively associated with caregiver-reported CU concurrently and 3 years later, whereas non-ethical guilt revolving around concerns over rule-breaking, disobedience, or punishment was associated with higher CU tendencies. Importantly, children’s general tendency to experience affective discomfort after wrongdoing was not associated with CU traits after controlling for demographic variables. These findings underscore the benefit of moving beyond a broad focus on the presence or absence of negative emotional responses to misbehavior to consider the reasons underlying children’s experiences of guilt in specific contexts (Tilghman-Osborne et al., 2010).

Our results align with and extend recent research linking variations in the tendency to experience ethical guilt to aggression in childhood and adolescence (Colasante et al., 2021; Jambon & Smetana, 2020; Tani & Ponti, 2018). Engaging in actions that negatively impact others is an inevitable fact of children’s social lives (Eisner & Malti, 2015). In the aftermath of these events, the discomfort that arises from the recognition and acceptance that one has harmed others serves to highlight the consequences of one’s actions, thereby triggering the desire to repair relationships and avoid future aggression (Arsenio, 2014; Malti, 2016). Without carefully attending to the specific nature of children’s affective responses, however, it is impossible to disentangle genuine other-oriented guilt from other types of negative emotions after wrongdoing. This has important implications for our understanding of the social-emotional correlates of CU tendencies in young children (Frick et al., 2014a; Waller & Hyde, 2019).

Researchers have identified temperamental fearlessness as an important precursor to the development of CU tendencies (Waller & Wagner, 2019). Low fearful arousal is theorized to contribute to CU by reducing children’s capacity to experience negative emotions in the aftermath of wrongdoing. This dampened affective response, in turn, interferes with children’s ability to learn from external cues of punishment or threat, which would otherwise be expected to inhibit misconduct and promote prosocial and caring responses to others (Blair et al. 2006; Frick et al., 2014a; Waller & Wagner, 2019). This implies that children higher in CU traits have difficulty “feeling bad” after wronging others, and that this generalized emotional numbness contributes to poor behavioral outcomes. Our findings paint a more nuanced picture. Children reported similar levels of undifferentiated negative emotions regardless of their CU levels, and those reporting higher non-ethical guilt were rated higher in CU traits by caregivers. This suggests that children exhibiting heightened CU tendencies are at least capable of—and sometimes more likely to—express generalized or rudimentary bad feelings after wrongdoing. Nonetheless, our findings indicate that simply “feeling bad” was not good enough; only guilt rooted in other-oriented, ethical concerns was associated with CU tendencies concurrently and over time.

The positive link between non-ethical guilt and CU tendencies, although unexpected, is consistent with the notion that individuals exhibiting CU and psychopathic tendencies may “know the words but not the music” (Johns & Quay, 1962; p. 217). A recent systematic review by Northam and Dadds (2020) concluded that individual differences in CU traits are robustly associated with emotional responsiveness in other-oriented contexts (e.g., when viewing images of people being harmed) in studies employing physiological indicators, but not in those based on observational and self-reports. The authors surmised that youth high in CU may be able to feign other-oriented emotions for self-serving purposes, which aligns with research indicating that children who deliberately and selectively harm others for personal gain are often intelligent, socially skilled, and adept at appearing ethically adroit in the presence of others (Hawley, 2014). Children become aware of social norms during the second year of life, and by 2 to 3 years of age, readily appreciate that aggressive actions will elicit disapproval from adults and peers (Smetana et al., 2018). Thus, even children whose physiological and neurological impairments hamper their ability to experience and learn from other-oriented affective cues nevertheless are likely to develop an understanding of what constitutes appropriate conduct, which may inhibit their willingness to express socially undesirable views.

Interestingly, we did not find a significant positive association between non-ethical guilt and CU traits in 4-year-olds. This may partly reflect the fact that children who expected to feel neutral or positive emotions received a non-ethical guilt score of zero. Because happy victimizer responses are common during the preschool years, but exceedingly rare by middle childhood (Arsenio, 2014), younger (but not older) children exhibiting CU tendencies may have been equally as likely to have high or low scores on this variable. Moreover, dysregulated and aggressive behaviors are ubiquitous during the toddler and preschool years before quickly subsiding at later ages (Eisner & Malti, 2015). With increasing cognitive and regulatory abilities, many children exhibiting elevated CU-like tendencies during this period will eventually grow out these behaviors with time (Waller & Hyde, 2017). Nevertheless, the finding that non-ethical guilt at age 4 did predict higher levels of CU in middle childhood suggests that a preoccupation with external rules, norms, and sanctions in ethical contexts may constitute an early risk factor for future antisocial conduct (cf., Kochanska & Aksan, 2006).

In addition to the novel contributions of the present study, it is also important to discuss limitations. We relied on caregivers as the sole informant of CU traits. Although parents are reliable informants of their children’s CU tendencies (Frick, 2004; Frick & Hare, 2001), incorporating information from other sources, such as teachers and peers, would provide a more comprehensive assessment of the construct. We did not include a measure of CU at the initial assessment, necessitating the creation of a composite measure based on items drawn from different scales, prompting us to examine whether the observed baseline associations would replicate when a validated measure was assessed at a later time point. Although scores on this proxy measure were strongly correlated with ratings derived from the widely used ICU 3 years later, the proxy measure did not capture the full breadth of the CU construct (e.g., a general dearth of emotional responsiveness). Our sample was comprised primarily of ethnically diverse, middle-SES families drawn from local communities. Although this approach advances our understanding of the social-emotional correlates of early CU tendencies in the general population (Frick et al., 2014a; Waller & Hyde, 2019), additional research is needed to determine whether this pattern of results generalizes to high-risk or clinic-referred populations. Finally, we relied on children’s reports of their own affect in response to acts of aggression and victimization. Although asking children to explain the reasons for their emotions does alleviate some concerns with self-presentation biases (Jambon & Smetana, 2020), emotional experiences are inherently complex and require an assessment from multiple perspectives and levels (e.g., physiology, observation). As such, our study provided one view on how children’s emotional experiences link to CU traits; future work may extend this by incorporating additional methods and informants.

In conclusion, this study illustrates how adopting a more differentiated conceptualization of guilt advances our understanding of the presentation of CU traits in young children. Although children exhibiting elevated CU traits are commonly viewed as lacking the general capacity to experience guilt in response to wrongdoing, our findings indicate that this was specific to guilt grounded in ethical concerns for others’ wellbeing. Future research targeting the affective mechanisms involved in CU may benefit from considering the ethicality underlying children’s emotional responses to harmful behavior.