Introduction

Considerable stability is evident in aggressive behaviors in childhood, adolescence and adulthood in multi-decade longitudinal studies (Farrington et al. 2009; Huesmann et al. 2009). Although aggressive children do not always become aggressive adults, early aggressive behavior conveys substantial risk for violence later in life (Loeber and Stouthamer-Loeber 1998; Piquero et al. 2012). Persistent patterns of aggressive behavior, in particular proactive aggressive behavior (i.e., controlled, callous, and used as a means to achieve goals such as gaining rewards, profits and dominance) early in life is considered a gateway for subsequent substance abuse disorder, violent crime, worse health outcome, and socioeconomic problems (Fite et al. 2010; Odgers et al. 2007, 2008). Identifying predictors and moderators of individual differences in the expression of aggressive behavior during early adolescence is key to advancing our understanding of aggression as well as interrupting its trajectory by individualizing prevention and intervention efforts.

A large volume of empirical evidence links individual differences in testosterone levels to aggressive behavior. Most of the evidence is derived from animal studies in which experimental administration or depletion of testosterone was shown to have direct effects on the expression of physical aggression with conspecifics (Allee et al. 1939; Beeman 1947; Svare 1983). In comparison, studies with humans suggest a far more complex picture (Booth et al. 2006; Eisenegger et al. 2011). Research with prison inmates shows that higher testosterone levels were associated with violence (Dabbs et al. 1995; Dabbs and Hargrove 1997; Dabbs et al. 2001). Although individual studies are cited as confirmatory evidence, meta-analyses reveal that the association (weighted effect sizes ranged from 0.08 to 0.14) between testosterone and aggressive behavior in humans is weak at best (Archer et al. 2005; Book and Quinsey 2005; Book et al. 2001). To explain the disparity between these literatures, some investigators emphasize the species-specific complexity of the nature and measurement of aggression and suggest that testosterone’s link with human behavior are with dominance rather than aggressive behavior (Mazur and Booth 1998). Others raise the possibility of a biosocial explanation and postulate that the expression of the association is highly dependent on social contextual forces (e.g., Booth et al. 2003; Booth and Osgood 1993; Raine 2002). A third line of research focuses on the possibility that the effect is reciprocal —for instance, studies show that actual or perceived challenge of status, or competition for status has the potential to regulate testosterone dynamics in humans (e.g., Archer 2006; Carre and Olmstead 2015).

The current study furthers the exploration of these alternative explanations. We aim to fill in the knowledge gap by examining testosterone’s links with two distinct subtypes of aggression in youth, and considers the moderating effects of the immediate social context of the family. Prior studies on the testosterone-aggression association do not often distinguish between subtypes of aggression (e.g., proactive and reactive aggression) which have distinct antecedents and outcomes. This may be an important oversight because testosterone could be more relevant in one functional form of aggression than the other. Additionally, studies on the testosterone-aggression relationship typically were conducted in males in their late adolescence and early adulthood (Archer et al. 2005; Book and Quinsey 2005; Book et al. 2001) and we include both males and females in their early adolescence (11–12 years old) in the current study. The early adolescent transitional period, ages 11–12 years, is especially interesting because a) there is considerable variation in testosterone levels, b) sex differences in testosterone are becoming more developmentally salient, c) youth who show high levels of aggression at this age are at higher risk for future problem behaviors, and d) parent-child conflicts rise to maximum level between the age of 10–14 years (Hill et al. 1985a, b; Steinberg 1987). Last, few studies on testosterone have focused on youth from urban cities where social ecology is very different from rural areas, and conclusions drawn from youth in rural areas (Booth et al. 2003; Rowe et al. 2004) may not generalize to urban settings. In a large (n = 445) urban sample of mostly minority (80% African-American) boys and girls (ages 11 to 12 years) living in predominantly single-parent households, we test a biosocial model of the relationships between testosterone, harsh discipline, and proactive and reactive aggression.

Aggression and Testosterone

One potential source for a weak testosterone-aggression association could stem from the nature of human behavior and the complexity of its measurement. Growing evidence supports divergent construct validity between proactive and reactive aggression and their links to a host of different biological and psychological antecedents and outcomes (Card and Little 2006; Polman et al. 2007). Proactive aggression is controlled, callous, and is driven by the achievement of goals or gaining rewards, profits and dominance (Dodge and Coie 1987; Raine et al. 2006). In contrast, reactive aggression is emotionally charged, reacting to provocation, poorly planned and seeking to harm others as the goal (Dodge and Coie 1987; Raine et al. 2006). Developmentally, children exhibiting more reactive aggression tend to develop more internalizing problems such as depression and anxiety (Fite et al. 2010; Raine et al. 2006; Vitaro et al. 2002); whereas proactively aggressive children are more likely to have conduct disorder, delinquency and disruptive behavior (Atkins and Stoff 1993; Vitaro et al. 1998). Prospective studies report that proactive, but not reactive aggression is a more stable trait and associated with later delinquency, psychopathy, and more serious criminality (Fite et al. 2010; Pulkkinen 1996; Vitaro et al. 1998; Woodworth and Porter 2002).

Higher testosterone levels may convey a latent tendency of dominating, competing, and seeking status, which manifest more likely as proactive aggression. Individuals high in proactive aggression tend to have a more positive view of aggression and expect positive outcomes through the use of aggressive behaviors (Smithmyer et al. 2000). Studies showed that proactive aggression was associated with self-reported efficacy in carrying out aggressive acts—they felt more competent in their aggressive prowess (Crick and Dodge 1996; Dodge et al. 1997); and proactively aggressive youth had a higher expectation that aggressive behavior would result in material gain, respect from others (Dodge et al. 1997; Schwartz et al. 1998; Smithmyer et al. 2000), and even happy feelings (Arsenio et al. 2004).

Very little research has investigated individual differences in testosterone levels and proactive aggression specifically (see Yildirim and Derksen 2012 for a review). Nonetheless, there is tentative evidence suggesting a relationship. For example, Dabbs et al. (2001) reported that among male prison inmates who committed homicide, those with higher testosterone levels more often committed a premeditated act. Moreover, Carney and Mason (2010) showed that in moral decision making scenarios (i.e., trolley problem--to stop the trolley running down 5 people on the track, whether or not one would choose to flip a switch to kill one person and whether or not one would push a heavy man over the bridge), intransigent utilitarian who were always willing to endorse trading one life to save five had higher testosterone levels than avoiders (choose no action in both scenarios) and fair-weather utilitarian (choose to flip the switch but not push a heavy man). Individuals with higher testosterone appeared to make decisions focused disproportionally on outcome (Carney and Mason 2010), which is critical in the formation of proactive aggression (Blair 2006). We propose that disparities between the animal and human literature linking testosterone to aggression may be, at least partially, due to species specific differences in the motives of aggression and possibly insufficient empirical attention to this inherent complexity in measurement with human participants.

Context-Contingency Effects of Testosterone: the Role of Harsh Discipline

As noted above, research on the testosterone-behavior relationship in humans has pointed to the importance of social contextual forces. Consistent with this perspective, Mazur (1995) reported that interactive models of testosterone and age, education, and income, predicted delinquent behavior better than a simple additive model of testosterone and these social factors. In a sample of over 4400 military veterans, the positive relationship between testosterone and delinquency/deviance was much stronger in men of low socioeconomic status (Dabbs and Morris 1990) and in men with low social integration (Booth and Osgood 1993). To the best of our knowledge, only two secondary data analyses have tested a biosocial model of testosterone with youth. In the Penn State University Family Relations Project, using a large sample (N = 400; 97% Caucasian) of established rural middle- and working class families with normally developing children and adolescents, Booth et al. (2003) showed that when parent-child relationship quality decreased, the association between testosterone and risky behavior increased. In the Great Smoky Mountains study (a sample of youth from the rural Southern United States; less than 10% African Americans), testosterone was associated with nonaggressive symptoms of conduct disorder in boys with deviant peers, but testosterone was associated with leadership behaviors in boys with non-deviant peers (Rowe et al. 2004). These two sets of findings are often cited as support for the context-contingency effect in that testosterone does not cause behavior but instead it increases the probability of expressing pre-existing behavioral tendencies given the appropriate contextual demands (Sapolsky 1998).

Within the social ecology of mostly-minority single-parent low income households in urban settings, and as viewed from the perspective of the main tenets of the biosocial model of the family (Booth et al. 2000), harsh discipline may serve as a salient contextual factor that moderates the association between testosterone and aggression in youth. Harsh discipline includes physically and verbally aggressive behavior towards offspring, such as corporal punishment, shouting, and threats (Reid et al. 2002). Parents’ harsh verbal and physical behavior towards youth may provide a socialization context for implicit approval of the use of aggression through their own modeling/demonstration. Harsh discipline has been associated with higher levels of aggressive behavior in youth (Gershoff et al. 2012; Wang and Kenny 2014). Youth may adopt parents’ explosive temper in their dealing with frustration (reactive aggression) and/or they may adopt parent’s use of aggression to solve problems (proactive aggression).

We propose that harsh discipline moderates the association between testosterone and aggression, particularly proactive aggression because of its characteristics and the distinct social-cognitive processes involved. Proactive aggression has been interpreted typically in the context of social-cognitive learning theory (Bandura 1976), whereby individuals learn to use aggression as means to achieve their goals in an instrumental and planned manner. The association of testosterone and proactive aggression is likely to be exacerbated by harsh discipline which supplies social learning material for proactive aggression. In contrast, reactive aggression is typically interpreted under Berkowitz’s frustration-aggression model (1962) which theorizes that aggression is a hostile, angry reaction to perceived frustration, and it focuses heavily on the adverse triggering of an aggressive reaction, including threat, heightened anger, and frustrated expectations. Without the dominance or status seeking characteristics in reactive aggression, harsh discipline may not augment the relationship between testosterone and reactive aggression. Taken together, higher testosterone in the context of harsh discipline may increase the probability of the expression of proactive aggression and augment proactive aggression proportionally.

Diversity: Considerations and Caveats

With some exceptions (e.g., Booth et al. 2003; Fang et al. 2009; Pajer et al. 2006; Platje et al. 2015), research on the testosterone-behavior link has focused on males. Relatively little is known about association of testosterone or its joint effects with social factors on aggression among females. The behavioral biology literature reveals clear and consistent sex differences in a) testosterone levels with post-pubertal males having higher levels than females; b) the source of testosterone production with the Leydig cells as the primary source in post-pubertal males and the peripheral metabolism of Dehydroepiandrosterone (DHEA) as the primary source of testosterone in females; and c) the stability of testosterone levels within and across days (see Granger et al. 2004 for a review). Similarly, the developmental psychopathology literature consistently highlights sex differences in the rates and type of aggressive behavior, with males engaging in higher levels of aggressive behavior (Moffitt et al. 2001; Odgers et al. 2008). Thus, it is clearly important to consider sex-related differences in the association between testosterone and aggression, and sex differences in the interplay between testosterone and harsh discipline on aggression.

It is noteworthy that the only studies on the context contingency effect of testosterone in youth were conducted either in the context of family with a rural (central PA) community sample of normally developing Caucasian youth from intact middle- and working- class households (Booth et al. 2003) or in the context of peer relationships with a higher-risk sample of youth from the rural (western NC) southern United States (Rowe et al. 2004). These raise questions about how robust and applicable the context contingency effect might be across studies and also across diverse families, social ecologies, and cultures.

Current Study

Using a very different sample from either Booth et al. (2003) or Rowe et al. (2004) – that is, a large (N = 445) community-based sample of mostly minority (80% African American) urban youth age 11 to 12 years old from predominantly single-parent households – this study advances our understanding by exploring the relationship between testosterone and two distinct types of aggression, and considers the moderating effects of harsh discipline, one of the important immediate family contexts at this developmental stages. We hypothesize that the positive association between testosterone and proactive aggression (but not reactive aggression) will be stronger among those exposed to higher levels of harsh discipline. We anticipate this result pattern to be more pronounced in males than females.

Methods

Participants

Data were drawn from the initial assessment of the Philadelphia Healthy Brains and Behavior (HBB) project. The HBB project sought to identify risk and protective factors for aggression. The HBB project was approved by institutional review board of the University of Pennsylvania and of the Philadelphia Department of Health. Participants were recruited by advertisements in the urban community in Philadelphia. Exclusion criteria were diagnoses of a psychotic disorder, mental retardation, pregnancy, a pervasive developmental disorder or current medication use with the potential to interfere with the measurement of salivary analytes such as steroid based anti-inflammatory (more details see Granger et al. 2009). Participants visited the university laboratories where data were collected. Caregivers gave informed consent and youth gave assent after description of the study was given. The HBB project had a treatment component that was administered to a subsample after the initial assessment, and the selection of participants at age 11 and 12 years old was predicated on the project goal of conducting a treatment study before participants entered teen years. For a comprehensive description see Liu et al. (2013). The data used in the present analyses were collected at the initial assessment.

Of the 446 participants who were enrolled in the HBB project, one participant skipped items on both the reactive and proactive aggression subscales, three skipped items only on the reactive aggression subscale, and six skipped items only on the proactive aggression subscale. Because proactive and reactive aggression were the outcome variables (see Analysis Plan) and their missingness wouldn’t contribute to model estimates (Allison 2009), these participants were excluded from the analysis resulting in 442 participants for analysis on reactive aggression and 439 participants for analysis on proactive regression. The final analytic sample comprised 445 participants (50.80% male) who have non-missing data on either proactive or reactive aggression (note that only 1 person was missing on both proactive and reactive aggression). On average, they were 11.93 years old (SD = 0.60). The sample included participants who identified themselves as African-American (N = 358), White (N = 53), and other/mixed-race ethnicity (N = 34). On average, the household monthly income was $2807.26 (SD = $3064.78). Regarding caregivers’ marital status (1.35% missing data), 15.73% were divorced or separated, 1.80% widowed, 56.63% never married, and 24.49% were married and living with their spouses.

Measures

Reactive and Proactive Aggression

Proactive aggression and reactive aggression were measured with the 23-item Reactive-Proactive Aggression Questionnaire (RPQ; Raine et al. 2006). Participants rate each item on a 3-point scale (0 = never; 1 = sometimes; 2 = often). A sum score of 11 items assesses reactive aggression (e.g. “reacted angrily when provoked by others” “had temper tantrums”) and a sum score of the remaining 12 items assesses proactive aggression (e.g. “had fights with others to show who was on top” “Hurt others to win a game”). This scale has good internal consistency in this sample (Cronbach’s α = 0.80 and 0.81 for proactive and reactive aggression). To evaluate the divergent validity of proactive and reactive aggression, we tested their correlations with self-reported and parent-reported callous emotion (CU) traits and impulsivity measured with the corresponding subscales in Antisocial Process Screening Device (Frick and Hare 2001). Overall, proactive, but not reactive aggression was associated with CU traits, and reactive aggression was associated with impulsivity (see Online Resource 1). It has also been showed elsewhere that RPQ has an adequate convergent validity, discriminant validity, criteria validity, and construct validity (Raine et al. 2006). Proactive aggression was logarithm transformed because of non-normality, and reactive aggression was normally distributed.

Harsh Discipline

Harsh discipline was measured with the Conflict Tactics Scale (CTS; (Straus 1979). Physically harsh discipline was measured by the CTS minor assault/corporal punishment subscale (Straus et al. 1998). This subscale has 3 items, including “parents throwing something at you”; “pushing, grabbing or shoving you”; and “slapping or spanking you”. Following Wang and Kenny (2014), we used three items from the psychological aggression subscale of CTS to measure verbally harsh discipline, including “parents insulting or swearing at you”; “doing or saying something to spite you”; and “threatening to hit or throw something at you”. Parents self-reported the frequency of using these discipline practices in the past 12 months on a 6-point scale (0 = never; 5 = most of the time). Youth responded on the same 6-point scale regarding their parents’ use of harsh discipline towards them. Harsh discipline scale had good internal consistency (Cronbach’s α = 0.77 and 0.80 for parent report and child report). Parent-reported and child-reported harsh discipline were significantly correlated but of small effect size (r = 0.12, p = 0.013), and parent reported more usage of harsh discipline (paired sample t = 2.40, p = 0.017). The harsh discipline measures were averaged across report sources to reduce bias and to give equal weights on both informants instead of relying more heavily on one informant than the other.

Pubertal Development

Boys were presented with drawings illustrating Tanner five stages of pubic hair and genital development. Girls were presented with drawings illustrating Tanner five stages of pubic hair and breast development. Youth were then instructed to choose the drawing closest to their stage of development (Morris and Udry 1980). The stages of pubic hair and breast/genital stage were averaged to yield an overall score (range: 1 to 5). Means and standard deviations by sex are reported in Table 1.

Table 1 Means, standard deviations and correlations among main variables by sex

Collection of Saliva and Determination of Testosterone

Saliva samples were collected in the morning from each participant in the laboratory at the University. The youth were instructed to refrain from food and drink (except water) for at least 60 min prior to sample donation (Granger et al. 2012). Participants provided whole, un-stimulated, saliva by passive drool method (Granger et al. 2007). The saliva sample was collected, on average, at 9:18 a.m., with 95% of the saliva samples collected between 8:56 a.m. and 9:58 a.m. Immediately after collection, specimens were frozen and stored at −80 °C until assay. Samples were assayed for salivary testosterone using a commercially available enzyme immunoassay (Salimetrics, State College, PA) without modification to the recommended protocol. The assay had the sensitivity of 1 pg/ml, range of calibrators from 6.1 to 600 pg/ml, and on average intra- and inter-assay coefficients of variation were 4.50% and 6.94%. Means and standard deviations by sex are reported in Table 1.

Analytical Strategy

Analyses were conducted in Mplus 8 (Muthén and Muthén 1998–2017) using maximum likelihood estimation with robust standard errors. Missing data, ranging from 0 to 4% on each predictor were handled with maximum likelihood under the missing at random assumption. Given the potential sex differences in the testosterone-behavior association, we conduct a multi-group linear regression allowing all the estimates and variances to vary across sex. We tested two separate models, one with proactive aggression as the outcome, and one with reactive regression. Both models had the effects and the interaction of testosterone and harsh discipline as predictors. We controlled for pubertal development because we were interested in estimating the relationship between testosterone, harsh discipline and aggression beyond what was accounted for by pubertal development (Granger et al. 2004). We adjusted for the effect of household income and race (dichotomized into African-American or not; 1 = African Americans, 0 = others) in the models because household income was linked to aggression and harsh discipline, and race was linked to aggression and testosterone (at least among females). Saliva sample collection time of the day was not significantly correlated with testosterone level (r = 0.03, p = 0.59), harsh discipline (r = 0.03, p = 0.56), reactive aggression (r = 0.03, p = 0.57) or proactive aggression (r = 0.02, p = 0.64). Thus, saliva sample collection time was not included in the model. Continuous predictors were centered at the mean levels to facilitate interpretation.

Interactions were probed with the Johnson-Neyman technique (J-N technique; Hayes and Matthes 2009; Johnson and Fay 1950; Johnson and Neyman 1936) to reveal the regions of significance of harsh discipline wherein the relationship between testosterone and proactive/reactive aggression was significant. The region of significance was defined by the 95% confidence interval. If the 95% confidence interval enclosed the value of zero, then the association between testosterone and proactive/reactive aggression was not significant. The plots for regions of significance also demonstrate the changes in magnitudes of the slope of testosterone on aggression. Simple slopes were also plotted for significant interactions at the level of 1 standard deviation below and above the mean of harsh discipline (Aiken and West 1991; Cohen et al. 2002) to illustrate not only the magnitude of the slopes of testosterone on aggression but also the relative level of the aggression.

Results

Descriptive Analyses

As shown in Table 1, testosterone was not directly correlated with either proactive aggression or reactive aggression among boys or girls. Household income was inversely correlated with both reactive and proactive aggression among girls but not boys. As expected, girls were at more advanced Tanner Stages of puberty than boys and testosterone was positively correlated with pubertal development in both boys and girls. Boys had significantly higher levels of testosterone than girls. We further examined the testosterone level by pubertal stages in each sex and results are reported in Table 2. Consistent with the prior literature, sex differences in testosterone levels were mainly observed among those at Tanner Stage 3 and above, and the effect size of sex differences as measured in Cohen’s d was much larger at later Tanner Stages. The correlation coefficient between testosterone and pubertal ratings was larger in boys than girls (z = 3.61, p < 0.001; r = 0.53 and 0.23 in boys and girls). Testosterone was positively associated with chronological age in boys (r = 0.47, p < 0.001) but not in girls (r = 0.10, p = 0.14), and coefficients of the testosterone-age correlation differed significantly between boys and girls (z = 4.25, p < 0.001).

Table 2 Testosterone by sex and by pubertal stages

Testosterone, Harsh Discipline and Proactive Aggression

Multi-group regression on proactive aggression revealed a significant effect of harsh discipline when testosterone was held at the mean level, and a significant interaction between harsh discipline and testosterone for both boys and girls (see Table 3, Model 1). We plotted the region of significance for the slope of testosterone on proactive aggression (see Fig. 1). Higher testosterone levels were associated with more proactive aggression among boys who experienced high levels of harsh discipline (at levels greater than 8.44), and among girls who experienced high level of harsh discipline (at levels greater than 7.80). For individuals experiencing high levels of harsh discipline, as indicated by area where the 95% confidence band did not include the value of zero in Fig. 1, when harsh discipline increased, the positive association between testosterone and proactive aggression became stronger. Among girls who experienced low ot average levels of harsh discipline (at levels below 3.36), higher testosterone levels were associated with less proactive aggression, and the effect size of this inverse relationship grew greater when harsh discipline levels decreased. Simple slope effect in Fig. 2 revealed that individuals have similar proactive aggression at low levels of testosterone regardless of sex and harsh discipline experiences. At a high level of harsh discipline, those with higher levels of testosterone displayed more proactive aggression than those with lower levels of testosterone; a similar pattern was evident in both boys and girls. At a low level of harsh discipline, girls with higher levels of testosterone displayed much less proactive aggression than those with low levels of testosterone, but this pattern was not revealed in boys.

Table 3 Multi-group analysis of the effects of testosterone and harsh discipline on proactive and reactive aggression
Fig. 1
figure 1

The association between testosterone and proactive aggression at all levels of harsh discipline in boys (top panel) and girls (bottom panel). Caption: When harsh discipline is above 8.44 for boys and above 7.80 for girls, testosterone is positively linked to proactive aggression; and when harsh discipline is below 3.36, testosterone is inversely linked to girls’ proactive aggression

Fig. 2
figure 2

The association between testosterone and proactive aggression by harsh discipline and sex. Caption: * indicates the 95% confidence interval (CI) of the slope does not include zero. Proactive aggression is positively associated with testosterone among youth who experienced high level of harsh discipline (Boys: b = 0.05, 95%CI = [0.003, 0.106]; Girls: b = 0.09, 95%CI = [0.02, 0.16]); proactive aggression is negatively associated with testosterone among girls who experienced low level of harsh discipline (b = −0.09, 95%CI = [−0.14, −0.04]), but not among boys who experienced low level of harsh discipline (b = −0.03, 95%CI = [−0.08, 0.02])

Wald test suggested no sex differences in the interaction between harsh discipline and testosterone on proactive aggression (see Table 3). We examined the power with Monte Carlo Simulation Analyses. With the current sample size, we had 70% and 97% power to detect an interaction with a standardized effect size of 0.16 in boys and 0.25 in girls respectively and only 20% power to detect sex differences in the interaction. With a sample size of 800 boys and 800 girls, we had 99% and 100% power to detect the interactive effect in boys and in girls respectively, and 50% power to detect sex differences in the interaction on proactive aggression. It is likely that the interaction of testosterone and harsh discipline on proactive aggression does not differ between boys and girls (see Online Resources 2 for additional details).

Testosterone, Harsh Discipline and Reactive Aggression

Multi-group regression on reactive aggression revealed a significant interaction of harsh discipline and testosterone among boys but not girls. The effect of harsh discipline on reactive aggression was significant for both sexes when testosterone was held at the mean level. As shown in Fig. 3, among boys who experienced low to average levels of harsh discipline (at levels below 4.20), higher testosterone levels were associated with less reactive aggression, and the effect size of this inverse relationship became larger when harsh discipline levels decreased. There was no association between testosterone and reactive aggression among girls at any level of harsh discipline. The simple slope effect in Fig. 4 revealed that boys have similarly high levels of reactive aggression at a high level of harsh discipline regardless of testosterone levels, whereas at a low level of harsh discipline, those with higher levels of testosterone have less reactive aggression than those with lower levels of testosterone.

Fig. 3
figure 3

The association between testosterone and reactive aggression at all levels of harsh discipline by boys (left panel) and girls (right panel). Caption: When harsh discipline is below 4.20 for boys, the higher the testosterone, the lower the reactive aggression. There is no association between testosterone and reactive aggression at any level of harsh discipline among girls

Fig. 4
figure 4

The association between testosterone and boy’s reactive aggression by harsh discipline. Caption: * indicates the 95% confidence interval (CI) of the slope does not include zero. Testosterone was negatively associated with reactive aggression among boys who experienced low level of harsh discipline (b = −0.38, 95%CI = [−0.69, −0.09])

Wald test revealed no sex difference in the interaction between harsh discipline and testosterone on reactive aggression (see Table 3). Simulation analyses assuming small effect sizes revealed that we had 56% power to detect an interaction in boys, and only had 31% power to detect sex differences in the interaction on reactive aggression. With a sample size of 800 boys and 800 girls, we had 97% power to detect interactive effect in boys and 80% power to detect sex differences in the interactive effect. It appeared that we were under-powered to determine that the interaction of harsh discipline and testosterone on reactive aggression varies by sex (see Online Resources 2 for additional details).

Discussion

This study tested the context contingency effect of testosterone on aggression. Our analyses reveal an interaction between testosterone and harsh discipline on proactive aggression in both boys and girls, and an interaction between testosterone and harsh discipline on reactive aggression in boys only. More specifically, among youth who experienced high levels of harsh discipline, testosterone was positively associated with proactive aggression -- the association became stronger as harsh discipline increased. Unexpectedly, among youth who experienced below average levels of harsh discipline, there was an inverse relationship between testosterone and boy’s reactive aggression and between testosterone and girl’s proactive aggression, and we termed these as protective effects of testosterone against aggression. When harsh discipline decreased, the protective effect of testosterone for girls’ proactive aggression and boy’s reactive aggression became stronger. The association between testosterone and reactive aggression was non-significant for girls, regardless of exposure to harsh discipline. As noted above, at the operational level, the context contingency effect implies that testosterone does not cause behavior, but instead that it increases the probability that pre-existing behavioral tendencies will be expressed given the appropriate contextual demands (Sapolsky 1998). Our observations confirm this basic assumption. This is the first study to our knowledge to document an interaction of testosterone and harsh discipline on proactive aggression in both boys and girls and on reactive aggression for boys. These findings advance our understanding in important ways and are discussed in relation to the emerging scientific narrative describing the complex relationship between testosterone, social context, and the display of aggression in early adolescence.

The Link between Testosterone and Aggression at High Levels of Harsh Discipline

Consistent with our prediction, testosterone was positively associated with proactive aggression at high levels of harsh discipline. This is in line with prior findings examining adverse social context such that when youth had deviant peers, testosterone was positively associated with conduct problems (Rowe et al. 2004). Harsh discipline may influence youth at both cognitive and behavior levels. First, there could be a behavior modeling effect (Bandura 1976; Polman et al. 2007) in that youth are likely to adopt aggression as a means of achieving goals, obtaining power and getting their ways in the same way their parents do with the verbal and physical aggression to discipline the youth. Second, proactive aggression is generally linked to a more positive attitude and expectancy about aggression (Schwartz et al. 1998; Smithmyer et al. 2000), and a household with high levels of harsh discipline may instill social cognitive processes subtly to view aggression as acceptable, more “normative” or even positive. As a result, harsh discipline could permit the expressive link of testosterone and proactive aggression, and in this family context, youth with high testosterone may express their status seeking tendency in aggressive forms, and consider aggressive acts “best” for seeking status and dominance. Future studies to test these possible explanations are needed to elucidate the cognitive/behavioral mechanisms involved.

The Link between Testosterone and Aggression at Low Levels of Harsh Discipline

Within the context of low harsh discipline or its absence, the higher the testosterone, the lower the boys’ reactive aggression and the lower the girl’s proactive aggression. We did not predict such protective effects even though the hypothesis of context contingency effect did not preclude such prediction. Ours findings are not the first to suggest a protective effect of testosterone in specific social contexts. Careful examination of Booth et al. (2003) showed that with high quality of mother-son relationship, testosterone was inversely associated with boy’s risky behavior. In addition to the exacerbating effect of deviant peers on the testosterone and conduct problems, Rowe et al. (2004) also showed at low levels of deviant peers, testosterone was associated with positive outcomes such as leadership. These findings together highlight the importance of social context in conditioning a bidirectional (positive and negative) association of testosterone and behavior outcomes.

At low levels of harsh discipline, parents may adopt a prosocial and constructive means of disciplining youth. Youth with high testosterone could adopt prosocial behavior more readily and consequently displayed less aggression. Tentative evidence for this hypothesis emanates from the literature on testosterone dynamics in laboratory settings which shows that testosterone prompts the types of behaviors that are needed to maintain status (Eisenegger et al. 2011, 2010; Sapolsky 2017). If the situational context requires prosocial behavior to maintain status, then higher testosterone would be associated with higher levels of prosocial behavior. For example, in the Ultimatum Game where participants decide how to split money with another player (experiment confederate) and where status and reputation depends on being fair, injection of testosterone was found to result in more generous offers (Eisenegger et al. 2010). Extending these experimental findings to baseline testosterone, it is possible that in the broader family context where prosocial behaviors are promoted and demonstrated and are considered to be good strategies to maintain social status, high testosterone would be associated with high prosocial behavior and in turn result in low aggression. Alternatively, low levels of harsh discipline could mean that parents adopt the uninvolved/permissive discipline approach, then our findings would be difficult to interpret and require further exploration. The protective effect of testosterone appeared to diverge for boys and girls for different forms of aggression. It is possible that the different socialization processes for boys and girls play a role in determining which aggression is more deterred in our modern society for each sex, which leads to the divergent sex patterns of the protective effect of testosterone against aggression. We caution that replication is needed and our explanations here are only tentative. Future studies are needed to have a more comprehensive assessment of the family discipline practices, ideally include both prosocial/constructive and harsh discipline dimensions, to replicate the current findings and shed light on the underlying mechanisms of the protective effect of testosterone against aggression at low levels of harsh discipline.

The inverse association between testosterone and girl’s reactive aggression and boy’s proactive aggression was not predicted but was nevertheless consistent with the context contingency effect of testosterone. Interestingly, our findings also fit well with another theoretical framework that has not garnered much attention in the testosterone-behavior literature, namely, the differential susceptibility hypothesis (Belsky et al. 2007). This evolutionary-inspired theory postulates that individuals with high susceptibility are the ones that are most susceptible to the environmental influences, “for better and for worse” (Belsky and Pluess 2009). In other words, individuals with high susceptibility would reap the best of the environment if they are in a supportive environment but they are also the ones that could have the worst outcomes if they are exposed to adversity; individuals with low susceptibility would exihbit much less fluctuation in their behavioral outcomes when placed in different environments. Belsky and Pluess (2009) explicitly stated that what factors serve as susceptiblity is an empirical question that needs to be answered. Our findings suggest that testosterone serves as one such susceptibility factor instead of the general conceptualization of it as a vulnerability factor with a direct link to high aggression only. Our findings reveal both an exacerbating and a protective effect of testosterone, depending on the contexts, that warrant further examination in future studies.

Measurement Issues on Testosterone

The validity of the testosterone measure was examined in two ways. First, we examine the sex differences in the correlation between testosterone and pubertal stages, and between testosterone and chronological age. Consistent with previous findings (Granger et al. 2004), testosterone in our study was more strongly linked to both pubertal stages and chronological age in boys than in girls. Second, given the relatively small, though statistically significant, sex differences in testosterone level, we further examined the sex differences in testosterone by the five Tanner pubertal development stages. The findings were comparable to those found in a recent study (Table 4 in Shirtcliff et al. 2009) in that sex differences in testosterone was found only at Tanner Stage 4 and 5, and that at Tanner Stage 5 boys’ testosterone level was about 60% higher than girl’s. Some differences between Shirtcliff et al.’s (2009) and our study are noteworthy: a) Shirtcliff et al. (2009) taking the average of several samples collected between 9:38 a.m. (waking) and 21:04 p.m. (bedtime) across five days would generate lower testosterone values than the morning testosterone level in our study (collected at 9:18 a.m. on average, with 95% of the samples collected between 8:56 a.m. and 9:58 a.m.), because of the continuous decrease in testosterone levels throughout the day; and b) participants in Shirtcliff et al.’s (2009) were primarily Whites whereas our participants were predominantly African Americans. Given the limited research on African American youth on the topic of testosterone, our study would provide for the first time a comparable sample for future studies focused on African Americans.

We relied on a single time point assessment of testosterone and this raises a potential concern. However, testosterone is in general stable across the day and across developmental stages, not in the sense of absolute values but in the relative rank (Granger et al. 2004). Conservatively, assuming our observed testosterone measures the true score at a reliability of 0.5, meaning that only 50% of the variance in the observed testosterone levels could be attributable to true individual differences in testosterone, we conducted additional analyses (available upon request) with a latent factor for testosterone accounting for measurement error, and we replicated all the main interaction findings. Additionally, our findings are unlikely to be attributable solely to random noise in testosterone because samples were collected at the same hours of day across individuals and moment-to-moment influence (i.e., state influence) on testosterone is unlikely to generate systematic covarying with other measures. Nonetheless, future studies should replicate and extend the current findings with several indicators of testosterone.

Limitations

Findings should be interpreted in light of some limitations. First, the cross-sectional design precludes causal inference. Future longitudinal research is needed to address the temporal order and incorporate components to explicitly examine within-individual change and prosocial discipline practices to test the proposed pathway we discussed. Second, the participants in this study were drawn from a very narrow age span of 11 to 12 years old and were mostly African Americans. This was a convenient sample. Thus, the conclusions drawn may not generalize to youth at different pubertal developmental stages and of other racial backgrounds. It is of note that prior research on testosterone in the past comprised mostly White participants (e.g., Booth et al. 2003; Carre and McCormick 2008), and our findings from a sample of predominately African-American youth adds to the literature on testosterone. The findings could however be specific to the diverse features and social ecology of inner cities in U.S., and generalization warrants caution. Third, we did not collect menstrual cycles and birth control use for girls in this sample and future study should include these potential confounders, especially for mid- and late-adolescent girls.

Conclusion

Findings of this study support some basic assumptions of the biosocial model —links between testosterone and aggression were contingent upon social contextual forces (i.e., harsh discipline). Specifically, harsh discipline exacerbates the positive link between testosterone and proactive aggression in both boys and girls. Novel findings include a protective effect of testosterone against girls’ proactive aggression and boys’ reactive aggression, which is nevertheless in line with the context contingency effect of testosterone, but requires replication. Testosterone may be a susceptibility factor that renders individuals most sensitive to both the good and the bad elements of the environment.