Men’s facial width-to-height ratio (fWHR) has recently emerged as an important variable in social cognition, as evidence suggests that it predicts both perceptions of aggressiveness and actual aggressive behavior (for a review, see Geniole et al. 2015). However, the physiological variables that produce and explain variance in fWHR remain unknown. Here, we test whether a specific polymorphism in the androgen receptor (AR) gene (the number of cytosine-adenine-guanine repeats, hereafter CAGn) may explain variance in men’s fWHR, as well as links between fWHR and behavior.

fWHR Predicts Aggressive Behavior

fWHR has been shown to positively predict reactive aggression in the laboratory, as well as aggressive behavior in hockey players (Carré and McCormick 2008; cf. Deaner et al. 2012). Evidence suggests that violent men have wider faces than non-violent men (Christiansen and Winkler 1992), and fWHR has positively predicted success among professional fighters (Zilioli et al. 2015) and negatively predicted estimates of the odds of dying from contact violence (Stirrat et al. 2012). In laboratory tasks, fWHR positively predicted rates of cheating (Geniole et al. 2014), and the odds of exploiting a partner’s trust (Stirrat and Perrett 2010). fWHR did not predict personality variables in a recent large sample (Kosinski 2017); however, this study relied on self-report measures and did not focus directly on aggressive behavior.

Consistent with their use of more aggressive behavioral strategies, men with higher fWHRs have been perceived as more aggressive (Carré et al. 2009), dominant (Alrajih and Ward 2014) and intimidating (Hehman et al. 2013), but as less trustworthy (Stirrat and Perrett 2010), prosocial and desirable as a friend (Eisenbruch et al. 2016). (See Geniole et al. 2015; Haselhuhn et al. 2015 for meta-analytic reviews.) These perceptions have important social consequences: women preferred to keep greater physical distance between themselves and wide-faced men (Lieberz et al. 2017), and fWHR negatively predicted how generously men were treated by other men in a cooperation game (Eisenbruch et al. 2016), but positively predicted the odds of receiving the death penalty if convicted of murder (Wilson and Rule 2015).

What Explains Individual Differences in fWHR?

Given this suite of associated behaviors and perceptions, it has been suggested that male fWHR may have evolved to serve as a signal of intrasexual threat and/or dominance (Geniole et al. 2015). Men in ancestral environments who engaged in more dominant behavioral strategies may have benefited from the intimidation associated with a wider face, more so than men who played more cooperative strategies. If fWHR is a component of an intrasexually-competitive mating strategy, and testosterone calibrates mating effort overall (Bribiescas 2001; Roney and Gettler 2015), one might expect variance in fWHR to be related to, and perhaps caused by, variance in testosterone concentrations. Essentially, elevated testosterone might both promote development of a wider face and calibrate brain mechanisms toward more aggressive behaviors, thereby producing a functional coordination between behavioral strategies and the morphological features that promote their effectiveness (see Roney 2016). In support of this possibility, there is some evidence of relationships between other measures of facial masculinity and dominance and both prenatal testosterone concentrations and adult testosterone reactivity (e.g. Neave et al. 2003; Pound et al. 2009), but this literature is mixed (e.g. Burriss et al. 2007).

Recent studies have tested relationships between testosterone and fHWR. Lefevre et al. (2013) reported weak, but positive and statistically significant, relationships between men’s fWHR and their baseline testosterone concentrations, as well as with the magnitude of their changes in testosterone after a speed-dating event. However, a meta-analysis with 9 studies and over 1000 participants found no relationship between fWHR and either baseline testosterone or the magnitude of testosterone response to competition (Bird et al. 2016; see also Whitehouse et al. 2015).

Since facial shape may be relatively fixed by early adulthood, it has been suggested that fWHR may reflect pubertal testosterone exposure more so than adult circulating concentrations (Carré and McCormick 2008), though evidence for this is mixed. A cross-sectional study of males ages 8 to 23 among the Tsimane of Bolivia found a positive correlation between testosterone and fWHR, controlling for age (Hodges-Simeon et al. 2016), and this relationship was strongest among males ages 12–16 (see Welker et al. 2016). This relationship may not be causal, however; secondary sex traits that develop under the influence of pubertal testosterone (e.g. deep voice) typically increase across adolescence and exhibit a growth spurt that co-occurs with the pubertal testosterone surge (e.g. Hodges-Simeon et al. 2013). fWHR in the Tsimane sample did not show these characteristics, suggesting that the pubertal testosterone surge is unlikely to directly cause fWHR growth (Hodges-Simeon et al. 2016). The observed correlation between fWHR and testosterone during puberty might instead be the result of face width being determined by prenatal androgen exposure, which itself could be positively correlated with adolescent testosterone production (Hodges-Simeon et al. 2016). However, the evidence for prenatal testosterone influences on fWHR is mixed as well: although some research has supported a correlation between the second-to-fourth digit ratio (a proposed proxy of prenatal androgen exposure) and facial width (Fink et al. 2005; Weinberg et al. 2015), another study both failed to replicate that effect and reported a null relationship between umbilical cord blood testosterone and adult fWHR (Whitehouse et al. 2015). In addition, there is not compelling evidence of sexual dimorphism in fWHR, which might be expected if it were promoted by testosterone (Kramer 2017; Lefevre et al. 2012).

In sum, there is mixed and generally weak evidence for a positive relationship between testosterone and fWHR, which leaves largely unexplained both the physiological factors that may account for individual differences in fWHR, as well as how this feature becomes linked to behaviors. Because variability in the androgen receptor (AR) gene has been linked to degree of androgenization of multiple traits, polymorphisms in this gene might explain variability in both fWHR and the adoption of androgen-linked behavioral strategies.

The AR Gene Moderates the Effects of Testosterone

The AR acts as a ligand-activated transcription factor – i.e. it regulates the expression of various genes when bound to androgens – and thus variability in its functioning has the potential to exert coordinated effects on multiple components of the phenotype. The human AR gene contains a variable number of CAG codon repeats (CAGn) in the first exon, normally distributed between 9 and 31 in healthy populations (e.g. Alevizaki et al. 2003; Edwards et al. 1992). Lower CAGn results in greater transcriptional activity and protein expression of the AR (Chamberlain et al. 1994; Choong et al. 1996), which causes greater androgenic effects per unit of testosterone (Zitzmann and Nieschlag 2007; but see also Ryan et al. 2017). Thus, although it is only a single gene, variation in the AR gene can affect the expression of many other genes and cause widespread phenotypic effects.

Because CAGn moderates the effects of testosterone, and testosterone has been linked to mating effort, CAGn may directly predict variability in somatic, physiological, and behavioral components of such effort. Consistent with this, men with lower CAGn exhibited a greater testosterone increase in response to social interactions with women (Roney et al. 2010), and men who combined low CAGn and high circulating testosterone were more likely than more moderately-androgenic men to experience relationship instability and (among fathers) to provide only minimal childcare (Gettler et al. 2017). CAGn inversely predicted men’s strength and dominance (Simmons and Roney 2011), their levels of extraversion (Lukaszewski and Roney 2011), and their likelihood of committing violent crime (Rajender et al. 2008), suggesting that CAGn may be related to intrasexual competitive effort in particular. In one study of hunter-gatherers, men’s CAGn negatively predicted aggressiveness, which in turn positively predicted their number of children (Butovskaya et al. 2015), but a relationship between CAGn and reproductive success was not found in other cultural contexts (e.g. Gettler et al. 2017; Gray et al. 2009).

The Present Study

If fWHR is an accurate predictor of competitive behavioral tendencies, and CAGn calibrates individual differences in intrasexual competitiveness, then a pleiotropic effect of CAGn on both facial morphology and behavioral proclivities might explain why fWHR is associated with specific behavioral patterns. Because CAGn moderates the effects of circulating testosterone, furthermore, AR gene sequence and measured testosterone concentrations may combine to explain variance in fWHR, and both Bird et al. (2016) and Hodges-Simeon et al. (2016) speculated that an interaction between CAGn and testosterone may partially explain the null relationships they found between fWHR and adult and pubertal testosterone, respectively. Here, we used an existing dataset (see Roney et al. 2010; Simmons and Roney 2011) to provide an initial test of three predictions: (1) CAGn is negatively related to fWHR; (2) CAGn negatively moderates the relationship of baseline testosterone to fWHR; (3) CAGn negatively moderates the relationship of reactive testosterone to fWHR. In addition, we were able to provide further tests of whether fWHR alone is correlated with either baseline testosterone or testosterone reactions to interactions with potential mates.

Methods

Participants

One hundred forty-nine men participated in the study for partial fulfillment of undergraduate course requirements. Facial photographs suitable for reliable measurements were obtained from 141 of these men, and failures of DNA extraction reduced the sample size for CAGn to 138; 133 participants had data for both CAGn and fWHR, comprising our final sample. Ages of these 133 men ranged from 18 to 24 (mean = 18.94, s.d. = 1.37). Eighty-five men reported being white, 29 Asian, 18 Latino, and 1 African-American.

Procedures

Participants were part of a larger study testing hormonal responses of men to social interactions with either women or men (Roney et al. 2010). These interactions took place in an initial testing session, during which men provided saliva samples before and 40 min after the social interactions. Saliva was collected by passive drool into polypropylene vials. Two-thirds of the male participants were allocated to a condition in which they interacted with female confederates who attempted to be flirtatious during the interactions (for further details, see Roney et al. 2010). Change in testosterone from before to after the interactions with women (“raw T change”) was used to test whether reactive testosterone increases were associated with men’s fWHR.

Participants returned approximately one week later, and provided a saliva sample via passive drool at the start of this second session. Since both this sample and the pre-conversation sample in session 1 were collected prior to any other tasks, we averaged the testosterone concentrations assayed from these two samples as our measure of baseline testosterone. Participants next completed a number of surveys, within which they reported their age, height, and ethnicity. Weight was recorded from a scale and this value and self-reported height were used to compute body mass index (BMI). Face photographs were taken under standardized lighting conditions and using a tripod in a standardized position. Participants were instructed to look directly at the camera with a neutral expression; only photographs containing all facial landmarks needed to calculate fWHR (see below) were used. Near the end of the second testing session, participants swished with alcohol-based mouthwash before expectorating into collection vials. These samples contained buccal cells from which DNA could be extracted in order to determine the number of CAG repeats in the AR gene. Testing sessions took place between 1 PM and 5 PM.

Participants provided written consent to participate in the larger study and for analysis of their genotypes, hormone concentrations and facial photographs. Procedures were approved by the University of California, Santa Barbara Human Subjects Committee.

Face Measurements

fWHR was measured from face photographs using ImageJ. Photos were first rotated so that a line drawn between the pupils was horizontal. Next, fWHR was measured as the ratio of the distance between the left and right edges of the face (i.e. the distance between the right and left zygion) to the distance between the top of the upper lip and the top of the eyelids (see Lefevre et al. 2013 for illustration of these distances). Two research assistants made independent measurements; because their fWHR measurements were highly correlated (r = .90), we averaged these as our measure of fWHR.

Genotyping and Hormone Assays

Mouthwash samples were stored at −80° C until being delivered to the Biological Samples Processing Core at UCLA for DNA extraction. Genotyping and sequencing were subsequently performed by the UCLA Sequencing and Genotyping Core. Full details of the extraction and sequencing procedure appear in Roney et al. (2010) and Simmons and Roney (2011). Consistent with previously reported values, CAGn in this sample ranged from 14 to 31, mean = 21.74, s.d. = 2.92.

Saliva samples were shipped on dry ice for assay at the Biomarkers Core Laboratory at the Yerkes National Primate Research Center, Atlanta, GA. Salivary testosterone was assayed in triplicate using procedures described in Roney et al. (2007). Following Roney et al. (2010), the mean of the closest two out of three triplicate values was used in data analyses in order to improve the precision of testosterone estimates. Doing so improved the intra-assay coefficient of variation (CV) from 10.95 to 6.91%; the inter-assay CV was 15.60%.

Data Analyses

Baseline testosterone values were significantly skewed but became normal after logarithmic transformation, as assessed by inspection of Q-Q plots and the Shapiro-Wilk test (p = .28). CAGn and fWHR were both approximately normal without transformation. We first tested zero-order associations between these variables using Pearson correlation. We tested for an interaction between CAGn and baseline testosterone in predicting fWHR by first standardizing the two predictor variables and then entering their product along with the individual variables in a regression model with fWHR as the dependent variable. For completeness, we tested whether men with wider faces had larger testosterone responses to encounters with potential mates in two ways: by testing whether raw change from baseline after conversations with women correlated with fWHR, and using a regression approach that controlled for variability in change scores associated with baseline values by testing whether fWHR predicted post-conversation testosterone when pre-conversation testosterone was also included as a predictor variable.

Additional models controlled for effects of ethnicity and BMI. A one-way ANOVA demonstrated differences in fWHR across ethnic groups, F (3, 137) = 3.55, p = .02 (means and SD for the respective groups were: white (2.02, .14), Asian (2.11, .16), Latino (2.06, .13), and 1.96 for the one African American participant). We controlled for ethnicity by adding ethnic group as a categorical variable in regression models, and by re-computing effects in white participants only as the largest ethnic subgroup. Adiposity has been previously related to fWHR (e.g. Coetzee et al. 2010), and thus BMI was added to regression models to control for this influence.

Results

Table 1 demonstrates that CAG repeat number was not correlated with fWHR in either the full sample of men or among the subgroup of white participants (see also Fig. 1). Effects were in the predicted direction, but were very weak. A multiple regression model in the full sample that added controls for ethnicity and BMI still produced a null result for CAGn predicting fWHR, β = −.08, p = .34 (also in the predicted direction, but very weak). Baseline testosterone was also uncorrelated with fWHR in the full sample, although a moderate positive correlation (i.e. in the predicted direction) was found among the white participants (see Table 1 and Fig. 2). This correlation among white participants dropped to marginal significance after controlling for BMI, β = .21, p = .07. Baseline testosterone did not differ by ethnicity (p = .70), and the direction of the relationship between testosterone and fWHR was negative (i.e. contrary to prediction) in the small samples of Asian (r = −.19, n = 29, p = .30) and Latino (r = −.10, n = 18, p = .70) men. When ethnicity and BMI were both entered as controls in a regression model, there was no effect of baseline testosterone as a predictor of fWHR in the full sample, β = .09, p = .29.

Table 1 Zero-order correlations between facial width to height ratio and CAG repeat length, testosterone variables, and BMI
Fig. 1
figure 1

Relationship between fWHR and CAGn among the full sample

Fig. 2
figure 2

Relationship between fWHR and baseline testosterone among the white participants

In a regression model that included controls for ethnicity and BMI, there was no significant interaction between CAG repeat number and baseline testosterone in predicting fWHR, β = .13, p = .16. Notice that the positive regression coefficient is opposite in sign to the conjecture of Bird et al. (2016) that men with more sensitive androgen receptors (as indexed by fewer CAG repeats) may have a stronger relationship between circulating testosterone and fWHR. The null effect for this interaction was also found among only white participants (β = .14, p = .25) and when control variables were removed from the model (β = .16, p = .09).

Table 1 also illustrates null results for the correlation between men’s fWHR and their change in testosterone from baseline after social interactions with young women. (At the request of a reviewer, we also tested the association between change in testosterone and fWHR among men who interacted with other men, but this relationship was also null, r = .04, n = 47, p = .78.) This null result persisted when using a regression approach to assess change in testosterone (see Data analyses), whether computed from the full sample with or without controls for BMI and ethnicity, or in the white only subsample (all ps > .60). Likewise, CAGn did not moderate the relationship between fWHR and testosterone reactivity, as addition of the CAGn by fWHR product to the above regression models produced only null effects for the interaction term (ps > .40).

Discussion

The present study provided no evidence that CAGn is related to men’s fWHR, either directly or via an interaction with baseline testosterone concentrations. Our sample size of 133 gave us 75% power to observe a true zero-order correlation of r = .23, which represented the magnitude of the relationship between CAGn and men’s physical strength in this same sample (Lukaszewski and Roney 2011). For the regression testing the CAGn by testosterone interaction, we had 75% power to observe a small-to-medium sized interaction (f 2 = .055; both power estimates computed using G*Power 3.1, Faul et al. 2009). These effect sizes may be optimistic, however, given the small impact that any single gene is likely to have on face morphology (e.g. Liu et al. 2012). Therefore, while the present study provides evidence that CAGn does not have a moderate-to-strong association with fWHR, larger samples are required to rule out smaller effect sizes. Note that the nonsignificant, negative correlation between CAGn and fWHR was in the predicted direction (see Table 1), but the direction of the nonsignificant interaction between CAGn and baseline testosterone in predicting fWHR was in a direction opposite to prediction. Given prior speculation regarding a role for CAGn in explaining variability in fWHR (Bird et al. 2016; Hodges-Simeon et al. 2016), we think it is important that these data appear in the literature and be available for possible future meta-analyses, even if the current study does not have sufficient power to provide definitive evidence on this question.

Our findings also failed to replicate a previously reported association between magnitude of testosterone responses to potential mates and men’s fWHR (Lefevre et al. 2013). That relationship is of theoretical interest since it may be that men generally reduce baseline testosterone production outside of competitive or mating contexts in order to achieve energy savings, but that wider-faced, more intrasexually competitive men exhibit larger testosterone responses to contextual triggers. Combined with a null finding for a relationship between fWHR and testosterone responses to competitions (Bird et al. 2016), our failure to replicate an association with hormone responses to young women casts further doubt on an association between men’s fWHR and their testosterone reactivity.

We did find a weak-but-significant positive relationship between fWHR and baseline testosterone concentrations among our white sub-sample. This is similar to Lefevre et al.’s (2013) finding in a European sample, and Hodges-Simeon et al.’s (2016) finding (after controlling for age) in a Tsimane adolescent sample. With a much larger sample size based on combining data across multiple samples, however, Bird et al. (2016) failed to find this relationship even among their white subsample. Inconsistent and weak relationships between fWHR and circulating testosterone might emerge if both variables are partly calibrated by other variables in similar ways (see discussion below), but the overall evidence is mixed and suggests that fWHR is not consistently related to adult baseline testosterone.

The present results deepen the mystery surrounding the source of fWHR variation and the correlation between fWHR and behavior. Components of face morphology, including width, are at least moderately heritable (e.g. Baydaş et al. 2007), and variability in genes that directly encode for face growth may explain some of the individual differences in fWHR. However, it seems unlikely that genes that encode for facial structure would also pleiotropically affect neural structures in ways that cause correlated behavioral strategies, and thus this source of variability leaves correlations between fWHR and aggressive behavior largely unexplained. In what follows, we speculate on two possible pathways whereby behaviors may become linked to facial morphology: (1) developmental processes that have not yet been tested adjust both facial width and behavioral proclivities, and (2) behavioral patterns are partially responsive to social treatment from others that is triggered by physical appearance.

On the developmental process account, facial width may be calibrated away from a genetically encoded baseline based on factors experienced early in development, such as cues of risk, harshness, or conflict. For example, gestational stress in rats has been shown to affect the face shape of male offspring (Aminabadi et al. 2016). Likewise, evidence suggests that men show increased testosterone in response to conflict and hierarchical instability (Zilioli and Watson 2014). Thus, if men are calibrated to pursue more aggressive behavioral strategies in harsher or more unstable environments, and fWHR and adult testosterone levels are partly calibrated by the same cues, then positive associations between these variables should sometimes occur. Because Hodges-Simeon et al. (2016) showed that fWHR does not increase at adolescence concomitant with the pubertal testosterone surge, any such calibration would have to affect fWHR earlier in childhood, and could well be affected by signals other than androgens (e.g. glucocorticoids). In addition, prenatal or childhood testosterone concentrations could affect fWHR via an interaction with CAGn, which has not yet been tested given that our sample assessed circulating testosterone only in young adults.

The second conjecture is that men with wider faces are treated differently by others, and this in turn triggers more aggressive or selfish behaviors in such men. Haselhuhn et al. (2013) proposed and provided evidence for this process by showing that men with higher fWHR acted and were treated more selfishly in a set of economic games, but also that other subjects who received the same treatment as high fWHR men responded selfishly to that treatment. Such responses, however, cannot explain why people would mistrust wide-faced men in the first place if fWHR were not already a valid cue of aggressive or exploitative behavior. One possibility is that wide faces resemble components of facial expressions of anger; anger faces involve raising the chin, lowering the brow, and widening the nostrils and cheeks, producing a configuration that both resembles high fWHR-faces and enhances rater perceptions of physical strength (Sell et al. 2014). If wider-faced men tend to be over-perceived as angry due to this resemblance, they could receive deferential, mistrustful, and/or aggressive treatment that over time leads to the adoption of more exploitative and aggressive behavioral strategies, and perhaps also in some cases to higher levels of testosterone production. This is similar to the processes of overgeneralization of emotion cues and self-fulfilling prophesies that have been observed elsewhere (e.g. Jussim 1986; Zebrowitz and Montepare 2008). Since the testosterone responses are likely probabilistic and context-dependent, they may be only weakly associated with fWHR, perhaps consistent with the mixed findings in this literature.

Conclusion

fWHR appears to predict the adoption of an intrasexually-competitive behavioral strategy, but the mechanism whereby facial width becomes associated with that strategy remains unclear. The present results fail to support an explanatory link between fWHR and CAGn of the AR gene, although a small association between these variables cannot be ruled out given our sample size. Beyond our null findings for CAGn, we did find a small, positive correlation between fWHR and baseline testosterone concentrations among the white participants in our sample. Further research is necessary to test conjectures regarding how fWHR becomes linked to behavior and perhaps also to testosterone production.