When it comes to men’s perceptions of women as potential military leaders, people might assume that women who appear more masculine, including those with more masculine facial features, have an advantage over their more feminine-looking counterparts (Grabo & van Vugt, 2018; Luo et al., 2023; Sczesny et al., 2006). A typical masculine female face, characterized by a broader nose, thinner lips, and stronger jawline, more closely resembles the prototypical look of a soldier than a typical feminine female face, characterized by softer facial lines, a thinner nose, and fuller lips (Grabo & van Vugt, 2018; Olivola et al., 2014). More masculine faces are associated with traits such as strength, bravery, and dominance (Johnson et al., 2008; Walker & Wänke, 2017), all of which are valued in a military setting and associated with mission success. More feminine faces are associated with warmth and submissiveness, but not competence (Walker & Wänke, 2017), yielding lower ratings for leadership potential and perhaps even suggesting women need protection rather than provide protection. However, recent research has found that when rating sexually dimorphic faces for both men and women on leadership potential, male military cadets gave the lowest ratings to women with masculine faces, while both feminine female and feminine male faces received intermediate ratings and masculine male faces garnered the highest ratings (Korenman et al., 2019).

One possible explanation for this counterintuitive finding with respect to the female faces is that some male military members may feel threatened professionally by the masculine woman leader, especially those men who endorse hostile sexist beliefs, leading them to rate her leadership potential less favorably relative to a feminine-faced woman or either man. In their review of research on reactions to people who disconfirm stereotypes, known as vanguards, Rudman et al. (2012) found evidence of backlash against female leaders who were agentic even though they were perceived as competent. These women were rated lower in terms of likability and hirability than agentic men. Similarly, Bareket and Fiske (2023) concluded that the way in which a person disconfirms a stereotype also matters. They found that negative reactions towards women who defied stereotypes were associated with varying forms of punishment, or backlash, as may be seen in the power dynamics of hostile sexism. To the extent that women who exhibit traits associated with masculinity may face punitive reactions for defying expected gender norms, masculine-faced women might be penalized more severely than feminine-faced women in a military context, as their masculine appearance may serve as a visible cue of norm violation. Thus, the question guiding the current study is: who will garner more favorable military leadership potential ratings from men who are high in dispositional sexism: women with more masculine facial features or more feminine facial features?

Background Research

Facial Perception and Leadership Evaluation

Research shows that people use their perceptions of a person’s body and face to make inferences about personality traits and leadership ability (Antonakis & Eubanks, 2017; Nana et al., 2010; Re et al., 2013). This is consistent with social categorization based on observable characteristics such as race, gender, and age (i.e., Fiske & Neuberg, 1990). To the extent that such judgments might favor some desired qualities, and therefore, some people over others, actual skills and abilities may be overlooked. Although some individuals may resist allowing their first impressions to drive later judgments, this may be especially difficult for people who lack internal motivation to do so. For example, people high in dispositional prejudice may be unmotivated to make corrections when first encountering a member of a stereotyped group, failing to seek additional information about the person beyond initial appearance (Willis & Todorov, 2006; Zebrowitz, 2017). This is known to happen when participants are asked to rate familiar and unfamiliar faces on characteristics such as competence, trustworthiness, or aggressiveness (Willis & Todorov, 2006).

With respect to judging leadership potential directly from faces, men are typically rated higher than women (Chiao et al., 2008; Korenman et al., 2019). Whether in the context of job fit or leadership potential, both the sex and gender characteristics tend to influence perceptions (Grabo & van Vugt, 2018; Little et al., 2007a; Olivola & Todorov, 2010; Todorov et al., 2005). Research suggests people in a more competitive setting (such as wartime) will prefer a more masculine-looking leader versus in a more cooperative setting (such as peacetime) preferring a more feminine-looking leader (Spisak et al., 2014). This is also evident when people in wartime or peacetime are asked their preferences for political leaders based on facial features (Ferguson et al., 2019). During a time of peace (a cooperative time), a leader with a more feminine face is preferred whereas during a time of war (a more competitive time), a leader with a more masculine face is preferred.

When asked to make judgments regarding the competence, dominance, attractiveness, and approachability of candidates for political office, male candidates were rated higher in terms of competence and dominance compared to the female candidates who were rated higher in approachability and attractiveness (Ferguson et al., 2019; Little et al., 2012). However, when asked to assign ratings of competence to photographs of men and women (identified as such by the labels “Mr.” and “Mrs.”), participants attributed higher levels of confidence to faces possessing more masculine characteristics, regardless of sex of the face. Subtle physical characteristics, such as the gender of a face, may influence people’s judgments without their awareness, leading to more gender-stereotypic judgments based on those gender differences. This is likely because characteristics indicating sex tend to be more obvious and processed at a higher level of awareness, allowing people to correct their own stereotyped biases before making judgments of ability or worth (Sczesny et al., 2006).

These biases pervade even more when taking into consideration the schemas people hold regarding the job in question. von Stockhausen et al. (2013) found that participants showed a distinct preference for masculine-looking faces when the job was stereotypically masculine and feminine-looking faces for stereotypically feminine jobs but found no interactions between the sex and gender of faces. Interestingly, results from eye tracking data point to differences in the processing of faces, with early fixations lasting longer for incongruent faces (feminine-looking men and masculine-looking women) than congruent faces (Valuch et al., 2015). This suggests that participants’ initial analysis of faces may have focused on the congruence between sex and gender of a face, but their final hiring decisions were ultimately based on a second stage analysis, where participants likely considered the candidate in terms of the appropriate gender role.

However, in a military context, when rating leadership competencies for officers, participants showed a preference for men’s faces over women’s faces, yet only male participants exhibited a preference for the feminine version of women’s faces over the masculine version of women’s faces (Korenman et al., 2019). The less favorable evaluation of women’s faces with masculine features may reflect gender role incongruity in the military domain. This is a direct contradiction of previous results which found that women who look quite feminine may be liked but not respected, and thus not be rated high on leadership potential (Boyce & Herd, 2003; Gloor et al., 2018; Silva, 2008).

Gender Stereotypes in Leadership Evaluation

If an individual displays one or more gender traits that are incongruent with what is expected of their sex, perceptions of their ability may be negatively impacted. Heilman and Chen (2005) showed that women in leadership positions who displayed characteristics more typically associated with men were perceived negatively in terms of leadership ability. These attitudes are especially prevalent in institutions and professions that are male dominated, such as in the military. However, long held, and strong beliefs about whether women belong in the military, especially in combat positions, may contribute more to the perception of whether a woman has leader potential than her apparent competence.

Matthews et al. (2009) found negative attitudes toward women in the military from multiple samples, including civilian students, as well as both ROTC and service academy cadets. This was particularly the case for male cadets at the United States Military Academy, who showed lower approval for female leaders, specifically in combat leadership roles, when compared to both their civilian and ROTC cohorts. Similarly, Looney et al., (2004) found that midshipmen at the United States Naval Academy accepted women less when those women held stereotypically male-dominated leadership roles, such as those found in combat, versus other similarly aged men in the same role. Even at the nation’s military academies, recognized for their diversity and leadership development, cadets prefer women to remain in support, logistics, and medical services, which reflect roles historically reserved for women (Field & Nagl, 2001). The fact that until recently women were excluded from combat roles suggests that military service members may be conditioned to believe that these positions are better suited exclusively for men, a view that is propagated in society. A potential explanation for these beliefs could be sexist attitudes towards women in the military and in society at large.

Sexism and Leadership

Contemporary understanding of sexism is that there exist two components that have far reaching effects (Glick & Fiske, 1996; Glick et al., 2015). Hostile sexism refers to negative attitudes, beliefs, and behaviors toward women that are overtly antagonistic and derogatory. It involves expressions of disdain, contempt, and prejudice toward women, often manifesting as aggression, discrimination, or devaluation based solely on gender. Hostile sexism is characterized by a belief in the superiority of men and the inferiority of women, as well as the enforcement of traditional gender roles and norms that limit women's autonomy and opportunities. Conversely, benevolent sexism subjugates women through less hostile means, specifically through emphasizing women’s supposed need to be protected and cared for by men, as well as women’s role in the home and as a fulfiller of men’s sexual desires within relationships that are based upon mutual interdependence between men and women. Although benevolent sexism may not appear to be as harmful or demeaning, it is problematic when the excessive chivalry displayed by benevolent sexists reduces women’s agency and causes them to doubt themselves and their abilities and relegates them to subservient roles (Bareket & Fiske, 2023; Glick & Fiske, 1996; Rudman et al., 2012; Zaikman & Marks, 2014). Hostile sexists, on the other hand, openly disparage women and are easily identified by their negative opinions and treatment of women.

Both hostile sexism and benevolent sexism might affect how men rate military leadership potential for women in the military. Given that hostile sexism refers to negative and antagonistic attitudes toward women, particularly those who challenge traditional gender roles (Glick et al., 2015; Little et al., 2007b; Little et al., 2011; Rule & Ambady, 2009; Silva, 2008), men who endorse hostile sexist beliefs may feel threatened by the idea of a woman who challenges traditional gender norms in the military, perceiving her as a professional threat. In contrast, the paternalistic attitudes of benevolent sexism may be directed towards the woman with more feminine facial features. Men who endorse benevolent sexism might view her as needing protection rather than being a leader. Ultimately, both hostile and benevolent sexism can hinder women's progress in the military by influencing men's perceptions and ratings of their leadership potential, regardless of their facial features or abilities. Addressing these biases is crucial to ensuring that military leadership positions are based on merit rather than gender stereotypes.

While sexism itself may help explain ratings of masculine and feminine male and female faces in a military context because they help justify and maintain the status quo, there are reasons to expect that hostile sexism may provide greater explanatory potential than benevolent sexism due to perceptions of women pushing themselves where they are not wanted or needed. The focus on hostile sexism is also consistent with prior research by Masser and Abrams (2004) who found that male participants who scored high in hostile sexism formed more negative evaluations of and were less likely to recommend a female candidate for a managerial role than a male candidate. They did not find benevolent sexism to be related to either evaluations or recommendations. Additionally, recent research by Schaefer et al. (2021) in a military context found that hostile sexism negatively affected peer evaluations of military readiness, physical ability, and social ability, factors crucial in leadership development in military contexts. These studies bolster the rationale for examining hostile sexism's moderating effect on leadership ratings for gendered facial features.

The Current Study

Using an established paradigm and previously validated stimulus materials, we asked male cadets at a military service academy to rate leadership potential for male and female faces that had been masculinized or feminized, and we also measured participants’ levels of dispositional hostile and benevolent sexism. The resulting design of the study was a 2 (sex of face: male, female) × 2 (gender of face: masculine, feminine) × 2 (type of sexism; hostile, benevolent) x 2 (level of sexism; low, high) mixed model design, with the first three variables being within subjects and the level of sexism variable being between subjects. Thus, the current study allowed us to replicate findings from Korenman et al. (2019) and, critical to this study, assess the potential moderating role of hostile and benevolent sexism in male cadets’ ratings of leadership potential for dimorphic male and female faces.

H1: There will be a main effect for sex of face such that ratings for leadership potential for male faces will be significantly higher than ratings for female faces, replicating results from Korenman et al. (2019).

H2: There will be a significant interaction between sex of face and gender of face such that masculine male faces will receive the highest ratings and masculine female faces will receive the lowest ratings for leadership potential, with feminine male and female faces falling in the middle, replicating findings from Korenman et al. (2019).

H3: There will be a significant interaction among sex of face, gender of face, and level of hostile sexism, such that participants with higher levels of hostile sexism will have more polarized ratings of leadership potential for masculine male faces and masculine female faces than participants with lower levels of hostile sexism or any level of benevolent sexism.

In all, we expected that dispositional hostile sexism would moderate the previously observed interaction wherein male cadets rated male faces with masculine features as the highest in leadership potential and female faces with masculine features as the lowest, with feminine female faces and feminine masculine faces falling in between the other two groups. Evidence of a moderating role of hostile sexism would indicate that antipathy toward women who are perceived to be encroaching on men’s territory could help explain the less positive evaluations. Women with masculine faces in the military may represent agency and competence signaling a potential threat to the status quo and triggering backlash in the form of lower leadership potential ratings from men high in hostile sexism.

Method

Participants

Participants included 224 male cadets at a military service academy, ranging from 18–26 years of age, with an average age of 19.88 years (SD = 1.35). The decision to include only male cadets was based on prior research indicating that ratings of leadership potential for sexually dimorphic male and female faces differed among male but not female military cadet participants (Korenman et al., 2019). The racial composition of the sample reflected the overall composition of cadets enrolled at the academy: African American (n = 20, 8.9%), Caucasian (n = 157, 70.1%), Asian (n = 24, 10.7%), Hispanic (n = 13, 5.8%), and other (n = 10, 4.5%). All cadets were enrolled in one of two psychology courses and received extra credit in exchange for their participation.

Materials

Stimulus materials included eight pairs of composite male and female faces, each in a masculinized and feminized form (see Fig. 1 for examples). These faces were used in previous research investigating perceptions of masculinized and feminized faces (see DeBruine et al., 2010; Little et al., 2007b; Penton‑Voak et al., 2006). The original faces were rated as average in attractiveness and symmetry. Each face was transformed into a masculinized and a feminized versions according to the Perrett et al. (1998) sexual dimorphism dimension protocol. Faces that were masculinized had broader noses, thinner lips, and squarer jawlines, whereas the feminized faces had softer facial lines, narrower noses, and fuller lips. Each face was presented against a black background. The stimuli included only facial features and did not include aspects such as ears, hairstyles, or shape of the neck. Validity for this set of faces has been established through previous studies where participants are asked to rate the faces according to how masculine or feminine they perceive each face to be (see Korenman et al., 2019; Little et al., 2007b; Little et al., 2012). Because each face was distinct, participants rated all 16 faces; faces were presented in random order with the caveat that the masculinized and feminized versions of the same face never appeared in immediate succession. The choice to use the same faces from the original study, which included only White faces, was twofold: 1) to replicate the findings of Korenman et al. (2019) and 2) to provide facial stimuli that approximate those seen most often in this environment.

Fig. 1
figure 1

Examples of Masculinized and Feminized Versions of Male and Female Faces

Participants rated leadership ability for each face based on 14 characteristics and skills identified in the Army Leader Development Manual as necessary for being a successful and competent leader (Department of the Army, 2012). These characteristics and skills reflect core leadership competencies for the Army and would be highly familiar to cadet participants in the study. The statements are that the person a) builds trust, b) fosters teamwork, c) manages resources, d) maintains and enforces professional standards, e) balances requirements of the mission with the welfare of others, f) displays character, g) leads with confidence in adverse conditions, h) demonstrates technical and tactical knowledge and skill, i) fosters teamwork, j) encourages fairness, k) maintains mental and physical health and well-being, l) facilitates ongoing development, m) effectively manages resources, and n) recognizes and rewards good performance.

With the exceptions of two characteristics/skills that may have been perceived as more stereotypically masculine (e.g., leads with confidence in adverse conditions and demonstrates technical and tactical knowledge and skill), the statements were relatively gender neutral. Ratings were made on a scale of 1 to 7, with 1 indicating that participants strongly disagreed that the face they were rating represented each quality and 7 indicating that they strongly agreed that the face they were rating represented the specific quality. Ratings on the 14 dimensions were averaged, yielding a single score for leadership potential. Coefficient alpha for the composite score was very high at .98. The decision to consolidate into a single score follows prior work by Korenman et al. (2019) and Korenman et al. (2023).

Dispositional levels of hostile sexism and benevolent sexism were assessed using the Ambivalent Sexism Inventory (ASI; Glick & Whitehead, 2010; Rollero et al., 2014), a psychometrically sound measure that asks respondents to indicate the extent to which they agree with a variety of statements on a scale of 1 to 6 concerning the relation between men and women, yielding separate scores for both the hostile sexism and benevolent sexism subscales (Glick & Fiske, 1996). Higher scores reflected higher levels of sexism. An example statement that taps into hostile sexism is “Women seek to gain power by getting control over men.” Participants who strongly agree with this item tend to score higher on hostile sexism than participants who strongly disagree with the statement. An example of a statement that taps into benevolent sexism is “Women should be cherished and protected by men.” Participants who strongly agree with this item tend to score higher on benevolent sexism than participants who strongly disagree with the statement. Coefficients alpha for the hostile and benevolent sexism subscales were .81 and .88 respectively. The two subscales were significantly correlated (r = .50, p < .001).

Finally, participants answered several demographic questions, which allowed us to confirm that the composition of the sample was representative of male cadets at USMA. All participants accessed the survey and submitted their results via a Qualtrics survey and data collection link that they accessed on their personal laptops.

Procedure

Participants completed the procedure individually on their own laptop computers. Once participants entered the Qualtrics site and consented to participate, they rated each of the 16 faces in a random order. Participants examined each face individually and rated the extent to which that face represented each of the 14 leadership statements before moving on to the next face, and they continued this process for all 16 faces. Once ratings for the faces were complete, participants responded to the ASI statements and demographic questions. Participants had unlimited time to respond. Upon completion of the study, participants were debriefed and dismissed. The researchers remained unaware of participants’ ratings of faces and sexism scores throughout the data collection period. This research was approved by the Institutional Review Board at the United States Military Academy with project control number: 17–087 Korenman-Rodeo.

Results

A 2 (sex of face: male, female) × 2 (gender of face: masculine, feminine) × 2 (type of sexism: hostile, benevolent) × 2 (level of sexism: low, high) mixed model analyses of variance (ANOVA) was conducted to determine how levels of hostile and benevolent sexism affected ratings of leadership for sexually dimorphic faces, where sex of face, gender of the face, and type of sexism (hostile or benevolent) served as within-subjects variables and level of sexism (low or high) served as the between-subjects variable. This analysis allowed us to test for all main effects, two-way interactions, and the focal three-way interaction. We present our results in this order. Prior to this analysis, scores on the ASI were categorized as high and low for both benevolent and hostile sexism. Although the ASI subscales are often analyzed as continuous variables, our preference was to follow the approach of Acker (2009) and Hogg et al. (2006) of using a median split, which is also supported in the psychological literature (Iacobucci et al., 2015a, b; McClelland et al., 2015; Rucker et al., 2015). We also retained the original scores for use as continuous variables in repeated measures multiple regression analyses; results from both analytic approaches were similar. For ease of interpretation, we report results using the median split approach. To divide participants into groups low and high in each type of sexism, we performed two separate median splits, one each for benevolent and hostile sexism. Participants who scored at or below the median of 3.50 on benevolent sexism (n = 116) were designated as low in benevolent sexism (M = 2.91, SD = 0.49) and those who scored above the median (n = 108) were designated as high in benevolent sexism (M = 4.03, SD = 0.36). Participants who scored at or below the median of 3.27 on hostile sexism (n = 114) were designated as low in hostile sexism (M = 2.70, SD = 0.49) and those who scored above the median (n = 110) were designated as high in hostile sexism (M = 3.77, SD = 0.42).

Results from the 2 × 2 × 2 × 2 mixed model (ANOVA) showed no effect for sexism on ratings of faces, indicating that high versus low levels of hostile and benevolent sexism did not impact overall ratings of faces Fs(1, 220) = 3.47 and 0.78, respectively, with both ps greater than .050. Although we did not find a main effect for the gender of face, F(1, 220) = 0.332, p = .565, ηp2 = .002, we did find that participants differed in their ratings of faces based on the sex of the face, F(1, 220) = 4.30, p = .039, ηp2 = .01, with male faces (M = 4.80, SD = 0.66) receiving higher leadership potential ratings than female faces (M = 4.72, SD = 0.70). In addition, results indicated a significant two-way interaction between sex of face and gender of face F(1, 220) = 22.01, p < .010, ηp2 = .09, however this result is more interesting when considered in the context of the different types of sexism, benevolent or hostile.

The three-way interaction between sex of face, gender of face and level of sexism was not significant for benevolent sexism, F(1, 220) = 0.101, p = .749, ηp2 < .01, but was significant for hostile sexism, F(1, 220) = 4.11, p = .044, ηp2 = .02, indicating that hostile sexist beliefs affected ratings of sexually dimorphic faces, but benevolent sexist beliefs did not. To better understand this interplay between the sex and gender of the faces and level of hostile sexism, we conducted a simple effects analysis looking at how differences in the sex and gender of face affected ratings for each of the levels of hostile sexism separately.

Results from participants rated low in hostile sexism yielded no main effect for sex of face, F(1,113) = 0.21, p = .645, ηp2 < .01, or gender of face F(1,113) = 0.97, p = .328, ηp2 < .01, but did indicate a significant interaction between sex of face and gender of face, F(1,113) = 4.91, p = .029, ηp2 = .04. A simple effects analysis showed that when observing a male face, those low in hostile sexism did not significantly differ in their ratings of the masculine face (M = 4.83, SD = 0.64) compared to the feminine face (M = 4.81, SD = 0.65). Interestingly, when observing a female face, participants low in hostile sexism favored the feminine face (M = 4.89, SD = 0.64) over the masculine face (M = 4.80, SD = 0.73).

When looking at the results for those participants rated high in hostile sexism, we see a similar pattern of responses, however more polarized. Participants higher in hostile sexism favored male faces (M = 4.78, SD = 0.70) significantly more than female faces (M = 4.59, SD = 0.72), F(1,109) = 9.74, p < .002, ηp2 = .08, but did not differ in their ratings of masculine (M = 4.71, SD = .66) versus feminine faces (M = 4.67, SD = 0.66), F(1,109) = 1.80, p = .183, ηp2 < .01. A significant interaction between sex of face and gender of face points to the potentially polarized views of those participants high in hostile sexism, F(1,109) = 20.85, p < .001, ηp2 = .16. To further elucidate these results, we conducted another simple effects analysis to examine the effect of gender on the male and female faces separately. Ratings of leadership potential by participants high in hostile sexism were different for both male faces, F(1,109) = 18.86, p < .001, ηp2 = .15, and female faces alike, F(1,109) = 5.74, p < .050, ηp2 = .05, however the pattern of ratings differed based on the gender of the face, where the masculine male faces were rated the highest of all faces (M = 4.87, SD = 0.66), and the masculine female faces were rated the lowest (M = 4.54, SD = 0.74). Feminine male faces (M = 4.69, SD = 0.70) and female faces (M = 4.65, SD = 0 .78) were about equal. Finally, the 4-way interaction between sex of face, gender of face, level of benevolent sexism, and level of hostile sexism was not significant, F(1,220) = .46, p = .500, ηp2 < .01. All means, standard deviations and post hoc pairwise comparisons can be found in Table 1.

Table 1 Mean Leadership Ratings of Faces With Standard Deviationsa

Discussion

The purpose of this study was to investigate the potential role of dispositional sexism in the observed tendency for male military cadets to rate the leadership potential of women with masculine faces as lower than they rate women with feminine faces or men with either face. Consistent with Hypothesis 1 and prior research conducted in a similar context (Korenman et al., 2019), a significant main effect of sex of face revealed that ratings of leadership potential for male faces were significantly higher than female faces, regardless of whether they were masculine or feminine and regardless of amount or type of sexism. Also replicating findings by Korenman et al. (2019) and consistent with the predicted interaction between sex of face and gender of face in Hypothesis 2, participants gave masculine women the lowest ratings and masculine men the highest ratings for leadership potential, with the feminine versions of both male and female faces falling in the intermediate range. Most important, and in support of Hypothesis 3, the previously described two-way interaction between sex of face and gender of face was qualified by a higher order interaction involving hostile (but not benevolent) sexism. In short, men who were high in hostile sexism responded with the most polarized ratings of faces, favoring the masculine male face, and providing the least favorable ratings for the masculine female face; ratings by participants with lower hostile sexism scores did not follow this pattern.

The finding that participants rated male faces higher than female faces regardless of sexual dimorphism or sexism is congruent with research showing that men rather than women tend to be perceived as well suited for leadership positions, especially in masculine domains (Boldry et al., 2001; Heilman et al., 2004; Yukl, 2012). Research on the stability of implicit leadership theories over the past two decades found that the heuristic of “think leader, think male” persists (Offermann & Coats, 2018). The varied responses to different types of female faces, however, are inconsistent with results from studies in other contexts showing higher leadership ratings for faces with masculine features regardless of sex of face (Ferguson et al., 2019; Walker & Wänke, 2017; Watkins & Jones, 2016), but consistent with prior research in a military context (Korenman et al., 2019), and similar to research on selection of political candidates (Carpinella et al., 2016). It appears that when male cadets do see leadership potential for women in the military it is reserved for the women with feminine faces.

Alternatively, Cuddy et al. (2008) suggest that within the Stereotype Content Model (SCM), the warmth dimension has primacy in face ratings, with considerations of competence playing a secondary role (Imhoff et al., 2013). To the extent that men high in hostile sexism may have little motivation to make corrections to their initial impressions of the women they encounter, the primacy of the warmth dimension may be what is reflected in the observed ratings in the current study. This could also explain why men low in sexism did not show a similar pattern, as they would have had the motivation to consider competence in the secondary step. However, it is less clear how the primacy of warmth could account for the lack of differences observed with respect to benevolent sexism, unless men high in hostile sexism perceive the warmth of the two female faces differently than do men high in benevolent sexism or men who are low in either type of sexism.

Also consistent with this explanation is the possibility that the masculine and feminine female stimulus faces activated different stereotypes altogether. Research has shown that various subtypes of women fall into different quadrants within the SCM (Cuddy et al., 2008). For example, housewives are rated as high in warmth but low in competence and career women are rated as low in warmth but high in competence. Given that participants were asked to rate leadership potential for both feminine and masculine female faces with respect to leadership in the Army, both types of faces could be presumed to represent a single subtype: career women (in the Army). However, Army women may be further subtyped, and both military men and women may be frequently exposed to subtype references. For example, an article published on the official U.S. Army website (https://www.army.mil/) profiled several highly successful West Point women under the subheading of “Badass Ladies” (O’Connor, 2020), and an article on Today.com profiled inspiring military women under the title of “9 badass women in the military who have made history — and why you should know them” (Hanson, 2023). These labels both implicitly and explicitly communicate that the type of women respected and revered in the military are the more stereotypically masculine “badass” ones, and that they presumably are the type expected to rise to higher levels of leadership. Review of the content of the articles and descriptions of the women profiled in them show these women to be portrayed in stereotypically masculine terms as being high in determination, leadership, resilience, and bravery, as well as pioneering spirit.

Other subtypes of women in the military, however, may be viewed much less favorably. For example, military women may also be subtyped as lesbians or feminists (or both). Conceivably, the masculine female face may have activated one or more subtypes evaluated in a particularly negative way by male cadets. Although research suggests that within the general population and cross-culturally people tend to perceive feminists as competent but cold, falling into the same SCM quadrant as career women, Cuddy et al. (2008) also show perceptions of feminists in one study to fall into the quadrant of low warmth and low competence. The perception of an apparent feminist in the male-dominated Army may evoke uniquely high levels of contempt from military men high in hostile sexism. These men may perceive them to be the driving force behind integrating (unwanted) women into traditionally masculine domains, including combat arms.

Consistent with this interpretation, Glick et al. (2015) found that higher scores in hostile sexism were significantly correlated with negative evaluations of subtypes of masculine women and feminist women, but not feminine women. Moreover, research by Gundersen and Kunst (2018) found that feminist women were visually masculinized by perceivers, while feminist men were feminized. To the extent that lesbian women may also be masculinized via the “angry butch” lesbian stereotype (Geiger et al., 2006) and simultaneously assumed to hold feminist viewpoints, women who appear to fit into both categories may encounter backlash on multiple fronts; Wilkinson (2008) discusses how when a woman identifies as a feminist she is also implicated as being lesbian. In contrast, a feminine-faced woman may be perceived as detracting from the mission due to being attractive (a distraction) or physically weaker (perceived lower ability level).

This study afforded the chance to investigate a possible interaction between qualities of both the target (i.e., features of the face to be judged) and the perceiver (i.e., qualities of the participant, such as sexist beliefs), which Hehman et al. (2019) suggest may be less well understood in research on facial social perception than either target or perceiver qualities alone. Indeed, high levels of hostile sexism among some perceivers appear to be driving the effect of male cadets rating the leadership potential of women with masculine faces as the least favorable of the four groups. Even though other unmeasured participant characteristics may have played a role in the ratings, including information about hostile sexism yields a more complete understanding of how gendered facial information may be used by male military cadets to determine who would make a good military leader.

Limitations and Future Research Directions

Although the current study suggests that higher levels of hostile sexism may contribute to less positive perceptions of leadership potential for women with masculine faces, there are several limitations that should be considered. First, all stimulus faces depicted White men and women. Although the demographic report of the total U.S. Army indicates that 54% of military members are White, 46% are from other groups (U.S. Department of the Army, 2022). Perceptions of leadership potential of masculine and feminine male and female faces may differ based on the apparent race of the individual in the photo. For example, in a study on Black Chief Executive Officers (CEOs), Livingston and Pearce (2009) found that the baby-facedness of Black CEOs conveyed warmth and may have helped disarm perceptions of threat as compared with mature-faced Black CEOs. To the extent that being mature faced may be seen as more masculine and being baby-faced may be seen as more feminine, the potential warmth elicited by the feminine female faces in our study may have served to counteract any potential threat that was elicited from her being a woman in the Army. Future research should examine leadership potential ratings for faces that represent other races.

We also did not vary the age of people in the photos or include any participants who were older than the typical age of cadets at the academy. All participants were aged 18–26, therefore in young adulthood with little military experience and few encounters with women leaders. Research addressing age-related changes in hostile and benevolent sexism in New Zealand found that levels of hostile sexism decreased from initially higher levels in young adulthood to lower levels for men in middle adulthood before rising again in late adulthood, whereas benevolent sexism increased in a linear manner over time (Hammond et al., 2018). Also, participants may infer different levels of leadership potential for men and women with older-looking faces. After all, anyone serving in the Army long enough to reach middle age may be assumed to already possess the requisite leadership traits and attributes to be successful military leaders. Investigating how the apparent age of the stimulus faces and age of the participants contribute to perceptions of leadership potential with respect to the military seems necessary for a more complete understanding of these interrelationships.

Similarly, we asked about leadership using traits and characteristics the Army wants from their leaders in general, but the Army has highly specialized branches, some of which are perceived to be more masculine, such as infantry and armor, and others that are potentially viewed as more feminine, such as medical services or finance. Now that women are fully integrated into combat roles, the specific branch that participants are thinking about when rating either male or female faces for leadership potential may matter. Perhaps the prototypical Army leadership role is in combat arms, which may contribute to a reluctance to acknowledge more leadership potential for the women they rated. Without exit interviews to ask what Army branch participants may have been thinking about as they rated the faces we do not know. Exploring these topics should contribute to a better understanding of the factors that affect perceptions of leadership potential for women in a military context.

While we have identified a pattern in which masculine female faces are rated lower in terms of leadership potential, especially by men high in hostile sexism, the underlying reasons for this perception warrant further exploration. For example, individuals who are high in hostile sexism may experience threat when perceiving masculine women or discomfort due to the clash between the expected warmth associated with women and the competence typically associated with men. As previously stated, backlash theory (Rudman et al., 2012) suggests that women who deviate from gender norms may face penalties for doing so, which also may be particularly pronounced in fields traditionally dominated by masculine values. Exploring whether similar biases against gender-atypical appearances manifest in other male-dominated (e.g., engineering) or female-dominated (e.g., nursing) fields could illuminate whether these perceptions are unique to the hierarchical and power-laden environment in the military or if it reflects broader societal prejudices against individuals who deviate from gender norms.

Practice Implications

Results from this study have implications relevant to the military and other organizations that may be male-dominated or have a masculine culture, such as law enforcement or first responders, as well as fields relying heavily on science, technology, engineering, or math. Given that men comprise 82% of current total Army forces (U.S. Department of the Army, 2022), advancement for women at any rank is highly likely to be dependent on their male colleagues’ impressions of them. Based on numerical representation alone, men are much more likely to be a female service member’s immediate supervisor than are other women, and men are more likely to be the decision-makers on promotion and leadership selection boards. Biased impressions formed by men high in hostile sexism may have serious consequences for women’s careers.

Even if the decision-makers for promotion and leader selection boards are men who are low in sexism, it does not preclude the possibility that raters at the immediate supervisory level may give lower performance ratings to a masculine-faced woman if the rater is a man who is high in hostile sexism. If that is the case, then even open-minded reviewers on promotion and selection boards may pass over the woman’s application because they are reviewing (biased) ratings that suggest she may be less qualified than others. Numerous studies have shown that women in nontraditional roles and/or masculine cultures receive backlash and may be penalized in performance ratings (Boldry et al., 2001; Heilman et al., 2004; Looney et al., 2004; Smith et al., 2019), and within a military sample of basic training instructors, men with higher hostile sexism and authoritarianism engaged in more maltreatment of female trainees and provided less effective mentoring to them (Barron & Ogle, 2014). Thus, men in the military who are high in hostile sexism also may be less likely to mentor women to become future leaders, leaving them dependent on low sexist men or other women as mentors. However, women remain underrepresented at the higher officer and enlisted ranks (Department of the Army, 2012), which furthers their dependence on men for continued career advancement.

Fortunately, the U.S. Army recently changed a long-standing policy that required inclusion of a soldier’s official Department of the Army photograph with other promotion materials. In a memo signed June 26, 2020, the former Secretary of the Army, Ryan D. McCarthy, mandated that official photos be eliminated from promotion materials submitted for officers, warrant officers, and enlisted soldiers and that any data contained within evaluation records identifying race, ethnicity, or gender also be redacted. Elimination of the photograph marks significant progress towards minimizing the potential subtle influence of the information conveyed in the picture, such as facial masculinity or femininity. However, redaction of the other information may ironically backfire, as adopting a colorblind (and genderblind) approach to remedy inequality can sometimes further inequality (Plaut et al., 2018). For example, the actual performance ratings of enlisted soldiers and officers are given by supervisors who have full knowledge of the racial, gender, and age categories of the ratee, as well as the person’s physical appearance. As Correll et al. (2020) caution, biases within written evaluations can affect future decision-making, especially if those evaluations are assumed to be unbiased by subsequent evaluators.

Similarly, people who are treated as though they may be leaders may come to believe more in their own leader potential and be more willing to act in a leaderlike manner, perpetuating the cycle. Numerous studies support this dynamic process in the leadership domain with respect to perceived displays of dominance (e.g., Haselhuhn et al., 2013; McArthur & Baron, 1983; Re & Rule, 2017). Future research conducted on the facial characteristics of the women in the military who have ascended to leadership positions might yield information about whether female leaders’ faces have masculine or feminine characteristics, but it would not sort out the relative contributions of their own behavior versus other people’s expectations and treatment of them. There also may be differences in expectations for women serving in leadership roles within the enlisted versus officer ranks. Given what is known about the powerful nature of self-fulfilling prophecies, treating all women as though they have the potential to develop as leaders and to serve in leadership roles may yield tangible benefits for both the women themselves and the organizations they serve. The discovery that women with more dominant facial features were viewed as less likely to be leaders, evident in judgments based solely on facial characteristics, suggests that these subtle appearance cues may evoke meaningful penalties, especially from men who endorse hostile sexist attitudes.

Taken altogether, these findings may offer valuable insights for professionals, policymakers, educators, and other parties interested in promoting gender equity and combating sexism within male-dominated or masculine-culture organizations. For example, organizations with a gender imbalance in leadership positions might use these findings to inform leader development and mentoring programs. Emphasizing the importance of recognizing and addressing biases related to gender and physical appearance may help organizations develop more inclusive leadership pipelines. Organizations can critically evaluate their leadership selection processes with a goal of reducing reliance on subjective judgments, such as facial appearance, and placing greater emphasis on objective performance metrics to mitigate the influence of biases. Implementation of gender-blind assessments in selection processes may increase the chances that that individuals are evaluated based on their qualifications and achievements. By acknowledging and addressing biases related to appearance and gender, to include bias evoked from facial features alone, organizations and individuals can contribute to more equitable workplaces and leadership structures.

Conclusion

The current study provides evidence of the moderating role of dispositional sexism, specifically hostile sexism, in men’s ratings of male and female faces that are either more masculine or more feminine on their leadership potential in a military context. Although men low in hostile sexism rated the different type of faces all quite similarly, men high in hostile sexism had polarized ratings, with lower ratings of masculine female faces. This pattern was obtained in the absence of any other diagnostic information regarding ability or performance. Results from this study underscore the importance of considering how characteristics of both the target and the perceiver may interact when studying impressions of leadership potential that are formed when participants are asked to make judgments based solely on faces. Future research may determine whether facial masculinity or femininity, when paired with other information, such as diagnostic information about prior performance, still dictates ratings of leadership potential.