More than 40 years ago the conventional understanding of gender and gender roles in psychology took a momentous turn. Before 1973, a simple reasoning model informed psychological theorizing such that femininity and masculinity were viewed as opposite poles on the same dimension in order to conceptualize and make sense of most psychological differences between women and men in the United States (Constantinople 1973; Ellis 1905; Hathaway and McKinley 1940; Strong 1935, 1936; Terman and Miles 1936) and Europe, specifically Germany (Hirschfeld 1919/2000). Constantinople (1973) was among the first to challenge this simple reasoning model by proposing the radical perspective that femininity and masculinity may be independent constructs, with the ability for a single individual to vary independently on both. In this intellectual milieu, Bem (1974) produced a progressive conceptual apparatus (for that time) by arguing that women and men should engage in a psychological state that she termed psychological androgyny (i.e., strongly endorsing both stereotypically feminine and masculine attributes within the same individual). This conceptual advance stood in contrast to the traditional assumption of that time: that it is the feminine woman and the masculine man who embody optimal mental health (Bem 1975). Accordingly, although well-known today and indispensable to much of the clinical, developmental, personality, and social psychological theorizing on gender in the United States for the past four decades, Bem’s early work contained radical and progressive ideas for that time.

For those trained in psychological science, it can be said that the field as a whole relies on a philosophy of science that features the interplay of three fundamental pieces of the scholarly enterprise: theory construction, statistical methodology, and research methodology (Cozby 2009; Gravetter and Wallnau 2013; Myers 2011). Yet, we believe that this common description leaves out an important piece of the knowledge creation and subsequent evaluation process: conceptual advances. Conceptual advances, as we define them, are not truly theories or theory construction. Traditionally, proper psychological theories are tightly wrought chains or webs of propositions, each intimately connected to the others and each coherent on its own terms—which means that our view of psychological theory is ultimately similar to (or at least aspires to) the equations of theoretical physics. With such features, psychological theories can be tested empirically and evaluated using statistics. Conceptual advances, in our view, are more akin to ideas, suppositions, or innovations. In this way, we believe that conceptual advances are loose, sometimes metaphorical, and often incomplete propositions that nonetheless serve as ways to push science in another direction. From this understanding, we argue that Bem’s idea of psychological androgyny is best described as a conceptual advance—not a proper theory. (Her proper theory can be regarded as gender schema theory; Bem 1981a.) Thus, in our view, one of Bem’s demonstrable legacies is that she innovated psychology’s understanding of how to deal with the many constructs that fall under the rubric gender by advancing the notion of psychological androgyny.

Bem (1974) created the Bem Sex Role Inventory (BSRI) as the psychological measurement tool for both the traditional and progressive concepts that she wished to study. The progressive concept—psychological androgyny—along with the measurement possibility for it, infused new energy into the discussion of social influences on gender roles, as well as the extents to which those roles are internalized and to which it was psychologically healthy to rigidly adhere to them. We focus our analysis mostly on Bem’s work through the 1970s and into the 1980s, noting that Bem and colleagues (Bem 1974, 1975, 1977, 1979, 1981a, 1981b; Bem and Lenney 1976; Bem et al. 1976) developed a series of works that centrally focused on illustrating the advantages of leading a psychologically androgynous life—one that featured cognitive flexibility and steered clear of strict adherence to femininity for women or strict adherence to masculinity for men. Unless otherwise stated, all cited studies were based on samples of adults and sometimes children in the United States.

By focusing on the formative years of the concept of psychological androgyny, we cast any use of the BSRI as a way to measure the concept as secondary to our points. Thus, the many criticisms of the BSRI as a valid measurement tool for this and related concepts is not the focus of our article. We believe the validity of the BSRI to be an important issue on its own, and, for those interested in critiques of its validity, many authors have provided detailed arguments and empirical demonstrations. We provide a non-exhaustive list of these critiques for the reader’s consideration here, along with some of Bem’s responses: Abrahams et al. (1978); Auster and Ohm (2000); Bem (1979, 1981a); Harris (1994); Hoffman and Borders (2001); Holt and Ellis (1998); Pedhazur and Tetenbaum (1979); Spence (1984); Spence and Helmreich (1981); Tate et al. (2015); and Tobin et al. (2010).

With our focus on the concept of psychological androgyny, we provide an analysis of how Bem’s larger scholarly project of that time has filtered into the current understandings of gender-associated phenomena overall. Throughout our analysis, we focus on the conceptual pieces that are often hidden or overlooked within modern retellings of psychological androgyny. In our view, Bem’s project was one of the first and most enduring attempts to pull psychological theorizing out of a simplified view of phenomena associated with gender and is thus a touchstone for the modern understanding of gender as a multidimensional or multifaceted construct. We describe the functions that we believe psychological androgyny served from Bem’s own writing. Additionally, we describe how these functions allowed for true advances in the psychological study of gender, but were also beset by backslides into the simplified view. Consequently, although Bem may be best remembered as an advocate for feminism and social justice within psychology, at a time when little of either existed, she should be equally remembered for providing the science of psychology with the innovative conceptual tools to create more nuanced and intricate reasoning about a variety of phenomena associated with gender.

A Single Dimension: Masculinity–Femininity

As gender differences became a compelling area of psychological investigation in the United States in the 1950s and 1960s (Kagan 1964; Kohlberg 1966), it became clear that the advances in theorizing about gender soon outpaced the methodology for measuring gender differences since the time of Terman and Miles (1936). Crystallizing these considerations, Constantinople (1973) provided a review of the literature in which she was able to make explicit some of the implicit assumptions psychological science had made while building the scales to measure gender and gender differences from the 1930s into the early 1970s. The central assumption was that all important aspects of gender existed along a single continuum called masculinity–femininity (M–F) (cf. Terman and Miles 1936). According to Constantinople (1973, p. 389), the construct of masculinity–femininity (M–F) had three features: “(a) that it (M–F) is best defined in terms of [gender] differences in item response; (b) that it is a single bipolar dimension ranging from extreme masculinity at one end to extreme femininity at the other; and, (c) that it is undimensional in nature and can be adequately measured by a single score.” According to Constantinople, these three features were only assumptions and ones that lacked evidence. Consequently, Constantinople argued that the psychological construct of M–F was built from conviction more than theory. And, although Constantinople (1973) noted that this construct of M–F was useful at least to the layperson (of that time) as an organizational tool for social experience, she questioned whether that same conceptual tool was helpful in a scientific understanding of gender.

To further explain Constantinople’s (1973) first and second points, with femininity and masculinity as two ends of the same continuum, the measurement assumption relies on the logical constraint that feminine items are opposites of masculine items (and vice versa) and the score that is obtained reflects this underlying opposition. This is, for example, exactly how Terman and Miles (1936) treated women’s and men’s responses: a difference between the groups was thought to be evidence of this continuum. The third point in Constantinople’s review suggests that the M–F concept could be exposed as inadequate and more adequately measured by a multidimensional approach. If one considers the possibility that there could be, at minimum, two separate dimensions within the conceptual space—one dimension measuring low-to-high femininity and the other dimension measuring low-to-high masculinity—then researchers would have at least a one way to determine whether the assumptions of the unidimensional M–F understanding were warranted. If those assumptions were true, then people would almost always indicate being high on one dimension and simultaneously low on the other, revealing that two dimensions were unnecessary. Thus, asking separately about femininity and masculinity as constructs can support either the multidimensional view or the unidimensional view of M–F.

Introducing Psychological Androgyny

Bem (1974) created a new way of understanding the previous research on gender in a manner that was aligned with the points that Constantinople (1973) raised. Bem’s construction of the BSRI reflected the new idea that gender-associated stereotypes are not bipolar; instead, these social concepts of femininity and masculinity should be measured on separate scales, allowing a person to move freely on both dimensions. Importantly, with this two-dimensional measurement, the BSRI could still provide empirical evidence of whether some women and men were adhering to traditional gender stereotypes in addition to whether others were adhering to some flexible participation in stereotypically feminine and masculine qualities simultaneously (viz., psychological androgyny). According to Bem (1979), the BSRI was designed to assess the culturally-defined, desirable qualities for men and women (viz., gender stereotypes) that are then reflected in one’s self-description by personally endorsing whether they have these qualities on a continuum of never true to almost always true (which are the response anchors on the BSRI; Bem 1974). This method allowed for the measurement of psychological androgyny by observing that both feminine and masculine qualities were almost always present, at least to some degree, within the same individual.

In some of her early articles, Bem focused on telling the conceptual story of psychological androgyny (Bem 1974, 1975, 1977, 1979, 1981a, 1981b; Bem and Lenney 1976; Bem et al. 1976). Because the measurement properties of the BSRI allowed respondents to vary along a femininity dimension (from low-to-high) and separately on a masculinity dimension (also from low-to-high), this allowed for a two-dimensional space with four idealized outcomes: (a) high/low (e.g., high femininity/low masculinity), (b) low/high (e.g., low femininity/high masculinity), (c) high/high (i.e., high femininity/high masculinity), and (d) low/low (i.e., low femininity/low masculinity). Given that this two-dimensional understanding of gender-associated stereotypes allowed for free expression of both femininity and masculinity within a single individual, Bem argued that people should strive for androgyny. Thus, it appears that Bem (1974) introduced psychological androgyny with several functions. Below, we describe what we believe to be three of the central functions of this concept in psychological literature: (a) improving psychological well-being, (b) undercutting gender role polarization, and (c) expanding psychology’s focus beyond gender roles by implication.

Improving Psychological Well-Being

Research by Kagan (1964) and Kohlberg (1966) provided a foundation for Bem to begin identifying the behavioral motivations of typical gender role-typed individuals (who were formerly called “sex-typed” by Bem 1974, p. 155; Kagan 1964, p. 137; Kohlberg 1966, p. 83). By Kagan’s and Kohlberg’s definitions, typical gender role-typed individuals were driven to keep their behavior consistent with internalized gender role standards. From that previous research, Bem inferred that those with a strong self-concept as one gender would avoid engaging in behavior prescribed for the other gender (Bem 1974). The BSRI could identify these typical gender role-typed individuals, conditional on the high/low or low/high pattern aligning with the respondent’s gender self-categorization as female or male, respectively. That is, for Bem, typical gender role-typed women were those who were high in feminine and low in masculine stereotyped traits. Likewise, typical gender role-typed men were those who were low in feminine and high in masculine stereotyped traits.

Additionally, the creation of the BSRI can be viewed as giving a voice to two experiences relative to gender stereotypes that had yet to be accounted for in the psychological literature. One experience was atypical gender role-typed (which Bem described as “sex-reversed”; Bem 1975, p. 634). These individuals showed the opposite high/low pattern to the typical gender role-typed individuals. Atypical gender role-typed women were low in feminine and high in masculine stereotyped traits (i.e., were low/high women). Likewise, atypical gender role-typed men were high in feminine and low in masculine stereotyped traits (i.e., were high/low men). Importantly, Bem considered atypical gender role-typed individuals to fall into the same narrow and restrictive self-concept as typical gender role-typed individuals, only “differing from their more conventional counterparts primarily in their being ‘inappropriately’ rather than ‘appropriately’ [gender role]-typed” (Bem 1975, p. 634). (Yet, later, Bem and Lenney 1976, only found mixed support for this claim).

The other experience was the concept of psychological androgyny. According to Bem (1974, p. 155), psychological androgyny is the idea that “individuals may be both masculine and feminine, both assertive and yielding, both instrumental and expressive—depending on the situational appropriateness of these various behaviors” (i.e., high/high). Until that point in modern psychology’s history, there were virtually no previous research studies on this idea or psychological experience. Interestingly, without any prior research (and likely her own intuitions as a guide), Bem boldly claimed that there were multiple benefits of a psychologically androgynous self-concept.

Bem (1975) attempted to show that women and men who exhibited psychological androgyny would perform better than those who scored as either typical or atypical gender role-typed in a variety of social contexts, especially those related to mental health. In general, to demonstrate this point, Bem had participants complete the BSRI and then later engage in what she called “feminine” and “masculine” activities (Bem 1975, p. 635; Bem et al. 1976, p. 1017; Bem and Lenney 1976, p. 48). The “feminine” activities attempted to evoke nurturing and playful behavior, for instance, by interacting with a tiny kitten (Bem 1975). The “masculine” activities assessed a willingness to remain independent, such as in a conformity paradigm where accomplices provided opposite statements to the participant’s point of view but encouraged the participant to agree with them (Bem 1975). Importantly, Bem’s (1975) participants engaged in both tasks. Bem theorized that those with a strong typical gender role-typed self-concept or a strong atypical gender role-typed self-concept should do well at only one activity, but those with a psychologically androgynous self-concept should do well at both activities. If one takes the BSRI as a valid measurement, then Bem and colleagues found mostly supportive results (Bem 1975), but a few non-supportive ones as well—especially the lack of difference between psychologically androgynous (high/high) and atypical gender role-typed (low/high) women (Bem 1975; Bem and Lenney 1976). With these mostly supportive (though never replicated) findings, Bem argued that psychological androgyny truly does provide a cognitive flexibility that generally allows for free engagement in either type of activity.

Bem (1975; 1977; 1981a; Bem and Lenney 1976; Bem et al. 1976) continued through the 1970s and early 1980s to advocate for the advantages of psychological androgyny. In fact, a review of her commentaries and articles after 1974 shows an increasing commitment to the virtues and benefits of psychological androgyny. For instance, Bem (1975, p. 643) stated:

It is clear that whatever psychological barriers may turn out to be responsible for the behavioral rigidities of the [typical gender role] typed and [atypical gender role-typed] subjects the current set of studies nevertheless provided the first empirical demonstration that there exists a distinct class of people who can appropriately be termed androgynous, whose [gender] role adaptability enables them to engage in situationally effective behavior without regard for its stereotype as masculine or feminine.

Bem and Lenney (1976, p. 48) even went so far as to argue, “rather, it is now the [psychologically] ‘androgynous’ person, one capable of incorporating both masculinity and femininity into his or her personality, who is emerging as a more appropriate [gender] role ideal for contemporary society.”

Bem also continued to argue for the disadvantages of typical gender role-typed self-concepts. For example, Bem and Lenney (1976) observed that typical gender role-typed individuals not only actively avoided a wide variety of simple, everyday activities (like nailing two boards together or winding a package of yarn into a ball) just because these activities happen to be stereotyped as more appropriate for a different gender, but they also reported discomfort and even some temporary loss of self-esteem when actually required to perform such activities. The restrictive nature of the typical gender role-typed individuals became the less desirable path because of these limiting effects. Bem and Lenney summarized their thoughts as follows: “We can only speculate about the specific repercussions that this pattern of avoidance must have in one’s daily life, but it seems clear that [typical gender role] typing does restrict one’s behavior in unnecessary and perhaps even dysfunctional ways” (p. 53).

From this targeted review of Bem’s early influential works, one can see a commitment to the social justice mission of gender equality through advancing the concept of psychological androgyny. Bem’s advocacy for psychological androgyny through the 1970s and 1980s in the United States can be placed in a historical and sociological context of the rise of the second wave of feminism, a push for women’s equality in the workforce, and a move away from the older U.S. societal belief that women were appropriately feminine and that men were appropriately masculine—with the expectation that neither gender should invest psychological energy in what was deemed appropriate for the other gender. In fact, psychological androgyny can be viewed as a clarion call that all people should freely express both stereotypically feminine and stereotypically masculine attributes to achieve optimal well-being.

Undercutting Gender Role Polarization

Importantly, Bem (1995) reminded her readers that another motive for advancing the concept of psychological androgyny was to highlight the importance of social constructions in the dynamics of gender roles. In her autobiographical essay, Bem (1995, p. 45) explicitly stated that:

By the late 1970s and early 1980s, however, I had begun to see that the concept of androgyny inevitably focuses so much more attention on the individual’s being both masculine and feminine than on the culture’s having created the concepts of masculinity and femininity in the first place that it can legitimately be said to reproduce precisely the gender polarization that it seeks to undercut. Accordingly, I moved on to the concept of gender schematicity because it enabled me to argue even more forcefully that masculinity and femininity are merely the constructions of a cultural schema—or lens—that polarizes gender.

Bem’s statement that she introduced the concept of psychological androgyny to undercut gender polarization is worth noting because it reveals that the concept was meant to have another function—namely, one of change within psychological (if not societal) approaches to gender roles. We argue that when Bem (1995) explicitly stated that she focused on gender schema theory more after the early 1980s, this shift was an effort to make clearer her commitment to the social justice concern that gender roles themselves are problematic, but ultimately malleable, social forces. This function of psychological androgyny as undercutting gender polarization is distinguishable from the first function of psychological androgyny as a push for equality because it focuses a scholar’s attention on the nature of gender roles themselves, rather than simply being flexible and comfortable within them. By exposing gender roles as a socially constructed lens, psychological androgyny was meant to challenge and ultimately erode that polarizing view of gender as a single M-F dimension.

Expanding the Field’s Focus by Implication

We argue that by addressing only one aspect of gender—namely, gender roles (or what Bem 1974, p. 162, called “sex-role self-concept,” now more appropriately referred to in modern psychology as gender role identity)—Bem focused scientific attention on a subset of experiences and social phenomena that had previously been clustered together under the heading “gender” in the previous single dimension M–F understanding. Bem’s (1974, 1979, 1981a) definition of femininity and masculinity as located within socially constructed gender roles can be contrasted with the generalized way that M–F was used since Terman and Miles (1936). Other scholars (Dahlstrom et al. 1972; Hathaway and McKinley 1940; Lippa 1991; Strong 1935, 1936) used M–F to refer to job aptitudes, social interests and hobbies, self-labels, clothing choices, attitudes toward ingroup or outgroup members, and sexual orientation.

The foregoing list of experiences does not appear to be included in gender roles as Bem understood them. For instance, Bem’s (1975) definition of a typical gender role-typed individual is a person whose behavior is restricted in situational contexts, contingent on the stereotype of the behavior as feminine or masculine. In our view, Bem (1995) used gender schematicity and psychological androgyny as ways to showcase that gender roles were not only socially constructed (and thus malleable) but also only one part of the experiences to which gender has been tied. Thus, through Bem’s efforts to single out gender roles and only these associated phenomena (e.g., gender role identity, gender role schemas) from other aspects of gender, we believe that she was able to show by implication (and admittedly not explicitly) that gender itself has different layers (or different aspects) to it, even while she focused on just one of those aspects—namely, gender roles.

As with any sufficiently new idea for its time, Bem’s conceptual innovation of psychological androgyny resulted in a mixed set of consequences within psychological research. In particular, psychological scientists engaged in both backslides into and advances against the previous M–F understanding of gender using Bem’s work as a starting point.

Backslides into the Previous M–F Understanding

Even while Bem hoped that scholars would use psychological androgyny to undercut previous understandings of gender, a number of subsequent researchers slid back into the older thinking. Rather than provide an exhaustive list of researchers who contributed to these backslides, we instead provide only three lines of research that showcase the backslides in action while using the BSRI itself, or work that considers itself to focus on femininity or masculinity (even if it did not use the BSRI). These three lines of research are: (a) predicting woman and man self-labels from BSRI scores, (b) using BSRI scores as supporting earlier versions of the simplified M–F understanding of gender, and (c) collapsing femininity and masculinity into heteronormativity. We highlight these specific backslides because they appeared in early responses to Bem’s 1970s work and have thereby provided confusion as to Bem’s actual conceptual analysis—a confusion that, in our opinion, remains into the present.

Predicting Woman and Man Self-Labels

As should be clear from our foregoing discussions, Bem intended for a participant’s self-labels of woman and man to be distinct from endorsing feminine and masculine gender role-related desirable traits. As we argued previously and as others have also argued (e.g., Keener 2015; Tate 2012, 2014), the only way that labels such as typical gender role-typed and atypical gender role-typed make sense is if one grants that self-labeling exists as a different psychological process as compared to the personal endorsement of either of the femininity or masculinity constructs. Nonetheless, some of Bem’s critics in the 1970s and some contemporary critics have failed to realize or understand the import of this distinction. This failed realization led to a procedure of using the femininity and masculinity scores from the BSRI to predict gender self-labeling as woman or man, respectively. This procedure happened originally in the Pedhazur and Tetenbaum (1979) article and was repeated by Lippa (1991) and then by Choi et al. (2008).

However, we argue that this procedure is theoretically meaningless given Bem’s original intentions. In order for this procedure to make sense, one would have to believe either (a) that self-labeling as woman or man is a consequence of rating oneself on stereotypically feminine and masculine traits or else (b) that personal endorsement of gender role stereotypes is a subcomponent in the larger phenomenon of self-labeling (in a similar way, for example, that gregariousness is a subcomponent of the larger phenomenon of extraversion). However, neither of these positions is what Bem believed. Instead, it appears that Pedhazur and Tetenbaum (1979), Lippa (1991), and Choi et al. (2008) showcase a belief in the previous M–F understanding that both Constantinople (1973) and Bem (1974) problematized.

The BSRI and Simplified Masculinity–Femininity

Another set of studies has focused on correlating the BSRI with earlier measures of masculinity–femininity constructs—the same measures in fact that Constantinople (1973) problematized as being unclear and uninformative. Bernard (1981), for example, attempted to show the so-called multidimensionality of masculinity–femininity by factor analyzing existing M–F scales (viz., Minnesota Multiphasic Personality Inventory [MMPI], California Psychological Inventory [CPI], Guilford-Zimmerman Temperament Survey [GZTP], and the Strong Vocational Interest Blank [SVIB]) to determine subfactors of masculinity and femininity. However, those existing scales were using the same general method introduced by Terman and Miles (1936) of calculating differences between women’s and men’s responses to the items as evidence for masculinity–femininity. This is the general method that Constantinople problematized and the one that we have argued Bem’s work was working against. Consequently, Bernard’s analysis is neither consistent with nor really advancing any point made by Constantinople.

Even so, Bernard (1981) correlated the scores from the previously named inventories with the BSRI to determine how well the BSRI measured the so-called multidimensionality of masculinity–femininity. Bernard summarized these latter results as follows:

These results also suggest that the BSRI scores are related to the multidimensional factors underlying traditional scales. However, although this may support the construct validity of the BSRI, the M[asculinity] and F[emininity] scores' loadings are only moderate and therefore information may be lost by relying on the BSRI scales alone to represent the entire domain of [gender role] identity. (p. 801)

It should be clear from the prior quote that Bernard was assuming that the so-called traditional scales (i.e., the MMPI, CPI, GZTP, and SVIB) validly and appropriately measured masculinity–femininity, whereas the BSRI might do this with information-loss relative to those scales. Yet, we argue this line of reasoning is inconsistent with Bem’s arguments; Bem was trying to use the BSRI to undercut the previous understanding, not to contribute to it (e.g., Bem 1995).

Femininity, Masculinity, and Heteronormativity

Another backslide into the previous M–F understanding while using the language and concepts facilitated by Bem’s BSRI (e.g., Bem 1974) comes from research into hypermasculinity and hyperfemininity. Although ostensibly about gender roles, the hypermasculinity and hyperfemininity constructs developed by Mosher and Sirkin (1984) and Murnen and Byrne (1991), respectively, really have more to do with heteronormativity than gender as such. Heteronormativity is sometimes used to describe a body of lifestyle norms in which heterosexuality is privileged and taken for granted to a point that is normalized and naturalized (Herz and Johansson 2015). It is true that most U.S. research participants identify as heterosexual (Diamond 1993), but part of the previous M–F understanding since the 1930s (and before) has been to conflate masculinity–femininity with heterosexuality in the following way: masculine men are heterosexual men and feminine women are heterosexual women (Dahlstrom et al. 1972). This conflation of gender roles and sexual orientation was the basis for sexual inversion theory (Ellis 1905), in which masculine women are presumed to be lesbian women and feminine men are presumed to be gay men (for a similar argument from the perspective of social perceivers’ lay theories, see Kite and Deaux 1987). Although heteronormativity was not discussed explicitly in Bem’s early works, the heteronormative bias is one of the many conflations Bem appeared to be avoiding in her methods of assessing only one aspect of gender: gender roles.

Nevertheless, Mosher and Sirkin (1984) developed the Hypermasculinity Scale and credited Bem’s ideas as part of the inspiration for this measure. Murnen and Byrne (1991) developed the Hyperfemininity Scale but argued that it should not be correlated with the BSRI because the latter assesses positive gender-associated traits. An immediate issue of note for both these scales is that each required that only one gender group respond to it (i.e., men for the Hypermasculinity Scale and women for the Hyperfemininity Scale). This already shows an important departure from Bem’s original intentions with the BSRI—that both gender groups could respond to all the items. In what can be viewed as a rectification of that issue, Hamburger et al. (1996) combined the two previous scales into a single Hypergender Ideologies Scale that allowed both women and men to respond to the same items. However, the item content is important to discuss for all scales because it shows a particular theoretical bias that seems to favor heteronormativity rather than anything to do with gender as such—especially because all authors describe their respective scales as assessing gender roles.

To illustrate the heteronormativity of the items on the hypergender ideologies scales, consider just two representative items: (a) “A real man can get any woman to have sex with him” and (b) “Homosexuals can be just as good at parenting as heterosexuals” (Hamburger et al. 1996, pp. 164–165). Thus, the backslide is that hypergender ideologies are really hyper-heteronormative ideologies, focusing on heterosexual dynamics which necessarily involve different gender groups—namely, women and men—but are really about sexual orientation and behavioral dynamics within it. Thus, crediting Bem for providing some foundation for hypergender ideologies—explicitly, as Mosher and Sirkin (1984) did, or as a foil, as Murnen and Byrne (1991) did—muddles the message. In the abstract, gender roles should focus more broadly than just as applied to heterosexuals because childhood socializations into gender-typed behaviors affect adult social interactions across sexual orientations and ethnicities (cf., Wood and Eagly 2010)—even while there is likely nuance based on intersecting identities (cf., Cole 2009).

Research Advances: Extending Bem’s Gender Role Focus

There are researchers who have successfully extended Bem’s focus on some aspect of gender roles. We highlight two areas of research in which these advances have happened: (a) the modeling of social and self-perceptions of agency and communion as gendered and (b) the measurement of gender role beliefs.

Social and Self-Perceptions of Agency and Communion

Given the existing problems of validity for the BSRI, some researchers have extended some of the core concepts of Bem’s ideas as agency (formerly, masculinity) and communion (formerly, femininity) as gender-related (and ultimately malleable) perceptions of self and other. Some researchers have called this gender identity when it has been focused on self-perceptions (e.g., Laurent and Hodges 2009; Witt and Wood 2010, p. 635) whereas others have simply stated the attributes as either communion or agency and noted that social perceivers tie these attributes differentially to women and men, respectively (e.g., Diekman and Eagly 2008; Rudman and Glick 1999, 2001).

One notable line of research has examined these attributes as stereotypically associated with men (agency) and with women (communion) in leadership contexts to uncover sources of interpersonal prejudice against women leaders (Eagly and Karau 2002; Sczesny 2003). Another line has demonstrated that agentic women leaders are perceived less positively than either agentic male leaders (Rudman and Glick 1999) or equally communal and agentic female leaders (Rudman and Glick 2001) by both women and men as perceivers. Communal male leaders are evaluated negatively by both women and men as perceivers, but not more so than communal female leaders (Rudman and Glick 1999).

Another line of research has examined how people’s perceptions of self in relation to agency and communion attributes influence different social outcomes, such as attraction to science, technology, engineering, and mathematics (STEM) academic fields (Diekman et al. 2011). Still other research has used role-congruity theory (Diekman and Eagly 2008), which focuses on how stereotypical gender roles can be both used as social perceptions but also as self-perceptions. Considering self-perceptions, Brown and Diekman (2010), for example, found that women and men college students in the U.S. Midwest hoped for role-congruent future selves and feared role-incongruent future selves. All this research can be viewed as extending Bem’s work insofar as it may suggest that gender-schematicity (in the definition Bem proposed) still affects U.S. women’s and men’s self-attitudes and attitudes about others.

The Measurement of Gender Role Beliefs

Whereas we argued that some versions of gender role beliefs are really tantamount to heteronormativity (rather than gender roles as such), Kerr and Holden (1996), at least, have provided researchers with participants’ beliefs in traditional gender roles in a way that is connected to, but separable from, heteronormativity. Kerr and Holden (1996) created the Gender Role Beliefs Scale (GRBS), which provides a measure of attitudes toward both traditional and what the authors called “feminist” (p. 3) (or, what we call progressive) beliefs about gender roles. It should be noted that the GRBS is not a measure of personal endorsement of gender roles; instead, it assesses agreement with statements about whether traditional behaviors are acceptable or appropriate (e.g., “It is disrespectful for a man to swear in the presence of a lady”; Kerr and Holden 1996, p. 10). To our knowledge, however, even this belief in traditional or more progressive gender roles has not been correlated with any mental health outcome (e.g., self-esteem), which, were this to happen, would provide a good test of one of Bem’s (1974) assumptions.

Theoretical Advances: Focusing Beyond Gender Roles

We have argued that by focusing specifically on gender roles, Bem made a subtle move to focus on one piece of the larger whole that could be described as gender. In this section, we note that at least two recent analytic models (and associated lines of research) have proposed the idea that gender itself is either multidimensional (e.g., Egan and Perry 2001) or multi-faceted (e.g., Tate et al. 2014)—with differences intended between the meanings of these different terms. We argue that both models can be viewed as among the most modern attempts to erode the previous M–F understanding and ones that have used Bem’s initial work in some way as the conceptual springboard to do this.

The Multidimensional Understanding of Gender

Perry and colleagues (e.g., Egan and Perry 2001; Tobin et al. 2010) have focused on five different topics to which the term gender has been tied and used these topics as foci to build their multidimensional model of gender. The five dimensions of gender for Perry and colleagues are: (a) gender membership knowledge (or the awareness that the self has been or is categorized into a gender group), (b) gender typicality (or the self-rating of being similar to others with the same gender label), (c) contentment with one’s gender assignment (or the attitude toward one’s categorization into that specific group), (d) felt-pressure to conform to gender stereotypes or expectations for their gender ingroup, and (e) superiority feelings toward outgroups (or traditional gender bias). It is noteworthy that Perry and colleagues have studied this model with children (under 18 years-old) in the United States, but the dimensions could also be extended to U.S. adults. (See Tate et al. 2015, for a first extension of the Egan and Perry gender typicality construct to U.S. adults as well as Lemaster et al. 2015, for a second extension with different response options.)

It is worth noting that in previous works, Perry and colleagues (e.g., Egan and Perry 2001; Tobin et al. 2010) have critiqued the BSRI in particular as part of the way to argue for the merits of their multidimensional understanding. Although these critiques have evolved (see Pauletti et al. 2016), the starting point is worth noting because it showcases engagement with Bem’s ideas as part of the conceptual foundation for the development of this larger, more expansive understanding of gender. The upshot of these critical engagements with Bem’s work on the BSRI is that Perry and colleagues have pointed to the need for characterizing gender as more than simply gender roles as Bem defined them (e.g., Tobin et al. 2010). In their gender self-socialization model, for instance, Tobin et al. (2010) argue for the necessity of understanding how children view their own knowledge that they have been assigned to a gender category, how content they are with their gender assignment, how similar they feel to others assigned to that same gender category, how much pressure they feel to conform to stereotypes of that gender category, and how superior (or inferior) they feel when compared to other children in a different gender category in order to fully understand how self-socialization into gender happens and features individual differences across children. Each of the foregoing foci for this socialization are, of course, the dimensions on which Perry and colleagues focus to argue for how children experience the multidimensionality of gender.

As a related aside, even the framing of the short titles for the dimensions of gender makes it difficult for a scholar to place the terms masculinity and femininity onto these dimensions. In this way, Perry and colleagues invite scholars to fully break out of the previous M–F understanding by focusing on the discrete topics (each with distinct, non-overlapping names) in a way that achieves both clarity and precision in the study of gender overall.

The Multi-faceted Understanding of Gender

Inspired by both Perry and colleagues’(e.g., Egan and Perry 2001; Tobin et al. 2010) multidimensional work and Bem’s (1974) original work on psychological androgyny, Tate et al. (2014, p. 303) recently proposed a multi-faceted construal of gender-related phenomena in an analytic model they called “the gender bundle.” In particular, Tate et al. describe the five facets of gender as: (a) birth-assigned gender category (which comes from cultural authorities [e.g., medical professionals in industrialized countries]); (b) current gender identity or self-assigned gender category (which is the individual’s own, self-reported sense of being categorizable into a gender group, such as female, male, or nonbinary); (c) gender roles and gender expectations, which explicitly include the ideas presented by Bem (1974, 1981a) in her work on psychological androgyny (as well as other work in manhood and womanhood; Vandello and Bosson 2013), but does not adhere to any specific measurement of these constructs; (d) social presentation of gender, which is the interpersonal social signals that are associated with one sense of self-categorization, including attire-based presentation and appellations (e.g., names), but that other people also use to infer the self’s categorization; and (e) gender evaluations as ingroup-directed attitudes (e.g., women evaluating other women) and outgroup directed attitudes (e.g., women evaluating men or nonbinary individuals), as well as other types of self-other similarity judgments (e.g., a woman reporting her typicality with respect to other women). Thus, the gender bundle understanding has similar and unique features with respect to the multidimensional understanding of gender, and this is why a difference is intended in the meanings of facets as compared to dimensions. Nonetheless, both analytic models are demonstrations that gender itself appears to have layers that can be meaningfully distinguished—one from the others.

Importantly, Tate et al. (2014) based part of the idea for the gender bundle upon Bem’s original work, which noted that self-labeling as woman or man was distinct from endorsement of what Bem believed were gender stereotypes of femininity and masculinity (see also Tate 2012, 2014). If such a separation was required to argue for psychological androgyny, then this distinction could (and likely should) extend to other types of phenomena studied under the umbrella of gender. In fact, the reason that the self-categorization facet is separated from every other facet is inspired by Bem’s original work. Accordingly, Tate and colleagues used Bem’s original work in a non-oppositional way as scaffolding to create all the facets of the gender bundle itself. Although Tate and colleagues do not believe that Bem’s tool, the BSRI, actually measures gender role expectations or self-reported gender typicality (see Tate et al. 2015), they do appreciate the conceptual insight to separate gender self-categorization from gender role endorsement. In this way, the gender bundle can also be viewed as providing scholars with the conceptual tools to completely erode the previous M–F understanding.

Summarizing Advancements to Bem’s Conceptual Work

The previous examples demonstrate that Bem’s work on psychological androgyny has been successfully extended into present day scholarship with fidelity to the original issues as we construed them. What is more, both sets of advances we described accomplish this development without backsliding scholarship into the narrower, previous understanding of gender as M–F. Finally, all advances allow for the extension of Bem’s work into the newest frontier of a more intersectional understanding of psychological processes to which gender is tied (e.g., Cole 2009; Davis 2008; Stuanes 2003; Yuval-Davis 2006), by focusing scholars on (a) the nature of gender roles and their consequences and (b) the fact that gender roles are one among many aspects of what can be meant by gender.

Summary and Conclusions

We have argued in our article that Bem’s development of the concept of psychological androgyny can be viewed as a truly seminal contribution to the study of gender in the 1970s and 1980s in the United States—one that has influenced many different lines of research and scholarship into the present. Here, we synthesize all the presented information to highlight three notable and interrelated impacts that Bem’s introduction of the concept of psychological androgyny had on how psychology’s researchers approach the study of gender. We believe that these three impacts are intimately tied to the functions that Bem appears to have intended for the concept.

One impact concerned social justice. Specifically, psychological androgyny served as a catalyst for, and later an impetus toward, gender equality in the psychological study of gender in the United States. Contained within the first two functions of the concept that we described (i.e., improving well-being and undercutting gender role polarization) is the idea that women and men need to become equal in psychological androgyny (at least), which allows for further considerations as to where else these gender groups should also be equal. In this way, we argue that Bem can and should be regarded as a trailblazer and advocate for gender equality in psychological science and U.S. social science in general.

A second impact was Bem’s ability to use the concept of psychological androgyny to focus research attention on gender roles as ultimately malleable, socially constructed phenomena. The first two functions of psychological androgyny that we described cannot be achieved unless one tacitly believes that gender role-associated characteristics are truly malleable within an individual and (likely) coming from a source outside biology.

A third and less obvious impact was Bem’s use of psychological androgyny to, by implication, help scholars divest themselves of the previous, narrow understanding of gender as ultimately collapsible into a single dimension of masculinity–femininity (M–F). As we noted earlier, with the third function of focusing squarely on gender roles as one among many associates of gender, Bem, with psychological androgyny, seemed to implicitly acknowledge the need to expand the ideas of how gender was conceived in that time period.

These three impacts are noteworthy precisely because each stood in contrast to the previous understanding of gender in psychology. Without these impacts, it would have been more difficult to develop the insights from which all gender scholars currently benefit. In this regard, psychological androgyny is an important and sometimes misunderstood touchstone in understanding the modern scholarship on gender in psychological science.

In that light, it is unfortunate that some scholars have not yet fully appreciated these three impacts. For instance, more than 25 years ago, while conducting a validation study of the psychometric properties of the BSRI, from our perspective, Wong, McCreary, and Duffy (1990, p. 258) mischaracterized Bem’s intent for creating the psychological androgyny construct when they stated:

Despite that Bem herself declared that the concept of androgyny “contains an inner contradiction and hence the seeds of its own destruction” ([Bem,] 1979, p. 1053), we feel that the concept has not self-destructed but rather is thriving, at least in the psychological literature.

Wong et al. (1990) then go on to cite 419 entries on an electronic psychology reference database as support for the “thriving” statement.

Yet, recalling the context in which Bem wrote the 1979 (p. 1053) quote mentioned by Wong et al. (1990), psychological androgyny contains “the seeds of its own destruction” by undercutting the idea of using a single, narrow M–F dimension to define or reify gender. (This sentiment was repeated by Bem 1995.) As Bem (1979) argued, it would be through the cultural adoption of psychological androgyny in the United States that all women and men would be free of gender stereotypical thinking. At that point, the concepts of femininity and masculinity as gender roles would cease to have any social power, and, in this sense, self-destruct. Thus, for Bem, when psychological androgyny becomes a reality, then one aspect of the social construction of gender polarization will have been eroded completely in both science and society.

Before writing this article, like many scholars before us, we had a particular (though ultimately inaccurate) picture of what Bem’s scholarly project was and what she achieved based on the dominant critiques. Simultaneously, we understood that without the provocativeness of Bem’s work, the study of gender within U.S. psychology would not have progressed as it did. By consulting the original source material and situating our analysis in Bem’s own epoch (rather than just accepting the most repeated current narratives), we came to understand just how transformative Bem’s scholarly project actually was—at least in the three impacts discussed here. The critiques of Bem’s work notwithstanding, we believe that Bem provided a true conceptual advance to the study of gender (at least in the United States). As with any conceptual advance, criticism and derision are just as likely as praise and excitement because conceptual advances contain subtle and incomplete elements together with the sparks of innovation. Yet, such innovations seem indispensable to the conduct of any science because these big ideas perturb the status quo—even if they are somewhat vague for method, theory, and statistics. Bem’s concept of psychological androgyny is a perfect example of this kind of big idea in psychological science. We hope that through our article we have shown scholars just how important this idea was in its own time, as well as how it can still inspire innovations in gender scholarship now and into the future.