1 Introduction

Academic environments are important for student learning and retention. A number of studies have demonstrated a correlation between warmer perceptions of an academic environment and student success (e.g., Edman and Brazil 2007; Farley 2002; Thompson et al. 2007). For example, college students’ perceptions of the academic environment are related to their persistence in higher education (Gloria and Ho 2003) and exam performance (McKinney et al. 2006). Furthermore, academic environments can cue who does—and does not—belong (Cheryan et al. 2011, 2009), which can lead to segregated patterns of participation in certain fields (Cheryan and Plaut 2010). To encourage student success and retention, researchers and educators alike should assess academic environments and intervene when necessary to shape a more supportive community.

Such environmental scrutiny might most benefit traditionally underrepresented populations, such as women majoring in science, technology, engineering, and math (STEM) fields. Previous research demonstrates that female students in STEM fields are frequently exposed to unsupportive academic environments. Although examples of blatant discrimination and stereotyping of women in these settings exist (Cohn 2000; Lyness and Heilman 2006), more often these environments are characterized by subtle behavioral biases. Examples include men receiving more challenging materials, positive feedback, and time to respond to questions than women, which selectively promotes men’s development of STEM skills (Beaman et al. 2006; Becker 1981; Kelly 1988; Merrett and Wheldall 1992; Sadker et al. 2009). Women may also perceive an environment as unwelcoming when men greatly outnumber women (Murphy et al. 2007). Over time, these environments may lead women to develop negative beliefs about their participation in STEM, inhibiting their success and encouraging them to choose other fields for their major or career.

Unwelcoming academic environments may influence three kinds of beliefs that matter for women’s STEM outcomes. First, they may encourage women to implicitly link STEM more strongly with men, their outgroup, than with women, their ingroup (Nosek et al. 2002). Women and girls also tend to identify with math and science domains less than men and boys do (Jacobs et al. 2002; Nosek et al. 2002). These beliefs are shaped by repeated exposure to more stereotypic than counterstereotypic messages in the environment (Han et al. 2006; Karpinski and Hilton 2001). That is, women’s implicit beliefs about self, gender, and STEM are shaped by greater exposure to male-STEM pairings than female-STEM pairings. Second, unwelcoming STEM environments may highlight how numerically scarce women are in STEM fields (Murphy et al. 2007). Finally, they may remind women that others expect women to have weaker math and science ability than men (Steele et al. 2002), and thus heighten awareness that they may be judged in terms of this stereotype (Spencer et al. 1999).

Negative implicit beliefs, expectations of underrepresentation, and concerns about stereotyping each have troubling consequences. Students who hold negative beliefs about an academic domain, deeming it personally unimportant and feeling that their self-esteem is not contingent on their performance in that domain (i.e., who are less domain-identified) show less participation and interest in those fields (Crocker and Major 1989; Simpkins et al. 2006). Feeling outnumbered and concerned about self-relevant stereotypes can diminish women’s math test performance (Quinn and Spencer 2001; Schmader 2002; Sekaquaptewa and Thompson 2003; Spencer et al. 1999) and motivation to participate in STEM-related activities (Murphy et al. 2007). Increasing women’s identification with STEM, increasing their expectations for female representation, and reducing their stereotyping concerns may be important for increasing women’s participation in STEM as students and professionals.

Encouragingly, these changes can be achieved by reshaping environments. The “stereotype inoculation model” argues women can develop stronger implicit STEM identities through exposure to positive cues in their surroundings (Dasgupta 2011; Stout et al. 2011). U.S. undergraduate women in environments that featured ingroup role models and experts reported stronger implicit STEM identity, which in turn increased motivation and persistence in STEM fields. Importantly, these positive outcomes emerged even as women’s male-STEM stereotypes stayed the same. Although it is possible to change implicit stereotypes, it often requires targeted training on counterstereotypic pairings (e.g., Dasgupta and Asgari 2004; Dasgupta and Greenwald 2001; Karpinski and Hilton 2001). Implicit stereotypes may be more resilient against subtler, briefer, or more naturalistic examples (Stout et al. 2011). After all, implicit stereotypes reflect deep-seated associations encountered repeatedly throughout one’s lifetime (e.g., “math is not for girls, therefore math is not for me;” Nosek et al. 2002), and participants may not even be aware that they hold them.

Instead of weakening implicit stereotypes, environmental messages that highlight women’s presence in STEM may dissuade women from defining themselves in terms of those stereotypes. These messages would thus yield not just stronger STEM identity, but also expectations of a female presence and less concern about being judged in terms of gender stereotypes. Research suggests that small changes to women’s environments can improve these outcomes. For instance, undergraduate women feel less out-of-place in a computer science classroom featuring neutral nature posters compared to “geeky” Star Trek posters (Cheryan et al. 2009). Studies with undergraduate women at primarily White universities are less affected by negative stereotypes in environments with other women compared to all-male settings (Inzlicht and Ben-Zeev 2000; Sekaquaptewa and Thompson 2003). Testing situations believed to be irrelevant to gender stereotypes (e.g., describing a math test as “gender fair”) yield weaker concerns about being stereotyped and raise undergraduate women’s math performance to match that of men (Quinn and Spencer 2001; Schmader 2002; Spencer et al. 1999).

In order to understand how academic environments could be altered to promote these outcomes, the present research examined an academic environment that is infused with counterstereotypic pairings and welcoming cues. At the University of Michigan, first year women pursuing STEM majors can enroll in a residential program called “Women in Science and Engineering (WISE).” WISE women share a living environment that offers academic resources and social support for women entering male-dominated STEM fields. They are surrounded by incidental messages emphasizing women’s presence in science, from flyers advertising STEM career resources to “WISE Night” discussions with female STEM faculty. Peer role models, including senior WISE mentors and student-run study groups, provide additional counterstereotypic examples. Finally, by virtue of living with female STEM majors, WISE members simply see more women in science and engineering than a typical STEM major would. An environment rich in counterstereotypic messages may strengthen these female students’ identification with STEM.

Previous research has examined the effects of these kinds of STEM-focused living-learning communities across the country with racially diverse samples of students, revealing that these kinds of programs are effective in encouraging the participation of undergraduate women in STEM fields (though these programs may particularly benefit majority rather than underrepresented minority students; Hathaway et al. 2001; Soldner et al. 2012). For instance, the WISE program was shown to increase female undergraduates’ retention in science fields compared to male and female control groups matched on ethnicity, intended major, GPA, and SAT scores (Hathaway et al. 2001). These effects likely occur because the programs increase the participants’ perceptions of the compatibility between STEM fields and being female, as well as provide social support for women in STEM (Rosenthal et al. 2011). However, previous research has not investigated which factors of the environment offered in these programs contributes to these positive outcomes.

Therefore, the goal of Study 1 was to identify aspects of the academic climate that differed among women in the WISE program and those not in the program—that is, the specific environmental elements that make the WISE environment more supportive for women in STEM. Study 2 then tested an intervention that combines these positive environmental elements with techniques from previous research (e.g., Dasgupta and Asgari 2004). Our ultimate goal was to develop an effective intervention that could be translated for more traditional academic environments to reach women who do not have access to such an extensive program.

2 Study 1

2.1 Method

2.1.1 Participants and procedure

An online survey was administered to female college students from the University of Michigan WISE program (\(n = 29\) “WISE women”) and women not enrolled in WISE but taking introductory-level STEM courses (e.g., biology, chemistry, calculus, physics, and engineering) commonly taken by WISE members (\(n = 41\) “non-WISE women”). All participants were paid ten dollars in compensation.

Applicants are selected for the WISE program based on academic record and an essay on motivation for joining the WISE program. More than 90 % of applicants are accepted annually. Because we are using a quasi-experimental design for this study, it is important to compare the groups on a variety of demographic and background variables to gauge whether there are pre-existing differences between the groups. The racial diversity among the two samples was not significantly different, \(\chi ^{2} (4) = 5.02, p = .29\). Among the WISE women, there were 24 White/Caucasian, 1 Asian American/Asian/Pacific Islander, 2 African American/Black, and 2 Latina/Hispanic. Among the non-WISE women, there were 26 White/Caucasian, 8 Asian American/Asian/Pacific Islander, 3 African American/Black, 3 Latina/Hispanic, and 1 Native American/Alaskan Native.

To roughly gauge the participants’ socioeconomic background, we asked them to report their parents’ highest level of education completed. Roughly equivalent numbers of participants in WISE and non-WISE groups reported that their mother’s highest level of education was some high school or less than high school \((n_{\mathrm{non}\text{- } \mathrm{WISE}} \!=\! 1, n_{\mathrm{WISE}} \!=\! 0)\), high school graduate \((n_{\mathrm{non}\text{- }\mathrm{WISE}} \!=\! 2, n_{\mathrm{WISE}} = 7)\), some college \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 6, n_{\mathrm{WISE}} = 3)\), college graduate \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 21, n_{\mathrm{WISE}} = 11)\), or advanced degree \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 10, n_{\mathrm{WISE}} = 3\); one non-WISE student did not report her mother’s highest education level), \(\chi ^{2}(5) = 7.28, p = .20\). Similarly, roughly the same number of WISE and non-WISE women reported that their father’s highest education level was some high school or less than high school \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 1, n_{\mathrm{WISE}} = 0)\), high school graduate \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 2, n_{\mathrm{WISE}} = 4)\), some college \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 6, n_{\mathrm{WISE}} = 7)\), college graduate \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 14, n_{\mathrm{WISE}} = 6)\), or advanced degree \((n_{\mathrm{non}\text{- }\mathrm{WISE}} = 16, n_\mathrm{WISE} = 12), \chi ^{2}(4) = 4.13, p = .39\).

Participants who were in the WISE program were statistically significantly younger than participants not in the WISE program \([M_\mathrm{WISE}= 18.17,\, SD_\mathrm{WISE}= .54;\, M_{\mathrm{Non}\text{- }\mathrm{WISE}}= 18.85,\, SD_{\mathrm{Non}\text{- }\mathrm{WISE}}= .85;\, t(67.28) = -4.09,\, p < .001]\). Because of this statistically significant difference, we included age as a covariate in ANCOVA analyses testing for differences between the WISE and non-WISE women on the variables of interest. However, including age as a covariate did not alter the pattern of findings, and so we present more parsimonious \(t\) tests in our results section below.

Participants indicated their SAT or ACT math scores, and there were no differences between the WISE and non-WISE women on either \([\text{ SAT }: n_\mathrm{WISE} \!=\! 28, M_\mathrm{WISE}\!=\! 673.57, SD_\mathrm{WISE}\!=\! 70.30; n_{\mathrm{Non}\text{- }\mathrm{WISE}} \!=\! 40, M_{\mathrm{Non}\text{- }\mathrm{WISE}}\!=\! 679.75, SD_{\mathrm{Non}\text{- }\mathrm{WISE}}=\! 81.93; t(66) \!=\! -.32, p= .75; ACT: n_\mathrm{WISE} \!=\! 21, M_\mathrm{WISE }\!=\! 28.76, SD_\mathrm{WISE}\!=\! 3.65; n_{\mathrm{Non}\text{- }\mathrm{WISE}} \!=\! 28, M_{\mathrm{Non}\text{- }\mathrm{WISE} }\!=\! 29.71,\, SD_{\mathrm{Non}\text{- }\mathrm{WISE} }\!=\! 3.95;\, t(47) \!=\! -.86, p = .39]\). Similarly, WISE and non-WISE women did not show significant differences in the number of math courses that they reported taking in high school \([M_\mathrm{WISE}\!=\! 4.25,\, SD_\mathrm{WISE}\!=\! .80;\, M_{\mathrm{Non}\text{- }\mathrm{WISE}}\!=\! 4.19, SD_{\mathrm{Non}\text{- }\mathrm{WISE}}\!=\! .96;\, t(66) \!=\! .28, p \!=\! .78]\).

2.1.2 Materials

The main goal of the survey was to assess students’ perceptions of their academic environments, specifically targeting STEM student experiences that could differ between WISE and non-WISE women. The survey included several measures requiring students to rate statements from 1 (strongly disagree) to 7 (strongly agree). Messages about women in STEM (eight items; \(\upalpha = .83\)) assessed how frequently participants saw positive messages about women in STEM (e.g., “I often read articles about high-achieving women scientists and engineers.”). Living environment (9 items; \(\upalpha = .88\)) assessed participants’ attitudes toward their living environment (e.g., “I like my current living environment.”). STEM markers (one item) assessed whether students wore or carried signifiers or markers of their academic field (i.e., “I wear or carry something that identifies me part of my field of study.”). Friends in STEM (5 items; \(\upalpha = .67\)) assessed whether students felt close to their STEM classmates (e.g., “I feel close to others in my field of study”). Peer role models in STEM (four items; \(\upalpha = .76\)) assessed whether students had supportive peer role models in STEM (e.g., “I look up to the leader of my study group as a role model.”).

Predictors of STEM motivation and performance were also measured in order to assess whether the environmental features above were related to important indicators of success for women in STEM. These included two measures of STEM-gender stereotypes. Explicit STEM-gender stereotyping was assessed using four items \((\upalpha = .79)\) adapted from Schmader et al. (2004) to reference STEM (e.g., “It is possible that men have more math and science ability than women”). Expectations of female STEM representation was assessed with two items asking participants to select the proportion of women among United States engineers (1, 4, 7, 13, 16, or 19 %) and computer systems analysts (9, 16, 23, 37, 44, or 51 %). The answers were coded such that the highest percentage received the highest score (3) and the smallest percentage received the lowest score (\(-\)3). Note that the currently correct percentages were not given as an answer choice, as they were between the third and fourth answer choices (Hammond 1948). Thus, a positive score reflected expectations of more women than there currently are in STEM careers. These items were positively correlated (\(r = .27,\, p = .03\)), and so they were combined into a single measure of expectations for female STEM representation.Footnote 1

Explicit STEM identity was assessed using four items \((\upalpha = .81)\) adapted from the Collective Self-Esteem Scale (Luhtanen and Crocker 1992) to reference membership in a STEM field (e.g., “Being a student in science/engineering/math is an important part of my self-image”). Finally, a version of the Intrinsic Motivation Inventory (IMI; Ryan 1982) adapted for STEM courses was administered. The five IMI subscales measured interest/enjoyment, perceived competence, effort, pressure/tension, and value/importance. All measures demonstrated acceptable reliability (alphas ranged from .80 to .95).

2.2 Results

2.2.1 Differences in academic environment

Results revealed that women in the WISE program saw more messages about women in STEM \([M_\mathrm{WISE}= 4.7, SD_\mathrm{WISE}= 1.1; M_\mathrm{Non\text{- }WISE }= 3.8, SD_\mathrm{Non\text{- }WISE }= 1.1; t(68) = 3.61, p = .001]\), were more likely to wear or carry markers of their major \([M_\mathrm{WISE }= 4.93, SD_\mathrm{WISE }= 2.05; M_\mathrm{Non\text{- }WISE }= 3.37, SD_\mathrm{Non\text{- }WISE }= 1.53; t(67) = 3.62, p = .001]\), and had more peer STEM role models \([M_\mathrm{WISE }= 5.2, SD_\mathrm{WISE }= .99; M_\mathrm{Non\text{- }WISE }= 4.6, SD_\mathrm{Non\text{- }WISE }= .95; t(68) = 2.55, p = .01]\) compared to STEM women who were not in the WISE program. However, they did not have significantly different attitudes toward their living environment \([M_\mathrm{WISE }= 5.07, SD_\mathrm{WISE }= 1.29; M_\mathrm{Non\text{- }WISE }= 4.74, SD_\mathrm{Non\text{- }WISE }= 1.35; t(68) = 1.02, p = .31]\), nor did they have different numbers of friends in STEM fields \([M_\mathrm{WISE }= 4.74, SD_\mathrm{WISE }= .95; M_\mathrm{Non\text{- }WISE }= 4.44, SD_\mathrm{Non\text{- }WISE }= 1.14; t(68) = 1.20, p = .24]\).

2.2.2 Differences in predictors of STEM success

Women in the WISE program were more explicitly identified with STEM fields \([M_\mathrm{WISE }= 5.3, SD_\mathrm{WISE }= .94; M_\mathrm{Non\text{- }WISE }= 4.7, SD_\mathrm{Non\text{- }WISE }= .96; t(68) = 2.57, p = .01]\), compared to STEM women who were not in the WISE program. No significant group differences emerged on explicit STEM-gender stereotyping, expectations for female STEM representation, or any of the subscales of the Intrinsic Motivation Inventory (all ps \(>\) .11).

Given that WISE women were more identified with STEM than non-WISE women, it could be that the group differences regarding academic environment between the WISE and non-WISE women were actually due to STEM identification rather than the WISE program. Therefore, we re-tested the group differences regarding academic environment controlling for STEM identification. Results indicated the same patterns of statistical significance, with WISE women receiving more messages about women in STEM, being more likely to carry a STEM marker, and having more peer role models in STEM.

2.2.3 Correlations between academic environment and predictors of STEM success

All correlations can be found in Table 1. Of note, all of the academic environment measures (except for living environment) positively correlated with STEM identification. Wearing markers that identified students’ major was marginally negatively correlated with explicit stereotyping, and having friends in STEM was marginally positively correlated with expectations for female STEM representation. Furthermore, messages about women in STEM, having friends in STEM, and having peer role models were significantly correlated with several subscales of the IMI.

Table 1 Bivariate correlations, preliminary study

2.3 Discussion

Study 1 revealed three environmental factors that differentiated WISE and non-WISE women: messages about women in STEM, visible STEM markers, and peer role models in STEM. Each of these three factors is an important aspect of the WISE program, and thus might contribute to the positive outcomes experienced by WISE women (such as increased retention in STEM). The groups did not differ in terms of their perceptions of their living environment, which suggests that a residential program is not necessarily needed to provide women in STEM with a supportive environment. The groups also did not have different numbers of friends in STEM, which suggests that friendships with classmates can be found in and out of the WISE program. This study was thus able to identify where women in the WISE program do—and do not—differ from other women enrolled in STEM courses.

Study 1 was consistent with previous research showing a connection between perceptions of the environment and academic success (e.g., Edman and Brazil 2007; Gloria and Ho 2003; McKinney et al. 2006). Women’s perceptions of their academic environment correlated with known predictors of STEM outcomes. Women who received messages about women in STEM, wore or carried STEM markers, had friends in STEM, and had peer role models in STEM were more likely to be identified with their field. Additionally, several of these measures correlated with aspects of the participants’ intrinsic motivation in STEM fields, including enjoyment of their STEM courses, the value they saw in STEM courses, their perceived competence in STEM, and the pressure they felt in STEM classes, all of which are related to achievement and engagement in STEM (Gottfried et al. 2007; Simpkins et al. 2006).

It is important to keep in mind that this study utilized a quasi-experimental design, and so there was no random assignment to the WISE and non-WISE groups. Therefore, it is possible that pre-existing differences between the groups were responsible for perceived differences in academic environments. However, there were no differences between the groups in terms of their standardized testing math scores or the number of math courses they took in high school—indicators of STEM achievement that might lead students to perhaps make note of more women in STEM-themed events or otherwise more positively perceive their STEM environments. This supports our argument that reported differences in STEM messages, markers, and peer role models are unlikely to be driven by pre-existing group differences, and rather are an accurate reflection of the WISE and non-WISE STEM environments.

Furthermore, it is possible that Study 1 isolated the elements of the WISE program that provide a uniquely welcoming academic environment for women in STEM. This substantially extends previous work that has highlighted the importance of these kinds of intensive, residential interventions (e.g., Edwards and McKelfresh 2002; Hathaway et al. 2001; Soldner et al. 2012; Szelenyi and Inkelas 2011). Given that the data collected for this study was in the fall semester, the WISE women had only been exposed to the program for a couple of months, suggesting that the program has a swift impact on the academic environment experienced by its students. It may be the case that more differences between the WISE and non-WISE women would emerge later in the school year, particularly regarding the predictors of STEM success, such as their stereotyping and intrinsic motivation to succeed in STEM fields. However, if these aspects of the academic environment can be felt fairly quickly, then perhaps they could be translated into an intervention for women who are not enrolled in an intensive residential program. Study 2 aimed to develop and test such an intervention.

3 Study 2

Brief environmental interventions can improve female and ethnic minority students’ academic outcomes (Cohen et al. 2006; Good et al. 2003; Miyake et al. 2010). Study 2 tests whether a short-term intervention modeled after the WISE environment can improve outcomes relevant to women’s STEM participation. An intervention that strengthens outcomes such as implicit STEM identity and that targets women already in STEM may offer particular benefits. First, even undergraduate women who explicitly express strong interests in math show weaker implicit math identification than undergraduate men do (Nosek et al. 2002). Further, implicit measures may be more predictive for these women because they are relatively immune to demand characteristics. Women in STEM know that their explicitly stated identification with STEM and consciously held stereotypes have the potential to confirm self-relevant stereotypes; these explicitly considered responses are thus subject to impression management concerns (Greenwald et al. 2009). Second, recent research suggests that environmental interventions can increase implicit STEM identity more easily than they decrease implicit stereotypes. Strengthening implicit STEM identity is sufficient to improve STEM outcomes because it can buffer undergraduate women from relatively entrenched stereotypes (Stout et al. 2011).

The goal of Study 2 was to expose a new sample of non-WISE women to an intervention based on three environmental factors examined in Study 1: messages about women in STEM, visible STEM markers, and peer role models in STEM. These factors were chosen because they are what differentiated WISE and non-WISE women and because they were correlated with at least one predictor of STEM success in Study 1. The intervention incorporated exposure to messages about women in STEM by including flyers in the experimental setting advertising an event featuring female mathematicians. Participants also received a pencil with a message about female students in STEM, imitating the markers carried by WISE women and simultaneously reminding women of their peers in STEM.

The role models presented on the intervention flyer likely comprise only a fraction of the successful women in STEM that WISE women see. Therefore, the intervention supplemented exposure to messages about women in STEM by modifying a procedure used by Dasgupta and Asgari (2004) to decrease implicit gender stereotypes about women’s leadership abilities. In Dasgupta and Asgari’s study, a community sample of women in New York City who were diverse in both age and ethnicity were exposed to photos and descriptions of famous women leaders (experimental condition) or flowers (control condition). They then were given a quiz where they saw each of the photos of the women leaders (or flowers) twice with an abbreviated correct and incorrect description, and the participants had to identify the correct description. The correct answer was then provided, even if the participant chose the incorrect description.

The present intervention made two key changes to this procedure. First, rather than learning about female leaders, the quiz featured famous women in science or math and famous men in the arts or humanities: both gender-counterstereotypic exemplars. Second, the quiz was briefer: whereas Dasgupta and Asgari’s participants first learned about the exemplars’ achievements, the present study’s participants merely took the quiz. Because the goal of the quiz for the present study was to supplement the exposure to messages about women in STEM provided by the experimental flyer, the shortened version of the procedure seemed sufficient. Furthermore, these two changes made our intervention similar to an earlier version of this procedure used by Dasgupta and Greenwald (2001) to influence implicit racial bias.

Another important feature of the WISE environment is that the messages it sends likely feel self-relevant for WISE women, since they knowingly joined a program explicitly focused on supporting female STEM students. Self-relevance may be important because, according to a cognitive consistency framework (Greenwald et al. 2002), beliefs about the self, one’s social group (e.g., gender), and one’s roles (e.g., participation in STEM) are interdependent. For example, a woman who has a strong association with her gender group (i.e., strong gender identity) and associates math with men more than women (i.e., strong math-gender stereotype) does not tend to associate herself with math (i.e., weak math identification; Nosek et al. 2002). This pattern of beliefs often exists at an implicit level (without intention or awareness) even more strongly than at an explicit level (Greenwald et al. 2002; Stout et al. 2011).

We expected that our short-term intervention would be more effective if it felt similarly relevant to our non-WISE participants. Messages are considered more carefully when they are self-relevant (Petty and Cacioppo 1984) and evidence suggests that implicit attitudes are subject to the same persuasive cues that influence explicit attitude change (Smith et al. 2012). Further, positive environmental messages, including role models, are only effective if participants can relate to or identify with them (Asgari et al. 2012; Cheryan et al. 2011; Stout et al. 2011). We therefore predicted that making our intervention feel more personally relevant to the participants would increase its effectiveness.

To induce self-relevance, we included a questionnaire about students’ own experiences in STEM. Answering questions about one’s own experiences directly after receiving the intervention messages might encourage participants to draw connections between themselves and the women in the messages. It might also encourage rehearsal of self-STEM associations, and implicit attitude shifts are strengthened when participants have time and motivation to rehearse the targeted association (Briñol et al. 2008). We manipulated the order of the self-relevant questionnaire and implicit measures to compare the effects of the intervention when it was made self-relevant versus not. Making the intervention feel more self-relevant was thus expected to yield more positive outcomes, particularly on implicit measures.

3.1 Pretest of the self-relevance manipulation

Eighty-two women \((M_{\mathrm{age}} = 20.0, SD_{\mathrm{age}} = 1.3)\) who had recently enrolled in introductory STEM courses completed the STEM Experience Questionnaire (which consisted of 23 questions about participants’ experiences in STEM classes, in addition to the explicit dependent measures described in Sect. 3.2.3 below) as well as a word fragment completion measure of self-activation. Participants were asked to complete 40 word fragments, 25 of which could yield either neutral words (e.g., “be,” “the,” “silence”) or words relevant to the self (e.g., “me”; six stems adapted from Eichstaedt and Silvia 2003), women (e.g., “she”; six stems adapted from Ambady et al. 2004; Nosek et al. 2002), or STEM (e.g., “science,” thirteen stems adapted from Nosek et al. 2002; IAT target words from the main study). Possible neutral word completions had similar or higher English language word frequencies than target word completions (Kuçera and Francis 1967) and were equivalently likely to appear as answers in pilot testing.

The order of the questionnaire and the word fragments was counterbalanced to determine if participants were more conscious of their self-concepts after they completed the STEM Experience Questionnaire than before. A one-way ANOVA revealed a significant effect of order on the number of self-relevant word completions, \(F(1,80) = 10.75, p = .002, d = .73\). Participants who completed the STEM Experiences Questionnaire before the self-activation measure provided an average of 3.96 (SD= 1.35) self-relevant word completions, while those who had not yet completed the STEM Experience Questionnaire provided an average of 2.88 (SD= 1.61) self-relevant word completions. Order had no significant effect on the number of STEM- \((p = .16)\) or female-relevant word completions \((p = .19)\). This provided evidence that the STEM Experience Questionnaire did indeed activate students’ self-concept, and therefore it was included in the main study to use as a manipulation of self-relevance.

3.2 Method

3.2.1 Participants and procedure

One hundred thirty-seven female students participated in exchange for 20 dollars. Participants were recruited from first-year STEM-major courses in biology, physics, chemistry, engineering, or math. The sample consisted of 86 women who identified as Caucasian/White, 27 as Asian American/Asian/Pacific Islander, 13 as African American/Black, 2 as Latina/Hispanic, and 9 selected “other” as their racial identity. The ages represented in the sample ranged from 17 to 34 (\(M = 19.07\), SD= 1.66). On the whole, the participants reported having well-educated parents. Only three participants reported their mother’s highest level of education completed as some high school or less and only 19 reported high school graduate, whereas 25 reported some college, 42 reported college graduate, and 47 reported advanced degree (1 participant indicated don’t know/not applicable). Only four participants reported their father’s highest level of education completed as some high school or less and only 16 reported high school graduate, whereas 22 reported some college, 34 reported college graduate, and 59 reported advanced degree (2 participants indicated don’t know/not applicable).

The study used a 2 (condition: intervention vs. control) \(\times \) 2 (self-relevance: high vs. low) factorial design. Participants were randomly assigned to condition (intervention vs. control) and self-relevance condition (high vs. low). A white female experimenter ran groups of one to eight participants, with each student seated at an individual computer station. Everyone first received the condition manipulation (either intervention or control, as described below). Then, participants in the low self-relevance condition completed implicit stereotyping and identity measures, followed by the STEM Experience Questionnaire. For these participants, the intervention was not made self-relevant. For participants in the high self-relevance condition, the STEM Experience Questionnaire preceded the implicit measures, thus making the intervention self-relevant. Finally, all participants finished with a general attitude and demographic survey, including manipulation check items.

3.2.2 Condition manipulation (Intervention vs. Control)

The intervention consisted of three different parts. The first part manipulated whether the participants received messages about women in STEM. One of two flyers was posted at each computer station. To ensure that the flyers would seem like a general interest announcement unrelated to the experiment, they were posted prior to the participants’ arrival and accompanied by an additional neutral flyer advertising a graduate student social gathering. The intervention flyer emphasized women’s presence in STEM by advertising a math department event featuring mostly female professors. Control participants saw a nearly identical flyer advertising an evenly mixed-gender psychology department event, along with the neutral flyer.

The second part of the intervention concerned both identifiers and peers in STEM. Participants received a custom-printed pencil (i.e., a portable STEM identifier) to use in the experiment and keep afterwards. Intervention pencils were inscribed with the following statistic: “In 2004, women earned more than half of all degrees awarded in science, math and engineering. –National Science Foundation”, a message that both identified the participants as members of their academic field and highlighted peer role models in their fields. Control participants received a “University of Michigan” pencil.

The third part of the intervention was the counterstereotypic exemplar quiz based on the work of Dasgupta and Asgari (2004). Intervention participants completed a “famous people” quiz featuring famous women in STEM fields. Twelve famous women in math and science and twelve famous men in art or humanities (counter-stereotypic fields) were presented twice each in random order. Exemplars ranged from well-known (e.g., Marie Curie, Humphrey Bogart) to obscure (e.g., Leona Woods, Fred Gipson). Participants viewed each exemplar’s name and picture and selected his/her real accomplishment from two options. All female exemplars were paired with math- or science-related options, and all men with arts or humanities (e.g., “Leona Woods is famous for a) solving a theorem in mathematics previously thought to be unsolvable, or b) helping to build the first nuclear reactor;” “Fred Gipson is famous for a) starring in two Academy Award-winning films, or b) writing classic works of fiction, including Old Yeller.”). After selecting an answer, participants always immediately received the correct answer for each question. Control participants completed an identically structured insect/flower recognition quiz (inspired by the practice categorization task in the Implicit Association Test; Greenwald et al. 1998).

3.2.3 Dependent measures

Participants completed four explicit questionnaires. Concerns about stereotyping were assessed with two items using a 7-point Likert scale (e.g., 1 = Not at all likely, 7 = Very likely; 1 = Strongly disagree, 7 = Strongly agree; e.g., “I am concerned that others will judge people of my gender as a whole based on my performance in my math/science/engineering classes”; \(\upalpha = .91\)). Explicit STEM-gender stereotyping, expectations for female STEM representation, and explicit STEM identity were assessed as in Study 1 (\(\upalpha = .80, r = .46 (p < .001)\), and \(\upalpha = .81\), respectively).

Implicit identification with STEM fields was measured using an Implicit Association Test (IAT; Greenwald et al. 1998). This test infers implicit associations by comparing response times for categorizing words when certain categories are paired (here, “sciences/humanities” and “self/other”). In order to assess global STEM identification, the “sciences” list included several different STEM disciplines (e.g., chemistry, mathematics, and engineering).Footnote 2 If the participant was slower to categorize a word when the concepts “self” and “sciences” shared the same response key than when “other” and “sciences” shared a response key, she was said to have low implicit STEM identification. The test was scored following the standard algorithm (Greenwald et al. 2003), with higher scores indicating greater implicit STEM identification.

Another IAT measured implicit STEM-gender stereotyping using the categories “male/female” and “sciences/humanities”. Higher scores indicated greater implicit STEM-male stereotyping. Computer error prevented the collection of IAT scores from four participants.

3.3 Results

Overall, scores ranged from one to seven on explicit STEM-gender stereotyping \((M = 4.88, SD = 1.29)\), explicit STEM identification \((M = 4.79, SD = 1.25)\), and stereotyping concerns \((M = 4.77, SD = 1.69)\). Scores ranged from negative three to positive three on expectations of female STEM representation \((M = .64, SD= 1.31)\). Participants’ implicit STEM-gender stereotyping \((M = .35, SD = .35)\) differed significantly from zero, \(t(132) = 11.38, p < .001, d = 1.98\), indicating an implicit male-STEM association. Implicit STEM identification \((M = .02, SD= .42)\) did not differ significantly from zero, \(t(132) = -.45, p = .66, d = .09\), indicating that the sample as a whole did not preferentially associate themselves with either STEM or humanities fields.

Implicit STEM identification correlated significantly with implicit STEM-gender stereotyping, \(r(133) = -.31, p < .001\); women who were more implicitly identified with STEM showed less implicit stereotyping. Implicit and explicit STEM identification were also correlated, \(r(133) = .25, p < .01, d = .52\).

3.3.1 Manipulation check

Questioning during debriefing revealed that twelve participants (eight intervention, four control) failed to notice both the wording on the pencil and the flyer. Thus these participants were excluded from the following analyses.Footnote 3

3.3.2 Concerns about stereotyping

A 2 (condition: intervention vs. control) \(\times \) 2 (self-relevance: high vs. low) ANOVA showed a significant main effect of condition, \(F(1,121) = 4.59, p = .03, d = .39\). Participants felt less concerned about stereotyping in the intervention \((M = 4.38, SD= 1.74)\) than the control condition \((M = 5.03, SD= 1.63)\). Neither the main effect of self-relevance \((F(1,121) = 1.44, p = .23, d = .22)\) nor the condition by self-relevance interaction \((F(1,121) = 2.60, p = .11, d = .29)\) were significant.

3.3.3 Expectations of female STEM representation

A 2 (condition) \(\times \) 2 (self-relevance) ANOVA showed a significant effect of condition, \(F(1,121) = 4.75, p = .03, d = .40\). Participants reported a greater expectation for female STEM representation in the intervention \((M = .87, SD= 1.31)\) than the control condition \((M = .35, SD= 1.27)\). That is, participants in the intervention condition expected a higher proportion of women in STEM careers than those in the control condition. The main effect of self-relevance was marginally significant \((F(1,121) = 2.95, p = .09, d = .31)\) such that participants in the non-self-relevant condition reported greater expectations for female STEM representation \((M = .40, SD= 1.40)\) than those in the self-relevant condition \((M = .80, SD= 1.20)\). The interaction \((F(1,121) = .04, p = .84, d = .04)\) was nonsignificant.

3.3.4 Implicit STEM identification

The 2 (condition) \(\times \) 2 (self-relevance) ANOVA revealed a significant two-way interaction, \(F(1, 117) \!=\! 6.01, p \!=\! .02, d \!=\! .45\). A simple effect analysis showed that the condition effect was significant in the high self-relevance condition, \(F(1, 120) \!=\! 7.89, p < .01, d \!=\! .51\), but not in the low self-relevance condition, \(F(1,120) \!=\! .46, p \!=\! .50, d \!=\! .12\). In the high self-relevance condition, participants showed stronger implicit math identification in the intervention \((M \!=\! .14, SD\!=\! .40)\) compared to the control condition \((M \!=\! -.16, SD\!=\! .42)\), whereas implicit math identification did not differ between conditions in the low self-relevance condition \((M_{{ intervention}} \!=\! -.01, SD\!=\! .38; M_{{ control}} \!=\! .06, SD \!=\! .44)\). See Fig. 1.

Fig. 1
figure 1

Implicit STEM identification by condition and self-relevance, Study 2. Positive scores indicate self-STEM association is greater than self-humanities association, whereas negative scores indicate self-humanities association is greater than self-STEM association

3.3.5 Implicit STEM-gender stereotyping

The 2 (condition) \(\times \) 2 (self-relevance) ANOVA revealed that condition \((F(1, 117) \!= .001, p = .97, d < .01)\), self-relevance \((F(1, 117) = 2.68, p = .11, d = .24)\), and their interaction \((F(1, 117) = .04, p = .84, d = .04)\) were nonsignificant.

3.3.6 Explicit science identification and stereotyping

Results of the 2 (condition) \(\times \) 2 (self-relevance) ANOVA revealed no significant main effects or interactions for either variable, all \(F\text{ s } < .66, p\text{ s } > .41\).

3.4 Discussion

After identifying three aspects of positive STEM environments for women in Study 1, this study tested an intervention combining small-scale versions of these environmental aspects with an intervention used previously in other domains (Dasgupta and Asgari 2004). The results showed that female STEM students from traditional academic environments benefited from the intervention. Regardless of whether the intervention was made self-relevant, they reported fewer stereotyping concerns and perceived that women held a greater percentage of computer analyst and engineering jobs in the United States. They also reported greater implicit STEM identification when the intervention was made self-relevant.

These results offer insight into relationships between implicit associations involving the self, gender, and STEM. Although our intervention affected concerns about stereotyping and expectations for female STEM representation regardless of self-relevance, it affected implicit identity (i.e., self-STEM associations) only when participants first reflected on their STEM experiences. This suggests that the intervention needed to be made self-relevant to influence implicit beliefs, a finding that echoes evidence that students benefit more from ingroup role models when they personally identify with them (Cheryan et al. 2011; Stout et al. 2011). This finding also recalls evidence that implicit attitude changes are strengthened when participants rehearse the targeted association (Briñol et al. 2008). Activating participants’ self-concepts may have highlighted the messages’ personal relevance and motivated the rehearsal of STEM-self associations, both of which should promote changes in implicit identity. Finding the intervention self-relevant may have also encouraged more careful consideration of it, allowing its message to sink in more deeply (Petty and Cacioppo 1984; Smith et al. 2012). Educators and researchers implementing academic environment interventions should take steps to activate students’ sense of self, perhaps by encouraging students to reflect on their own STEM experiences in relation to those of female STEM role models and peers.

Unified theory may explain why we observed changes in implicit STEM identification but not implicit stereotypes (Greenwald et al. 2002). Theories of cognitive consistency, including unified theory, hold that associations between gender, STEM, and the self are interdependent (“me = female, STEM = male, therefore STEM \(\ne \) me”; Nosek et al. 2002). One would thus predict that a shift in STEM identification (self-STEM associations) would be accompanied by a shift in either female-STEM associations or self-female associations. In this case, participants may have maintained cognitive consistency by weakening their implicit gender identities rather than their STEM-gender stereotypes. Weakened gender identification in response to strengthened STEM identification may reflect woman-scientist identity interference, which is the degree to which a female scientist may feel that being a woman and being a scientist are incompatible (Settles 2004; Settles et al. 2009). Future research could incorporate measures of gender identification and woman-scientist identity interference to gain a more nuanced understanding of the influence of this intervention.

Alternatively, our intervention may have simply buffered female students’ implicit STEM identities against the otherwise detrimental impact of deep-seated implicit stereotypes. As discussed, the stereotype inoculation model argues that role models can strengthen implicit STEM identities without changing stereotypes (Stout et al. 2011). We did not require participants to rehearse counterstereotype training so changes in implicit stereotypes may have been unlikely at the outset.

Our pattern of explicit outcomes is also predicted by the stereotype inoculation model. According to this model, messages that strengthen implicit identity should also make women seem more visible in STEM domains and thus make those fields seem less threatening to female students. In our study, environmental reminders of ingroup success made women seem more prevalent in STEM careers and reduced participants’ stereotyping concerns. In contrast, explicit identity and stereotyping were unaffected. Because the women in our sample were STEM students, they were probably motivated to assert a strong STEM identity and to reject stereotypes about ability. Yet, they might still have come into our study with expectations of being outnumbered in STEM or being judged on stereotypes that they personally rejected (Huguet and Régner 2009; Shapiro 2011). Our environmental messages highlighting fellow women in STEM increased expectations for women’s representation in STEM, which may have made it seem less likely that their performance would be the only representation of women in STEM. This could have lifted some pressure to defy the stereotype and accurately represent all women. These belief shifts could have occurred even as explicit identification and stereotyping remained unchanged.

Another reason why explicit identity and stereotypes may have remained unchanged comes from Greenwald et al. (2002). Their work demonstrated that the relationship between interconnected implicit associations is stronger than the relationship between interconnected explicit associations. In other words, implicit associations about the self may be more subject to cognitive consistency constraints than explicit associations. Because we interpret our findings from a cognitive consistency perspective, the lack of change in explicit identity and stereotyping measures is not surprising. Furthermore, evidence of change in implicit measures is a strength of this work, since they are more resistant to demand effects (Greenwald et al. 2009).

Our three-pronged intervention proved successful here, although there are some limitations to this research. First, all of the counterstereotypic female exemplars used in the intervention materials were White. We did not observe any participant race differences in our data, but it is possible that women of color would have had a stronger reaction to the intervention if it included examples that not only matched their gender but also their race. Future research may wish to manipulate the characteristics of the counterstereotypic exemplars using a more intersectional approach. Second, the present studies did not manipulate the three elements of the intervention separately, so future research should test whether fewer than three intervention elements could have similar effects.

Nevertheless, the implications of our results remain important. First, each of the three components share the underlying message that women in STEM are normative. The intervention quiz gives examples of women in STEM, the flyer advertises female mathematicians, and the pencil explicitly states that there are many women in STEM. Therefore, although the relative contribution of each specific element was not assessed in the current study, one could infer the positive implications of simply sending normative messages about women in STEM. Second, the goal of the study was to test an environmental intervention based on the success of an existing program. The three components of the intervention represent aspects of the WISE program that could be imported into a typical classroom: learning about successful women in STEM can be smoothly incorporated into existing curriculum (e.g., Rios et al. 2010), as can STEM markers (e.g., pencils or T-shirts) and messages about the number of female peers in STEM. These components thus constitute a small-scale version of a more intensive intervention program.

There is significant room for future research to build upon this intervention. For example, the present research only tested the short-term effects of this intervention, but a longer-term follow-up could offer a more complete picture of the impact of the intervention. It may be the case that the intervention (or some variation thereof) needs to be administered multiple times to produce lasting changes. Future research could directly compare the effects of this intervention to the effects of longer-term interventions such as the WISE program. Finally, this intervention should be tested for use with younger age groups to encourage the retention of women in STEM fields even earlier in the pipeline.

The present work extends past successful academic interventions by combining lab methods with real practices used by women-in-STEM programs. It adds to evidence that short-term interventions can change implicit beliefs to potentially improve academic outcomes (e.g., Dasgupta and Asgari 2004). It further verifies the effectiveness of counterstereotype exposure by demonstrating its effects on a previously unaddressed set of implicit beliefs (STEM-self associations) in a distinct population (female STEM majors). Furthermore, it tests new ways of approximating large-scale supportive academic environments in a small-scale way through its use of incidental messages and portable markers. Finally, these studies help to identify the key elements that contribute to the success of living-learning communities in encouraging women in STEM. Future research should continue to investigate interventions to increase women’s participation in STEM that can be implemented in a variety of settings.