Introduction

In an ideal world, children would be raised in a society free of gender stereotypes. Such freedom from these stereotypes would allow children to exhibit behaviors and acquire skills based solely on their personal preferences devoid of the constraints of the societal norms that surround their particular gender (Bem 1983). However, in the real world, from the moment of their birth, children are placed into either a “boy” or “girl” category (Bem 1983; Fagot and Leinbach 1993; Kimmel 2004). This seemingly fundamental physiological distinction is automatically surrounded by a system of societal expectations that determine which behaviors are appropriate for “boys” and which are appropriate for “girls” (Fagot and Leinbach 1993) and facilitate the creation and maintenance of gender role stereotypes (Ridgeway and Correll 2004). Based on data collected from United States samples, many have argued that the consequences of such stereotypes are detrimental because they are thought to contribute to the current occupational segregation (Cejka and Eagly 1999; Correll 2004; Hochschild 1989) and well-established pay gap between men’s and women’s wages (Blau et al. 2002). Eagly and colleagues (Eagly and Steffen 1984; Eagly and Diekman 2003; Eagly et al. 2000) have proposed that occupational segregation may not be the result of gender role stereotypes, but rather the cause of gender role stereotypes. More specifically, Eagly’s social role theory states that gender stereotypes are bound to social roles and reflect current occupational and societal trends (Diekman and Eagly 2000; Diekman and Goodfriend 2006; Eagly and Diekman 2003; Eagly et al. 2000). However, the majority of the research testing social role theory has examined adults’ gender role stereotypes by using explicit judgment or evaluative measures. Thus, the purpose of the current study was to test social role theory by examining children’s gender role stereotypes via implicit measures of their information processing of and memory for male and female occupational roles. While social role theory is not limited to the United States, the majority of the applications of and research supporting this theory have been from U.S. populations. Therefore, the research cited and conducted in the current report is drawn from U.S. samples.

Generally speaking, social role theory states that gender role stereotypes are dynamic and malleable because they emerge from role-bound activities and characteristics (Diekman and Eagly 2000; Eagly et al. 2000). As a consequence, if the distribution of men and women in stereotypical activities and occupations changes, then the gender role stereotypes surrounding these activities and occupations should reflect that change. Already, due to powerful economic and societal influences, the number of women entering the paid labor force has doubled since the 1950s (Diekman and Eagly 2000) and is steadily increasing. This drastic increase in the number of women not only entering the work force, but also pursuing more traditionally masculine career paths has resulted in more flexible attitudes and perceptions pertaining to the female gender role (Diekman and Goodfriend 2006). These perceptions now include both feminine (e.g., household responsibilities) and masculine (e.g., paid laborer) activities (Diekman and Goodfriend 2006; Hayghe 1990). Unfortunately, a similar trend is not evident for young boys and the male gender role (Diekman and Goodfriend 2006; Halpern 2000). This may be due to the fact that men have not moved into traditionally feminine occupations at nearly the rate that women have moved into traditionally masculine occupations (Diekman and Eagly 2000; Diekman and Goodfriend 2006). Moreover, young boys are also not typically encouraged (or required) to engage in more feminine activities (e.g., babysitting), classes and majors (e.g., home economics, creative writing, child development), or occupations (e.g., nurse, homemaker, elementary school teacher; Eccles et al. 1999).

Are these trends reflected in our gender role stereotypes? In a series of experiments, Diekman and Eagly (2000) found that adults’ evaluations and judgments of men’s and women’s participation in both masculine and feminine activities, behaviors, or occupations were much more reflective of current trends than past trends. More specifically, the findings revealed that masculine and feminine gender roles appear to be converging for women, but not for men. Diekman and Eagly (2000) contended that this increased flexibility for the female gender role compared to the male gender role provides additional evidence for their claim that current sociological trends maintain or change gender role stereotypes and not the other way around. Although compelling, most of the research guided by social role theory has focused on adults, leaving open the question of whether social role theory also applies to children’s gender role stereotypes.

Research has shown that at as early as 2 years of age, children are aware of and affected by gender role stereotypes (Bauer 1993; Ruble et al. 2006; Martin et al. 2002). During the early years (e.g., 2–6 years) as children are acquiring gender role knowledge (e.g., “Who typically becomes a doctor and/or who typically wears pink?”), they can be fairly rigid in their application of gender role stereotypes (e.g., “Can a man grow up to be a nurse?”) and harsh in their evaluations of gender norm violations (e.g., “Should a man be a nurse?”; Levy et al. 2000; Levy et al. 1995; Ruble et al. 2006; Signorella et al. 1993; see Signorella and Liben 1985 for a discussion regarding important distinctions between children’s knowledge, attitudes, and evaluations of gender stereotypes). Bigler and Liben (1992) posited that this rigidity may ultimately be the result of children’s difficulty in sorting people into multiple categories simultaneously (e.g., woman + masculine occupation). However, around age seven, children begin to acquire more complex categorization skills (Bigler and Liben 1992) and often exhibit a phase-like shift from rigidity to flexibility in their adherence to gender role stereotypes on some measures (e.g., “can they” questions; Blakemore 2003; Levy et al. 1995; Ruble et al. 2006). Yet on other measures, particularly those that assess more evaluative or personal reactions to gender norm violations (e.g., “should they” questions), trends are less clear (Blakemore 2003; Carter and McCloskey 1983–1984; Fagot 1977, 1993; Levy et al. 1995; Ruble et al. 2006; Stoddard and Turiel 1985). Several studies have found that, compared to younger children, children between the ages of 7–9 years are consistently more negative in their evaluations of cross-gender violations, particularly for males engaging in counterstereotypic behaviors (Blakemore 2003; Garrett et al. 1977; Ruble et al. 2006; but see also Signorella et al. 1993). For instance, Blakemore (2003) found that 7- to 10-year-olds were particularly negative in their evaluations of male nurses relative to female doctors.

It is important to note that much of this research exploring gender role stereotypes with children has largely utilized explicit judgment or evaluative measures (Bigler and Liben 1990; Cann 1993; Ruble et al. 2006; Trice and Rush 1995). Cann (1993) has long argued that one issue to consider when using these types of explicit measures is that they may lend themselves to more reactive and biased responding because children may be more consciously awareness of socially appropriate responses. Few studies have used more implicit measures when investigating children’s gender role schemas (Bennett et al. 2000; Kail and Levine 1976; Koblinsky et al. 1978; Liben and Signorella 1980; Martin and Halverson 1981, 1983; Most et al. 2007; Nadelman 1974). For instance, in a pioneering study, Koblinsky and colleagues (1978) tested children using an implicit memory task where participants first read a story about a boy and a girl who exhibited both stereotypic and counterstereotypic behaviors and traits. The findings revealed that children remembered significantly fewer feminine traits from the story when those traits were associated with the male character than when associated with the female character. Similarly, Liben and Signorella (1980) found that when highly stereotyped 7- and 8-year-olds where shown pictures of males and females engaging in counterstereotypic activities or occupations, they recognized significantly fewer pictures of male actors violating gender norms than female actors. Liben and Signorella (1980) concluded that children perceived male gender role violations as more incongruent with gender role norms than female gender role violations. Taken together, these findings suggest that implicit measures highlight the complexities of children’s differential stereotypic views for male and female gender roles.

However, many have proposed that using information processing/reaction time tasks to tap into individual’s gender role stereotypes might provide additional insights into the complexity by which stereotypes are cognitively accessed and maintained (Cann 1993; Devine 1989, 2001; Ruble et al. 2006). The ability to rapidly encode, access and utilize stored information is often referred to as knowledge base access (Fischer et al. 1994; Kee and Guttentag 1994). Knowledge base accessibility, as it relates to gender stereotypes, has been studied in adults (Banaji and Hardin 1996; Blair and Banaji 1996; Most et al. 2007) and, to a much lesser extent, in children (Most et al. 2007). Banaji and Hardin (1996) tested college students’ knowledge base access using a stereotype-priming task that relied on the premise that when presented with two words consecutively, the processing of the second word will be influenced by how closely it relates to the first word (Macrae et al. 2002; Martin 1993; Sherman et al. 1997). Participants were first presented with a gender-related noun (e.g., “engineer” or “nurse”) and asked to respond as quickly as possible as to whether a subsequent proper name was male (e.g., Frank) or female (e.g., Mary). Participants made significantly faster judgments if the noun-name pair was congruent (e.g., engineer - Frank), rather than incongruent (e.g., engineer - Mary), with gender role stereotypes. Likewise, Most et al. (2007) reported similar findings using an auditory Stroop task with both adults and children (i.e., 8- to 9-year-olds). In this study, participants heard both male and female voices saying stereotypically masculine or feminine names (e.g., “Amy”) or words (e.g., “tough”) and were required to rapidly respond as to whether the voice was that of a man or a woman. The findings revealed that both adults and children were significantly faster to respond and more accurate when the voice and name/word were congruent (e.g., male voice saying “pirate”) then when they were incongruent (e.g., female voice saying “football”).

Similarly, Kee and Guttentag (1994) used an information-processing task that incorporated an implicit memory measure with 8- and 9-year-old children to explore knowledge base accessibility and memory of non-gender related information. In this study, children were orally presented with both congruent (e.g., “cowboy-ranch”) and incongruent (e.g., “frog-chair”) noun pairs (see Rohwer et al. 1982) and asked to silently create a sentence incorporating both nouns and orally produce it as fast as possible. Response latency for sentence generations, the children’s strategy use, and subsequent recall of the noun pairs were recorded. To explore strategy use, the researchers characterized an elaborative strategy as a sentence that had a direct interaction between the noun pairs (e.g., “The fish swam through the seaweed”). A sentence was classified as an associative strategy if the sentence did not interactively incorporate both nouns (e.g., “Fish and seaweed are in the ocean”). In line with the previous research (Banaji and Hardin 1996; Most et al. 2007), Kee and Guttentag (1994) found that children (1) were significantly faster at creating a sentence, (2) used more elaborative strategies, and (3) remembered significantly more pairs then when they were presented with congruent noun pairs (e.g., “bush-garden”) than incongruent noun pairs (e.g., “glass-elbow”).

Taking these findings into consideration, we decided to use a modified version of the Kee and Guttentag (1994) and Banaji and Hardin (1996) paradigms in order to test social role theory with children. More specifically, we sought to implicitly measure children’s knowledge base access and memory for occupational stereotypes to determine whether these stereotypes were more rigid for the male gender role when compared to the female gender role, as social role theory would predict (Eagly and Steffen 1984). Thus, children were presented with both stereotypic (e.g., “Mark-Dentist/Allison-Librarian”) and counterstereotypic (e.g., “Henry-Nurse/Patricia-Janitor”) name-occupation pairs and asked to create sentences incorporating both the name and occupation as fast as they could. They were also later asked to recall these pairings. Response latencies for sentence generation, the number of sentences children omitted, and whether or not the sentences incorporated aspects of the occupation (e.g., elaborative versus associative strategies; Kee and Guttentag 1994) were all used as measures of knowledge base access. Children’s overall memories for the name-occupation pairings as well as the types of errors made during recall were also examined.

We decided to test children between the ages of 8–9 years (i.e., third graders) because numerous studies have shown that during this particular period in gender development, children (1) have a well-established knowledge base regarding occupational stereotypes (e.g., who typically is a doctor?), (2) are cognitively advanced and flexible enough to understand that both genders can violate gender role norms (e.g., Can a woman be a doctor?), yet (3) often have differential evaluations of male and female gender role norm violations (e.g., Should a man be a nurse?), with more negative reactions to males engaging in cross-gender behaviors or occupations compared to females (Blakemore 2003; Garrett et al. 1977; Helwig 1998; Liben and Signorella 1980; Most et al. 2007; Ruble et al. 2006).

Based on the previous research and in conjunction with social role theory, several hypotheses were made. The first hypothesis was that children would take significantly longer to create sentences for the counterstereotypic male pairings (e.g., Henry-Nurse) relative to the stereotypic male pairings (e.g., Mark-Dentist). The second hypothesis was that children would have significantly more sentences omitted (i.e., child could not come up with a sentence) and sentences that did not incorporate the relevant aspects of the occupations (i.e., more associative strategies; Kee and Guttentag 1994) for the male counterstereotypic pairs relative to male stereotypic pairings.

Predictions regarding the relationship between knowledge base accessibility (i.e., response time) and memory are often quite complicated with stereotypic and counterstereotypic gender-related information because numerous factors come into play (Sherman et al. 1997; Stangor 1988; Stangor and Ruble 1989; Stangor and McMillian 1992). For example, previous research has found that when counterstereotypic or incongruent pairings take longer to process, then those pairings are less likely to be remembered relative to the stereotypic or congruent pairs (Kee and Guttentag 1994; Liben and Signorella 1980; Martin and Halverson 1981). However, based on their meta-analysis, Stangor and McMillian (1992) report that often when the procedure (1) has a distracter task in between the stimuli presentation and the memory task and (2) the memory task required recall as opposed to recognition memory, the likelihood of counterstereotypic information being correctly recalled actually increases. Given the design of the current study and the utilization of a distracter task immediately before the recall test, our third hypothesis was that the male counterstereotypic pairings (i.e., male names-feminine occupations) would be correctly recalled more often than the other categories (i.e., male stereotypic, female stereotypic and female counterstereotypic). Our fourth and final hypothesis was that children would make significantly more recall errors that maintained gender stereotype congruency (e.g., Henry - Doctor) than gender role incongruency (e.g., Henry - Nurse).

Method

Participants

Fifty-seven third-grade students (25 boys, 32 girls) between the ages of 8 and 9 years participated in the study in the United States. The children were recruited from a public elementary school that serviced middle- and low-income communities in southern California. The sample consisted of children from a variety of racial and ethnic groups. The children volunteered to participate in the study and were not compensated. In each third grade classroom, the children were given the option of participating in the “Memory Game.” With the exception of two absent students and one student choosing not to participate, all of the remaining third graders voluntarily agreed to participate. Informed consent was not required because the procedure used in the current study was deemed by the Institutional Review Board (IRB) to be exempt from parental informed consent because the school included this procedure into their ongoing Learning Enhancement and Assessment program. As a result of this exemption, no identifying information (e.g., name, race, ethnicity, etc.) was obtained from the child participants. Participants were (1) given the option of not participating, (2) told that they would not be rewarded or punished for their decision, and (3) told they could stop at anytime with no penalty.

Materials and Stimuli

A standard voice recorder was used to audiotape each experimental session for verification of the procedure, subsequent response time coding, and transcription of the children’s created sentences. A stopwatch was used during the experimental procedure to time the allocated 15-second interval from the presentation of the stimuli by the experimenter to the sentence creation by the child. This stopwatch was also used to measure response times for sentence generation from the previously recorded audiotapes.

Sentence Generation Task

The stimuli for the sentence generation task were two lists. Each list comprised 20 names (10 typical male, 10 typical female) and 20 occupations (10 stereotypically masculine, 10 stereotypically feminine). Each name was paired with an occupation that was either consistent with a gender stereotype (e.g. Henry-Doctor) or inconsistent with a gender stereotype (e.g. Henry-Nurse). Several of the typical male or female names used were taken from previous work done by Blair and Banaji (1996). Each male name had a female name counterpart that began with the same letter (e.g. Andrew/Alison).

The stereotypical masculine and feminine occupations were derived from The United States Department of Labor’s 2000 online listings of nontraditional occupations for women in 1999 (U.S. Department of Labor 1999) and from the listing of the 20 leading occupations of employed women in 1996 (U.S. Department of Labor 1996). See Table 1 for the traditional male and female occupations used as stimuli. An independent sample of approximately 25 third-grade students verified the names and occupations as being typical “boy” or “girl” names and occupations. Any name or occupation that was classified as both typically male and typically female was eliminated from the set of stimuli.

Table 1 Lists of name-occupation pairings.

Lists of Name-Occupation Stimuli

The two lists of name-occupation pairings were created with the constraint that no more than 2 male or 2 female names could be presented consecutively. The presentation order of the names was held constant for both lists. Within a list, an occupation was paired with a corresponding name with the constraint that no more than two consecutive pairings were either consistent or inconsistent with a gender stereotype. If a particular name was paired with an occupation that was consistent with a gender stereotype on List A, that name was paired with an occupation that was inconsistent with a gender stereotype on List B. This procedure was used to counterbalance each name-occupation pairing and to minimize potential order effects. Both lists contained five pairings of a male name and masculine occupation (hereafter, Mm), five pairings of a male name and a feminine occupation (hereafter, Mf), five pairings of a female name and a feminine occupation (hereafter, Ff), and five pairings of a female name and a masculine occupation (hereafter, Fm; see Table 1).

The name-occupation pairings were orally presented with the modifier “the” to eliminate the possibility that the children would create a sentence that incorporated both words but did not activate the gender stereotype they may have for the pairing. For instance, when presented with the pair Henry-Nurse without the word “the”, a child may have created a sentence like, “Henry knows a nurse” instead of “Henry the nurse helps people.”

Distracter Task

After the sentence generation task, the children were asked to sort twenty-five white 3 × 5 index cards numbered from 10 to 50 into two piles: one high (above 30) and one low (below 30). For a detailed description of this procedure see Perez and Kee (2000).

Cued-recall Test

The cued-recall test was created using the same 20 names presented during the sentence generation task. The presentation order of the names for the cued-recall test was constructed randomly with the stipulation so that no more than three typical male or female names were presented consecutively. This was done to avoid a response bias during recall.

Procedure

Each child was tested individually by the same female experimenter, assigned a participant number, and randomly assigned to either List A (N = 27) or List B (N = 30). At the beginning of each session, the child was told that he or she would be playing three different games one right after the other: (1) the “make up a sentence” game (i.e., sentence generation task), (2) the “card sorting” game (i.e., distracter task), and (3) the “memory game” (i.e., cued-recall test). At this point the child was told that he or she could choose not to play the game or could stop at any time if he or she did not want to play anymore.

To familiarize the children with the experimental procedure, four practice trials were administered. The first practice trial presented the children with a visual aid (i.e., a sheet of paper with the practice phrase on it). The remaining three practice trials were administered without the visual aide to familiarize the participant with the oral testing conditions. The practice trials were word pairs similar to the testing condition, but conceptually unrelated to the stimuli (e.g., Sparky the dog). The children were instructed to think of the sentence in their heads about Sparky the dog and to produce it orally to the experimenter as soon as they created it. After completion of the practice trials, the child was informed that items from the make up a sentence game would be part of the memory game and was told to try to make up sentences that they would remember later. The children were reminded of the instructions and asked if they had any questions. Next, the children were orally presented with the first stimulus pair from either List A or List B. The children were positively reinforced regardless of their responses after 5 name-occupation pairs were presented. If the child could not create a sentence up to 15-seconds after the experimenter orally presented the stimulus pair (i.e., cue offset), the next pair was presented and the experimenter positively encouraged the participant to keep going.

After the 20 name-occupation pairs were presented during the distracter task, a card sorting game was administered. The children were asked to sort 25 two-digit numbers into two piles, one high (above 30) and one low (below 30) as fast as they could (see Perez and Kee 2000) with no time constraint. Once the child completed this task, the cued-recall test (i.e., the memory game) was administered. The administration of this cued-recall test was participant paced. Each child was presented with the 20 names presented during the sentence generation task. The child was orally given a name (e.g., Henry) and was asked to recall the corresponding occupation that was presented during the sentence generation task. The entire procedure took approximately 20 min to complete.

Data Coding

Response Latency for Sentence Generation

Response time (in hundredths of a second) was measured as cue offset (i.e., completion of the stimulus phrase by the experimenter) to sentence generation (i.e., the first word after the presented phrase given by the participant). If the child simply repeated the phrase (e.g., “Henry the Nurse”) and paused, the response time was coded at the beginning of the first word generated after the phrase. Two different coders measured the time between cue offset and sentence generation. The average of both raters’ response times measures were used for the analyses. Inter-rater reliability (using an intraclass correlation) was .947.

Sentence Omissions

Sentence omissions were recorded if a child forgot one of the items from the presented pair, could not generate the sentence within the allocated 15-second period, or indicated that he or she could not think of a sentence. The percentage of sentence omissions for each category was calculated by tallying the total number of omissions for each category (i.e., Mm, Mf, Ff, Fm) and dividing by the total number possible (i.e., 20).

Analysis of Created Sentences

A sentence was coded as Occupation Incorporated when the child included aspects or elements of the occupation into the sentence with the presented male or female name. An example of an Occupation Incorporated sentence is “Henry the nurse takes care of sick people.” A sentence was coded as Occupation Unincorporated if the child failed to include aspects of the occupation into the sentence with the presented name or if the child created or altered a sentence to fit a gender stereotype. An example of an Occupation Unincorporated sentence is “Henry the nurse is nice” or “Henry the nurse is a man.” Two raters coded each of the sentences. The interrater reliability for the raters was Kappa = .992 (p < .001). The percentage of unincorporated sentences was calculated by dividing the number of sentences in each category that did not incorporate aspects of the occupation by the total number of sentences created by the child. The total number of sentences created varied among children as not all of the children created all 20 sentences. See the Appendix for additional examples of sentences that were created by the children and coded as Occupation Incorporated or Occupation Unincorporated.

Cued-recall Test

The number of correctly recalled occupations was used as an estimate of the children’s memory of the previously presented name-occupation pairs. A percentage was calculated by dividing the total number of correctly recalled occupations within each category (Mm, Mf, Ff, Fm) presented during the sentence generation task by the total number possible (i.e., 20).

Recall Errors

A recall error was coded as Gender Congruent if the child incorrectly recalled an occupation that was consistent with the gender role stereotype of the name presented. For example, if the child was presented with “Henry the Nurse” during the sentence generation task and incorrectly recalled a masculine occupation (e.g., doctor), that error was considered to be Gender Congruent. A recall error was coded as Gender Incongruent if the child incorrectly recalled an occupation that was incongruent with a gender role stereotype of the name presented.

Results

For purposes of analyses, all percentage scores were converted to arcsine transformations to normalize the data (Neter et al. 1996). For ease of interpretation, the percentage scores for all of the arcsine-transformed data are presented. An initial analysis was conducted to determine whether there were differences across all of the measures as a function of gender. A 2 (Gender: boys vs. girls) by 2 (Name: male vs. female) by 2 (Occupation: masculine vs. feminine) by 5 (Dependent Measures: sentence response latency, sentence omissions, unincorporated sentences, correct recall, type of recall errors) mixed-model multivariate analysis of variance (MANOVA) was conducted. Gender was a between-subjects variable while Name, Occupation, and the Dependent Measures variables were within-subjects variables. The analysis failed to yield a significant main effect of Gender, F (1, 51) = .36, p = .55, or significant interactions with Gender, all ps > .05. See appendix for a breakdown of the means and standard deviations by gender for each dependent variable.

Hypothesis 1 Children will take significantly longer to create sentences for the counterstereotypic male pairings (e.g., Henry-Nurse) relative to the stereotypic male pairings (e.g., Mark-Dentist).

To test this hypothesis, a 2 (Gender: boys vs. girls) by 2 (Name: male vs. female) by 2 (Occupation: masculine vs. feminine) mixed-model analysis of variance (ANOVA) was conducted on the children’s reaction times (in seconds) during the sentence generation tasks. Gender was a between subjects variable while Name and Occupation were analyzed as within-subjects variables. The analysis revealed a significant Name by Occupation interaction, F (1, 55) = 4.16, p < .05. Planned comparisons revealed that, as predicted, children yielded the longest response times for the counterstereotypic male pairings (Mf) compared to the stereotypic male pairings (Mm), F (1, 55) = 4.23, p < .05. Children also yielded significantly longer response times for the Mf pairs compared to both of the female pairings, all Fs > 1.0, all ps < .05. See Fig. 1 for mean response times and standard errors for each category. There was also a subsumed main effect of Name, F (1, 55) = 4.18, p < .05, indicating that the children took significantly longer to create a sentence when presented with a male name (M = 3.3 s, SE = .28 s) than when presented with a female name (M = 2.9 s, SE = .23 s). This was due to the Mf category. No other main effects or interactions were significant.

Fig. 1
figure 1

Mean response latency in seconds (+Standard Error) of sentence generation as a function of the name-occupation pairings.

Hypothesis 2 Children will omit significantly more sentences and create sentences that do not incorporate the relevant aspects of occupation for the counterstereotypic Mf pairs relative to the stereotypic Mm pairs.

Sentence Omissions. A 2 (Gender) by 2 (Name) by 2 (Occupation) mixed-model ANOVA was conducted on the proportion of sentences children omitted during the sentence creation task. A significant Name by Occupation interaction emerged, F (1, 55) = 6.70, p < .05. Planned comparisons revealed that, as predicted, the counterstereotypic Mf pairings were omitted significantly more often than the stereotypic Mm pairings (Mm), F (1, 56) = 4.68, p < .05. Moreover, the Mm category was least likely to be omitted from sentence generation task compared to all of the other categories, all Fs > 1.0, all ps < .05. See Fig. 2 for the mean proportion of sentences omitted in each category. A significant main effect of Occupation revealed that the feminine occupations overall were more likely to be omitted (M = 10.4%, SE = .01%) than masculine occupations (M = 6.3%, SE = .01%), F (1, 55) = 14.60, p < .001. This was due to the fact that sentences in the Mm category were significantly less likely to be omitted overall. The analyses did not yield any other main effects or interactions.

Fig. 2
figure 2

Mean percentages (+Standard Error) of sentences omitted as a function of the name-occupation pairings.

Analysis of Created Sentences

A 2 (Gender) by 2 (Name) by 2 (Occupation) mixed-model ANOVA was conducted on the proportion of sentences that failed to incorporate relevant aspects of the occupation. An average of 12% of the children’s created sentences were coded as Occupation Unincorporated. The analysis did not yield a significant Name by Occupation interaction, but did suggest a trend, F (1, 55) = 2.51, p = .12. In line with the response time findings, planned comparisons indicated that the counterstereotypic Mf pairings had significantly more Occupation Unincorporated sentences than both the Mm and Fm pairings, all Fs > 1.0, ps < .05. See Fig. 3 for the mean percentages and standard errors of unincorporated sentences in each category. No other main effects or interactions yielded significant results.

Fig. 3
figure 3

Mean percentages (+Standard Error) of Occupation Unincorporated sentences as a function of the name-occupation pairings.

Hypothesis 3 Children will correctly recall significantly more male counterstereotypic pairings (i.e., male names-feminine occupations) than the other categories.

A preliminary analysis was conducted to ensure that the children’s recall of the name-occupation pairings did not vary as a function of which list they received (i.e., List A or List B) during the sentence generation task. A 2 (List: A vs. B) by 2 (Gender: boys vs. girls) by 2 (Name) by 2 (Occupation) mixed-model ANOVA was conducted on the proportion of correctly recalled name-occupation pairings, with List and Gender administered as between subjects variables and Name and Occupation administered as repeated measures. The findings did not yield a significant main effect of List, F (1, 53) = .266, p = .61 or interactions, all ps > .10.

The subsequent 2 (Gender) by 2 (Name) by 2 (Occupation) mixed model ANOVA conducted on the proportion of correctly recalled pairs revealed a significant main effect of Name, F (1, 55) = 12.50, p < .001, indicating that more occupations were correctly recalled that were paired with a male name (M = 10.5%, SE = 1.5%) than with a female name (M = 5.2%, SE = 1.0%). Planned comparisons revealed that children remembered significantly more counterstereotypic Mf pairs than the Fm pairs, F (1, 55) = 4.38, p < .05 or Ff pairs, F (1, 55) = 4.14, p < .05. Figure 4 depicts the mean percentages of correctly recalled pairs within each of the four categories. No other main effects or interactions yielded significant findings.

Fig. 4
figure 4

Mean percentages (+Standard Error) of correctly recalled occupations as a function of the name-occupation pairings.

Hypothesis 4 Children will make significantly more recall errors that maintain gender stereotype congruency (e.g., Henry - Doctor) than gender role incongruency (e.g., Henry - Nurse).

A 2 (Gender) by 2 (Type of Error: Stereotype Congruent or Incongruent) mixed model ANOVA was conducted on the number of errors each child made during the cued-recall test. Gender was a between subjects variable, whereas Type of Error was administered as within subjects. Children made significantly more stereotype congruent errors (M = 5.6 errors, SE = .32) than incongruent errors (M = 3.4 errors, SE = .27), as evidenced by a significant main effect of Type of Error, F (1, 55) = 29.42, p < .001. Further inspection of the differences between the types of errors within each category revealed that children made significantly more congruent errors than incongruent errors for each of the categories (Mm, Mf, Fm), all Fs > 1.0, ps < .01; except for the female name-feminine occupation pairings (Ff), p = ns. No other main effects or interactions emerged as significant.

Discussion

The purpose of the current study was to test social role theory by exploring children’s occupational stereotypes using implicit information processing and memory measures. We were specifically interested in whether children’s gender role stereotypes were less restrictive for female occupational roles compared to male occupational roles. We presented eight- and nine- year-olds with name-occupation pairings that were either stereotypic (e.g., Patricia-Nurse) or counterstereotypic (e.g., Mark-Secretary) and measured how long it took for them to create a sentence for each pairing. After this sentence generation task, the children’s memories for these pairings were tested.

The findings from the current study confirmed our predictions and provided additional support for Eagly’s social role theory. First, the results showed that children were equally efficient at processing information pertaining to women participating in either stereotypically feminine occupations (e.g., nurse) or stereotypically masculine occupations (e.g., doctor). This efficiency highlights that children’s occupational stereotypes for women are organized in such a way that both stereotypic and counterstereotypic occupations are readily accessible and include both masculine and feminine occupations (Diekman and Eagly 2000). In fact, children were more efficient processing the counterstereotypic Female name-masculine occupation (Fm) pairings relative to the counterstereotypic Male name-feminine occupation (Mf) pairings, suggesting that accessing counterstereotypic information pertaining to female occupational roles was easier than accessing counterstereotypic information pertaining to male occupational roles. The findings also revealed that, as predicted, children took significantly longer to process and create sentences for the Mf pairings compared to the stereotypic male name-masculine occupation (Mm) pairings. Children’s difficulty in processing the Mf pairs suggests that children clearly delineate stereotypic versus counterstereotypic occupational choices for men and struggle cognitively when required to access and process information outside of those predefined gender role boundaries.

Participants’ difficulty in processing the Mf pairs also confirmed our second prediction that children were (a) more likely to omit sentences for the counterstereotypic Mf pairs than the stereotypic Mm pairs and (b) would use significantly fewer elaborative strategies (i.e., incorporated relevant aspects of the occupations into the sentence) when they did create sentences with the Mf pairings. Moreover, children’s difficulty in processing information pertaining to deviations from male occupational stereotypes was also apparent in the details of their created sentences. For example during sentence generation, children were more likely to alter or adjust the male name-feminine occupation (Mf) pairings to be congruent with their gender role stereotypes. Liben and colleagues (Liben et al. 2001, 2002; Liben and Signorella 1980) similarly found that children in their studies would often alter information that was incongruent with their gender stereotypes so that they became congruent. In the current study, participants often altered the first name or pronoun to fit with their gender stereotypes or completely disregarded the occupation when given a counterstereotypic Mf phrase. For instance, when given the phrase, “James the Babysitter”, one child created the sentence, “James the babysitter likes babysitting because she likes kids.” Another interesting pattern was when a child would simply include an additional masculine occupation to a male name, like “Mark the secretary is also a principal,” or “Henry the nurse is a doctor, too,” or “Henry the nurse is a children’s doctor,” ignoring the feminine occupation all together. This was also evident, but to a much lesser degree, with the Fm pairings. When given the phrase “Julie the Police Officer,” one child created the sentence, “Julian the police officer fights crime.” Another child, when given the phrase “Debbie the Truck Driver,” created the sentence, “Derek the truck driver made a soccer field.” This sentence is particularly interesting because not only did the child alter the female name to make it male, but he/she also appears to have struggled with what a truck driver’s tasks may actually entail. Taken together, it is apparent that the counterstereotypic Mf pairings were the most inaccessible and challenging for children to create sentences with compared to all of the other pairings. These findings suggest that children’s occupational stereotypes are more conservative for the male gender role compared to the female gender role and provide additional support for Eagly et al.’s (2000) claim that gender role stereotypes for males are restrictive and may be more resistant to change than for females.

The children’s recall of the stereotypic and counterstereotypic pairs also provides support for social role theory. Considering the design of the study (i.e., distracter task + recall; Stangor and McMillian 1992), our third prediction was that children’s longer response latency when presented with the Mf pairs would enhance their memories for the counterstereotypic Mf pairs (Stangor and McMillian 1992; Srull and Wyer 1989). The findings confirmed this prediction and revealed that children remembered significantly more Mf pairs than Fm pairs. Thus, the extra processing time (i.e., depth of processing) for the Mf pairs may have allowed for richer encoding of the information. In addition, it is possible that because children’s male gender role stereotypes are more restricted with a limited number of appropriate options, the seemingly inappropriate Mf pairings were more salient due to the perceived social consequences of such violations. A male’s gender role violation may be particularly noteworthy relative to a female’s gender role violation because male gender role violations may be perceived as more “deviant” (Liben and Signorella 1993; Levy et al. 1995) because they occur much less frequently (Diekman and Eagly 2000). As a result, once this type of counterstereotypic information was processed, it may have been marked with a mental “tag” of sorts that ultimately allowed it to be more richly represented and uniquely stored (Stangor and McMillian 1992).

Overall, children in the current project were more likely to correctly remember an occupation if it was paired with a male name regardless of whether it was paired with a stereotypically masculine or feminine occupation. More specifically, children did not provide evidence of remembering significantly more Mf pairs than Mm pairs, counter to what we would have expected given the increased processing time for the counterstereotypic Mf pairs (Stangor and McMillian 1992). Also, in the current study, children were less likely to correctly recall an occupation when it was paired with a female name regardless of whether it was a stereotypically masculine or feminine occupation. Similar to these findings, previous research has shown that children typically have lower accuracy in memory for females participating in feminine occupations (Liben and Signorella 1993; O’Brien et al. 2000). For example, Liben and Signorella (1993) found that children had the most difficulty correctly identifying pictures on an identification task that presented women in feminine occupations. In some ways, the finding that the counterstereotypic female name-occupation pairs were equally likely to be forgotten as the stereotypic pairs suggests that from a cognitive perspective, the counterstereotypic pairs were not more salient or treated differentially than the stereotypic pairs, providing additional evidence that the children were equally flexible and comfortable with females participating in both masculine and feminine occupations.

It is also possible that the children’s overall memory for occupations paired with male names was higher than occupations paired with female names because of the gender differences found in the characteristics of the names themselves (Barry and Harper 1995; Bauer and Coyne 1997). For example, Barry and Harper (1995) report that people assign different phonetic attributes (e.g., attractiveness, strength, etc.) to male names than to female names. In general, male names are much more common, are fewer in number, and have remained relatively stable over time (Barry and Harper 1995; Bentley et al. 2004). Bentley and colleagues (2004) contend that possibilities for male names are more constrained than for female names because fewer neutral variants exists for males. Therefore, the presentation of the male names in general may have required less processing time because they were more quickly recognized as “male,” allowing more working memory resources to be allocated to the occupations as opposed to the name. On the other hand, since there is more neutrality and ambiguity for female names the children may have had to make a more conscious judgment call of “male” versus “female,” reducing the amount of working memory resources to be allocated to the occupations. As a consequence, children’s processing of the male or female names may have been an artifact of male and female naming patterns and more general gender stereotyping as opposed to gender role stereotyping. It is important to note that although in the current study the male and female first names were (1) derived from previous research (Blair and Banaji 1996), (2) matched on first initial, and (3) evaluated by an independent sample of 3rd graders (8- and 9-year-olds) to assess the masculinity and femininity of each name; the first names utilized were not matched in terms of familiarity and frequency in contemporary usage. The relative weight of each of these components on children’s gender role stereotyping is an empirical question that warrants further investigation.

Also our fourth and final prediction was confirmed in that the children in the current study were more likely to generate recall errors that were congruent with their existing gender role stereotypes. Several studies have demonstrated that when the memory task is cognitively taxing, children will often default to using their gender schemas or strategies (Liben and Signorella 1993; Stangor and McMillian 1992) and misremember, distort or alter information that is incongruent with these schemas (Conkright et al. 2000; Frawley 2008; Hughes and Seta 2003; Liben et al. 2002; Ruble et al. 2006). Given the nature of the cognitively taxing task used in the current study, it is not surprising that children demonstrated these types of errors during the recall task.

In addition to providing additional support for social role theory, the current study also offers some insight into the relation between implicit and explicit measures used to assess children’s gender role stereotypes. For example, many of the studies that have used explicit measures assessing children’s gender role stereotypes using can they or should they questions have reported that children between the ages of 7–10 years are consistently more negative in their evaluations of gender norm violations, particularly if a male is the violator (Blakemore 2003; Garrett et al. 1977; Ruble et al. 2006; Signorella et al. 1993). Similarly, research using implicit memory measures has shown that children are less likely to remember (or recognize) a picture or elements of a story if a male is engaging in more traditionally feminine behaviors or occupations (Koblinsky et al. 1978; Liben and Signorella 1980). In the current study, one consistent pattern found across each implicit measure was that the children processed and remembered the counterstereotypic Mf pairs differently than the other pairings (i.e., Mf, Ff, Fm). What makes the sentence generation task used in the current study particularly informative is that it not only required children to rapidly access their gender role stereotypes, but also necessitated more deliberate processing of the gender-related stimuli in order to create an appropriate and meaningful sentence. This type of deliberate processing may be why the findings of the current study mirror those of previous research using explicit evaluation measures (Blakemore 2003; Carter and McCloskey 1983–1984; Levy et al. 1995; Ruble et al. 2006; Stoddard and Turiel 1985). The suggestion that the type of implicit measures used in the current study may reflect children’s attitudes and evaluations is supported by Signorella et al.’s (1993) meta-analysis positing that certain types of implicit measures may actually tap into children’s attitudes more so than directly tapping into their gender role stereotype knowledge. Conversely, other types of implicit measures such as the stereotype priming task used in the Banaji and Hardin (1996) study and the auditory stroop task used in the Most et al. (2007) study may actually be more reflective of children’s gender stereotype knowledge due to the forced choice (e.g., masculine vs. feminine word or male vs. female voice) nature of the task (M. Signorella, personal communication, December 23, 2009). Given that the responses (i.e., sentence generation) required from the children in the current study were more cognitively involved than a male vs. female choice, it is more likely that children’s attitudes and judgments played a larger role in their responses in addition to their gender role knowledge. Nonetheless, the task for future research is to better understand the meaning and complexity of children’s implicit responses in relation to their explicit responses.

Although the findings of the current study support social role theory in that children’s gender role stereotypes for females are (1) less constrained than for males, (2) include both masculine and feminine occupations, and (3) mirror current sociological occupational trends, the question still remains as to why it is still so detrimental for males to engage in more traditionally feminine activities or occupations. It is plausible that, within the last two decades, our more overt encouragement of young girls and women to aspire for and participate in more traditionally masculine occupations (e.g., doctor, lawyer, firefighter) coupled with our relative silence towards young boys and men (Halpern 2000), has inadvertently sent the message that “lower status,” traditionally feminine occupations (e.g., teacher, homemaker, nurse) are less desirable by both men and women. In other words, maybe we have spent so much time focusing on affording women opportunities to participate in traditionally male-dominated occupations that we have lost sight of the importance and need to change the perceived value and contribution of traditionally female-dominated occupations, such as teachers and nurses. Shouldn’t we expect and work toward more equal representation in these occupations for both men and women?

Obviously, there is still much more work to be done in order to truly equate the genders and readjust our perspectives of both men’s and women’s gender roles. Considering that there has been so much progress for women in the past 50 years (Halpern 2000; Diekman and Goodfriend 2006), maybe it is time that we start discussing with young boys the overall value of being well rounded in their behaviors, activities, and occupational aspirations. Perhaps children’s increased exposure to men participating in more nontraditional roles (e.g., caregiver), activities (i.e., laundry), and occupations (e.g., elementary school teachers) will not only alter the way counterstereotypic information about men is processed, but also may alter the way we evaluate both men’s and women’s social roles more generally. Maybe if we start actively encouraging young boys to aspire to be what they want to be and not what society deems as “appropriate” for them, we will afford them an opportunity to engage in both masculine and feminine roles without such harsh and negative social consequences. Then, maybe someday soon, Henry the nurse will not have to be a doctor, too.