Introduction

Women’s participation in higher education and science as an indicator of social and economic progress has attracted considerable attention from numerous researchers and national and international organizations. A variety of initiatives and reports have been undertaken to analyze the participation of women in science and higher education and to promote gender equality. Among these initiatives we can mention: the Association for Women in Science (AWIS), founded in 1971; the Helsinki Group on Women and Science, set up in 1999; the Korea Federation of Women’s Science Associations, set up in 2003; the WIRDEM (Women in Research Decision Making) expert group established in 2006; and the EU-funded genSET project, which ran from September 2009 to February 2012. Among the most recent reports are: the She Figures by the European Commission; the Global Gender Gap Report, introduced by the World Economic Forum in 2006; the annual World Development Report: Gender Equality and Development, published by the World Bank; the UNESCO World Atlas of Gender Equality in Education (2012); and the National Science Foundation’s reports on Women, Minorities, and Persons with Disabilities in Science and Engineering.

The latest data published by the OECD (2013) indicate that, despite some progress, gender inequalities in higher education and science persist. In OECD countries, younger women have higher attainment rates than younger men in upper-secondary and tertiary education. In 2011, an average of 84 % of younger women attained at least upper-secondary education while 81 % of younger men did. While the proportion of women is relatively high at the level of tertiary education, that proportion diminishes in the later stages of academic careers, especially in top-level positions; and women receive lower wages than those of similarly qualified men. As is also indicated in the UNESCO World Atlas of Gender Equality in Education (2012), enhanced access to higher education by women has not always translated into enhanced career opportunities, including the opportunity to use their doctorates in the field of their research. In addition to working conditions, including differences in salary, women encounter bias at many levels in their academic careers: they receive less funding through research grants; they are significantly underrepresented on the boards of research institutions, funding organizations, scientific councils and academies; and they are rarely found among the heads of higher education institutions (LERU 2012).

The persistent gender gap has prompted many studies seeking to identify different explanatory factors in various areas of science, across different time periods, and in diverse national settings. Much of this research has identified factors related to family formation and childrearing as being the most influential causes of women’s under-representation in academia (Wennerås and Wold 1997; Sax et al. 2002; Stack 2004; Fox 2005; Ginther and Kahn 2009; Hunter and Leahey 2010). Along with fertility choices (that weigh more heavily on the career goals of women) and issues of work-home balance (female scientists are more likely than males to bear domestic duties), there are also significant gender differences in hours worked and lifestyle preferences (Ledin et al. 2007; Ferriman et al. 2009; Fox et al. 2011; Shen 2013).

Traditionally, gender disparities in career attainment have been explained largely by differences in research productivity (Cole and Zuckerman 1984; Prpic 2002; Fox 2005; Leahey 2006). At the institutional level, there is also a considerable body of literature suggesting that differences are caused by: structural factors such as the type of institution, insofar as women are more likely than men to work at teaching-intensive colleges (Allison and Long 1990; Xie and Shauman 1998); the teaching load, which is traditionally higher for women than for men (Taylor et al. 2006); the degree of specialization (Leahey 2006); financial resources, since women tend to occupy positions offering fewer resources (Xie and Shauman 1998); academic status, insofar as women tend to occupy lower academic positions (Leta and Lewison 2003) and research assistance (Ceci and Williams 2011). Some studies have also shown that the lower the percentage of women in selection committees is, the less likely women are to be appointed (European Commission 2009; Zinovyeva and Bagues 2011). Additionally, research has evidenced that academic assessment systems have traditionally ignored factors that especially affect women. Examples would be the way in which scientific excellence is defined (Van den Brink and Benschop 2011), the fact that selection criteria tend to value quantity of research output over quality, when men tend to produce more publications (Symonds et al. 2006), or attaching less importance to female characteristics (Lawrence 2006).

As a complement to the above, the psychological literature has explained gender disparities in terms of women’s lower levels of career orientation, ambition, and aggressiveness (Sonnert 1996).

In addition to all the above-mentioned factors that place women at a disadvantage in all fields, career preferences, ability, and biological differences have been the main variables proposed in the literature to explain their underrepresentation in STEM (science, technology, engineering, and mathematics) disciplines. Empirical research in these fields has pointed to career preferences and choices, both freely made and constrained, as important causes of women’s underrepresentation in academia (Ceci and Williams 2011), and it is suggested that some of these choices originate before or during adolescence (Ginther and Kahn 2009; Ferriman et al. 2009; Mason and Goulden 2009). Hence, adolescent girls frequently prefer careers linked to the humanities and social sciences as opposed to STEM-based fields.

Beyond all these explanatory factors, impediments to women scientists may also be a consequence of the overt or unconscious gender bias that still persists at most universities (Dewandre 2002; Moss-Racusin et al. 2012; Shen 2013). However, some research has suggested that after controlling for structural, family, and discipline variables, there is no evidence of discriminatory treatment, because women and men in the same circumstances (e.g., same type of institution, discipline, and amount of experience) fare equivalently (Ceci and Williams 2011).

One of the problems in relation to these findings is that the large body of research in this area does not provide the kind of systematic and comprehensive overview of factors related to gender differences that would help to guide future research and practices in the field. In response to this situation, the present study uses co-word analysis in order to describe the evolution and current state of the literature on gender differences in science, focusing on factors that influence gender inequality in higher education and science. This bibliometric technique, proposed by Callon et al. (1983), will help us to visualize the division of the field (in this case, the explanatory factors for gender differences in science) into several subfields and show the relationships between them, thereby providing insights into the evolution of the main topics discussed in the field over the years. The technique will also help us to identify the major research topics in the domain, as well as to suggest issues to be addressed or strengthened in further work. The results obtained through this process will be of interest to policy makers, funders, and academic administrators who are seeking to provide adequate facilities and to gauge research activities in a proper direction (Sudhier and Abhila 2011).

Method

Data collection

The data were extracted from the Thomson Reuters Web of Science in February 2013, using a search that combined the principal terms related to the subject. Figure 1 shows the steps followed to collect the data. First, in order to retrieve the available scientific literature on the subject we went through the related literature with the purpose of identifying the related key terms. A preliminary combination of key terms was used to extract the papers related to the subject. Next, after reviewing the keywords of these preliminary papers, we added more specific terms to the query in order to check whether these new terms increased the number of records retrieved; if they did, they were included in the query, and if not, they were eliminated. A total of 50,970 records were initially retrieved. In a next step, records were refined by subject area, such that those papers classified in research areas not directly related to the topic were discarded (e.g., history, zoology, toxicology, allergy, and transportation). Titles and abstracts from the remaining pool of papers (n = 12,743) were then manually checked to find related records. A corpus of 651 articles and reviews dealing with factors related to gender differences in science, published between 1991 and 2012, were finally considered. In order to study the evolution of the topic and to see how the results changed over time, the records were divided into three consecutive sub-periods: 1991–2001, 2002–2007, and 2008–2012. The time spans were selected based on the number of target documents published per period; so following Cobo et al. (2011) and Muñoz-Leiva et al. (2012), we fixed a longest first sub-period in order to get a representative number of published papers and keywords. Thus, the first period (1991–2001) spans 11 years (and includes a total of 164 documents: 25 %), the second period (2002–2007) spans 6 years (and 147 documents: 23 %) and the last period (2008–2012) spans 5 years (and 340 documents: 52 %). In addition, an important event in women’s access to higher education and science occurs within each period. Thus, the “World Conference on Education for All” took place in 1990 and during the years 2002 and 2008, UNESCO launched its “Gender Equality Action Plans” for the periods 2002–2007 and 2008–2013, respectively.

Fig. 1
figure 1

Flow chart illustrating the process of data collection

Data process

Co-word analysis is a content analysis technique based on the assumption that the subject of a paper can be summarized in a number of few key terms that reflect its core contents. The frequency of word occurrence in the subject can reflect the importance of themes, and the co-occurrence of keywords across papers can be interpreted as indicating similarity between publications. According to Börner et al. (2003), the more keywords two publications have in common, the more similar the two publications are. Therefore, the main purpose of a co-word analysis is to map the dynamics of a subject and identify the core research topics based on the pattern of co-occurrence of pairs of keywords, which represent the different themes in a selected body of literature (He 1999).

The co-word analysis conducted in the present study involved five sequential steps: extraction of the data, standardization of keywords, construction of the co-occurrence matrix, clustering, and visual presentation of keyword groups. First, author-provided keywords were extracted from papers, with keywords plus being used in those instances where no author-provided keywords were available. Once the data had been extracted, keywords and phrases were standardized manually in order to refine the dataset (e.g., keywords occurring in different forms, plural and singular forms, uppercase and lowercase words). Keywords denoting the same concepts were changed into the most frequent key term occurring in the data set. For instance, the terms research productivity, scientific productivity, publication productivity, academic publishing, scholarly productivity, medical publication, publication rates, publications, and research output were considered as synonymous keywords and were all identified as research productivity, which was the most frequent term. By contrast, those keywords which were very closely related but different in meaning were kept separate, for example: gender issues, children, family, marriage, motherhood or salary, salary gap and promotion. Any keywords that were unrelated to the topic were also eliminated in this step (for instance, names of countries and statistical tests). After standardization, a total number of 170 unique keywords or phrases were selected.

The word-document occurrence matrix was automatically built using SPSS v20. Only those keywords and phrases with a frequency greater than or equal to 5 in each temporal sub-period were considered in the analysis. The total number of keywords for each sub-period is shown in Fig. 1. The resulting matrix for each sub-period was then exported to Ucinet (Borgatti et al. 2002) in order to calculate the word co-occurrence matrix. The similarities between items were also calculated using the jaccard similarity index. Hierarchical clustering analysis was then conducted using Ward’s method, and squared Euclidean distance was applied as the distance measure using SPSS v20. Ward’s method involves an agglomerative clustering algorithm. It starts with n clusters of size 1 and continues until all the observations are included in one cluster. In contrast to other agglomerative clustering algorithms such as single link clustering used in Callon’s original proposal of co-word analysis, Ward’s method tends to produce same-size and spherical clusters (Everitt et al. 2011). The result of the clustering was then visualized in a two-dimensional diagram, known as a dendrogram, which displays the steps in the clustering process and illustrates how individual words are combined in order to form gradually larger clusters. The clusters were then transformed into networks in Ucinet. Finally, in the last step, and in order to identify and visualize the importance and position of clusters considered as themes, as well as their relational patterns, strategic diagrams were built for each sub-period. A strategic diagram is a two-dimensional space built by plotting themes according to their centrality and density, where the abscissa axis represents the centrality, the ordinate axis represents the density, and the origin is denoted by the median or mean value of the two, centrality and density (Callon et al. 1991; Cobo et al. 2011). The density, or the internal cohesion index, indicates the strength of the linkage that each word has with other words within the same cluster (or theme). It is an indicator of the internal strength of a cluster and represents the conceptual development of a theme. The centrality, or the external cohesion index, indicates the strength of the linkage that each keyword has with other keywords in other clusters. It is a measure of the strength of a subject area’s interaction with other subject areas and represents the central position of a theme within the overall network. The value of the density and the centrality of a given cluster can be measured in several ways (He 1999). In our study density was computed as the average value (mean) of the internal links (Turner et al. 1988) and centrality was computed as the sum of squares of all external link values (Bauin et al. 1991). The origin of the strategic diagram is calculated by the mean value of centrality and the mean value of density. The strategic diagram divides the space into four quadrants, such that there are four types of themes according to their location (Callon et al. 1991; He 1999). Themes located in the upper-right quadrant are considered to be well-developed and important themes for the structure of a research field. They are known as the motor themes of the specialty, given that they present strong centrality and high density. The placement of themes in this quadrant implies that they are externally related to concepts applicable to other themes that are conceptually closely related. Themes in the upper-left quadrant have well-developed internal ties (high density) but unimportant external ties (weak centrality), and so are of only marginal importance for the field. These themes are very specialized and peripheral in nature. Themes placed in the lower-left quadrant are both weakly developed (low density) and marginal (weak centrality), and are considered as emerging or disappearing themes. Finally, themes in the lower-right quadrant are important for a research field (strong centrality) but present low internal development (low density). Therefore, this quadrant comprises transverse and general or basic themes.

After calculating density and centrality for each cluster, the themes were then displayed, using Excel, in a strategic diagram according to their internal and external cohesion indices. The themes were represented by spheres of different sizes, which were proportional to the number of papers that they each represented.

Results

A total of 170 keywords were obtained from the 651 documents. In what follows, we show the dendrograms, strategic diagrams for each sub-period, and tables containing the names of clusters, the number and percentage of documents by cluster, the centrality and density values, and a brief explanation of each theme.

Period 1: 1991–2001

The dendrogram shows that the 29 keywords of the documents are divided into four clusters (Fig. 2). Table 1 gives the names and descriptive values of each cluster, while Fig. 3 shows the corresponding strategic diagram. The origin of the strategic diagram is based on the centrality value (5.750) and density value (0.117).

Fig. 2
figure 2

Dendrogram for the first sub-period (1991–2001)

Table 1 Descriptive values of clusters for the first sub-period (1991–2001)
Fig. 3
figure 3

Strategic diagram for the first sub-period (1991–2001)

Gender discrimination in labor markets and universities” (C1) is located in the upper-right quadrant. This means that this cluster contains close internal connections and is also widely connected to other clusters. Given its position and the number of papers that deal with this theme, it can be considered as the motor theme of this period. Because of its high/medium density and centrality (upper-left quadrant), “Mobility of women academics” (C3) was regarded as a specialized theme with high conceptual development but weak external interconnection with other themes.

A further two themes, namely “Institutional issues” (C2) and “Sex differences in promotion” (C4) (lower-left quadrant), were regarded as either emerging or disappearing themes because of their showing both low density and low centrality.

Period 2: 2002–2007

In this period, the 35 keywords of the documents were divided into ten major themes, as shown in Table 2. The dendrogram of the cluster analysis and the strategic diagram are shown respectively in Figs. 4 and 5. The origin of the strategic diagram is based on the centrality value (0.953) and the density value (0.130).

Table 2 Descriptive values of clusters for the second sub-period (2002–2007)
Fig. 4
figure 4

Dendrogram for the second sub-period (2002–2007)

Fig. 5
figure 5

Strategic diagram for the second sub-period (2002–2007)

In this period, two new motor themes appeared: “Career satisfaction in medicine” (C1) and “Academic career in sociology” (C9). Besides being a motor theme, “Career satisfaction in medicine” (C1) was the cluster with the highest number of documents.

The clusters “Mobility of women academics” (C6), “Sex differences in promotion” (C2) and to some extent “Gender stereotypes and discrimination,” (C3) all present in the previous period, also appeared in this period. “Mobility of women academics” (C6) showed a decrease in density but a higher percentage of documents compared with the previous period. It was now relocated to the lower-left quadrant, suggesting that it is either an emerging or a disappearing theme. In contrast, “Sex differences in promotion” (C2) and “Gender discrimination in labor markets and universities” (C1), which became “Gender stereotypes and discrimination” (C3) in this second period, showed an increase in density and a lower percentage of documents compared with the previous period, and they were relocated to the upper-left quadrant as specialized themes with a higher conceptual development but weak external interconnections with other themes.

Compared with the previous period, the number of emerging (or disappearing) themes increased from two to six. In addition to “Mobility of women academics” (C6), five new themes appeared: “Gender roles in management” (C4), “Mentorship” (C5), “Racial discrimination at universities” (C7), “Work-life balance in academia” (C8), and “Gender issues in geography” (C10).

Period 3: 2008–2012

Based on the hierarchical clustering of 106 keywords, 16 clusters of keywords (themes) were identified in the last period, as shown in Table 3. The dendrogram of the cluster analysis and the strategic diagram are shown respectively in Figs. 6 and 7. The origin of the strategic diagram is based on the centrality value (1.500) and density value (0.099).

Table 3 Descriptive values of clusters for the third sub-period (2008–2012)
Fig. 6
figure 6

Dendrogram for the third sub-period (2008–2012)

Fig. 7
figure 7

Strategic diagram for the third sub-period (2008–2012)

In this period, just one motor theme was found: “Advancement in academic medicine” (C9). This theme includes articles related mainly to success and progression in medicine. This cluster is similar to the cluster labeled “Career satisfaction in medicine” (C1), identified in the second period as a motor theme. “Gender discrimination in labor markets and universities” (C1), which was present as a motor theme in the first period and as a specialized theme in the second, reappeared in this third period, where it showed a decrease in the percentage of documents compared with the previous periods and an increase in centrality with respect to the second period. Therefore, it moved from the upper-right quadrant in the first period to the upper-left quadrant in the following two periods as specialized themes with a peripheral character. Additionally, six new themes appeared in this quadrant as specialized themes: “Gender differences in productivity” (C2), “Employment stratification” (C3), “Personal factors” (C5), “Stereotypes in mathematics” (C7), “Institutional issues” (C10), and “Women’s studies” (C15). “Institutional issues” (C10), which appeared as an emerging theme in the first period but was absent in the second period, reemerged in the third period as a specialized theme, although it had a lower percentage of documents.

The theme of “Mobility, career choice, and sex composition” (C6), similar to “Mobility of women academics”, had been present in the two previous periods and appeared again in the third period. It corresponded to a similar percentage of documents in the three periods, although it went from being a specialized theme in the first period to an emerging or disappearing theme in the second and third periods.

Senior positions in medicine” (C12) and “Bibliometric indicators” (C14) were new themes which also appeared in this quadrant as emerging or disappearing themes.

Finally, five themes, namely “Glass ceiling barriers” (C4), “Work-life balance in engineering” (C8), “Climate and staff composition in academia” (C11), “Inequality and diversity in higher education” (C13), and “Work-life balance in psychology” (C16), appeared in the lower-right quadrant.

It is interesting to see how the theme “Work-life balance in academia” (C8), which was present in the second period, reappears twice in the third period and in the same quadrant in the form of “Work-life balance in psychology” (C16) and “Work-life balance in engineering” (C8), indicating that the topic of work-life balance has attracted the attention of researchers from different research fields.

Finally, “Inequality and diversity in higher education” (C13), similar to the cluster labeled “Racial discrimination at universities” in the second period, showed a significant increase both in centrality and the percentage of documents compared with the previous period. Consequently, it was relocated to the lower-right quadrant. As can be seen in Fig. 7, this theme had the largest number of documents among all themes in all periods.

Conclusion and discussion

Using co-word analysis, the present study describes the evolution and current state of the literature on gender differences in higher education and science, and more specifically of those papers that deal with factors that cause these differences. It also examines the evolution of this topic by dividing the literature into three sub-periods (i.e., 1991–2001, 2002–2007, and 2008–2012). Regarding the evolution of the number of documents, the results reveal that more than fifty percent of the total body of literature was published in the last five years (2008–2012), suggesting that this is a current topic which has aroused the interest of researchers. Specifically, “Inequality and diversity in higher education” is the theme with the largest number of documents over this period. This broad topic addresses gender and other types of inequalities in higher education, as well as diversity issues. While some papers in this cluster mainly evidence gender and race inequalities related to academic degree, salary, socio-economic status, disciplines, rank, tenure, or mentoring etc., others focus on the potential value of diversity in terms of enhancing work processes and organizational mechanisms through the incorporation of women and members of other underrepresented groups such as racial/ethnic minority groups (Homan et al. 2008; Gonzalez and DeNisi 2009; Rosser 2012).

The results also showed that the number of themes has increased significantly over the years, ranging from four in the first period to ten in the second and sixteen in the third period. This suggests a greater interest in the study of factors related to gender differences in higher education and science, as well as a diversification and specialization of the research field over time. “Work-life balance in academia” provides a good illustration of the latter issue: this theme appeared for the first time in the second period, mainly in relation to the issue of work-life balance in universities, while in the third period it became specialized and was covered by specific fields of study such as engineering and psychology (i.e. “Work-life balance in engineering” (C8) and “Work-life balance in psychology” (C16)). The relevance of this topic has recently been underlined in the latest release of Education at a Glance by the OECD (2013). According to this report, the issue remains a key element for achieving gender equality, since women still bear the main burden of care and domestic work.

In terms of trends in the evolution of themes, the strategic diagrams reveal that many themes are still immature in the studied field. Only four motor themes appeared in the upper-right quadrant of the diagrams, the location for those that could be regarded as mature and well-developed themes. The specific themes in each period were “Gender discrimination in labor markets and universities” in the first period, “Career satisfaction in medicine” and “Academic career in sociology” in the second period, and “Advancement in academic medicine” in the third period. Moreover, only two themes, “Mobility of women academics” and “Gender discrimination in labor markets and universities”, were present in all three periods. Some themes emerged and remained in subsequent periods: “Work-life balance in academia” and “Advancement in academic medicine” appeared in both the second and third periods, while “Sex differences in promotion” appeared in both the first and second periods. Other themes such as “Institutional issues” emerged (first period), disappeared (second period), and then reemerged (third period).

The results also indicate that gender differences in higher education and science have been considered by specific research disciplines such as medicine, psychology, geography, sociology, engineering, and mathematics, suggesting important research-field-specific variations. Indeed, after the second period a number of specific research disciplines can be seen to show an interest in gender issues. Notably, medicine is a discipline that appears in both of the two most recent periods (2002–2007 and 2008–2012) as a motor theme related to satisfaction and success in an academic medical career. Furthermore, an additional cluster in the field of medicine appears in the third period as an emerging theme related to senior positions in medicine. Particular research fields related to STEM disciplines, such as mathematics and engineering, also appear in the third period. It is worth noting that while in engineering and mathematics the main problem is located at the entry point (i.e., a problem of convincing girls to undertake these studies and embark on a research career), the challenge in the humanities and social and health sciences is not so much one of attraction but of retention, such that in these research fields the particular pipeline is relatively more leaky (LERU 2012).

It is worth mentioning that although the study aims to identify the main explanatory factors that could account for gender differences in higher education and science, several of the themes identified refer to the differences themselves rather than explanatory factors. For instance, “Sex differences in promotion” and “Gender discrimination in labor markets” correspond to differences described in the literature; but actually, they are not factors related to gender differences. In our view, this result could have two main reasons: the topic of the papers and the selection of keywords. On the one hand, most of the papers in the sample focus on the analysis of gender differences (e.g., salary, promotion, publication rates, etc.) and they sought to explain these differences via some possible factors. On the other hand, authors need to summarize their research in a limited number of keywords, and this point is in fact the biggest problem that is attributed to co-word analysis (He 1999). In co-word analysis, the keywords used for the description of the content of a publication are used as the unit of analysis to map the research field structure. Law and Whittaker (1992), indeed, point out that some keywords are too general and that indexers sometimes put the wrong emphasis on keywording; this has been called the “indexer effect”. However, as Courtial et al. (1984) note there is a general structure in each specific field which underlies the co-occurrence of the keywords, and this structure does not seem to be sensitive to variations or redundancies of terms used by indexers. In order to partially solve this issue and to improve the validity of the data, the recommendation is to normalize the keywords or to use a combination of words from abstracts, title words or full-text (He 1999, Wang et al. 2012).

In our view, the evidence presented in this paper allows the most prominent themes at different time periods to be identified together with possible gaps in the literature. For instance, “Teaching load differences” and “Funding support” are examples of institutional factors that do not appear in our results, indicating that these issues generate little interest among researchers, despite the fact that some studies report clear gender differences based on these issues (LERU 2012).

To the best of our knowledge, this is the first bibliometric study based on co-word analysis to have focused on gender differences in science. The results obtained through the cluster analysis and strategic diagrams complement and confirm previous findings (LERU 2012; European Commission 2013), adding new information and bringing a new perspective to the subject. The value of the strategic diagrams is that they identify the motor themes for the topic and also provide information about the less visible and emerging themes. Furthermore, studying the evolution of results across the three considered periods provides information about specific transient trends, for example, themes that have emerged, then disappeared, and perhaps emerged again. These data illustrate the utility of co-word analysis for understanding the dynamic structure of a subject, and they could serve to anticipate future development or to identify gaps that can be taken into account when setting out the priorities for research policy. In this sense, researchers, governments, and funding agencies could draw upon this type of analysis in order to promote research in emerging areas.