Introduction

Higher education institutions (HEIs) undergo a continuous process of adaptation in order to create a population capable of adjusting to new advances, with the ultimate goal of building a sustainable society. Their mission to spread knowledge combined with the transfer of research and technology comprise the fundamental pillars of HEIs (Berghaeuser & Hoelscher, 2020). Innovation capability at the national level is linked to the research potential of universities (Kang & Liu, 2021), as they are a cornerstone of companies' transformative process (Liu et al., 2023). In Europe, the aspiration is to create a joint European education and research space to promote the excellence of HEIs while guaranteeing gender equality, inclusion and equity (European Council, 2021).

More than a decade ago, some authors still argued that men produced a greater volume and higher quality of research (Abramo et al., 2009; Larivière et al., 2013; van Arensbergen et al., 2012), in some cases without considering that the unequal presence of men and women in the academic world could condition the results (Nygaard et al., 2022). More recently, authors such as Sá et al. (2020) or Haghani et al. (2022) continue to report such evidence, demonstrating the persistence of the pattern of productivity and recognition based on gender. In this respect, Abramo et al. (2021) have contended that the major differences are found in the top 10% of scientists, where men predominate; in the other 90% there is greater homogeneity, with a certain reduction in the research gender gap being observed. Others such as Huang et al. (2020) claim that the disparities do not lie in the number of publications and their impact, but in the length of professional careers and the dropout rate of women in academia. According to Thelwall (2020), when it comes to citation impact, the inequality is imperceptible; indeed, women's citation impact is even slightly higher in countries such as Australia, Canada, Ireland, Jamaica, New Zealand and the United Kingdom.

The need for an evaluation of HEI performance has prompted a proliferation of rankings that seek to offer a comprehensive overview of the activities carried out, with research playing a decisive part in a university's prestige. However, the marked heterogeneity of these institutions somewhat complicates the task (Miotto et al., 2020). The scientific community has thus produced a substantial body of literature aimed at identifying the factors associated with universities' research performance (Fox & Nikivincze, 2021; Hermanu et al., 2022). Some of the most commonly used parameters include the volume of publications and citations, the number of doctoral students or the monetary amount of grants (Szuwarzyński, 2019; Xia et al., 2021). Authors such as Aboagye et al. (2021) and Mohd Rasdi et al. (2022) have found evidence that organizational culture, individual effort and transformational leadership are potential predictors of research performance. González Ramos et al. (2015), in a study of Spanish researchers, identify an absence of gender differences in the pattern of performance, revealing that the lower presence of women in the most relevant studies is due to their priorities and work style. Santos et al. (2021) support this conclusion, demonstrating that the disparity lies in the way of working; women are risk averse and focus their work on fields that offer greater certainty of success, valuing more the autonomy offered by HEIs.

The 2030 Agenda promotes women's equality and empowerment (Sustainable Development Goal 5, SDG5) as a way to achieve the sustainable development of society. However, the university world still has a long way to go to correct the underrepresentation of women among academic research staff (ARS) (European Commission, 2019). There has traditionally been a wide gender gap in disciplines related to science and technology, otherwise known as STEM (Science, Technology, Engineering and Mathematics). In Latin American countries, specific national programmes have been developed aimed at attracting, improving access, orienting and retaining women in STEM (García-Holgado & García-Peñalvo, 2022). In the same vein, the European Commission is allocating substantial resources to boost the participation of women in STEM, passing regulations and approving research frameworks, agencies, and funding systems to encourage their presence (Fatourou et al., 2019). Verdugo-Castro et al. (2022), after conducting a systematic review of the literature, confirm that gender stereotypes are the drivers of female underrepresentation in the STEM field, with women not even reaching 30% of the total. The authors propose targeting action at modifying family influences, the educational environment and even the culture, fostering self-confidence so that the choice of studies is based only on the objectives pursued. In Spain, Dos Santos et al. (2022) advise activities aimed at society as a whole, with special attention paid to the students and their families, to bring about a shift in the culture that has kept women away from STEM.

The present study seeks to evaluate whether there is a gender gap in the Research and Development and Innovation (R&D&I) results—understood in terms of efficiency—of a technological university, where the presence of women accounts for just over 30% of the total, in line with the trend pointed out by Verdugo-Castro et al. (2022). One problem lies in the question of how to measure productivity and performance when in many cases it relates to unequal academic positions that reflect a lingering situation inherited from the past, when the gender gap was more obvious. If the impact indicators of the scientific output produced by men are still higher in absolute terms than those of women (Morgan et al., 2021; Squazzoni et al., 2021), is this due to greater performance or is it a temporary gap that will close in a few years? In this paper, we propose the use of the term “efficiency” instead of "performance", analysing the gender gap by measuring the research efficiency at a Spanish university included in the Shanghai ranking, Universitat Politècnica de València (UPV). In short, our aim is to replace the traditional assessment of research—where the parameters used focus on measurements of volume (number of articles, citations, patents, projects, etc.)—with an efficiency-based assessment, which accounts for the relationship between the available resources and the results obtained. This article analyses the possible gender gap in research efficiency, and by so doing seeks to provide answers to the following questions:

  • Q1. Are there gender differences in research efficiency depending on the different disciplines?

  • Q2. Who achieves the best results in the knowledge areas that make up each discipline?

The approach used is useful for several reasons. First of all, it provides HEIs with a methodological tool to be able to determine women’s position in terms of efficiency. This study does not seek to estimate the historical evolution but rather to define a baseline that can be updated periodically in years to come. Second, it incorporates the gender dimension in the answer to the question about the efficiency of research units. In this respect, what is relevant is the comparison of the efficiency of different units at the time of the analysis, thus addressing the question of the gender dimension within a university organization. We analyse the four disciplines of the UPV: sciences; engineering and architecture; social sciences and law; and arts and humanities. Each of them consists of different knowledge areas with widely differing proportions of men and women, an issue that must be taken into account to avoid any distortion in the results. To answer the first question, we use a variant of data envelopment analysis (DEA), a non-concave metafrontier. This model allows us to measure the efficiency of comparable scientists corresponding to the research profile of each discipline, and then evaluate them with the rest of the sample. For the second question, we use the cross-efficiency method to produce a synthetic index that allows us to establish a ranking of the knowledge areas. There are virtually no studies in the literature that have analysed this gender issue from the perspective of efficiency. The results will help to encourage debate on equality in HEIs, where research productivity is sometimes overshadowed by scientists’ higher professional category.

The paper is structured as follows: in "Results and discussion" Section, a literature review is carried out to identify the conceptual framework within which the study is developed; in "Conclusions” Section, the methods and variables used are presented; in "Results and discussion" section, the results of the empirical analysis are analysed, and the applicable paradigm is discussed; and, finally, the conclusions, contribution and limitations of the study are summarized in "Conclusions” section.

Literature review

Universities must become instruments that drive the response to the social challenges we are currently facing; they can do so by calling on members of the university community to be more engaged in technology and research efforts. While women's participation in this type of activity is increasing (Huang et al, 2020), gender gaps persist in this environment, and universities—wellsprings of knowledge—are no exception. The most prestigious academic positions are still mainly held by men (Regner et al., 2019), despite numerous diversity policies (Khan et al., 2019). According to the report published by the European Commission (2019), only 22% of HEI directors are women and 24% of full professors, a figure that drops to 15% when it comes to STEM. In countries such as Italy and Norway, the volume of scientific output produced by men is still more than 30 percentage points higher, although the differences are smaller when other parameters such as citations received or the scientific category of the journal are considered (Abramo et al., 2021). However, while this gap has started to narrow, there remains much to be done. There is a need to change deeply-rooted social roles where family responsibilities limit female scientists' dedication to their work.

Sometimes the solution lies in collaboration between genders. Authors such as De Saa-Perez et al. (2017) or Maddi and Gingras (2021) confirm that research teams perform better when there is more representative participation of the two sexes. Once again, however, not all researchers agree on this conclusion. According to Nielsen and Börjeson (2019), in no case does gender diversity improve the indicators of research productivity. Shen et al. (2022) even find evidence that collaboration between genders reduces the performance of female academics and improves that of their male counterparts. As can be seen, contradictory conclusions have been drawn from recent studies assessing the productivity of men and women (Table 1).

Table 1 Gender diversity and research performance

The scientific community has not been able to reach a unanimous conclusion regarding the possible link between research performance and gender. Reasons for this lack of clarity could include bias in the indicators used, the lesser presence of women in the academic world, the length of their research careers, or an analytical focus on a particular discipline. For example, although Finland has the highest proportion of high-level female scientists, the figure barely exceeds 20%, with Saudi Arabia registering the lowest representation (2.08%); furthermore, in terms of the most cited scholars in mathematics & statistics, engineering, and physics & astrophysics, women account for slightly more than 5% (Chan & Torgler, 2020). Women's underrepresentation can also be perceived in illustrious events such as the Nobel Prizes (Lunnemann et al., 2019; Modgil et al., 2018). According to Agarwal (2018), the cause of this gap lies in the age of the laureates; therefore, women's lesser presence in academia in those later decades means a lack of equality in the possibility of being selected.

Could it be the case that family responsibilities are the source of this divide? In essence, one of the goals of liberal democracies is to achieve gender equality at all levels of HEIs. Progress has been achieved in various countries through family-oriented work programmes, incentives for retaining female scientists, and gender-balanced organizational structures (Barron & Kattan, 2022; CohenMiller et al., 2022). Nevertheless, the reality is that there is still much to be done to ensure that domestic responsibilities do not interfere with the performance of female scientists (Chan & Torgler, 2020; Defazio et al., 2022). It has recently been shown that in extreme situations such as that caused by the COVID-19 lockdown, household obligations exacerbate the gender disparity in research activities; specifically, women submitted proportionally fewer manuscripts than men (Squazzoni et al., 2021; Caldarulo et al., 2022). However, according to Horta et al. (2022), in terms of hours spent on research in science and higher education, the effect of the pandemic has not led to changes by gender. All this highlights the need to encourage academia to account for these aggravating factors when setting policy. By so doing, these factors can be taken into consideration when establishing the performance assessment guidelines for the university community, enhancing the focus on concepts such as efficiency rather than performance.

Methodology and data

The particular features of the analysed sample, where the observations (decision-making units, DMUs) correspond to branches of science with very different research profiles, require the use of the metafrontier. Although this paper focuses on R&D&I, the intention is not to downplay teaching. It is important to use a broad approach that includes not only research but also outputs in terms of the transfer and dissemination of knowledge, which are considered increasingly vital for validating the functions of universities (Bornmann, 2013; Morgan-Jones et al. 2017; Kamenetzky & Hinrichs-Krapels, 2020). Recent evidence suggests that the presence of research-oriented universities is valuable but not crucial for building dynamic regional economies (Garcia-Alvarez-Coque et al., 2021). As such, the proposed measurement recognizes the multifaceted nature of the university's results, in line with the proposals of a number of authors (Lindgreen et al., 2021; Davison & Bjorn-Andersen, 2019; McKenna, 2020). The proposed methodology makes it possible to calculate the efficiency of heterogeneous DMUs, thus avoiding the limitation inherent to traditional DEA, where only DMUs with similar features can be included in the analysis. In addition, the cross-efficiency model will be used to construct a synthetic index and thereby produce a ranking of the knowledge areas corresponding to each discipline and gender. Both methods rely on information provided by the UPV, evaluating different items about its ARS.

Methodology: non-concave metafrontier and cross-efficiency

DEA is a non-parametric linear programming method that provides a measure of efficiency by comparing the inputs and outputs of each DMU with the rest of the observations that make up the sample. Depending on the orientation chosen for the model, the efficient frontier will be formed by the DMUs whose inputs enable them to obtain the maximum level of outputs (output orientation) or, vice versa, by those that can achieve a certain volume of outputs while using the minimum amount of resources (input orientation). The original proposal by Charnes et al. (1978) was under the assumption of constant returns to scale, a restriction that was relaxed by Banker et al. (1984), who allowed inputs and outputs to vary non-proportionally to one another through variable returns to scale (VRS). This method has enjoyed broad acceptance in the literature, having been applied to a range of very different fields, from tourism (Puertas et al., 2022) to sustainability (de Castro-Pardo et al, 2022), irrigation systems (García-Mollá et al., 2021), and even comparative analyses of regional policy (Bresciani et al., 2021), among others.

The metafrontier, developed in the works of Battese and Rao (2002) and Battese et al. (2004), is used here because we need to analyse DMUs in which the relationships between inputs and outputs follow different patterns. This entails constructing a global frontier, like an umbrella, that encompasses the individual frontiers of each of the homogeneous groups of DMUs that make up the sample. Each DMU is compared twice: once with the observations in its own discipline, calculating the technical efficiency in relation to the group (TEK); and again with the rest of the sample, that is, its efficiency relative to the metafrontier (TE). Therefore, it is necessary to construct as many efficient frontiers as there are disciplines, and then another that envelopes all of them—the metafrontier. However, constructing the latter frontier involves combinations of inputs and outputs that are not present in the sample, referred to by Tiedemann et al. (2011) as infeasible input–output combinations (Fig. 1). In an effort to solve this problem, Tiedemann et al. (2011) propose a variant called a non-concave metafrontier, where only DMUs from the sample are included, avoiding combinations that are not feasible. The concave metafrontier is one that envelopes Technology A and Technology B, where combinations of inputs and outputs not present in the sample are considered (Fig. 1).

Fig. 1
figure 1

Concave and Non-concave metafrontier

It is not possible for TEK to register a higher level of efficiency than TE, since all the groups are enveloped in the metafrontier. The difference between the two distances indicates the proximity of group k to the metafrontier, defined as the metatechnology ratio (also known as the technology gap ratio), TGRK (Battese et al., 2004).

$${TGR}^{K}=\frac{TE}{{TE}^{K}}$$
(1)

\({TGR}^{K}\) shows the efficiency derived from the way group k is managed, TEK shows the local efficiency of group k and TE the efficiency at the metafrontier. The group with the lowest TGRK corresponds to the most efficient form of management. To summarize, the procedure for calculating the non-concave metafrontier involves the following steps:

  1. (1)

    Assess the heterogeneity of the observations that prevents the use of traditional DEA.

  2. (2)

    Calculate the efficiency of each of the homogeneous groups that make up the sample (TEK). TEK determines the level of inefficiency within each group due to inadequate use of resources.

  3. (3)

    Determine the metafrontier by calculating the efficiency of each homogeneous group in comparison with the rest of the observations belonging to the other groups, avoiding the infeasible input–output combinations (TE). TE determines the level of global inefficiency, in other words, that caused by the inefficiency of the group (TEK) and by not using the most appropriate technology (TGRK).

  4. (4)

    The technology gap ratio is calculated (TGRK). TGRK determines the level of inefficiency derived from not using the appropriate technology.

In this research, the proposed model has been solved with an output orientation and VRS, meaning that the resulting values are greater than or equal to 1, where the amount in excess of unity represents the amount by which the outputs should increase when using the available resources in order to achieve the maximum level of efficiency. The metafrontier approach has been applied to the context of universities by authors such as Wongchai et al. (2012), Villano and Tran (2019), Liu and Kuo (2020), and Agasisti et al. (2021), among others.

To construct the ranking, we apply the cross-efficiency method, which has been widely used in the literature (Martí et al., 2022; Puertas & Martí, 2023; Calafat-Marzal et al., 2023). This method enables a complete ranking of all the observations, avoiding the issue of equal values, which typically occurs with traditional DEA when several DMUs achieve maximum efficiency. It is carried out in two stages: in the first stage, self-evaluated efficiency is calculated for each DMU on the basis of its own set of optimal weights, while the second stage involves the comparison of peer-evaluated efficiency scores calculated based on the optimal weights of other DMUs (Liu et al., 2019). This yields a cross-efficiency matrix, in which each of the elements is calculated using the following expression (Sexton et al., 1986):

$${E}_{kj}=\frac{\sum_{r=1}^{s}{u}_{rk}{y}_{rj}}{\sum_{i=1}^{m}{v}_{ik}{x}_{ij}} j=1, . . .,n;\,k=1, . . ., n$$
(2)

where m and s represent the number of input and output variables, respectively; \({y}_{rj}\) the value of output r of the jth DMU; \({x}_{ij}\) the value of input i of the jth DMU; \({u}_{rk}\) the weight of output r; \({v}_{ik}\) the weight of input i. The ranking is based on the average value of the scores obtained in each column of the matrix,

$${E}_{j}=\frac{1}{n}\sum_{k=1}^{n}{E}_{kj}\,\,\,\,\,\,\,\,j=1, \cdots , n$$
(3)

where \({E}_{j}\) is the average efficiency of the observation j. The deaR library implemented in the free software Rstudio has been used for both methods (Coll-Serrano et al., 2018).

Variables and data

The empirical analysis has been carried out on a sample of 3639 ARS from the UPV corresponding to the year 2020. All the detailed information for each observation has been provided by the UPV management; these are data used internally to calculate the research performance of each ARS. An agreement was signed with the university to ensure that personal data could be anonymized. In addition, the data were treated by subdividing them into four disciplines so that the information could be grouped, thereby maintaining the confidentiality of individual observations. Each of these disciplines is made up of different knowledge areas, as shown in Table 8 in the appendix. The selection of disciplines was based on the classification used by the Spanish Ministry of Universities for the knowledge areas to which teaching staff are assigned (Royal Decree 822/2021). The choice of the year of study was determined by the availability of the detailed data provided by the university, which have been filtered through a rigorous anonymization process. However, according to the aggregate information, the gender composition of the UPV staff has not undergone any substantial changes in recent years that could lead to substantial changes in efficiency results (Table 2). The study provides a baseline for the evaluation of efficiency that could be applicable to medium-term studies rather than to two consecutive years, where changes tend not to be significant.

Table 2 Evolution of the gender composition of academic positions at the UPV (%)

To standardize the initial profile of the four UPV disciplines, we first had to remove all the staff who did not have a doctorate degree, leaving 2062 ARS; the breakdown by gender is shown in Fig. 2. In order to ensure that this structure does not distort the results, the scientists in each area have been replaced by the average values corresponding to both genders, thus neutralizing the greater presence of male scientists. Women represent only 30% of the UPV researchers. When calculating efficiency on an individual basis, they are compared with a group of mostly male scientists; hence, using the average values avoids distortions due to the differences in quantity. The UPV is an institution of recognized prestige as attested to by the Academic Ranking of World Universities (2022), where it ranks as the only Spanish polytechnic among the top 500 universities in the world. For its part, Times Higher Education (https://www.timeshighereducation.com) lists this HEI as one of the 300 universities with the greatest social and economic impact in the world, and also includes it in the top 100 for educational quality, innovation and infrastructure, and responsible production and consumption.

Fig. 2
figure 2

UPV disciplines and their gender distribution

For the efficiency analysis, we have to define the variables identified as inputs and outputs (Table 3). The inputs constitute the available resources, that is, the capacity of the ARS, while the outputs represent the results obtained during the year under study. The inputs of the model, sourced from the database, include the academic position of the doctoral-level professors (Academic position), the years of experience since the defence of their doctoral thesis (Experience), and a quality indicator based on the external peer accreditation of six-year terms of scientific or innovation activities (Recognized 6-year terms) by the National Commission for the Evaluation of Spanish Research Activity (NCESRA). NCESRA is a Spanish public body responsible for the promotion and quality of HEIs, a function performed through processes of orientation, evaluation, certification and accreditation of teaching work, study, knowledge transfer and research. As the functions of the university are multifaceted, its production model can be understood as being composed of several outputs that include academic production (Research outputs), transfer and innovation (Technological development and innovation), and dissemination activities (Knowledge dissemination).

Table 3 Inputs and outputs used in the efficiency models

Regarding the inputs, it should be clarified that the ARS categories have been converted into numerical scores to indicate the differences between them. Recognized 6-year terms, whether for knowledge transfer or for research, are recognition of the quality of the research and projects carried out by the ARS over a six-year period. This recognition is granted by NCESRA after an exhaustive evaluation of the six years chosen by the researcher, with the determining factors in the assessment being the quality of the journals where the work has been published and the volume of citations received. Furthermore, only articles published in journals indexed in the Journal Citation Reports (JCR) are taken into consideration. Table 4 shows the descriptive statistics of the variables

On average, male ARS have more resources than their female counterparts in terms of the three inputs; in particular, they have longer research careers, with a differential of more than two years. Similarly, men’s output is higher on average, except in Knowledge dissemination, where women register slightly higher values. A similar pattern is repeated in the rest of the descriptive statistics, where it can be seen that the volume of male ARS drives the higher dispersion and maximum values of the sample. Therefore, in terms of performance, the information provided in Table 4 places men in a predominant position. Nevertheless, the objective of the proposed research goes beyond numerical data, instead centring around the concept of efficiency; that is, it seeks to discern which gender best maximizes its outputs using the available resources (inputs).

Table 4 Descriptive statistics of the variables (2020)

Results and discussion

Q1. Are there gender differences in research efficiency depending on the different disciplines?

To apply the metafrontier, we first need to check for DMUs with different profiles in each of the four disciplines at the UPV. The results of the Kruskal–Wallis test shown in Table 5 confirm the presence of such differences. Therefore, it is not possible to apply traditional DEA to the sample as a whole, given the significant differences in the variables for each discipline (p-value < 0.05.

Table 5 Kruskal–Wallis test to check for differences between the four disciplines

After confirming that there are four different research categories corresponding to each of the disciplines, an efficient production frontier has been constructed for each one as well as another that encompasses them all, using the non-concave metafrontier to prevent infeasible combinations. Table 6 shows the results separated into the following columns: TEK, technical efficiency of group k; TE, technical efficiency with respect to the metafrontier; TGRK, distance between the efficiency of group k and the metafrontier; EFF_TEK, percentage of efficient DMUs on the frontier of group k; EFF_TE, percentage of efficient DMUs on the metafrontier. Furthermore, TGRK represents the efficiency of the way of managing each of the analysed groups; the one registering the lowest value has the most appropriate management.

Table 6 Mean values of the efficiencies of the group (TEK), the metafrontier (TE) and the metatechnology ratio (TGRK)

The efficiency of the university staff of each discipline compared with that of their group (TEK) reveals the noteworthy position of Arts & humanities with a global inefficiency level of 3.6%,Footnote 1 followed by Social sciences & law (12.3%), Engineering & architecture (20.6%) and Sciences (31.4%). These percentages measure how much the outputs need to increase using the available resources in order to achieve complete efficiency. According to Agasisti and Shibanova (2022), advanced staff management practices can help to increase publishing activity and institutional efficiency in general. In addition, the superior efficiency of women is noted in all cases except Engineering & architecture, a study traditionally undertaken by men, at least in technological universities like the one in our case study. As established by the theory, all disciplines show a higher level of inefficiency with respect to the metafrontier (TE) than that found in the group; the increase is greater in disciplines where the inefficiency of the group is lower (Arts & humanities and Social sciences & law).

The following column shows that Engineering & architecture followed by Sciences are the disciplines showing the best management of their research (TGRK); they would only have to improve by 1.5 and 3.5%, respectively, compared to 6.8% and 19.3% for Social sciences & law and Arts & humanities, respectively. The UPV, as its name suggests, is a fundamentally technical university, and the vast majority of its ARS are engineers whose research projects receive substantial public and private grants, which could explain the better performance of Engineering & architecture compared to the rest. The last columns (EFF_TEK and EFF_TE) show the proportion of fully efficient DMUs both in group k and relative to the metafrontier, with the highest percentage of fully efficient ARS found in Social sciences & law (42.31 and 23.08%, respectively).

Another aspect worth highlighting is the greater inefficiency of men, except in engineering, where women register worse results for both the group (19.4% versus 21.9%) and the metafrontier (22 versus 22.7%). However, their way of managing research in Engineering & architecture is more appropriate (2.2% men compared to 0.7% women). In short, we can confirm the better relative position of women in the UPV compared to men, both in the level of efficiency and in the organization of research in all the disciplines analysed. These results are difficult to compare with the literature due to the bias in gender studies, almost all of which are focused on the humanities and social sciences (Silander et al., 2022). Some have analysed the gender gap in research performance in STEM disciplines, with the scales tipping in favour of male scientists (Cidlinská, 2019; Sarabi & Smith, 2023). The current reality is that men publish more and receive a higher volume of citations. However, our study confirms that this does not determine the efficiency of them all, because productivity should be evaluated on the basis of similar resources—in our case measured by academic position (internally recognized quality), years of experience, or six-year terms (externally recognized quality).

The STEM disciplines are mainly populated by male ARS, whereas men have a much smaller presence in social and political sciences. However, according to Abramo et al. (2021), this does not determine their performance. Focusing on life sciences, Lerchenmueller and Sorenson (2018) show that the gender gap emerges early in researchers' career, which turns out to be key for subsequent outcomes, where women become the principal investigators on research projects at a 20% lower rate than men. For their part, Casad et al. (2021) indicate that the lack of progress towards gender parity in STEM is due to discrimination in hiring and reduced opportunities for women's career advancement.

Gender equality has begun to go beyond its intrinsic value and is acquiring a critical instrumental value as a way to achieve other objectives (Silander, 2019). These days, the struggle for the empowerment of women is a reality not only at the national level, but also in the international objectives established to ensure global sustainability. Numerous studies confirm that the neoliberal university environment is having a negative effect on the existing gap in HEIs (Rosa, 2022; Tzanakou & Pearce, 2019). Public policies should seek to bolster and properly value the position of women in the academic world, granting direct assistance to enhance their work and support their inclusion in an environment traditionally hostile to female scientists.

Q2. Who achieves the best results in knowledge areas that make up each discipline?

To produce a ranking of the knowledge areas, the cross-efficiency method is applied to each of the disciplines. The ranking lists these areas in order, noting the corresponding most efficient gender, with the aim of identifying the position of women scientists in the different fields analysed. Table 7 shows the top 10 positions of the resulting ranking.

Table 7 Ranking of the knowledge areas of each discipline

The rankings obtained contradict the conclusions of Abramo et al. (2021), there is a much higher proportion of men among the top 10% performing scientists. As can be seen in Table 6, men and women appear in very similar proportions in the top positions. Even in Engineering & architecture, where there is a much larger presence of male ARS, female researchers still hold important positions. In short, it can be seen that the underrepresentation of women at university does not harm their results, once again contradicting the existing literature on performance. According to Casad et al. (2021) progress towards gender equality in STEM is very slow, particularly when it comes to management positions in universities. There are some fairly widespread negative stereotypes that hinder progress towards parity. Even in the United Arab Emirates, where the position of women is problematic to say the least, Patterson et al. (2021) confirm a decreasing trend in gender discrimination, although there are still very few articles published by women.

This underrepresentation has its origins in the lower graduation rates of women in certain STEM disciplines. A study conducted in Australia revealed that learning preferences, the masculine culture in these areas, and scientific identity could be behind these results (Fisher et al., 2020). Spanish HEIs have promoted the introduction of various initiatives aimed at raising the visibility of female scientists, thus seeking to prevent androcentrism in various fields of science and engineering. These include summer schools on physics and gender, asynchronous virtual courses on mathematical co-education, and certain pilot projects, all of which have been developed at universities in Barcelona. The goal is to break the power structures that have such a strong hold on the scientific community (Calvo-Iglesias et al., 2022). According to Greider et al. (2019), there are still a number of social and cultural factors deeply rooted in society that hinder the advancement of women's research careers, with the distribution of domestic work being a prominent issue.

The results obtained reflect the better position of women in terms of efficiency, despite their lesser presence in almost all the disciplines analysed. For decades they have had to work doubly hard to make their way in this unforgiving terrain—simply because they are women, not because of their proficiency.

Conclusions

Advances in science are happening continuously in response to the growing needs of society, making the research carried out by HEIs particularly relevant. It is universities that bring together the largest number of scientists, dedicated not only to the transmission of knowledge, but also to scholarship and to providing results to a society that needs an inexhaustible flow of innovation to meet the demands of a population thirsting for progress. However, there is a blight that we have carried with us since humanity's earliest days and that has yet to be overcome: the gender gap in all social and scientific levels continues to be a reality that limits the development of female researchers.

The scientific community has produced a body of literature aimed at measuring research performance, providing a gender perspective to identify the source of differences, but reporting contradictory results that prevent accurate conclusions. In this study, we replace the term “performance” with “efficiency” in order to assess the capacity of women in STEM disciplines—fields with a longstanding masculine tradition. The empirical analysis conducted here reveals that, using the available resources, women scientists obtain slightly better results than men, meaning they are able to manage their research more effectively. The results on performance reported by other studies, in cases where men far outperform women, could be corrected by granting more resources to female researchers, as such resources are sometimes limited by the smaller volume of scientific output.

Female ARS are in a tight bind that they find it difficult to free themselves from without help. They need support policies that allow them to break out and demonstrate their ability to a world notoriously dominated by men. In this research, the efficiency of female scientists is clearly demonstrated. However, their prestige is overshadowed by parameters that cannot capture the existing reality. They need to have access to resources similar to those received by male researchers; however, these resources are currently beyond their reach due to the evaluation of certain items of scientific production that do not reflect the value of their research work, a situation that prevents them from climbing up the scientific career ladder.

This paper suffers from the limitations typical of efficiency studies. (1) We do not have access to individualized information on the hours spent teaching. We believe that the conclusions could be reinforced by including the teaching load of the evaluated ARS as an “undesirable input”, as it limits their dedication to research projects. In Spanish universities, teaching is assigned according to one's professional category; since men occupy the most prestigious positions, women ARS have a higher volume of teaching work, adversely affecting their research career. (2) The analysis refers to a single period, due to the procedural demands and limitation posed by the anonymized information provided by the UPV. We have checked that the gender composition did not undergo significant changes in previous years (2017–2019) that could have brought about changes in efficiency, as it would be necessary to go back further in time. (3) It would be interesting to incorporate the funds received by each researcher into the assessment. This is a very “sensitive" variable, and the UPV management considers this to be private information. However, the perception is that introducing this information would further strengthen the results obtained.