Introduction

Gender biases in medicine and health care are pervasive and one of the key drivers of health-related inequities (Hamberg 2008; Humphries et al. 2017; Mosca et al. 2011; Poinhos 2011; Wei et al. 2017). Some authors have argued that increasing health care professionals’ gender awareness, i.e., positive attitudes towards considering sex and gender issues in health and illness and the knowledge and skills necessary to incorporate them into medical practice, may contribute to promote gender equity in health care (e.g. Verdonk et al. 2009). The development of reliable and valid measures of gender awareness is the cornerstone for empirically supporting the contention that increasing physicians’ gender awareness will contribute to prevent gender biases in health care and, ultimately, assessing the effectiveness of intervention programs aimed at increasing health care professionals’ gender awareness (e.g., Dielissen et al. 2014; Eisenberg et al. 2013). It is the general goal of this paper to contribute to such endeavor by aiming to adapt and validate to the Portuguese population one of the main measures developed so far to assess health care professionals’ gender awareness—The Nijmegen Gender Awareness scale (Verdonk et al. 2008). In doing so we also seek to contribute to further validate this measure of gender awareness in medicine.

Measuring gender awareness in health care

Since gender awareness in health care was conceptualized for the first time (Miller et al. 1999), several measures have been used to operationalize it. These measures are very different in their characteristics and in what they intend to assess, in part reflecting an ongoing debate surrounding the gender awareness construct (e.g., Miller et al. 1999; Khoury and Weisman 2002; Verdonk et al. 2009). However, to the best of our knowledge, so far only two scales have been developed and validated to provide a theoretically grounded, multi-dimensional assessment of health care professionals’ gender awareness: (1) the Gender Awareness Inventory—Veterans Administration (GAI-VA, Salgado et al. 2002) and; (2) the Nijmegen Gender Awareness in Medicine Scale (N-GAMS, Verdonk et al. 2008).

Drawing upon Miller et al.’s (1999) Model of Gender Awareness, the GAI-VA was developed and validated for the United States of America veteran population, where women are a minority in a context traditionally marked by men. The GAI-VA assesses health care professionals’: (1) gender sensitivity, i.e., the degree to which they are aware of and sympathetic towards the needs and requirements of female veteran patients; (2) gender ideology, i.e., their attitudes towards these patients and (3) knowledge, i.e., accurate information about these patients and their specific needs. Despite its reasonable psychometric properties (Salgado et al. 2002), its specific focus on female veteran patients hampers a more generalized use of the measure to assess health care professionals’ gender awareness towards both women and men in other health care contexts.

The N-GAMS overcomes this limitation by aiming at assessing medical students’ gender awareness towards male and female patients in general, and expanding it towards male and female physicians. Its original validation study (Verdonk et al. 2008) suggests that it assesses three dimensions: (1) gender sensitivity, i.e., the extent to which medical students are sensitive and sympathetic to the impact of gender in medical practice (14 items); (2) gender-role ideology towards patients, i.e., medical students’ stereotypical views towards male and female patients (11 items) and (3) gender-role ideology towards doctors, i.e., medical students’ stereotypical views towards male and female doctors (8 items). All sub-scales showed good reliability (alphas equal and above .80). Findings also suggested good criteria-related validity. Indeed, as hypothesized: (1) as compared to male medical students, female students held less gender stereotypes towards patients and doctors and (2) patient centeredness, i.e., being more involved in psychological issues and holding more open, empathic and democratic attitudes, was positively associated with gender sensitivity among male and female medical students and negatively associated with gender-role ideologies towards patients only among female medical students. In sum, previous findings suggest that the N-GAMS may be a reasonably good measure of (future) physicians’ gender awareness.

Since its development, the N-GAMS has been used to: (a) assess and compare Dutch and Swedish medical students’ gender awareness (Andersson et al. 2012); (b) assess the effect of an intervention program about female reproduction, clinical practices of gynecology and obstetrics, and other women health-related issues in medical students’ levels of gender awareness (Eisenberg et al. 2013); and (c) compare differences in General Practitioner trainees gender awareness’ following different gender medicine programs (Dielissen et al. 2014). These studies emphasize the relevance and applicability of this scale in several contexts, namely, to assess cultural differences in gender awareness and also the efficacy of gender training programs focused on increasing gender awareness.

Study aims and hypotheses

The main goal of the present study is to adapt and validate the N-GAMS to the Portuguese population. As far as we know, there are currently no validated instruments to assess Portuguese (future) health care professionals’ gender awareness. While pursuing this objective, we also aimed to further validate the N-GAMS, by addressing some limitations of its original study (Verdonk et al. 2008). First, regarding the N-GAMS construct validity, the three-fold underlying structure of the scale was never tested, as the original study only presented the results of a principal component analysis. Therefore, our first goal was to test the underlying 3-factor structure found by Verdonk et al. (2008), where gender awareness is composed by gender sensitivity, and two correlated factors, i.e., gender-role ideology towards patients and gender-role ideology towards doctors (model 1). This model was tested and confirmed against two alternative models: (1) gender awareness as a unique and first-order factor (model 2) and; (2) gender awareness as a second-order factor with gender sensitivity and gender-role ideology as first-order factors (model 3). We hypothesized that model 1 would show a better fit to the data than models 2 and 3 (Hypothesis 1). Also, and in line with the results of Verdonk et al. (2008), we expected that gender-role ideology towards patients would be positively correlated with gender-role ideology towards doctors (Hypothesis 2), proving empirical support to N-GAMS construct validity.

Second, we aimed at extending the study of the measure’s criteria-related validity, as in the original study it was only tested against the following criteria: students’ sex and patient centeredness. As such, we aimed to assess the relationship between gender awareness and physician empathy, sexism and years of medical education. Empathy has been an important construct in the context of patient care, generally defined as the ability of physicians to understand patients’ emotions and perspectives, expressing their care and concerns about them (Hojat et al. 2003). It is our contention that such empathic ability may be positively associated with doctors’ sensitivity to the impact of gender in medical practice, as both constructs require perspective taking skills. Therefore, we hypothesized that medical students’ empathy would be positively correlated with their gender sensitivity (Hypothesis 3.1). Conversely, we expected that more empathic medical students would uphold less stereotypical views of both patients and doctors (Hypothesis 3.2).

Also, as gender-role ideologies are assessing individuals’ adherence to stereotypical views of patients and doctors, these constructs may to some extent be associated with sexism. Two types of sexist attitudes have been identified in the literature: hostile and benevolent sexism (Glick and Fiske 1996). Hostile sexism reflects hostility towards women and benevolent sexism reflects a stereotypical attitude towards women in a subjectively positive in feeling tone (for the observer) including behaviors typically categorized as prosocial or intimate. Therefore, our hypothesis (Hypothesis 4.1) was that sexism (hostile and benevolent) would be positively correlated with gender-role ideologies, but, in turn, negatively associated with gender sensitivity (Hypothesis 4.2).

As for years of medical education, on one hand, we could expect that, given the still dominant biomedical model in medical training (Engel 1977), more years of education would make medical students less aware of psychosocial influences and diversity issues, hence, decreasing their gender awareness. On the other hand, previous studies showed that older medical students showed higher gender awareness (Andersson et al. 2012), which may in part be due to a role played by medical education. Given these conflicting expectations, our aim was to explore the relationship between years of medical education and gender awareness.

Finally, we intended to replicate the hypothesis postulated by Verdonk et al. (2008) regarding sex-related differences in gender awareness; namely, we expected that female students would have higher levels of gender sensitivity and lower levels of gender-role ideologies as compared to male students (hypothesis 5).

Method

Participants

This study was conducted with a convenience sample of 1048 medical students (67.1% women; 27.1% men and 5.8% did not mention their sex) from 8 Portuguese medical schools. The female/male proportion of medical students in our sample was similar to the female (65.9%)/male (34.1%) proportion of students enrolled in Portuguese medical schools in the year the data collection took place (2016; PORDATA 2019). Their ages ranged from 18 to 55 years (M = 22.90; SD = 4.38). Participants were attending different course years—first year (12%), second year (14.6%), third year (18.1%), fourth year (19.1%), fifth year (17.4%) and sixth year (18.8%) and, on average, they reported 3.72 years (SD = 1.64) of medical education. Although 39.4% of the students did not know yet which medical specialty they would like to pursue, some pointed out to surgery (14.9%) and internal medicine (11.1%) as their preferred medical specialties. Most students (99.29%) had Portuguese nationality, were single (92.3%) and did not have any children (97.2%). Most students’ fathers (79.7%) and mothers (83.2%) had a paid professional activity. Also, 45% of students’ fathers and 54.2% of their mothers had a higher education degree (e.g., bachelor, master). There are no differences between female and male participants regarding the majority of sociodemographics. However, male participants are significant older (M = 23.42, SD = 5.24) than female participants (M = 22.69, SD = 3.97; t(984) = 2.377, p = .018).

Instruments

The Nijmegen Gender Awareness in Medicine Scale (N-GAMS)

To adapt and validate the N-GAMS to the Portuguese population we followed international guidelines for the adaptation and cross-cultural validation of instruments for measuring psychological constructs (Beaton et al. 2000; Guillemin et al. 1993). Two bilingual researchers, familiarized with the N-GAMS, and one bilingual researcher not familiarized with it were asked to independently translate the instrument from English to Portuguese. The three translations were compared to achieve a final consensual translation. The final translation was sent to a bilingual professional translator to perform the back translation, which was then compared with the English version of the N-GAMS for semantic equivalence. Small changes were made to linguistic expressions as to facilitate their understanding in Portuguese. Finally, the instructions were slightly adapted to an online questionnaire. Participants were asked to rate the extent to which they agreed with each item on a scale ranging from 1 (Totally disagree) to 5 (Totally agree).

Jefferson scale of physician empathy: students version (JSPE-spv)

Medical students were asked to fill out the Portuguese version of the JSPE-spv (Aguiar et al. 2009; Magalhães et al. 2011). This measure was used to assess physician empathy as to support N-GAMS criteria-related validity. The Portuguese version of the JSPE-spv is a reliable (α > .76), valid and stable instrument.

The JSPE-spv is composed of 20 items, answered on a Likert scale ranging from 1 (totally disagree) to 7 (totally agree), which assess 3 dimensions of physician empathy: (1) perspective taking (10 items, e.g. physicians should try to think like their patients in order to render better care); (2) compassionate care (8 items, e.g. I believe that emotions have no place in the treatment of medical illness-reversed) and (3) standing in the patient’s shoes (2 items, e.g. because people are too different it is difficult to see things from patients’ perspective-reversed). To assess some of the psychometric properties of this instrument in our sample, a principal axis factoring analysis (orthogonal rotation) was conducted [KMO = .899; Bartlett’s χ2 (171) = 5372.202, p < 0.001]. One item “physicians should not allow themselves to be influenced by strong personal bonds between their patients and their family members” was eliminated because it loaded on one separate factor and had the lowest communality (.073). Based on the Kaiser criterion, three factors were extracted, accounting for 46.53% of the total variance: (1) compassionate care (n = 7 items, α = .739), (2) perspective taking (n = 10 items, α = .673) and (3) standing in patient’s shoes (n = 2 items; rsb = .767). A Confirmatory Factor Analysis (CFA) showed satisfactory fit indexes for this factorial structure model (χ2 [149] = 484.071, p < 0.001; CFI = 0.936; NFI = 0.911; IFI = 0.936; RMSEA = 0.047). It should be noted, however, that only the perspective taking and compassionate care subscales (r = .496, p < .001) were used in the analyses since they are the dimensions which explain the majority of variance, making the third dimension “standing in patient’s shoes” a residual two item factor. Scores on perspective-taking and compassionate care subscales were computed by calculating the average of their respective items; higher indicate higher perspective-taking and compassionate care.

Ambivalent Sexism Inventory (ASI)

The ASI was also used to assess sexism in order to support N-GAMS concurrent validity. The Portuguese version of the ASI (Costa et al. 2015) is composed of 22 items that assess sexist attitudes towards women. Eleven items assess hostile sexism and 11 items assess benevolent sexism. The items were answered on a Likert scale from 1 (totally disagree) to 5 (totally agree). To avoid an excessively lengthy data collection protocol, we sought to reduce the number of items to 14; 7 of hostile sexism (e.g. most women interpret innocent remarks as being sexist) and 7 of benevolent sexism (e.g. women should be cherished and protected by men). The items kept in the present study were the ones that presented the highest factorial loadings in Costa et al.’s study.

To assess some of the psychometric properties of this instrument in our sample, a principal axis factoring analysis (oblimin rotation) was conducted [KMO = .901; Bartlett’s χ2 (78) = 4423.513, p <0.001]. The item “in a disaster, women need not to be rescued first” was previously eliminated because it had a difference below .30 between the loadings on at least two factors and the lowest communality (.116). Based on the Kaiser criterion, two factors were extracted, accounting for 51.30% of the total variance: (1) hostile sexism (n = 7 items, α = .868) and (2) benevolent sexism (n = 6 items, α = .752), which were significantly correlated (r = .509; p < .001). CFA showed satisfactory fit indexes for this factorial structure model [χ2 (64) = 397.197, p < 0.001; CFI = 0.924; NFI = 0.911; IFI = 0.924; RMSEA = 0.074].

Procedure

This study was carried out online using Qualtrics software (Qualtrics, Provo, UT) and following the ethical and deontological guidelines of ISCTE-Instituto Universitário de Lisboa (ISCTE-IUL) and the Portuguese Board of Psychologists (Ordem dos Psicólogos Portugueses 2011). First, we asked permission to the Boards of all Medical Schools in Portugal (eight schools) to conduct an online study about gender issues in Medicine. The Boards of every school approved the data collection protocol and one person from the administrative staff at each school was responsible for diffusing the online protocol among students, through their institutional e-mails. The first author sent weekly reminders to the staff members responsible for diffusing the protocol. The reminders were sent weekly during 2 months (between February and April of 2016) as to significantly increase the sample.

Participants were invited to collaborate on a study about gender issues in Medicine. The participation was voluntarily, and students were assured that their responses were anonymized and treated confidentially. The protocol included the questionnaires by the following order: the N-GAMS, the JSPE-spv, the ASI and, finally, a set of sociodemographic questions. The online protocol took an average of 10 min to complete; participants who spent < 5 min filling it out were excluded from the final sample (n = 39). We randomly allotted two 50€ vouchers to all participants.

Data analysis

Data were analyzed with version 23 of IBM SPSS (IBM Corp. 2015) and IBM AMOS (Arbuckle 2014). First, N-GAMS item distribution for the total sample (n = 1048) were analyzed. Before starting the analyses all the items of the gender sensitivity subscale were reversed except items GS-1, GS-2 and GS-13 (duly marked on the Table 1). Afterwards, we ran a Parallel Analysis (O’Connor 2000), commonly used to determine the number of components to retain in an Exploratory Factorial Analysis (EFA). Then, we ran a Principal Axis Factoring (PAF) analysis with oblique rotation with all N-GAMS items in a random subsample of about half of the original sample (n = 509). Items with a difference below .30 between the loadings on at least two factors, with the lowest communalities (< .20) and higher levels of asymmetry (skeweness/SE skewness > [2.0]) were gradually eliminated.

Table 1 Descriptive statistics for the total sample (n = 1048) and exploratory factor analysis for the random subsample (n = 509) of the N-GAMS.pt

Then, a CFA was performed using maximum likelihood estimation with the second random subsample (n = 539) with no missing data. The CFA was run to test the hypothesized three-fold factorial structure of the Portuguese version of the N-GAMS (henceforth N-GAMS.pt) against two other alternative models: (1) gender awareness as a unique and first-order factor and (2) gender awareness as a second-order factor with gender sensitivity and gender-role ideology as first-order factors. The latent variables’ variance was constrained to one and correlated errors were kept fixed, observed variables were free, and the degree of freedom was greater than zero. Multiple fit indexes were chosen reflecting different features of model fit. Criteria for a good fit were established following the guidelines by Hu and Bentler (1999), Maroco (2010) and Schermelleh-Engel et al. (2003). Given that most N-GAMS.pt items did not present a normal distribution, a nonparametric method (bootstrap) with 5000 subsamples was used to validate the previously obtained results.

N-GAMS.pt factors were then obtained by calculating the average of their respective items; the higher the scores the higher medical students’ gender sensitivity and adherence to gender-role ideologies. Afterwards, we investigated the relationship between these factors and sociodemographic characteristics (i.e., age, preferred medical specialty, father’s professional situation, mother’s professional situation, father’s education level, mother’s education level and number of children). No significant relationships were found. In line with the Central Limit Theorem, and given the large sample size, we used Pearson correlations to investigate the relationship between the three gender awareness dimensions to test N-GAMS.pt construct validity (hypothesis 2), and between gender awareness dimensions, physician empathy (hypotheses 3.1 and 3.2), sexism dimensions (hypotheses 4.1 and 4.2) and years of medical education to test N-GAMS.pt criteria-related validity. Also to investigate N-GAMS.pt criteria-related validity, we performed a one-way MANOVA to analyze sex-related differences on gender-role ideology towards patients and doctors and a t test to analyze sex-related differences gender sensitivity (hypothesis 5).

Results

Descriptive analysis of N-GAMS items

As shown in Table 1, participants’ answers covered the entire scale range (min = 1 and max = 5) for almost every item. The means ranged between 1.54 and 4.39. Most of the items did not present a normal distribution, especially the items of the gender sensitivity subscale, showing high levels of Skewness (Skewness/SE of Skewness > |2.0|) and Kurtosis (Kurtosis/SE of Kurtosis > |2.0|).

Construct validity

Parallel analysis and exploratory factor analysis

A parallel analysis was conducted suggesting that only factors with eigenvalue of 1 or more should be retained, corroborating a 3-factor structure. As for the EFA, the sampling adequacy was guaranteed [KMO = .890; Bartlett’s χ2 (153) = 3002.829, p < .001]. Based on the Kaiser criterion, three factors were extracted accounting for 52.15% of the total variance: (1) gender sensitivity (n = 6 items), (2) gender-role ideology towards-patients and (n = 7 items), (3) gender-role ideology towards-doctors (n = 5 items).

It should be noted that 15 items were removed from the final EFA solution due to their low communalities and/or high cross loadings (see Table 1). Eight items were removed from the gender sensitivity subscale, four from the gender-role ideology towards patients subscale and three from the gender-role ideology towards doctors subscale. All eliminated items are presented in Table 1.

Regarding the correlations between N-GAMS.pt factors, gender-role ideology towards patients and gender-role ideology towards doctors were positively correlated (r = .570; p < .001; n = 1048). Also, gender sensitivity showed a negative and very weak correlation with gender-role ideology towards doctors (r = − .079; p = .010; n = 1048). However, no significant correlations were found between gender sensitivity and gender-role ideology towards patients.

Confirmatory factor analysis

The first model tested was the one obtained from the previous EFA—the 3 factors model (hypothesized model; Fig. 1) with gender-role ideology-patients and gender-role ideology-doctors correlated.

Fig. 1
figure 1

Confirmatory factor analysis of the N-GAMS.pt (n = 539). Note: see Table 1 for correspondence between item codes and full items

As shown in Table 2, the fit indexes of the hypothesized model were better than the alternative models that did not improve the data fit. Alternative model 2, i.e., gender awareness as a second-order factor with gender sensitivity and gender-role ideology as first-order factors, showed the worst fit to the data.

Table 2 Fit indexes comparison between hypothesized and alternative models (n = 539)

Because of the underlying non-normality of the items, a nonparametric method (bootstrap) was subsequently used as to validate the results obtained by the parametric method (maximum likelihood). As it can be seen in Table 3, the bias between the 2 methods was minimal, showing that the difference between the results obtained by the parametric method and the nonparametric method is almost nonexistent.

Table 3 Comparison of estimates obtained from maximum likelihood and bootstrap methods (n = 539)

N-GAMS.pt reliability and sensitivity

Participants’ gender sensitivity (n = 1048) scores covered the full scale range, from 1 to 5. On average, participants presented moderate-to-high levels of gender sensitivity (M = 3.91, SD = .60). This factor presented a negatively skewed (− 8.90) and a platykurtic (7.69) distribution and good internal reliability (αGS = .713).

Participants’ gender-role ideology towards patients scores ranged from 1 to 4.43 and on average, participants held low-to-moderate gender stereotypical views of patients (M = 2.54, SD = .73). This factor presented a symmetric (− 3.05) and leptokurtic (− 2.84) distribution and showed very good internal reliability (αGRI-p = .858).

Finally, participants’ gender-role ideology towards doctors scores ranged from 1 to 3.80 and on average, participants held weak gender stereotypical views of doctors (M = 1.79, SD = .60). This factor presented a positively skewed (6.73) and leptokurtic (− 2.21) distribution and showed very good internal reliability (αGRI-d = .837).

Criteria-related validity

Physician empathy and gender awareness

Gender sensitivity presented a very weak and positive correlation with perspective taking and a weak and positive correlation with compassionate care (Evans 1996, for reference values of correlation size). Gender-role ideology towards patients was very weak and negatively correlated with compassionate care. Also, gender-role ideology towards doctors was very weak and negatively correlated with perspective taking and weak and negatively correlated with compassionate care (see Table 4).

Table 4 Pearson correlations between gender awareness, years of medical education and physician empathy and sexism (n = 1048)

Sexism and gender awareness

Gender sensitivity and hostile sexism was very weak and negatively correlated. Gender-role ideology towards patients and doctors presented moderate and positive correlations with hostile and benevolent sexism (see Table 4). No significant correlations were found between gender sensitivity and hostile or benevolent sexism.

Years of medical education and gender awareness

Years of medical education presented a very weak and positive correlation with gender sensitivity and very weak and negative correlation with gender-role ideology towards patients and towards doctors (see Table 4).

Sex-related differences in gender awareness

Multivariate tests showed significant sex-related differences in gender-role ideologies [F(2,984) = 9.616; p = < .001]. More specifically, significant sex-related differences were found in gender-role ideology towards patients and doctors. As compared to male students, female students held slightly less gender stereotypes towards patients [Mfemales = 2.51, SD females = .72; Mmales = 2.62, SD males = .75; F(1,985) = 4.674; p = .031] and doctors [Mfemales = 1.74, SD females = .55; Mmales = 1.92, SD males = .68; F(1,985) = 19.131; p < .001]. No significant sex-related differences were found on gender sensitivity [t(985) = 1.024; p = .306].

Discussion

This study aimed to adapt and validate the Nijmegen Gender Awareness in Medicine Scale (N-GAMS; Verdonk et al. 2008) to the Portuguese population, also suppressing some limitations of its original study by testing its underling 3-factor structure and further assessing its criteria-related validity. As to achieve this goal a large Portuguese sample of medical students of all medicine schools in the country was used.

N-GAMS.pt construct validity and reliability

Our first goal was to test the hypothesized N-GAMS.pt underlying 3-factor structure (Verdonk et al. 2008), in which gender awareness was composed by gender sensitivity, and two correlated factors, i.e., gender-role ideology towards patients and gender-role ideology towards doctors. We hypothesized (hypothesis 1) that this model would show a better fit to the data than two alternative models, namely: (1) gender awareness as a unique and first-order factor and; (2) gender awareness as a second-order factor with gender sensitivity and gender-role ideology as first-order factors.

Our preliminary analysis (parallel and exploratory factor analysis) suggested retaining the expected three factors (Verdonk et al. 2008), after the removal of eight items from the gender sensitivity subscale, four items from the gender-role ideology towards patients subscale and three items from the gender-role ideology towards doctors subscale. One of the reasons behind the exclusion of this amount of items from the original N-GAMS (Verdonk et al. 2008) may pertain to differences in the extraction methods that were used. Whereas Verdonk et al. (2008) reported using a principal component analysis, in the present paper we used a principal axis factoring, which is the most appropriate method to extract latent factors based upon variables’ common variance, considering error variance (Schmitt 2011). In other words, items that in the original version (Verdonk et al. 2008) might have loaded into a component due to shared error variance, could have been easily eliminated from the N-GAMS.pt. Most of the eliminated items were, to some extent, redundant regarding the final pool of items, which makes this version of the N-GAMS.pt a more parsimonious measure as compared to the original N-GAMS.

Our first hypothesis was confirmed by (parametric and non-parametric) Confirmatory Factor Analyses; the three-factor model showed indeed a better fit to the data than the two alternative models. Our results also showed that gender-role ideologies are a construct directed at different targets (patients and doctors), which is congruent with previous research (e.g. Anderson and Johnson 2003; Cuddy, Fiske and Glick 2004; Verdonk et al. 2008). Also in line with the original study (Verdonk et al. 2008), and supporting our second hypothesis, students who reported stronger endorsement of gender stereotypical views of patients also showed a stronger endorsement of gender stereotypical views of physicians. This result suggests a common ground for gender stereotypes towards patients and doctors (Verdonk et al. 2008). Also replicating previous findings, students’ gender sensitivity showed no significant association with the endorsement of gender-role ideologies (Verdonk et al. 2008). This means that these are independent subdimensions of the attitudinal component of gender awareness that eventually need to be specifically and independently targeted in interventions.

As for the N-GAMS.pt reliability and sensitivity, all three factors showed good internal consistency and were sensitive to participants’ differences in gender sensitivity, gender-role ideology towards patients and gender-role ideology towards doctors.

On the whole, these findings give support to the construct validity, reliability and sensitivity of N-GAMS.pt and suggest that its three-fold structure seems to be a robust psychological model for medical students’ gender awareness. The stability and robustness of this conceptual model is stressed by the fact that it has been replicated in a sample of participants with different cultural backgrounds as the ones used in the original sample (Verdonk et al. 2008). It should be noticed, however, that like its original version, the N-GAMS.pt is only measuring the attitudinal component of gender awareness. Indeed, although the N-GAMS.pt may show that a medical student reports high scores on gender awareness, i.e., high on gender sensitivity and low on gender-role ideologies, he/she might lack the knowledge (e.g., know the influence of sex and gender on cardiovascular diseases) and the skills (e.g., reflexivity) necessary to promote gender equity in his/her practice (Verdonk et al. 2009).

Criteria-related validity

We expected that medical students’ empathy would be positively associated with gender sensitivity (hypothesis 3.1) and negatively associated with gender-role ideologies (hypothesis 3.2). Our findings have, to some extent, supported our hypotheses. Medical students endorsing higher perspective-taking and compassionate care also showed slightly more sensitivity to the relevance of considering sex and gender issues in medical practice and lower endorsement of gender stereotypical views of doctors. Higher endorsement of compassionate care was also slightly associated with lower endorsement of stereotypical views of the patient. Interestingly, it was the emotional dimension of empathy (compassionate care) that showed the strongest association with gender sensitivity. Also, it was only this dimension that was associated with lower endorsement of gender stereotypical views regarding patients. These results are in line with our assumption that empathy and gender sensitivity are both concepts that require perspective taking skills but mostly feeling patients’ emotions as ones’ own. Although the cognitive dimension is dominant in empathic medical relationships (Hojat 2009; Hojat et al. 2003), these results suggest that interventions directed at the emotional component of empathy may help increase medical students’ gender awareness.

Regarding sexism, we expected that both hostile and benevolent sexism would be negatively associated with gender sensitivity (hypothesis 4.1) and positively associated with gender-role ideologies (hypothesis 4.2). Again, our results seemed to support our hypotheses. Higher levels of hostile and benevolent sexism were moderately associated with a stronger endorsement of gender-role ideologies towards patients and doctors. These associations seemed to be stronger between hostile sexism and gender-role ideologies. Indeed, the positive tone of benevolent sexism, which is often less perceived as sexism per se than hostile sexism (Barreto and Ellemers 2005), might account for its weaker association with gender-role ideologies, as the latter present a more explicit devaluing tone towards female patients and doctors. Overall, these associations are not surprising as gender-role ideologies represent stereotypical views towards patients and doctors and gender stereotypes are indeed the basis of sexism (Swann et al. 1999).

Although sexism shows considerable associations with the stereotypical components of gender awareness, the association with gender sensitivity was much weaker and it only barely showed a significant negative association with hostile sexism. These findings are consistent with the previously mentioned lack of association between students’ gender sensitivity and their endorsement of gender-role ideologies. Again, this suggests that although medical students may hold strong sexist attitudes towards women they may also hold positive attitudes towards taking sex and gender issues into consideration in their medical practice. This would indeed be a worst case scenario, where such integration would be based on gender stereotypical beliefs, thus reinforcing gender disparities in medicine. Again, this highlights the need to devise specific interventions to tap both gender sensitivity and sexism/gender-role ideologies, independently.

As for the role of medical training, Portuguese medical students become slightly more gender sensitive and adhere less to gender-role ideologies as their years of medical education increase. Given the generally high association between students’ age and medical years of education, these findings are partially in line with previous studies that have shown that older Dutch and Swedish medical students show higher levels of gender awareness (Anderson et al. 2012). One explanation for these results may lie in the fact that as medical education increases physicians’ thinking becomes more diverse and complex, they have more varied clinical experience and this contact with diversity may account for more positive attitudes towards the consideration of sex and gender issues in medical practice and a lower adherence to a binary view of male/female patients/doctors.

Finally, sex-related differences were found in gender-role ideologies towards patients and doctors, but contrary to what was expected this was not true for gender sensitivity (hypothesis 5). Portuguese female students showed a lower endorsement of gender-role ideologies towards patients and doctors than male students. These findings replicate the results found among Dutch and Swedish medical students (Anderson et al. 2012) and are consistently in line with many other studies that have shown that, on average, men more strongly endorse gender stereotypes than women (Anderson et al. 2012; Verdonk et al. 2008; Ridgeway and Correll 2004). It should be noted, however, that rates of endorsement of gender-role ideologies were, overall, relatively low. This might be accounted for by the fact that we were using an explicit measure of stereotype endorsement, hence, more susceptible to social desirability. It is also interesting to note that students reported lower endorsement of gender-role ideologies towards doctors than towards patients. This reveals an ingroup favoritism bias that is a natural part of social categorization processes and serve the goal of promoting a positive social identity (Brewer 1979; Cadinu and Rothbart 1996; Stangor and Leary 2006; Zebrowitz et al. 2007).

In sum, most of our findings showed the expected associations between the N-GAMS.pt subscales and four main criteria—students’ sex, empathy, sexist attitudes and years of medical education—reflecting the measures’ good criteria-related validity.

Strengths, limitations and implications for future research

One of the major strengths of this study is the support of the N-GAMS.pt construct validity by replicating and confirming its underlying factor structure in a large sample of medical students of all Portuguese medical schools. This not only speaks to the study’s ecological validity but also, to some extent, to the cross-cultural stability of the measure. This study also extends the knowledge on the psychometric qualities of the N-GAMS.pt criteria-related validity, by showing how its subscales are associated with students’ sex, empathy, sexism and years of medical education.

Like any other study, however, this one also bears some limitations. First, the fact that we have only conducted this study with medical students, with little or no clinical practice, preempts any conclusions about the qualities of the measure to assess trained physicians’ gender awareness. Although we may assume that the N-GAMS.pt will be a valid and reliable measure to assess trained physicians’ gender awareness, future studies are needed to extend its use to the medical professionals. Second, although we present a large sample of medical students with a female/male proportion similar to that of the Portuguese medical student population, our sample is not representative. This curtails our ability to generalize our findings and draw norms for gender awareness assessment. Third, the N-GAMS.pt is only measuring the attitudinal component of gender awareness. A comprehensive assessment of medical students’ gender awareness would also entail assessing their knowledge on how sex and gender may influence individual’s health and health-care and their skills to incorporate such knowledge in their clinical practice. Therefore, assessing medical students’ gender awareness will necessarily require methods beyond pencil-and-paper instruments (e.g., test and scales), namely, gender awareness skills observation (e.g., Dielissen et al. 2012).

Despite these limitations, this study bears important implications for research and practice. As for research, this study showed that N-GAMS.pt is a parsimonious, valid and reliable tool to assess the attitudinal components of medical students’ gender awareness in future research and intervention projects. Given the lack of scales to assess gender awareness this is an important methodological contribution. This measure may be useful to advance knowledge about the relationship between medical students’ gender awareness and quality of care.

As for the implications for practice, the N-GAMS.pt may be particularly useful for monitoring the effectiveness of medical education projects or specific training programs aiming at increasing medical students or physicians’ gender awareness. The fact that gender sensitivity and gender-role ideologies were shown to be independent, suggests that interventions must be directed at both subdimension simultaneously, as to promote effective changes in gender awareness. Thus, interventions should, on one hand, seek to make medical students’ attitudes towards sex and gender issues in medicine more positive, and, on the other hand, help them identify their own gender stereotypes and how and when these influences their medical practice. The N-GAMS.pt may be useful in tapping the effects of training and intervention on these different subdimension of gender awareness.

In sum, the N-GAMS.pt is a short, valid and reliable tool to assess the attitudinal component of medical students’ gender awareness, which bears important contributions for medical education fields and for future research on the role gender awareness in health-care quality and equity.