BACKGROUND

Medical knowledge and clinical performance ratings are major criteria for assessing the competence of resident physicians,1 so it is essential that these standards are trustworthy. Assessments of medical knowledge and clinical performance should primarily reflect residents’ abilities to care for patients, but these assessments may be influenced by other factors. For example, research has shown that resident empathy affects assessments of faculty members,2 female teachers receive lower ratings than males with similar skills,3 and assessment scores for a single instrument may vary substantially between learning environments within the same institution.4

It has been theorized that physician health and well-being could have an impact on medical knowledge acquisition.5 While some studies have linked resident well-being and medical knowledge,6,7 other research has failed to demonstrate this relationship.810 It is unknown whether resident well-being shapes overall assessments of residents’ competency.

Research has shown that internal medicine residents’ well-being is affected by features of the work environment such as duty-hours11 and the perception of experiencing medical errors.12 Because it is known that learning environments11,12 and interpersonal relationships1316 play crucial roles in learning, we postulated that residents’ well-being and empathy would influence assessments of their medical knowledge and clinical performance.

Studies have indicated that well-being affects residents’ attitudes towards patients and the quality of care that they provide,11,12,17 but no studies have examined interactions between residents’ well-being and the global assessments they receive from other members of the health care team. Therefore, we used a prospective longitudinal study design to investigate the hypothesis that resident well-being (measured by the Maslach Burnout Inventory, Linear Analog Self-Assessment item, Medical Outcomes Study Short Form health survey, and a standardized two-question depression screen) and empathy (measured by the Interpersonal Reactivity Index) are associated with numerous dimensions of competency including assessments of knowledge on the medical in-training examination (ITE), clinical performance on the mini-clinical evaluation exercise (mini-CEX), and multi-source assessments by peers, supervising residents, and allied health professionals.

METHODS

Learning Environment and Participants

This study involved Mayo internal medicine residents in training between January 2009 and August 2010, and included scores from all knowledge and clinical performance assessments performed on the 264 resident physicians enrolled during this time period in the Mayo Clinic Internal Medicine Well-being (IMWELL) Study, which is described below. This study was approved by the Mayo Institutional Review Board.

The Mayo IMWELL Study

Resident characteristics were obtained from the Mayo Clinic-Rochester IMWELL study, a longitudinal study of resident physician well-being. Since the 2003 academic year, all categorical and preliminary residents in the Mayo Clinic Rochester Internal Medicine Residency program have been invited to participate in the IMWELL study during their first-year orientation. For the time period of this study, 264 of 312 (84.6%) eligible residents volunteered to participate. All residents provided written consent and were surveyed at regular intervals throughout their residency training. An instrument that measures quality of life (QOL) was administered quarterly, and instruments that measure burnout, empathy, and depression were administered biannually. To maintain anonymity, study participants’ identities were blinded by using numerical codes during data collection and analyses.

Instruments Comprising the Mayo IMWELL Study

The IMWELL study utilized a linear analog self-assessment (LASA) scale of QOL and survey items from the Maslach Burnout Inventory (MBI), Interpersonal Reactivity Index (IRI), Medical Outcomes Study Short Form (SF-8) health survey, and a depression screen by Spitzer et al. Notably, these instruments are supported by sources of validity evidence18,19 that are essential to medical education studies.20,21

The LASA for measuring QOL is a single item with scores ranging from 0 (as bad as it can be) to 10 (as good as it can be). LASA QOL scores have been validated in varied populations including the general public,22 cancer patients,23,24 and physicians.25

The MBI is a 22-item instrument with Likert scales ranging from 0 (never) to 6 (daily).26 Content validity has been demonstrated by reviewing established scales and surveying professionals who are at risk of experiencing burnout, including physicians.26 Many studies have shown that the MBI is an effective measure of burnout in resident physicians.12,2730 Factor analysis demonstrated that the MBI consists of three dimensions: depersonalization, emotional exhaustion, and sense of low personal accomplishment.26 As has been done in previous studies of physicians, we considered residents with a high score on either the depersonalization or emotional exhaustion subscale as having at least one manifestation of professional burnout.25,27 Additionally, the MBI has high internal consistency, acceptable test-retest reliability, moderate correlation with other measures of burnout, and poor correlation with constructs that are likely confounded with burnout.26

The IRI is a 28-item instrument for measuring empathy with Likert scales ranging from 0 (does not describe me well) to 4 (describes me well). Factor analysis revealed four dimensions: perspective-taking, personal distress, empathic concern, and fantasy.31 These seven-item subscales correspond to the two-dimensional cognitive and emotive model, and may be evaluated separately.31,32 A model focusing on empathic concern and perspective-taking has proven especially useful when evaluating empathy among resident physicians.17,30,33,34 On this basis, these two subscales were assessed in this study. Additional validity evidence for the IRI includes good internal consistency,32,35,36 and significant correlations between IRI subscales and other recognized measures of empathy.36,37

The Medical Outcomes Study Short Form (SF-8) health survey has eight items with 5- and 6-point Likert scales.38 Content validity is supported in that survey items represent ideas that are commonly included in widely used health measures.38 The SF-8 generates scores that are assigned to domains of mental and physical health.38 Further SF-8 validity evidence includes score reliability, high correlation with existing measures of the same concepts,38 and convergence between measurements of patients with migraine and those with other conditions.39

Spitzer et al. described a depression screening method consisting of two questions: “During the past month, have you often been bothered by feeling down, depressed and hopeless?” and “During the past month, have you often been bothered by little interest or pleasure in doing things?”40 These two questions perform well in screening for depression relative to several widely used depression inventories, including the Beck Depression Inventory, the Center for Epidemiological Studies Depression Scale, and the Medical Outcomes Study depression measure.41 Spitzer et al.’s screening questions have been used to identify depression in various populations, including resident physicians.11,27

Outcome Measures

Strong validity evidence supports the outcome measures used in this study. ITE scores, the validity and use of which are well described,42,43 have been shown to correlate strongly with resident conference attendance and self-directed reading,44,45 and to have no association with resident physician empathy.10 The mini-CEX has impressive validity and reliability as demonstrated by previous studies at the Mayo Clinic46,47 and elsewhere.4852 Our institution uses the traditional version of the mini-CEX, except that the items are on 5-point scales.

Clinical performance assessments of resident physicians at the Mayo Clinic are comprised of forms completed by peers, senior medical residents, and non-physician professionals. After critically analyzing the items for content most plausibly related to resident well-being, the following items (scale: 1 = needs improvement, 3 = average, 5 = top 10%) were chosen for this study: (1) desirability as a physician for one of your family members, (2) desirability as a future co-worker or team member, (3) effectiveness and completeness of sign-outs, (4) coverage of cross-cover issues and completeness of tasks when on call, (5) demonstrates empathy and compassion for patients, and (6) communication skills with patients, family, allied health, and other providers. Content validity for these clinical performance assessments is based on assessment elements that are represented in previously published instruments and were selected by experts with experience in scale design. A factor analytic study revealed that several items within these Mayo clinical performance assessments are multi-dimensional and have excellent internal consistency reliability.53

Data Analysis

A repeated measures design, analyzed using multivariate generalized estimating equations, was employed to evaluate associations between resident clinical performance assessments, mini-CEX evaluations and ITE examinations, and residents’ QOL, burnout, empathy, and depression over the 4 points in time. Scores from the clinical performance assessment items were evaluated individually and also averaged within assessor group to form an overall score ranging from 1 to 5. Covariates included resident well-being (QOL, burnout, depression), empathy, gender, year of training, program (categorical or preliminary), debt, relationship status (single, married, divorced, partner), and children (yes or no). Univariate associations were examined, and a multivariate model was developed using standard forward and backward stepwise selection techniques. The threshold for statistical significance was set at p < 0.01 to account for multiple comparisons. The study sample of 202 residents provided 80% power for a medium-to-small Cohen’s f2 effect size of 0.04 for a univariate association between clinical performance assessment scores, mini-CEX evaluations, and ITE examinations, and any well-being or empathy variable. Statistical analyses were conducted using SAS version 9.1 (SAS Institute Inc., Cary, NC).

RESULTS

We studied 730 clinical performance assessments completed by peers, supervisors, and allied health professionals for Mayo Clinic internal medicine residents in January 2009, August 2009, January 2010, and August 2010. We also examined 193 mini-clinical evaluation exercise (mini-CEX) evaluations and 260 in-training examinations (ITE) during the same time frame. Of 312 eligible residents, 202 (64.7% of all eligible) provided well-being and at least one category of assessment data for this study. Demographic characteristics for this sample are shown in Table 1. Data for responders were similar to those of non-responders on measured factors including year of training, age, sex, and program type (categorical or preliminary). In addition, United States Medical Licensing Examination Step 1 and Step 2 Clinical Knowledge (CK) scores were similar for responders and non-responders (Step 1 mean scores 232.3 vs 231.2, respectively, p = 0.74; Step 2 CK mean scores 241.3 vs 236.1, respectively, p = 0.10). As the overall results did not differ for categorical and preliminary residents, data were pooled across these categories. Baseline well-being, empathy, assessment, mini-CEX, and ITE scores at the start of the current study are shown in Table 2. In addition, response rates for each outcome are detailed in Table 2.

Table 1 Baseline Characteristics of Resident Physicians Providing Both Well-Being and Evaluation Data from January 2009 Through August 2010
Table 2 Initial Well-Being, Empathy, Assessment, Mini-CEX, and ITE Scores for Resident Physicians Providing Data from January 2009 to August 2010

Univariate associations of measures of well-being and empathy with summary scores in each resident performance assessment domain are shown in an on-line appendix. In multivariate models, there were no statistically significant associations between resident ITE or mini-CEX scores and QOL, burnout, depression, empathy, or demographic characteristics. Both forward and backward model selection approaches yielded the same results. However, residents’ scores on the IRI Perspective Taking scale, a measure of “the tendency to adopt the psychological view of others,” were associated with higher peer ratings on “desirability as a physician for a family member” (multivariate beta = 0.023, 95% CI = 0.007–0.039, p = 0.004). Consequently, a 5-point increase in this empathy score was associated with a small but statistically significant 0.12-point increase in residents’ ratings by their peers as desirable physicians, as shown in Table 3. Additionally, having at least one manifestation of professional burnout was associated with higher resident supervisor ratings of communication with patients, families, allied health, and other providers (multivariate beta = 0.309, 95% CI = 0.100–0.517, p = 0.004). Hence, burnout was associated with a 0.3-point increase in resident communication score, as shown in Table 3.

Table 3 Univariate Associations Between Well-Being and Empathy, with Desirability as a Physician and Effective Communication

DISCUSSION

To our awareness, this is the first study to show that multiple dimensions of resident performance assessment are not significantly influenced by various aspects of well-being including QOL and depression. Nonetheless, the detected correlations between resident empathy and burnout and assessments completed by other doctors suggest that observation-based ratings of residents may be influenced by both intrapersonal and interpersonal factors.

Resident well-being has received national attention because the Accreditation Council for Graduate Medical Education has recently mandated further restrictions on resident duty hours,54 based on the assumption that these restrictions will enhance patient care by improving resident fatigue and well-being. Consequently, it is important to examine the potential relationships between resident well-being and assessments of resident competency. Furthermore, many residency programs assess resident performance on tests of knowledge and clinical skill using the ITE and mini-CEX, respectively. Thus, it is encouraging that we identified no associations between resident well-being and performance on the ITE and mini-CEX. Our findings build upon research that has demonstrated validity of these measures,43,4652 while underscoring the fact that these measures should not be viewed as reliable means for detecting variations in resident empathy and well-being. Our findings also suggest that increasing resident well-being through duty hour restrictions may not be a comprehensive, stand-alone strategy for improving resident performance and enhancing patient care.

It has been theorized that resident performance is partly determined by depression, burnout, and distress related to the work environment.5 Girard et al. found that American Board of Internal Medicine (ABIM) Certification Examination score variation among internal medicine residents was largely attributable to residents’ psychological states.6 Filho et al. have shown that anesthesiology residents’ knowledge of basic science was associated with academic performance anxiety, but was not associated with QOL.9 However, other research has identified no relationship between resident well-being and knowledge on specific topics or standardized examinations.8 Likewise, the current study found no association between resident well-being and medical knowledge on the ITE. A potential explanation for this lack of association is that an insufficient number of residents in our study sample displayed the extremes of low well-being that would be required to negatively impact on the acquisition or display of medical knowledge; indeed, West et al. identified an association between resident well-being and medical knowledge in a much larger sample with wider ranges in well-being scores.7 Another explanation for our negative findings may be that the ITE—as opposed to the ABIM certification—is a lower stakes examination and intended only for formative feedback. This may be the reason why Girard et al. identified a relationship between resident mood disorders and score variation in the ABIM certification examination.6

We found an association between resident empathy and desirability as a physician. In particular, residents who scored high on the IRI Perspective-Taking scale, which is “the tendency to adopt the psychological views of others,” were considered to be desirable physicians for a family member. It is common wisdom that to be entrusted with the care of a colleague’s loved one is the highest compliment. This finding also reflects two core virtues of medicine, which include empathy and compassion.55,56 Additionally, this finding extends previous studies that have shown relationships between scores on the IRI Perspective-Taking scale and higher performance among medical students,57 as well as decreased likelihood of medical errors by residents.12

We observed that supervising residents perceived interns with higher burnout to have better communication with patients, families, allied health, and other providers. Although this finding could seem counterintuitive, many physicians who experience burnout may sustain high levels of professional achievement for long durations. Furthermore, the most dedicated physicians might be more likely to place professional duties—including the time-consuming task of effectively communicating with patients, family members, support staff, and colleagues—above all other aspects of personal life.58 Therefore, such physicians could be viewed favorably by supervisors in the workplace, even though the personal aspects of these physicians’ lives may suffer. Ultimately, this finding should prompt residency programs to reflect on the optimal balance between patient care responsibilities and resident burnout.

This study has limitations. It involved participation by only 64.7% of eligible residents. Of the 264 of 312 (84.6%) residents volunteering for the IMWELL study, there were additional missing data due to incomplete surveys, missing evaluation measures at the same point in time as well-being assessment, and the fact that not all rotations have clinical evaluations. Nonetheless, the participation rate in this study was favorable relative to that typically seen in physician studies.59,60 Our study sample was largely comprised of first-year residents and resident work-load—which may influence well-being—varies by PGY year; yet, our study results were adjusted for PGY level in the multivariate analyses, and none of the results were significantly different for the various PGY levels. Since this was a single-institution study, one should generalize the findings to other settings with caution. However, the range of well-being and empathy scores in this study is similar to those reported in previous studies at other institutions.28,33,34 Furthermore, the outcomes variables in this study for medical knowledge (ITE) and clinical skill (mini-CEX) are widely used among US residency programs, which should broaden the importance and relevance of the study findings. While the results may have been affected by non-response bias, data for responders were similar to those of non-responders on measured demographic factors and USMLE scores. The small but statistically significant associations between resident empathy and burnout and assessments by other physicians might be viewed as clinically insignificant; however, the actual range of scores among residents at the Mayo Clinic for communications competency is very narrow, so the observed 0.3-point change in communication score with the presence of burnout (yes/no) could have a substantial impact on a Mayo resident’s relative standing within this competency. Additionally, the positive associations between resident empathy and burnout and assessments by other physicians persisted after multivariate adjustments that incorporated a number of potential confounding covariates, so we believe that these findings are educationally relevant and add new knowledge regarding the relationships between resident empathy and desirability as a physician, as well as between resident burnout and communication. Finally, we acknowledge that this study did not examine the influence of several potential confounders, including residents’ learning styles, personality types, and major life events.

In this study sample, multiple dimensions of resident performance were generally not influenced by various aspects of well-being, which lends credibility to standardized measures of knowledge (ITE) and clinical performance (mini-CEX). However, the lack of association between well-being and medical knowledge requires further study involving high-stakes examinations. The association between empathy and desirability as a doctor for a family member—which is widely accepted as a characteristic of excellence among physicians—suggests the need to emphasize the identification and promotion of empathy in medical learners. However, the positive association between burnout and the perception of excellent clinical performance should stimulate discussion about the best ways to engage residents in meaningful clinical experiences without compromising their overall well-being.