BACKGROUND

The Institute of Medicine’s landmark report, To Err is Human, focused attention on patient safety.1 To improve patient care, many healthcare organizations review sentinel events,2 study patient mortality data,3 and convene morbidity and mortality conferences.4 The goal of these endeavors is to prevent adverse events by identifying root causes and solutions. However, there is little conclusive research regarding solutions that influence physicians’ practice behaviors.

Electronic case-based curricula on patient safety and systems-based practice have been studied among residents and faculty members including emergency medicine, family medicine, and internal medicine.58 All of these studies have shown that case-based curricula improve physician knowledge. However, we are unaware of research about reflection on adverse events to change future practice behaviors.

Kirkpatrick’s outcomes hierarchy is a common framework for education intervention studies.9,10 The following are the hierarchical levels: 1) Reaction (learner satisfaction), 2) Learning (acquired knowledge or skills), 3) Behavior (transferring learning to the workplace), and 4) Results (patient outcomes).9,10 When progressing up this hierarchy from Reaction to Results, the outcomes become increasingly meaningful, yet more difficult to measure with respect to feasibility and methodological rigor. Although most education interventions involve outcomes at the levels of Reaction and Learning,11 experts have observed that Behavior outcomes strike the optimal balance between feasibility and meaningfulness.12 One way to change physicians’ behaviors is to encourage critical reflection on adverse patient events.13 Critical reflection is defined as “a meta-cognitive process that occurs before, during and after situations with the purpose of developing greater understanding of both the self and the situation, so that future encounters with the situation are informed from the previous encounters.”14 Remarkably, validated systems to measure physician reflection on adverse events have not been described.

Therefore, we created an electronic system for presenting cases of adverse patient events in order to stimulate and measure faculty physician reflection on those events. In order to stratify physicians into low versus high levels of reflection, we created a measure of reflection based on a previously validated instrument, which separates learners into four increasing levels of reflection ranging from habitual action (no reflection) to critical reflection (highest reflection). We hypothesized that the degree of physician reflection would be associated with characteristics of physicians (e.g., age, gender) and patient cases (e.g., event relevance). The objectives of this study were to 1) develop an electronic case-based learning curriculum regarding actual systems failures and adverse patient events, 2) create a reflection instrument based on previous research15,16 and validate it in a population of faculty physicians, and 3) determine associations between physician reflection and characteristics of physicians and patient cases.

METHODS

Study Population

This study was conducted in 2009 and involved department of internal medicine faculty at the Mayo Clinic Rochester. All faculty members in the divisions of hospital internal medicine (generalists), pulmonary diseases and critical care medicine (specialists), and non-interventional cardiovascular diseases (sub-specialists) were invited to participate. This study was deemed exempt by the Mayo Institutional Review Board.

Case-Based Learning System (CBLS)

The CBLS (available in an online Appendix) was developed by Mayo Clinic specialists in the sections of information technology, medical education, and quality improvement (QI). The system was modified based on multiple rounds of feedback from small test groups. In August, 2009, the final system was administered to the study sample via e-mail messages that contained links to the web-based system, which in turn presented three patient cases. Participants who had not responded were sent two e-mail reminders. Participants were instructed to read each case and answer the corresponding multiple choice question following each case. After completing all three cases, the participants were asked to complete the reflection questions. Data collection was completed within four weeks.

After the participants accessed links to the CBLS, they reviewed the patient cases and answered multiple choice questions (MCQ) that tested understanding of the healthcare system (not understanding of medical knowledge), and required commitment to the next step in management. Subsequently, participants were provided with the case resolution where the adverse event was revealed. Participants then rated their perception of the case’s generalizability and relevance, and completed the reflection questionnaire. Finally, the participants read an informational case discussion and completed a satisfaction survey, which asked whether the CBLS met their needs for CME structured on a five-point scale (unsatisfactory, needs improvement, average, above average, excellent).

Three cases, representing real adverse events at the Mayo Clinic, were selected based on the most common error types (systems, medication, and diagnostic) encountered by internal medicine physicians.17 The systems error case involved a hospitalized patient who developed a ST-elevation myocardial infarction. The team did not follow hospital protocol to activate the cardiac catheterization laboratory, which lead to delayed treatment and a poor patient outcome. The medication error case involved a patient whose home medications were not reconciled at the time of admission to the hospital. This error resulted in a potassium overdose, intensive care unit transfer, and prolonged hospital stay. The failure to diagnose case involved an elderly woman who had provided a urine culture on the day of hospital discharge. After discharge the culture turned positive, but no one reviewed the result. The patient returned to the hospital three days later with urosepsis. All cases were reviewed and edited by a generalist (author CMW), a specialist (author FLJ), and a systems expert (author TIM).

The reflection instrument used in the CBLS was adapted from a previously validated tool by Kember et al., which comprised four levels of reflection: habitual action, understanding, reflection, critical reflection.15,16 Habitual action is a perfunctory feat that through repetition has become automatic.15,16 Understanding is using existing knowledge without critically apprising that knowledge.15,16 Reflection is exploration of past experiences to develop new understandings.15,16,18 Critical reflection is a deeper from of reflection where a person’s perspective is changed.15,16 We adapted Kember’s tool to our setting by creating eight items (two for each level of reflection) structured on five-point Likert scales (1 = Disagree, 2 = Disagree with reservation, 3 = Neutral, 4 = Agree with reservation, 5 = Agree). (See Table 1).

Table 1 Faculty Physician Refection on Adverse Patient Events: Item Loadings

Data Analysis

Satisfaction survey, multiple choice, and item score data were presented using standard descriptive statistics. Confirmatory factor analysis with Varimax rotation was performed to determine dimensions of physician reflection on adverse events. Specifically, we wished to confirm whether items clustered into conceptual groups representing low versus high reflection, as previously demonstrated by Kember.15,16 Factors were extracted using the minimal proportion criteria, which is the proportion of the common variance (defined by the sum of communality estimates) that is explained by successive factors. In this study, we established the threshold at 90% of the common variance, and factors were extracted until the sum of Eigenvalues for the retained factors exceeded 90% of the common variance, defined as the sum of the initial communality estimates. The final model was confirmed by examining the scree plot, which shows relative magnitudes of the factors’ Eigenvalues and can aid in determining how many factors to retain by inspecting to see where the declining Eigenvalues level off.19 Items with factor loadings ≥ 0.30 were retained. Internal consistency reliability for items comprising each factor and overall were calculated using Cronbach’s coefficient α, where α > 0.7 is considered acceptable.19,20

Associations between reflection scores and learner or case variables were determined. Overall reflection scores were reported as the mean and standard deviation of all eight instrument items. For reporting and all associations calculations, values for items comprising Factor 2 were reverse-scored, given the negative phrasing of these items. Case relevance (yes/no), event generalizability (yes/no), event preventability (yes/no), event root cause (personal/system), physician gender (female/male), and multiple choice answer correct (yes/no) were treated as binary variables. Event severity (near miss, minor impact, moderately severe impact, severe impact, death) was treated as an ordinal scale. Physician age was treated as a continuous variable. ANOVA and linear regression analysis were used to determine associations between overall reflection score and categorical or continuous variables, respectively. Statistical significance was set at p < 0.05. Statistical analyses were performed using SAS version 9.2 (SAS Institute Inc., Cary, NC).

RESULTS

Faculty Satisfaction and Responses to CBLS Cases

The CBLS was accessed by 44 (38%) of the 116 faculty physicians invited to participate. The 44 participants were given three cases apiece and completed a collective total of 107 case reflections. The CBLS was rated as average to excellent in 95 of 104 (91.3%) completed satisfaction surveys. In the ST-segment elevation myocardial infarction case, which asked how to activate the cardiac catheterization laboratory, 7 of 29 (24.1%) identified the correct answer. In the potassium overdose case, which asked about the current policy regarding medication reconciliation, 15 of 38 (39.5%) identified the correct answer. In the missed urine culture case, which asked about the protocol regarding who is responsible for pending laboratory data at the time of patient discharge, 25 of 40 (62.5%) identified the correct answer.

Reflection Instrument Validation

Factor analysis revealed a two-dimensional model for measuring faculty physicians’ reflections on adverse patient events. The identified factors were: 1) Minimal Reflection (items 1 and 2), and High Reflection (items 3 through 8). These factors support Kember’s model by distinguishing between low and high levels of reflection (Table 1). Specifically, the Minimal Reflection factor was comprised of two items corresponding to Kember’s Habitual Action level, and the High Reflection factor was comprised of items that correspond to Kember’s Understanding, Reflection, and Critical Reflection levels. Overall, the extracted factors accounted for 100% of the shared variance among the original variables.

Item mean scores ranged from 2.89 to 3.73 on a five-point scale. The overall reflection score was 3.41 (standard deviation 0.64). Remarkably, mean items scores were highest for items representing the lowest levels of reflection (e.g., “When I do activities like in this case, I complete them without thinking about what I am doing;” mean score = 3.66), and lowest for items representing the highest levels of reflection (e.g., “During this case, I discovered faults in what I had previously believed to be right;” mean score = 2.89), indicating that, on average, high level reflection was less commonly achieved in this study sample (Table 2). Regarding internal consistency reliability, Cronbach’s α for Factor 1 was 0.85, for Factor 2 was 0.58, and overall was 0.77 (Table 2).

Table 2 Faculty Physician Reflection on Adverse Patient Events: Factors, Mean Scores and Reliability

Reflection Score Associations

ANOVA analysis indicated that reflection scores were associated with physicians’ perceptions of case relevance (p = 0.02) and event generalizability (p = 0.001). There were no statistically significant associations between physicians’ reflection scores and event severity, event preventability, root cause, physician gender, physician age, or multiple choice answer response (Table 3).

Table 3 Associations between Reflection Scores and Characteristics of Faculty and Patient Cases

DISCUSSION

To our knowledge, this is the first study of a cased-based learning system for measuring faculty physicians’ reflections on adverse patient events. The CBLS reflection instrument scores were reliable and stratified faculty members across two levels of reflection, which should prove useful for future research regarding QI among low versus high-reflecting physicians. Furthermore, reflection scores were positively associated with case generalizability and case relevance, indicating that reflection improves with the use of cases that represent actual patient encounters.

We found that reflection scores correlated positively with case generalizability and relevance. These findings suggest that cases stimulating the richest reflections are those having obvious bearing on one’s own practice. In this study, generalizability was enhanced by selecting cases which have been shown to occur commonly in the setting of adverse events.17 Relevance was optimized by selecting cases that represented real adverse events at the Mayo Clinic. Physicians may be most comfortable when contemplating familiar ideas, even though, arguably, healthcare improvement is driven by reflecting on the unknown. While cases in this study were comprised of familiar content, physicians generally scored poorly regarding knowledge of critical management steps. Therefore, we intend to use the CBLS to electronically disseminate adverse patient events to the entire faculty, with the aim of educating faculty members regarding key quality initiatives.

Reflection has been described as a process of thinking critically about all aspects of a situation, including the self, and has even been described as “thinking about thinking.”14 Schon further observes that artful practitioners representing diverse disciplines share the ability to deal with unique circumstances by having “reflective conversations with the situation.”21 Previous studies on reflection among healthcare workers exist. The Mayo Evaluation of Reflection on Improvement Tool (MERIT) assesses resident physicians’ reflections on adverse events encountered in practice.22 However, MERIT reflections are scored by external raters, and are thus cumbersome and biased. The Kember instrument, which was the basis for this study’s assessment tool, measures health sciences students’ reflections on their courses.15,16 The Groningen Reflection Ability Scale (GRAS)23 and the Self-Reflection and Insight Scale (SRIS)24 are additional measures of reflection in educational settings. The Mayo CBLS reflection assessment instrument adds to this body of literature, because it specifically measures faculty physicians’ reflections on adverse patient events. Additionally, we found that faculty members scored lowest on critical reflection compared with the other categories. This finding provides preliminary evidence that, among faculty physicians, the highest level of reflection on QI may be difficult to achieve.

Our measure of faculty physician reflection on adverse events is supported by validity evidence. An established validity paradigm states that construct validity is upheld by evidence from the following sources: content, response process, internal structure, relations to other variables (criterion), and consequences.2530 In this study, content evidence draws from items that were created based on a previously validated reflection assessment instrument,15,16 findings from our prior research,22 and revision by a panel of experts with experience in teaching and measuring reflection and QI. Internal Structure evidence is supported by factor analysis showing a two-dimensional assessment of physician reflection that generally verifies the reflection levels found in the original Kember instrument,15,16 and by excellent overall internal consistency reliability. Relations to other variables (criterion) evidence is established by associations between reflection scores and other meaningful variables including case characteristics. Our previous review of the literature indicated that the aforementioned sources of validity evidence are the categories most commonly reported in the literature.29 Nonetheless, in the future it will be important to establish Consequences evidence by determining whether our assessment of reflection actually has an impact on faculty physician’s abilities to improve patient care through effective QI endeavors.

This study has limitations. It was conducted at a single academic institution, which may limit external validity. However, the CBLS scenarios were real cases that represented the most common categories of adverse patient events.17 The response rate was low and data was missing for some participants who did not answer all the multiple choice questions, which may limit the sensitivity of our analyses. The web-based technology developed for this system is not available to other institutions. Our model did not map perfectly to Kember’s in that we observed only a distinction between Habitual Action and the remaining items. Yet, previous research has emphasized the potential for factor instability and the importance of repeating factor analysis when using the same instrument in new educational settings, so this finding is not surprising.31 However, the basic contribution of this study is a validated method for measuring faculty physicians’ reflections on adverse patient events, which could be utilized in future research and replicated elsewhere using either paper or electronic formats. Finally, study participants completed several cases apiece, which could be considered clustered data, which may limit our interpretation of the factor analysis.

In summary, reflection on adverse events is a crucial step in practice improvement.13 We describe what may be the first validated method for measuring degrees of faculty physicians’ reflections on adverse patient events. This method will aid future research to compare quality outcomes among low versus high-reflecting physicians. We also found that reflection is enhanced by case material that is relevant and generalizable, which should be useful information when developing QI curricula for faculty physicians, as such curricula should strive to use actual, as opposed to hypothetical, case examples. The next challenge will be to determine how to stimulate reflection among practicing physicians in order to improve the quality of healthcare.