Abstract
Background
Surveillance of burnout by the gold-standard Maslach Burnout Inventory (MBI) is hindered by cost and length. The validity and benchmarking of the commonly recommended and used single-item burnout question (SIBOQ) are unknown. We sought to (1) derive an equation for predicting the gold standard MBI from the SIBOQ and (2) measure the correlation of the SIBOQ with the full MBI and its subscales.
Methods
We sought studies in PubMed along with citations by and to included studies. We included studies that either correlated the SIBOQ and the MBI or reported the rates of burnout measured by both instruments. Two reviewers extracted data and CLARITY risk of bias. We used generalized linear mixed regression to separately quantify the predictive (benchmarking) and explanatory (hot-spotting) capabilities of the SIBOQ. We created a regression equation for converting SIBOQ scores to MBI scores. We meta-analyzed correlation coefficients (r) for the SIBOQ and MBI subscales. For all analyses, we considered an r of 0.7 as acceptable reliability for group-level comparisons.
Results
We included 17 studies reporting 6788 respondents. All studies had a high risk of bias, as no study had a response rate over 75% and no study was able to examine non-responders. The correlations (r) of the SIBOQ with the overall MBI were explanatory r = 0.82 and predictive r = 0.56. Regarding MBI subscales, the correlations of the SIBOQ with emotional exhaustion were adequate with r = 0.71 (95% CI 0.67–0.74; I2 = 89%), and depersonalization was r = 0.44 (95% CI 0.34–0.52; I2 = 90%). However, in 8 of 15 comparisons, the r was less than 0.70.
Discussion
The SIBOQ’s usually adequate explanatory abilities allow “hot-spotting” to identify subgroups with high or low burnout within a single, homogenous survey fielding. However, the predictive ability of the SIBOQ indicates insufficient reliability in comparing local results to external benchmarks.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Healthcare workers, physicians and nonphysicians, report high levels of burnout that affect the well-being of themselves and their patients.1,2,3,4 Thus, for the well-being of both healthcare workers and patients, we must reliably measure workforce burnout. The best-studied measurement of burnout is the Maslach Burnout Inventory (MBI), introduced in 1981.5 However, the proprietary status of the MBI and its length of 22 items both handicap organizations that embed the MBI within larger surveys on employee well-being and experiences.6,7,8,9
These limitations of the MBI have led organizations to use the single-item burnout question (SIBOQ), derived from the “Z” Clinical Questionnaire and introduced in 1994.8 The SIBOQ is widely used, including incorporation into the Physician Worklife Study,10 the MEMO study,11 and the Mini-Z survey in 2016.12 The American Medical Association’s (AMA) Steps Forward program has recommended the SIBOQ since 2015.13 The National Academy of Medicine 14 recommends the Dolan 9 variation from 2015 as valid, reliable, and freely available. The SIBOQ is worded by the AMA as,13 “Using your own definition of ‘burnout,’ please circle one of the answers…” with five Likert responses ranging from “I enjoy my work. I have no symptoms of burnout” to “I feel completely burned out. I am at the point where I may need to seek help.”
Unfortunately, the two studies often cited as establishing the validity of the SIBOC validated the SIBOQ only against the emotional exhaustion (EE) subscale of the Maslach Burnout Inventory (MBI) and not the full MBI. 9,15 Thus, the ability of the SIBOQ to measure overall burnout is uncertain despite its frequent use.
We aim to create an equation to convert group rates of burnout as measured by the SIBOQ to estimated rates measured by the MBI. By using generalized linear mixed regression, we can separately quantify the predictive (benchmarking) and explanatory (hot-spotting) capabilities of the SIBOQ. A prediction equation, if reliable, will allow organizations that use the SIBOQ to benchmark their results against studies that report MBI from large sample populations. Our second aim is to meta-analyze existing studies to correlate the overall SIBOQ with the full MBI (MBI-full) and its subscales. This second aim will help focus further development of the SIBOQ.
Methods
We followed the PRISMA reporting guidelines for systematic reviews (checklist in eTable 1).16 We did not prospectively register the review for reasons detailed in the Online Supplement : Registration and reporting standards. We followed methods of openMetaAnalysis and the Cochrane Collaboration.17,18
Eligibility Criteria
We included studies that met either of two criteria.
For the study’s first aim, to support benchmarking, we included studies reporting rates and respondent counts of burnout for both the SIBOQ and the overall MBI. In addition, we included studies that used the MBI dual items (MBI-DI) as a gold standard. This criterion was unplanned and the rationale for accepting the MBI-DI is in the Supplement: Inclusion Criteria.
For our second aim, we included studies reporting correlations (r) between the SIBOQ and any version of the MBI, including overall correlation as well as correlations with one or more of the main subscales of the MBI: emotional exhaustion (EE) and depersonalization (DP).
Information Sources and Search Strategy
Our literature search was first developed in 2021 with the final search before submission occurring on April 7, 2023. Since that date, our search continues to be executed daily at PubMed as done in our first living systematic review.19 Details of our search are in the Online only supplement.
One of two reviewers screened studies and shared with the second reviewer if the first reviewer required the full text of the article to determine inclusion.17 We included studies if also approved by the second reviewer, who was unaware of the first reviewer’s decision.
Data Collection Process and Effect Measures
Two reviewers abstracted for each study: the numbers of surveys distributed and respondents; rates of burnout as measured by the SIBOQ, MBI, and its subscales; rates of confounders; and lastly, correlation coefficients between measures.
We abstracted any version of the MBI for our criterion standard.1 The initial MBI-Human Services Survey (MBI-HSS) was published in 1981 and contains 22 items, including eight items in the personal accomplishment (PA) scale.5 The second version of the MBI-General Survey (MBI-GS) was published in 1996 with 16 items and replaced the PA scale with 6 items on the professional efficacy (PE) scale.20 Both versions measure depersonalization (DP) with 5 items, but the emotional exhaustion (EE) scale was reduced from 9 items on the MBI-HSS to 5 items on the MBI-GS. We also collected descriptions of study populations, survey methods and incentives, and response rates.
Studies varied in criteria for burnout cutoff scores for determining burnout. For the MBI, the MBI originally required three criteria to deem burnout: abnormal EE, abnormal DP, and lack of personal accomplishment.21 Most studies deemed burnout if either the EE or DP scales were abnormal. Thus, we accepted the following definitions of an abnormal MBI: abnormal EE accompanied by abnormal DP or PA scales 22, abnormal EE or DP scales, or abnormal MBI dual items (MBI-DI: either EE or DP single items abnormal).23
Risk of Bias Assessment
We based our assessment on the CLARITY tool (eTable 3).24 We did not include CLARITY’s fourth item, “Is the survey clinically sensible,” as we are comparing the administration of two surveys and not comparing their development.
Statistical Methods
Aim 1: create an equation for converting a group rate of burnout measured by the SIBOQ to a predicted rate measured by the MBI. As noted by Debray et al., validation of prediction models is highly recommended and thus meta-analysis is needed to summarize predictive performance of the model across different settings and populations.25 For our first aim, we meta-regressed the rates of burnout by the SIBOQ against those reported by the MBI. If a study reported subgroups of respondents, we used these results rather than the overall rates reported by the study. We used the intercept and coefficient from the regression to derive an equation for predicting the MBI from the SIBOQ.
The equation for converting between the SIBOQ and the MBI was created with unweighted, linear regression models using both a fixed effects regression model as well as a generalized linear mixed-effects regression model using restricted maximum likelihood (REML). To increase validity of our regression model, we presented the results with prediction intervals that are more conservative than confidence intervals.25,26,27
Aim 2: correlate the overall SIBOQ with the full MBI (MBI-full) and its subscales. We used generalized linear mixed regression to separately quantify the predictive (benchmarking) and explanatory (hot-spotting) capabilities of the SIBOQ. In the mixed regression model, we treated the study as a random effect. For the mixed regression model, we calculated the explanatory power (total, conditional) R2 as well as the predictive (marginal, fixed) R2. We compared the explanatory powers of the two regression models by using analysis of variance (ANOVA) of their Akaike information criterion (AIC). We calculated the heterogeneity of the results of the mixed regression by dividing the variance between studies by the total variance.
To explore modulators of the relationship between the SIBOQ and the overall MBI, we added to the mixed regression terms for the number of respondents, the response rate, the rate of DP, and the rate of physicians among responders. In a post hoc revision of the analysis, we replaced the rate of DP with the ratio of the rates of DP to EE to predict the MBI. The reason for the revised analysis was our realization that the planned analysis using the absolute rate of DP would not be informative unless compared to the rate of EE.
For our second aim, we meta-analyzed the correlation coefficients with random effects estimation with inverse variance weighting. As noted by Borenstein et al., the correlation coefficient can serve as the effect size index.28 However, as recommended by Borenstein for validity, we transformed the correlation coefficients to Fisher’s z scale.
In all analyses, we deemed correlation coefficients to indicate reliability when the correlation coefficient was at least 0.7 29,30 or the corresponding proportion of variance explained (R2) was at least 50%.
We performed all statistics with the R software.31 Random effects meta-analyses with the Hartung and Knapp approach32 were done with the metacor function of the R package meta,33 while the mixed effects analysis for creating the conversion equation was done with the lmer function of the package lme4.34 The dominance analysis was performed with the package domin.35
All data and statistical code are available online at https://ebmgt.github.io/well-being_measurement/.
Role of the Funding Source
We received no funding for this review.
Results
Study Selection and Characteristics
We included 17 studies reporting 15,600 survey recipients and 6788 respondents (PRISMA Flow diagram; Fig. 1). The daily PubMed alert identified two included studies before our manuscript submission 36,37 and the study by Song published after our submission.38 The 17 included studies are described in Table 1.
As previously found by Rotenstein 1, studies defined burnout with the MBI in many ways: MBI:EE and MBI:DS scales both abnormal in two studies,39,40 and MBI:EE and MBI:DP single items positive in two studies,39,41 and MBI:EE or MBI:DP scales abnormal in the remaining studies. Only the study by Ong used the updated Maslach criteria of a high EE scale and either high DP or low personal accomplishment scales.39 Studies used from 642 to 7 anchors for the MBI questions. Even within the two studies from different years by the same organization, the Sierra Sacramento Valley Medical Society, the MBI anchors changed. The first survey did not include the conventional anchor of “a few times a month”,41,42 which was corrected in the second survey.43 The addition of the new anchor was associated with an 8% drop in the rate of burnout, as 15% of respondents chose the new anchor of “a few times a month” for each MBI question.
The lowest rates of burnout tended to be in the two studies that required both MBI:EE and MBI:DS scales to be abnormal (20 to 26%).39,40 This was followed by studies requiring either single item to be abnormal (27 to 48.1%) 39,43 and studies requiring either scale to be abnormal which tended to report the highest rates of burnout (6.7 to 85%).29,44
There were insufficient data to meta-analyze studies that required both MBI:EE and MBI:DS scales to be abnormal; therefore, this analysis focuses on the remaining papers.
Risk of Bias in Studies
No studies used standard reporting recommendations for surveys; however, all studies were published before the availability of the CROSS guideline for reporting of survey studies.24
A high risk of bias was present in all studies as no study met the CLARITY item of a response rate over 75% or a comparison of respondents and non-respondents (online supplement, eTable 3). However, six studies of the 17 studies had response rates between 50 and 75% (Hansen,28 Hansen,29 Kemper,32 Knox,2 Ong,39 Waddimba4513) and all of these studies met the other four CLARITY items for quality. The response rate, pooled by random effects analysis, was 35% (95% CI 24 to 49%) which was significantly heterogeneous across studies (I2 = 99%).
Results of Syntheses
There were insufficient data to meta-analyze studies that required both MBI:EE and MBI:DS subscales to be abnormal; therefore, our analysis focuses on surveys that deem burnout if either scale is abnormal. The summary of the three regressions is in Table 2.
For the first study aim, the fixed regression analysis of rates of abnormal SIBOQ versus rates of abnormal MBI yielded an equation to convert rates (Fig. 2):
As shown in Fig. 2, the prediction limits are wide. The coefficient of determination, indicating the proportion of variance explained (R2) by the regression model, was 32%.
The median absolute difference of the predicted rates of the SIBOQ below the rates of the MBI in 37 comparisons was less than 1% with a range of 24% lower to 31% higher than the MBI. In comparison, the median absolute difference of the reported rates of the SIBOQ below the rates of the MBI in 37 comparisons was 6% with a range of 25% lower to 35% higher than the MBI.
The mixed regression, with the study treated as a random effect, yielded a substantial explanatory R2 at 69%. The fixed regression yielded a predictive R2 of only 32% (Table 2). The proportion of variance explained by the generalized linear mixed regression model is over the 50% threshold to indicate an adequate fit for explanatory ability. However, the mixed model’s predictive (marginal) ability is lower than the threshold for adequacy.
For the second study aim, the forest plots of random effects meta-analyses are in Figs. 3 and 4 and summarized in Table 2. Of the studies that used surveys in English, the correlations of the SIBOQ with the MBI subscales were MBI:EE 0.71 (95% CI 0.67 to 0.74) and MBI:DP 0.44 (95% CI 0.34 to 0.52). The studies whose surveys were not in English36,38 reported a significantly lower correlation with MBI:EE (Fig. 3) and a statistically nonsignificantly higher correlation with MBI:DP (Fig. 4). The I2 values, 88% and 89%, respectively, indicated substantial heterogeneity of results (Figs. 3, 4).
We could not assess publication bias as cross-sectional studies are not generally registered in advance.
For a sensitivity analysis, we repeated the mixed effects analysis, limited to the studies that used the full MBI for the gold standard, and the results were very similar (Online only supplement, eTable 4).
The DP/EE ratio, which ranged from 0.74 to 1.53, tended to modulate the correlation between the SIBOQ and the MLB with a beta-coefficient of 11.0 (95% CI − 2.9 to 24.9). Dominance analysis of the fixed regression yielded for the R2 52%, 42%, and 10% for the complete model, SIBOQ alone, and the DP/EE ratio alone, respectively. The p value for DP/EE ratio was 0.114.
We planned to measure the impact on the fixed regressions from study size, response rates, and the rates of women and physicians among respondents. However, these data were only available at the study level, giving too few data points to analyze.
Discussion
The SIBOQ has statistically significant and borderline adequate reliability for predicting the MBI:EE. However, the SIBOQ has statistically significant but insufficient reliability to predict the overall MBI or the MBI:Dl. Accordingly, we found substantial heterogeneity of correlations between the SIBOQ and the two MBI scales across studies. Our work complements that of Brady, who created a crosswalk at the level of the individual respondent between SIBOQ and MBI,29 while we compared measures at the organizational level.
Unfortunately, we cannot recommend the SIBOQ as a short, reliable measure of burnout that can be embedded in larger surveys of workplaces and achieve the goals of both identifying local bright spots and comparisons to external benchmarks. The mixed meta-regression, with its ability to separate predictive and explanatory abilities, supports that the SIBOQ can stratify levels of burnout across subgroups in a single fielding of the survey to help organizations identify bright spots; however, the mixed regression finds insufficient predictive ability of the SIBOQ to external benchmarks. Accordingly, the median absolute difference of the rate of the SIBOQ below the rate of the MBI was 6% (range of 25% lower to 35% higher).
The heterogeneity in response rates across studies is substantial (I2 = 100%). One explanation for variable response rates may be the recipients’ expected impact of the survey as found by Brosnan.46 If recipients have completed well-being surveys in the past and their effort did not lead to meaningful organizational change, the recipients may be less likely to repeat burnout screening measures. The response rate to the survey may modulate correlations. Brady used the AMA Masterfile and was the only study that reported a lower rate of burnout using the MBI than using the SIBOQ.29 While the difference was less than 1%, no other studies approached equivalence between the two measures. This is revealed in Fig. 2 by the Brady study (black points) being lower on the plot than the other studies. One possible explanation is that when response rates are low, survey recipients who do respond may be more influenced by the subjective anchors in the SIBOQ than the objective anchors in the MBI. In a prior study of respondents to the Staff Surveys of the English National Health Service (NHS), we found that sites with low response rates reported more work stress and less engagement in a survey that used subjective Likert anchors.47 This hypothesis would be difficult to study as it would require an additional study like Brady’s in which both the SIBOQ and MBI were asked of multiple sites with a range of response rates.
The DP/EE ratio may also modulate the ability of the SIBOQ to predict the MBI. This ratio varied from 0.68 among the pediatricians in Brady’s study, to 1.91 among the emergency medicine physicians in Brady’s study.29 Prior research identified two modulators of the DP/EE ratio. First, women respondents may report EE whereas men tend toward DP or cynicism.5,48,49 Second, the nature of worksite demands may affect the ratio. Leiter found that worksite workload contributes to exhaustion whereas lack of values congruence with management contributes to cynicism.49 Lastly, the previously reported reduced internal consistency of MBI:DP compared to MBI:EE50 may affect the effect of the ratio. The role of the DP/EE ratio should be further explored with larger studies or with an individual patient meta-analysis.
The study by Kemper supports the inadequacy of the SIBOQ to predict rates in external populations. Kemper was the only study with a negative correlation between the rates of abnormal SIBOQ and MBI across the subgroups within the study (Fig. 2 online, purple and largest points).51 Kemper was also the only study with subgroups based on the year of survey administration. Restated, Kemper had the unusual finding in the second year of having a lower MBI, but higher SIBOQ. In the second year, the sample size increased by 20% as the number of study sites increased from 34 to 46. While the differences between years are small, the absence of a positive relationship raises the question of whether the new sites may have had a different profile of stressors, specifically a lower rate of value incongruence relative to the rate of workload stress.
Due to the cost of the MBI and the low predictive ability of the SIBOQ, organizations might choose to switch to another non-proprietary scale. However, aside from the SIBOQ and MBI, few surveys reviewed by the National Academy of Medicine have extensive benchmarks, and all have at least six items. Increasingly, organizational psychologists recognize the need to combine short scales that measure issues relevant to organizations.52
We encourage survey developers to acknowledge the difficulty of survey creation and support collaborative evolution of their work. A successful example is the measurement of psychological engagement at work with the UWES-3,53 which has derivatives used by national surveys of the CDC,54 National Health Service,55 and American Psychological Association.56 One way to promote collaboration is to publish surveys with a Creative Commons “copyleft” license without a “NoDerivatives” feature.57 Surveys developed with public funding may require and support open-access publication with a Creative Commons license. Authors who cannot afford open-access fees can still publish their survey items prior to journal submission with a Creative Commons license in repositories that can create a digital object identifier (DOI), such as GitHub or the Open Science Framework. We strongly commend the studies in this review that display a Creative Commons copyright in their publications.36,37,39,58,59,60,61,62 This includes three studies that validated items to measure DP that could be studied with the SIBOQ.39,58,63
A copyleft license with a NonCommercial feature allows the developer to separately negotiate commercial licenses with payments. The entertainment industry provides a precedent. Many songs or films have been created, performed, and recorded by one artist, then covered years later by a different one, sometimes obtaining similar or greater success than the original. Imagine if Simon and Garfunkel in 1964 refused to allow modifications to their song, “Sounds of Silence,” which has been covered or sampled over 100 times, including Disturbed’s cover 50 years later or the more recent sample by Eminem in “Darkness.”64 Similar approaches in academics could give survey developers continued success and financial gain if their work evolves in future surveys.
In conclusion, the SIBOQ can stratify levels of burnout across subgroups in a single fielding of the survey to help organizations identify bright spots whose success can inform the improvement of struggling hot spots. However, the SIBOQ is less able to compare local results to external benchmarks. For survey developers, we encourage using copyrights that allow surveys to evolve.
References
Rotenstein LS, Torre M, Ramos MA, et al. Prevalence of Burnout Among Physicians: A Systematic Review. JAMA. 2018;320(11):1131-1150. https://doi.org/10.1001/jama.2018.12777
Knox M, Willard-Grace R, Huang B, Grumbach K. Maslach Burnout Inventory and a Self-Defined, Single-Item Burnout Measure Produce Different Clinician and Staff Burnout Estimates. J Gen Intern Med. 2018;33(8):1344-1351. https://doi.org/10.1007/s11606-018-4507-6
Hodkinson A, Zhou A, Johnson J, et al. Associations of physician burnout with career engagement and quality of patient care: systematic review and meta-analysis. BMJ. 2022;378:e070442. https://doi.org/10.1136/bmj-2022-070442
Tawfik DS, Scheid A, Profit J, et al. Evidence Relating Health Care Provider Burnout and Quality of Care: A Systematic Review and Meta-analysis. Ann Intern Med. 2019;171(8):555-567. https://doi.org/10.7326/M19-1152
Maslach CT, Jackson SE. The measurement of experienced burnout. J Organ Behav. 1981;2(2):99-113. https://doi.org/10.1002/job.4030020205
Lucas R, Dandar V, Scott J. Engagement and Workplace Satisfaction of Emergency Medicine Faculty in the United States. AEM Educ Train. 2021;5(2):e10474. https://doi.org/10.1002/aet2.10474
Koranne R, Williams ES, Poplau S, et al. Reducing burnout and enhancing work engagement among clinicians: The Minnesota experience. Health Care Manage Rev. 2022;47(1):49-57. https://doi.org/10.1097/HMR.0000000000000298
Schmoldt RA, Freeborn DK, Klevit HD. Physician burnout: recommendations for HMO managers. HMO Pract. 1994;8(2):58-63.
Dolan ED, Mohr D, Lempa M, et al. Using a single item to measure burnout in primary care staff: a psychometric evaluation. J Gen Intern Med. 2015;30(5):582-587. https://doi.org/10.1007/s11606-014-3112-6
Williams ES, Konrad TR, Linzer M, et al. Physician, practice, and patient characteristics related to primary care physician physical and mental health: results from the Physician Worklife Study. Health Serv Res. 2002;37(1):121-143.
Linzer M, Manwell LB, Williams ES, et al. Working conditions in primary care: physician reactions and care quality. Ann Intern Med. 2009;151(1):28–36, W6–9. https://doi.org/10.7326/0003-4819-151-1-200907070-00006
Linzer M, Poplau S, Babbott S, et al. Worklife and Wellness in Academic General Internal Medicine: Results from a National Survey. J Gen Intern Med. 2016;31(9):1004-1010. https://doi.org/10.1007/s11606-016-3720-4
Linzer M, Guzman-Corrales L, Poplau S. Preventing physician burnout - STEPS Forward. American Medical Association. Published June 2015. Accessed September 24, 2023. https://edhub.ama-assn.org/steps-forward/module/2702509
Anonymous. Valid and Reliable Survey Instruments to Measure Burnout, Well-Being, and Other Work-Related Dimensions. National Academy of Medicine. Accessed October 3, 2021. https://nam.edu/valid-reliable-survey-instruments-measure-burnout-well-work-related-dimensions/
Rohland BM, Kruse GR, Rohrer JE. Validation of a single-item measure of burnout against the Maslach Burnout Inventory among physicians. Stress Health. 2004;20(2):75-79. https://doi.org/10.1002/smi.1002
Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. https://doi.org/10.1136/bmj.n71
Chandler C, Cumpston J, Li, Page M, Welch V. Cochrane Handbook for Systematic Reviews of Interventions. (Higgins J, Thomas J, eds.).; 2023. Accessed September 24, 2023. https://training.cochrane.org/handbook/current
openMetaAnalysis Contributors. openMetaAnalysis: methods. Published online 2023. Accessed September 24, 2023. http://openmetaanalysis.github.io/methods.html
Badgett RG, Vindhyal M, Stirnaman JT, Gibson CM, Halaby R. A Living Systematic Review of Nebulized Hypertonic Saline for Acute Bronchiolitis in Infants. JAMA Pediatr. 2015;169(8):788-789. https://doi.org/10.1001/jamapediatrics.2015.0681
Taris TW, Schreurs PJG, Schaufeli WB. Construct validity of the Maslach Burnout Inventory-General Survey: A two-sample examination of its factor structure and correlates. Work Stress. 1999;13(3):223-237. https://doi.org/10.1080/026783799296039
Eckleberry-Hunt J, Kirkpatrick H, Barbera T. The Problems With Burnout Research. Acad Med J Assoc Am Med Coll. Published online August 16, 2017. https://doi.org/10.1097/ACM.0000000000001890
Dyrbye LN, West CP, Shanafelt TD. Defining burnout as a dichotomous variable. J Gen Intern Med. 2009;24(3):440; author reply 441. https://doi.org/10.1007/s11606-008-0876-6
West CP, Dyrbye LN, Satele DV, Sloan JA, Shanafelt TD. Concurrent validity of single-item measures of emotional exhaustion and depersonalization in burnout assessment. J Gen Intern Med. 2012;27(11):1445-1452. https://doi.org/10.1007/s11606-012-2015-7
CLARITY Group. Risk of Bias Instrument for Cross-Sectional Surveys of Attitudes and Practices. Published online September 2017. https://www.evidencepartners.com/wp-content/uploads/2017/09/Risk-of-Bias-Instrument-for-Cross-Sectional-Surveys-of-Attitudes-and-Practices.pdf
Debray TPA, Damen JAAG, Snell KIE, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460. https://doi.org/10.1136/bmj.i6460
Willis BH, Riley RD. Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice. Stat Med. 2017;36(21):3283-3301. https://doi.org/10.1002/sim.7372
Riley R, I A, Tp D, et al. Summarising and validating test accuracy results across multiple studies for use in clinical practice. Stat Med. 2015;34(13). https://doi.org/10.1002/sim.6471
Borenstein M, Hedges LV. Effect Sizes for Meta-Analysis. In: Cooper HM, Hedges LV, Valentine JC, eds. The Handbook of Research Synthesis and Meta-Analysis. 3rd edition. Russell Sage Foundation; 2019:220.
Brady KJS, Ni P, Carlasare L, et al. Establishing Crosswalks Between Common Measures of Burnout in US Physicians. J Gen Intern Med. Published online March 31, 2021. https://doi.org/10.1007/s11606-021-06661-4
Anonymous. PROMIS® Instrument Development and Validation Scientific Standards, Version 2.0, (revised May 2013). HealthMeasures. Published May 2013. Accessed June 5, 2022. https://www.healthmeasures.net/images/PROMIS/PROMISStandards_Vers2.0_Final.pdf
R Core Team. R: A language and environment for statistical computing. Published online 2020. Accessed January 31, 2021. https://www.R-project.org/
Cornell JE, Mulrow CD, Localio R, et al. Random-Effects Meta-analysis of Inconsistent Effects: A Time for Change. Ann Intern Med. 2014;160(4):267-270. https://doi.org/10.7326/M13-2886
Schwarzer G. Package “meta.” Published online February 5, 2022. https://cran.r-project.org/web/packages/meta/
Bates D, Maechler M, Bolker [aut B, et al. lme4: Linear Mixed-Effects Models using “Eigen” and S4. Published online April 7, 2022. Accessed June 18, 2022. https://CRAN.R-project.org/package=lme4
Luchman J. domir: Tools to Support Relative Importance Analysis. R Package Documentation. Published July 11, 2022. Accessed March 11, 2023. https://rdrr.io/cran/domir/
Nagasaki K, Seo E, Maeno T, Kobayashi H. Diagnostic accuracy of the Single-item Measure of Burnout (Japanese version) for identifying medical resident burnout. J Gen Fam Med. 2022;23(4):241-247. https://doi.org/10.1002/jgf2.535
Houdmont J, Daliya P, Adiamah A, et al. Identification of Surgeon Burnout via a Single-Item Measure. Occup Med Oxf Engl. 2022;72(9):641-643. https://doi.org/10.1093/occmed/kqac116
Song HI, Yun JA, Ahn YS, Choi KS. Validating a Korean Version of the Single-Item Burnout Measure for Evaluating Burnout Among Doctors. Psychiatry Investig. 2023;20(7):681-688. https://doi.org/10.30773/pi.2022.0339
Ong J, Lim WY, Doshi K, et al. An Evaluation of the Performance of Five Burnout Screening Tools: A Multicentre Study in Anaesthesiology, Intensive Care, and Ancillary Staff. J Clin Med. 2021;10(21):4836. https://doi.org/10.3390/jcm10214836
Aguilar-Nájera O, Zamora-Nava LE, Grajales-Figueroa G, Valdovinos-Díaz MÁ, Téllez-Ávila FI. Prevalence of burnout syndrome in gastroenterologists and endoscopists: results of a national survey in Mexico. Postgrad Med. 2020;132(3):275-281. https://doi.org/10.1080/00325481.2019.1707486
Yellowlees P, Coate L, Misquitta R, Wetzel AE, Parish MB. The Association Between Adverse Childhood Experiences and Burnout in a Regional Sample of Physicians. Acad Psychiatry J Am Assoc Dir Psychiatr Resid Train Assoc Acad Psychiatry. 2021;45(2):159-163. https://doi.org/10.1007/s40596-020-01381-z
Coate L, Misquitta R, Wetzel AE, Yellowlees P, Parish MB. Joy of Medicine: Assessing Physician Well-Being in the Sacramento Region. Published online July 15, 2019. http://joyofmedicine.org/physician-burnout-research/
Coate L, Wetzel AE, Yellowlees P, Misquitta R, Gree L. Joy of Medicine: A Regional Approach to Improving Physician Well-Being. Published online March 2021. http://joyofmedicine.org/physician-burnout-research/
Olson K, Sinsky C, Rinne ST, et al. Cross-sectional survey of workplace stressors associated with physician burnout measured by the Mini-Z and the Maslach Burnout Inventory. Stress Health J Int Soc Investig Stress. 2019;35(2):157-175. https://doi.org/10.1002/smi.2849
Waddimba AC, Scribani M, Nieves MA, Krupa N, May JJ, Jenkins P. Validation of Single-Item Screening Measures for Provider Burnout in a Rural Health Care Network. Eval Health Prof. 2016;39(2):215-225. https://doi.org/10.1177/0163278715573866
Brosnan K, Kemperman A, Dolnicar S. Maximizing participation from online survey panel members. Int J Mark Res. 2021;63(4):416-435. https://doi.org/10.1177/1470785319880704
Boyle R, Badgett R. Healthcare Work Force: Survey Response Rates Impact on Response Patterns. In: ; 2022. https://doi.org/10.1007/s11606-022-07653-8
Leiter MP, Frank E, Matheson TJ. Demands, values, and burnout: relevance for physicians. Can Fam Physician Med Fam Can. 2009;55(12):1224–1225, 1225.e1–6.
Leiter MP, Maslach CT. Areas of worklife: a structured approach to organizational predictors of job burnout. In: Emotional and Physiological Processes and Positive Intervention Strategies. Vol 3. Research in Occupational Stress and Well-being. Emerald Insight; 2003:91–134. https://doi.org/10.1016/S1479-3555(03)03003-8
Schaufeli WB, Bakker AB, Hoogduin K, Schaap C, Kladler A. On the clinical validity of the maslach burnout inventory and the burnout measure. Psychol Health. 2001;16(5):565-582. https://doi.org/10.1080/08870440108405527
Kemper KJ, Wilson PM, Schwartz A, et al. Burnout in Pediatric Residents: Comparing Brief Screening Questions to the Maslach Burnout Inventory. Acad Pediatr. 2019;19(3):251-255. https://doi.org/10.1016/j.acap.2018.11.003
Matthews RA, Pineault L, Hong YH. Normalizing the Use of Single-Item Measures: Validation of the Single-Item Compendium for Organizational Psychology. J Bus Psychol. 2022;37(4):639-673. https://doi.org/10.1007/s10869-022-09813-3
Schaufeli WB, Shimazu A, Hakanen J, Salanova M, De Witte H. An Ultra-Short Measure for Work Engagement: The UWES-3 Validation Across Five Countries. Eur J Psychol Assess. 2019;35(4):577-591. https://doi.org/10.1027/1015-5759/a000430
The National Institute for Occupational Safety and Health. NIOSH Worker Well-Being Questionnaire (WellBQ). U.S. Department of Health and Human Services, Public Health Service, Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health; 2021. https://doi.org/10.26616/NIOSHPUB2021110revised52021
United Kingdom National Health Service Survey Coordination Centre. NHS Staff Surveys. NHS Staff Surveys. Accessed October 6, 2019. http://www.nhsstaffsurveys.com
APA Center for Organization Excellence. 2014 Work and Well-Being Survey. Published online April 2014. Accessed July 17, 2018. http://www.apaexcellence.org/assets/general/2014-work-and-wellbeing-survey-results.pdf
Newman JC, Feldman R. Copyright and open access at the bedside. N Engl J Med. 2011;365(26):2447-2449. https://doi.org/10.1056/NEJMp1110652
Trockel M, Bohman B, Lesure E, et al. A Brief Instrument to Assess Both Burnout and Professional Fulfillment in Physicians: Reliability and Validity, Including Correlation with Self-Reported Medical Errors, in a Sample of Resident and Practicing Physicians. Acad Psychiatry J Am Assoc Dir Psychiatr Resid Train Assoc Acad Psychiatry. 2018;42(1):11-24. https://doi.org/10.1007/s40596-017-0849-3
Li-Sauerwine S, Rebillot K, Melamed M, Addo N, Lin M. A 2-Question Summative Score Correlates with the Maslach Burnout Inventory. West J Emerg Med. 2020;21(3):610-617. https://doi.org/10.5811/westjem.2020.2.45139
Hansen V, Girgis A. Can a single question effectively screen for burnout in Australian cancer care workers? BMC Health Serv Res. 2010;10:341. https://doi.org/10.1186/1472-6963-10-341
Hansen V, Pit S. The Single Item Burnout Measure is a Psychometrically Sound Screening Tool for Occupational Burnout. Health Scope. 2016;5(2):e32164. https://doi.org/10.17795/jhealthscope-32164
Houdmont J, Daliya P, Theophilidou E, et al. Burnout Among Surgeons in the UK During the COVID-19 Pandemic: A Cohort Study. World J Surg. 2022;46(1):1-9. https://doi.org/10.1007/s00268-021-06351-6
Demerouti E, Bakker AB. The Oldenburg Burnout Inventory: A good alternative to measure burnout and engagement. Handbook of stress and burnout in health care. Published online September 25, 2007. https://www.isonderhouden.nl/doc/pdf/arnoldbakker/articles/articles_arnold_bakker_173.pdf
WhoSampled. The Sounds of Silence by Simon & Garfunkel on WhoSampled. WhoSampled. Accessed June 22, 2022. https://www.whosampled.com/Simon-%26-Garfunkel/The-Sounds-of-Silence/
Flickinger TE, Kon RH, Jacobsen R, Owens J, Schorling J, Plews-Ogan M. Single-Item Burnout Measure Correlates Well with Emotional Exhaustion Domain of Burnout but Not Depersonalization Among Medical Students. J Gen Intern Med. 2020;35(11):3383-3385. https://doi.org/10.1007/s11606-020-05808-z
Guyatt G, Oxman AD, Sultan S, et al. GRADE guidelines: 11. Making an overall rating of confidence in effect estimates for a single outcome and for all outcomes. J Clin Epidemiol. 2013;66(2):151-157. https://doi.org/10.1016/j.jclinepi.2012.01.006
Acknowledgements:
We very much appreciate the additional analyses by Dr. John Ong39 and Dr. Jonathan Houdemont37 of their studies. Likewise, we are very thankful for the experience and guidance about owner rights in the music industry that has been provided by Tim Carter of the Tim Carter Band (https://www.facebook.com/timcarterband/) and the Tree House Studio (https://www.facebook.com/thetreehousestudioridgetop/). The authors very much appreciate the excellent comments and directions by the Journal’s editors that substantially enhanced the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest:
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hagan, G., Okut, H. & Badgett, R.G. A Systematic Review of the Single-Item Burnout Question: Its Reliability Depends on Your Purpose. J GEN INTERN MED 39, 818–828 (2024). https://doi.org/10.1007/s11606-024-08685-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11606-024-08685-y