Introduction

The medical school experience with its focus on standardized examinations and subjective evaluations potentially impacting career potential significantly affects medical students [1, 2]. Although the United States Medical Licensing Examination (USMLE) Step 1 is now pass-fail, Step 2 Clinical Knowledge continues to report a numeric score, which is being used as a metric by program directors [3]. The need to achieve high academic performance leads to unchecked stress, resulting in anxiety and depression affecting 1 in 3 students [1], and suicidal ideation affecting 1 in 10 [2]. A recent review of the literature reported a higher prevalence of anxiety among medical students compared to non-medical students [4]. The chronic stress, intense academic workloads, and pressures to achieve high academic standards are antecedents to test anxiety [5].

Test anxiety has been characterized as impacting the working memory specifically available for test-taking performance, and has been attributed to poor academic performance [6, 7]. This phenomenon has been studied since the 1950s across many educational settings [8]. Test anxiety is defined as the physiological and behavioral responses that accompany concerns of failure. Highly test anxious individuals self-identify sensations of tension and worry before, during, and after examinations [9]. In fact, research has demonstrated that test anxiety involves autonomic responses and cognitive components [10]. These anxious responses have been shown to be more influential for high stakes tests than regular coursework [9].

Test anxiety in higher education involves students experiencing an emotional and a cognitive dimension of anxiety [11, 12]. Notably, Spielberger, developer of the State-Trait Anxiety Inventory (STAI), posited a state and trait element of anxiety which may co-occur or function individually, where trait anxiety is personality driven and chronic while state anxiety is acute in nature [13]. Studies have indicated chronic anxiety has a large effect on examination performance, while acute test anxiety has a smaller, though demonstrable, effect on examination performance in healthcare professional programs [14, 15].

Other studies about test anxiety in higher education include medical students, engineering students, physical therapy students, and nursing students [16,17,18,19,20]. A study by Schwartz et al. [19] compared timed versus untimed examination performance with positive results in mitigating anxiety and improved examination performance. However, these results need to be interpreted carefully since students took a timed test first and only spent an average of 8 additional minutes on the untimed examination, begging the question of how helpful that intervention was. Another systematic review also explored open- versus closed-book examinations with no significant differences in examination performance. It has been reported that test anxiety peaks in higher education, which merits continued exploration of interventions that mitigate test anxiety leading to improvement examination performance [21, 22].

The varying results of interventions impacting student examination performance led us to question what the literature has reported about medical student interventions for test anxiety. The aim of this review was to explore interventions designed to mitigate medical students’ test anxiety, ideally leading to improved examination performance. The following questions guided our research: (1) What tests are used to measure and evaluate test anxiety in medical students? (2) What types of interventions have been used to mitigate test anxiety? (3) What impact do test anxiety interventions have on academic performance?

Methods

Our procedure for this literature review applied Arksey and O’Malley’s five stage framework, which includes the following: (1) identifying the research question(s); (2) identifying relevant studies; (3) selecting studies; (4) extracting data; and (5) summarizing and reporting results [23]. Stage 1 has been explained.

Stage 2. Identifying Relevant Studies

We provided key terms to a medical librarian who then conducted an extensive literature search. The librarian has extensive experience conducting literature reviews, participating in nearly 100 systematic and scoping reviews. Search strategies for each database are provided in Table 1.

Table 1 Search strategies and databases

The databases used for this search included PubMed, EMBASE, PsychINFO, ERIC, SCOPUS, and CINAHL. The search included English language articles published between 2010 and 2021. This time period was chosen given the evolving field of medical education including new teaching methods, technologies, and pedagogical approaches. This time period allowed us to study the relevant and up-to-date interventions for current medical students. The search results were loaded into a reference manager to screen out duplicate articles. During the full-text screen and data extraction phase, reference lists were also inspected to identify additional articles for inclusion in the study.

Stage 3. Selecting Studies

The results of the literature search were loaded into Covidence (Melbourne, Australia), which is a cloud-based software program designed for literature reviews across multiple sites. Each abstract was reviewed by two members of the research team (CW, GLBD) for inclusion or exclusion. If there were disagreements, the team reviewed the abstract together to make a final determination. Abstracts were included if they indicated the article included a discussion about test anxiety and some type of intervention for test anxiety in US medical schools. We included assessments of knowledge and clinical skills. Abstracts were excluded if they were from review articles and commentaries, involved non-US medical schools, or clearly did not address test anxiety. Review articles were checked to ensure we did not inadvertently omit a study. Although we recognize the prevalence of test anxiety among international medical students, our familiarity with US medical student education and services merited a US focus to accurately report our findings.

After the abstract review process, all selected papers underwent full-text reviews. Each article was reviewed by two members of the research team. Any disagreements were resolved by the team reviewing the article and coming to consensus.

Stage 4. Extracting Data

The research team identified criteria for the data extraction form. An extraction form was set up in Covidence. Both researchers completed the extraction form on all articles chosen for the study by individually describing the objective or research question, medical student year of training, sample size, summary of intervention, examination type, measurement tools used, and primary outcome of study. Once data had been extracted from each article, the research team met to resolve any conflicts in the data collection process.

Results

A total of 883 papers were identified using the search criteria and subject to the first level of review (titles and abstracts). Of the 76 studies selected for full-text review, 54 were excluded due to being from other countries, lacking an intervention, or absence of other relevant data identified for inclusion. This resulted in 22 papers chosen for review and data extraction (see Fig. 1).

Fig. 1
figure 1

Manuscript review process

Demographics of the Studies

Study characteristics are displayed in Table 2. Briefly, first-year medical students were most frequently included (15 of 22 studies), with second-year students included in 12, and third- and fourth-year students included in 4 and 5 studies respectively. One study uniquely intervened among post-baccalaureate students pursuing medical school following degree completion [24] and another study included young physicians in their study [16].

Table 2 Study characteristics measurement tools used

Demographic reporting including race, ethnicity, or gender was largely absent among the studies, with 4 in total mentioning any description of participant demographics [16, 24,25,26]. Sample size among the cross-sectional and cohort studies largely varied (a high of 297 students [25], low of 41 [27]), as did response rates for surveys utilized (a low of 8%) [28].

Test anxiety was measured through various means, with no singular scale or tool used most commonly. The majority of studies elected to use self-report questionnaires (20 of 22), with 6 studies using Likert-type scales with locally developed questionnaires [24, 29,30,31,32,33]. Three studies used the STAI questionnaire [27, 33, 34] and two studies used the Westside Test Anxiety Scale [28, 35]. Other tools included the Learning and Study Strategies Inventory (LASSI) version 2, TAI, Perceived Stress Scale, Profile of Mood States (POMS), and Brief Symptom Inventory (BSI). One study used biometric markers, including cortisol, unspecified hormone, and IgA levels [36], and another used a salivary cortisol test alongside other measurement tools [28].

Interventions

To address our second research question, we explored the articles to determine what interventions were reported to mitigate test anxiety. There were a variety of approaches, some more formal than others. Table 2 details findings from each study.

Self-help or wellness interventions were the most commonly reported. Deep breathing techniques [24] and hypnosis [26, 37] were used to help reduce anxiety in the moment. Similarly, three studies reported creating a mind–body-spirit program designed to help students relax [16, 36, 38]. The final wellness activity was dog therapy, where students could spend 20 min with a therapy dog [28].

A formal course on test-taking and study strategies was reported [35]. Three studies described courses that provided students with exposure to upcoming content or skills as a means of ameliorating uncertainty about the testing experiences [29, 34, 39]. Finally, Moore [25] explained a 3-year longitudinal small group course intended to foster mutual collaboration and peer instruction.

Peer instruction was another intervention identified to help with anxiety. Tutors were used in a physiology course [31] and another study described using second-year medical students as small group facilitators [32]. Additionally, second-year students were paired with junior students during simulations to reduce anxiety about performing skills [27]. Manning-Geist et al. [40] reported hosting panel discussions with upper-level students to demystify key transition points along the medical education journey.

Similar to providing students with information, one study provided students with predictive scores before taking licensure examinations along with practice questions [41]. Their approach was similar to Dogairiu [33] where students received systematic desensitization treatments coupled with study skills sessions to reduce anxiety.

Only one study reported supplementation of some type. Students were asked to record food intake, sleep quality, and physical activity. The experimental group was given fish oil supplements to help with their anxiety [42].

The final articles did not specifically identify interventions, but rather discussed results of self-report questionnaires they administered to analyze questionnaire findings with examination results [30, 32]. One described the impact of converting from a tiered grading system to a pass-fail framework and the self-reported impact that had on students’ anxiety [43].

Impact of Interventions

Many of the articles reported that their interventions had the desired impact on levels of anxiety. Eleven of the articles indicated that as a result of their respective interventions, students self-reported feeling less anxious [24, 26,27,28,29, 31, 33, 35, 39, 42, 43]. One study noted that immediate anxiety levels reportedly dropped, but when re-evaluated later, little long-term impact was identified [27]. Another study measuring hormone levels found levels to be stable until the end of the semester when examinations were looming, at which time the intervention did not help [36]. Dyrbye et al. [38] and Vontver et al. [34] did not report a significant change in anxiety.

Only five articles reported on examination performance. None of the interventions was associated with improved examination performance [25, 28, 29, 32, 43].

Discussion

The results of this study call into question whether current research efforts are addressing the right issue. If addressing test anxiety does not generate examination performance improvement, is there a mediating factor we are not accounting for and should prioritize in future research? In this literature review exploring test anxiety intervention outcomes in US medical schools, a variety of interventions were identified to measure test anxiety. To mitigate these feelings, self-help and wellness interventions were described most frequently [16, 24, 26, 28, 38]. What was most significant from this literature review was the small number of studies reporting examination performance post-intervention [25, 28, 29, 35, 43], none of which resulted in improved performance.

Several important themes emerged from this literature review regarding ongoing research of medical student test anxiety. First, the subscales and methods used to evaluate test anxiety are not uniform and range from standardized tools to locally developed questionnaires. This variance makes the comparison of intervention impact between studies difficult to generalize. Several studies have been conducted to analyze the convergence and divergence of test results between various anxiety scales, finding the Trait scale of the STAI does not measure pure anxiety, but instead provides a score that is influenced by depressive or negative affect symptoms [44, 45]. Bieling et al. [45] noted that the State scale of the STAI exhibited more specificity for anxiety than the Trait scale. In 2010, Bados et al. [46] further supported a revision of the STAI-T scale due to its correlation with scales of depression, offering an alternative scale, the State-Trait Inventory for Cognitive and Somatic Anxiety (STICSA), which correlates more strongly with anxiety scales and weakly with depression scales. One is left to wonder if interventions delineate which type of anxiety, state or trait, is being addressed. Future studies should be judicious in selected anxiety subscales and be cautious not to allow the conflation of anxiety with depression through imprecise measurement.

Participant demographics have largely been neglected in the published literature. It is known that the prevalence of depression and anxiety differs among medical student populations based on race and gender [47]. This difference is also associated with a difference in test outcomes [48, 49]. Leiner et al. [48] reported higher general test anxiety among women, with a related drop in overall examination performance. Milam et al. [49] identified a relationship between discrimination and mental health symptoms among Black medical students, finding symptoms were alleviated with specific interventions such as addressing discrimination and increasing students’ sense of connectedness.

The striking finding was the lack of reporting of examination performance, and of those that did, none resulted in significantly improved scores. It is likely that while anxiety may be improved following an intervention, without a change in academic performance the loop causes stagnation and a concordant lack of long-term improvement in anxiety, performance, or motivation. There have been previous studies that report examination performance improvements for test-anxious students [16, 50, 51]. However, we have found there were five studies within the medical education field that showed no improvement in examination performance with mitigation of test anxiety. If the goal of mitigating test anxiety is to improve examination performance, theoretical models that focus on that goal should be considered. For instance, the control-value theory [52,53,54] led to the development of the Achievement Emotions Questionnaire [55]. Studies exploring achievement emotions have identified other emotions that are more strongly associated with examination performance than anxiety, specifically pride in test-taking was a better predictor of examination performance [53, 56, 57]. These findings suggest anxiety may be a symptom and not a cause of the performance. Deeper exploration with students’ examination preparation should be investigated in future studies wherein the focus is not solely on anxiety mitigation.

This review has focused on US medical student education. Given the impact on career potential for US medical students if they fail high stakes examinations such as USMLE Step 1, we chose to limit our study to US medical education. Though the purpose of this study was to examine interventions in medical education, expanding the analysis to other health professional fields may yield interventions useful to all health professions learners given the shared experience of testing anxiety in graduate programs. The authors also recognize student well-being is an important element of productive learning; however, incorporating this aspect of medical student psychology was out of the scope of this review. The search was restricted to the time period of 2010 to 2021 in large part due to the notion of the evolving pressures of medical education and the differences in the diversity of students from previous decades. The authors recognize that this review may have omitted studies that addressed this topic. It is important to note, however, that this limitation does not affect our suggestions for future study given the results of our review of the past 10 years of literature.

Conclusions

This review further reinforces the need to expand the lens the medical education field takes on test anxiety mitigation. While important for medical student well-being, mitigating test anxiety is not enough if institutions and their students hope to improve academic performance. Ongoing research efforts should consider the underlying cause of test anxiety while enhancing students’ preparation for examinations to enhance their confidence and motivation.