Introduction

Medical schools strive to admit a diverse population of students through holistic admission processes that consider life experiences and interpersonal attributes, in addition to academic aptitude tests (i.e., undergraduate grade point average [GPA] and Medical College Admission Test [MCAT] scores). One of the challenges associated with this approach is that students vary with regard to backgrounds as preparation for the academic rigor and demands of the preclinical academic years. Consequently, the relationship between pre-matriculation variables and academic performance in preclinical years and United States Medical Licensing Examination (USMLE) Step 1 has been studied extensively, and many institutions have developed prediction models to guide academic success support for the identified students of need [1,2,3,4,5,6]. However, there is a lack of predictive models that incorporate multiple elements of both academic and behavioral components of student learning.

It is apparent that performance on USMLE Step 1 examination is an important milestone for medical students in their preclinical years. USMLE Step 1 performance was cited as an important factor across all specialties for selecting candidates for interviews, and ranking applicants for the match [7] (National Resident Matching Program, Results of the 2016 NRMP Program Director Survey). In the same survey, 30% of the specialty programs indicated that they never consider applicants who fail USMLE Step 1 on first attempt. The financial consequences of not matching to a program are increased student debt, while not matching to a program of choice may lead to career dissatisfaction.

To predict performance on the USMLE Step 1 examination, we developed three prediction models that assess the progress of medical students during preclinical years. A number of instruments have been developed to monitor progress in student learning [8,9,10] and to identify behavioral attributes correlated with academic success (e.g., Learning and Study Strategies Inventory [LASSI], [11]; Self-directed Learning Readiness Scale [SDLRS], [12]). However, there have been very few predictive models that incorporate multiple elements of both academic and behavioral components of student learning [13].

The LASSI is a 10-scale assessment of students’ awareness of skill (information processing, selecting main idea, and test strategies), will (attitude, motivation, and anxiety), and self-regulation (concentration, time management, self-testing, and study aids) components of strategic learning. The ten LASSI subscales are found to be associated with academic performance [13,14,15,16,17]. Few studies examined the relationship between study strategies and medical students’ performance in internal and external examinations [13, 16]. The subscale Concentration was found to be the only study strategy to predict success in USMLE Step 1 performance [13]. The LASSI subscales Concentration, Anxiety, Selecting Main Idea, and Test Strategies were found to be significant predictors of performance in National Board of Chiropractic Examiners (NBCE) assessments [15]. Time Management and Self-Testing were observed to be strong predictors of medical students’ performance in their first semester [17].

The present study developed and validated multiple prediction models in the preclinical years that include both behavioral attributes (LASSI subscale scores) and performance measures (pre-matriculation GPA and MCAT, internal summative examinations, and National Board of Medical Examiners [NBME] Comprehensive Basic Sciences Examinations [CBSE]) for early identification of students at risk for failure in the institutional program and on USMLE Step 1 Examination. The validated models enable medical schools to predict student performance on USMLE Step 1, develop and implement targeted interventions to increase the success rate of students on passing USMLE Step 1, and assist students in achieving a targeted score that increases their competitiveness for their chosen specialty and residency program.

Methods

Participants

Development of the predictive models at the University of South Carolina School of Medicine Greenville (USCSOMG) included a sample of 180 medical students, comprising the graduating classes of 2016 (n = 52), 2017 (n = 53), and 2018 (n = 75) in their preclinical years. Participants were 56% females and 44% males, ranging in age from 21 to 39 years, with an average age of 23 years. Participants’ average MCAT score was in the 67th percentile, and with an average undergraduate GPA of 3.65 on a 4-point scale. Their average USMLE Step 1 score was 225. As a new medical school, the total number of students increases from approximately 50 students in the first 2 years to about 75 students in the third year, and thereafter reaching the maximum number of 100 students for the following classes.

Design

The study includes analyses and development of three predictive models. Model development and testing are at different stages of the student’s progression through the pre-clinical years of the medical school curriculum. The three models’ design provides an opportunity for early identification of students at risk of failing USMLE Step 1 examination. Student performance on USMLE Step 1 is the dependent variable, and the predictive value of combinations of behavioral measures, internal measures of student academic performance, and external measures of student academic performance were assessed using multivariate analyses. Three predictive models were proposed utilizing a combination of different independent variables:

  • Pre-matriculation model: MCAT scores, undergraduate subject, and overall GPA

  • End of M1 model: LASSI subscale scores, Weighted M1 Biomedical Sciences Performance, and Score on End M1 NBME CBSE progress tests

  • End of M2 model: LASSI subscale scores, Weighted M1 Biomedical Sciences Performance, Weighted M2 Biomedical Sciences Performance, and Scores on end of M1 and end of M2 NBME CBSE progress tests

Data Collection

The behavioral measure includes the scores on the ten subscales of the LASSI, [11]. The LASSI (Version Two) is an 80-item inventory that contains 8 items for each of the ten LASSI subscales. Participants answer each item on a 5-point Likert scale from 1 = not at all typical of me, 2 = not very typical of me, 3 = somewhat typical of me, 4 = fairly typical of me, and 5 = very much typical of me. The LASSI instrument shows good reliability (Cronbach Alpha of 0.73–0.89) and it demonstrates good validity [11]. The LASSI instrument was administered to the students during orientation at the beginning of their second (M2) year.

Performance measures include internal summative examinations administered during the first 2 years of the medical school curriculum including examinations of five modules during M1 year and seven modules during M2 year. The summative examination questions were vignette style, multiple-choice questions. Results from 12 modules and the overall weighted biomedical sciences performance at the end of M1 and M2 years were obtained. Students’ performance in biomedical sciences modules was weighted based on the duration of the module relative to the duration of the academic year, and these weighted values were then averaged. The external performance measures constitute the scores of four NBME CBSE progress tests administered at the beginning of M1, end of M1, midpoint of M2, and end of M2 academic years, in addition to student performance on the USMLE Step1 examination.

Data Analysis

Data were analyzed using IBM Statistical Package for the Social Sciences (SPSS) software (IBM Corporation, Armonk, NY, USA). Pearson product moment correlation coefficients and multivariate regression modeling were used to determine optimal combinations of independent variables representing behavioral attributes and student performance outcomes (i.e., ordinary least squares to optimize variation explained and standard error of the estimate). Student performance on USMLE Step 1 is the dependent variable, and the predictive value of combinations of behavioral measures, internal measures of student academic performance, and external measures of student academic performance were assessed using regression modeling.

Pre-matriculation Model

The development of the pre-matriculation model included the analysis of MCAT scores, and undergraduate subject and overall GPA.

End of M1 Model

At the end of the M1 year, a second model was developed following the analysis of students’ performance on internal summative examinations, overall weighted end of M1 biomedical sciences performance, end of M1 NBME CBSE progress tests, and LASSI subscale scores.

End of M2 Model

A third model was developed at the end of the M2 year based on the analysis of the data representing students’ performance on internal summative examinations, overall weighted end of M1 and M2 biomedical sciences performance, NBME CBSE progress tests, and LASSI subscale scores.

Results

Pre-matriculation Model

Initial results of linear regression analysis had identified predictors of performance on the USMLE Step 1 examination. The results showed that there is a limited but statistically significant association between undergraduate GPA and MCAT to USMLE Step 1 examination performance. The MCAT combined score showed the highest variation (19.45%) explained in USMLE Step 1 scores (Table 1). The pre-matriculation model (MCAT Bio + MCAT Phys + Cumulative GPA) explained 24% of the variation in USMLE Step 1 scores.

Table 1 USMLE Step 1 score variation explained (r 2) by pre-matriculation academic measures

End of M1 Model

Results of linear regression analysis indicated that the LASSI subscale scores of Anxiety, Information Processing, Motivation, Selecting Main Idea, and Test Strategies are significantly associated with USMLE Step 1 scores (Table 2). However, the regression model with the greatest percentage of variation explained and lowest standard error of the estimate included only the Anxiety (r 2 = 9.67%) and Test Strategies (r 2 = 14.21%) subscales. Analyses of the scores of NBME CBSE progress tests showed statistically significant association with USMLE Step 1 scores with an increase in score variations at the end of the M1 year (Table 3). Therefore, only the scores of end of M1 NBME CBSE (r 2 = 47.47%) were included in the multiple regression analysis for this model. Analysis of the performance on internal summative examinations and overall M1 biomedical sciences is summarized in Table 4. Overall M1 biomedical sciences score (r 2 = 43.56%) and MCAT combined (r 2 = 19.45%) were also included in the model regression analysis. This end of M1 model [MCAT Combined + End M1 NBME CBSE + Overall M1 Biomedical Sciences + LASSI (Anxiety + Test Strategies)] explained 62% of the variation in USMLE Step 1 scores.

Table 2 USMLE Step 1 score variation explained (r 2) by LASSI 10-scale scores (2016, 2017, and 2018)
Table 3 USMLE Step 1 score variation explained (r 2) by NBME Comprehensive Basic Science Exam (CBSE)
Table 4 USMLE Step 1 score variation explained (r 2) by Pre-clinical Biomedical Sciences modules

End of M2 Model

In addition to the overall M1 biomedical sciences score (r 2 = 43.56%) and MCAT combined score (r 2 = 19.45%) that were also included in the end of M1 Model regression analysis, the overall M2 biomedical sciences (r 2 = 51.41%) and end of M2 NBME CBSE (r 2 = 68.06%) were added for the end of M2 model [MCAT Combined + End M1 NBME CBSE + End M2 NBME CBSE + Overall M1 Biomedical Sciences + Overall M2 Biomedical Sciences + LASSI (Anxiety + Test Strategies)]. The end of M2 model explained 81% of the variation in USMLE Step 1 scores.

To summarize, the pre-matriculation model explained 24%, the end of M1 model explained 62%, and the end of M2 model explained 81% of the variation in USMLE Step 1 scores (Table 5). The average difference between predicted and actual USMLE Step 1 scores for all the three classes using the end of M2 model is 0.46 points. The inclusion of LASSI subscales improves the percentage of the variation explained in USMLE Step 1 scores. The end of M1 model improved from 60 to 62% and end of M2 model improved from 79 to 81% in explaining variation in USMLE Step 1 scores.

Table 5 Comparison of the three predictive models

Discussion

We developed three predictive models at different stages of students’ progress through the medical school curriculum. It is apparent that the more information about the students that is known, the greater the accuracy of our predictive models. With only pre-matriculation information (e.g., GPA, MCAT), our first model explained only 24% of the variation in USMLE Step 1 scores. With the addition of students’ performance data from the M1 year, NBME CBSE at the end of M1, and their study strategies (e.g., LASSI scores), the second model improved to explain 62% of variation in USMLE Step 1 scores. Further additions of performance data from the M2 year and NBME CBSE at the end of M2 year improved the predictability of the third model to explain 81% of the variation in USMLE Step 1 scores.

Studies have shown that undergraduate GPA [2, 3, 6, 18] and MCAT scores [2, 19] are strong predictors of preclinical academic performance. However, other studies indicated weak associations between MCAT scores and academic performance [3, 4, 20]. In the present study, there was a statistically significant but limited correlation of GPA and MCAT scores to USMLE Step 1 score, which is in accord with the studies by Saguil et al. [5] and Gauer et al. [20]. Similar findings were observed by Roy et al. [4] to predict performance on Medical Council of Canada Examination Part 1 (MCCQE-1). The predictive validity of MCAT to USMLE Step 1 score is in the mid-0.40s [21], and it ranges from small to medium for preclinical academic performance and USMLE Step 1 [1, 20]. Since the pre-matriculation model explains only 24% of the variation in USMLE Step 1 scores, GPA and MCAT scores should not be over emphasized as important factors in admission decisions. However, students with low GPA and MCAT scores should be flagged for additional support in other behavioral factors not limited to anxiety, study skills, and test taking strategies.

Performance in the basic medical sciences in the preclinical years was also used to predict performance on USMLE Step 1 examination. Students’ performance on gross anatomy comprehensive examination was found to correlate with USMLE Step 1 performance [22], and participation in an Applied Anatomy master program enhanced students’ performance on the USMLE Step 1 examination [23]. Pre-matriculation variables and achievement in basic science courses significantly correlate with USMLE Step 1 scores [24], and significant correlations were also observed between M1 and M2 GPAs with Step 1 performance [25]. Our results indicated that performance in M1 and M2 single modules were significantly correlated with USMLE Step 1 scores with score variations ranging from r 2 = 19–37%. The overall performance in M1 (r 2 = 44%), M2 (r 2 = 51%), and M1 + M2 (r 2 = 57%) successively improved in explaining variations in USMLE Step 1 scores.

The relationship between students’ performance on basic science examinations and USMLE Step 1 scores has been investigated [4, 26, 27]. These studies found that the NBME Comprehensive Basic Science Self-assessment Assessment (CBSSA) was found to explain 62–67% of the variation in USMLE Step 1 scores [26, 27] when the self-assessment was taken closest in time to the first Step 1 attempt. Our results showed that at the end of M1 year, NBME CBSE explained 47% of the variation in USMLE Step 1 scores. However, at end of the M2 year, the NBME CBSE explained 68% of the variation in USMLE Step 1 scores. That is, the proximity of taking the NBME CBSE for self-assessment before student’s first attempt of taking USMLE Step 1 examination is an important factor to consider for a consistent prediction.

Similar to our findings, the LASSI subscales of Anxiety, Selecting Main Ideas, and Test Strategies were observed to significantly predict performance on the National Board of Chiropractic Examiners [15] and correlate with academic performance [28]. Additionally, the LASSI subscale Anxiety was found to negatively correlate with performance on the USMLE Step 1 examination [29, 30]. A previous study demonstrated that an accurate and valid assessment of the LASSI subscales is achieved at the beginning of M2 year when compared to the beginning of M1 year [31]. That is, assessing students’ learning and study strategy skills is more accurate after experiencing the learning environment, stresses, and performance requirements inherent in medical education. Therefore, an early identification of students who lack the skills in the above-mentioned LASSI subscales could easily be achieved by administering the LASSI instrument at the beginning of the second half of M1 year [31]. Consequently, early support should be available for those identified students at midpoint of M1 year since it takes more time for the development of these skills.

Our prediction models that utilized performance and behavioral measures provided more accurate prediction of students’ performance on the USMLE Step 1 examination. The end of M2 model explained 81% of the variation in USMLE Step 1 scores, and the average difference between predicted and actual USMLE Step 1 scores for all the three classes using the end of M2 model is 0.46 points. This information provides students with a realistic assessment of their readiness to take USMLE Step 1 examination. Using this model before taking USMLE Step 1 examination would allow students who were predicted a low performance to have the opportunity to seek help to modify their study strategies for improved outcomes.

The prediction models are very useful tools for identifying students at risk of failing the USMLE Step 1 examination. Although the end of M2 model is a better predictor of USMLE Step 1 performance, the end of M1 model is a useful tool for the early identification and targeted intervention for at risk students. Accordingly, selection of intervention strategies should be adopted to address both the knowledge and skills needed when providing support to each student in preparation for taking the USMLE Step 1 examination. Students should be advised to receive professional counseling and coaching regarding study strategies to improve Information Processing, Selecting Main Ideas, and Test Strategies skills as well as addressing their anxiety. Multiple strategies have been used to lower stress and address anxiety. The incorporation of mindfulness-based stress reduction (MBSR) intervention programs [32, 33], the use of yoga [34], and changing to a pass-fail grading system [35, 36] have both been found to improve medical student coping strategies and student satisfaction, and reduce anxiety.

Conclusion

Three prediction models have been developed to assist in early identification and intervention for students at risk for failure in the institutional academic program, and on the USMLE Step 1 Examination. The pre-matriculation, end of M1, and end of M2 models explain 24, 62, and 81% of the variations in USMLE Step 1 performance, respectively. These models are tools to assist in the continuous monitoring of student performance and support a holistic perspective of students’ progress to assist them in achieving a targeted score that increases their competitiveness for their chosen specialty and residency program.