Introduction

Physicians in training require skills and attitudes beyond medical knowledge in order to mature into successful clinical practitioners. Excellent interpersonal and communication skills, appropriate professional behaviors and attitudes, the ability to identify and implement learning strategies needed to improve performance, and the ability to work effectively within teams, are all prerequisites for effective physicians in the twenty-first century [1]. Ideally, students should begin to develop these attributes early in their medical training and pre-clerkship faculty members have the opportunity to play a critical role in this process. However, because of the heavy focus on medical knowledge during the pre-clerkship phase, programs often struggle to design experiences and assessments that capture student behaviors and offer coaching to help prepare them for all aspects of clinical work [2].

As part of a larger curricular revision in the Vanderbilt University School of Medicine, we wanted to re-design the pre-clerkship curriculum to address the above challenges and to better equip our students to become physicians [3]. To this end, two critical features were incorporated into the pre-clerkship, or Foundations of Medical Knowledge (FMK), phase. First, to promote and enable observation of diverse facets of student performance that previously were under-emphasized during early stages in training and to provide settings that allowed for faculty- and peer-based assessments, we incorporated regular small group team-based active learning sessions and patient-focused activities across the phase. Second, we transitioned from grading pre-clerkship performance primarily on knowledge-based examinations to evaluating students across four of the Accreditation Council for Graduate Medical Education (ACGME) core competency domains: medical knowledge, practice-based learning and improvement, systems-based practice, and professionalism [4].

It is important to emphasize that the purpose of competency-based education is not to ensure that students attain “minimal competence” in a given domain. Rather, it is used to establish a trajectory that trainees can follow to achieve excellence across multiple domains of performance [58]. The competency-based approach also emphasizes the need for life-long learning and continual development throughout one’s medical career.

To support the transition to competency-based assessment, the FMK phase leadership team (comprised of a cell biologist/anatomist (PhD), a biochemist (PhD), an anatomical pathologist (MD, PhD), and a pediatric neurogeneticist (MD)) devised a grading scheme for all of the pre-clerkship science blocks that included standardized rubrics with qualitative milestone assessments in each of the competency domains and quantitative assessments in medical knowledge [9]. In addition, a digital platform was developed to collect and review individual, aggregate, and longitudinal views of student performance in each domain [10]. We currently are in the third year of our new pre-clerkship phase. Over this time, we have evolved from a theoretical goal of holistic student assessment to the implementation of a practical evaluation strategy, which is described below.

Pre-clerkship Phase

The FMK phase begins with a one-week block that introduces students to the medical profession followed by a series of five highly integrated foundational science blocks. The first two science blocks, Human Blueprint and Architecture (proteins/enzymology, nucleic acid processes, signal transduction, metabolism, basic cell biology, genetics, and an introduction to histology, anatomy, pathology, and pharmacology), and Microbes and Immunity (bacteriology, virology, immunology), are each 6 weeks in length. The next three organ-based blocks, Homeostasis (cardiology, hematology, renal, pulmonology); Endocrine, Digestion, and Reproduction (endocrinology, gastrointestional, genitourinary); and Brain, Behavior, and Movement (musculoskeletal and head anatomy, neuroanatomy, neurology, psychology), are each 12 weeks in length. All of the foundational science blocks are built around a similar weekly template of learning activities in order to implement consistent assessment events across the FMK phase. Running concurrently with the science blocks are three longitudinal courses: Physical Diagnosis, Foundations of Healthcare Delivery, and Learning Communities/Research. To the extent possible, the longitudinal blocks were designed and scheduled to align and integrate with the materials being taught in the science blocks throughout the FMK phase. As just three specific examples, the genetics content of Human Blueprint and Architecture was matched with an ethics section in the Learning Communities, and students in Physical Diagnosis learned to perform chest examinations and listen to heart sounds during the cardiovascular unit of Homeostasis and learned pediatric and adult neurological examinations during the Brain, Behavior, and Movement block.

Unless stated otherwise, the assessment strategies discussed below were designed specifically for the science blocks.

Competency-Based Assessment

Pre-clerkship faculty members educate trainees at a critical and formative stage in their progression from students to physicians and can profoundly influence the formation of professional identities [11]. However, because assessments in pre-clerkship curricula historically have focused almost exclusively on medical knowledge, faculty contributions to student development often have been limited. Only in cases of egregious unprofessional behavior were “other considerations” taken into account when assigning grades or providing feedback.

Fortunately, the advent of modern curricula, with more active and group learning formats, affords an opportunity for pre-clerkship science faculty to observe and foster a broader diversity of behaviors. Consequently, these faculty members are in a greater position to provide early-stage students with rich holistic feedback on many specific aspects of their performance (Table 1) that is designed to stimulate their overall professional growth and development. In order to sustain the continuity of student evaluation across the curriculum, we used four ACGME competency domains, medical knowledge, system-based practice, practice-based learning and improvement, and professionalism, as a framework for assessing learner performance. At this early stage of training, we framed the systems-based practice domain around each student’s role in the learning (rather than care delivery) system. Because students were engaged in learning teams, interpersonal, communication, and teamwork skills were emphasized under this domain. Similarly, the practice-based learning and improvement domain was presented to students as the need to monitor learning outcomes to guide both group and personalized learning plans.

Table 1 Competencies used to assess pre-clerkship students

Competencies represent trainable attributes of an individual that must be developed in order to successfully perform professional duties and are grouped within domains encompassing similar skills [12]. Trainee performance in each of the competency domains was evaluated using qualitative milestone assessments, which describe the typical developmental pathway for a given competency [13, 14]. Students were rated across 18 specific competencies associated with the four domains described above (Table 1). Assessors were provided with digital milestone forms that included six observable behavioral anchors for each competency that described escalating levels of performance from unacceptable to aspirational. Examples of the milestones used for two of the individual competencies are shown in Table 2. The behaviors described under “entry” reflect the minimal standards expected of students at the beginning of the FMK phase. Over the course of the phase, minimally acceptable levels of performance were raised to parallel the maturation of the students. Identical milestone descriptors were utilized across the entire four-year undergraduate medical curriculum. Thus, “aspirational” behavior took on a different context depending upon the particular learning environment. Whereas aspirational behavior for a senior student was intended to model expectations of a mid-level intern, students in the FMK phase were still able to display aspirational behaviors in a less-complex non-workplace environment.

Table 2 Milestones used to assess two individual competencies

In addition to filling out the milestones, both peer and faculty assessors were encouraged to provide specific comments for each of the four competency domains and were required to answer the following two global questions:

  • What is at least one valuable contribution this person has made to your team?

  • What is at least one important thing this person could have done to more effectively contribute to your team?

The goal of the assessment program was to collect numerous “low-stake” data points for every student, which when compiled, could provide a personal “developmental growth chart” for each of the competency domains [15]. To accrue richer information, a variety of educational settings were used to observe student behavior.

First, every scientific block included 6 hours per week of case-based learning sessions. Case-based learning groups included seven to eight students and a faculty facilitator and utilized an inquiry-based approach to dissect the biosciences that underlie a clinical case. Groups were student-run and utilized a format similar to that of problem-based learning sessions [16]. Case-based learning groups were formed initially to distribute students with various scientific and non-scientific backgrounds [17] and were reformed every 12 weeks in a manner that ensured that all members of the group (including the facilitator) were new to one another. Group reforming was undertaken to help prepare students for the multiple team-based activities that they encounter in the clinical workplace and to ensure that students were able to interact intellectually with a range of peers and faculty.

Second, most of the science blocks incorporated team-based learning sessions into the learning structure [18, 19]. When team-based learning sessions were employed, groups had the same composition of students as was used in the concurrent case-based learning groups.

Third, all of the science blocks included dissection teams of four to five students in the gross anatomy laboratory. These groups were distinct from the case-based learning and team-based learning groups described above and were reformed three times during the FMK phase for reasons discussed above.

Fourth, block directors retained the ability to fill out a milestone-based assessment for any student.

Facilitators were explicitly trained in performing milestone-based assessments in a series of workshop and development sessions run by Vanderbilt and external faculty. Facilitators also were provided forums to share experiences and offer feedback on the milestone language in an effort to standardize the review process and reduce rater variability. Furthermore, the FMK leadership team and assessment office reviewed facilitator comments and assessments to ensure that all groups were evaluated fairly and received appropriate levels of feedback.

Students also attended workshop and development sessions on peer-assessment run by Vanderbilt faculty and administration. In addition, initial self-assessment exercises were used to help students understand the intent and use of the milestones.

Students were assessed using the milestones at regular intervals across the FMK phase [20]. A schematic of the type (self-, peer-, or faculty-based) and frequency of assessments is shown in Fig. 1. All of the science blocks incorporated mid- and end-of-block assessments that included conferences with faculty facilitators. The mid-block conferences were used to provide students with feedback regarding their strengths and weaknesses and to offer coaching aimed at correcting undesired behaviors prior to the end-of-block assessment. As one example of the holistic feedback provided, a student who was not making sufficient contribution to team efforts may be described by the milestone in systems-based practice: “requires reminders from team or supervisor to complete responsibilities or to participate.”

Fig. 1
figure 1

Frequency and type of competency-based milestone assessments used for the foundational science blocks of the pre-clerkship Foundations of Medical Knowledge phase. The phase is comprised of five science blocks: Human Blueprint and Architecture (HBA) and Microbes and Immunity (M&I), which are each 6 weeks in length, and Homeostasis; Endocrine, Digestion, and Reproduction (EDR); and Brain, Behavior, and Movement (BBM), which are each 12 weeks in length. Arrows pointing up represent milestone assessments by students (red—self, green—peer). The red asterisk indicates that the self-assessment was part of the students’ reviews with their portfolio coaches. Students receive assessments from a minimum of two of their peers. Arrows pointing down represent assessments by faculty facilitators (orange—comments and conferences, blue—milestone assessments, comments, and conferences). Three dissection team faculty conferences that included milestone assessments were held during the Homeostasis and EDR blocks

All of the milestone information was compiled and made available to the block directors, the FMK phase leadership team, and members of the administration. In addition to the designated conferences, targeted interventions designed to address inappropriate student behaviors were performed as needed.

In each of the group types, except for the dissection teams, students assessed (and were assessed by) two of their peers and were given the option to assess more peers if desired. Given the small size of the dissection teams, students assessed all of their peers in the group. Finally, periodic faculty conferences were held with these teams to review dissections and apply material learned to clinical scenarios, and the faculty used milestones to assess student performance.

Quantitative Assessment for the Medical Knowledge Competency Domain

In addition to the qualitative assessment approach described above, quantitative assessment metrics were applied to the medical knowledge competency domain. Once again, a consistent strategy was applied across all five science blocks that included weekly low-stake assessments in conjunction with end-of-block examinations. Students completed weekly multiple-choice quizzes (graded electronically) and on-line essay assignments (graded by students’ case-based learning faculty facilitators) to provide practice for end-of-block examinations, to help develop rigorous study habits, and to identify individuals who were struggling with learning the material. End-of-block assessments included integrated on-line essay and short-answer questions (graded by block faculty) that allowed students to demonstrate a deeper understanding of foundational concepts and apply them to clinical scenarios. Assessments also included practical examinations that tested laboratory-based content and customized National Board of Medical Examiners examinations designed to test knowledge and provide practice for Step 1 board examinations.

In general, weekly assignments (which in some blocks also included individual and group readiness assessment quizzes in team-based learning sessions) constituted 20–25 % of the medical knowledge quantitative score and the National Board of Medical Examiners examinations contributed an additional 25 %. The remaining percentage was distributed across the essay, short-answer, and practical examinations in a manner consistent with the topics taught in the block. Passing scores for the quantitative assessment of medical knowledge (based on a total of 100 %) were set and validated using a bootstrap statistical analysis.

Integration of Qualitative and Quantitative Assessments, Assigning Grades, and Remediation

At the end of each block, qualitative and quantitative assessments were reviewed by the block directors in consultation with the FMK leadership team. The leadership team oversaw the use of qualitative and quantitative metrics from block to block in order to provide continuity in grading standards across the pre-clerkship phase. The team also monitored student progression to watch for trends in behaviors.

In order to assign end-of-block grades, milestone and quantitative data for each competency were analyzed and converted to a scale of sub-threshold, threshold, and target for each competency domain. Greater weight was assigned to milestone data when there was concordance between scores provided by peer and faculty assessors and when scores and written comments agreed. Milestone scores rated as “unacceptable” in any domain or quantitative scores in medical knowledge that fell below the passing cutoff mark were equated to “sub-threshold.” Milestone ratings below expected developmental norms, but still in the acceptable range, were equated to “threshold.” Threshold scores in medical knowledge also could be assigned for quantitative assessments that were in the marginal, but passing, range. Milestone scores in or above the expected developmental range were deemed “target.”

Given the importance of the four competency domains, we decided against assigning weights to individual domains that could be summed into a final score for the block. We felt strongly that high passing marks in three domains (for example) should not be able to compensate for a deficiency in a fourth domain. Consequently, in the final grading rubric, each competency domain was considered equal and students had to achieve a target score in each domain in order to pass the block.

With the above in mind, the following rubric for determining final block grades was devised: All blocks were graded on a pass/fail basis. Students who received target scores in all domains passed the block. Students who received a threshold score in one competency domain generally received a passing grade, but had to set learning goals in that domain. These learning goals were established and monitored in conjunction with designated portfolio coaches. Students who received one sub-threshold score in any domain or multiple threshold scores across domains generally failed the block, or in some cases (see below), were placed in a targeted longitudinal coaching and monitoring program.

Deficiencies in the medical knowledge domain required that the student retake the block or (at a minimum) pass a remediation exam. In contrast to the medical knowledge domain, behaviors associated with the other three domains were related to personal development rather than the science content specific to the block. Therefore, deficiencies in practice-based learning and improvement, systems-based practice, and professionalism were remediated in a longitudinal manner across the remainder of the phase by additional coaching and by monitoring performance on peer and faculty milestone assessments.

For the vast majority of students, the milestone framework provided detailed information about relative strengths and weaknesses and was used to support optimal performance (as opposed to pass/fail decisions). Thus far, less than 10 % of each class has not met standards in at least one competency domain and was required to complete some level of targeted remediation. Primarily, these students received threshold (as opposed to sub-threshold) scores in a single competency domain and were obligated to set appropriate learning goals in the domain of concern. To date, all students who were required to set learning goals, pass a remediation examination, or retake one or more blocks have completed their remediation successfully.

Challenges

Converting to a competency-based milestone assessment program comes with a number of challenges [21]. These challenges can be overcome, but the process requires trust, significant faculty and student development, and a strong collaborative effort between all of the major stakeholders.

First, development of assessors (both faculty and students) is essential [22]. Assessors need to be apprised as to how to use the milestones properly and how to apply them in a consistent manner. In this regard, it is highly desirable to have a cadre of experienced and well-trained faculty assessors [22].

Second, students have to “buy-in” to the program and understand the purpose of the milestones. Medical students are high achievers who are used to successful outcomes. Any comments by assessors that indicate relative weaknesses in performance may be rejected or viewed as threatening [23]. In either case, if students view comments only as summative, they can be detrimental to progression. It needs to be strongly communicated to students that in the vast majority of cases, milestones provide the basis for a coaching strategy designed to make them better and more complete physicians. As opposed to simply receiving a grade of “pass,” milestones allow students to identify their current status across multiple domains and represent a holistic roadmap for achieving higher levels of performance and enhancing professional growth [24].

Third, assessors also have to “buy-in” to the program. Initially, faculty can be reluctant to provide students with “realistic” assessments for fear that they will harm trainee morale or generate information that is damaging on student records. Once again, faculty members need to understand that for most students, milestones are less about assessment and more about coaching.

Fourth, block directors who are used to grading solely on the basis of medical knowledge may be reluctant to view qualitative milestone data as being on par with quantitative examination scores. The notion that qualitative data are not “real” needs to be dispelled [25]. It is helpful to have a group that oversees the pre-clerkship phase and can help block directors convert from the idea that quantitative scores in medical knowledge are the only standard by which students should be judged. If this conversion is to take place, it is critical that the milestone data be consistently generated and applied.

Fifth, having a digital platform for student quizzes, essays, and examinations greatly enhances the ability of faculty to grade these aspects of the blocks. Furthermore, having an appropriate digital platform to record milestone assessments and help accumulate and track milestone data is extremely useful and decreases rater fatigue [10].

Successes

The switch to competency-based assessment has provided a number of benefits for our learners as well as our faculty.

First, just as assessment drives learning, it also can drive the development of students’ attitudes and behaviors [26]. Competencies afford students with a mechanism to chart the trajectory of their professional development and provide an organized pathway for feedback [27]. Initially, our students are hesitant to receive and provide feedback. However, by the end of the pre-clerkship phase, they expect, appreciate, and even crave this feedback as part of their coaching to become better physicians. As a result, we have observed that our students acquire professional skills and attitudes considerably earlier in their training than they did in our previous curricula. Moreover, students who have moved on to the later phases of our curriculum are now actively soliciting feedback from clinical faculty.

Second, the assessment strategy has allowed us to identify competency challenges for individual pre-clerkship students that traditionally would have gone undetected at this early stage of training. Whereas feedback to students during our previous curriculum was heavily skewed toward deficiencies in medical knowledge, issues in communication, professionalism, and systems-based practice are now recognized. This has created opportunities for coaching and remediation before students enter the clinical workplace. To this point, since transitioning to competency-based assessment, approximately two thirds of the personal learning goals set by our students are now distributed across non-knowledge-based domains.

Third, on the basis of two years of aggregate data, milestones have proven to be a rich and accurate framework for describing performance and predicting student outcomes in the clerkship phase of the curriculum. Our validation process has given us confidence in using this information to make decisions regarding student progress and promotion.

Finally, our pre-clerkship faculty has always had a vested interest in nurturing student growth and development but has struggled to find a mechanism to formalize “anecdotal” observations of behaviors outside the realm of medical knowledge. The use of milestones has provided us with a scaffold to assess a much fuller range of student behaviors. As a result, we now are able to make substantial contributions to the professional growth of our students in ways that were not possible in previous curricular models. This has enriched relationships between students and educators and has reinforced the essential role of pre-clerkship faculty in the continuum of undergraduate medical education.