Keywords

1 Introduction

The Objective Structured Clinical Examination (OSCE) is a central component in assessing the clinical skills of medical students, and because the results provide information about the competencies of the students being assessed, the process must be ensured to be rigorous and accurate [1]. However, several factors interfere with the assessment in the OSCE, namely the inconsistency of the checklists and differences in the details of the assessment on each item, including the global rating scale [2], inequality in making checklists and their constructs [3], the level of difficulty of the material tested in the OSCE [4], and simulation patients which had a positive impact on student performance during OSCE compared to the use of student role-plays [5].

The OSCE blueprint plays an important role in the OSCE assessment, ensuring that exam candidates are comprehensively tested for competence [6]. However, a hidden pattern in examiners may influence them in conducting OSCE assessments [7]. Those hidden patterns are the perception of doctor-patient communication [8], various cultural factors of the examiner [9], the contrast effect of the previous student which becomes a benchmark for judging the next student [10], and the use of different assessment references in the OSCE [11]. Some of the factors for the inaccuracy of OSCE results can stem from the imprecision of the test, the variability of the examiner, and all of the other psychometric properties (simulated patients, assessment materials, scoring guides, etc.) [12].

Video-based assessment is considered a reliable method for testing clinical skills performance. Students can learn and prepare clinical skills with the help of video examinations, as a benchmark for clinical skills competency [13]. The use of video-based assessments of simulated examinations shows that these assessments can provide a valid and reliable method for testing the clinical performance of students [14]. The examiner's background, related to social and psychological processes, the examiner's clinical practice experience, the experience of assessing the OSCE, and the examiner's gender appropriateness, had a major role in the inaccuracy of the assessment even though the OSCE was administered under the most standard conditions [11, 12]. However, several studies that have been conducted on this aspect still found different results due to various bias factors. In this study, the considered OSCE examiners' backgrounds were gender, education, clinical practice experience and duration, OSCE experience, and their OSCE training. This study aimed to describe the development of the videos and to analyze the developed video examination results from the OSCE examiners regarding their backgrounds. The findings of this study can add a reliable way to foster the objectivity of OSCE.

2 Methods

This study described how the process of making a video-based assessment is done (Fig. 1). First, Cardio-Pulmonary Resuscitation (CPR) skill assessment was chosen because it already has a specific guideline from the American Heart Association [15, 16]. To be usable by our OSCE examiners, it was rewritten by two cardiologists, adapted in Bahasa Indonesia, and they revised the assessment rubrics that already matched the OSCE requirement. The validity of the content in the rubric and assessment guide was achieved when the assessment instrument was reviewed by the cardiologists. Then, we developed standardized simulated CPR procedure videos with their supervision based on the guideline. Our students served as the actors in both videos; one video that portrayed CPR according to the guidelines and the other one that did not comply with the guidelines. The cardiologists gave feedback after watching the videos and revisions were completed where appropriate.

Fig. 1
A flowchart of the development of clinical skills. It includes C P R guideline rewriting by cardiologists, adaptation in Bahasa Indonesia, assessment rubrics and guidelines, script development, video recording, revision feedback on the videos, and video validation by O S C E examiners.

The development of the clinical skill videos

A total of 51 OSCE examiners from the Faculty of Medicine, Duta Wacana Christian University were enrolled in the study using total sampling, to assess the CPR performance in both videos using standardized assessment guidelines. These OSCE examiners were pre-clinical and clinical teaching lecturers from various scientific groups in the medical faculty. The Faculty of Medicine, Duta Wacana Christian University (UKDW) uses the OSCE as a regular clinical skills examination every semester for undergraduate medical education.

This study used a quantitative method, in the form of a cross-sectional study of the assessment of OSCE examiners on the Cardiopulmonary Resuscitation (CPR) competency video. In giving the assessment, the results of the OSCE examiner's assessment based on each background characteristic were analyzed by the Kruskal–Wallis test because the distribution of the data was not homogeneous. This study was submitted to the Health Research Ethics Committee, Faculty of Medicine, Duta Wacana Christian University, while data collection was initiated after receiving approval (Reference No.1068/C.16/FK/2019).

3 Results

3.1 Script Development

Both video scripts were written and acted according to the American Heart Association's standardized rubrics and scoring guidelines. The scripts for these two videos were compiled by researchers, then reviewed and revised by two cardiologists. The two scripts were further developed into rubrics and assessment guides by the two cardiologists. Rubrics and assessment guides were prepared to evaluate student performance in CPR competencies. The validity of the content on the rubrics and assessment guides was achieved when the assessment instruments were reviewed by experts, who are cardiologists. This rubric and guideline for assessing CPR competencies coherently assessed three competencies, namely the primary survey, CPR procedures, and professional behavior that must be achieved on each value scale. The three competencies were defined in detail with specific explanations in the assessment guide. For CPR scripts that are not following the guidelines, standardized examinees performed <70% of clinical skills in the rubric, while for CPR scripts according to guidelines, standardized examinees performed >70% of clinical skills on the checklist.

3.2 Video Recording

The CPR video that showed performance following the guidelines and the other showed CPR not according to guidelines were recorded which contained the following: primary survey, CPR procedures, and professional behavior. All videos were recorded in the Skills Laboratory Faculty of Medicine, Duta Wacana Christian University with a digital Canon photographic camera. The sequence of video scripts was supervised by the researchers. The scripts were filmed by Medical Information Technology (IT) staff and were repeated several times to achieve the best situation that was written in the scripts. The cardiologists gave revision feedback on those two videos, then we reproduced the videos based on their feedback.

3.3 Video Validation

The validation of those two videos was conducted by the OSCE examiners as participants in this study. Participants in this study were 51 examiners described below in Table 1.

Table 1 Study subjects’ characteristics

In giving the assessment, the median of two videos scoring results of the OSCE examiner’s assessment based on each background characteristic and the significance from the Kruskal–Wallis analysis can be seen in the following Table 2.

Table 2 Video assessment results

The CPR videos that showed performance following the guidelines provided results that were not significantly different in the average value of the assessment results between each characteristic of OSCE examiners. Significant differences occurred in the two groups of examiners' characteristics, namely education and clinical experience when examiners assessed CPR competencies that were not following the guidelines. The median score for those groups was the same (33.33) with a p-value of 0.04; df 3 and p-value of 0.03; df 2, respectively.

4 Discussions

This study showed that the several steps to create a video for assessment, which were also done in this study, were planning or pre-production, recording or production, and editing or post-production [17]. Planning is important to ensure that the next step of video development is as expected, and this study describes how to develop and validate a video script, for which one video describes CPR that is appropriate and one video reflecting CPR not done properly according to the guidelines [18]. The video recording step needs to be supervised by the scriptwriter and the shooting must be done by a professional, which was also done in this study. This step is important so that video recordings record all relevant and objective information, can be seen clearly, and prevent video viewers from losing important details [19]. Post-production steps are also important as a final filter before the video is watched by video viewers as we did in the development of this assessment video. Submission of a post-editing video to the expert as the first viewer is expected so that the expert can identify potential gaps that can affect the assessment of the video, and provide an opportunity to make adjustments before the video is implemented [18].

The validation analysis of the two CPR videos in this study showed that although there were variations in the examiner's background that allowed differences in cognitive processes and various examiners' behaviors that could affect the assessment results, they were still able to provide consistent decisions. This study could illustrate that the results of the assessment of the two videos in this study were only influenced by differences in the performance of the students themselves. These results were consistent with previous studies showing similar results [20, 21]. Examiners will tend to make judgments easier and will give good judgment with accuracy when judging excellent performance and failing low quality performances because the examiners base their assessment on quantitative checklists of clinical skill performance [22, 23]. The tendency to more easily give assessments to students who perform well following the assessment guidelines is because the examiners base their assessments on quantitative measurements of the student's performance, including counting the number of correct points, and the examiners do not place more attention on the global assessment of pass and fail, so the examiner judges based on the fulfillment of checklist components [23]. This tendency can also be explained by when the examiner assesses good performance, it is easier for the examiner to choose the highest checklist point [22]. A video-based assessment accompanied by specific assessment instruments based on the newest and the most detailed evidence can increase the assessment's reliability [24].

As a reflection in the future, it is easier for examiners to give an assessment of good performance and the reduction of assessment deviations can be done by using specific cases indicating that there is a learning process when they evaluate when they use specific cases [24]. A video-based assessment using specific cases will be more effective than using general cases.

To minimize or avoid examiner biases, this example of video-based assessment can foster achievement of the highest objectivity of OSCE by applying this project in OSCE role-play scoring training. Through this role-play scoring training, we hope there will be the same perception between examiners on using assessment tools, using references, and minimizing the effect of background variability. Examiners' knowledge regarding their assessment performance, including the availability of clear checklists, understanding of the scoring rubric, a clear global rating scale, and how to rate it, is understandable so that it can be targeted in the training of OSCE examiners to minimize bias [25, 26].

One of the limitations of this study was that the results of this study could not be applied to other cases such as communication skills and clinical reasoning that had more complicated cases because in both cases there were differences in the way of assessment compared to the procedural skill with more standardized cases such as CPR in this study. The generalization was also a drawback in this study because the examiners came from a single institution. However, the examiners have the same standardization and are comparable with examiners in other institutions, hence, this approach can be also applied in other institutions.

Future research may use other clinical skills such as communication skills and clinical reasoning skills. Both have different forms of assessment and are more complex than the procedural skills in this study so that they can be used to answer with more certainty the influence of the examiners’ backgrounds in conducting clinical skills assessments.

5 Conclusions

There were no significant differences in scoring between OSCE examiners, except for clinical practice experience and educational background categories. Video-based assessment can foster the objectivity of OSCE, hence, it can be applied in OSCE scoring assessor training. However, this study shows that there are still sources of examiner biases that academics need to be aware of and consider.