Introduction

The Internet is an important tool for both patient and physician education. Several recent studies investigating web-based orthopaedic information available to patients have identified wide variability with regard to quality and accuracy [2, 6, 1214, 17]. Furthermore, these studies demonstrated that access to high quality content is largely dependent upon search terms used [12, 14, 17]. Much like patients, healthcare providers are increasingly turning to the web-based content for clinical instruction and medical information [27].

Many open-access video tutorials focused on resident and physician education have rapidly emerged and gained popularity due, in large part, to their convenience, zero cost, and accessibility. Websites such as YouTube (www.YouTube.com) and VuMedi (www.vumedi.com) provide access to thousands of educational videos intended for orthopaedic healthcare providers. It was previously reported that more than 7200 educational websites dedicated to orthopaedics and orthopaedic-related issues could be found on the Internet [11]. It is important to note that while recent studies in non-orthopaedic disciplines espoused the utility of open-access content websites such as YouTube related to medical education [19, 25], others reported that the quality of the video content was suboptimal [10, 16, 23, 28].

To our knowledge, no studies have investigated the use of orthopaedic open-access video tutorials by trainees or the quality of information available. The purpose of this study was to determine the accuracy of internet-based instructional videos featuring the shoulder physical examination.

Materials and Methods

On April 4, 2012, the video databases of four open-access websites (VuMedi, G9MD, Orthobullets, and YouTube) were searched using the search terms “shoulder,” “examination,” and “shoulder exam.” The shoulder physical examination was selected because it is a routine component of resident education and is composed of a variety of maneuvers that can be studied. The inclusion criteria for selected videos were (1) videos involving physical examination of the shoulder, both comprehensive and those focusing on a specific aspect of the exam, and (2) videos featuring a healthcare professional demonstrating the exam. The term “healthcare professional” included medical doctors (any specialty), physical therapists, or athletic trainers. Only videos that identified the individual in the video as a healthcare professional either by verbal or written introduction (either in the video information or embedded in the video itself) were included. Videos were excluded if they did not address the physical exam of the shoulder or were not posted by a healthcare professional. In the case of high-volume search results (pertaining only to YouTube), only the first ten pages of each search term results were screened. YouTube search results populate based on the relevance to the search term. As such, it was decided that the first ten pages of any search would provide us with the highest concentration of videos relevant to our study and fitting the inclusion criteria. Duplicates were eliminated. Each video was independently assessed for correctness by three orthopaedic surgeons at various stages of training: a chief resident (rater 1), a junior resident (rater 2), and an intern (rater 3) (SAT, EYU, and EC).

To improve reviewer consistency and to minimize subjectivity, reviewers were provided a description of each maneuver as it was originally described in the literature (Table 1). Although several scoring systems were previously used in the non-orthopaedic literature, none was appropriate for the video content that was being assessed presently, as they were specifically tailored to the context of the clinical examination in question [10, 23, 28]. As such, included shoulder examination videos were assessed for accuracy with an ordinal customized grading system based on previously established grading criteria [2, 6, 1214, 17, 20, 21]. An accuracy grade of 1 represented agreement with less than 25% of the information provided in the video, 2 represented agreement with 26–50%, 3 represented agreement with 51–75%, and 4 represented agreement with 76–100%. Scores were assigned according to the degree of accuracy with which each maneuver was performed (as compared to the aforementioned published standards). This included description of the test or maneuver, completeness of the demonstration, and accuracy of the interpretation of said test. In the case of the O’Brien sign, evaluation of the maneuver was deconstructed into three components: (1) arm position, (2) examiner action, and (3) verbal description and interpretation of the test. An individual score was assigned to each component. As such, the test was not assigned an overall cumulative score.

Table 1 Description of proper technique for physical exam maneuvers evaluated in videos

Inter-rater agreement between the reviewers’ scores for every exam maneuver was assessed by calculating a linearly-weighted Cohen’s kappa with 95% confidence intervals [26]. Kappa values ≥0.70 indicate good agreement, and kappa values ≥0.80 indicate excellent agreement [18]. Video quality was assessed for each examination maneuver by calculating the frequencies and percentages of videos that contained the procedure and, when present, the frequencies and percentages at each of the aforementioned quality grades. A similar analysis was conducted for overall video quality by considering each video in its entirety. All statistical analyses were performed with SAS 9.3 (Cary, NC, USA).

There was no external funding for this study.

Results

Thirty-six unique open-access video tutorials met inclusion and exclusion criteria and were independently reviewed. The healthcare professionals featured in the videos reviewed included medical doctors and physical therapists. Inter-rater reliability was excellent (mean kappa 0.80, range 0.79–0.81).

Of the 36 videos, 25 were from YouTube, seven from VuMedi, three from Orthobullets, and one video was from G9MD. YouTube videos had the overall highest number of inaccuracies, with 60% receiving a grade of 1 or 2 (Fig. 1a). VuMedi had the highest number of accurate ratings, with 37% receiving a grade of 4. When all videos were combined, overall, 9.2% were scored as grade 3 and 29.7% were scored grade 4. The remaining 61.1% of videos received a grade of 1 or 2, indicating low (<50%) accuracy. No single component of the exam received a perfect score in all 36 videos. The most consistently accurate maneuvers (grade 4) were acromioclavicular (AC) joint palpation (97.6% of videos), bicipital groove palpation (92.5% of videos), and the biceps Popeye sign (91.7% of videos; Table 2 and Fig. 1b). Range of motion testing accuracy was reduced because the test was incomplete in several videos. Provocative maneuvers such as the Neer impingement, Hawkins, and belly press tests were least often performed accurately, receiving a grade 4 only 26.1, 38.1, and 44.4% of the time, respectively (Fig. 1c). The most common error made in performing the Neer and Hawkins tests was not performing the maneuver in the scapular plane. The belly press test was inaccurate due to arm positioning with the elbow falling posterior to the mid-axillary line of the body. The most common error seen with the Yergason test was demonstrating resisted pronation instead of supination. The active compression test (O’Brien Sign) received a grade 4 only 42.5% of videos for arm position, 60.3% for actual performance of the maneuver, and 56% for interpretation. The most common errors identified for the active compression test were not adducting the arm 15° and not performing the “palm-up” portion of the exam.

Fig. 1
figure 1

a Percent accuracy of examination maneuvers as broken down by each website. b Overall percent accuracy of various common shoulder examination maneuvers (1 = <25% accurate; 2 = 25–50% accurate; 3 = 51–75% accurate; 4 >75% accurate). c Overall percent accuracy of various provocative maneuvers. (1 = <25% accurate; 2 = 25–50% accurate; 3 = 51–75% accurate; 4 >75% accurate).

Table 2 Summary of frequency and accuracy of individual components of the physical exam of the shoulder

Discussion

This study examined the accuracy of orthopaedic internet-based educational video content. Our results highlight the fact that the quality of educational orthopaedic videos on the internet appears to be inconsistent.

This study has some limitations that should be addressed. First, the grading system, while modeled closely after similar scales focused on Internet grading [12, 14, 17], is novel with regard to video content and subjective in nature. The grading method, however, demonstrated excellent inter-rater agreement. Second, the small sample size as well as the wide variability in the distribution of videos pulled from each site precludes us from drawing any conclusions regarding the content accuracy among the different sites. Furthermore, the video cohort was predominately from YouTube, with far fewer videos from other included sites. This further limits our ability to definitively comment on the accuracy of videos from the other sites. Third, limiting the search to the first ten pages of the search query in the YouTube results introduces a potential selection bias: it is possible that high-quality videos were inadvertently excluded due to their position in the results queue. Finally, the study represents a snapshot of video content available online at the time of the search. As such, our study may not accurately reflect the dynamic nature of the information stream. Since the search was performed, it is possible that the content available on the internet has changed significantly. We sought to mitigate this limitation, however, by selecting the shoulder examination for collection and analysis. Basic shoulder examination techniques are relatively well conserved and have not changed dramatically in recent time. As such, we felt this as the most stable representative tutorial for evaluation.

A review of relevant literature reveals that similar studies have been performed in other medical and surgical disciplines [1, 35, 710, 15, 16, 2224, 28]. One study evaluated the quality of YouTube videos addressing cardiopulmonary resuscitation (CPR) [23]. The authors queried YouTube videos on a single day using four search terms and evaluated the videos for accuracy and “view-ability.” They determined that only 63% of the 52 videos actually demonstrated the correct compression-ventilation ratio and 19% incorrectly recommended checking for pulse. The authors concluded that although YouTube videos were a potentially valuable source of learning material, they frequently omitted crucial information and presented wholly inaccurate information. Another study evaluated the quality of cardiac auscultation tutorial videos on YouTube [10]. The authors analyzed a total of 22 videos for audiovisual quality, teaching quality, and comprehensiveness and found the quality of the content to be highly variable. They concluded that few of the many video tutorials available on cardiac auscultation on YouTube were accurate. More recently, the same group performed a similar analysis on the quality of YouTube videos focused on respiratory auscultation [28]. Of the 6022 videos located, only 36 met inclusion criteria. The quality of these videos, as one might expect, was highly variable. No videos achieved the highest score. The authors emphasized the high volume of poor-quality and factually incorrect educational content, which was widely available and the difficulty of locating the small number of valuable, informative videos. The authors urged institutions engaged in healthcare education to guide their trainees to quality internet educational resources and call for a standardized moderating system that could improve content quality. Similar results were found by other investigators for the inaccuracy of videos focused on the cardiovascular and respiratory physical examination [5] and knee arthrocentesis [16]. The authors conclude that the vast majority of educational videos on YouTube are unsuitable for teaching purposes. Our results were consistent with this assessment, demonstrating that YouTube videos in particular have an unacceptably high frequency of inaccuracies. Given the difficulty of regulating content on YouTube, these results are not surprising. Video content on more regulated sites like VuMedi and Orthobullets appears to be more accurate; however, the quality of the videos was inconsistent. Our results suggest that VuMedi videos are the most consistently accurate.

Open-access video platforms not only facilitate distribution of educational material, but also allow swift dissemination of misinformation. Thus, quality control is of paramount importance. This study suggests that the quality of information available is inconsistent at best. We recognize that regulating public domain sites such as YouTube is nearly impossible, and thus, the quality of content will always beholden to those who choose to submit content. As such, the task of ensuring that trainees have access to high-quality instructional content falls squarely on the shoulders of the orthopaedic community—be it via the national and international societies, individual academic centers, or a combination of the two.

The first step in addressing this issue is to acknowledge the central role that video-based education has assumed in resident education, brought on by the popularity of advanced mobile devices. Next, the absence of a large, well-regulated video database should be recognized. This information gap is demonstrated in our study by the prevalence of YouTube-based videos and the relative paucity of videos from orthopaedic websites. orthopaedic trainees will inevitably seek out web-based information as it is needed. Unfortunately, as it stands now, the majority of instructional orthopaedic videos are found on YouTube. This fact should be the single-most influential driving force behind improving and standardizing available educational video content. One way to address this issue is by establishing internal educational platforms where content is regulated by a select group of specialists. One such example is Hospital for Special Surgery eAcademy website (https://hss.classroom24-7.com), which is accessible by both trainees and patients. In reality, however, the solution to this problem most likely lies within pre-existing educational platforms whose major purpose is trainee education—websites such as Orthobullets or Wheeless (www.wheelessonline.com). These sites’ main goal is to convey high-yield orthopaedic information in an easily accessible format. Additionally, such websites have the advantage of not being bound by specialty or geographic location, which allows them to cover a large spectrum of information and reach a large population of trainees.

Ultimately, whether educational content is produced by individual entities or by public websites is less important; the key is pooling of information, which is then evaluated against predetermined standards by experts in the field. The result would then be a collection of reliable, standardized videos provided on a single platform (website, phone application, etc.) to a population of trainees who have been directed to this platform by their respective institutions.

The results of this study suggest that information presented in open-access video tutorials featuring the physical examination of the shoulder is inconsistent. Internet-based learning has emerged into medical student, resident, and fellow education and has been recognized as a powerful teaching tool. These resources are likely to further penetrate the education market in the face of technological advances in mobile devices and smartphone applications. Keeping in mind the ubiquitous nature of online educational media and the inconsistency of its content, we believe that academically orientated orthopaedic institutions have a responsibility to both acknowledge this method of learning and to guide orthopaedic trainees to accurate educational resources.