Introduction

First proposed by Gartland et al. in 1959 and subsequently refined by Wilkins in 1984, the modified Gartland classification has become a widely accepted and valuable grading scale for communication and initial treatment guidance of supracondylar humeral fractures [1,2,3,4,5]. This modified classification system stratifies supracondylar humeral fractures into three types: Gartland type I fractures, non-displaced or minimally displaced (Fig. 1a); Gartland type II fractures, having intact posterior cortex, with or without rotational displacement or translation (Fig. 1b); and Gartland type III fractures, having complete displacement with no posterior cortical contact (Fig. 1c). Type II fractures are further subdivided into type IIA (sagittal extension deformity/angulation) and type IIB (IIA findings with the addition of axial and coronal malalignment) [2].

Fig. 1
figure 1

Radiographs demonstrating modified Gartland classification of supracondylar humerus fractures. (a) Left lateral radiograph of 9 y/o male displaying Gartland type I supracondylar fracture with nondisplaced transverse fracture best appreciated along the anterior humeral cortex (arrow). (b) Left lateral radiograph of 11 y/o female displaying Gartland type II supracondylar fracture with posterior angulation of the distal fracture fragment with the capitellum posterior to the anterior humeral line, and posterior humeral cortical buckling (arrow) without displacement. (c) Left lateral radiograph of 9 y/o male displaying of Gartland type III supracondylar fracture with both posterior displacement and angulation of the distal fracture fragment, with non-contact of the posterior cortical column of the distal humerus

Our institution employs the modified Gartland classification system for supracondylar humeral fracture communication and initial management guidelines. At our institution, Gartland type I supracondylar fractures are immobilized, but do not require operative management or urgent orthopedic consultation in the emergency department (ED). Gartland type II or III of supracondylar humeral fracture diagnosis triggers an immediate orthopedic consultation to ensure timely surgical intervention if necessary. This algorithmic approach relies on the need for strong inter-departmental modified Gartland fracture classification consensus between pediatric emergency medicine providers, radiologists, and orthopedic surgeons to ensure appropriate management of these fractures. Therefore, the purpose of our study is to evaluate the interobserver reliability of the modified Gartland classification system among pediatric radiologists, pediatric orthopedic surgeons, and pediatric emergency medicine physicians.

Materials and methods

This study was approved and granted a waiver of informed consent by our institutional review board. Pediatric patients who were diagnosed with a supracondylar humeral fracture at a single tertiary pediatric hospital system from January to November 2022 were identified for retrospective review using ICD codes: S42.41*: Simple supracondylar fracture without intercondylar fracture of humerus. Three hundred six patients were consecutively reviewed until we reached our goal of 100 supracondylar humeral fractures with a breakdown of 40 type I, 30 type II, and 30 type III fractures meeting inclusion criteria. Modified Gartland fracture grading for inclusion was based on the formal pediatric orthopedic faculty official consultation note, which either occurred in the clinic or emergency department. These grades were extracted from the official consultation note which includes pediatric orthopedic faculty review of imaging and their clinical exam. These grades were considered the reference standard. Orthopedic providers include fellowship-trained pediatric orthopedic surgeons, as well as orthopedic physician assistants and orthopedic surgery residents under the direct supervision of an attending pediatric orthopedic surgeon.

We excluded children whose fractures lacked a specified modified Gartland grade by the orthopedic department at time of initial encounter in the medical record, categorized as occult, nondisplaced or suspected, had been casted prior to radiographs, or fractures already showing signs of healing at the time of the first radiograph.

Radiographic elbow series, including standard 3-view (anterior–posterior, lateral, and oblique view) and 2-view (anterior–posterior and lateral view), acquired at the point of entry to our institution, were systematically collected for each patient. Most radiographic sets were 3-view elbow series in the study cohort. Radiographic sets were compiled into a PowerPoint (Microsoft, Albuquerque, NM) document and initially reviewed by six separate pediatric subspecialty-trained physicians: two pediatric radiologists, two pediatric emergency medicine physicians, and two pediatric orthopedic surgeons. Graders were provided with a figure from Teo et al., containing definitions and a pictorial for each modified Gartland grade [6]. The radiographic sets were each assigned a singular modified Gartland grade of type I, type II, or type III (Fig. 1) by each grader. Our study did not subcategorize IIA and IIB. In cases where the initial two graders disagreed on a case, a third grader of the same subspecialty served as the tie-breaker to generate consensus for the overall subspecialty grade. Graders were blinded to the scores of the other graders as well as to the final reference standard diagnosis and treatment outcome. All graders were affiliated with a quaternary care academic free standing children’s hospital, which regularly uses the modified Gartland classification since 2016. All graders had prior experience with the routine use of modified Gartland fracture classification of supracondylar fractures. Initial orthopedic graders had 0.5 (JCW) and 7 years (AZG) of post-fellowship practice experience, and the orthopedic tie-breaker had 32 years of post-fellowship practice experience (BGS). Initial pediatric radiology graders had 3.5 (ESB) and 25 years (SJK) of post-fellowship practice experience, and the pediatric radiology tie-breaker had 19 years of post-fellowship practice experience (JHK). Initial pediatric emergency medicine graders had 8 (EBH) and 20 years (JYA) of post-fellowship practice experience, and the pediatric emergency medicine tie-breaker had 15 years of post-fellowship practice experience (ATC).

Statistical analysis

Interobserver reliability was assessed, and kappa values were generated with 95% confidence intervals using IBM SPSS Statistics for Windows, version 29.0 (Armonk, NY: IBM Corp). Cohen’s weighted kappa was used to calculate agreement between the initial two observers within the same subspecialty. Fleiss’ kappa was used to calculate reliability between the three department grades comparing Gartland type I, type II, and type III. Fleiss’ kappa was also used to assess reliability between Gartland type I compared to type II and type III combined group. The kappa values were interpreted using the system proposed by Landis and Koch: Values less than 0.00 indicate poor reliability; 0.00 to 0.20, slight reliability; 0.21 to 0.40, fair reliability; 0.41 to 0.60, moderate reliability; 0.61 to 0.80, substantial agreement; and 0.81 to 1.00, excellent or almost perfect agreement [7]. The mean was calculated for the reference standard (based on formal orthopedic consultation note which included clinical exam and imaging) compared with subspecialty pediatric emergency medicine, radiology, and orthopedic consensus grade based on imaging alone. Two-tailed t-tests were used to compare each subspecialty’s mean grade to the reference standard mean grade. A fracture consensus grade was defined as under-triaged if a Gartland II/III fracture was misclassified as a Gartland I fracture. A fracture consensus grade was defined as over-triaged if a Gartland I fracture was misclassified as a Gartland II/III fracture.

Results

The mean age of the 100 included patients was 6.1 ± 2.5 years (min 2: max 13) and 47% were female. According to the reference standard, there were 40 Gartland type I, 30 Gartland type II, and 30 Gartland type III fractures in the study sample.

Agreement between initial readers of the same specialty

There was substantial interobserver agreement between the initial two radiology readers (kappa = 0.67 [95% CI, 0.55 – 0.79]) and substantial interobserver agreement between the initial two orthopedic surgery readers (kappa = 0.63 [95% CI, 0.52 – 0.74]). There was moderate interobserver agreement between the initial pediatric emergency medicine readers (kappa = 0.60 [95% CI, 0.49 – 0.72]).

Inter-specialty agreement on fracture classification

Total numbers of each grade assigned by each subspecialty group and inter-specialty agreement on individual fracture grades are shown in Table 1. Overall, there was substantial interobserver agreement (kappa = 0.77 [95% CI, 0.69 – 0.85]) on consensus fracture grade between the three subspecialties. Similarly, when discriminating between Gartland type I and any higher fracture grade, there was substantial interobserver agreement with kappa of 0.77 (95% CI, 0.66—0.89).

Table 1 Consensus diagnosis for supracondylar fracture grading by pediatric subspecialty with kappa statistics for inter-specialty agreement on individual fracture grades

Subspecialty agreement with the reference standard

There was substantial agreement between radiology readers and the reference standard with kappa of 0.79 (95% CI, 0.67 − 0.92). There was almost perfect agreement between orthopedic graders and the reference standard with kappa of 0.90 (95% CI, 0.81 − 0.96). There was substantial agreement between pediatric emergency medicine and the reference standard with kappa of 0.67 (95% CI, 0.53 − 0.82).

There was no significant difference between the average fracture grade defined by the reference standard when compared to the average grade assigned by the orthopedic surgery readers (p = 0.096) or to the average grade assigned by the radiology readers (p = 0.198). However, the average fracture grade assigned by pediatric emergency medicine was statistically higher than the average reference standard grade (p = 0.038) (Table 1).

For 14/100 (14%) patients, the pediatric emergency medicine consensus grade was higher than the reference standard grade (Fig. 2) while in 5/100 (5%) cases, the pediatric emergency medicine consensus grade was lower (Fig. 3). For 5/100 (5%) patients, the radiology consensus grade was higher than the reference standard while in 10/100 (10%) cases the radiology consensus grade was lower (Fig. 4). For 2/100 (2%) patients, the orthopedic consensus grade was higher than the reference standard grade while in 7/100 (7%) cases the orthopedic consensus grade was lower (Fig. 3).

Fig. 2
figure 2

A 2-year-old boy with Gartland type I fracture per the clinical reference standard. (a) Lateral, (b) AP, and (c) oblique radiographs show a nondisplaced buckle-like supracondylar fracture (arrow) and large elbow joint effusion (arrowheads). Consensus grading by pediatric emergency medicine providers was Gartland type II. Consensus grading by orthopedics and radiology was Gartland type I

Fig. 3
figure 3

An 8-year-old boy with Gartland type III fracture per the clinical reference standard. (a) Lateral, (b) AP, and (c) oblique radiographs show transverse fracture with posterior angulation and displacement of the distal fracture fragment with non-contact of the posterior humeral cortical column (arrow). Consensus grading by pediatric emergency medicine and orthopedics was Gartland type II. Consensus grading by radiology was Gartland type III

Fig. 4
figure 4

A 4-year-old boy with Gartland type II fracture per the clinical reference standard. (a) Lateral, (b) AP, and (c) oblique radiographs show transverse fracture line (arrow), the anterior humeral line intersects the most anterior aspect of the capitellum (white line), and there is posterior cortical buckling of the humerus (arrowhead) without displacement. Consensus grading by each pediatric emergency medicine and orthopedics was Gartland type II. Consensus grading by radiology was Gartland type I

When compared to the reference standard, under triage of Gartland II/III fractures (n = 60) as Gartland I fractures was seen in 2/60 (3%) instances by pediatric emergency medicine providers, 6/60 (10%) by radiologists, and 3/60 (5%) by orthopedic surgeons. Over triage of Gartland I fractures (n = 40) as Gartland II/III fractures was seen in 13/40 (33%) instances by pediatric emergency medicine, 4/40 (10%) by radiologists, and 2/40 (5%) by orthopedics.

Discussion

Supracondylar humerus fractures account for nearly 60% of pediatric elbow fractures [8]. Agreement surrounding proper identification and classification of supracondylar humerus fractures among pediatric providers is important as grading has relevance to fracture management. Further, efficient management of these fractures is one of the variables for quality metrics of pediatric orthopedic care based on the United States News & World Report, which requires an 18-h benchmark from admission to treatment for operative supracondylar fractures [9]. We have shown that the modified Gartland classification for supracondylar humeral fractures has substantial to excellent reliability among pediatric emergency medicine physicians, pediatric radiologists, and pediatric orthopedic surgeons.

Prior studies have established the modified Gartland classification to be a reliable tool among orthopedic surgeons, reporting moderate to substantial agreement with kappa values between 0.475 and 0.77 [3, 4, 6, 10]. However, except for a study by Barton et al. which included a radiologist, to our knowledge, no studies have assessed the reliability of the modified Gartland classification among physicians other than orthopedic surgeons. In their study, Barton et al. showed that the modified Gartland classification had substantial interobserver agreement with kappa values ranging from 0.59 to 0.77 between five observers [4]. This study also demonstrated substantial to almost perfect intra-rater agreement on fracture grading with kappa values between 0.72 and 0.93. Interobserver agreement across three pediatric subspecialties in our study is similar to what has been reported among orthopedic surgeons though agreement in our study may be higher than in broad clinical practice because our institution has been routinely using the modified Gartland classification since 2016. Other studies of interobserver agreement around the Gartland classification, such as a study by Teo et al. reported lower kappa values than our study but these studies are inherently different and may not be directly comparable to our results. Teo et al. specifically chose radiographs that border two modified Gartland grades and had graders subcategorize type II fractures as IIa and IIb, potentially resulting in the lower kappa [6]. Leung et al. also reported lower kappa values; however, their study also subdivided type II fractures into IIa and IIb and included both resident and attending physicians [10]. While our study did not assess intra-observer reliability, other studies have reported intra-observer kappa values between 0.652 and 0.84 [3, 4, 10].

The modified Gartland classification subdivides type II fractures into type IIa and type IIb, a distinction we did not explore in the current study. Prior studies have shown high levels of disagreement and low reliability when discriminating between type IIa and type IIb class with kappa values ranging from 0.240 to 0.430 [6, 10, 11]. Type IIa fractures are generally more stable, with an intact supracondylar posterior cortex with angulation occurring in the sagittal plane. Type IIb fractures are generally more unstable, with some axial rotatory component and/or coronal malalignment superimposed on sagittal angulation deformity. Compared to studies that did include distinction between types IIa and IIb, kappa values were higher in our study, likely because we did not subcategorize Gartland type II fractures. Radiologists’ and pediatric emergency medicine physicians’ identification of Gartland type II fractures is reasonable to activate emergent formal orthopedic consultation. Formal orthopedic consultation will allow the surgeon, rather than the non-surgeon, to facilitate subcategorizing type IIa (mostly non-surgical) and IIb (mostly surgical fractures) after their clinical exam.

There are notable trends in our data related to specific provider groups. Pediatric emergency medicine physicians were relatively more cautious, with a statistically significant trend of providing a higher grade for fractures when compared with the clinical reference standard. Specifically, the largest discrepancy was patients with Gartland type I fractures being classified as Gartland type II fractures. This trend could potentially lead to over-triaging and unnecessary activation of emergent orthopedic surgery consultation which, in centers without on staff pediatric orthopedic surgeons, might result in unnecessary hospital transfers. Past studies have shown that Gartland type I supracondylar humerus fractures are among the most common fracture types with unnecessary transfer to an outside emergency department [12]. Collaborative educational and interdepartmental quality improvement initiatives among pediatric emergency medicine, radiology, and orthopedics may help improve supracondylar fracture triage, particularly in settings where triage is implemented by pediatric emergency medicine alone.

Our study had limitations. First, all subspecialists were based at a single high-volume academic pediatric quaternary care center that routinely uses the modified Gartland classification for triage. Consequently, our results may not be generalizable to institutions or urgent care centers where providers are less acquainted with or infrequently employ the modified Gartland classification system. Second, our results also may not be generalizable to other institutions that do not employ experienced pediatric radiologic technologists. If radiographs are suboptimally obtained (e.g., poor lateral), this will negatively affect correct modified Gartland classification. Third, we did not assess intra-observer reliability. Fourth, although the modified Gartland classification uses objective radiographic findings, in clinical practice, the actual classification could be influenced by the physical examination findings by pediatric emergency medicine physicians and orthopedic surgeons. Radiographs on patients whose classification borders two categories might be categorized as higher or lower based on the patient’s pain level, neurovascular status, mechanism of injury, or other clinical indicators, and this was not included during consensus grading.

Conclusion

The modified Gartland classification is a reproducible grading system for supracondylar fracture description among pediatric emergency medicine physicians, radiologists, and orthopedic surgeons. Supracondylar fracture structured reporting should include the modified Gartland classification to help improve inter-departmental communication.