Introduction

The formation of Hill–Sachs lesions (HSLs) in anterior shoulder dislocation is a well-known phenomenon [10, 30, 32]. In most cases, these osseous lesions are small and, after Bankart repair, do not require additional surgical treatment [8, 16, 33]. Rarely, however, HSLs of sufficient size and location can engage despite a Bankart repair [11, 13, 15, 31]. To avoid re-dislocation, surgery may be considered as a treatment option for these “engaging” HSLs. [1, 22, 24, 26, 43]. Consequently, the reliable detection, localisation, and measurement of HSLs are fundamental for planning adequate surgical treatment, avoiding unnecessary interventions, and reducing health care costs [2, 3, 34, 45]. Different imaging diagnostic modalities, including anterior–posterior (AP)-view radiographs, Stryker Notch-(SN) view radiographs, computed tomography (CT), and magnetic resonance imaging (MRI), have been proposed for the detection of HSLs [6, 20, 49]. So far, however, only a few studies have investigated the clinical value of these diagnostic imaging techniques, and the studies that have been conducted so far have mostly obtained contradictory results  [11, 31, 48]. The ability of these diagnostic imaging methods to accurately measure HSLs remains a subject of debate [5, 19, 38, 44]. Several measurement methods have been developed to quantify the amount of humeral bone loss [17, 18, 23, 37, 39, 49]. However, the accuracy and reproducibility of these methods when using different imaging modalities have been underinvestigated, and no consensus has been established to date [4, 41, 46].

The aim of this study was to compare the diagnostic precision of the measurement methods—proposed by Franceschi et al. [18], Calandra et al. [9], Richards et al. [37], Hall et al. [23], Rowe et al. [39], Flatow et al. [17] and Di Giacomo et al. [14]—in quantifying the amount of humeral bone loss using MRI versus CT. The null hypothesis postulates that the investigated measurement methods for quantifying humeral bone loss using CT are more reliable than those that using MRI. This is the first study directly comparing all the above-mentioned measurement methods through the use of MRI and CT. The authors hypothesise that CT and MRI do not differ significantly in the measurement of HSLs.

Materials and methods

Independent institutional review board approval was obtained from the Ethics Committee of the local university (University of Ulm, ID number: 154/20). Eighty consecutive patients with anterior shoulder instability scheduled from 2013 to 2017 in our department for arthroscopy were retrospectively enrolled postoperatively in this study. The following inclusion criteria were considered: (1) arthroscopic or open shoulder stabilisation and (2) available CT and MRI scans of the affected shoulder. The following were applied as exclusion criteria: (1) concomitant rotator cuff tear, (2) incomplete imaging diagnostics, and (3) insufficient CT or MRI scans quality. Thirty patients were excluded according to these latter criteria: 25 due to incomplete CT scans and 5 due to the insufficient CT scan quality. Ultimately, a total of 50 patients were enrolled in the present study. Table 1 presents the demographic data of the patients included in the study.

Table 1 The demographic data of the analysed patients

Preoperative radiological setup

For all patients CT (Siemens Somatom Emotion, ST: 1.0 mm, pitch: 0.8, 130 kV) and MRI scans of the shoulders were performed as part of their preoperative diagnostic screening according to our routine clinical setup. MRI was performed with a 1.5-Tesla MRI scanner (Siemens Symphony, Germany).

Intraoperative assessment

For all included patients, the decision for surgery had been made previously and independently of the study. The indication for shoulder stabilisation based on the presence of clinical shoulder instability with MRI confirmed a Bankart lesion. Two experienced high-volume shoulder surgeons who were not involved in the radiological examination of the present study performed the arthroscopies. All arthroscopies were performed under brachial plexus block and general anaesthesia. All patients were positioned in the beach chair position, and a Trimano hydraulic support system (Arthrex, Naples, FL, USA) was used to hold their arms. Posterior and anterolateral portals served as standard approaches. Diagnostic arthroscopy was performed first, during which all analysed HSLs were confirmed. HSLs were distinguished from the anatomical humeral groove by identifying a more cranial position in relation to the longitudinal humeral axis and assessing macroscopic morphological characteristics (indentation, cartilage involvement, etc.)

Postoperative radiological assessment

Study-related radiological analysis of all patients was conducted postoperatively at 34.7 ± 11.4 months (range: 24.1–52.0 months). To determine test–retest reliability, two orthopaedic trainees re-analysed and re-evaluated preoperative CT and MRI scans. One of the trainees, who was blinded to the initial measurements, repeated measurements after 6 weeks. The intra- and inter-rater reliabilities of the respective analysed measurement methods on CT and MRI are presented in the results section. CT and MRI scans had been previously anonymised; all scans were analysed in the same period consecutively. The postoperative radiological assessment consisted of measurement of the HSL using different methods with different imaging modalities. In cases where an HSL was detected, the following measurement methods were conducted with both imaging modalities:

  • Width and depth of the HSL [12].

  • Franceschi grading [18].

  • Calandra classification [9].

  • Richards grading [37].

  • Hall grading [23].

  • Rowe grading [39].

  • Flatow percentage [17].

  • Glenoid track assessment [13].

Width and depth of the HSL

The width of the HSL was measured by drawing a line between both of its edges. The depth of the HSL was obtained by placing a virtual circle on the humeral head. The longest perpendicular line from the ground of the lesion to the surface of the circle was defined as the depth of the HSL [12].

Franceschi grade

Franceschi et al. described an arthroscopic method for grading HSLs. According to this method, the HSL is classified through the posterolateral portal view into three categories: Grade I, cartilaginous; Grade II, bony scuffing; and Grade III, hatchet fracture. Although this method was originally performed arthroscopically, it has also been applied using transaxial CT and MRI scans [7, 18, 36].

Calandra classification

Following the Calandra et al. approach, the HSL was classified as follows: Grade I, lesion confined to articular cartilage; Grade II, extension into subchondral bone; and Grade III, large subchondral defect [9].

Richards grading

This measurement method was performed as recommended by Richards et al. on the transaxial view. A concentric circle was drawn on the humeral head. The centre of the circle was defined as the intersection of the diameter lines, 0 degrees was defined as the anterior edge of the articular surface. Following this method of measurement, the size of the HSL lesion was determined by establishing its location on the humeral head using the circle reference to define an axial frame of 360° (Fig. 1) [37].

Fig. 1
figure 1

A schematic representation of the measurement method according to Richards et al. A best-fit circle is posed on the humeral head. Zero degrees is defined as the anterior edge of the articular surface. The size of the HSL lesion is determined by determining the position of the HSL on the humeral head using the circle reference, which defines an axial frame of 360°

Hall method

The Hall quotient was defined according to Hall et al. on the transaxial view. The HSL was graded according to the percent involvement of the humeral articular surface. An 180° arc was drawn on the humeral articular surface (Fig. 2). Bone loss involving the articular arc was measured. The percentage of involvement of the articular arc was calculated using the following formula: \(\left(\frac{Width of the articular HSL (^\circ )}{180^\circ }\right)*100\) [23].

Fig. 2
figure 2

The measurement method performed according to Hall et al. A 180° arc is drawn on the humeral articular surface. Bone loss involving the articular arc is determined. In this figure, the smaller arch corresponds to the affected portion of the total articular arc

Rowe method

To apply the measurement method proposed by Rowe et al., the length and depth of the HSL were measured on the transaxial view. The length was measured from the most dorsal to the most ventral edge of the HSL. The depth was determined with a line running from the deepest point of the HSL perpendicular to the line connecting the most dorsal edge and ventral edges of the HSL (Fig. 3). Based on the measured length and depth, the lesions were classified in the following manner: mild, 2.0 cm long ×  ≤ 0.3 cm deep; moderate, 2.0–4.0 cm long × 0.3–1.0 cm depth; and severe, 4.0 cm long ×  ≥ 1.0 cm deep [39].

Fig. 3
figure 3

The measurement method performed according to Rowe et al. The length of the HSL is measured from the most dorsal to the most ventral edges of the bone defects (yellow line). The depth is measured from the deepest point of the HSL perpendicular to a line connecting the most dorsal and the most ventral edges of the HSL. Based on the measured length and depth, the lesions are classified according to Rowe et al.

Flatow method

According to the first description by Flatow et al., with this measurement method the relevance of the HSL is assessed based on the compromised humeral cartilage surface in relation to the total diameter of the humeral head using a transaxial view. In this method, the diameter of the humeral head is first determined using a line parallel to the articular surface of the glenoid, without taking the HSL into account. Then, the actual diameter of the humeral head is measured using a line parallel to the first diameter, now considering the bone loss from the HSL (Fig. 4). Finally, the quotient the two diameters is calculated. Lesions with a quotient smaller than 20% are defined as not clinically relevant [17].

Fig. 4
figure 4

The measurement technique according to Flatow et al. The diameter of the humeral head is first measured without the HSL using a line parallel to the articular surface of the glenoid. Then, the diameter of the humerus with the HSL is then determined using a line parallel to the articular surface of the glenoid. Finally, the quotient of both diameters is calculated

Glenoid track

The glenoid track method was performed as recommended by Di Giacomo et al. First, the diameter (D) of the lower glenoid and the extent of glenoid bone loss (GBL) were measured using the best-fit-circle method. Second, the glenoid track was extrapolated using the following formula: GT = (0.83 * D)-GBL. Finally, the Hill–Sachs interval (HSI) was defined as the sum of the width of the HSL and the extent of intact bone between the rotator cuff insertion and the lateral rim of the HSL. The HSL was defined as off-track if the HSI was greater than the glenoid track (HSI > GT); otherwise, it was defined as on-track [13].

Statistical analysis

Statistical analysis was conducted using SPSS (version 26, IBM Corporation, New York, USA). A Shapiro–Wilk test was performed to check the distribution of the results. To assess each measurement method’s ability to quantify the dimension of the detected HSL depending on the imaging modality used, the results of each measurement method achieved using MRI were compared with those performed on CT. The intergroup differences among the MRI and CT scans were calculated for interval-scaled measurements using the Wilcoxon signed-rank test and for ordinal-scaled dimensions with the sign test. To obtain the intra- and inter-rater correlations, two orthopaedic trainees repeated all evaluations and measurements; one of the trainees, who was blinded to the first measurements, took repeated measurements after 6 weeks. The intra-rater reliability was calculated for nominal-scaled measurements with the Chi-square test, for ordinal-scaled measurements with Spearman’s rank correlation coefficient, and for interval-scaled measurements with Pearson’s correlation coefficient. Inter-rater reliability was assessed with the intraclass correlation coefficient (Cronbach’s alpha). Intraclass correlation coefficients less than 0.40 were considered poor, 0.40–0.59 fair, 0.60–0.74 good, and 0.75–1.00 excellent [27]. The sample size was calculated assuming a confidence interval of 95%, and an effect size of 0.3, resulting in a sample size of at least 50 patients with a power of 0.8 [21]. Differences were considered significant for p values < 0.05.

Results

Table 2 presents the results of the different methods for measuring HSLs using both MRI and CT. No significant differences were found among any of the methods’ measurements of the HSLs.

Table 2 The results from the measurement methods performed with CT and MRI

Table 3 presents the results of the measurements obtained using the method according to Di Giacomo et al. To allow a precise comparison of the measurements between the CT and MRI scans, the measurements of the glenoid track and the HSI, as well as the interpretations thereof, are presented separately. No significant difference was found between the measurement of the glenoid track, but a significant difference was found between the two measurements of the HSI.

Table 3 The results concerning the comparison of the measurements of the glenoid track using CT and MRI

Concerning the intra- and inter-rater reliabilities for the measurements performed on CT, the Franceschi (ICC = 0.359) and Calandra (ICC = 0.361) classifications achieved poor results, while all other measurements showed good (glenoid track, α = 0.632) or excellent reliabilities.

Regarding the intra- and inter-rater reliabilities for the measurements performed on MRI, all measurement techniques, with the exceptions of the Franceschi (ICC = 0.120) and Calandra (ICC = 0.154) classifications, showed fair (glenoid track, α = 0.413) or excellent reliabilities.

Discussion

The most important finding of this study was that no significant differences in the diagnostic validity of the measurement of HSLs were found among the gold-standard CT and MRI, regardless of the measurement technique, with the exception of the HSI.

The diagnostic validity of CT in the detection and measurement of HSLs has seldom been investigated to date [19, 25, 29, 40]. Different measuring methods have been proposed in the literature; however, no consensus has been reached in this context. For instance, in a laboratory study, Kobali et al. investigated the use of two-dimensional (2D) CT for measuring HSLs in six anatomic bone substitute models and found a good diagnostic validity and inter-rater reliability for depth and width measurements of 0.879 and 0.721, respectively [29]. In a similar laboratory study, Ho et al. used three-dimensional (3D) CT and found inter-rater reliabilities for length, width, and the Hill–Sachs interval of 0.880, 0.975, and 0.856, respectively [25]. In a retrospective study with 35 patients, Saito et al. measured the width and depth of HSLs using 2D CT and obtained an intra-rater reliability of 0.954–0.998 [40]. Cho et al. analysed the measurements of HSLs with the fit circle method in 107 patients using 3D CT and found intra- and inter-rater reliabilities of 0.845–0.998 and 0.629–0.992, respectively [12]. In a study with 142 shoulders, Ozaki et al. compared the diagnostic validity of 3D CT in the detection of HSLs with arthroscopic findings and found 28 false-negative results [35]. The results of the present study are mostly in agreement with those of the studies mentioned above. Regarding the measuring methods, in the current study, excellent intra- and inter-rater reliabilities were also achieved for all methods, with the exceptions of the Calandra and Franceschi classifications.

Studies investigating the diagnostic validity of MRI for the measurement of HSLs are also scarce [2, 46, 47]. Gyftopoulos et al. found an intra- and inter-rater correlation for the measurement of HSLs with the on-track, off-track method with MRI of 0.86 and 0.79, respectively. In a prospective study, Stillwater et al. compared the diagnostic validity of 3D CT and MRI in terms of assessing HSLs in 12 shoulders with recurrent instability and found no significant differences between the obtained measurements [42]. In a study with 16 patients, Kirkley et al. investigated the ability of MRI to detect HSLs depending on their size, using arthroscopic findings as a reference. In this study, the authors found only moderate agreement concerning the estimation of the sizes of the HSLs [28]. The findings of the present study concerning the diagnostic validity of MRI reflect the results obtained by Gyftopoulos et al. and Stillwater et al. Regarding the analysed measurement methods, no significant difference was observed between MRI and CT, except in terms of the determination of the HSI. This finding may be attributed to the better exposure of the rotator cuff and the resulting higher measurement precision with MRI. In summary, the better accuracy of MRI in determining HSI, the avoidance of radiation exposure, and the equivalent diagnostic validity of all other methods investigated suggests that MRI is more appropriate than CT for measuring HSLs.

The current study is subject to several limitations. First, the measurements concerning the size of the HSLs taken preoperatively with the different imaging procedures were not compared with the arthroscopically estimated values. This study had a retrospective design; the surgeries were performed by two different surgeons, meaning that the arthroscopic measurements were taken in different settings and with various techniques. Therefore, a comparison with the intraoperative measurements might have caused significant bias. Accordingly, the present study could only compare the reliability of different measurement methods, but the true size of HSL remains unknown. Therefore, the measurement methods investigated in the present study were not compared in terms of their accuracy in quantifying bone loss. The results of the present study show the reliability of the measurement methods in determining the size of the HSLs using CT or MRI, but do not indicate which of the measurement methods investigated should be preferred. Second, a small number of patients were analysed; because the exclusion criteria were defined as strictly as possible. However, the performed post hoc sample size calculation showed sufficient power. Third, no 3D imaging methods were included. The lack of 3D imaging may have played a role, especially due to the localisation of the HSLs. Nevertheless, no study to date has been able to show the advantages of 3D imaging procedures over traditional methods. It should also be noted that not all clinics have the ability to perform 3D imaging. Fourth, the HSL measurements were not performed by radiologists. Evaluation by experienced radiologists would certainly have further increased the reliability of the measurements. However, as measurements are always performed by orthopaedic surgeons in our clinic, the authors decided to better reflect routine clinical procedure, and to have the measurements performed by orthopaedic surgeons. Furthermore, in the present study, the second measurement was performed by one examiner. This may have influenced the determination of intra-rater reliability. An additional second measurement performed by another examiner would have influenced the reliability. Finally, the contralateral healthy site was not included in the analysis. Some of the previously proposed methods for measuring HSLs are based on comparisons with the contralateral side. However, it must be kept in mind that performing CT of the contralateral side is associated with additional radiation exposure, while in the case of MRI, higher costs and logistic efforts are involved. For these reasons, in their clinical practice, the authors of the present study use only images of the affected side.

This is the first study to compare various HSL measurement methods using MRI and CT. The results of the present study may have a relevant impact on the diagnostic approach to measuring HSL in clinical practice. In view of these results, the measurement of humeral bone loss with MRI can be performed with the same diagnostic accuracy as with CT, which has thus so far been considered the “gold standard”. The primary use of MRI may reduce radiation exposure and improve the detection of labral pathologies. Future studies are needed to confirm the results presented in the current study.

Conclusions

While the determination of the HSI with MRI was more accurate, all other analysed techniques for measuring the amount of humeral bone loss showed the same diagnostic precision. With regard to intra- and inter-rater reliabilities, all measurement techniques analysed, with the exception of the Franceschi and Calandra classifications, showed good to very good reliabilities with both CT and MRI.