Introduction

Patellar instability, characterised by recurrent lateral patellar dislocations, is a relatively common pathology of the knee in adolescents and young adults [1]. Patients suffering from recurrent patella dislocations that are refractory to conservative treatment (i.e. physiotherapy) may be considered for surgery of which various procedures exist, including lateral release, medial patellofemoral ligament (MPFL) reconstruction, tibial tubercle (TT) transfer and trochleoplasty, among many other procedures [2]. The surgical option(s) of choice should be tailored to the key underlying pathology contributing to the patella instability symptoms, which is deduced following thorough clinical and radiological evaluation. As part of the pre-operative assessment, the height of the patella should be assessed; patella alta (high riding patella) is a well-recognised risk factor for patellar instability [3, 4]. In patients with patella alta, the patella may not fully engage in the proximal trochlear groove of the femur, thus making it more susceptible to dislocation [4,5,6].

There are several validated indices commonly used to assess patella height including the Insall-Salvati (IS) ratio, the modified Insall-Salvati (mIS) ratio, the Caton-Deschamps (CD) index and the Blackburne-Peel (BP) index [7,8,9,10]. These indices all use bony landmarks seen on both plain radiographs (X-ray) and magnetic resonance imaging (MRI) to calculate a ratio, which when compared to a reference range will identify patella alta as well as patella baja (low riding patella). There are some more novel methods of measuring patella height such as the patellotrochlear (PT) index and the patellophyseal (PP) index [11, 12]. The PT index directly assesses chondral articulation and can therefore only be carried out on MRI. The PP index indirectly measures the chondral articulation on the assumption that the anterior femoral physeal line equates approximately with the proximal end of the trochlear cartilage. The physeal line can be seen on both imaging modalities; therefore, the PP index can be applied to both X-ray and MRI.

Due to the various strengths and weaknesses of each of the indices, there is no universally recognised ‘gold standard’ measurement for diagnosing patella alta [5, 13]. It is often suggested that the IS ratio is the most reliable assessment of patellar height [14, 15]. However, with the increasing use of MRI in the pre-operative assessment of patients with patellar instability, an investigation which looks at the direct articulation of the patellofemoral joint is often hypothesised to be more accurate in identifying causes of patellar instability (i.e. MPFL tear, trochlea dysplasia, patella alta, increased tibial tuberosity–trochlea groove distance (TT-TG), etc.) [11, 12].

The primary aim of this study was to evaluate the intra- and inter-observer reliability of six commonly used patella height indices in patients with and without patella instability symptoms. The secondary aim of this study was to compare the measurements between plain radiographs and MRI scans of the knee joint.

Materials and methods

This was a retrospective case–control study. All the patients included in this study were treated by a single fellowship trained consultant orthopaedic surgeon with a specialist interest in knee surgery at a district general hospital. This radiological study included patients who attended a specialist knee clinic and underwent arthroscopic knee surgery following clinical assessment and radiological investigation. This study was exempt from Institutional Review Board (IRB)/Ethics Committee approval as it was a pragmatic study evaluating the existing clinical practice of the senior author (consultant surgeon).

Patient cohort

Patients with symptoms and signs of recurrent patella instability following clinical and radiological assessment and whose symptoms were refractory to first-line conservative treatment (i.e. physiotherapy) were identified from theatre records who underwent either an MPFL reconstruction, TT transfer or trochleoplasty between January 2016 and November 2018; these patients were assigned to the patella group. Patients who did not have any patella instability symptoms but suffered from other knee pathology (i.e. meniscal tears, loose bodies, etc.) who also underwent clinical and radiological assessment and subsequently underwent arthroscopic knee surgery were identified in the same time period and assigned to the control group. On reviewing the case notes, the patient demographics (age, gender, BMI and laterality of symptomatic knee) were taken from the initial outpatient consultation.

Case notes for both groups were reviewed, ensuring all patients in the patella group had a confirmed diagnosis of recurrent patellar instability and those in the control group did not have a past medical history of patellar instability/dislocation.

Exclusion criteria consisted of previous patella surgery to the index knee and lack of pre-operative lateral knee X-ray and MRI scan of the same knee. No patients in the patella group had undergone any previous procedures to the patella in the index knee. The control group included 9 patients who had previously undergone arthroscopic meniscectomy of the knee, which was acceptable as this procedure would not have affected patella height.

Imaging

The Picture Archiving and Communication System (PACS) (Centricity version 6, GE Healthcare, Chicago) was utilised for all the radiological data collection. Plain X-rays were performed using a Siemens Ysio Fluorospot with weight bearing antero-posterior (AP), lateral and skyline views. The majority of the lateral films were performed in the turned/decubitus position on the bed; however, where patients were in too much pain (i.e. from acute trauma), the lateral films were performed supine on the bed with a small pad under the knee to add some flexion. There was no pre-determined angle of knee flexion aimed for by the radiographer. All standard views have a source-image distance (SID) of 100–115 cm. MRI images were obtained using a 1.5-T GE Healthcare SIGNA Artist MRI scanner. The MRI sequences were obtained as per standard knee protocols used by the radiology department which included sagittal, coronal and axial proton density (PD) fat-saturated sequences alongside a sagittal T1-weighted sequence. The sagittal T1 slice with the greatest patella length was then selected to measure the patellar height. The parameters for sagittal T1 images were slice thickness: 3.5 mm; slice gap: 0.5 mm; matrix: frequency 384/phase 256; TR: 650; TE: 8; echo train length: 3; number of excitations (NEX): 2; FOV: 15 × 15 and Bandwidth: 35.71. The slice image number was recorded to ensure that the same image was used throughout the data collection.

Image analysis

Two observers measured all the patella height indices: one observer was a Clinical Research Fellow (Senior House Officer level) and the other was a Fourth-Year Medical Student, both of whom were allocated to the department of Trauma and Orthopaedic surgery in the host institution of this study. The 6 patella height indices (illustrated in Figs. 1 and 2) that were analysed included:

  • Insall-Salvati (IS) ratio [7]: length of patellar tendon divided by maximum length of the patella.

  • Modified Insall-Salvati (mIS) ratio [8]: distance from insertion of patellar tendon on tibial tuberosity to the inferior edge of the patellar articular surface divided by the length of patellar articular surface.

  • Caton-Dechamps (CD) index [9]: distance from anterior edge of tibial plateau to the inferior edge of the patellar articular surface divided by the length of patellar articular surface.

  • Blackburne-Peel (BP) index [10]: the perpendicular distance from the inferior edge of the patellar articular surface to a line drawn along the tibial plateau divided by the length of patellar articular surface.

  • Patellophyseal (PP) index [12]: height of patella above anterior physeal line divided by the length of patella articular surface.

  • Patellotrochlear (PT) index [11]: length of trochlear cartilage overlapping patellar cartilage divided by the length of patella articular surface.

Fig. 1
figure 1

Patella height indices: X-ray

Fig. 2
figure 2

Patella height indices: magnetic resonance imaging

The observers measured all indices on both lateral X-rays and sagittal T1 MRI scans, with the exception of the PT index which can only be evaluated on MRI. The indices were measured as described in their original publications [7,8,9,10,11], with the exception of the PP index which was carried out as described by Ali et al. [12]; this description of the PP index was used because no prior detailed instructions on how to measure the method could be found in the literature. A handbook was created using the images in Figs. 1 and 2 as a point of reference for the observers to ensure consistency. Measurements were obtained on PACS by each observer on two separate time points: the first during December 2019 and the second, 8 weeks later, in February 2020. During each data collection time period, the observers were blinded to each other’s results and their own previous measurements. During the first round of data collection, the angle of knee flexion on both X-rays and MRI scans was measured on PACS by drawing along the anterior cortex of the femur and tibia avoiding both the femoral condyles and tibial tuberosity.

Statistical analysis

The power calculation for this study was derived from the findings of a related previously published radiological study [16]. The sample sizes were based on a conventional type I error of 5% and a type II error rate of 10% (i.e. 90% power). The calculation revealed that a sample size of approximately 32 subjects per group was required for a clinically relevant patella height ratio between group mean difference of 1.2, based on a within-group standard deviation of 0.20. Plotted histograms with fitted curve lines, boxplots, normal Q-Q plots and the Kolmogorov–Smirnov statistic were used to confirm that a normal distribution was an appropriate assumption for all the continuous variables in the study. The independent-sample Student’s t-test was used for the between-group statistical analyses and the paired Student’s t-test was used for the within-group analyses. The inter-observer reliability and the intra-observer reliability of all the patella height ratios (continuous data) were determined using intraclass correlation coefficients (ICC). The ICC analysis was based on a consistency type two-way mixed model. The ICC results were further interpreted and categorised on the basis of the values proposed by Shrout and Fleiss [17] with a score of 0–0.4 indicating poor reliability, a score of 0.4–0.75 indicating moderate reliability and a score of more than 0.75 indicating excellent reliability. All of the patella height ratios were further classified into ‘patella baja’, ‘normal patellar height’ and ‘patella alta’ according to the commonly accepted [5] cutoff points (Figs. 1 and 2). It should be noted that there is no reference range for ‘patella baja’ when using the mIS ratio, as it was not designed to detect ‘patella baja’ but rather to detect ‘patella alta’ in individuals with an elongated patella [8]. Therefore, this index was only placed into the ‘normal patellar height’ or ‘patella alta’ categories. The inter-observer reliability and the intra-observer reliability of all the patella height classifications (ordinal categorical variables) were determined using the kappa measure of agreement. The kappa results were further graded based on the strengths of agreement recommended by Landis and Koch [18] with a kappa statistic < 0.00 considered poor agreement, 0.00–0.20 slight agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement and 0.81–1.00 almost perfect agreement. The level of statistical significance was set at p < 0.05. Statistical analysis was performed using SPSS for Windows version 26.0 (IBM Corp., Armonk, New York). The power calculation was performed using Minitab statistical software version 17 (Minitab LLC, State College, PA).

Results

There were 44 patients included in patella group and 50 patients included in the control group. The demographics for both groups are shown in Table 1. The mean age of subjects in the patella group was 19 years younger than that of the control group. There was a higher proportion of females in the patella group. Due to the difference in gender ratios between the two groups, further statistical analysis (independent-sample Student’s t-test) of patellar height in men compared to women was carried out; this showed no statistically significant difference (p > 0.05) in any of the indices in either of the two study groups on both X-rays and MRI scans. All other demographic variables (i.e. laterality, BMI, etc.) were comparable between the two groups.

Table 1 Demographics of subjects

The mean knee flexion angle measured on both X-ray and MRI is shown in Table 2. In both groups, the mean knee flexion angle (and range of angle measurements) is significantly smaller when measured on MRI. The difference in knee angle flexion between the patella group and the control group was statistically significant in the X-ray images (p = 0.011) but not in the MRI images (p = 0.768).

Table 2 Knee flexion angle

Table 3 demonstrates the between-group statistical analysis of the patellar height ratios as measured by both observers during time 1 (data collection in December 2019) and time 2 (data collection in February 2020). The PP index is significantly different between the patella group and the control group on all occasions except for observer A, during time 2 on X-ray (p = 0.192); this is likely to be a type II statistical error. Conversely, the PT index only showed a statistically significantly difference between the groups on one occasion. This was for observer A during time 1 (p = 0.042), which likely represents a type I statistical error. All other analyses showed a significant between-group difference.

Table 3 Between-group analysis

Table 4 compares the patella height ratios between the two imaging modalities; the PT index is not included due to only being measured on MRI. Both the IS and PP indices had significantly (p < 0.001) higher mean ratios on MRI when compared with X-ray on all occasions. The difference in mIS ratio was not statistically significant in all but two occasions. The results comparing the imaging modalities in the control group were significantly different when measured by observer B in both rounds of data collection. The difference in CD index was statistically significant on all but one occasion. There was no significant difference in the first round of data collection in the patella group when measured by observer A. For the BP index, there was a significant difference between MRI and X-ray on all but two occasions.

Table 4 Within-group analysis

Table 5 displays the intra-observer reliability of both groups for both imaging modalities. The intra-observer ICC of the IS index was graded as excellent on all but one occasion, when observer A measured the control group on MRI. The intra-observer reliability of mIS and CD are more consistent on X-ray than on MRI; for both indices, intra-observer reliability ranges from poor to excellent on MRI. The BP ratio measurements are more reproducible on X-ray compared with MRI. However, the intra-observer reliability is at least moderate on all occasions. The PP index performed better overall on MRI than on X-ray. The PT index performed better on the patella group than in the control group for both observers.

Table 5 Intra-observer reliability

Table 6 shows the inter-observer reliability at both time periods of data collection for X-ray and MRI. The ICC of the IS ratio was graded excellent on all occasions. The results of the mIS, CD, BP and PP indices were comparable for both imaging modalities. The PT index again performed better in the patella group than in the control group at both time periods.

Table 6 Inter-observer reliability

Table 7 shows a comparison of the mean patella height ratio on X-ray and MRI for 5 of the 6 indices used. Patients were included in this comparison if the X-ray was taken between 20 and 40 degrees of flexion (n = 21) so as to best approximate the original measurement technique descriptions which were performed on a lateral X-ray at 30 degrees flexion. Patients were then allocated into grades depending on the measurement on X-ray for the particular index being assessed. The patients were graded as per the X-ray measurements as the normal ranges are well validated on X-ray compared to more novel ranges proposed on MRI.

Table 7 Comparison between X-ray and MRI (X-ray knee flexion range 20 to 40°)

Table 8 shows proposed normal ranges for X-ray and MRI based on the findings shown in Table 7. The ranges for all grades using the IS ratio have been increased by 0.1 as this was the consistent difference between X-ray and MRI across all grades. When utilising the mIS ratio, the measured ratio was lower on MRI for ‘patella alta’ patients and higher on MRI for ‘normal patella’ patients. The mean difference across all grades is 0.02; this value is not clinically significant, so no changes were made to the normal ranges. The ranges for the CD ratio have only been adjusted for the ‘patella alta’ group; it was increased by 0.1. The ‘patella baja’ value has not been altered; there were no patients in this sample graded as ‘patella baja’ using CD, and as the results were not consistent across the other grades, the results cannot be extrapolated to include the ‘patella baja’ grade. The ranges for the BP ratio have been altered in a similar fashion to the IS ratio, by increasing all values by 0.1, as this was the consistent difference across all grades.

Table 8 Proposed normal ranges

Discussion

This study aimed to ascertain the reliability of the commonly used measurements of patellar height in patients with patella instability and the difference in measurement between X-ray or MRI. There was a difference between measurements obtained from X-rays as compared to those from MRI scans. Overall, the Insall-Salvati ratio was found to be the most reliable index.

The ICC of both the intra- and inter-observer reliability was slightly higher on X-ray for the IS ratio; this is in keeping with other studies [14, 16]. The differences in reliability between MRI and X-ray are minimal. The ICC is graded as excellent on all occasions on X-ray compared to all but one occasion on MRI, where it is graded as moderate. The reliability based on the kappa grading system (k) on MRI was better than that on X-ray, particularly in the patella group. The k grades are fairly similar between MRI and X-ray in the control group. This is important clinically as it shows that the classification of ‘normal’, ‘patella alta’ and ‘patella baja’ is reproducible in patients without patella instability especially if the IS ratio is measured on MRI.

In clinical practice, MRI is taken in a more consistent anatomical position with less variation in flexion angles due to the patient always lying supine within a relatively constrained knee coil in the scanner; this is substantiated by Table 2 of the present study. It is likely that having the leg in extension resulted in a higher IS ratio, causing more patients with patellar instability to be classified as ‘patella alta’. Becher et al. [19] suggested that having the leg in extension when measuring the IS ratio results in a significant rise in the ratio in patients with patella instability as compared to a control group. This phenomenon is confirmed in Table 4 of the present study. We can see that the mean value of the IS ratio was significantly higher on MRI than on X-ray, almost 1 standard deviation higher than the cutoff value for patella alta. This has clinical implications as more patients in the patella group were graded as having patella alta when using the IS ratio as compared to the other indices.

A recent study by Ye et al. [20] suggests that all measurements for patellar height on MRI showed adequate intra- and inter-observer reliability. The present study would contradict this as there was variability in the reliability of the indices assessed.

The mIS ratio performed poorly overall in terms of intra- and inter-observer reliability. The ICC grade was moderate or lower on most assessments. Similarly, the k grading was fair or lower on most occasions. The mean value of the mIS ratio in the patella group was approximately 1 standard deviation lower than the cutoff value for patella alta, meaning that when using mIS, the majority of patients in the patella group were categorised as having a normal patella.

The BP ratio appears to have comparable inter- and intra-observer reliability to the IS ratio when measured on X-ray, both in terms of ICC grading and k value; however, when BP is measured on MRI, this drops to an average rating of moderate. The observers often found that the measurement line along the tibial plateau was difficult to draw when viewing the slice with the maximum patella length; this may be due to this slice lying in the same plane as the insertion of cruciate ligaments, meaning there was not a smooth surface to measure along. In clinical practice, we would only recommend using this index on X-ray. Similarly, the CD ratio had better inter- and intra-observer reliability on X-ray as compared to MRI. The observers often found it difficult to identify the apex of the tibial plateau on MRI, making the CD ratio more subjective.

The observers also found that the PP index was often hard to measure on X-ray due to the differences in rotation and flexion. The observers found it easier to measure on MRI; this is supported by better intra- and inter-observer reliability on MRI. This is in keeping with the results of Ali et al. [12] who measured this index. Although the ICC values for both intra- and inter-observer reliability were similar to that of the IS ratio, the k gradings were better for IS. As this is quite a novel method of measuring patellar height, further investigation is needed into the normal ranges. For this reason, we feel the IS ratio is preferable to the PP index.

The PT index was the only variable used exclusively on MRI. The ICC for intra- and inter-observer reliability ranges from moderate to excellent. Conversely, when we categorise the results and apply the kappa measurement of agreement, the intra- and inter-observer reliability is more variable, ranging from poor to substantial. As this is a comparably novel index, more research needs to be carried out to determine the optimum cutoff values for ‘patella alta’ and ‘patella baja’. The observers found that due to the modest number of measures obtained for this particular index (i.e. MRI only), a small variation in measurements could result in a disproportionate change in the ratio. In this study, sagittal T1-weighted MRI images were used, but if sagittal PD fat-saturated images were used instead, the articular cartilage would have appeared clearer, theoretically making it easier to measure. A future study comparing the MRI measurements on T1-weighted and PD fat-saturated sagittal images may be beneficial as the PT index may be more reliable on the latter.

It is recognised that the knee flexion angle influences patella height. The closer to extension, the higher the patella; the more flexed position leads to a lower patella height. In this study, X-ray images had a large variation of knee flexion angles whereas in the MRI scans, the knee position had much smaller variation but a more extended position. Therefore, MRI scans can potentially over diagnose patella alta as the knee is always in a more extended position and so it is important to create new set of normative values for patella height measurements performed on MRI scans. Several studies have considered the adjustments required to the traditional normal ranges on X-ray in order to apply these indices to MRI. Lee et al. [21] have recommended corrections to the cutoff points for ‘patella alta’ and ‘patella baja’ when using both the IS and BP ratios in the order of 0.13 and 0.09 units, respectively. Yue et al. [16] have suggested that IS, mIS, CD and BP all require an adjustment of roughly 0.1 units to the cutoff values for these four ratios.

This study is the first, to our knowledge, to set out specific ranges for MRI for IS, mIS, CD and BP across all grades of patella height. The adjustments to both IS and BP are similar to the two studies mentioned previously [16, 21]; all cutoff values are raised by 0.1 units across all grades on MRI. However, this study cannot recommend a change to all grades using the CD index. The cutoff value for ‘patella alta’ can be raised by 0.1 on MRI but there is not enough evidence to do the same for ‘patella baja’. A change from 0.06 to 0.16 would be too large without investigating a large group of individuals with ‘patella baja’. This study recommends no changes to the mIS ranges on MRI. In patients graded as ‘patella alta’ on X-ray, the mIS is smaller on MRI. Conversely, in patients graded as ‘normal’ on X-ray, the mIS is higher on MRI. When considering the average across both groups, there is a negligible adjustment factor of 0.03 calculated. A larger sized study would be useful to investigate this further.

For all the indices measured on MRI, the observers used a single standardised image slice; this is a validated method as described in similar studies [11, 12, 16, 21, 22]. As this slice was selected based on the maximum length of the patella, this may not correlate with other landmarks such as the apex of tibial plateau or the tibial tuberosity. Occasionally, this sagittal image may not have been taken perfectly in plane, resulting in a slightly oblique image. The natural rotational alignment of the tibia relative to the femur within the native knee joint could also bear an influence in this regard. This may also cause the above problems of identifying landmarks on a single sagittal image slice and could yield higher measurements, particularly when measuring patellar tendon length, which may also explain the higher ratios observed on MRI.

The two observers in this study have no formal Radiology training; this may have impacted on the interpretation of MRI images, particularly the more skilled measurements such as the PP and PT indices. If this study was repeated with Consultant Radiologists as observers, we may find different results. Ye et al. [20] carried out a similar study using Musculoskeletal Radiologists, the ICC values for inter- and intra-observer reliability were higher than our values for all indices. In particular, the reliability of the PT index on MRI in our control group was much lower than that of the measurements made by the Consultant Radiologists. However, it is worth noting that our studies are not identical in design as they did not separate patients into a control group and patella instability group. Furthermore, the results of the current study are more generalisable to a larger audience as opposed to a smaller cohort of specialist radiologists alone.

The limitations of this retrospective study primarily related to the image acquisition. As the imaging had been performed as part of the routine clinical diagnostic process rather than to specifically measure patellar height, the images were occasionally suboptimal. Ideally, all X-rays and MRIs would have been taken at 30 degrees of flexion. Furthermore, all X-rays would have been true laterals (femoral condyles directly superimposed upon one another) and the MRI images would have accurate sagittal images. A prospective study where the ratios are measured on images with more consistent knee flexion angles would have been beneficial to mitigate these potentially confounding factors. Another area for future research is the establishment of a validated reference range for ‘patella baja’ when using the mIS ratio as currently it only includes values for ‘normal patellar height’ and ‘patella alta’.

Conclusion

This study recommends the use of the Insall-Salvati ratio on MRI to radiologically evaluate patella height. The IS ratio was the most consistently reliable across both patient groups and imaging modalities. In addition, the IS ratio measured on MRI more accurately identified patients with patella alta. Patella height measurements significantly differed between X-rays and MRI scans. This may infer that a different set of normative values are required for each radiological modality, which we have proposed in this study.