Introduction and rationale

Measurement of patellar height is important in evaluating knee conditions and can be important in planning treatment [14]. Patellar height can be assessed on radiographs using numerous methods and corresponding ratios [58], however none is perceived as the gold standard [5]. These methods either relate the position of the patella to the femur (direct assessment) or relate the position of the patella to the tibia (indirect assessment). The direct assessment methods are not widely applied because they proved to be too complex to apply [5]. The methods that use an indirect assessment, such as the Insall-Salvati (IS) [9], modified Insall-Salvati (MIS) [10], Blackburne-Peel (BP) [11] and Caton-Deschamps (CD) [12] are the most widely applied radiographic assessment methods.

With respect to the widely used indirect assessment methods, there is no consensus about which method should be routinely used to assess patellar height. Few studies analysed and compared inter-observer reliability or intra-observer reliability of the most wildly applied patellar height ratios [68, 1317]. One of these studies included patients with total knee arthroplasty [15], one study [14] included skeletally immature patients, and in one study [16] it was not clear if the entire study cohort was skeletally mature. Both knee arthroplasty and skeletal immaturity can affect the landmarks used by the ratios. Furthermore, one of the studies [13] did not report whether they used a digital system for evaluating the radiographs, which is customary in current clinical practice. In two studies [15, 17] it was unclear if the measurements were performed using a digital system. It also was unclear what the status was of the assessors (e.g., orthopaedic surgeon, radiologist or resident) in half of the studies [7, 13, 14, 17]. In total, two studies [7, 8] reported reliability of patellar height ratios for mature native knees, using a digital system, for evaluating routine radiographs. Of these studies only one study [7] compared all four ratios, but this study proposed to use their newly introduced direct method to measure patellar height. None of these studies reported the inter-observer and intra-observer reliability of the classification of the patellar height ratios into ‘patella baja’, ‘normal patellar height’ and ‘patella alta’ of the four patellar height ratios.

The objective of this study was to evaluate the inter-observer and intra-observer reliability of four patellar height ratios (based on IS, MIS, BP, CD) using a digital system for evaluating standard radiographs of mature native knees. The secondary objective was to evaluate the inter-observer and intra-observer reliability of the classification of the patellar height ratios into ‘patella baja’, ‘normal patellar height’ and ‘patella alta’ of the four patellar height ratios.

Material and methods

A retrospective study was conducted. Institutional review board approval was obtained. Given the retrospective design, an informed patients’ consent was not deemed necessary.

Patient selection

We collected radiographs from patients aged 15–40 years, registered in the hospital information system for visiting our outpatient clinic for patellofemoral pain syndrome between January 2011 and July 2012. Inclusion criteria were: availability of standard weight bearing lateral knee radiograph in minimal 30° flexion, closed physis, availability of an MRI report compiled within one year before or after the radiograph, excluding knee deformities, meniscus injury, anterior cruciate ligament injury, collateral ligament injury or other traumatic injury. Technically poor radiographs, i.e., the with landmarks obscured, were excluded.

Image acquisition and assessment

Standard weight bearing lateral knee radiographs in approximately 30° flexion were used to measure the patellar height ratios. A goniometer was not used when obtaining standard weight bearing lateral knee radiographs.

The patellar height was assessed in millimetres on the lateral knee radiographs using the built-in ruler of our picture archiving and registration system (IMPAX 6.4, Agfa HealthCare N.V.). The diagnostic images were independently interpreted in an identical sequence by four experts (two radiologists specialized in the musculoskeletal system and two orthopaedic surgeons specialized in knee surgery) who were blinded to the patients’ data and the other examiners results. The measurements were assessed independently at two different moments with an interval of at least six weeks. The sequence of the radiographs was different for the second assessment.

Figure 1 shows the four methods for assessing patellar height [912]. Patellar length (A) and patellar tendon length (B) were measured to obtain the IS ratio (B/A). For the MIS ratio (D/C), the length of the patellar articular surface (C) and the distance from the inferior edge of the patellar articular surface to the insertion of the patellar tendon (D) were assessed. The length of the patellar articular surface (E) and the perpendicular distance from the inferior edge of the patellar articular surface to the line from the tibial plateau surface (F) were used for the BP ratio (F/E). The length of the patellar articular surface (G) and the distance from the inferior edge of the patellar articular surface to the anterosuperior angle of the tibial plateau surface (H) were measured for the CD ratio (H/G). Depending on the ratio used, patellar height was classified into ‘patella baja’, ‘normal patellar height’ and ‘patella alta’ according to the original [912] cutoff points (Table 1).

Fig. 1
figure 1

From left to right: Insall-Salvati (b/a), modified Insall-Salvati (d/c), Blackburne-Peel (f/e), Caton-Deschamps (h/g)

Table 1 The cutoff points of the four patellar height ratios

Statistical analysis

Descriptive statistics were calculated according to standard methods, including frequencies, means, medians, standard deviations and ranges when appropriate.

The inter-observer reliability and the intra-observer reliability of the four ratios were determined using intraclass correlation coefficients (ICC) for single measures [18]. ICCs and 95 % confidence intervals were based on a two-way random model utilizing absolute agreement using IBM SPSS Statistics 20.0 (IBM Corporation, Armonk, NY, USA). Inter-observer reliability was calculated with use of the data from the first acquisition session only. Intra-observer reliability was calculated with use of the data from both acquisitions for each observer. The individual ICCs were averaged subsequently. Scores were interpreted on the basis of the values suggested by Shrout and Fleiss [18] with a score of 0–0.4 indicating poor reliability, 0.4–0.75 indicating moderate reliability and a score of more than 0.75 indicating excellent reliability.

Based on the four different ratios, Fleiss’ kappa’s and confidence intervals were calculated for the categories “patella baja”, “normal patellar height” or “patella alta”. Fleiss’ kappa and confidence intervals were calculated using http://www.statstodo.com (Fleiss's kappa from rating scores). Inter-observer reliability was calculated with use of the data from the first acquisition session only. Intra-observer reliability was calculated with use of the data from both acquisitions for each observer. The individual Fleiss’ kappa’s were averaged subsequently. A kappa of <0.2 is considered poor agreement, 0.21–0.4 fair agreement, 0.41–0.6 moderate agreement, 0.61–0.8 strong agreement and more than 0.8 near complete agreement [19].

Results

Between January 2011 and July 2012, 269 patients visited our outpatient clinic for patellofemoral pain syndrome. Of the 269 patients 45 patients were included in this study. Mean age at the time of the radiograph was 24 (SD 8) years and 73 % was female. When the measurements where averaged for all observers and both assessment moments (360 measurements), the average ratios were 1.1 (SD 0.2) for the IS, 1.7 (SD 0.2) for the MIS, 0.9 (SD 0.2) for the BP and 1.1 (SD 0.2) for the CD. All average ratios indicate normal patellar height.

The inter-observer reliability of the first measurement for the four ratios is presented in Table 2. The IS showed excellent reliability, the other ratios showed moderate reliability. When the measurements were categorized in “patella baja”, “normal patellar height” and “patella alta”, the IS showed a strong inter-observer reliability, the MIS and BP had moderate inter-observer reliability, and the CD showed poor inter-observer reliability.

Table 2 The inter-observer reliability of patellar height ratios

The intra-observer variability for the four ratios is presented in Table 3. The IS, MIS and CD had excellent reliability, and the BP had strong reliability. When the ratio values were categorized the IS and MIS had strong intra-observer reliability, while the BP and the CD showed moderate intra-observer reliability.

Table 3 The intra-observer reliability of patellar height ratios

Discussion

In this study the IS showed to have excellent reliability and has the best reliability of all the ratios. The MIS showed to have moderate to excellent reliability and has the second best reliability, except for the inter-observer ICC where it had the third best score. There was no evident difference between the intra-observer reliability of the orthopaedic surgeons and the radiologists. The high reliability of the IS might be explained by the patellar and tibial landmarks used by the ratio. The superior and inferior patellar pole, as used by IS, are easy to identify on plain radiographs. However, the shape of the inferior pole can be affected by Sinding-Larsen-Johansson disease. The inferior ridge of the patellar articular surface as used by the MIS, BP and CD is less easy to determine than the inferior patellar pole [5]. It is not always clear where the articular surface starts and ends distally. It is especially hard to identify the distal articular border when the radiograph is not perfectly lateral. In our opinion, radiographs are frequently not perfectly lateral in daily practice and therefore the patellar height assessment methods should allow a certain degree of imperfection in the quality of the radiographs when assessing patellar height. The tibial tuberosity, as used by IS and MIS, is easily determined, because the patellar tendon is usually visible on contemporary radiographs. However, the shape of the tibial tuberosity and the patellar morphology can be affected in Osgood Schlatter’s disease [20]. The horizontal line projected anteriorly from the tibial plateau in the BP ratio (Fig. 1) is less easily determined. When the radiograph is not obtained with a perfectly lateral view, the medial and lateral plateau will not be parallel on the picture and it will be difficult to determine where to place the line. The antero-superior articular tibial margin in the CD ratio is not always easily determined and not found in 10 % [21]. Again, the identification of this landmark is highly dependent on the quality of the radiograph. However, this landmark is not always clearly identifiable due to bone morphology. Considering the arguments above, we believe that the anatomical landmarks of the IS ratio are most easily determined in a patella without patellofemoral pathology.

Two studies reported on the reliability of patellar height ratios for evaluating routine radiographs of mature native knees using a digital system [7, 8]. In accordance with the present study, Nizić et al. [7] reported the highest reliability for the IS for both the inter-observer reliability and the intra-observer reliability when assessing the IS, MIS, BP, and CD. However, the reliability of the IS, MIS, BP and CD reported in this study differed only marginally, and according to these findings the reliability of all ratios could be considered as excellent. Portner et al. [8] also reported the highest inter-observer and the intra-observer reliability for the IS when compared to the BP and CD. The MIS was not assessed by Portner et al. [8]. Chareancholvanich et al. [6] also reported the reliability of the four most widely used patellar height ratios. However, they used standardized, non-weight-bearing radiographs. In accordance with the present study, Chareancholvanich et al. [6] reported the highest inter-observer reliability for the IS and the second best reliability for the MIS.

Ideally, patellar height ratios use a distal femoral reference point to represent the true patellofemoral articulation. However, all ratios evaluated in this study related the patellar height to a reference point on the tibia. Anatomic variability in the position of the landmarks used to calculate the patellar height when using a tibial reference point can compromise the validity of the ratio. Grelsamer et al. [21] defined three types of sagittal morphologies for the patella. These types represent different proportions of the inferior pole related to the articular surface. Both lines used by IS (Fig. 1) are influenced by these proportions. The distribution of the morphologic types differs between healthy subjects and patients with patellofemoral symptoms [21]. Grelsamer et al. [21] showed that patients with patellofemoral symptoms had a significantly higher percentage of type II patella’s, i.e., these patients showed a patella with a short distance between the cranial and the inferior articulating surface. However, the patella was normal when measured from the cranial to the inferior pole. Because the IS uses the inferior pole as an anatomic landmark, the IS does not give a good representation of the position of the articulating surface of the patella relative to the trochlea in “long nosed” patella’s and thus has poor validity, especially in patients with patellofemoral symptoms. In our opinion, patellar height is mainly assessed in patients with patellofemoral symptoms and, therefore, should not be used despite its excellent reliability. Furthermore, the inclination of the tibia plateau varies considerately among patients [22] and this influences the “horizontal line”. Minor errors in determining the horizontal line over the tibial plateau or difference in slope of the “horizontal line” will lead to significant differences for the point where the horizontal line and the vertical line from the patella will meet. This will influence the length of the vertical line (Fig. 1, line F) and thus the ratio. Even if the BP would be highly reliable, the variation of the tibia plateau inclination makes the validity questionable. The validity of patellar height ratios is also dependent on the use of proper reference values. These reference values should be based on asymptomatic knees because knee symptoms can be due to abnormalities of the patellofemoral relationship [5]. Only the CD had reference values that were based on a healthy population [5, 912]. New reference values could be extracted from Chareancholvanich et al. [6], because they only included non-symptomatic volunteers and calculated normal values for the IS, MIS, BP, and CD. However, this was in a non-weight-bearing situation. It is also important that the ratios maintain accuracy in varying degrees of knee flexion. Chareancholvanich et al. [6] showed that knee angles between 0 and 60° of flexion did not influence the measurement of the four patellar height ratios in non-weight-bearing in healthy subjects. Yiannakopoulos et al. [23] showed that weight-bearing radiographs in 30° resulted in higher patellar height in ratios in healthy subjects when compared with non-weight-bearing radiographs. Furthermore, biomechanical analysis of weight-bearing healthy subjects showed that the patellar tendon length increased significantly between 0 and 30° of flexion, and did not further increase after 30° [24]. In patients with patella alta, the effect of weight-bearing and knee flexion might even be greater due to a potentially longer tendon which might be flabby in the extended knee or a potentially more stretchable patellar tendon. We would advise to obtain the lateral knee radiograph in some degree of flexion and in weight-bearing position to guarantee a tensed patellar tendon (i.e., be at full length).

The main limitation of this study was the moderate sample size [25]. However, we do not expect that the inclusion of more patients would have influenced our results. A strength of this study was the assessment of the radiographs by both orthopaedic surgeons and musculoskeletal radiologists, because these clinicians would be the assessors of the radiographs in daily practice. Furthermore, we consider the use of imaging obtained as part of the normal diagnostic evaluation as a strength. These radiographs were of varying quality (e.g., not always perfectly lateral). In our opinion, the use of these radiographs give a good representation of the imaging a clinician will encounter in daily practice, therefore the reliability obtained using these radiographs is more valid than the reliability obtained from perfectly lateral radiographs made for study purposes only.

In conclusion, when comparing the most widely applied ratios to assess patellar height, the IS showed to be most reliable. However, although it showed the second best reliability, we advise to use the MIS because it has—according to the literature—better validity.