Introduction

Pelvic incidence (PI) is a key parameter for assessing sagittal spinopelvic balance. Most parameters measuring pelvic sagittal balance, such as sacral slope and pelvic tilt, are positional parameters, which must be evaluated in a standing position. However, PI has the advantage of being an anatomical parameter that is independent of patient position and posture. PI is defined as the angle between the line connecting the midpoint of the sacral endplate to bicoxofemoral axis (the axis passing through the left and right femoral heads) and the line perpendicular to the sacral endplate [1]. Elevated PI is associated with development and progression of isthmic spondylolisthesis [2,3,4], adolescent idiopathic scoliosis [5], degenerative spondylolisthesis [6, 7] and facet arthritis [8]. In addition to the involvement in spinal pathologies, PI is also considered to be an important parameter in preoperative planning and has been shown to be associated with clinical outcome of several spine surgeries including posterior intervertebral fusion for isthmic spondylolisthesis [9] and surgical correction of scoliosis/kyphosis [10, 11]. In regard to the hips, PI has been associated with cam and pincer type femoroacetabular impingement [12, 13], SCFE and retroversion deformities [14, 15] and hip osteoarthritis [16].

PI was first introduced by Duval-Beaupere et al. in 1992 and was measured on a true lateral spine radiograph [17]. To measure PI, a large number of steps have to be performed to identify anatomic landmarks including the center of the femoral heads and the sacral endplate, which leads to a high degree of variability. PI has been shown to have the lowest ICC scores out of all pelvic parameters (including sacral slope, thoracic kyphosis, sagittal vertical axis and pelvic tilt) measured on full spine radiographs [18].

Due to this variability, there has been investigation into measuring PI using CT scans and CT 3D reconstructions [11, 19, 20]. In a study comparing PI measurement among full spine standing radiographs, standing radiographs of the pelvis and standard CT scans, the intra- and interobserver reliability rates were 0.84 and 0.79 for whole spine radiographs, and 0.98 and 0.97 for pelvis standing radiographs. The correlation coefficient was 0.81 between full spine radiograph and CT and 0.95 for pelvis radiograph and CT, suggesting that full spine PI is comparatively less accurate [11]. The reliability of measuring pelvic incidence on CT with 3D reconstruction is considered high, with an ICC of 0.95 according to Vrotec et al. [20]. Chen et al. created 3D reconstructed pelvic models using in house software, which had similarly high intra- and interobserver reliability rates of 0.96 and 0.98, compared to 0.86 and 0.90 when PI was measured from full spine radiographs [19].

To the best of our knowledge, no previous study has directly compared PI measurements across all three modalities of XR, CT and 3D CT, and, in particular, standard CT scans have not been compared to 3D CT. This has important relevance because of the extra steps and learning curve necessary to perform a 3D reconstruction and measurement. The objective of this study is to compare the reliability of measuring PI across these three modalities.

Materials and methods

Subjects

Institutional review board approval was obtained. Patients over 18 years old with imaging between 1/1/2011 and 12/31/2019 were randomly selected from our PACS system. Sixty-two patients had both a lateral XR of either the sacrum/coccyx or lumbar spine, as well as CT scan of the abdomen/pelvis or the pelvis alone. In order to obtain 28 patients each with at repeated XR and CT imaging, we used 15 additional patients, for a total of 77 patients. Images without adequate visualization of the femoral heads and sacral endplate were excluded. Patients with XR and CT images greater than 3 years apart from each other were excluded. XRs with femoral heads that were not at least partially overlapping were excluded, as larger amounts of axial rotation of the image can introduce some inaccuracy [21].

Measurements

To measure PI on lateral XRs, a line was drawn between the centers of the two femoral heads. A line was then drawn between the center of the sacral endplate and the midpoint of the two femoral heads. PI was measured as the angle between this second line and the line orthogonal to the sacral endplate (Fig. 1). To measure PI on standard CT scans, sagittal cuts were utilized and circles were drawn to fit each femoral head on separate cuts. The midpoint of each circle was marked and a line was drawn connecting these two points. The midpoint of this line was then marked on the sagittal cut which incorporated the center of the sacrum. A line was drawn from the midpoint of the femoral heads and the center of the sacral endplate. PI was measured between this line and the line orthogonal to the sacral endplate (Fig. 2). To measure PI using CT 3D reconstructions, the MPR (multiplanar reconstruction) function in PACS was used to produce 3D projections from the 2D slices on standard CT. First, circles were drawn around each femoral head (Fig. 3a). The rotation of the x-, y- and z- axes was adjusted so that the two femoral heads were lined up exactly from a sagittal view (Fig. 3b). Using this method, the bicoxofemoral axis remained orthogonal to the sagittal plane. With the outline of the overlapping femoral heads present, the slab thickness of the 3D reconstruction was then decreased to 0.5 mm to visualize the center of the sacral endplate more accurately. PI was measured as the angle between the line connecting the previously marked femoral head centers to the center of the sacrum and the line orthogonal to the sacral endplate (Fig. 3c).

Fig. 1
figure 1

PI measured on lateral radiograph. The centers of the femoral heads are marked, and a line from their midpoint is drawn to the center of the sacral endplate. A second line is drawn orthogonal to the sacral endplate. The angle between the two lines is pelvic incidence

Fig. 2
figure 2

PI Measurement on standard CT scan. The center of the femoral heads are marked on sagittal slices, and these positions are translated onto the image slice at the mid aspect of the sacrum. The sacral endplate is then well visualized and pelvic incidence is measured

Fig. 3
figure 3

PI measured on CT MPR 3D reconstruction. The femoral heads are clearly identified a, and then the entire image is adjusted until the femoral heads overlap b. The location of the femoral heads are marked, and a thin slab at the center of the sacrum is used to better visualize the sacral endplate and complete the measurements of pelvic incidence c

Statistics and measurement validation

A power analysis was performed to detect a 2° difference between modalities, with a standard deviation of 10 degrees based on preliminary data, an alpha = 0.05 and beta = 0.9. G*Power test for ANOVA: repeated measures within factors, using the default parameters other than those stated above, produced a projected sample size of 55 [22]. ICC values were calculated comparing different modalities (XR vs standard CT, XR vs 3D CT and standard CT vs 3D CT) for the full set of 62 patients. A one-way repeated measures ANOVA analysis with Bonferoni correction for the postop hoc tests was performed in these 62 subjects to determine if there was any significant difference among the mean PI measurements in XR vs standard CT vs 3D CT. To determine the reliability of PI measurements across different time points within the same individual, PI was also measured on serial images in 28 individuals with two separate scans using the same imaging modality within 3 years of each other and evaluated using ICC. Of these 28 individuals, 13 were part of the original cohort of 62 subjects and an additional 15 subjects were recruited.

To assess measurement reproducibility, two investigators measured PI on XR, standard CT and 3D CT using a subset of 22 subjects, and repeat measurements were made by the primary measurer 3–6 weeks after initial measurement for this subset. ICC was used to quantify intra- and interobserver reliability for the PI measurements. Intra- and interrater reliability were all in the excellent range (Table 1).

Table 1 Intra- and interrater reliability based on a subset of 22 subjects

Results

The primary 62 subjects (mean age ± SD between XR and CT 60 ± 20 years, range 21–96) consisted of 20 males and 42 females. PI measurements on XR, standard CT and CT with 3D reconstruction were 56° ± 13°, 53° ± 12° and 53° ± 12°, respectively. Repeated measures ANOVA analysis on these 62 subjects showed on post hoc tests that mean PI on XR was elevated compared to standard CT (p < 0.001) as well as 3D CT (p < 0.001). However, there was no significant difference between the mean PI on standard CT and 3D CT (p = 0.283). The mean absolute difference between XR and 2DCT, XR and 3DCT and 2DCT and 3DCT were 3.7° (ranging from 0 to 10°), 3.7° (ranging from 0.3 to 11°) and 1.6° (ranging from 0.1 to 4.4°), respectively.

ICC values demonstrated higher correlation between standard CT and 3D CT versus XR with either modality (Table 2, Fig. 4). Furthermore, ICC values calculated between PI measurements on 28 cases of repeated imaging of the same individuals also demonstrated that the CT methods produce more reliable measurements (Table 3).

Table 2 ICCs between modalities based on primary 62 subjects recruited
Fig. 4
figure 4

Correlations and Pearson’s coefficients of PI measurements between different modalities

Table 3 ICCs among repeated imaging of the same individual using the same modality

Discussion

This study found that there was a stronger correlation between standard CT and 3D CT (ICC 0.986), with a slightly lower albeit still relatively high correlation between XR and standard CT (ICC 0.934) and XR and 3D CT (ICC 0.937). While this demonstrates the strength of correlation between modalities, there remains the question of which modality is more valid. The difficulty in drawing definitive conclusions about the validity is due to the lack of a clearly established gold standard modality for measuring PI. Therefore, this study also evaluated the consistency of serial PI measurements within individual subjects using each modality. Two separate images of the same modality taken within 3 years found that standard CT and 3D CT were more reproducible on serial images (ICC of 0.986 and 0.981, respectively), while XR had a relatively lower reproducibility rate (ICC of 0.935). The fact that the CT values matched better with each other and consisted provided more reproducible results, as well as their more anatomic nature, argue that standard CT and 3D CT are more valid modalities for measuring PI.

Additionally, using a repeated measures ANOVA analysis, this study determined that there was a slight elevation in mean PI on XR (56° ± 13°) when compared to the mean PI on standard CT and 3D CT (53° ± 12° and 53° ± 12°), which was statistically significant. In comparison, a study by Yamada et al. compared PI measurements among 120 subjects who underwent full spine lateral standing XR, pelvis lateral standing XR and CT pelvis and reported mean PI measurements of 55.8° ± 9.8°, 51.5° ± 9.4° and 50.6° ± 9.0°, respectively [11]. They reported lower ICC reliability values of measuring PI on the full spine XR compared to other modalities, which was attributed to the large angle of the radiograph’s projection to capture the full spine, causing difficulty in identifying landmarks such as the sacral endplate and resulting in higher PI values. Also of note, in a study measuring PI on 880 cadaveric specimens, the mean PI was found to be 46.0° ± 11.0°, which is slightly lower than mean reported values on most radiographic studies [23]. These findings all suggest that measuring PI on XR may yield a slightly higher value compared to the true anatomic PI, especially when capturing a wide projection angle.

Several reasons could explain why measuring PI on XR could have reduced reliability compared to standard CT and 3D CT. In the previously mentioned study by Yamada et al., their results indicate that the error of measuring PI on XR was primarily due to difficulty in precisely identifying the sacral endplate, rather than the bicoxofemoral axis [11]. In our reliability test subgroup of 22 subjects, PI measurements differed by more than 3° on four radiographs, three of which were found to have sacral endplates with obscured posterior boundaries (Fig. 5). In several X-rays from the total 77 subjects, the projection of the XR beam from a more inferior position causes the sacral endplate to become circular in shape and more difficult to precisely delineate (Fig. 6a), leading to a discrepancy in PI measurement on XR compared to standard CT and 3D CT (Fig. 6b, c). In a study by Legaye comparing reliability of measuring spinopelvic parameters, including PI, in patients with and without dome-shaped sacrum, there was a larger spread of values of PI in patients with dome-shaped sacra compared to patients with a flat sacral plate (standard deviation of 5.4 and 1.3, respectively), likely due to ambiguity in evaluating the tilt of the sacral endplate with dome-shaped sacra [24]. In another study by Chen et al., the group reported that depending on whether the sacral endplate was concave or convex side anteriorly, PI measured on XR would be smaller or larger compared to CT, respectively, due to the projectional nature of the XR [19]. CT imaging allows the user to measure the sacral endplate at the midline sagittal slice, which avoids several issues by excluding projections created from other components of the sacrum. However, if irregularity of the sacral endplate still persists at the midsagittal slice on CT, there still remains some ambiguity in identification. For example, degenerative changes leading to osteophytes or irregular S-shaped or dome-shaped endplates still caused difficulty in precisely identifying the sacral plate on both XR and CT in our study.

Fig. 5
figure 5

Sacral endplate with unclear posterior boundary due to degenerative changes

Fig. 6
figure 6

PI measurements on a XR, b standard CT, and c CT with 3D reconstruction for the same individual. Potential projectional issues from the XR beam potentially cause the sacral endplate to appear as an ellipse rather than a straight line, making it difficult to precisely annotate. This imprecision leads to overestimation of the PI when measured on XR, compared to CT

While a 3° difference when measuring PI has minimal clinical implications on an individual basis, this difference has more substantial impact when used in a research setting. Gebhart et al. reported the mean PI for hips with a cam lesion versus normal hips was 43.1° versus 47.7°, respectively (p = 0.02) [12]. Another study investigating the relationship between PI and facet joint arthritis reported a median PI of 51.7° versus 49.8° in the highest and lowest grade facet arthritis groups, respectively (p = 0.07) [25]. These studies show that a small difference in PI within the scale of a few degrees has been significantly correlated with both spine and hip pathologies. The small differences in these studies advocate for using an accurate modality to measure PI, even when this accuracy is improved only by a few degrees. There is also the question of time needed to measure PI on each modality. Based on a timed measurement of five random subjects within this study, the average time to measure PI on XR, 2DCT and 3DCT is 128, 177 and 207 s, respectively. Note that although there is a learning curve associated with measuring PI among all modalities, this curve is slightly steeper for 3DCT, and the times recorded by the author were done after this initial period of acclimation. Although CT is associated with an increase in measuring time and increased radiation to the patient, these factors could be justified by the increase in reliability that CT offers when measuring PI in a research setting.

This study has several important limitations. As mentioned previously, although measuring PI on XR has been extensively described and routinely used, there is no absolute gold standard modality. Our study demonstrates a stronger correlation between standard CT and 3D CT compared to with XR, as well as slightly increased intra- and interobserver ICC rates with standard CT and 3D CT and more reproducibility with serial imaging in the same patients with standard CT and 3D CT, all providing support for CT. Another limitation is the female-to-male ratio (52 vs. 25). However, numerous studies comparing PI among females and males have shown no significant differences between the two groups [20, 22, 26]. Finally, we did not insist on perfect lateral radiographic views of the spine. There is some minor inaccuracy secondary to rotational position, as the approximation of the center of two femoral heads does introduce a small amount of mathematical error [21]. This is a practical reality of using radiographs and thus we felt it was reasonable to include radiographs with minor rotation. However, we mitigated this effect by only including images where the femoral heads had some overlap. The relatively high ICC values for radiographs in this study may be secondary to this inclusion criteria.

Conclusion

In summary, this study showed that while XR still demonstrated a high level of reliability for measuring pelvic incidence, standard CT and CT with 3D reconstruction are both more reliable methods. There was no clear improvement in using 3D CT in place of standard CT. In addition, PI measurements on XR were slightly elevated when compared to those on standard CT and 3D CT. Overall, XR remains a reasonable technique but has lower reliability than CT and may overestimate true PI.