Introduction

Hip microinstability (HMI), defined as supra-physiologic hip motion, has gained acceptance as a unique clinical entity that may cause or contribute to hip pain [1]. HMI is thought to be due to subtle bone deficiency, peripelvic soft tissue weakness, and/or ligamentous laxity [2]. The clinical presentation may be subtle. Many patients will not report hip joint unsteadiness. Rather, they may describe a C sign. The “C” sign is a common sign seen in patients presenting with pain from the intra-articular hip joint. The patient holds his/her hand in a C shape over the superior lateral aspect of the hip with the thumb positioned posterior to the trochanter and the fingers extending into the groin. The patient may also report groin pain. Both the “C” sign and groin pain are non-specific findings for hip joint pathology [3]. Certain populations may be at risk such as athletes. Specifically, dancers have been documented to have high prevalence of hip dysplasia, hip pain (prevalence as high as 27%), as well as being described to be more hypermobile than controls [4]–[6].

Objective findings characterizing HMI with various imaging modalities are limited, though certain radiographic and MRI findings associated with hip instability have been reported. For example, Akiyama described femoroacetabular translation in 2 positions (neutral and Patrick position) comparing a cohort of normal female hips to dysplastic hips. They reported an average posterior-infero-medial femoral head translation of 1.12 mm in normal hips v. 1.97 mm in hips with dysplasia [7]. However, these findings have been thought to be suggestive but not diagnostic [8]–[10]. As hip motion is dynamic, studies have assessed hip translation at various positions and extreme range of motion using plain radiographs in dancers [11], MRI [12], and 3D CT [13]. In addition, dual fluoroscopy is another potential dynamic diagnostic tool as it has been utilized to study the biomechanics of the walk gait cycle pertaining to patients with cam-type femoroacetabular impingement [14]. Nevertheless, the only imaging modality that has been shown to reliably assess the degree of femoral head translation is ultrasound (US) [15, 16]. The ability to assess for anterior femoroacetabular translation using dynamic US has been shown to have excellent intra-rater and good to excellent inter-rater reliability [15].

Characterizing posterior hip translation through US has yet to be described. The ability to objectively quantify femoroacetabular posterior translation (FAPT) may lead to better understanding of clinical implications of common hip structural abnormalities such as the cam deformity (femoroacetabular impingement (FAI)) and acetabular dysplasia. For example, Philippon et al. and Krych et al. found that 75% and 82% of athletes with posterior hip instability episodes were found to have FAI [17, 18]. In addition, posterior hip instability has been associated with acetabular morphology such as acetabular retroversion and a decrease in posterior acetabular coverage [19]. With this, we aim to present a protocol for measuring FAPT using dynamic hip ultrasonography (DHUS) and to determine the inter- and intra-rater reliability of hip ultrasound measurements of FAPT.

Methods

The study protocol was approved by the institutional review board. The study was conducted at a tertiary pediatric hospital, which has a high-volume hip preservation program. This study utilized a test–retest analysis to analyze intra- and inter-rater reliability of DHUS in assessing FAPT.

“A total of 4 attending-level board-certified primary sports medicine physicians were utilized as ultrasound scanners. Each physician had varying levels of musculoskeletal (MSK) US experience: > 10 years (two senior level), 5–10 years (one intermediate), and one provider that was 1 year out of sports medicine fellowship training (Junior). Two physicians, both with > 10 years of MSK US experience, were each assigned a group to scan due to schedule availability of the scanning provider. Specifically, the cohorts were divided into two groups: group 1 consisted of participants #1–7 and group 2 consisted of participants #8–14. The intermediate and junior scanners performed the US scan for both cohorts. The ultrasound scans were obtained over a 2-week time period, each week comprised of 2 days of scanning. In total, there were 4 days of scanning. On week 1, participants #1–7 attended day 1 and participants #8–14 attended day 2. The group was asked to return 2 weeks later to complete the scans in a similar fashion.”

The DHUS assessments were performed, and data was collected at a total of 4 days over a 2-week period as described above. Thirteen participants were included in the study—1 participant was excluded, and the reason for exclusion is described below. Each participant had both hips scanned by 3 providers. Each hip was considered a unique ultrasound examination, and in turn, 26 hips were assessed. Three scans were obtained for each hip. This was repeated twice, 2 weeks apart. A total of 468 scans were obtained for the study.

Study participants

Fourteen participants, who were employees at the pediatric hospital, were recruited for the study using an internal advertisement. None of the participants were involved with the study design, data collection, data analysis, or manuscript preparation. All were in good health with no known underlying history of hip pathology. Demographics were collected on each participant including age, sex, height, weight, BMI, and 9-point Beighton score. Hypermobility was defined as Beighton ≥ 5/9 [20, 21]. Participants with hip pain and a history of prior hip surgery were excluded. One participant was only able to attend 1 visit and therefore was not included in the data analysis. The final cohort consisted of 13 participants. Both hips of each participant were used as an independent data point resulting in a total of 26 hips. Each participant was provided with an incentive of a $50 Amazon gift card per visit.

FAPT US protocol

The measurements of FAPT were collected for 3 patient positions. The first position is the neutral (PN) or baseline position (Fig. 1A and B). The patient is in the lateral decubitus position with the hip being scanned facing up. The side being scanned has both the hip and knee in neutral. The contralateral hip is flexed to 90° to neutralize the pelvis and lumbar spine. The second position (PFADIR) simulates the posterior apprehension test. The position begins with the PN position as described above. The scanned hip is then passively flexed to 110°, adducted, and internally rotated (Fig. 2A and B). The third position is weight bearing and loading of the posterior hip (PStand). The individual is standing with their feet facing forward and shoulder width apart. They then flex their spine and reach over to the contralateral foot with their hands (Fig. 3A). Ultrasound measurements of FAPT were obtained using an internal software, which is a ruler tool included in the ultrasound machine. Measurements were obtained by determining the position of the femoral head in relation to the acetabulum (Figs. 1C, 2C, and 3B). A negative reading indicates the femoral head is below the acetabulum.

Fig. 1
figure 1

AC A The first ultrasound position, termed posterior neutral or baseline position (PN), shows the subject lying in the lateral decubitus position. The scanned hip and knee on the ipsilateral side are both in the neutral position, while the contralateral hip is flexed to 90°. B Illustration showing the placement of the ultrasound probe, which is placed in parallel to the femur. C Example of the measurement of posterior femoroacetabular translation of the PN position. An internal software was used to calculate the position of the femoral head as it relates to the acetabulum. The vertical distance from the tip of the acetabulum to the sclerotic margin of the femoral head represents femoral head translation

Fig. 2
figure 2

AC A The second ultrasound position, termed flexion, adduction, and internal rotation (PFADIR), is intended to simulate the posterior apprehension test. The position begins with the posterior neutral (PN) position (Fig. 1A). The scanned hip is then passively flexed to 110°, adducted, and internally rotated. B Illustration showing the placement of the ultrasound probe, which is placed in parallel to the femur. C Example of the measurement of posterior femoroacetabular translation of the PFADIR position

Fig. 3
figure 3

AB A The third ultrasound position is termed the stand and load position (PStand). The individual is standing with their feet facing forward and shoulder width apart. The patient flexes their spine and reaches over to the contralateral foot with their hands. B Example of the measurement of posterior femoroacetabular translation of the PStand position

Statistical analysis

Demographic characteristics were summarized for the cohort by frequency and percent, mean and standard deviation (SD), or median and interquartile range (IQR), as appropriate. Each of the three DHUS measurements across all hips was summarized by mean and SD by rater level (junior, 0–5 years of experience; intermediate, 5–10 years of experience; and senior, 10 or more years of experience) and by first and second read. Intra-rater reliability was assessed by calculating the intraclass correlation coefficient (ICC) along with 95% confidence intervals (CIs) for each rater and across all raters. An ICC model was used, which is a two-way mixed effects model to assess consistency for a single rater. Inter-rater reliability was assessed by estimating ICCs across all three raters for the first read, the second read, and across all reads along with 95% CIs. An ICC model was used, which is a two-way random effects model to assess absolute agreement over the average of three random raters. Interpretations of reliability coefficients were based on the cutoffs by Fleiss and Cicchetti and Sparrow: < 0.40, poor; 0.40–0.59, fair; 0.60–0.74, good; and > 0.74, excellent [22]. Power analysis found that a sample of 26 measurements across three raters provided more than 80% power to test for ICC values of 0.4 to 0.8 against null hypotheses 0.30 units lower than the sample estimate with alpha set to 5%.

Results

Demographics

DHUS measurements for thirteen subjects (54% female), with a mean age of 26 years (range, 19–38), taken at two separate time points by three independent raters were reviewed. Over 1/3 of the cohort were hypermobile. The mean FAPT measurements for the three US positions of neutral, PFADIR, and PStand were 3.6 mm (SD 3.2 mm), 10.5 mm (SD 3.9 mm), and 8.2 mm (SD 4.1 mm), respectively (Table 1).

Table 1 Cohort summary (n = 13)

US measurements by rater experience

The US measurements of each rater are highlighted in Table 2. Due to the variability in measurements, no differences were statistically discernible across rater experience level (Table 2). For the junior rater, the absolute difference in measurement between read 1 and read 2 was 1.5 mm, 1.5 mm, and 0.7 mm for the neutral, PFADIR, and PStand, respectively. For the intermediate rater, the absolute difference in measurement between read 1 and read 2 was 0.7 mm, 0.8 mm, and 1.9 mm for the neutral, PFADIR, and PStand, respectively. For the senior raters, the absolute difference in measurement between read 1 and read 2 was 1.3 mm, 0.2 mm, and 0.8 mm for the neutral, PFADIR, and PStand, respectively (Table 2).

Table 2 Measurement summary by rater level and read

Inter- and intra-rater reliability

Inter-rater reliability ranged from good to excellent for read one and read two (Table 3).

Table 3 Inter-rater reliability for the first read, second read, and across all reads

Inter-rater reliability was lowest, although good, for the neutral measurement and was highest and excellent for the PFADIR measurement (Table 3).

Intra-rater reliability ranged from fair to good (Table 4). Intra-rater reliability was good and highest for the PFADIR measurement (Table 4). The intra-rater reliability was fair for the neutral and PStand positions (Table 4). Intra-rater reliability ranged from poor to fair for the junior rater, from poor to excellent for the intermediate rater, and from poor to fair for the senior rater (Table 4).

Table 4 Intra-rater reliability by rater level

Discussion

This study is the first to propose an ultrasound protocol to assess femoroacetabular posterior translation (FAPT) using 3 measurements. These measurements and positions were designed with the intentions of understanding posterior femoroacetabular motion and the dynamic changes at various positions: baseline (posterior neutral), simulating the posterior hip apprehension test (posterior, flexion, adduction, and internal rotation), and to assess the hip in the weightbearing position (posterior standing).

The use of DHUS has been proposed as a point-of-care tool in diagnosing hip microinstability (HMI)—a condition that remains challenging to diagnose, especially in the athlete population whereby the hip joint demands are often much greater than for the non-athlete. Moreover, posterior femoroacetabular instability and impingement in the athletes and performing artist athletes is thought to contribute to posterior acetabular chondral damage population seen uniquely in some groups. By gaining a better understanding of posterior hip micro instability measured on dynamic US, we can correlate this finding to pain profile and articular and femoral cartilage damage patterns seen on MR imaging and bony changes seen on radiographs. Together, this information will serve to guide hip preservation efforts when treating complex hip pain.

The main finding of our study was that DHUS of FAPT demonstrated good to excellent inter-rater reliability and fair to good intra-rater reliability. Another notable finding was that US measurements did not vary across physician rater experience. The findings of our study offer a strategy to objectively quantify hip translation. Furthermore, our study will support clinical integration of DHUS for FAPT when evaluating complex hip pain.

The use of DHUS in evaluating joint motion is not novel. d’Hemecourt et al. introduced an US protocol to evaluate anterior femoral head translation, demonstrating excellent inter- and intra-rater reliability for neutral and anterior apprehension patient positions [15]. The use of US in quantifying joint mobility has been described in shoulders. Krarup et al. reported a significant difference in anterior shoulder translation when comparing affected shoulders to individuals without shoulder instability; 4.9 mm v. 1.9 mm (P < 0.01) [23]. In addition, Henderson et al. characterized the end range glenohumeral translation with application of an accessory passive force and ultrasound imaging [24].

Outside of ultrasonography, other imaging modalities have been investigated on its ability to provide objective findings for the diagnosis of HMI. The utilization of x-rays was illustrated by Mitchell et al., who was able to measure hip subluxation in elite ballet dancers using anteroposterior radiographic views by calculating the difference in the distance of the hip center position at the neutral position v. split position. They identified a 1.41 mm subluxation distance difference between the 2 positions [11]. Other imaging modalities such as 3D CT and MRI have also described in vivo hip translation. Cvetanovich et al. used a 3D CT software to quantify hip translation in adults with symptomatic FAI. The 3D software quantified the femoral head translation between the neutral and FABER positions to be 0.84 ± 0.37 mm. Moreover, they reported a posterior translation of 0.10 ± 0.54 mm in their cohort [13]. This amount of translation is much less than our findings of 3.6 mm (SD 3.2 mm) in the neutral and 10. 5 mm (SD 3.9 mm) in the posterior apprehension position. This difference may be from the patient position used to obtain each measurement. The use of the FABER position by Cvetanoovich et al. may underestimate FAPT as this position has been described to cause the femoral head to translate anteriorly and thus stressing the anterior hip joint and labrum [25, 26]. The cohort of Cvetanovich et al. were primarily participants with CAM femoroacetabular impingement. The exact location of the impingement, however, was not characterized—this may have contributed to the underestimation of FAPT in situations where posterior impingement is present. In addition, 3D MRI has characterized hip translation in patients with hip dysplasia. Akiyama et al. reported a mean translation of 4.10 ± 1.41 mm between the neutral and FABER position in patients with hip dysplasia [27]. Similarly using 3D MRI, Gilles et al. and Charbonnier et al. reported a mean translation of 2.12 ± 0.79 mm and 5.14 ± 1.28 mm at extreme range of motion when performed by professional dancers, respectively [12, 28]. The direction of the translation, however, was no specified. Although x-rays, 3D CT, and 3D MRI have demonstrated its ability to assess for in vivo hip translation, the use of ultrasound allows for a dynamic assessment that permits for real-time patient feedback and symptom localization to imaging findings. Ultrasound also eliminates radiation exposure and is more cost-effective than CT and MRI [29, 30].

There are (a few) limitations to this study to consider. We assumed that each hip (used) was an independent data point, even though we scanned both hips of each participant. As this was a study aimed at evaluating the reliability of a novel protocol, we felt this assumption would not affect our findings. In addition, we used 2 senior-level physician scanners who divided the cohort with each scanning half of the participants. Although having an additional senior scanner may theoretically improve the reliability results, it strengthens our data as more scanners—and therefore more chances for variability—were included in the data collection. Lastly, ultrasonography is operator-dependent, and findings may be difficult to replicate. To address this limitation, the study utilized 4 US scanners with a range of MSK US experience.

To our knowledge, this is the first study to describe an ultrasound protocol to evaluate FAPT. HMI can be difficult to diagnose, but dynamic hip ultrasound may be a tool to provide a clinical objective measurement in quantifying hip motion for the athlete with complex hip pain. Future studies are needed to establish normative values of hip translation for athletes within various sports including, for example, ice hockey athletes, swimmers, gymnasts, and figure skaters. In addition, variables such as sex, hip morphology, and ligamentous laxity will be considered.