Introduction

Developmental dysplasia of the hip (DDH) involves a range of disorders that represent an abnormal relationship between the head of the femur and the acetabulum leading to abnormal development of the hip joint [1,2,3]. Current and previous data allude to an estimate of DDH that ranges from one to eight cases per 1000 live births revealed by physical examination which escalates up to one to three in 100 live births revealed by sonographic screening [3,4,5].

Several risk factors have been implicated with the cause of the DDH including first born status, female gender, family history, fetal malposition (particularly extended breech presentation), fetal packaging disorders (congenital torticollis and metatarsus adductus), and some geographical and racial factors (more common in native American and European than Asian and African). Two meta-analyses of over a million babies showed that the main independent risk factors are breech presentation, female gender, and family history [6, 7].

Screening practices include a detailed history and clinical examination by assessing infants for limb discrepancy and thigh fold symmetry, as well as any limitations in hip abduction [4]. Diagnosis is confirmed by ultrasound examination using either Graf, Harcke, Terjesen, or Suzuki methods [8, 9]. The quantitative classification of Graf is the most commonly utilized method (Table1) [10].

Table 1 Graf sonographic grading for DDH

Pavlik harness (PH) treatment is one of the most used treatment methods for DDH in infants younger than six months yielding a 70–90% success rate [1, 11,12,13,14]. PH aims to relocate the hip to the normal anatomical position in a non-invasive fashion [15,16,17] [10] (Fig. 1).

Fig. 1
figure 1

Child treated with a Pavlik harness

We conducted a systematic review of the studies that used PH treatment, and we identified 15 studies that investigated the Pavlik harness treatment in DDH [16, 18,19,20,21,22,23,24,25,26,27,28,29,30,31]. Seven studies investigated studies that reported a failure rate that ranged from 3.3 to 40% (crude mean is 11.6%) [11] (Table 2). Graf type IV hips, treatment initiation age, and male sex are reported as risk factors for treatment failure [4, 11]. Most case series are either small, did not follow a standard protocol, or were based on clinical rather than sonographic grading which is less specific and sensitive. In addition, not all the potential causes of treatment failure were studied.

Table 2 The characteristics of studies of Pavlik harness treatments

The purpose of the study is to identify independent prognostic factors that could predict PH treatment failure in patients with DDH.

Methods

This is a retrospective study of infants, who were less than six months of age, that were treated with a Pavlik harness. The study was conducted and reported in accordance with Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies [32].

Demographic, sonographic, and clinical data were registered in a purpose-designed Microsoft Access Database. Referred patients were assessed according to the baby hip clinic initial assessment proforma. Based on the clinical and radiological findings, patients were either discharged, monitored, or treated with PH. A few patients who were close to six months of age were treated with surgery as first-line treatment. These patients were excluded from the study. Patients with Graf type I hips are considered normal and consequently were discharged. Patients with Graf IIa were monitored on a monthly basis. Graf IIb and onwards were treated with the PH.

Patients were provided with individual follow-up plans. Each follow-up involved completing the baby hip clinic follow-up assessment and ultrasound examination to monitor progress. The Pavlik harness treatment was considered successful and stopped when the hip became normal (Graf type I). Treatment failure was considered when there is worsening, or no improvement, and the PH was abandoned, and surgical treatment was offered. When a hip is subluxed or not centered (B angle is above 77°), a maximum of two weeks of Pavlik harness treatment is allowed for the hip to reduce. If not, the harness will be abandoned. This is to prevent the Pavlik harness syndrome which is the damage to the posterior rim of the acetabulum. The latter can pose a difficulty not only to future treatment but also to the Pavlik harness treatment itself. In contrast to a well-centered but dysplastic hip (A-angle is < 60°), treatment is continued until the hip is normalized.

The main objective of this research was to identify the success and failure rates and prognostic factors for failure and establish the strength of association with failure (or success). Our null hypothesis is that there are no prognostic factors that could predict the failure of PH treatment in DDH. Data was collected using the baby hip clinic database. Most data were collected retrospectively but some data were collected prospectively using the initial assessment form and follow-up assessment form of the baby hip clinic (see Appendices 1 and 2). Variables obtained from the data are quantitative and qualitative. Quantitative data includes continuous variables such as age and duration of treatment. Qualitative data includes nominal variables such gender, first born status, family history, foot deformity, plagiocephaly, breech presentation, Galeazzi sign, hip abduction, and femoral nerve palsy. Ordinal variables include hip ultrasound Graf classification and hip stability.

The latter was classified into five grades: stable, dislocatable, dislocated but reducible, dislocated but irreducible, and unable to assess. However, for the purpose of this study analysis, hip stability was recoded as not dislocatable and dislocatable to maximize the power of the statistical test while keeping a meaningful clinical message [33].

Hip abduction was coded as full or limited as used clinically. Graf hip types are classified as listed in Table 1. This variable was not condensed to a binary outcome because it is a radiological finding. However, if a child has bilateral hip dysplasia, we used the worse side as a predictor for treatment failure. In our experience, this is almost always correct.

Data were formally checked for normality using Kolmogorov-Smirnov test (KST) and visually. The methods used for descriptive statistics of continuous variables are measures of tendency and dispersion, while categorical variables are described by calculating proportion and frequency. The categorical dichotomous independent variables were cross tabulated with the outcomes tested using chi-square or Fisher exact test based on sample size. Categorical independent variables with more than two independent variables were compared visually through a bar chart in adherence to chi-square conditions. Continuous independent variables were all nonparametric and thus were analyzed using the Mann-Whitney test for significance. Independent variables that proved significance in the univariate analysis were then entered in the multivariate analysis to identify prognostic factors for failure. Multiple logistic regression was used to identify the interplay between the significant variables. The P value of ≤ 0.05 was chosen to determine statistical significance. Missing data were excluded from the analysis.

Results

Success rate of Pavlik harness

Two hundred and sixty-five patients out of 3885 patients, who were referred to the baby hip clinic, were treated with PH in the five year period of analysis. All patients completed their treatment and there was no loss to follow-up during the treatment phase; however, two patients lost to follow-up after treatment was finished and were excluded from the study. Two hundred and twenty-one patients (83.4%) were successfully treated, while 44 (16.6%) failed the treatment.

Prognostic factors for failure of Pavlik harness univariate analysis

Thirteen factors of interest were assessed to identify the ones that could predict failure of PH treatment (Table 3). Age, plagiocephaly, positive Galeazzi sign, hip instability, higher grades on Graf hip classification, and the development of femoral nerve palsy were shown to be significant on univariate analysis (P ≤ 0.05). The mean age of successful treatment was 6.73 weeks in comparison to 8.84 weeks for failure of treatment.

Table 3 Univariate analysis of predictors of Pavlik harness treatment failure

The influence of the grades of the Graf classification on failure rates is illustrated in Table 4. The radiological type of hip as per Graf classification was a significant factor in predicting failure PH treatment. The higher the grade, the more likely it was that PH treatment failed (FET = 0.027).

Table 4 Pavlik harness treatment success and failure associated with the grades of the Graf infant hip classification

Multivariate analysis

The significant variables were entered into a multiple logistic regression analysis in order to identify the confounder adjusted odds ratios of the prognostic factors. Table 5 illustrates our findings. This yielded significantly independent predictors of failed PH treatment by 4.43 folds due to a positive Galeazzi, 5.27 folds due to hip instability, 1.3 folds due to a higher Graf grade, and 4.724 folds due to the development of femoral nerve palsy.

Table 5 Multivariate analysis of predictors of Pavlik harness treatment failure

Discussion

Pavlik harness is a commonly used treatment device for developmental dysplasia of the hip in patients younger than six months of age [1, 34]. Crude failure rate has been reported to be 11.6% (3.3–40%) [14]; however, there exists a lack of comprehensive assessment of possible prognostic factors that could lead to failure of treatment. Of the patients in the study, 83.4% (n = 221/265) were successfully treated, while 16.6% (n = 44/265) failed which is within the published range of successful treatment.

Our study identified several predictors for PH treatment failure. However, a positive Galeazzi sign, hip instability (on clinical examination), Graf classification, and the development of femoral nerve palsy were identified as independent prognostic factors for failed PH treatment.

The age at initiation of treatment was significant in the univariate analysis; however, upon running multivariate analysis, the adjusted OR, CI, and P value were no longer significant. This could be because most of the older patients had more severe findings in hip instability, limited hip abduction, and Graf type hip. Therefore, the regression model associated age as a confounding factor. The association between the two variables reported that failure increased with infants above three to six months of age [11, 35], whereas in our study, the mean age of patient population who failed treatment was 6.7 weeks. This is consistent with some published studies that reported a higher failure rate in children above seven weeks [36, 37].

Gender was not significantly associated with PH failure upon univariate analysis. Males constituted 18.3% of our study (n = 48/262) with a failure of 4.2% (n = 11/48), while females comprised of 81.7% of the study population (n = 214/262) with a failure of 12.6% (n = 33/214). Many studies show no association between gender and failure [11, 38,39,40]. One study, however, indicates that males whom are Graf type IV and Ortolani positive (dislocated reducible) are more likely to fail [41]. Another study claims that males past seven weeks in the PH will require alternative therapy [42]. The significance in these studies could be attributed to the severity of displacement as well as the prolonged duration of treatment rather than gender. A multivariate analysis would have been a better statistical analysis to uncover the play of confounding factors.

First born status, family history, and breech presentation had been hypothesized to be a predictor of treatment failure; however, our study shows no association between these three factors and PH failure. This is consistent with another published study by Omeroglu and colleagues [11]. There are studies that reported the presence of a foot deformity is a prognostic factor for failure [11, 43]; however, no significant association was established in our study. This could be a genuine finding; however, the small sample size regarding failed patients with foot deformities may preclude strong recommendation. In our series, patients with foot deformities compromise only 4.6% of the population (n = 12/257) and 4.5% of the total failure (n = 2/44).

Intra uterine packaging disorders such as plagiocephaly have been described as risk factors for developing DDH but not to treatment failure. In the univariate analysis that we conducted, there was a significant association between plagiocephaly and treatment of failure; however, upon adjusting for confounding factors, the significance and CI subsided. There is no literature on plagiocephaly and its prognostic value on PH failure; thus, more studies with larger sample sizes are required.

Galeazzi sign is a clinical finding that signifies the presence of a truly dislocated hip. Univariate and multivariate analyses indicated a strong association between Galeazzi sign and PH treatment failure. To the best of our knowledge, there are no studies that evaluated the presence of Galeazzi sign as a prognostic factor for PH treatment failure.

Like Galeazzi sign, clinical hip instability and Graf types III and IV are signs of actual dislocated hips and not a mere dysplasia. Both were significantly associated as independent prognostic factors with failure of PH treatment. Several studies reveal that 60% of the clinically dislocated and irreducible hips fail PH treatment, while the failure rate of dislocated but reducible hips is 40% [15, 37, 38, 44, 45]. The greatest proportion of patients that failed PH treatment in our study was found to be patients with hips that were assessed as dislocated irreducible 71% (n = 5/7) which is quite similar to the results of another study in which all cases with dislocated irreducible hips failed the Pavlik harness treatment (n = 6/6) [38]. Graf hip classification has been widely accepted as a radiological prognostic factor for PH treatment failure in which Graf hip 2a has the highest success rate, while Graf type hips III and IV have the highest failure rate [8, 38, 41, 44]. This is consistent with our findings. However, one intriguing finding is that nine patients (9.4%) with Graf type IIa failed PH treatment. This is more than expected if we compare it to treatment failure in type IIb. Therefore, we studied these nine patients individually. The median age was eight weeks (1, 2, 5, 8, 8, 9, 10, 10, 11) with six of them above the median. Seven were females. Seven were born by caesarian section and two developed femoral nerve palsy. We cautioned against any conclusion from these nine patients as the number is small and these may be a chance finding. Moreover, we do not recommend treating type IIa hips because most would resolve naturally.

Femoral nerve palsy is a complication of PH treatment and an early sign that can predict treatment failure [25]. The relationship between femoral nerve palsy is complicated as most clinicians abandon treatment when it occurs. This area will be the focus of ongoing research which we aim to publish separately.

There are some limitations of this study. The protocol of treatment for DDH was changed over time as more evidence had become available that most type IIa hips normalize naturally. Initially, type IIa was treated with PH; however, this was changed. Treatment was advocated for type IIb and worse. The lack of long-term follow-up precludes quantifying some of the long-term PH treatment complication such as avascular necrosis of the femoral head and residual dysplasia. However, there are several strengths of the study. It is one of the largest published series with 265 patients that were extracted from a larger population of 3800 patients who were referred to our baby hip clinics. A thorough analysis of various variables have been conducted including ones that have not been or rarely assessed before such as plagiocephaly, Galeazzi sign, foot deformities, and femoral nerve palsy.

Conclusions

Pavlik harness treatment is a successful treatment method with an average success of 85%. Several independent predictors for failure of PH treatment have been identified. These include a positive Galeazzi sign, a frankly dislocated hip, Graf types III and IV, and the development of femoral nerve palsy.