Introduction

Hallux valgus (HV) is a progressive complex tri-planar deformity of the forefoot, which is characterized by a valgus deviation and occasionally pronation of the great toe, varus angulation of the first metatarsal, lateral displacement of the sesamoids and the extensor tendons, and a first metatarsophalangeal (MTP) joint bunion formation [1]. Although a wide variation in the prevalence of HV has been reported, there is a strong correlation of this deformity with female sex and advancing age [2,3,4,5]. Mann and Coughlin classified HV into three types based on hallux valgus angle (HVA) and intermetatarsal angle (IMA): mild (HV < 20°, IMA < 11°), moderate (HV 20–40°, IMA 11–16°), and severe (HV > 40°, IMA > 16°) [6].

Numerous surgical procedures have been described for correction of HV. The conventional treatment of severe deformity includes proximal metatarsal osteotomies (PMOs). On the other hand, distal metatarsal osteotomies (DMOs) are recommended for mild and moderate HV deformity [6, 7]. However, recent evidence indicates that the outcomes after distal metatarsal osteotomies for moderate to severe HV deformity appear to be satisfactory.

We hypothesized that the extension of indications for DMOs may result in comparable clinical and radiological outcomes with those of PMOs. The aim of this study was to compare the efficacy of proximal and distal metatarsal osteotomies for correction of moderate to severe HV deformity, through a random-effects meta-analysis study design.

Methods

The present meta-analysis was prospectively registered with PROSPERO (​CRD42017068312). We also used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [8].

Inclusion and exclusion criteria

We enrolled studies that compared the results of proximal and distal metatarsal osteotomies in adults with moderate to severe hallux valgus deformity. This review included studies that compared any type of PMO with any type of DMO.

We excluded patients treated with scarf osteotomy as this is considered a mid-shaft rather a proximal or distal metatarsal osteotomy [9, 10]. We also did not consider studies assessing the efficacy of Lapidus procedure, since this is classified as an arthrodesis of the tarsometatarsal (TMT) joint [11].

Outcome assessment

The primary outcome measures were the American Orthopaedic Foot and Ankle Society (AOFAS) scoring system [12] and first IMA. The secondary outcomes were the HVA, tibial sesamoid position, participants’ satisfaction, and post-operative complications [13]. We assessed all the above outcomes in the following time periods: short term (≤ 1 year) and medium term (> 1 and < 10 years).

Literature search

We performed a literature search including the following electronic databases up to 25 July 2017: PubMed, Scopus, and Cochrane Central Register of Controlled Trials (CENTRAL). In these database searches, we applied no language restrictions. We also considered reference lists of relevant studies. Furthermore, we searched the following registries for completed unpublished comparative studies: ClinicalTrials.gov; Australian New Zealand Clinical Trials Registry (ANZCTR); and the International Standard Randomized Controlled Trial Number (ISRCTN) Register.

For the search strategy, we used the following terms: “hallux valgus,” “random*,” “comparative study,” “retrospective study,” “distal osteotomy,” “proximal osteotomy.” We adapted this search to each included database. In electronic supplementary material (ESM) 1, we provide the detailed search strategy we used for the PubMed.

Study selection

Two review authors (KT and DK) searched for records in a blinded fashion. Then, the titles and abstracts of these records were screened for eligibility. For the remaining articles, we obtained the full texts and assessed them for potential inclusion. We considered only RCTs in the meta-analysis.

Data extraction

Two reviewers (KT and DK) extracted the data independently. We abstracted information including the year of publication, comparators in the control group, and the number and demographics of patients in the included intervention groups. We also extracted information about the intervention characteristics, study outcomes, follow-up, and complications.

Quality assessment

Two investigators (KT and DK) independently performed the quality assessment of individual trials using the Cochrane Collaboration’s “risk of bias” and ROBINS-I tools for randomized and non-randomized studies, respectively [14, 15]. For the assessment of the included trials, we considered the following domains: randomization; allocation concealment; blinding of patients; blinding of personnel; blinding of outcome assessors; incomplete outcome data; selective reporting; and other bias. We judged each domain as either low, unclear, or high risk of bias.

Furthermore, we assessed the quality across studies. For each domain of the Cochrane’s risk of bias tool, if more than half of the information was from studies at a low risk of bias, we judged the domain to be at a low risk of bias. If most information was from studies at an unclear/high risk of bias, we considered the domain to be at an unclear/high risk of bias, respectively. For all study outcomes, we considered the domain of masking of outcome assessors to be crucial [16].

For the assessment of non-randomized trials, we considered three major domains:

  1. 1.

    Pre-intervention

  2. 2.

    At intervention

  3. 3.

    Post-intervention

In the first domain, we assessed the risk of bias due to confounding and in the selection of patients for the study as well. In the second domain, we evaluated the risk of bias in classification of interventions. In the third domain, we assessed the risk of bias due to deviations from intended interventions and missing data. Furthermore, for the latter domain, we evaluated the risk of bias in the measurement of outcomes and selection of the reported results. The possible judgments were “low risk,” “moderate risk,” “serious risk,” and “critical risk” of bias. In this systematic review, the exploration for the presence of small study effects and publication bias depended on the number of the included studies (i.e., 10 as a minimum). This exploration is performed using funnel plots and statistical models [17, 18].

Finally, we contacted the corresponding authors of the included studies to request additional information in regard to the quality assessment. We resolved any discrepancies about the risk of bias assessment through discussion.

Statistical analysis

We used the Review Manager (RevMan) Software (version 5.3) to perform pair-wise meta-analysis [19]. For continuous outcomes, we conducted random effects quantitative synthesis utilizing the effect size of standardized mean difference (SMD) and calculated 95% confidence intervals (CIs) according to the inverse variance method. For dichotomous outcomes, we conducted a random effects meta-analysis using the Mantel-Haenszel method and considered the effect measure of odds ratio (OR).

In this review, a p value of less than 0.05 indicated statistical significance. We explored for statistical heterogeneity using the Q statistic and measured the extent of heterogeneity using the I2 statistic. We considered the following classification of statistical heterogeneity [20]:

  • I2 = 0–40%: not important heterogeneity

  • I2 = 30–60%: moderate heterogeneity

  • I2 = 50–90%: substantial heterogeneity

  • I2 = 75–100%: considerable heterogeneity

Synthesis of the results

For clinical and radiological outcomes, we avoided combining data from different study designs [21]. Instead, we provided a qualitative presentation of the results from non-randomized studies with the aim to supplement the results of our meta-analysis [21].

For the reported complications, we proceeded with a quantitative comparison between the intervention groups, after accounting for the randomization of the included studies. Briefly, we classified unintended events into complications involving bone and soft tissues. We thereafter conducted a post hoc meta-analysis, which was requested by the peer reviewers, using subgroup analyses for randomized and non-randomized studies.

Subgroup and sensitivity analyses

We accounted for the impact of the soft tissue release on the outcomes of the included operations by conducting a pre-specified subgroup analysis. Moreover, we performed a sensitivity analysis, in which we excluded trials at an unclear and high risk of bias. Finally, we conducted a post hoc sub-analysis, in which we kept only studies assessing patients with an IMA of > 16°.

Clinical interpretation of the results

For the classification of effect sizes in this meta-analysis, an SMD value of 0.2 showed a small effect, a value of 0.5 denoted a moderate effect, and a value of 0.8 indicated a large effect [22]. For the clinical interpretation of the results, we accounted for the level of evidence and statistical power of the analyses.

Results

The literature search yielded 568 potentially relevant studies. Duplicates were removed and the remaining 431 studies were screened according to the information provided in their title and abstract. After the exclusion of 420 records, we assessed the remaining 11 articles for eligibility. In two articles, the patients did not present moderate to severe HV deformity [23, 24]. We enrolled 9 published studies in the qualitative synthesis [25,26,27,28,29,30,31,32,33]. Finally, we statistically pooled the results from 5 randomized trials (Fig. 1).

Fig. 1
figure 1

Flow diagram of the study selection procedure. HV= Hallux valgus

Study characteristics

In this systematic review, we considered nine comparative studies comprising a total of 696 feet. We abstracted information from 604 patients. Of the six randomized trials considered in the present review, five were eligible for inclusion in the quantitative synthesis [26, 28, 30, 32, 33]. The enrolled studies were published between 1991 and 2015. Four trials were conducted in Korea [29,30,31,32], one in the USA [25], one in Sweden [33], one in Turkey [26], one in China [27], and one in Thailand [28]. The mean age of participants in the intervention groups ranged between 31 and 65 years of age (Table 1).

Table 1 Baseline characteristics of the enrolled studies

In the proximal osteotomy group, we considered the following procedures: proximal crescentic, proximal chevron, Ludloff (30° proximal oblique diaphyseal), opening wedge, Mau (proximal oblique), and proximal closing wedge metatarsal osteotomies. On the other hand, in the distal metatarsal osteotomy group, we enrolled the following interventions: distal chevron, Bösch (subcapital linear), Hohmann, and Lindgren-Turan (30° subcapital transverse) osteotomies. For a thorough presentation of the intervention-related characteristics of the included studies, please see Table 2.

Table 2 Intervention-related characteristics of the included studies

Quality assessment

For the RCTs, the results of the quality assessment of individual studies are shown in Table 3. Adequate randomization was performed in two trials of the present systematic review [28, 30]. For all the enrolled studies, there were no available registration protocols. We highlight that due to the different anatomical sites of the osteotomies, it was not possible for the participants to be blinded to the interventions they received. Successful blinding of the outcome assessors was reported in three trials [28, 30, 32]. For the risk of bias assessment across trials, most domains were at an unclear risk of bias (ESM 2). The findings from the quality assessment of non-randomized studies are presented in ESM 2. It should be noted that the number of the included studies did not allow us for creating funnel plots for the assessment of publication bias.

Table 3 Risk of bias assessment of the individual randomized trials

Synthesis of the results from randomized studies

Short-term assessment

Four studies were eligible for inclusion in the short-term outcome assessment of the present SR [28, 29, 31, 33]. Statistical pooling was possible for 185 feet. For the first IMA and HVA, there was a statistically significant difference in favour of the PMO group (SMD was − 0.71, 95% CIs − 1.02 to − 0.41, p < 0.05; and − 0.95, 95% CIs − 1.26 to − 0.64, p < 0.05, respectively) (ESM 3). In these analyses, heterogeneity was insignificant.

Medium-term assessment

For the assessment of the IMA, there was a statistically significant difference in favor of PMO group in the medium term (SMD = − 0.38, 95% CIs − 0.65 to − 0.12, p < 0.05) (Fig. 2). For this analysis, we detected low heterogeneity levels (I2 = 21%, p = 0.28).

Fig. 2
figure 2

Forest plot of standardized mean differences for the assessment of the first intermetatarsal angle in the medium term. Three different subgroups are considered. Vertical line demonstrates no difference between the two comparison groups. An overall statistically significant difference in favour of the proximal metatarsal osteotomy group is shown. SMD standardized mean difference, IV inverse variance, SD standard deviation, CI confidence interval, PMO proximal metatarsal osteotomy, DMO distal metatarsal osteotomy

Regarding the AOFAS evaluation in the medium term, there were no statistically significant differences between the intervention groups (SMD = 0.18, 95% CIs − 0.21 to 0.57, p = 0.16) (Fig. 3). In this analysis, statistical heterogeneity levels were considered to be moderate, albeit not significant (I2 = 46%, p = 0.16).

Fig. 3
figure 3

Forest plot of standardized mean differences for the assessment of the American Orthopaedic Foot and Ankle Society scoring system in the medium term. Two different subgroups are considered. Vertical line demonstrates no difference between the two comparison groups. In the overall analysis, no statistically significant differences are shown. SMD standardized mean difference, IV inverse variance, SD standard deviation, CI confidence interval, PMO proximal metatarsal osteotomy, DMO distal metatarsal osteotomy

For the assessment of the HVA, we did not detect any significant differences between the intervention groups (SMD = − 0.25, 95% CIs − 0.57 to 0.06, p = 0.12) (Fig. 4). Heterogeneity was not statistically significant (I2 = 43%, p = 0.15).

Fig. 4
figure 4

Forest plot of standardized mean differences for the assessment of the hallux valgus angle in the medium term. Three different subgroups are considered. Vertical line demonstrates no difference between the two comparison groups. In the overall analysis, no statistically significant differences are shown. SMD standardized mean difference, IV inverse variance, SD standard deviation, CI confidence interval, PMO proximal metatarsal osteotomy, DMO distal metatarsal osteotomy

For the assessment of the participants’ satisfaction, and tibial sesamoid position, we did not detect any significant differences between the intervention groups (Fig. 5 and ESM 4, respectively). Statistical heterogeneity was low in all cases.

Fig. 5
figure 5

Forest plot of odds ratios for the assessment of the participants’ satisfaction in the medium term. Two different subgroups are considered. No statistically significant differences are shown. M-H Mantel-Haenszel, CI confidence interval, PMO proximal metatarsal osteotomy, DMO distal metatarsal osteotomy

Results from non-randomized studies

A descriptive presentation of the main clinical and radiographic findings from non-randomized studies is provided in ESM 5. For these studies, we did not detect any significant clinical differences between the intervention groups. It should be noted that in one study, statistically significant differences in favour of the DMO group were reported regarding the first IMA and tibial sesamoid position changes [27].

Complications

For a qualitative presentation of the reported complications, please see ESM 6. For the complications involving bone and soft tissues, we did not detect any significant differences between the intervention groups (odds ratio was 1.37, 95% CI [0.62 to 3.06], p = 0.44, I2 = 32%; and 1.3, 95% CIs 0.4 to 4.24, p = 0.66, I2 = 41%, respectively) (ESM 7).

Sensitivity analyses

We conducted a pre-determined sensitivity analysis, in which we excluded trials at an unclear and high risk of bias and detected insignificant heterogeneity levels (ESM 8). Moreover, we did not detect any significant differences between our primary and sensitivity analyses when we accounted for the severity of the deformity (ESM 8).

Discussion

In this systematic review, we explored the efficacy of proximal and distal metatarsal osteotomies for the surgical management of moderate to severe HV deformity. Proximal metatarsal osteotomy offers a greater correction of the deformity and minimal shortening of the metatarsal, because the correction is performed near the first metatarsocuneiform joint. However, osteotomy at this site is relatively unstable and takes longer to heal than a DMO. Proximal metatarsal osteotomy is also likely to heal in dorsiflexion, which could cause transfer metatarsalgia [31]. On the other hand, DMO offers only relatively smaller amount of correction of the deformity, causing some post-operative shortening of the first metatarsal with a risk of avascular necrosis of the metatarsal head. However, DMO requires a shorter incision and improves pain and functional ability in a wider range of deformities [28]. Recent studies have shown that the extension of indications of distal metatarsal osteotomies may be a viable option. In the present systematic review, we used information from 696 feet with the aim to clarify this controversy. For patient-reported and clinician-oriented outcomes, the quantitative synthesis showed that there were no significant differences between the intervention groups in the medium term. These findings were supported by data from non-randomized studies and remained robust after controlling for the severity of the deformity.

Dealing with clinical diversity

We highlight that the scope of a systematic review determines the extent to which the included studies are diverse [20]. In the present study, we performed a comparison linked to the anatomical location of the metatarsal osteotomies using a pair-wise meta-analysis study design. Taking into account that the optimal proximal and distal metatarsal osteotomies for moderate to severe hallux valgus deformity have yet to be defined, we considered that a meta-analysis study design was appropriate to test our hypothesis. After analyzing data, we observed insignificant heterogeneity indicating that the intervention effects were not significantly affected by factors that varied across studies [20].

Medium-term clinical and radiographic findings

For the assessment of the first IMA, we detected a slight superiority in favor of the PMO group. Among the PMOs and DMOs, evidence from randomized and non-randomized studies showed that none was favoured in terms of the patients’ satisfaction and AOFAS assessment. These findings remained robust after adjusting for the risk of bias. Finally, we highlight that we detected higher complication rates in the PMO group.

Clinical implications

For the management of HV deformity, PMOs are considered to be more technically demanding and require a more restricted post-intervention protocol compared to DMOs [9, 34]. As we did not detect any significant differences between the therapeutic efficacy of proximal and distal metatarsal osteotomies, this study suggests that the decision between these two interventions depends on a surgeon’s skills and/or preference. We also highlight that health policy makers should take into account the higher number of reported complications after proximal metatarsal osteotomies and cost-effectiveness of different techniques for bunion surgery [35].

Strengths and limitations of the present systematic review

In this SR, the sample size was adequate enough to allow us for testing our hypothesis. However, half of the included studies were judged to be at an unclear risk of bias. Thus, we suggest that the results of the present systematic review be interpreted with caution. It should be noted that in 89 cases of the present systematic review, a simultaneous bilateral correction was performed. We also noticed a variability in the definition of moderate and severe hallux valgus deformity among the included studies. Finally, we detected that there was a limited reporting of long-term results in the enrolled studies (i.e., at 10 years and beyond).

Implications for future research

For a thorough pre-operative assessment, we recommend that the authors of the future studies should not only focus on the pre-operative HVA and IMA measurements, but also interphalangeal angle, first metatarsal protrusion distance, and sesamoid rotation angle [36, 37]. Moreover, we suggest that more emphasis is placed on the reporting of long-term observations.

Conclusions

In conclusion, for the management of moderate to severe HV deformity, we found no significant clinical and radiographic differences between patients treated with proximal and distal metatarsal osteotomies. This was also the case for the reported complications. Accordingly, we underline that the extension of the indications for distal metatarsal osteotomies may be a viable option. Between proximal and distal metatarsal osteotomies, we recommend that an orthopedic surgeon decide not only on the basis of his/her personal skills and/or preference but also on the cost-effectiveness of the available techniques.