Introduction

Freezing of gait (FOG) is among the most disabling gait disorders of Parkinson’s disease (PD), affecting 26% of people with mild PD and 80% of those with severe PD. FOG is characterized by sudden, relatively brief episodes of inability to initiate or continue effective forward stepping; it is often described by patients as if their feet are ‘stuck to the floor’ while their upper body continues its original trajectory. During disease progression, FOG occurs more frequently and shows more resistance to dopaminergic medications, which poses a major burden on patients’ daily living and is associated with impaired functions and mobility, increased risks of falls and related injuries, poor quality of life, loss of independency, and high rates of institutionalization and mortality [1].

Recent neuroscience studies have suggested that the impaired basal ganglia in PD affects the activation of supplementary motor areas, generating deficits in activity preparation and causing abnormal movement such as FOG [2]. FOG is associated with impaired dual-tasking ability—an inability to set-shift attention among the motor, limbic, sensory and cognitive networks. As summarized by Gilat and his colleagues [3], FOG are more frequently triggered while turning, performing cognitive challenges while walking such as dual-tasking, encountering environmental challenges, such as negotiating doorways and approaching a destination, and reduced visual input such as walking in the dark. Greater anxiety is also associated with worse FOG.

Certainly, the pathophysiology underlying FOG is multifaceted, with an interplay between motor elements (dysregulated stepping mechanisms) and non-motor elements (anxiety and cognitive decline) [1]. A narrative review [4] identified 59 unique behavioral compensation strategies used/self-invented by patients with PD to overcome FOG, including changing balance and gait patterns, using external stimuli, internal triggers, and cognitive training and motor imagery techniques. As in clinical and research settings, a wide range of behavioral modalities have been designed and applied to target various FOG triggers and/or determinants, including physiotherapy [5, 6], external cueing [1], attentional exercises [7], and cognitive training [8].

Two recently published systematic reviews/meta-analyses examined the evidence of non-pharmacological interventions on FOG. The first systematic review included 19 RCTs testing represented physiotherapy interventions with FOG as a primary outcome [5]. It concluded that physical therapy improved subjective FOG as compared to both active (n = 10, Z = 3.90, p < 0.001) and passive control groups (m = 9, Z = 3.42, p < 0.001). Significant residual effects were found in eight studies comparing action observation to an active control intervention (n = 4, p = 0.002), but not cueing (n = 2, p = 0.78). However, this meta-analysis only included studies which tested physiotherapy-related interventions, and excluded RCTs of alternative mind–body interventions such as dance and Taichi.

The second review conducted by Gilat, Ginis [3] aimed to examine a broader spectrum of exercise- and training-based interventions on FOG with a primary focus on freezers, which included 41 studies and 1838 patients. It adopted a conceptual model to categorize diverse behavioral interventions into three subgroups based on their relevancy to FOG, namely, (1) FOG-specific interventions which targeted triggers mostly (such as cueing offered to help patients overcoming FOG episodes, and action-observation training strategies designed to overcome FOG-provoking situation), (2) FOG-relevant interventions which targeted the underlying motor- and/or non-motor determinants of FOG, with the aim to reduce the FOG severity but not aiming at the immediate alleviation of imminent FOG episodes (such as cognitive training, cognitive-motor dual-task training, treadmill training with cueing aiming at improving gait parameters other than FOG, etc.), and (3) generic exercises which included conventional physical therapy or generic exercises that might not be FOG-specific (i.e., dance, yoga, Taichi, gait training, muscle-power training, etc.). The summarized evidence revealed a favorable small–moderate effect size (ES = −0.37) of a wide variety of training modalities for reducing subjective FOG-severity (p < 0.00001) compared to any type of control condition, though several interventions were not directly aiming at FOG and some included non-freezers. The review also found that FOG-specific targeted training such as cueing and action-observation training demonstrated moderate effects to help patients overcoming imminent FOG episodes; while generic exercises were not [3]. However, significant heterogeneity was identified across study effects and intervention designs within the three intervention categories. Thus, the results should be interpreted cautiously. Pragmatically, each exercise/training modality might target multiple triggers and/or determinants of FOG with/without a purposeful intention. For example, treadmill itself may act as a rhythmic auditory-cueing for alleviating imminent FOG episodes. Hence, questions remain on how strong the evidence is for each behavioral intervention to reduce FOG.

Although many trials/several reviews have been conducted to compare treatments for FOG, each has compared only two or a few treatments. There’s a lack of integrated and systematic evidence to inform the relative efficacy of all tested behavioral strategies for FOG. This integration is important because different strategies vary both in cost and efficacy. Existing evidence typically adopted a conventional pair-wise meta-analysis approach and the conclusions were limited to pairwise comparisons of subsets of these treatments. In this systematic review, we adopted a network meta-analysis approach to compare multiple treatments simultaneously in a single analysis by combining direct and indirect evidence within a network of randomized controlled trials (RCTs). Conventional meta-analysis can only pool studies designed for direct comparisons of interventions. Thus, inadequate direct comparisons would limit the credibility of a conventional meta-analysis. In contrast, a network met-analysis may consider indirect comparisons and thus allows more reliable comparisons among different interventions [9]. Given the diverse variety of behavioral intervention trials for FOG to date, network meta-analysis appeared as an appealing alternative approach to provide imperative evidence to inform the future clinical and research directions of FOG rehabilitation and support decision-making by patients, clinicians, and service commissioners.

To our knowledge, no review has yet compared different behavioral interventions relative to each other using network meta-analysis, in which all interventions that have been tested in RCTs can be simultaneously compared and their effect can be estimated relative to each other and to a common reference condition (e.g., usual care). In this review, we undertook a network meta-analysis of all behavioral interventions that have been tested in RCTs for FOG management. We aimed to examine the comparative effectiveness and treatment ranking probabilities of existing behavioral strategies for FOG management. Such information would help to support evidence-based research directions and recommendations regarding FOG management for PD.

Methods

Search strategy

The protocol of this systematic review was registered in the International Prospective Register of Systematic Reviews (PROSPERO), National Institute for Health Research [Protocol no. CRD42021226951]. The Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) and its extension statement for network meta-analysis were adopted in this network meta-analysis [10]. The objectives were achieved by keeping the search broad and including the following: (1) all kinds of behavioral interventions; (2) the target population comprising people with idiopathic PD, regardless of disease severity and disease stage; and (3) any outcome measures related to mobility and gait.

We systematically searched the following databases from 1990 to December 2021: Cochrane Library, PubMed, CINAHL, Ovid Medline, EMBASE and PsycINFO databases. The search terms were ‘Parkinson’, ‘Parkinson’s’ OR ‘Parkinson’s disease’ AND ‘gait’, ‘balance’, ‘aquatic’, ‘exercise’, ‘cognitive training’, ‘mindfulness’, ‘cue’, ‘cueing’, ‘dance’ OR ‘mind body’ AND ‘freezing’, ‘freezing of gait’. Full-text articles published as abstracts and in conference proceedings were included. We reviewed the reference lists of relevant systematic reviews and of all included studies. We also searched ClinicalTrials.gov and the World Health Organization International Clinical Trials Registry Platform. For studies with incomplete data, we also wrote to the authors to request for missing data. The full search strategy of each database was shown as Online Resource 1.

Study selection

Controlled clinical trials (CCTs) were selected to reach a high grade level of evidence. Cross-over trials were incorporated by including only data from the first period. We included studies published in English. The inclusion criteria were listed in PICO framework. Previous literature suggested freezers exhibiting very mild episodes may be misdiagnosed as non-freezers based on clinical observation, which may result in selective bias in many trials focused only on FOG [11]. Many patients who experienced mild FOG symptoms may be underdiagnosed or did not seek/receive any active treatment. To comprehensively evaluate the evidence of behavioral interventions for FOG management, this review purposely included not only studies with clinically diagnosed freezers, but also all studies which reported patient-perceived FOG outcomes, regardless of their disease severity and disease stage. For interventions, all types of behavioral interventions were included, regardless of the intervention modalities and intensity. For comparisons, all types of control were considered eligible, such as usual care, no intervention control, waitlist, or other behavioral interventions (comparative effectiveness trials). For outcomes, all types of gait-specific outcome measures were included to provide a comprehensive evaluation of the effects of behavioral interventions on FOG for patients with PD.

Screening, data extraction and quality assessment

Citations were screened by two researchers independently (LC, CL), and citations that were not related to trials or to mobility and gait for PD were excluded; potentially relevant citations were checked by a third researcher (JK). Using a standardized data extraction Excel template, study characteristics and outcomes were extracted by two researchers (LC, YS) and checked by a third researcher (JK). Two reviewers (LC, YS) independently appraised each article using the revised Cochrane risk-of-bias tool 2.0 (RoB 2) for randomized controlled trials, and discussed any disagreements until consensus was reached with a third reviewer (JK) [12]. RoB 2 is the recommended tool to assess the risk of bias in randomized trials included in Cochrane Reviews, which assesses different aspects of trial design, conduct and reporting, and categorizes the risk of bias into ‘low’, ‘some concerns’, or ‘high’ risk of bias. Certainty of evidence contributing to the network estimate of the main outcome was evaluated through the GRADE method [13].

Statistical analysis

The primary outcome measure was determined as the Freezing of Gait Questionnaire (FOGQ) and New Freezing of Gait Questionnaire (nFOGQ), which provide a global measure of the severity and impact of FOG on patients’ daily life. To date, FOGQ and nFOGQ are considered as the only validated and reliable available clinical tests to subjectively assess FOG in PD patients [5, 14]. FOGQ is a 6-item scale (range 0–24) consisting of four items assessing FOG severity and two items assessing gait difficulties in general questionnaire ranges [15]. nFOGQ (range 0–28) consists of three parts, part I consist of an initial item distinguishing freezers from non-freezers; part II assesses the FOG severity (frequency and duration), and part III assesses its impact on daily life [16]. Higher scores denote more severe FOG symptoms [3]. Continuous data were extracted for all included studies. Standardized mean differences (SMDs) were calculated for the treatment effect for the change in scores between baseline and post-intervention between the treatment arm and comparison arm. SMD was calculated as the difference in change between groups of FOG over the pooled standard deviation (SD) of change for FOG. Negative SMD values indicate a reduction in FOG symptoms for the treatment arm compared to the comparison arm. For studies that only reported baseline and post-intervention mean and error scores, the pooled SD of change for FOG was estimated following the Cochrane Handbook [17]. A correlation coefficient (r = 0.86) was used to impute the pooled SD of change based on the correlation between baseline, follow-up and change data in studies that reported all three [18,19,20]. Where both were reported, we used results that accounted for missing data (e.g., multiple imputation) rather than results from participants who only completed the study. Studies’ results were extracted for intention to treat where possible.

A random-effects Bayesian network meta-analysis was conducted to account for between-trial effects and effects of trials with more than two arms [21]. The goodness of fit of the random-effects model was assessed by comparing the deviance information criterion (DIC) to a fixed-effects model. The network meta-analysis was conducted with a Bayesian Markov chain Monte Carlo (MCMC) method fitted using Just Another Gibbs Samplers (JAGS) software within R Statistical Software using the GEMTC package (R Core Team 2020). We set non-informative priors and used four MCMC chains simultaneously within the MCMC model. The Bayesian model ran 5000 burn-in iterations and 100,000 simulation iterations. Convergence will be assessed using the potential scale reduction factor (PSRF), where we expect model two to decrease to below 1.05 and using Gelman–Rubin diagnostic trace plots. Heterogeneity (direct evidence) and model consistency (consistency between direct and indirect effect size) were assessed using the node split function of the GEMTC package, and sources of heterogeneity were explored between studies. Heterogeneity in direct evidence comparisons was assessed using the I2 statistic for pooled SMD, a value of 0% indicating no observed heterogeneity, and values greater than 50% were considered substantial in heterogeneity [22]. All results of each possible comparison were made with SMD and 95% credible intervals (Crls), which can be considered Bayesian equivalents of confidence intervals. As a measure that reflects ranking and uncertainty, we used the surface under the cumulative ranking curve (SUCRA) [23]. The SUCRA score, expressed as a range of 0–1, showed the relative probability of an intervention being among the best options. We also ranked the probability of each intervention being the most effective (first best), the second best, the third best, and so on. Ranked probability represents the probability of the ranking performance of each intervention type.

Network meta-regressions were conducted to evaluate study characteristics that may influence the effect sizes of interventions within the network model. Factors for exploring the meta-regressions included year of publication, study sample age, study sample’s baseline FOG (if possible), proportion of males within sample and follow-up number of weeks. Publication bias was assessed using a funnel plot of action observation training vs usual care, as this comparison contained the most studies to make an estimation of publication bias, with a trim and fill analysis conducted. Publication bias was only assessed using studies that compared an active treatment to usual care so that there is a consistent comparison between all studies, while retaining the most studies from the network meta-analysis in the analysis.

Results

Study selection

The PRISMA 2020 flow diagram (Fig. 1 PRISMA 2020 flow diagram) was used to indicate the sifting process of this systematic review. The full search strategy used for each database was shown in Online Resource 1. A total of 15,802 papers were initially identified. After removing duplicates, 9483 abstracts were screened for potential inclusion. The title and abstract were used to determine if the article matched the inclusion criteria for this systematic review. Five hundred and sixty full-text papers were then retrieved to confirm their eligibility for inclusion in the study. Twenty-one articles were identified in the manual search of the papers’ reference lists. Forty-seven articles on behavioral interventions for patients with PD were included for qualitative and quantitative synthesis. A total of 37 studies were included in the network meta-analysis. Three studies were excluded because they reported FOG outcomes in a format that could not be pooled [24,25,26]. Seven studies were excluded due to their comparison groups not matching this network meta-analysis’s purpose [29,30,31,32,33].For instance, testing interventions under the same categories (e.g., tango versus mixed-genre dance) and multimodal interventions. The findings of one study were published in two papers by [34, 35]. The included trials assessed 72 interventions or control conditions. Guided by the European Physiotherapy Guideline for PD [36], we further categorized the included interventions into 12 categories. Table 1 shows the operation definition of each category.

Fig. 1
figure 1

PRISMA 2020 flow diagram

Table 1 Operation definition of behavioral interventions

Summary of network geometry

Figure 2 shows a network graph comparing 12 categories of behavioral interventions for FOG management. A total of 1454 PD patients were included in the network geometry. Among, 1028 PD patients received active behavioral interventions, and 426 patients received usual care as control. The studied interventions were more commonly conventional physiotherapy (n = 14 trials, patients receiving treatment = 246), generic exercises (n = 8 trials, patients receiving treatment = 93, and mind–body exercises (n = 9 trials, patients receiving treatment = 212). The most frequently used comparisons were mind–body exercises versus usual care (n = 7), conventional physiotherapy versus usual care (n = 6), and generic exercises versus usual care (n = 4).

Fig. 2
figure 2

Network diagram of direct comparisons among all behavioral interventions for FOG. Key: lines represented treatments with direct comparisons. The size of treatment nodes reflects the number of patients randomly assigned to each treatment. The thickness of edges represents the number of studies underlying each comparison.

Study characteristics

Online Resources 2 and 3 show a systematic presentation of information regarding the patient and study characteristics. A total of 39 studies reported data on gender, in which 55.4% of the participants were male. The mean age of participants was 68.8 years, ranging from 63.1 to 79.85 years. Forty-two studies reported disease duration as the number of years since clinical diagnosis. The mean (SD) disease duration was 8.25 (2.19) years. A total of 44 studies reported disease staging measured by the Hoehn and Yahr scale, the median stage was 2.35, with a range of 1.61–3.15 across all groups. As for severity of motor symptoms, a total of 39 trials reported the scores on the motor part of the Unified Parkinson’s Disease Rating scale (UPDRS-III) (n = 29) and Movement Disorders Society Unified Parkinson’s Disease Rating scale (MDS-UPDRS-III) (n = 10). The mean (SD) UPDRS-III and MDS-UPDRS-III scores were 28.3 (8.1) and 31.2 (11.5).

As for study characteristics, 42 studies had parallel design and the other four were crossover studies. Twenty-two studies were conducted in Europe, 11 in Americas, 8 in Australasia and 5 in Asia. Eighteen (39%) had usual care/no treatment as passive control, and 28 (61%) had an active comparator. The median and mean duration of treatment were 8 weeks and 10.3 weeks, respectively (range 2—48 weeks).

The potential effect modifiers (including age, gender, disease duration and staging, severity of motor symptoms, and severity of FOG) were comparable across the studies. Within each intervention category, intervention and sample characteristics were deemed to be balanced across all trials. Therefore, we assumed the validity of transitivity for this network meta-analysis.

Risk of bias within studies

Online Resource 4 presents a summary of the methodological quality assessment of the 46 studies included in qualitative synthesis, with the use of the Cochrane risk-of-bias tool (version 2). The overall rating indicated 16 studies (35%) with some concerns and 30 studies (65%) with high risk of bias. As for the domain 1, the randomization process of 29 studies (63%) was well described and therefore rated as low risk of bias, 14 studies (30%) with some concerns and 3 studies (7%) with a high risk of bias in this domain. Domain 2 assesses bias due to deviations from the intended interventions. Four studies (8.7%) were rated as low risk of bias, 12 studies (26%) were rated as some concerns, while 30 studies (65%) were rated as high risk of bias. The reasons are two-folded: (1) owing to the nature of behavioral interventions, it is infeasible to blind the participants; and (2) majority of the studies (78.3%) did not follow or violated the intention-to-treat principle, only 10 studies (21.7%) followed intention-to-treat analysis properly [19, 37,38,39,40,41,42,43,44,45]. Domain 3 assesses bias due to missing outcome data, 19 studies (41%) were rated as low risk of bias, while 27 studies (59%) were rated as high risk of bias, mostly due to no information of attrition provided. A number of trials did not pre-register in any clinical trial registries or did not publish their study protocols [19, 24, 27, 37,38,39,40,41, 46,47,48,49,50,51,52,53,54,55,56]. Hence, we are uncertain if there are any protocol/analysis deviations. Discrepancies between the planned and published outcome measures were noted in five studies, without any explanation [28, 34, 57,58,, 58]. Domain 4 assesses bias due to measurement of outcome; since this review only included studies examining FOG severity as measured by the validated, self-reported FOGQ or NFOG-Q, all studies were rated with some concerns of risk of bias. Domain 5 assesses bias in selection of the reported result, 19 studies (41%) were rated as low risk of bias and 27 studies (59%) were rated with some concerns. Online Resource 4 shows the summary of quality assessment of studies using RoB 2. Figures 3 and 4 show the certainty of evidence of each category contributing to the network estimate of the main outcome according to GRADE. In summary, when compared to usual care, the evidence for external cueing and mind–body exercises was considered to be moderate; action observation training, conventional physiotherapy, and general exercises external was considered to be low; for the rest of active interventions was very low (Fig. 3).

Fig. 3
figure 3

Forest plot for estimated effects on freezing of gait of each intervention compared with usual care control. Key: SMD (95% Crl) = standardized mean difference (95% credible intervals); negative value indicates reduced FOG severity.

Fig. 4
figure 4

Forest plot for estimated effects on freezing of gait of each intervention compared with conventional physiotherapy. Key: SMD (95% Crl) = standardized mean difference (95% credible intervals); negative value indicates reduced FOG severity.

Synthesis of results

When compared to usual care/control, obstacle training (SMD = −2.20; 95% Crl: −3.40, −0.94), gait training on treadmill (SMD = −0.88; 95% Crl: −1.70, −0.09) and general exercises (SMD = −0.77; 95% Crl: −1.30, −0.27) showed significant benefits within 95% Crl, with the obstacle training displaying the larger effect size (Fig. 3, Forest plot for estimated effects on freezing of gait of each intervention compared with usual care control). The mean SMD for all other interventions showed benefits with a small–moderate effect size compared to usual care/control, yet these were within 95% Crls. When compared to conventional physiotherapy, only obstacle training exceeded the 95% Crl difference (SMD = −1.9; 95% Crl: −3.20, −0.56) (Fig. 4). When comparing relative effect sizes between all interventions, obstacle training showed benefits with a large effect size compared to most of the active interventions. No other differences were found between active interventions beyond 95% Crls (Table 2).

Table 2 League table presenting network meta-analysis estimates (lower triangle) and direct estimates (upper triangle) of effects of behavioral interventions for FOG in PD

The ranked probability of each treatment arm within the network meta-analysis showed that obstacle training was most likely to be the best performing (91% probability of best performing intervention, SUCRA = 0.98), gait training on treadmill was the second ranked intervention (SUCRA = 0.74), followed by general exercises (SUCRA = 0.67), action observation training (SUCRA = 0.58), real-time feedback (SUCRA = 0.50), psychoeducation (SUCRA = 0.47), robotic assisted walking (SUCRA = 0.45), mind–body exercises (SUCRA = 0.40), external cueing (SUCRA = 0.39), conventional physiotherapy (SUCRA = 0.35), dual task training (SUCRA = 0.28), and usual care (SUCRA = 0.10) (Online Resource 5, 6).

Exploration for inconsistency

Inconsistency between observed evidence (direct evidence) and indirect evidence was substantial in comparisons that included direct evidence (Online Resource 7). For comparison between general exercises and action observation training, the direct evidence favored action observation training, while the indirect evidence favored general exercises. Even though there was some degree of inconsistency between direct and indirect evidence, all comparisons with both direct and indirect evidence were within 95% Crl.

Risk of bias across studies

The random-effects model was a good fit to the data compared with the individual-effects network meta-analysis model, which showed greater fit compared to the fixed-effects model (DIC = 64.52 vs 86.54, where lower values indicate a better fit). The network model was assessed to have convergence with a potential scale reduction factor (PSRF) value of 1.00, below the 1.05 cutoff and normal distribution of the posterior effect density plot. There was some level of inconsistencies between the direct model and indirect model; however, none were significantly different, and the mean effect sizes did not differ in direction for any of these comparisons (Online Resource 7). Heterogeneity within the direct model showed heterogeneity I2 values ranging from 0 to 83%, with general exercises vs usual care (83%), conventional physiotherapy vs usual care (65%), gait training on treadmill vs external cueing (72%), and showing substantial heterogeneity above I2 50% (Online Resource 7). Substantial heterogeneity was also observed in the network model of comparisons for real-time biofeedback vs external cueing, gait training on treadmill vs external cueing, usual care vs conventional physiotherapy, general exercises vs conventional physiotherapy, real-time biofeedback vs conventional physiotherapy, gait training on treadmill vs conventional physiotherapy, and general exercises vs usual care (Online Resource7).

Publication bias

Online Resource 8 shows the funnel plots comparing active interventions to usual care. For studies comparing mind–body exercises to usual care (n = 8), a moderate degree of asymmetry was noted within the funnel plot, which indicates potential publication bias. The trim-and-fill analysis did not estimate any missing studies on the right hand side of the funnel and removed four of these studies. Some asymmetry still observed with studies points skewed further the right side of the funnel plot, SMD and SE for mind–body exercises compared to usual care remained unchanged from a random effects meta-analysis between mind–body exercises to usual care).

Meta-regression analysis

Individual meta-regression was conducted for studies’ follow-up time, mean age of study sample, proportion of females within study sample, mean baseline FOG of sample, year of publication and a multivariate regression of all covariates. Risk of bias could not be evaluated in the meta-regression as all studies were scored overall as ‘of some concern’ or ‘high risk of biases. Studies with missing values for covariates were excluded from the corresponding meta-regressions, for gender: Zhu et al. [45]; for weeks follow-up: Carpinella et al. [59]; for baseline FOG: Duncan and Earhart [34], Ginis et al. [39], Paul et al. [20], Song et al. [60], Martin et al. [51] and Silva-Batista et al. [61] were excluded. In the case of FOG, some studies varied in the outcome measure, only studies that used the FOG questionnaire were retained for the meta-regression. When adjusting for studies’ sample proportions of females, year of publication, DIC remained similar to the network meta-analysis (proportion for females DIC = 64.54, year of publication DIC = 66.26, age DIC = 64.74, follow-up weeks DIC = 63.07, FOG DIC = 55.31). Overall, the FOG showed best model fit. SMD values for the meta-regression models for year of publication, age, proportion of females and study follow-up time (weeks) showed similar values to the network meta-analysis. In the FOG models, SMD values differed, with obstacle training (SMD −2.1), gait training with treadmill (SMD −1.2), action observation training (SMD −1.0), conventional physiotherapy (SMD −0.70) and general exercise (SMD −0.64) showing differences beyond 95% Crl of usual care in effectiveness when adjusting for studies’ baseline FOG values (Fig. 3; Fig. 5—adjusted forest plot).

Fig. 5
figure 5

Adjusted forest plot for the estimated effects on freezing of gait of each intervention compared with usual care control. Key: SMD (95% Crl) = standardized mean difference (95% credible intervals); negative value indicates reduced FOG severity

Subgroup analysis

Subgroup analysis was conducted for studies with samples above the total median FOG across all studies (FOG = 7.65). Subgroup analysis for studies below the total median baseline FOG was not possible to be conducted as the network of interventions did not have enough comparisons for a connected network. There were 18 studies included for samples above median FOG, model fit indicated relative good fit compared to all studies (DIC = 31.21). The SUCRA indicators showed that obstacle training (SUCRA = 0.99) as the best performing interventions for studies with samples of high FOG (Online Resource 9). Obstacle training, gait training on treadmill, and general exercise displayed effectiveness compared to usual care beyond 95% Crl (Online Resource 10).

Discussion

To the best of our knowledge, it is the first network meta-analysis to compare behavioral interventions for FOG management among PD patients. For patients with mild–moderate PD, the findings concluded that FOG symptoms most likely respond to obstacle training, gait training on treadmill and general exercises, with moderate–large effect sizes. After adjusting for the moderating effects of baseline FOG severity, action observation training and conventional physiotherapy appeared to be effective for managing FOG symptoms as well. However, the positive effects of some commonly prescribed compensation strategies for gait rehabilitation, including external cueing, dual task training, and mind–body exercises (including dance), were not evident when implemented as a single compensation strategy.

Effectiveness of behavioral interventions vs control treatment on FOG

After adjusting the moderating effects of baseline FOG severity, obstacle training, gait training on treadmill, general exercises, action observation training and conventional physiotherapy showed beneficial effects on subjective FOG outcomes compared to control conditions (beyond the 95% Crls). Previous meta-analysis [62] reported treadmill training, hydrotherapy, action observation, Nordic walking, and conventional physiotherapy demonstrated moderate–large effect in improving objective gait outcomes including gait speed and step length in a laboratory setting. Our findings further conform the improvement measured in such controlled settings could be translated into patients’ daily living.

Only one study examined obstacle training—Zhua and Yin [45] examined the effects of obstacle training delivered in an aquatic setting. High intervention dose was noted, i.e. 40-min per session, five sessions per week for six weeks. The turbulence, hydrostatic pressure and buoyancy existed in an aquatic environment may increase sensory stimulation and cause balance reactions that can improve postural and gait control of PD patients. Meanwhile, the obstacle training simulated circumstances that demanded switching between motor actions (FOG-triggering situations such as frequent turning and narrow passages), and the repeated training might alter the gait patterns and eventually avert the occurrence of FOG.

While gait training on treadmill typically involved walking tasks on a motorized treadmill supervised by physical therapists. Fenkel-Toledo [63] and Frazzitta, Maestri [38] suggested that treadmill may act as an external cue itself, which reinforced neuronal circuits and modulates walking patterns. During treadmill training, patients are required to focus their attention on gait following the enforced external pacing [38, 52]. Such a progressive and repeated motor-cognitive training process further facilitates their skill acquisition on gait control, enabling them to ‘internalize’ the external pacing and translate motor skillsets into daily applications. This process is similar to that of obstacle training and action observation training, in which progressive learning of motor skillsets with attentional/cognitive requisite was emphasized through observation and practice. The engagement in high-level multitasking of planning and executing motor movements indeed promoted the deep learning of motor control to enhance body coordination and gait performance [34].

It is noted that the benefits of general exercises are evident when indirectly compared to usual care control, regardless of the insignificant findings of three direct comparisons [53, 57, 64]. The promising relative effect estimates of general exercises are likely to be driven by the two studies examining Nordic walking and adapted resistance training [55, 61]. In Wroblewska, Gajos [55]’s study, Nordic walking (biweekly 60-min sessions for 12 weeks) demonstrated significant long-lasting benefits on FOG outcomes against usual care. While Silva-Batista, de Lima-Pardini [61] examined the effects of a 12-week triweekly adapted resistance training programme compared to conventional physiotherapy, and concluded that exercises with high motor complexity demonstrated moderate clinically important difference on FOG against traditional motor rehabilitation.

Surprisingly, the positive effects of some commonly prescribed interventions for gait rehabilitation, such as external cueing, dual task gait training and mind–body exercises (including dance [42, 65]) were not evident in this analysis (lied within 95% Crls when compared to usual care). Although these interventions are suggested for conventional gait and balance rehabilitation, the application of these compensation strategies is deemed inadequate to initiate clinically meaningful improvement in FOG. Previous research concluded treadmill training with external cueing strategy was more effective in reducing FOG symptoms than external cueing alone [38, 63]. It is presumably that effective interventions should simultaneously target the motor and attentional/cognitive pathophysiology underlying FOG. The sole practice of exercises without high complexity motor training or cognitive compensation strategy might be inadequate to ameliorate FOG in PD patients. Meanwhile, Cassimatis, Liu [66] suggested the effects of continuous cueing through external pathways are often diminished over time, probably because gait control shifts back from a goal-directed strategy to being automatically processed by the malfunctioning basal ganglia network. This hypothesis highlights the importance of investigating compensation strategies that could promote long-lasting attentional/cognitive requisite in addition to motor skillset training. Future research should further examine the skill acquisition process to identify optimal compensation strategies and modalities and the retention effects of these behavioral interventions, as well as establish the implementation of sustained practice of these lifestyle interventions.

Based on the relative effect estimates and ranked probability in this network meta-analysis, obstacle training, gait training on treadmill and general exercises seemed to be the most effective interventions for reducing immediate FOG severity. However, this review does not endorse the superiority of each behavioral intervention. The ranked probability obtained by this network meta-analysis cannot be considered conclusive because of the lack of high quality evidence underlying most comparisons. Feasibility, patients’ preference and cost should be considered while prescribing/recommending FOG interventions. Although obstacle training ranked the first and showed benefits when compared to most of the other active interventions, the intervention was delivered through an aquatic-based setting with a frequency of 5 times per week for 6 weeks, which is difficult and costly to be implemented in clinical or community settings. In other words, with the high feasibility and relatively low-cost of implementation, gait training on treadmill, general exercises (in particular, Nordic walking, or exercises with high complexity motor training), action observation training and conventional physiotherapy appeared as the most ideal and feasible behavioral prescriptions for improving patient-reported FOG outcomes.

Clinical implications and recommendations

Gait training with treadmill, action observation training and conventional physiotherapy demonstrated evident moderate–large benefits (effect sizes of 0.7–1.2) compared to usual care (beyond the 95% Crl). Rehabilitation institutions could consider adopting these gait-specific training protocols as a complementary rehabilitation approach for PD patients experiencing gait disorders such as FOG. Grounded from the studies with positive findings, the suggested dosage for treadmill gait training ranging 20–45 min per session, 2–7 times per week for 4–6 weeks; for action observation training, the suggested dosage ranging 45–60 min per session, 2–3 times a week for 4–8 weeks; for conventional physiotherapy, the suggested dosage ranging 40–90 min per session, 2–3 times per week for 4 weeks to 6 months [18, 19, 25, 37, 38, 41, 44, 52, 54, 6769]. These interventions shall be supervised/delivered by trained physiotherapists. Future work should further identify the optimal prescription dosage and effective components among these promising interventions.

As for community rehabilitation, general exercises demonstrated an evident medium effect size of 0.51 compared to usual care (beyond the 95% Crl). Referring to the trials with positive findings [27, 41, 48, 52, 69], the suggested training time shall last 60–90 min per session, 2–3 times per week for 12 weeks. Compared to allied health professional-led interventions, general exercises were delivered in a group and required fewer tangible resources (such as equipment, nonmedical professionals, space and flexible venue). It is noting that only studies with relatively high complexity motor tasks (i.e., Nordic walking, adapted resistance training) exerted positive effects on FOG symptoms compared to the control conditions. To facilitate long-term implementation of these community-based FOG rehabilitation, future study could integrate psychosocial synergy and telehealth strategy to enhance the participants’ motivation and compliance [70,71,72,73,74].

Limitations

This comprehensive review adopted a network meta-analysis approach to compare different behavioral interventions and evaluate the relative effects of each intervention type. We acknowledge that this approach does not allow an analysis of more specific or detailed components of interventions investigated in a small number of trials, such as conventional physiotherapy, which might include multiple compensation strategies. We considered adding categories or investigating each compensation strategy separately, but this further complicated the analysis and strongly reduced the statistical power of the network meta-analysis. Second, the review focused on evaluating clinical effectiveness using patient-reported FOG outcomes, which reflected the real-life experience perceived by PD patients in home/community setting. Ideally, these results need to be further complemented and validated using objective instrumental gait analyses, and we did not attempt to assess the cost effectiveness of preferentially using a particular behavioral intervention. Future research directions may consider mHealth innovations to capture real-life FOG experience, including objective data of FOG severity and frequency of occurrence, as well as cost-effectiveness analysis. Last but not least, many trials had methodological limitations introducing some/high risk of bias due to protocol deviations, non-compliance of intention-to-treat analysis, and/or small sample size [median = 39; range = 17–231]. Hence, the synthesized findings shall be interpreted with cautions. Corroborating with Cugusi, Manca [75]’s concluding remarks, although promising data have been obtained in well-controlled experimental settings from individual studies, it did not provide a definite evidence of relative effect estimates. High quality evidence regarding the superiority of each behavioral intervention for FOG management is still missing. These promising findings need to be further confirmed in robust, large-scale clinical trials, preferably with pragmatic design to confirm its real-life effect for clinical application. To uplift the quality of evidence in the field of behavioral science, compliance of reporting of clinical trials in accordance with international guidelines such as CONSORT statement [76] are strongly advised.

Conclusion

This network meta-analysis found that obstacle training, gait training on treadmill, general exercises, action observation training and conventional physiotherapy demonstrated immediate, real-life benefits on FOG symptoms among patients with mild–moderate PD. However, the superiority of each intervention remains inconclusive. The effects of high complexity motor training combined with attentional/cognitive strategy should be further explored. Future trials with rigorous research designs using both subjective and objective outcome measures, long-term follow-up and cost-effective analysis are warranted to establish effective compensation strategies for PD patients experiencing FOG.