1 Introduction

Patellofemoral pain (PFP) has a high prevalence within both sporting and recreationally active populations [1, 2]. Among 2,002 patients presenting to a sports medicine clinic with running-related injuries, 842 (42.1 %) reported knee pain, with 331 (46 %) being diagnosed with PFP [2]. PFP is characterised by the gradual onset of diffuse pain in the retropatellar or peripatellar region that is aggravated during tasks that increase patellofemoral joint (PFJ) loading (e.g. running, jumping, squatting) [3].

Development and persistence of PFP is widely considered to be multifactorial [4], with both extrinsic and intrinsic factors thought to contribute. Proposed extrinsic factors include excessive training load, altered training surface and/or inappropriate footwear. Proposed intrinsic factors can be divided into local (around the knee), proximal (thigh, hip, trunk or pelvis), and distal (foot and lower leg) characteristics [5, 6].

Larger quadriceps angle, sulcus sign, patella tilt angle, and lower peak torque knee extension, hip abduction and external rotation strength have proven association with PFP [7]. However, these studies have methodological weaknesses and the cross-sectional design inhibits determination of causality. Prospectively, limited quadriceps and gastrocnemius flexibility, knee extension weakness and increased knee valgus moment at initial contact when landing have been identified as predictors of PFP development [6]. Most of these studies utilized military populations with resultant limited generalizability to most clinical populations. Put together, the findings from these reviews highlight both the multifactorial nature of PFP and the diversity of presenting characteristics that could be addressed by treatment [6, 7].

Proximal [811], distal [12] and local [13, 14] interventions have all demonstrated favourable PFP treatment outcomes. Multimodal physiotherapy, including a combination of patella taping, vasti retraining, gluteal strengthening, patella mobilisation and stretches, remains the gold standard treatment option with the strongest reported evidence base [15]. Considering the multifactorial nature of PFP, greater intervention efficacy could be achieved through better selection of treatment for a given patient, therefore improving clinical outcomes and future research. Furthermore, identification of outcome predictors that guide tailored intervention packages may reduce recurrence, known to be high [16, 17].

It is important to consider overall prognosis differently from outcome prediction. For example, a retrospective analysis of two high-quality (HQ), randomised controlled trials (RCTs) described the characteristics of the 55 % of individuals with PFP who had unfavourable overall outcome from multimodal packages of care at 3 months and 40 % at 12 months [18]. This prognostic analysis would not guide clinicians’ specific intervention choice as a function of positive outcome. Evaluating outcome predictors to identify subgroups likely to respond to specific interventions has therefore received increased attention in the literature in recent years [1932]. Consequently, the aim of this review was to identify potential outcome predictors for conservative interventions in the management of PFP in order to guide clinicians when considering the likelihood of intervention success and steer the direction of future research in this area.

2 Methods

2.1 Inclusion and Exclusion Criteria

Eligibility criteria were modified from a published review of musculoskeletal clinical prediction rules [33]. These included peer-reviewed journal publication, the primary study aim being development or evaluation of outcome predictors, application to treatment selection for patients with PFP, and clear evidence that the measurement tool was appropriate to the evaluated outcome predictor (e.g. use of the Kujala pain questionnaire as an outcome measure for individuals with PFP [34]). Unpublished work was not sought. Only papers published in English were considered.

2.2 Search Strategy

The AMED, CINAHL, EMBASE, MEDLINE and Web of Science databases were searched from inception up to April 2013. The keyword ‘predict*.ti ab.’ was used in combination with keywords relating to PFP to capture papers relating to the development of clinical prediction rules. The search criteria were modified from a previous PFP systematic review that evaluated the scope and quality of systematic reviews on non-pharmacological conservative treatment for PFP [35]. The search strategy and results are reported in the electronic supplementary material (ESM) Table S1. Citing, and cited, references were surveyed in Google scholar and at source, respectively.

2.3 Review Process

All titles and abstracts found were downloaded into Endnote X4 (Thomson Reuters, Philadelphia, PA, USA), search returns collated and duplicates removed. Potential papers were assessed by two independent reviewers (SL and CB) using an inclusion criteria checklist. If sufficient information could not be obtained from the title and abstract, the full text was obtained for further evaluation. Any disagreements were resolved by consensus, and a third reviewer (DM) was available if needed, but was not required.

2.4 Quality Assessment of Reviews

Methodological quality was assessed with a scale (ESM Table S2) used previously for a PFP systematic review [36], and applied by two reviewers independently (SL and CB), with discrepancies resolved by discussion, and a third reviewer (DM) was available if required. The quality assessment scale consisted of 19 items divided into four components—participants, interventions, outcome measures and data presentations. With RCTs considered the gold standard of predictor analysis, the scale is scored out of 40, with the total possible score given as a percentage. Scores ≥70 % were considered to be ‘high quality’ and scores <70 % considered to be ‘low quality’.

2.5 Data Extraction and Analysis

Study design characteristics were extracted and tabulated to enable methodological comparison (Table 1). Treatment ‘success’ was defined within eight studies [1923, 28, 30, 32], and not defined in a further six studies [2427, 29, 31]. In studies where ‘success’ was defined, continuous and dichotomous baseline outcome predictor data for both ‘successful’ and ‘unsuccessful’ subgroups was extracted to allow univariate statistical analysis of effect size (ES) [standardised mean difference] and risk ratio calculations, respectively, using Review Manager (RevMan v5.1, 2011, The Cochrane Collaboration, Copenhagen, Denmark). ES and the associated 95 % confidence intervals (CIs) were presented as forest plots to facilitate visual comparison. Where two or more outcome predictors and success determinants were consistent between studies, data was pooled. Pooled results were reported as significant when the test for overall effect (Z score) was p < 0.05, and as a trend when p < 0.1. Determinants of success were considered consistent if a justified, clinically meaningful measure was used in the two or more studies pooled (e.g. ‘Marked improvement’ on a 5-point Likert scale). If adequate data was not available to complete calculations from published reports, attempts were made to contact corresponding authors. Where treatment success was not defined, baseline measures of potential outcome predictors reported to significantly predict change of the primary outcome through multivariate statistical analysis, were extracted. The primary outcome used for each study is presented in Table 1.

Table 1 Study design characteristics for each included study

Interpretation of calculated individual or pooled ES were categorised based on those used by Hume et al. [37] as small (≤0.59), medium (0.60–1.19), or large (≥1.20). The level of statistical heterogeneity, defined as p < 0.05, for pooled data was established using the Chi-square and I2 statistics. Definitions for ‘levels of evidence’ were guided by recommendations made by van Tulder et al. [38].

Strong evidence = pooled results derived from three or more studies, including a minimum of two HQ studies, which are statistically homogenous (p > 0.05)—may be associated with a statistically significant or non-significant pooled result.

Moderate evidence = statistically significant pooled results derived from multiple studies, including at least one HQ study, which are statistically heterogeneous (p < 0.05), or from multiple LQ studies which are statistically homogenous (p > 0.05).

Limited evidence = results from multiple LQ studies which are statistically heterogeneous (p < 0.05), or from one HQ study.

Very limited evidence = results from one LQ study.

Conflicting evidence = pooled results insignificant and derived from multiple studies, regardless of quality, which are statistically heterogeneous (p < 0.05, i.e. inconsistent).

3 Results

3.1 Review Selection and Identification

The initial search yielded 1,888 citations. Following application of the inclusion/exclusion criteria to citation title, abstract and full text, 15 cohort studies remained (Fig. 1). No RCTs were found. Two studies included data from the same PFP population [26, 31], however they reported findings from different follow-up durations and were both therefore included in the review. Two further studies, one that presented short- and long-term predictors of outcome without differentiating predictors for specific interventions [39], and the other that reported post hoc baseline foot mobility measures [40], could not be used within this review.

Fig. 1
figure 1

Flow diagram summarising study selection for inclusion. PFP patellofemoral pain

3.2 Additional Data

Additional data required for ES calculation was provided by authors for one paper [19].

3.3 Quality Assessment

Results from the quality assessment scale are shown in the ESM Table S2. Scores ranged from 15 to 24 out of a possible 40. Of the 15 included studies, all were scored as LQ.

3.4 Summary of Findings

3.4.1 Pain

Very limited evidence identified higher baseline functional index questionnaire scores (mean 0.82, 95 % CI 0.18–1.46) predicted improved outcome following 12-week orthoses intervention in one LQ study [19]. Pooled results from two LQ studies [19, 20] showed a trend towards less usual (mean −0.45, 95 % CI −0.93 to 0.03, p = 0.07) and worst pain (mean −0.45, 95 % CI −0.93 to 0.03, p = 0.07) being associated with foot orthoses success (Fig. 2).

Fig. 2
figure 2

Baseline pain characteristics for ‘successful’ and ‘unsuccessful’ groups following lumbopelvic manipulation, foot orthoses and taping interventions. AKP Sc anterior knee pain score, FIQ Sc functional index questionnaire score, SDatBase step-down at baseline, SLRsitbase single-leg rises from sitting at baseline, U.P. usual pain, W.P. worst pain, SD standard deviation, IV inverse variance, CI confidence interval, df degrees of freedom. Barton 2011a [20], Barton 2011b [32]

Multiple stepwise regression identified shorter symptom duration predicted positive changes in Kujala scores associated with successful exercise intervention in one LQ study at 5-week and 3-month follow-up (p = 0.045 and p = 0.019, respectively) [27]. Lower frequency of pain at baseline, when identified with concurrent greater quadriceps cross-sectional area (CSA) and reduced eccentric quadriceps torque (see Sect. 3.4.3), was predictive of successful outcome after a quadriceps exercise programme combined with patella mobilisation and lower-limb stretches tailored to the individuals mobility/flexibility restrictions (p = 0.012) in one LQ study [25].

Very limited evidence indicated greater usual pain (mean 0.43, 95 % CI 0.01–0.85) significantly predicted taping intervention success in one LQ study [30].

3.4.2 Demographics

Limited evidence showed patient height (mean −0.17, 95 % CI −0.60 to 0.27, p = 0.45) and weight (mean −0.09, 95 % CI −0.52 to 0.34, p = 0.68) did not predict foot orthoses intervention success [19, 20]. Pooled results from three LQ studies [19, 20, 23] showed a trend for older age to predict successful outcomes from foot orthoses intervention (mean 0.29, 95 % CI −0.06 to 0.65, p = 0.1) (Fig. 3).

Fig. 3
figure 3

Baseline demographic characteristics for ‘successful’ and ‘unsuccessful’ groups following lumbopelvic manipulation, foot orthoses, exercise and taping interventions. BMI body mass index, SD standard deviation, IV inverse variance, CI confidence interval, df degrees of freedom. Barton 2011a [20], Barton 2011b [32]

Younger age predicted positive changes in pain (decreased visual analogue scale score), Tegner and Lysholm scores at 6 weeks, and Tegner and Lysholm scores at 6 months’ follow-up after exercise intervention [31].

3.4.3 Knee

No local knee characteristics were shown to predict foot orthoses intervention success (Fig. 4).

Fig. 4
figure 4

Baseline knee characteristics for ‘successful’ and ‘unsuccessful’ groups following lumbopelvic manipulation, foot orthoses and taping interventions. LPA lateral patellofemoral angle, LPD lateral patellar displacement, TibTor tibial torsion, TibValgum tibial valgum, SD standard deviation, IV inverse variance, CI confidence interval, df degrees of freedom

Faster vastus medialis oblique (VMO) reflex response time (p = 0.041 and p = 0.026, respectively) predicted positive changes in Kujala scores following exercise intervention [27]. Multiple stepwise regression (forward stepping) identified negative patella apprehension at baseline to predict positive changes in Tegner and Lysholm scores at 7-year follow-up in one LQ exercise intervention study [26]. An absence of chrondomalacia patella and tibial tubercle deviation <14.6 mm on magnetic resonance imaging (MRI) predicted resolution of symptoms at 5 weeks following exercise intervention in one LQ study [27]. A further LQ exercise intervention study identified a lack of self-reported ‘cold legs’ (p = 0.019) predicted delayed onset of pain during a treadmill test [29]. Single variables added to a linear regression model identified greater CSA of the total quadriceps at mid-thigh level (p = 0.01) and reduced eccentric average quadriceps peak torque at 60°/s (p = 0.015) at baseline as predictors of successful outcome, when identified with concurrent lower frequency of pain (see Sect. 3.4.1), after a tailored exercise and mobilisation programme in one LQ study [25].

Limited evidence indicated an increased Q-angle was a significant predictor of a successful outcome following patellar taping intervention (two LQ studies [22, 40], mean 0.38, 95 % CI 0.05–0.72, p = 0.03) [Fig. 4]. Very limited evidence identified reduced lateral patellofemoral angle (LPA) [mean −0.47, 95 % CI −0.89 to −0.05] predicted patellar taping success [30].

Most pain squatting (mean 2.27, 95 % CI 1.57–3.28), greater patella glide (mean 1.59, 95 % CI 1.18–2.26), less stiffness (mean 0.43, 95 % CI 0.3–0.61) and fewer episodes of giving way (mean 0.65, 95 % CI 0.49–0.86) and clicking (mean 0.64, 95 % CI 0.47–0.88) were shown to be significant predictors of lumbopelvic manipulation success [21]; however, these findings were not replicated in a follow-up study using the methodological design [41].

3.4.4 Hip and Pelvis

No significant predictors at the hip or pelvis for foot orthoses, exercise, patellar taping or lumbopelvic manipulation were identified (Fig. 5).

Fig. 5
figure 5

Baseline hip characteristics for ‘successful’ and ‘unsuccessful’ groups following lumbopelvic manipulation, foot orthoses and taping interventions. Craig’s Craig’s test, Hip IR hip internal rotation range, Hip IR Diff hip internal rotation range difference, Pel. Crest pelvic crest height, LL Diff leg-length difference, SD standard deviation, IV inverse variance, CI confidence interval

3.4.5 Foot and Ankle

Limited evidence showed great toe extension (mean −0.16, 95 % CI −0.59 to 0.27, p = 0.46) and ankle dorsiflexion range with the knee bent (mean −0.25, 95 % CI −0.68 to 0.18) did not significantly predict orthoses success [20, 23]. Very limited evidence reported greater forefoot valgus (mean 0.67, 95 % CI 0.05–1.28) predicted successful outcomes following 20–23 days of wearing prescribed prefabricated foot orthoses in one LQ study [23]. Greater rearfoot eversion magnitude peak predicted orthoses intervention success (mean −0.93, 95 % CI −1.84 to −0.01) in one LQ study [32] (Fig. 6).

Fig. 6
figure 6

Baseline foot and ankle characteristics for ‘successful’ and ‘unsuccessful’ groups following lumbopelvic manipulation, foot orthoses and taping interventions. A.df/KE ankle DF with knee extended, A.df/KF ankle DF with knee flexed, CalcSt relaxed calcaneal stance, FFalign forefoot alignment, GtToeEx great toe extension, NavDrop navicular drop, RF STJN rearfoot in subtalar joint neutral position, MFW WB mid-foot width (weight-bearing), MFW NWB mid-foot width (non-weight-bearing), MFW Dif mid-foot width difference (MFW WB−MFW NWB), AchHght arch height, AH Rat arch height ration, STJN NWB subtalar joint neutral non-weight-bearing, RF.TIB.EV.MP rearfoot relative to tibia eversion magnitude peak, RF.TIB.EV.TP rearfoot relative to tibia eversion timing to peak, RF.TIB.EV.ROM rearfoot relative to tibia eversion range of motion, FF.RF.DF.MP forefoot relative to rearfoot motion dorsiflexion magnitude peak, FF.RF.DF.TP forefoot relative to rearfoot motion dorsiflexion timing to peak, FF.RF.DF.ROM forefoot relative to rearfoot motion dorsiflexion range of motion, FF.RF.AB.MP forefoot relative to rearfoot motion abduction magnitude peak, FF.RF.AB.TP forefoot relative to rearfoot motion abduction timing peak, FF.RF.AB.ROM forefoot relative to rearfoot motion abduction range of motion, RF.LAB.EV.MP rearfoot relative to laboratory eversion magnitude peak, RF.LAB.EV.TP rearfoot relative to laboratory eversion timing peak, RF.LAB.EV.ROM rearfoot relative to laboratory eversion range of motion, SD standard deviation, IV inverse variance, CI confidence interval, df degrees of freedom. Barton 2011a [20], Barton 2011b [32]

4 Discussion

The intent of this review was to identify outcome predictors for specific conservative interventions in the management of PFP in order to guide clinicians when considering the likelihood of intervention success. With an absence of RCTs prospectively validating outcome predictors, significant findings should only be considered as preliminary indicators of successful outcome prediction. Additionally, the potential for this review to categorically differentiate between predictors of success following specific interventions and indicators of the probable course of PFP symptoms (prognostic factors) is limited by an absence of control groups.

We identified evaluation of 205 conservative management outcome predictors within 15 LQ cohort studies. Of this comprehensive range, 19 (9 % of total) were found to be significant. Of the 15 included studies, none have reached the validation stage of prediction development important for ensuring predictors accurately identify individuals who will benefit from specific interventions [33]. We found all studies used a single-arm design, without the inclusion of a control group, and did not recruit adequate participants relative to the number of variables investigated [42]. Although this single-arm design can be a useful tool in the derivation stage of outcome prediction, it is not powerful enough to provide definitive information on factors that can modify treatment effects. As such, the outcome predictors identified in these studies have a high risk of being non-specific predictors of outcome—that is, predictive of outcome regardless of management care plan rather than response to specific interventions—or prognostic factors [33]. Variability in outcome measures and follow-up duration was evident across the included studies, further limiting evidence synthesis and therefore the strength of conclusions drawn.

4.1 Potential Predictors

4.1.1 Pain

Higher functional index questionnaire scores [19] and a trend towards less ‘usual’ and less ‘worst’ pain [19, 20] predicting orthoses intervention success suggest that lower symptom severity may be predictive of a favourable outcome. Similar findings were also evident following exercise intervention, with shorter symptom duration [27] and lower frequency of pain [25] predicting better outcomes. When compared with a multicentre PFP prognostic study showing symptom duration over 2 months and Anterior Knee Pain Scale score less than 70/100 (more severe symptoms) predicted poor outcomes [18], the findings from this review further implicate pain variables as prognostic factors irrespective of orthoses or exercise intervention. Of interest, higher pain severity at baseline and longer pain duration have also shown association with poor prognosis in other musculoskeletal pain conditions [43]. Irrespective of being predictive or prognostic, these findings highlight the clinical importance of implementing an effective intervention programme early in the pain experience in order to increase the likelihood of intervention success and reduce the risk of chronicity.

In contrast, greater usual pain was identified within one LQ study to be predictive of patellar taping success (mean 0.43, 95 % CI 0.01–0.85) [40]. The most significant limitation of these findings is that only immediate effects were assessed. With literature pertaining to the mechanisms and effect of taping beyond the short-term being limited [13], the strength of clinical inference for this predictor is somewhat limited. Further research exploring longer-term taping efficacy and the ability of greater usual pain to predict its outcome is needed.

4.1.2 Demographics

Consistent with prognosis following physiotherapy intervention including foot orthoses application [39], limited evidence showed patient height and weight was not predictive of a successful outcome following foot orthoses intervention [19, 20]. In contrast with prognostic data, a trend towards older age was identified as a predictor of foot orthoses success [20, 23], and younger age significantly predictive of exercise intervention success [31]. There are many plausible explanations for both of these results, primarily speculative in nature. First, movement patterns may be more entrenched in older individuals requiring an external adjunct to facilitate changes that can lead to symptom reduction. Second, younger individuals with pain may have a greater capacity for muscular adaptation—both neuromuscular adaptation and strength development—following exercise intervention. Validation of demographic characteristics predicting orthoses and exercise intervention warrants further investigation; however, consideration of patient age in the clinical setting may be an important characteristic for determining foot orthoses or exercise intervention success.

4.1.3 Knee

To our knowledge, no prognostic studies have investigated clinical measures of the knee as predictors of outcome. The findings from this review identify derivation level indicators of outcome prediction that require validation using case-control study design. Some of the potential predictors require expensive equipment (MRI) or cannot be easily obtained within the clinic environment (VMO reflex response time), limiting applicability to all clinical settings. Identification of additional predictors of both exercise and patellar taping intervention success that can be easily applied within the clinical setting requires further work to ensure maximal clinical utility.

A lack of sound clinical evidence for the role of lumbopelvic manipulation in the management of PFP, when compared with foot orthoses, exercise and patellar taping, questions the suitability of this modality undergoing an outcome prediction derivation study. Furthermore, a subsequent single-arm cohort study reported none of the initially identified predictors for lumbopelvic manipulation success were predictive when the same methods were repeated [41]. Further good-quality case-control studies, exploring the effectiveness of this intervention within PFP populations should be sought prior to attempting to identify subgroups of individuals who may benefit.

4.1.4 Hip and Pelvis

The absence of significant indicators of prognosis or successful outcome prediction at the hip and pelvis highlights an area within the current literature where further research is clearly needed. The role of the hip and pelvis in PFP development [44] and maintenance of symptoms [7] has received significant attention within recent literature. Interventions focused at this area have also shown favourable outcomes [45]. Therefore, identification of predictors that can inform clinical reasoning concerning hip and pelvis treatment has the potential to significantly increase treatment efficacy.

4.1.5 Foot and Ankle

The presence of excessive foot pronation has traditionally formed the basis of foot orthoses prescription. Despite multiple measures of foot posture reported in this review, greater forefoot valgus (forefoot-to-rearfoot angle measured in subtalar joint neutral) [23] and peak rearfoot eversion relative to laboratory [32] were the only identified significant predictors of foot orthoses intervention success. Although unable to extract specific interventions on which the predictor was evaluated, Collins et al. [39] reported weight-bearing arch height did not significantly predict prognosis. Conversely, a change in mid-foot width has been identified in two studies to predict foot orthoses success [19, 46]. Vicenzino et al. [19] reported a mid-foot width difference from non-weight-bearing to weight-bearing >10.96 mm significantly predicted success when a significance level of p < 0.20 was used and in subsequent regression analysis. Similarly, Mills et al. [40] reported a difference in mid-foot width of >11.25 mm correctly predicted orthoses success in 7 of 10 individuals using a classification tree model. Variability in clinical measures prevents direct comparison between prognostic and predictor studies; however, considering dynamic rearfoot eversion has been identified as a potential predictor of foot orthoses success [32], there is clear merit for further exploration of dynamic foot posture measures in predicting orthoses intervention outcomes.

4.2 Future Directions

More robust study design, including the use of control groups, would permit stronger conclusions to be made about the predictive capacity of the variables measured and allow differentiation from prognostic factors. Future studies should aim to address this evidence gap.

Consistency between studies and researchers for determinants of treatment ‘success’ warrants development of consensus in future research. It is acknowledged that variability in the measure of success between studies can influence the significance of the findings presented in this review.

Further prediction studies for an evidence-based multimodal physiotherapy intervention [46] should be conducted given this approach is the gold standard of therapy management for PFP [15]. Although this may seem contradictory to attempting to deliver a more tailored intervention from the appropriate use of outcome predictors, a multimodal approach still yields poor long-term outcomes. Studies predicting individuals who do improve may help to identify subgroups that require a novel intervention approach.

Some of the predictors identified within this review required the use of expensive (MRI scanning), sometimes inaccessible (VMO reflex response and rearfoot eversion magnitude peak) equipment to administer within the clinic. To ensure maximal clinical utility of the outcome predictors investigated, future studies should aim to assess potential predictors that are easy to administer, take minimal time, are repeatable, and provide useful information that is relevant to the intervention.

Lastly, for outcome predictors to be accurately integrated into a clinically reasoned and tailored intervention approach, studies to progress the evidence base from derivation stage of design to validation level are clearly warranted.

5 Conclusion

This systematic review provides a contemporary summary of derivation level studies identifying indicators of prediction for conservative PFP management. Without quality randomised clinical trials to categorically prove any of these identified predictors, this review is unable to differentiate between predictors of success and prognostic factors. The identified indicators of prediction should be considered non-specific prognostic factors and need to undergo further investigation before being applied clinically with confidence. The findings from this review do however highlight important potential predictors, which can be cautiously applied within clinical reasoning paradigms, and give important direction for future research. With appropriate caution, clinicians should consider patellar taping for those with greater usual pain, foot orthoses for older individuals and exercise for younger individuals, and foot orthoses intervention for patients with greater forefoot valgus and rearfoot eversion magnitude peak. RCTs to validate indicators of prediction are clearly warranted to provide clinicians with robust evidence to deliver a tailored intervention to this heterogeneous patient population.