Introduction

Total knee arthroplasty (TKA) is a common surgical procedure in ageing populations. The US National Centre for Health Statistics reported in 2014 that the overall TKA incidence rate had increased from 5.5 to 8.7 per 1000 population [28], and that the demand had risen substantially over the past decade [24]. The surgical aspect of TKA is just one part of the total process, with physical therapy (PT) and rehabilitation playing an integral role in successful TKA outcomes.

Since healthcare costs are rising and more patients are taking care of their own healthcare bills, PT is being critically under scrutiny to justify its effectiveness. Some research focused on home exercise programs have determined that it may be just as effective as supervised PT and a viable cost-conscious option [6]. Rehabilitation programs have shown efficacy in restoring functional status, thus, enhancing the clinical and social benefits of TKA [20, 30]. Rehabilitation practices after hospital discharge vary across and within countries. Some form of early rehabilitation (0–6 weeks) after hospital discharge appears to be the norm. After the surgical procedure, an early inpatient rehabilitation program will help to restore the function and range of motion (ROM). This rehabilitation should be continued after hospital discharge (Fig. 1). However, these post-hospitalization programs are highly variable and some are very costly [4]. They may include anything from supervised PT with numerous techniques to home-based exercises taught to patients by physiotherapists. Controversy still exists on the need of supervising PT or exercise [1, 5, 32].

Fig. 1
figure 1

Rehabilitation phases in the recovery of total knee arthroplasty (TKA)

If a well-structured home-based exercise regime were to be developed, the costly individualized supervised outpatient PT may not be necessary. However, solid grounds are needed to provide such an indication, as most supervised PT programs have already been consolidated in centres with TKA.

The aim of the present study was to evaluate the effectiveness and safety of outpatient PT delivered by physiotherapists in a clinic-based setting versus non-supervised home-based exercise for the functional recovery immediately after discharge from a primary TKA procedure. As a secondary objective, we aimed to describe the effect of techniques added to the unsupervised exercise.

Materials and methods

A systematic review and meta-analysis following the PRISMA statement was conducted and registered prospectively on PROSPERO. The clinical question above was translated into an epidemiological one using the Patient, Intervention, Comparator, Outcome, and Type of study (PICOT) approach. Inclusion criteria were as follows: (1) patients should be adults with primary TKA; (2) interventions should include one-to-one or individualized clinic-based outpatient PT and should be compared to unsupervised home-based programs; alone or in addition to other strategies, such as telerehabilitation or any device that can be used by the patient without supervision (e.g. transcutaneous electrical nerve stimulation and continuous passive motion machine); (3) outcomes should include active knee ROM in flexion, which is our primary outcome, functional knee limitation, pain or perceived pain intensity, physical conditioning or physical function, quality of life, muscle strength, patient’s satisfaction with intervention, or adverse events/complications; and (4) by type of design, only randomized clinical trials (RCT) were admitted. Studies were excluded if they focused on revision or bilateral surgeries.

The following databases were screened: Medline (1966 to April 2015), Embase (1974 to April 2015), Cochrane Library (1982 to April 2015), and the Physiotherapy Evidence Database, PEDro (to April 2015). The search strategy—available upon request—included as terms “home exercise program”, “unsupervised physical therapy”, “post surgical knee” or “physical therapy”. We searched only published articles, and limited languages to English, Spanish, French, and German. The reference lists of the included articles were also revised.

Study selection

Two authors (RC & IPP) independently assessed the electronic search results. They first screened by title and them by abstract in sessions aided by Covidence®. When an article title seemed relevant, the abstract was reviewed for eligibility. If there was any doubt, the full text of the article was retrieved and appraised for possible inclusion. Any differences between the two authors were discussed, and if necessary, a third author (LC) was referred to for arbitration. A reason for exclusion was recorded in all cases if the article was not eligible or excluded.

Quality assessment

Two authors (BN & RC) assessed independently the risk of bias of the articles selected for detailed review. Methodological domains of the assessment, namely randomization sequence, allocation concealment, blinding and conflicts of interest, were graded according to the PEDro scale checklist [14]. The PEDro scale considers two aspects of trial quality, namely the “believability” (or “internal validity”) of the trial and whether the trial contains sufficient statistical information to make it interpretable. It does not rate the “meaningfulness” (or “generalizability” or “external validity”) of the trial, or the size of the treatment effect.

Data extraction

Two authors (RC & BN) independently extracted the data from included articles in forms previously pilot tested for feasibility and comprehensiveness, and differences were discussed. Reasons for exclusion at this stage were summarized in Fig. 2 (the full list of excluded articles with reasons is available upon request). Results were recorded on an Excel spreadsheet. Data were extracted from each trial regarding participants (group size according to intention to treat analysis, age and sex), content of intervention and comparison, setting and timing of intervention, time from surgery and outcomes. When a trial employed two variations of a PT intervention (e.g. Ko et al. [21]), only one group was included.

Fig. 2
figure 2

Selection process and study flow criteria

For outcomes reported as continuous variables mean and standard deviation were extracted. If outcomes were reported as means and confidence intervals, or medians and inter-quartile ranges, appropriate conversions were applied [2, 37]. Authors were contacted for missing data. If authors had provided information to other reviewers, these data were included in our analysis and acknowledged appropriately [25, 27]. In two studies [23, 36], data were provided only in figures, and therefore numerical data were extrapolated from figures using image editing programs.

Statistical analysis

Data from knee ROM, separated by active extension and flexion, were obtained in all studies in similar form, by standard goniometry, and thus were combined as mean differences, whereas functional status was combined as standardized mean differences because it was collected using different scales, such as the Western Ontario and McMaster Universities Arthritis Index (WOMAC) or the Knee Society Score (KSS). Pooled effects were obtained from random effects meta-analyses [18] in Stata® version 14. For all outcomes we carried out subgroup analyses by whether a co-intervention was added to the home-based group or not.

Statistical heterogeneity was tested with the I 2 statistic; we considered values greater than 50 % as important variability, needed to explain. In order to explain variability, we performed sensitivity analyses [18].

Given that no difference between groups was anticipated in most outcomes, a non-inferiority hypothesis was established. To set the non-inferiority margin, the minimum clinically important change (MCIC) and difference (MCID) for each outcome was explored. The MCIC used were those reported by Collins et al. [12], Busija et al. [7], Julian et al. [19], Smarr et al. [34], and Dowsey et al. [13]. Based on these, the difference between groups should not be larger than four points in either direction to be considered non-inferior.

Results

The electronic search strategy yielded 2301 articles. After screening titles and abstracts, 75 full papers were retrieved. In addition, an article was found through manual search [33]. There was no need for arbitration with a third peer or contact with the authors of the original studies, during the screening process. After detailed scrutiny, 11 studies were included [5, 16, 21, 23, 25, 27, 3133, 36, 39]. Review process is summarized as a flow diagram in Fig. 2.

Characteristics of the included studies

All studies were RCT with sample sizes greater than 10 with follow-up between 6 weeks and 24 months. A summary of the studies is presented in Table 1. In terms of quality, the mean PEDro score of the studies was 6 over 10 (see Table 2). Four studies did not clearly report eligibility criteria [5, 27, 36, 39], and in one randomization was unclear [32]. Blinding was a complicated issue given the nature of the interventions. In four studies, the assessor was unblinded [5, 25, 33, 36]. Nevertheless, blinding is not as limiting in non-inferiority hypotheses as in superiority studies. Half of the studies reported more than 15 % losses of follow-up.

Table 1 Summary of the eleven RCT included studies
Table 2 PEDro scores for included studies (n = 11)

Regarding participants, these were very homogeneous, with mean ages around 65. However, the interventions varied widely across studies. PT included thermotherapy or cryotherapy (n = 7), electrical stimulation (n = 2), joint mobilization (n = 6), strengthening exercises or progressive resistance exercises (n = 8). Five studies reported standard or conventional PT, with no specification of which interventions were provided. Home programs varied across trials (Table 1) and were mainly exercise protocols. Collectively, the length of home exercise intervention ranged from 4 to 52 weeks. One study [32] did not report the length of the intervention. The frequency of exercises ranged from 1 to 7 times per week, with no information on regime intensity. The timing of intervention varied from 2 to 6 weeks after TKA.

Knee range of motion

Knee ROM was measured with goniometry and reported in all studies. ROM measurements covered flexion and extension, both active and passive. Two studies reported quadriceps lag [21, 39]. Different positions were used during assessment and are summarized in Table 3.

Table 3 Range of movement (ROM) measured and position for included studies (n = 11)

ROM active extension data suitable for meta-analysis were available from seven studies summing up 707 patients [5, 16, 25, 27, 3136, 39] and ROM active flexion was available from nine studies totaling 983 patients [5, 16, 23, 25, 27, 3133, 36, 39]. Most studies showed no difference between groups. The pooled difference was within the non-inferiority margin at all time points selected (3, 6 and 12 months). Heterogeneity was larger in active flexion meta-analyses, ranging from 36 % at 12 months, up to 60 % at 6 months, whereas active extension meta-analyses showed only heterogeneity (23 %) at 6 months after surgery. For details on meta-analyses see Table 4 and Figs. 3, 4 and 5 regarding knee extension and Table 5 and Figs. 6, 7 and 8 on knee flexion. Subgroup meta-analyses by co-interventions showed no differences between groups.

Table 4 Pooled estimates of the mean difference in active knee extension at 3, 6 and 12 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups
Fig. 3
figure 3

Forest plot of the mean difference in knee extension at 3 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Fig. 4
figure 4

Forest plot of the mean difference in knee extension at 6 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Fig. 5
figure 5

Forest plot of the mean difference in knee extension at 12 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Table 5 Pooled estimates of the mean difference in active knee flexion at 3, 6 and 12 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups
Fig. 6
figure 6

Forest plot of the mean difference in knee flexion at 3 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Fig. 7
figure 7

Forest plot of the mean difference in knee flexion at 6 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Fig. 8
figure 8

Forest plot of the mean difference in knee flexion at 12 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Patient-reported physical function

Data on physical function were available from all the studies except for Rajan et al. [32] and were measured with WOMAC (n = 7), KSS (n = 3) and Oxford Knee Score (OKS) (n = 2) (see Table 6 for details on questionnaires used). Although the differences between groups on function were all in favour of home-based exercise, they were not statistically significant (see Table 7; Figs. 9, 10, 11). Heterogeneity was only moderate in studies reporting function outcomes at 3 months.

Table 6 Instruments used to measure functional status in the included studies
Table 7 Pooled estimates of the mean difference in functional status at 3, 6 and 12 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups
Fig. 9
figure 9

Forest plot of the mean difference in physical function at 3 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Fig. 10
figure 10

Forest plot of the mean difference in physical function 6 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Fig. 11
figure 11

Forest plot of the mean difference in physical function at 12 months after discharge for total knee replacement between home-based unsupervised and clinic-based supervised physical therapy groups

Pain

Pain was measured in seven studies, either using VAS pain score (n = 2), KSS pain score (n = 3) or the 5 item subscale from the WOMAC (n = 3). Pain was not meta-analysed.

Safety

Safety was described in three of the included articles [16, 21, 23]. In Kramer et al. study [23], two patients in each group needed knee manipulation under anaesthesia between 2 and 7 weeks after surgery; the same event occurred in three patients in the home exercise group and two in the supervised PT in the Ko et al. study [21]. In the study by Han et al. [16] the rate of hospital admissions in the first 6 weeks after surgery was similar between groups (7 vs. 9 %).

Discussion

The most important finding of the present study was the non-inferiority of home-based exercise when compared to usual PT activities with respect to knee ROM and functional status at 3, 6 and 12 months after surgery.

Coppola et al. [11] in a previous systematic review found no evidence to support supervised PT following knee surgery over home exercises; however, they included studies with young patients, and consequently with few comorbidities, and could not conclude on the effect of home exercises over supervised PT in older populations with comorbidities or with complicated knee surgical procedures, such as TKA.

Another review by Artz et al. [3] showed that patients receiving PT exercise, supervised or not, improved physical function at 3–4 months compared to any other PT technique, with a standardized mean difference of −0.37 (95 % CI −0.62, −0.12). The present meta-analysis shares 5 of the 18 studies in Artz et al.’s review [3]; however, Artz’s included PT treatments performed in the patient’s home, thus supervised, and other treatment modalities not compared specifically to unsupervised home-based exercise.

Knee ROM is the main follow-up outcome after TKA and it is believed to reflect the patient evolution, although it has been found to be a poor marker of implant success [29]. In addition, a high heterogeneity was observed in the meta-analyses of ROM, possible causes being the use of different measurement positions across studies, difficulties in extracting data from studies—especially in the case of active extension; where negative values can be misleading—and the low reliability of the instruments employed [10]. Five degrees has been chosen as the MCID, reflecting a difference that is both larger than the measurement error and detectable by the subject in daily activities [8].

Another concern is to what extent patients care about flexion beyond the point needed to perform daily activities after TKA. For that purpose, Thomsen et al. [35] designed a study where they used high-flex PS prosthesis that achieved very high degrees of knee flexion. Although they showed increased knee flexion, patient perceived outcomes showed no significant differences. This suggests little importance of the difference in knee flexion to the patients—when flexion, of course, reaches a minimum magnitude—as pain-free ROM and high patient satisfaction were achieved with both types of prostheses [35].

Poor medium-to-long-term patient outcomes after TKA are consistent with other studies reporting that only 50 % of patients may have a clinically important improvement in WOMAC score a year after surgery [17]. In addition, the poor reporting of both the WOMAC index and the KSS in the individual studies resulted in significant uncertainty in the interpretation of the combined results and limited their contribution to evidence synthesis [38]. Miner et al. [26] found a low correlation between knee ROM and WOMAC functional status (r < 0.34). Patients with flexion <95° had significantly worse WOMAC function scores than patients with 95° or higher a year after surgery, and both WOMAC pain and function scores correlated with patient satisfaction and perceived improvement in quality of life after a year, but knee flexion was not [26]. A recent systematic review [38] with 76 articles from 22 countries pointed out that WOMAC reliability was consistently high (≥0.90) for the function scale and acceptable (≥0.70) for the pain and stiffness scale [15]. Therefore, we should move towards measurements of functional outcomes which may better reflect the real state of the patient than knee ROM.

Two other outcomes are worth mentioning in this discussion, pain and safety. Although pain was reported in seven studies, it was not included in the meta-analysis because exercise regimes and PT do not manage directly pain after TKA. An effect on pain is the aim of pharmacological and other non-pharmacologic therapies, such as neuromuscular electrical stimulation, transcutaneous electrical nerve stimulation or cryotherapy [9], but not clearly exercise. Regarding safety, a common argument against the use of a monitored home program is the risk of delayed detention of adverse effects, with consequently serious complications. For this purpose, we analysed safety, which unfortunately was only described in detail in three of the eleven studies [16, 21, 23], and we found that both treatment groups (home-based vs outpatient PT) were similar with respect to the number of hospital readmissions for knee-related issues.

The main limitations to this review include heterogeneity, especially in the results on knee ROM, and imprecision; this latter due to the small sample sizes of the studies and to having had to extrapolate data from figures. Publication bias could not be adequately assessed due to the small number of included studies, but it is possible in a field that mainly consists of small trials. Nonetheless, this systematic review has several strengths, including comprehensive search of multiple databases, selecting studies by two independent reviewers and the use of co-intervention subgroup analyses within trials. Although our ability to answer our research questions was hampered by the inadequate reporting of outcome data in primary studies, the results of this study can help with decision-making after TKA.

Conclusion

Despite the limitations of the data, the improvement in physical function and knee ROM does not seem to clearly differ with the use of interventions including outpatient PT or home-based exercise regimes after primary TKA for knee osteoarthritis.