Introduction

Osteoarthritis (OA), featured by narrowing of the joint space and remodeling of adjacent bone, is the most common form of arthritis and the major cause of disability and impairment of life quality [1, 2]. The currently recommended nonpharmacological and pharmacological treatment of OA aims at controlling pain and physical dysfunction while avoiding therapeutic adverse events (AEs) [35]. As inflammatory processes contribute a lot to the pain of OA [6], non-steroidal anti-inflammatory drugs (NSAIDs) are the most frequently prescribed remedy. However, the oral administration of NSAIDs carries a substantial risk of clinical AEs, including renal toxicity and gastrointestinal (GI) effects (ranged from mild heartburn to serious obstruction, ulceration, perforation, and bleeding) [79].

The topical use of NSAIDs was meant to address the need of safer treatment of OA [10, 11]. It is a possible alternative to oral therapy in relieving the symptoms of OA with reduced AEs, especially the GI tract [1214]. The guidelines of the European League Against Rheumatism (EULAR) recommended acetaminophen (paracetamol) as the first-line treatment for OA pain, while for patients who do not respond adequately to acetaminophen, either oral or topical NSAIDs was suggested [15]. The American College of Rheumatology also recommended using topical analgesics for patients who do not respond to acetaminophen and want to avoid systemic therapy [3]. NSAIDs can be applied to the skin in various forms, such as gels, creams, sprays, and foams. For example, Pennsaid, which consists of diclofenac sodium in a patented carrier containing dimethyl sulphoxide (DMSO) is an effective product of this sort. The DMSO moiety is deemed to facilitate the site-specific drug delivery of topical diclofenac through the skin to reach the pain-generating sites in the joint [1618]. Other forms of topical diclofenac also showed decreased pain and stiffness and improved the physical function and global assessment of patients (PGA) with primary OA, with minimal systemic AEs and only minor skin irritation at the application site [1921].

Several studies have explored the efficacy of topical diclofenac for the treatment of OA [2227]. However, controversy was raised regarding its long-term efficacy and safety [13, 28, 29]. A previous meta-analysis which only included four randomized controlled trials (RCTs) examed on topical diclofenac for the treatment of OA [30], but evidence was limited due to the small number of included trials and the lack of attention paid to physical function and AEs. With newly emerged evidence, the objective of this study was to evaluate the efficacy and safety of topical diclofenac for OA patients by conducting a quantitative meta-analysis. It is hypothesized that topical diclofenac is more effective in pain relief and function improvement without inducing side effects for OA patients when compared with the control group.

Materials and methods

Search strategy

This quantitative meta-analysis was in accord with the Preferred Reporting Items for Systematic reviews and Meta-analyses statements [31]. We searched MEDLINE/PubMed database, the Cochrane Central Register of Controlled Trials (CENTRAL), and EMBASE databases in September 2014 for relevant RCTs that compared topical diclofenac with placebo or vehicle in the treatment of OA through using a series of logic combination of keywords and text words related to OA, interested interventions, and RCTs (ESM 1). No restrictions were imposed, and reference lists of retrieved articles and reviews were also searched.

Study selection

Two researchers reviewed all the retrieved abstracts and full texts independently. Disagreements were resolved through discussions with another researcher. The inclusion criteria for this meta-analysis were (1) patients diagnosed with OA, (2) experimental group received topical diclofenac, (3) RCTs, (4) control group received placebo or vehicle, and (5) English literature. The exclusion criteria were (1) case reports, reviews, meta-analyses, animal trials, letters, retrospective studies, and other non-RCTs; (2) non-placebo or vehicle controlled trials; (3) experimental group mixed with other analgesics; (4) unavailablility of data extraction; and (5) unavailability of full texts.

Quality assessment

The methodological quality of the included studies was assessed using the Cochrane Collaboration’s Risk of Bias tool [32], based on the following items: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, and selective reporting. The assessment items were either categorized as low risk of bias, or high risk of bias, or unclear risk of bias.

Data extraction

The primary goal of this study was to identify the effectiveness of the topical diclofenac therapy in pain management and function improvement. The treatment effect was measured by the degree of change scores of pain and function at last follow-up time point. The change score is equal to the result of the baseline minus the follow-up. Specifically, the greater the change score of pain or function, the better the effect is. In addition, the end-point score of pain or function was combined. If a study reported multiple pain scales, the highest one on the hierarchy of the pain scale-related outcomes was selected, as described by Jüni and colleagues [33]. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) function was the preferred measure for function. If a study did not measure or report the WOMAC function, WOMAC total, Lequesne Index, or other functional measurement scales were used instead in this meta-analysis. If a study reported the outcomes of multiple time points after treatment, only data of the final follow-up time point was extracted for analysis. The effect of pain management and function improvement was expressed as the standard mean differences (SMD) between different treatment arms. The standard deviations (SDs) of absolute changes were computed from the baseline in accordance with the details in the Cochrane Handbook, if they were not available in any individual trial [34].

Statistical analyses

Quantitative analyses were performed for pain relief reported on WOMAC, VAS scale, and Australian/Canadian Osteoarthritis Hand Index (AUSCAN) score; function improvement reported on WOMAC, Lequesne Index, and AUSCAN score; We calculated a SMD and its corresponding 95 % confidence interval (CI) for quantitative data. As outcomes were reported at different time points in different studies, to facilitate and standardize pooling of data, the effect sizes were analyzed using the difference between the baseline and the last follow-up time points. Sensitivity analyses were conducted to explore potential sources of heterogeneity by excluding any single study and studies of short durations (<4 weeks), site specific (not knee) and small sample size (<100 per groups). Subgroup analyses were also conducted by stratifying different drug ingredient formulations and different follow-up time points. Dichotomous data on AEs and number of request for analgesia were summarized using risk ratio (RR) and its corresponding 95 % CI. The outcomes were also analyzed at the last follow-up period.

The homogeneity of effect size across trials was tested by Q statistics (P≦0.05 was considered heterogeneous). If there was significant heterogeneity among the studies, the random-effect model was used; otherwise, the fixed-effects model was acceptable. We also examined the I 2 statistics, which measures the percentage of the total variation across studies, which results from heterogeneity rather than chance (I 2≧50 % was considered heterogeneous). A sensitivity analysis was conducted to examine the influence of various exclusion criteria on overall effect sizes.

The Begg’s tests and funnel plots were performed to assess publication bias [36]. We used Review Manager 5.2 software (Rev Man 5.2, The Cochrane Collaboration, Oxford, UK) and STATA, version 12.0 (Stata Corp LP, College Station, TX) to perform statistical analyses. A P value <0.05 was considered to be statistically significant, unless otherwise specified.

Results

A total of 236 papers met the search criteria of this meta-analysis. The researchers reviewed 31 full texts, and eventually identified 9 papers (covering a total of 2642 patients, including 1333 in the topical diclofenac group and 1309 in the control group) for final analysis [1927]. Figure 1 shows a flow diagram which illustrates the results of the literature search and the study selection procedure. The characteristics of the nine included studies are presented in Table 1.

Fig. 1
figure 1

Flow diagram of screened, excluded, and analyzed publications

Table 1 Characteristics of nine included RCTs

A total of six RCTs [1921, 23, 24, 27] had a high methodological quality (low risk of bias). Two [25, 26] had unclear random sequence generation, one [26] had a high risk of bias of allocation concealment, and one [22] had an unclear risk of bias of allocation concealment. One RCT [22] did not report the specific method for blinding of participants and outcome assessment. All the included RCTs adopted the intention-to-treat analysis method, so the risk of bias of incomplete outcome data was generally low. Meanwhile, all RCTs had a low risk of selective reporting. The details were shown in Table 2.

Table 2 Methodological quality assessment according to the Cochrane Collaboration’s Risk of Bias tool study

Change pain score

Figure 2 shows the results of the 9 included RCTs (2642 patients) with combining all the SMDs for change pain scores. Overall, the combined data indicated that OA patients who received topical diclofenac therapy had significantly higher change pain scores (SMD = 0.40; 95 % CI 0.19 to 0.62; P = 0.0003) compared with the control group. Substantial heterogeneity was observed (I 2 = 86 %; P < 0.00001). The results were shown in Fig. 2. Sensitivity analysis was conducted to examine the potential source of heterogeneity between topical diclofenac and the control group and to investigate the impact of various exclusion criteria on the overall risk estimate. The overall combined SMD was ranged from 0.30 (95 % Cl 0.13 to 0.46; P = 0.0004) to 0.44 (95 % Cl 0.20 to 0.68; P = 0.0003), suggesting that it was not significantly altered by the exclusion of any single study. When studies featured with short duration (<4 weeks), site specific (not knee), and small sample size (<100 per groups) were excluded, the results of change pain scores remained positive.

Fig. 2
figure 2

Forest plot of meta-analysis: pain intensity of nine included RCTs. CI confidence interval

Because active drug ingredient existed in three formulations of the nine included studies, namely, patch [22], solution [1921, 24], and gel [23, 2527], subgroup analysis was subsequently conducted to eliminate the difference between formulations (Forest plots in ESM 2). Table 3 shows that topical diclofenac still had a significantly higher change pain score compared with placebo in both the solution subgroup (SMD = 0.81; 95 % CI 0.06 to 0.31) and the gel subgroup (SMD = 0.40; 95 % CI 0.14 to 0.67). The result of the subgroup analysis of different follow-up time point (week 1, 2, 3, 4, 6, 8, 12) suggested that the effect of pain relief remained positive (ESM 2). In addition, only 4 RCTs reported the end-point pain score, and the combined result remained positive (SMD = −0.30; 95 % CI −0.43 to −0.18; P < 0.00001) (ESM 3). The Begg’s funnel plot did not indicate substantial asymmetry based on visual review. The Begg rank correlation test did not identify any evidence of publication bias among the included studies (P = 0.917).

Table 3 Results of sensitivity analysis

Change function score

Figure 3 shows the results of the 9 included RCTs (2642 patients) with combining all the SMD for change function scores. Similarly, the topical diclofenac group achieved a significantly better treatment effect (SMD = 0.23; 95 % CI 0.03 to 0.43). Substantial heterogeneity was observed (I 2 = 85 %; P < 0.00001). When studies featured with short duration (<4 weeks), site specific (not knee), and small sample size (<100 per groups) were excluded, the result of change function scores became non-significant. The subgroup analysis showed the improvement turned to be a non-significant difference (Tables 3 and 4) (Forest plots in ESM 2). The result of subgroup analysis of different follow-up time points (week 1, 2, 3, 4, 6, 8, 12) suggested that the effect of function improvement remained positive (ESM 2). In addition, only 4 RCTs reported the end-point function score, and the combined result remained positive (SMD = −0.42; 95 % CI −0.65 to −0.19; P = 0.0003) (ESM 3). The Begg’s funnel plot did not indicate substantial asymmetry based on visual review. The Begg rank correlation test did not identify any evidence of publication bias among included studies (P = 0.466).

Fig. 3
figure 3

Forest plot of meta-analysis: physical function of nine included RCTs. CI confidence interval

Table 4 Results of subgroup analysis

Adverse events

All the nine included studies reported the rate of various complications. There were 21 types of AE mentioned by at least 2 papers (ESM 4): dry skin, rash, dermatitis, paresthesia, pruritus, GI events (abdominal pain, constipation, diarrhea, dyspepsia, flatulence, melena, nausea, vomiting), upper respiratory tract infection, edema, headache, taste perversion, pain, pain in extremity, arthralgia, back pain, neck pain, nasopharyngitis, sinusitis, cough, severe AEs, serious AEs, and withdrawal due to AEs. Pooled data analysis revealed that the incidence of five types of AE was higher in the topical diclofenac group, including dry skin, rash, dermatitis, neck pain, and withdrawal due to adverse events (Table 5). Fortunately, none of these AEs was serious. Other types of AE, including conjunctivitis, halitosis, pharyngolaryngeal pain, accidental injury, and abnormal vision, were only mentioned once. Therefore, they should not be considered as common complications.

Table 5 Adverse events mentioned ≥2 times of the included studies

Discussion

This meta-analysis assessed the efficacy and safety of topical diclofenac for the management of pain and physical dysfunction of OA patients. The result shows that topical diclofenac is certainly effective in pain relief, but its potential effect in function improvement needs to be further verified. Besides, topical diclofenac could greatly reduce renal toxicity and GI effects, but patients may suffer from slight skin irritation (mainly expresses as dry skin, rash, dermatitis, and neck pain).

A majority of the included studies suggested that topical NSAIDs is an effective analgesic, and it can be effectively used in pain relief for OA patients [1927]. A meta-analysis conducted by Moore concluded that topical NSAIDs were effective in pain relief for both acute and chronic diseases, including OA (relative benefit of 2.0; 95 % CI 1.7 to 2.2); meanwhile, it was as safe as the control group in terms of both local and systemic AEs [13]. In another meta-analysis conducted by Manson, first 2 weeks data also proved that topical NSAIDs were significantly more effective than placebo for subjects with painful chronic conditions, including OA (relative benefit of 1.9; 95 % CI 1.7 to 2.2) [28]. Lin found that topical NSAIDs were superior to placebo in function improvement as well as in pain relief for OA patients, but the findings were only based on the first 2 weeks of treatment [29]. Biswal conducted a meta-analysis in 2006, which included four RCTs with duration of 4 weeks or more. He compared topical NSAIDs with placebo or vehicle that fulfilled the specified criteria [35]. The pooled effect of topical NSAID which was measured at the 4th week or beyond was superior to placebo or vehicle in pain relief (mean effect size –0.28, 95 % CI –0.42 to –0.14). He concluded that topical NSAID was effective in pain relief for knee OA with a longer duration.

Since all these studies only involved a few types of preparations, extrapolating results to all NSAIDs may be erroneous. Towheed focused his study on Pennsaid (a topical diclofenac solution), which was a kind of topical NSAIDs widely used in OA analgesia [30]. His systematic review and meta-analysis included four high-quality RCTs with a mean duration of 8.5 weeks. In comparison with the vehicle control placebo, the SMD of WOMAC pain, stiffness, and physical function subscales (ranging from 0.30 to 0.39) were all significantly in favor of Pennsaid. Except for minor skin dryness at the application site (RR 1.7), Pennsaid was as safe as placebo.

NSAIDs are found to be associated with dose-related risks of renal toxicity, GI effects, and cardiovascular diseases, the incidence of which parallelizes with age progress [79]. Epidemiologic data indicated that even the young population was also associated with elevated risk of cardiovascular disease. Consequently, it is fundamentally indispensable to mitigate the risk of NSAID-related AEs for both the old and young population [36]. Topical NSAIDs exhibited a lower incidence of GI effects compared to other similar drugs when taken orally and other cyclo-oxygenase 2 inhibitors [12, 37, 38]. Therefore, the replacement of systemic NSAIDs exposure with topical administration can benefit patients of any age who suffer from superficial joint pain of OA. The side effects of topical diclofenac were mainly expressed as local skin reactions (dryness, dermatitis, rash), which are generally well tolerated and self-limiting [19, 20, 29]. Two studies mentioned about the AE of neck pain but did not give detailed explanation [26, 27]. The low incidence of systemic AEs of topical NSAIDs is probably due to the lower plasma concentrations from similar doses applied topically to those administered orally [39, 40]. However, it is noteworthy that the risk of withdrawal due to adverse events was higher in the topical diclofenac group when compared with the control group. Such a consequence is reasonable considering the higher incidence of skin reaction (including dry skin, rash, and dermatitis) of topical diclofenac, it but needs further examination.

The present study has several strengths. First of all, this meta-analysis only included randomized controlled trials, which improves the comparability between the two groups and reduces the risk of selection bias. Meanwhile, it reports the effects of topical diclofenac on OA analgesia comprehensively. The subjects’ characteristics, such as baseline illness status, residential background, and ethnicity, varied greatly with geographical locations, suggesting that the conclusions may have adequate external validity to be extrapolated to a broader population. Furthermore, the search strategies of this meta-analysis were comprehensively designed based on three fundamental databases, which could cover almost all relevant papers. Most of the included studies were high-quality RCTs, so the robustness of the conclusions could be best guaranteed. Last but not least, both the sensitivity analysis and subgroup analysis confirmed the stability of the results. The present study provided a comprehensive and certain conclusion to the current literature base, which can thus be used in clinical practice.

Limitations of the present meta-analysis should also be acknowledged. Firstly, the search strategies did not cover unpublished trials, which might result in selection bias as trials with positive results were more likely to be included in. Secondly, the language bias cannot be completely avoided, because some non-English papers were not indexed in the databases [41]. Thirdly, results may have been confounded by different efficacy endpoints in different studies. The dissimilarity was diluted in this study by calculating the absolute change scores from baseline to endline, providing that all the subjects had the similar conditions before the therapy. Finally, the conclusions of this study are not fully applicable to other joint OA rather than knee OA, because of the limited number of included trials with other joint OA patients.

Conclusions

In conclusion, this meta-analysis of randomized controlled trials suggested that the administration of topical diclofenac is effective in pain relief for OA patients, while it may have a potential effect in function improvement which needs further studies to explore. Although several adverse effects were observed in the use of topical diclofenac, none of them was serious.