Introduction

Migraine is a recurrent primary headache manifested by unilateral, throbbing pain lasting for 4–72 h each attack; migraine headaches with or without aura usually accompany nausea, photophobia, or phonophobia [1]. Migraine is the third most common disease globally, and it affects 14.7% of the general population [2, 3]. Migraine ranked as the third-highest cause of disability in males and females under the age of 50 (GBDS 2015) [1]. Migraine causes heavy economic burdens. In the United States, an annual cost of $9.2 billion was spent on the medical management of migraines between 2004 and 2013 [4].

The prophylaxis of migraine is unsatisfactory. Although prophylactic drug therapies are mainly recommended by guides, these drugs, such as beta-blockers, calcium antagonists, or antiepileptic drugs, are not developed specifically to treat migraine [1]. Some drugs are often cause adverse events (AEs) like nausea, weigh gaining, and dizziness. Owing to the unsatisfactory effect of the prophylactic drug and their AEs, it is difficult for patients with migraines to be compliant with these drug treatments [5, 6]. What is more, the effect of drugs for prophylaxis of migraine without aura requires further studies [7].

Acupuncture is reported to be effective for migraine prophylaxis in lessening headache intensity [8], reducing migraine days [9], and improving quality of life [10]. It seems to be safer than prophylactic drugs since it rarely causes severe adverse events [11]. However, several meta-analyses on the effectiveness of acupuncture for migraine prophylaxis showed results that differ in the final effect size of statistics [12,13,14,15]. So, the new RCTs had to be conducted since a larger sample size can make the results more accurate, and accordingly, the teams of meta-analyses had to update their data, although this increased the likelihood of type I errors [16].

Trial sequential analysis is a method combining various techniques, which quantifies the evidence needed and provides the specific values of the required information, which include the thresholds of statistical significance and ineffectiveness of the intervention effects. It can reduce early false-positive results due to inaccurate meta-analysis and repeated significance tests. The effectiveness of a treatment can be summarized on time by successive inclusion in trials and analysis. If the treatment is proven to be ineffective by the TSA, it can be stopped immediately, which reduces unnecessary costs, and if effective, it can be promptly expanded [17,18,19,20]. Aiming to clarify whether acupuncture is efficacious in migraine prophylaxis and whether current RCTs were adequately powered to detect the efficacy, we conducted a trial sequential analysis (TSA) meta-analysis comparing acupuncture with sham acupuncture or conventional therapy in the prevention of migraine.

Methods

The meta-analysis was designed and performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [21, 22].

Inclusion and exclusion criteria

RCTs were included when they met all the following criteria: (1) assessing the efficacy of acupuncture in migraine prophylaxis by comparing it with sham acupuncture or conventional therapy; (2) recruiting participants with age over 18 years and participants who were diagnosed with migraine according to International Classification of Headache Disorders (ICHD) developed by the International Headache Society (IHS); (3) with any of the outcome measurements: migraine episodes (migraine frequency or migraine days), responder (defined as a participant who had a reduction ≥ 50% in monthly migraine attacks), or adverse events defined as the articles reported.

RCTs were excluded when they met any of the following criteria: (1) with crossover design, N-of-1 design, or non-randomized controlled design; (2) reporting no clear diagnosis criteria; (3) assessing the efficacy of an intervention in the management of acute migraine attacks; (4) duplicate publication; (5) without necessary data.

Search strategy

We searched PubMed, EMBASE, and the Cochrane Library from inception to April 23rd, 2020, with publication language restricted to English and Chinese. We used the following keywords or Mesh terms in combination to develop search strategy: “acupuncture”, “electroacupuncture”, “migraine”, “migraine disorders”, “headache”, “cephalalgia”, and “randomized controlled trials”. We also read the reference lists of the retrieved articles to search for potentially eligible RCTs. The details of searching strategies were showed in supplements.

Screening and data extraction

Two reviewers (Shi-Qi Fan and Tai-Chun Tang) independently read titles and abstracts of searched articles based on the inclusion and exclusion criteria, and they further screened full-text copies of potentially eligible RCTs after the title and abstract screening. Disagreement between the two reviewers in the inclusion of RCTs was solved by group discussion and arbitrated by a third reviewer (Song Jin). For RCTs with missing data, we contacted the authors to ask for original data by email, and we tried to calculate the data through the available coefficients when data were unavailable from the authors [23].

Data extractions included: (1) trial characteristics like name of the first author, publication year, and country; (2) participant characteristics like mean age, proportion of females, and duration of migraine; (3) intervention and control: name of the intervention or control, dosage and frequency of treatment, and treatment duration; (3) outcome measures: name of the outcome, the number of participants allocated to the intervention or control; parameters like mean, standard deviation, and the number of events.

Two reviewers assessed the risk of bias in the included RCTs in six domains: sequence generation, allocation concealment, blinding (blinding of participants and personnel and blinding of outcome assessment), incomplete data, selective reporting, and other bias. Divergences among the two reviewers were solved by a discussion with a third reviewer.

Statistical analysis

We used Review Manager 5.3 and TSA 0.9.5.10 beta (https://www.ctu.dk/tsa/) to manage the analysis. We calculated the effect size of the interventions in reducing migraine episodes using standardized mean difference (SMD), and we calculated the effect size of the interventions using relative ratio (RR). We also calculated 95% confidence interval (95% CI) for each SMD or RR. We assessed the heterogeneity between RCTs using the I2 statistics. SMD and RR were combined using the fixed-effects model (the inverse variance method) when I2 < 50% or they were combined using the random effect model (DerSimon Laird method) when I2 ≥ 50. The accumulated meta-analysis was divided into two parts, one for results after treatment (usually evaluated 12–16 weeks after introducing the procedure of acupuncture) and one for follow-up. We performed TSA analysis to reduce the risk of false-positive findings owing to multiple statistical testing [24]. We calculated the information size—an estimation of the optimum sample size for statistical inference from a meta-analysis—after taking heterogeneity of the included RCTs into account. We calculated the required information size (RIS) allowing for a type 1 error of 0.05 and a type 2 error of 0.2, and we presented significance boundaries (adjusting the threshold for statistical significance such that the overall risk of type 1 error maintains under 5%) based on O’Brien-Fleming alpha-spending function. In the TSA software, we set “Sample Size” as “information axis” and estimated the values of effect type mean, effect type variance and effect type intervention based on low-bias risk studies. The correction of heterogeneity was based on “Model Variance”. For the responder rate, we used the results of the most weighted studies [25,26,27] to estimate the value of the incidence in the control arm.

Results

Characteristics of the included RCTs

We found 1077 potentially eligible articles, and we finally included 20 studies [8,9,10, 25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41] after excluding studies that are not RCTs, pediatric studies, crossover studies, studies with unclear diagnostic criteria, ineligibility of interventions or ineligibility of outcome measures, duplicated records, articles without necessary data (Fig. 1). The included RCTs were conducted in eight countries; ten of them recruited only patients with episodic migraine, three recruited only chronic migraine, and seven recruited both. 79.35% of the population was female, and the overall population had mean ages ranging from 29.94 to 47.85 years. Eleven studies compared acupuncture with sham acupuncture; eight studies compared acupuncture with conventional drugs (flunarizine, venlafaxine, valproic acid and metoprolol), and one study compared acupuncture with both interventions. The treatment duration ranged from 4 to 24 weeks. Table 1 summarizes the main characteristics of the included RCTs. Figure 2 shows the risk of bias in the included RCTs, and the main risk lies in blinding.

Fig.1
figure 1

Study flowchart

Table 1 Characteristics of the included RCTs
Fig.2
figure 2

Risk of bias

Results of meta-analysis and TSA

Migraine episodes

Acupuncture vs. sham acupuncture

We entered eleven studies (n = 1727, I2 = 59%) comparing acupuncture and sham acupuncture (Fig. 3a). Acupuncture was superior over sham acupuncture (SMD = − 0.29, 95% CI − 0.47 to − 0.11, P = 0.002) in reducing the migraine episodes after treatment. Eight RCTs showed the results of follow-up, which indicated that acupuncture was statistically superior over sham acupuncture (SMD = − 0.24, 95% CI − 0.47 to − 0.01, P = 0.004).

Fig.3
figure 3

Forest plot of migraine episodes

In this comparison for migraine episodes, Z-curve of TSA crossed the traditional level of statistical significance (P = 0.05) before we added the study of Wang 2015, but it neither intersected with trial sequential monitoring boundaries which favors acupuncture nor the vertical line representing the required information size (RIS = 4324; Fig. 4a).

Fig.4
figure 4

TSA graph: a migraine episodes (acupuncture vs. sham); b migraine episodes (acupuncture vs. drugs); c responder rate (vs. sham); d responder rate (vs. drugs). The blue curve represents the Z-curve, the red curves above and below represent trial sequential monitoring boundaries, the dashed red line represents the traditional level of statistical significance, and the red vertical line represents RIS Value; the red lines on the sides closest to the horizontal line are boundaries for futility

Acupuncture vs. conventional drugs

In this comparison, seven studies (n = 1044, I2 = 56%) were eligible for pooling. The results showed that the difference between acupuncture and prophylactic drugs was not significant (SMD = − 0.21, 95% CI − 0.42 to 0.00, P = 0.06) after the treatment process (Fig. 3b). However, the migraine episodes of acupuncture group were statistically significant less than that of the conventional drugs group (SMD = − 0.14, 95% CI − 0.28 to − 0.00, P = 0.04) of follow-up.

As for TSA results, the cumulative Z-curve crossed the traditional level of statistical significance before adding the first study of Allais 2002, but it also has not intersected with trial sequential monitoring boundaries and did not reach the vertical line of required information size (RIS = 7816; Fig. 4b).

Responder rate

Acupuncture vs. sham acupuncture

We entered nine studies (n = 1640, I2 = 43%) in the comparison. The results after treatment showed that acupuncture was statistically significantly better than sham acupuncture (RR 1.30, 95% CI 1.09 to 1.55, P = 0.003), while there was no statistically significant difference favoring acupuncture during the follow-up period (RR 1.21, 95% CI 0.89 to 1.64, P = 0.22).

The cumulative Z-curve crossed both the traditional level of statistical significance and trial sequential monitoring boundaries for the benefit of acupuncture (RIS 1760; Fig. 4c).

Acupuncture vs. conventional drugs

Four studies (n = 1021, I2 = 25%) were eligible for pooling. The responder rate of acupuncture therapy was statistically significantly larger than that of conventional drugs (RR 1.24, 95% CI 1.04 to 1.48, P = 0.01). There was no significant difference for the results of follow-up (RR 1.10, 95% CI 0.98 to 0.24, P = 0.12).

Z-curve crossed the traditional level of statistical significance before adding the study of Allais 2002, and it also intersected with the trial sequential monitoring boundaries favoring acupuncture (RIS 1270; Fig. 4d).

Safety evaluation

Some adverse events such as dizziness or nausea occurred in these studies, and all of them had descriptive analysis for the safety of interventions. Ten studies of them reported the number of patients who had adverse events. In the comparison of acupuncture and sham acupuncture, the incidence rates were (16.32%; 141/864 participants) of the acupuncture group and (16.23%; 98/604 participants) of the sham group, which has no statistically significant difference (RR 1.15, 95% CI 0.91 to 0.46, P = 0.24). As for the comparison of acupuncture and conventional drugs, incidence rate of acupuncture group (13.70%; 85/621 participants) was statistically significantly less than that of conventional drugs group (25%; 121/484 participants) and (RR 0.34, 95% CI 0.14 to 0.81, P = 0.01; Fig. 5).

Fig.5
figure 5

Forest plot of adverse events

Discussions

Summary of evidence

In our study, we consider acupuncture to be an optional prophylactic treatment for people who are troubled by frequent and unmanageable migraine attacks, especially those refusing conventional drug therapy because of unbearable side effects.

Statistically significant reduction in migraine episodes was showed after acupuncture treatment or at the follow-up stage compared to sham acupuncture. The response rate to acupuncture treatment was higher than both sham acupuncture and positive drugs and statistically different. These confirm that acupuncture brings better benefits than sham ones and conventional prophylactic drugs, at least for a period after the treatment course is completed. Previously, some meta-analyses [12, 13, 42] have also mentioned the superiority of acupuncture treatments over no-acupuncture, but they did not confirm whether the sample size was already enough, and our study has some advantages in this regard. Compared with sham acupuncture or prophylactic drugs, the TSA graphs proved that more trials are currently needed in migraine prevention. However, both cumulative Z-curves intersected with the trial sequential monitoring boundaries favoring acupuncture in terms of responder rate. The samples were sufficient, and the efficacy of acupuncture was prominent. Besides, one study found that compared to drug prophylaxis, differences at follow-up were no longer statistically significant [12]. In our study, in terms of migraine episodes, the acupuncture group performed better than the conventional drug group during the follow-up period (P = 0.04).

Implication for practice

Several studies have suggested that sham acupuncture may have a stronger effect than placebo pills, which may be associated with the special ritual of acupuncture and a better patient–doctor relationship during treatment [43,44,45]. Many randomized controlled trials comparing acupuncture with sham acupuncture found a slight difference between them [12, 25, 46]; although an individual patient data meta-analysis found a statistically significant difference between them in the treatment of chronic pain, the difference was clinically irrelevant [47]. However, acupuncture was found at least as effective as conventional treatments in migraine prophylaxis [12]. These facts indicate that sham acupuncture is not inert as a placebo control, and that the effect of acupuncture might mostly rely on non-specific effects.

We had similar findings to the above studies. When acupuncture was compared to sham acupuncture in reducing migraine episodes, the effect size was small (SMD = − 0.29). In a large individual patient data meta-analysis of acupuncture for chronic headaches, they calculated an SMD of 0.15, which is very close to ours [47]. The TSA graph of this result, in which the Z-curve has not yet intersected the trial sequential monitoring boundaries, may confirm the efficacy of acupuncture in the future if more clinical trials are added, or may shift the Z-curve to intersect the boundaries for futility and obtain the opposite result. The effect size of acupuncture is also small when comparing conventional drugs, indicating that acupuncture is at least not less effective than these positive drugs. In the future, studies of acupuncture vs. other positive drugs could be conducted, such as comparing it with newly marketed CGRP antagonists, and the results of these studies could be used to make another decision about whether acupuncture should be used to treat migraine.

In these comparisons, the heterogeneity between the various studies was slightly more considerable (I > 50%) although we did not include some trials with a higher risk of bias except for the heterogeneity between acupuncture and drugs in terms of response rates. We attempted to analyze the sources of heterogeneity, using subgroup analyses by age, number of treatments per week, and treatment duration, but did not find covariates that significantly contributed to the heterogeneity. A review presented a point of view: they suggested that there was little evidence that the effects were modified by any acupuncture characteristics, such as the number, frequency or duration of sessions [48]. We speculate that the quality of the studies may be an essential factor causing heterogeneity.

Limitations

The first is a limitation common to TSA: definitive conclusions can be drawn when it reaches RIS values or Z-curves or when bounds or invalid lines are intersected, and immediate cessation of the series should be recommended, but if there is a high-quality test in progress, can the results be ignored and the meta-analysis not updated? We cannot rely solely on sequential analysis to make judgments and should consider a combination of these.

It is difficult to use methods of the blind in the comparison of acupuncture and drugs, so it may cause a risk of bias. Also, with the popularity of acupuncture therapy, the blinding in the group of sham acupuncture could be impeded, because people may doubt that if this kind of acupuncture is useful.

Acupuncture treatment varies widely in duration, ranging from 4 to 24 weeks, and the number of treatments per week. Which one will be better or more readily accepted by patients? This seemed to be rarely mentioned and explored in studies. What’s more, the choices of acupoints or the conventional drugs may also influence the results, but we have failed to carefully analyze the different outcomes that these differences would cause. In the end, we should have included more literature, but many studies were abandoned by us because of the apparent lack of rigorous design and the high risk of bias. Strict and scientific clinical trials are needed for better evidence based medical studies.

Conclusions

Acupuncture can reduce migraine episodes compared to sham one and can be an alternative and safe prophylactic treatment for conventional drugs therapy, but it should be further verified trough more RCTs. Available studies suggested acupuncture was superior to sham acupuncture and conventional drugs in terms of responder rate as verified by TSA.