Introduction

Over the last decades, the cesarean section rate has steadily increased worldwide, with more than 35% in 2016 in the United States itself [1]. This increase has been accompanied by an increased use of neuraxial anesthesia, with single-shot spinal anesthesia being most commonly used for elective cesarean section, irrespective of the country [1,2,3,4,5].

Besides local anesthetics, opioids have often been administered in the subarachnoid space [6, 7]. Intrathecal opioids can improve intraoperative anesthesia quality and prolong postoperative analgesia [8]. Conversely, they can cause undesirable effects, including nausea, vomiting, pruritus, and respiratory depression [9,10,11]. Opioid effects depend on the dose administered and opioid physicochemical properties, particularly lipid solubility. Lipophilic opioids have more rapid onset and shorter duration of action than hydrophilic opioids, whereas hydrophilic opioids may provide longer analgesia duration but have greater risks of late respiratory depression [12,13,14]. Various opioids have been used as adjuncts to local anesthetics, but their comparative effectiveness remains unknown.

This Bayesian network meta-analysis aimed to compare the beneficial and harmful effects of opioids used as adjuncts to local anesthetics in patients undergoing cesarean section under spinal anesthesia.

Methods

This study was conducted based on the preferred reporting items systematic reviews and meta-analysis (PRISMA) guidelines, Cochrane methodology, and the PRISMA extension statement for reporting network meta-analysis [15,16,17]. The protocol was pre-registered on the international prospective register of systematic reviews (CRD42018108364).

Search strategy

Two independent researchers (H.K. and H.S.) searched the electronic databases, including PubMed, Cochrane Central Register of clinical trials, EMBASE, and Web of Science, up to Nov 30, 2018, with no language restrictions, which was later updated (Mar 01, 2021). The full PubMed search strategy is described in the Electronic Supplementary Material (Supplementary Text S1). We also checked ClinicalTrials.gov for ongoing and unpublished clinical trials. Reference lists of all identified studies and those of previous meta-analyses on similar topics were checked.

Study selection

This meta-analysis focused on patients undergoing elective cesarean section under spinal anesthesia. Only randomized controlled trials (RCTs) comparing combinations of intrathecal local anesthetics and opioids or combinations of opioids to intrathecal local anesthetics (intervention arm) and placebo or local anesthetics alone (comparison arm) for spinal anesthesia in elective cesarean section were included. We excluded trials with no comparison or intervention arms; trials in which patients received epidural anesthesia, peripheral nerve blocks, or continuous wound infiltration of local anesthetics; and trials that included non-elective cesarean section. Conference abstracts, reviews, letters, retrospective or case reports/series, and trials with no relevant outcomes were also excluded. Opioids included here were morphine, diamorphine, hydromorphone, fentanyl, sufentanil, and meperidine. Eight reviewers (T.O., K.K., R.O., H.S., Y.H., S.H., T.F., and H.K.) working in four teams independently screened titles and abstracts of obtained references and collected full-text articles if potentially relevant. Disagreements were resolved by discussion or consultation with a third author (S.I.).

Data extraction

Data extracted included study setting, study population, details of intervention and control conditions, recruitment and completion rate, results and measurement time, and information for evaluating the risk of bias.

The primary outcome was the complete analgesia duration, defined as the duration of time until visual analog scale (VAS) pain score (0 = no pain and 100 = worst pain imaginable) became > 0 after injecting solution into the subarachnoid space. The numerical rating scale was converted to VAS score.

The secondary outcomes were (1) incidence of nausea and vomiting within 24 h after spinal anesthesia, (2) incidence of respiratory depression within 24 h after spinal anesthesia as defined by study authors, (3) cumulative postoperative opioid consumption within 24 h after spinal anesthesia, (4) incidence of pruritus within 24 h after spinal anesthesia, (5) duration of effective analgesia defined as the duration of time until VAS score became ≥ 4 or time to first analgesic use, and (6) pain scores at 12 and 24 h after spinal anesthesia. Cumulative postoperative opioid consumption was transformed to morphine-equivalent dose with previously published equianalgesic conversion factors [intravenous (iv) morphine, 10 mg = iv meperidine, 100 mg = iv fentanyl, 0.1 mg = iv sufentanil, 0.01 mg = subcutaneous diamorphine, 10 mg) [18,19,20]. Primary data sources were numerical data reported in the tables and main text of the included studies. Graphically reported data were converted to numerical data using Plot Digitizer (http://plotdigitizer.sourceforge.net). When a study presented data as medians with ranges, we converted the values to means and standard deviations (SDs) using a previously described method [21]. When a range was not reported, SDs were estimated as interquartile range/1.35.

Statistical analysis

A network meta-analysis was performed in the Bayesian framework. The Bayesian random-effects and consistency model was used to combine direct and indirect evidence [22, 23]. We constructed network plots for each outcome, in which node size corresponded with a sample size, and an edge width corresponded with the number of studies along with the expression of risk of bias. Statistical analyses were performed using the GeMTC R package (gemtc.drugis.org).

We first used non-informative priors with normal, binomial, and uniform prior distributions for mean, odds ratios (ORs), and SD, respectively. The median posterior weighted mean difference or ORs with corresponding 95% credible intervals (CrIs) were calculated using the Markov chain Monte Carlo method. For sampling, 30,000 posterior samples were first discarded (burn-in), after which another 100,000 posterior samples were saved with a thinning interval of 10. Convergence of iterations was diagnosed using the Gelman-Rubin-Brooks statistic [24]. Potential scale reduction factors at least < 1.05 indicated good convergence. Goodness-of-fit was evaluated with the residual deviance (\(\overline{D }\)res) and leverage (pD). A node-splitting model was used to evaluate inconsistency between direct and indirect estimates [25]. We estimated rank probabilities to assess the probability for each intervention to obtain each possible rank in terms of their relative effects. The surface under cumulative ranking curves (SUCRA) was estimated in each intervention in each outcome [26]. The SUCRA value represents the percentage of efficacy or safety achieved by an intervention compared with an imaginary best intervention. The larger the SUCRA, the better the rank of the treatment. The summary of findings was summarized for each outcome according to previous publication [27].

Risk of bias assessment

Pairs of reviewers independently assessed risk of bias in the included studies using the Cochrane risk of bias tool for RCTs [17]. We assessed seven domains, including the random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, outcome data completeness, selective reporting, and other biases. The estimated risk of bias for each domain was rated as “low,” “unclear,” or “high.” Disagreements among reviewers over the assessment of risk of bias were resolved by discussion with a third reviewer.

Data handling and assessment

Unit of analysis included the individual participants. If there were multiple relevant intervention groups in one study, we partitioned the control group. For indirect comparisons, we partitioned the comparator group into two or more groups according to how many times it was used for indirect comparisons.

For missing data, we assessed whether measured outcomes had been reported, and randomized participants, except for patients excluded with reasons, were included in the outcome data.

We assessed heterogeneity for each direct comparison within network meta-analysis using the estimate value of common between-study variance τ2 [28].

Egger’s test was performed to assess publication biases [29]. P values < 0.1 were considered statistically significant. A comparison-adjusted funnel plot was used to evaluate the presence of small-study effects in the network meta-analysis.

Certainty of evidence

The grading of recommendations, assessment, development, and evaluation (GRADE) approach was used to rate the quality of evidence for each network estimate [30]. In this approach, the rating of direct evidence from RCTs starts at a “high” quality and can be described as “moderate,” “low,” and “very low” based on the following six domains: within-study bias, across-studies bias, indirectness, imprecision, heterogeneity, and incoherence (inconsistency). We used the Confidence In Network Meta-Analysis (CINeMA) web application based on the framework previously developed [31].

Subgroup analysis and investing of heterogeneity

We performed two subgroup analyses: first, to evaluate the efficacy of lipophilic compared to hydrophilic opioids, with fentanyl, sufentanil, and meperidine classified as lipophilic, and morphine and diamorphine classified as hydrophilic. Second, to compare a single opioid to a combination of two opioids, a morphine and fentanyl combination was compared with either morphine or fentanyl alone.

Post-hoc sensitivity analysis

We performed post-hoc sensitivity analyses for all outcomes using a weak prior with inverse gamma distribution as prior distribution and compared them with non-informative priors previously used.

Results

Search results and study characteristics

The search strategy identified 6439 citations, of which 238 were assessed in full text and 66 studies with 4400 patients were included in the network meta-analysis (Fig. 1, Supplementary Table S1) [32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97]. Fifty-nine studies were written in English, two in Japanese, two in Turkish, one in Korean, one in Spanish, and one in Portuguese. Non-English studies, excluding those in Japanese, were translated using Google translation. There were no ongoing clinical trials that fulfilled the inclusion criteria for this study.

Fig. 1
figure 1

PRISMA flow diagram

Table 1 displays the characteristics of the included studies. Most studies reported only a subset of outcomes. Forty-six studies compared a single opioid (fentanyl, 24; morphine, 9; sufentanil, 6; diamorphine, 4; and pethidine or meperidine, 3) with control substances (local anesthetic and saline or local anesthetic alone), while other studies compared several opioids or adjuvants with control substances. No studies with hydromorphone were included. The dose of opioids added to local anesthetics varied among studies; fentanyl, 2.5–50 µg; morphine, 0.025–0.5 mg; sufentanil, 1.5–10 µg; diamorphine, 0.1–0.375 mg; pethidine or meperidine, 25–35 mg.

Table 1 Study characteristics and outcomes of interest assessed in included studies

In most studies, long-acting local anesthetics were used (bupivacaine, 56; levobupivacaine, 3; ropivacaine, 2), while short-acting local anesthetics, including lidocaine and lignocaine, were used in five studies. We conducted a post-hoc sensitivity analysis including only trials that used bupivacaine for the primary outcome, as bupivacaine was the most common local anesthetic among eligible trials. However, the results were similar to those obtained after including all trials (Supplementary Figure S1). Therefore, we decided not to conduct a post-hoc sensitivity analysis of including only trials that used bupivacaine, as we would not obtain different results.

Thirty-five (53%) RCTs were at low risk of bias, while the other 47% were at moderate risk (Supplementary Figure S2).

Primary outcome

Duration of complete analgesia (time to VAS > 0)

Table 2 shows the summary of findings for the primary outcome. Eight studies examining three opioids (fentanyl, sufentanil, and morphine) and one combination (fentanyl and morphine) with 572 patients were selected [37, 39, 44, 53, 54, 62, 63, 82]. Eight trial arms with direct comparisons were identified. All Bayesian parameters were well converged (Supplementary Table S3A). Compared with the placebo, when fentanyl, sufentanil (lipophilic), and morphine (hydrophilic) were added alone to local anesthetics, the duration of complete analgesia was significantly prolonged by 96 (95% CrI: 29–170) min, 96 (4.9–190) min, and 190 (29–360) min, respectively. The differences were not significant among the three opioids (Supplementary Table S4A). The fentanyl and morphine combination did not prolong the duration of complete analgesia. The node-splitting model revealed no evidence of incoherence (inconsistency) between the direct and indirect comparisons (Supplementary Table S5A). The comparison-adjusted funnel plot and Egger’s test results indicate that publication bias was unlikely (Supplementary Figure S3A and Table S6A). The SUCRA results showed that the interventions with the best probability of achieving the longest duration of complete analgesia were morphine (87.1%), followed by fentanyl and morphine combination (65.3%), sufentanil (48.0%), fentanyl (47.5%), and placebo (2.2%) (Table 2). For model fit, the \(\overline{D }\) res was 20.0 and pD was 18.7, suggesting that the model represented the data well (Supplementary Table S7A). For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4A). For imprecision, 60% of interval estimates in comparisons exceeded null values, indicating major concerns in data precision (Supplementary Table S8A). For heterogeneity, 33% of confidence and prediction interval estimates in comparisons extended into clinically important effects in both directions, indicating major concerns in heterogeneity (Supplementary Table S9A). The GRADE outcome results are detailed in Supplementary Table S10A. The certainty of evidence was low in 5 of 10 comparisons.

Table 2 Summary of findings table for primary outcome

Secondary outcomes

Incidence of nausea and vomiting within 24 h after spinal anesthesia

As the incidence of nausea and vomiting was reported as a separate outcome or combined as “nausea and vomiting,” we examined these outcomes separately. To examine the nausea incidence, 34 studies examining five opioids (diamorphine, fentanyl, sufentanil, meperidine, and morphine) and one combination (fentanyl and morphine) with 2,345 patients were selected [32, 36,37,38, 40,41,42, 45,46,47, 49, 51, 54,55,56, 60, 61, 63, 64, 71,72,73,74, 76, 78,79,80,81,82, 85, 86, 90, 92, 94]. Eleven trial arms with direct comparisons were identified (Supplementary Table S2A). Among the 34 studies, 11 determined the incidence of nausea 24 h after spinal anesthesia [32, 41, 55, 56, 61, 64, 74, 81, 82, 85, 92], six determined intraoperative nausea [36, 51, 63, 73, 78, 79], and three determined early (up to 6 h) postoperative nausea [54, 76, 80], while the observation period was not described in the other 14 studies. None of the opioids affected the nausea incidence (Supplementary Table S2A). No evidence of incoherence (inconsistency) between direct and indirect comparisons (Supplementary Table S5B) was demonstrated. The results of imprecision indicate some or major concerns (Supplementary Table S8B). The certainty of evidence was low in 19 of 21 comparisons (Supplementary Table S10B). Regarding vomiting, 32 studies examining five opioids (diamorphine, fentanyl, sufentanil, meperidine, and morphine) and one combination (fentanyl and morphine) with 2,318 patients were selected [32, 36,37,38, 40,41,42, 45, 46, 49, 54,55,56, 59,60,61, 63, 64, 66, 70, 71, 73, 74, 76, 80, 82, 84,85,86, 90, 92, 94] (Supplementary Table S2B). Twelve studies determined the incidence of vomiting during the 24 h after spinal anesthesia [32, 41, 55, 56, 61, 64, 66, 70, 74, 82, 85, 92], four determined early postoperative (up to 6 h) vomiting [54, 76, 80, 84], and three examined intraoperative vomiting [36, 63, 73], while the observation period was not described in the other 13 studies. Among the five opioids, only sufentanil was associated with reduced risks of vomiting (OR 0.35, 95% CrI: 0.11–0.94) (Supplementary Table S2B). No evidence of inconsistency was observed between direct and indirect comparisons (Supplementary Table S5C). The results of imprecision indicate some or major concerns (Supplementary Table S8C). The certainty of evidence was low in 20 of 21 comparisons (Supplementary Table S10C). Finally, 15 studies examining four drugs (fentanyl, sufentanil, meperidine, and morphine) with 991 patients were selected for examining the incidence of nausea and vomiting [34, 39, 44, 48, 50, 51, 57, 62, 65, 68, 77, 91, 93, 95, 97] (Supplementary Table S2C). Six studies determined the incidence of nausea and vomiting during the 24 h after spinal anesthesia [39, 50, 57, 62, 68, 91, 95, 97], three investigated intraoperative nausea and vomiting [51, 93, 95], and one examined early postoperative nausea and vomiting [44], while the observation period was unclear in the other five studies. Fentanyl was associated with significant decrease in nausea and vomiting (OR 0.39, 95% CrI: 0.18–0.88), while meperidine was associated with significant increase in nausea and vomiting (4.8, 1.3–19) (Supplementary Table S2C). No evidence of inconsistency existed between direct and indirect comparisons (Supplementary Table S5D). The results of imprecision indicate some or major concerns (Supplementary Table S8D). Certainty of evidence was mixed: high (1/10), low (7/10), or very low (2/10) (Supplementary Table S10D).

Respiratory depression incidence

A summary of finding table for respiratory depression is shown in Supplementary Table S2D. Forty-two studies examining five opioids (diamorphine, fentanyl, sufentanil, meperidine, and morphine) and one combination (fentanyl and morphine) with 2,740 patients were selected [32, 34, 35, 37, 39, 41, 44,45,46, 48, 49, 51,52,53,54,55, 57,58,59,60,61,62,63,64,65,66, 71,72,73, 77,78,79, 82, 84,85,86,87, 90, 91, 93, 96, 97]. Twelve trial arms with direct comparisons were identified. The definition of respiratory depression varied among studies (Table 1). The most frequently used definition was a respiratory rate of < 10 breaths/min with or without a threshold oxygen saturation (Spo2) value, which was used in 20 studies. Conversely, no definition was provided in 16 studies. Most studies (34/42) reported zero events in both the placebo and intervention groups. Respiratory depression was reported in the placebo group in two studies (n = 1 and 1, respectively; overall incidence, 0.2%) [78, 91], fentanyl group in five (n = 1, 1, 3, 1, and 2, respectively; overall incidence, 1.0%) [55, 72, 87, 91, 97], sufentanil group in two (n = 14 and 9, respectively; overall incidence, 6.9%) [39, 72], and morphine group in one (n = 1, overall incidence, 0.3%) [84]. No respiratory depression was described in the diamorphine, meperidine, and combination of fentanyl and morphine groups. Sufentanil and morphine were associated with significant increases in the incidence of respiratory depression (OR 240, 95% CrI: 7.8–74,000 and 2.3 × 1010, 1.6–7.4 × 1040, respectively). Other opioids were not associated with elevated risks of respiratory depression. For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4E). The results of imprecision indicate major concerns (Supplementary Table S8E). Certainty of evidence was low in 19 of 21 comparisons (Supplementary Table S10E).

Cumulative postoperative opioid consumption within 24 h after spinal anesthesia

Eighteen studies examining four opioids (diamorphine, fentanyl, meperidine, and morphine) with 1031 patients were selected [35, 50, 52, 58, 60, 66, 69, 70, 74,75,76,77, 81, 83, 84, 88, 92, 93]. Five trial arms with direct comparisons were identified (Supplementary Table S2E). Compared with the placebo, diamorphine and morphine reduced 24-h opioid consumption by 22 (95% CrI: 11–33) and 32 (22–42) mg, respectively. Compared with fentanyl, diamorphine and morphine were associated with lower amounts of opioid consumption by 24.0 (8.9–38.8) and 33.5 (18.5–49.8) mg, respectively (Supplementary Table S4F). Neither fentanyl nor meperidine reduced the 24-h opioid consumption compared with the placebo and morphine (Supplementary Table S4F). No evidence of inconsistency was found in any of the comparisons (Supplementary Table S5F). For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4F). The results of imprecision indicate some or major concerns (Supplementary Table S8F). Certainty of evidence was mixed: high (1/10), moderate (5/10), or low (4/10) (Supplementary Table S10F).

Pruritus incidence

Forty-seven studies examining five opioids (diamorphine, fentanyl, sufentanil, meperidine, and morphine) and one combination (fentanyl and morphine) with 3245 patients were selected [32, 34,35,36,37,38, 40,41,42, 44,45,46,47,48,49,50,51, 54,55,56, 59,60,61,62,63,64,65,66, 68, 70,71,72,73,74, 76,77,78,79,80,81,82, 84,85,86, 90,91,92, 95]. Eleven trial arms with direct comparisons were identified (Supplementary Table S2F). The incidence of pruritus was determined 24 h after spinal anesthesia in 17 studies [32, 41, 50, 55, 56, 61, 62, 64, 66, 68, 70, 74, 81, 82, 85, 91, 92], during surgery in seven [35, 36, 51, 63, 73, 78, 80], during the early postoperative period (up to 6 h) in four [44, 54, 76, 84], and by 2 days after surgery in one [79], while no observation period was described in the other 19 studies. Except for diamorphine, all opioids were associated with a significant increase in the incidence of pruritus (Supplementary Table S2F). No evidence of inconsistency was found in any of the comparisons (Supplementary Table S5G). For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4G). The results of imprecision indicate major concerns (Supplementary Table S8G). Certainty of evidence was low in 12 and very low in 9 comparisons (Supplementary Table S10G).

Effective analgesia duration

We defined the duration of effective analgesia as the duration of time until VAS score reached ≥ 4 or the time to first analgesic use. For VAS score, 12 studies examining three opioids (fentanyl, sufentanil, and meperidine) with 931 patients were selected [37, 40, 41, 47, 48, 53, 55, 56, 65, 71, 89, 93]. Five trial arms with direct comparisons were identified (Supplementary Table S2G). Compared with the placebo, fentanyl, meperidine, and sufentanil prolonged the duration of time until VAS score reached ≥ 4 (150 min, 95% CrI: 81–230 min; 240 min, 110–370 min; 170 min, 63–280 min). There was no evidence of inconsistency among these three opioids (Supplementary Table S5H). For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4H). The results of imprecision indicate some or major concerns (Supplementary Table S8H). Certainty of evidence was moderate in 2 of 6 comparisons and low in 4 of 6 (Supplementary Table S10H).

Time to first analgesia use was the most frequently examined outcome. Thirty-nine studies examined the outcome, but one study [85] was excluded from the analysis, as it was unusable as the reported SD in the treatment arm was 0. We tried but could not contact the authors. Thus, 38 studies examining five opioids (diamorphine, fentanyl, sufentanil, meperidine, and morphine) and one combination (fentanyl and morphine) with 2,453 patients were selected [31, 33, 35, 38, 43,44,45,46, 49,50,51, 57, 58, 60, 62,63,64, 66, 67, 69, 72,73,74, 76,77,78,79, 81,82,83, 86, 87, 90,91,92, 95,96,97]. Ten trial arms with direct comparisons were identified (Supplementary Table S2H). Hydrophilic opioids, including diamorphine and morphine, were associated with significant increases in time to first analgesic use compared with the placebo by 230 (95% CrI: 26–440) and 660 (510–810) min, respectively. The fentanyl and morphine combination also prolonged the time to first analgesic compared with the placebo by 520 (100–950) min. Conversely, lipophilic opioids, including fentanyl, meperidine, and sufentanil, had no prolonging effects. Morphine showed significantly longer time to first analgesic use compared to fentanyl and sufentanil (Supplementary Table S4I). Inconsistency was not observed in the six comparisons but was observed in the 15 comparisons (Supplementary Table S5I). The results of imprecision indicated some or major concerns (Supplementary Table S8I). For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4I). Certainty of evidence was very low in 17 of 21 comparisons (Supplementary Table S10I).

Pain scores at 12 and 24 h after spinal anesthesia

Three studies examining three opioids (diamorphine, fentanyl, and morphine) with 194 patients [52, 60, 77] and seven studies examining three opioids (diamorphine, fentanyl, and morphine) with 432 patients [52, 58, 60, 77, 84, 88, 96] were selected for pain evaluation at 12 and 24 h, respectively. Four trial arms with direct comparisons were identified for each outcome (Supplementary Tables S2I, J). No opioid examined affected the pain score at both 12 and 24 h. Analysis by the CINeMA was unavailable for pain scores at 12 h. Inconsistency was not found in any of the comparisons (Supplementary Table S5J). For risk of bias, all comparisons were rated as low or unclear (Supplementary Figure S4J and K). Certainty of evidence was low in 5 of 6 comparisons (Supplementary Table S10K).

Subgroup and sensitivity analysis

Significant differences were found in the incidence of vomiting, cumulative postoperative opioid consumption, and time to first analgesic use (Supplementary Table S11). Lipophilic opioids were associated with significant decreases in the incidence of vomiting (OR 0.3, 95% CrI: 0.11–0.77) compared with hydrophilic opioids (Supplementary Table S11C). Hydrophilic opioids were associated with significant reductions in postoperative morphine-equivalent opioid consumption by 29.22 (95% CrI: 17.02–41.31) mg and with significant increases in the time to first analgesic use by 442.24 (95% CrI: 290.47–602.23) min compared with lipophilic opioids (Supplementary Table S11I). There were no differences in other outcomes.

The morphine and fentanyl combination was not associated with a significant increase in the complete analgesia duration, incidence of nausea and vomiting, incidence of respiratory depression, incidence of pruritus, and time to first analgesic use when compared with either morphine or fentanyl alone (Supplementary Table S4A, B, C, E, G, I).

Sensitivity analysis showed no differences between a non-informative prior and a weak prior with inverse gamma distribution in all outcomes (Supplementary Table S12).

Discussion

Our Bayesian network meta-analysis evaluated the effect of five opioids and one combination of opioids used as adjuncts to spinal anesthesia in cesarean section on seven clinically important outcomes. We confirmed that all had favorable effects on analgesic outcomes. However, no clear conclusions could be drawn regarding opioid-associated adverse outcomes.

Regarding primary outcome (duration of complete analgesia, defined as time to VAS > 0), all investigated opioids, including fentanyl, sufentanil, and morphine, significantly prolonged the duration of complete analgesia compared with the placebo when added alone to local anesthetics. Despite morphine ranking first, the results should be interpreted with caution since no statistically significant difference in extending the duration of complete analgesia was observed. The fentanyl and morphine combination showed no prolonging effect on the duration of complete analgesia compared with the use of fentanyl alone. However, this may be inconclusive as the effect of combining the opioids was determined by only one small study [82].

Here, other analgesic outcomes were investigated. In terms of time until VAS ≥ 4, fentanyl, meperidine, and sufentanil equally extended the time compared with the placebo. However, lipophilic opioids neither reduced the cumulative 24 h opioid consumption nor delayed the time to first analgesic use. Although no studies included here examined the effect of diamorphine and morphine on the time until VAS ≥ 4, these hydrophilic opioids were associated with reduced 24 h opioid consumption and extended the time to first analgesic use as compared with the placebo. While 24 h opioid consumption was not different between diamorphine and morphine, morphine was associated with longer duration of time to the first analgesic use compared with diamorphine, as well as fentanyl and sufentanil. These results showed that hydrophilic opioids are more suitable for improving postoperative analgesia compared to lipophilic opioids, which was supported by subgroup analyses. Among five opioids, morphine ranked first on both the cumulative postoperative opioid consumption and time to first analgesic use; thus, morphine seems to be most suitable agent for improving postoperative analgesia.

A network meta-analysis could not be performed for other secondary outcomes, including pain scores at 12 and 24 h, due to the small number of studies or patients with availability of pain scores at 12 h; moreover, low to moderate evidence certainty was found in pain scores at 24 h; therefore, these data are inconclusive.

Here, clinically relevant opioid-related adverse effects were investigated. With respect to emetic outcomes, the results were inconsistent. The incidence of nausea was not affected by the addition of intrathecal opioids. Vomiting was reduced by sufentanil. While “nausea and vomiting” was decreased by fentanyl, it was increased by meperidine. These confusing results may partially be explained by the variation in the definition of the outcome measured and the observation period. While lipophilic opioids may decrease intraoperative nausea and/or vomiting associated with uterine exteriorization, hydrophilic opioids may increase postoperative nausea and/or vomiting. However, the majority of the studies did not specify whether the emetic outcome occurred during the intra- or postoperative period. Furthermore, GRADE results on nausea and vomiting remained mostly at a “low” certainty of evidence. Therefore, these results are not convincing.

One of the most serious adverse events in intrathecal opioids is respiratory depression. Most studies (83%) reported zero events in both placebo and intervention groups. However, our results showed that sufentanil and morphine were associated with significant increases in the incidence of respiratory depression. Although respiratory depression by morphine was reported in only one case in our study, the OR was inexplicably high (2.4e + 8) with a very wide CrI (1.3–2.3e + 30). Similarly, other opioids, including diamorphine, meperidine, sufentanil, and fentanyl and morphine combination, showed very wide CrIs. The possible reason is that unlike the frequentist approach, the Bayesian’s GeMTC algorithm does not delete a group, including zero cells in both placebo and intervention arms, and an appropriate solution is yet to be established. Therefore, the results on respiratory depression obtained here are inconclusive.

Opioid-induced pruritus is an uncomfortable complication caused by intrathecal opioids. Our results showed that all opioids, except for diamorphine, were associated with a significant increase in the incidence of pruritus; however, GRADE results remained “low” in the incidence of pruritus in all comparisons. Thus, these results must be considered suggestive.

Strengths and limitations

The strength of this review is that it focused on a wide range of clinically relevant outcomes, including four analgesic and three adverse outcomes, and on every conceivable intrathecal opioid used alone or in combination. Another strength is the use of network meta-analysis, which enables direct and indirect comparisons. With Bayesian network meta-analysis, there is greater capability for more flexible statistical modeling with the use of prior distributions.

This review had some limitations. First, most comparisons were judged as having a moderate or low certainty of evidence, mainly because of major concerns with imprecision and partially because of major concerns with heterogeneity. The major concern in imprecision was a wide 95% CrI, possibly implying a small sample size and event rate in each study. A large-scale multicenter RCT is needed to address this. Second, we did not consider the opioid dose as well as local anesthetic dose. Thus, dose–response relationships for each opioid and local anesthetic remain unclear. Third, we included only studies with a no-opioid arm to address the effects of intrathecal opioids. However, this might reduce the number of eligible studies. Fourth, we did not consider co-interventions, such as PONV prophylaxes. Each trial employed different strategies for prophylaxes and treatment of PONV, which can affect the incidence of PONV in addition to intrathecal opioids. Fifth, studies included here were published over a long period of time (1988–2019), two-thirds of which were published before 2010, which might affect the outcome due to change in clinical practice.

Comparison with previous studies

Four systematic reviews have been published with regard to the effects of intrathecal opioids on various outcomes in cesarean section. However, no reviews have extensively compared the effects of various opioids administered to the subarachnoid space. A 1999 systematic review [6] reported that intrathecal morphine prolonged the time to first postoperative analgesic requirement and reduced postoperative pain, while fentanyl and sufentanil did not have any clinical effects; however, the number of included studies was small (15 reports, 535 patients) raising concerns regarding the statistical power. A 2016 meta-analysis [7] compared low-dose (0.05–0.1 mg) with high-dose (> 0.1–0.25 mg) morphine to examine their effects on analgesic and adverse outcomes and showed that high-dose morphine was associated with more prolonged analgesia, but higher incidence of nausea, vomiting, or pruritus. Yet, their meta-analysis focused only on intrathecal morphine, lacked control despite of their dose–response approach, and the threshold of morphine differentiating low- and high-dose was somewhat arbitrary. A 2018 systematic review [98] focused on the prevalence of respiratory depression caused by intrathecal morphine and diamorphine, concluding that the prevalence was low (5.96–8.67/10,000 cases). This meta-analysis focused only on neuraxial morphine and diamorphine, and the associated adverse respiratory depression effects. Their results are comparable with ours in that the incidence of respiratory depression was very low. More recently, a pairwise meta-analysis [99] found that fentanyl added to intrathecal bupivacaine alone or in combination with morphine provided excellent analgesia; however, the analysis focused more on intraoperative than postoperative outcomes. Thus, the results cannot be compared with ours.

In this Bayesian network meta-analysis, we confirmed that intrathecal opioids benefit postoperative analgesia. Morphine was the most appropriate agent. However, some results were inconsistent, and the confidence in evidence was often moderate or low, especially for adverse outcomes. Therefore, we cannot ascertain confidently which opioid is best for addition to local anesthetic for cesarean section. Well-designed randomized clinical trials with a high certainty of evidence are imperative to determine the most appropriate opioid for cesarean section.