Introduction

Erectile dysfunction (ED) affects around 50% of men between 40 and 70 years and may cause significant humanistic and economic burden [1,2,3]. Phosphodiesterase type 5 inhibitors (PDE5i) are strongly recommended as primary treatment or complementary therapy for most men with ED [4, 5].

Sildenafil citrate was the first PDE5i, introduced in 1998, followed by vardenafil and tadalafil (2003) and avanafil (2013). These drugs are available in most countries worldwide. Other PDE5i such as udenafil and mirodenafil are approved for use in Korea, while lodenafil is only marketed in Brazil [6, 7].

Clinical studies demonstrated that PDE5i are effective and safe in more than 80% of patients with ED when compared to placebo [5]. However, dropout rates are still high (30–70%), mostly due treatment failure and adverse events (e.g. headache, flushing) [8, 9].

Studies directly comparing PDE5i are limited, and is still unclear which PDE5i is the safest. Only two systematic reviews using broader statistical techniques were published [10, 11]. To synthetize the current evidence on all marketed PDE5i (sildenafil, vardenafil, tadalafil, avanafil, udenafil, mirodenafil, lodenafil) at different dosages in men with ED, and therefore identify what is the best option, we performed a systematic review with network meta-analyses (NMA). To better establish the clinical profile of each drug, we quantitatively assessed their benefit–risk ratio through a stochastic multicriteria acceptability analyses (SMAA).

Methods

This study was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [12, 13] and Cochrane Collaboration recommendations [14] (PROSPERO CRD42017079308).

Search strategy and selection criteria

Systematic searches were conducted in Pubmed, Scopus and Web of Science without limits for time-frame or language (September 2017) (see supplemental material). Trial registration databases (clinicaltrials.gov) and reference lists of reviews were manually searched. Titles and abstracts of retrieved articles were screened for eligibility. Relevant articles were read in full and those fulfilling inclusion criteria had their data extracted. Two authors performed all the literature selection steps individually and then discussed the differences with a third author.

We included randomized controlled trials (RCT) of parallel or cross-over design that evaluated efficacy or safety of any PDE5i compared to placebo or to other PDE5i in adult males with ED (with or without comorbidities). Outcomes of interest were: efficacy evaluated through the change from baseline to study end of the International Index of Erectile Function-Erectile Function domain; safety and tolerability reported as: occurrence of any adverse event related to therapy (medication-related AE); serious AE; most common AE (e.g. flushing, headache, nasal congestion, visual disorders); discontinuation of treatment due AE; discontinuation of treatment due failure/inefficacy.

Articles published in non-Roman characters, trials without a control group, or not assessing the outcomes of interest were excluded.

Data extraction, quality assessment, and statistical analysis

The following data were independently extracted by two researchers: (1) study characteristics (e.g. authors’ names, year, sample size, patients’ age, comorbidities), (2) treatments, (3) methodological aspects, (4) clinical outcomes. Methodological quality of studies was evaluated by Jadad Scale [15] and Cochrane Collaboration’s tool for assessing risk of bias [14].

Network meta-analyses were performed for each outcome of interest using a Bayesian framework based on the Markov Chain Monte Carlo simulation method. Transitivity analyses were performed by comparing population, interventions, control, and outcome definitions among studies (i.e. qualitative evaluation to confirm the homogeneity). A common heterogeneity parameter was assumed for all comparisons. Consistency models were built for each network. We performed a conservative analysis of non-informative priors [16, 17]. Odds ratio (OR) and mean difference (MD), expressed with 95% credibility intervals (CrIs), were used as effect-sizes measures for dichotomous and continuous variables, respectively. Fixed and random effect models were tested, selecting the one with the lowest deviance information criteria. Ranking probabilities were calculated via surface under the cumulative ranking analysis (SUCRA) for each outcome, providing a hierarchy of the treatments with values ranging from 0 to 100% (see supplemental material for further information). Robustness of networks was estimated by node-splitting analysis, which depicts inconsistency between the pooled direct and indirect evidence for a comparison (p < 0.05 reveals inconsistency) [18, 19]. Sensitivity analyses with hypothetical removal of studies were conducted when discrepancies were identified in the network. Subgroup analyses considering patients clinical conditions (patients without described comorbidities; patients with cardiovascular disorders; patients with prostate hyperplasia) were performed. All analyses were conducted using Addis (v.1.1.6.7; GeMTC package) and Gephi v.0.9.1 [20,21,22].

Multicriteria analysis

Stochastic multicriteria acceptability analysis (SMAA) allows to assess the benefit–risk ratio of treatments according to simultaneous criteria. Benefit is described as a potential effect that moves the condition of the patient from disease towards health. Risk is a potential effect that moves the condition from health towards disease. This tool provides a holistic and quantitative assessment of the relative profile of treatments using evidence from a network of trials with unknown or partially known preferences [23, 24]. We used SMAA to estimate the benefit–risk of PDE5i. One benefit criterion (efficacy as IIEF) and one risk criterion (medication-related-AE) were initially considered (scenario I). The model was built with missing preferences, i.e. without a previously established order of importance for the two outcomes. Different models were also built either considering placebo or sildenafil 50 mg (most common active comparator) as baseline. Additional scenarios considering other risk criteria (discontinuation due AE, discontinuation due failure—scenario II) or individual AE (headache, flushing, nasal congestion, visual disorder—scenario III) were built. Models were performed using Monte Carlo iterations with measurements derived from the consistency models from the NMA (Addis v.1.1.6.7).

Results

This systematic review retrieved 3403 citations after removing duplicates, being 3038 excluded after screening. Of the 365 registers read in full, 184 articles (179 different RCTs) fulfilled eligibility criteria (50,620 patients) (Fig. 1). Median age of men (considering the median provided by each study) was 55.5 years (IQR 52.5–58.0). Patients with broad-spectrum ED were included, mostly organic (35–45% of patients), psychogenic (25–35%) or mixed (10–20%). Most studies (55.9%) assessed men without concomitant clinical conditions. When reported, main comorbidities were the following: cardiovascular/metabolic (e.g. diabetes, hypertension) (31 studies); benign prostatic hyperplasia (17 studies); depression/psychosis (9 studies); bilateral nerve sparing radical prostatectomy (7 studies). Trial’s median duration was 12 weeks (IQR 8–12). Regimens were considered flexible in 40 studies. Placebo was the comparator in 175 trials (see supplemental material).

Fig. 1
figure 1

PRISMA flow diagram

Methodological quality of studies was moderate (mean Jadad score: 3.27), with an overall unclear risk of bias (55%) (supplementary material). All studies were randomized and the majority (93.3%) were blinded, although around 80% lack on description of the allocation concealment. Most of trials were of parallel design (85.4%). Despite the lack of standardization on reporting results (unclear risk of bias for blinding of outcome assessment—92.2%), no further issues were detected in attrition and reporting bias. Most studies (77.1%) were supported by pharmaceutical companies.

We were able to build nine NMA, one for each outcome of interest. All networks were robust with no significant discrepancy between direct and indirect evidence in node-splitting analyses (supplemental material).

Efficacy

A total of 103 studies (26,845 patients) provided quantitative data on IIEF and were included in the NMA (Fig. 2a). A forest plot of 22 treatments for overall efficacy against placebo is shown in Fig. 3a. Drugs are ordered according to their approval dates (from oldest to newest). All treatments were significantly more efficient than placebo. Sildenafil 25 mg was statistically superior to all interventions in enhancing IIEF (MD ranging from 5.06 with 95% CrI [1.36; 8.78] to MD 13.08 [10.06; 16.02]) Considering SUCRA (Fig. 4), sildenafil 25 exhibited a probability of 99% of being the best treatment, followed by sildenafil 50 mg (probability of 80%).. Tadalafil 10 mg and 20 mg also showed a good profile (73% and 76%, respectively). Placebo was the worst option, followed by avanafil 50 mg. In the NMA of cardiovascular disorders (n = 15 RCT), sildenafil 50 mg presented the best efficacy profile (84% probability in SUCRA) for enhancing IIEF (MD compared to placebo of 7.41 [95% CrI 4.82–9.35]) (see supplemental material for complete analyses).

Fig. 2
figure 2

The network diagram of IIEF (a) and medication-related AE (b). Each node represents an intervention. The thickness of the lines is proportional to the number of studies for each pair of comparison. A: Avanafil 50 mg; B: Avanafil 100 mg; C: Avanafil 200 mg; D: Vardenafil 5 mg; E: Vardenafil 10 mg; F: Vardenafil 20 mg; G: sildenafil 50 mg; H: sildenafil 100 mg; I: tadalafil 20 mg; J: tadalafil 10 mg; K: tadalafil 2.5 mg; L: tadalafil 5 mg; M: sildenafil 25 mg; N: lodenafil 40 mg; O: lodenafil 80 mg; P: udenafil 75 mg; Q: udenafil 50 mg; R: udenafil 200 mg; S: udenafil 100 mg; T: mirodenafil 150 mg; U: mirodenafil 50 mg; V: mirodenafil 100 mg; X: placebo

Fig. 3
figure 3

Forest plot of overall efficacy as IIEF (a) and overall safety as medication-related AE (b) for PDE5i at different dosages. Data are shown as effect size (mean difference and odds ratio, respectively) and 95% credibility interval. Drugs are ordered according to the approval dates (i.e. from the oldest to the newest)

Fig. 4
figure 4

Ranking plot based the surface under the cumulative ranking curve analysis (SUCRA)—values of overall efficacy as IIEF and overall safety as medication-related AE. Treatments lying in the upper-right corner are more effective and safer than the other treatments. No available data for medication-related AE for avanafil 50 mg, lodenafil 40 mg, lodenafil 80 mg, tadalafil 2.5 mg, and tadalafil 20 mg

Safety and tolerability

Medication-related AE was addressed by 26 studies (7,237 patients). The network plot is shown in Fig. 2b. A forest plot of 17 treatments for overall safety against placebo is shown in Fig. 3b. Significant differences between almost all treatments versus placebo were obtained (OR ranging from 0.09 with CrI 95% [0.01; 0.49] to 0.40 [0.15; 1.08]), confirming that placebo was the safest option (SUCRA around 1.5%—Fig. 4). Mirodenafil 150 mg was the less safe therapy (98%) compared to almost all other treatments. Higher doses of sildenafil (100 mg) were also significantly related to AE (86%) (supplemental material).

Thirty trials reported the outcome of serious AE (11,794 patients). For this network, despite no significant differences between interventions were found, SUCRA demonstrated that sildenafil 50 mg (24%) and udenafil (29%) may be safer options. Vardenafil 5 mg (90%) and avanafil 100 mg (75%) were more associated to serious AE. Flushing and headache were reported by 94 (26,791 patients) and 118 studies (33,662 patients), respectively. Mirodenafil 150 mg presented the highest probability of causing both AEs (SUCRA 95% and 90%, respectively). Tadalafil 5 mg and 10 mg were less associated with flushing (SUCRA around 17%), while low doses of this drug (2.5 mg and 5 mg) produced less headache (13% and 22%, respectively). Nasal congestion and visual disorders were reported by 41 (12,700 patients) and 34 trials (8,208 patients), respectively. Avanafil 50 mg was the safest regimen for both outcomes (8% and 20%, respectively). Vardenafil (10 and 20 mg) was more associated with nasal congestion (around 85%), while sildenafil 100 mg was more prone to cause visual disorders (89%).

Rates of discontinuation due AE and inefficacy were reported in 82 (26,300 patients) and 58 trials (19,334 patients), respectively. Sildenafil 100 mg was more related to discontinuation due AE (SUCRA of 95%), while udenafil was best tolerated. Mirodenafil 100 mg and placebo were more related to discontinuation due inefficacy (92% and 80%, respectively), while vardenafil (10 and 20 mg) and sildenafil 25 mg were the best alternatives for this outcome. The NMA of cardiovascular disorders (n = 15 RCT) showed udenafil and vardenafil as more tolerated drugs (supplemental material). For the other comorbidities, no significant differences among drugs were observed. However, few studies could be statistically analyzed due the lack of reported data on patient’s clinical conditions.

Additional analyses and SMAA

Subgroup analyses of patients without described comorbidities, patients with cardiovascular disorders, or patients with prostate hyperplasia presented similar results to the original NMA.

Results of SMAA were similar to the obtained by individual NMA. Acceptability rank of scenario I (IIEF and medication-related AE criteria with missing preferences with placebo as baseline) is shown in Fig. 5 (17 therapeutic options and placebo). This scenario favored sildenafil 25 mg (benefit–risk ratio of 78%) followed by tadalafil 10 mg (20%). Mirodenafil 150 mg presented the worst benefit–risk ratio (56%). Placebo and sildenafil in higher doses (100 mg) were also disadvantaged options. When establishing ordinal preferences of the two criteria (IIEF as the first important outcome and then medication-related AE as the first important outcome), sildenafil 25 mg continued to be the best option (98% and 60%, respectively), followed by tadalafil 10 mg (17% in both cases). Placebo had 63% chances of being the worst alternative when efficacy criterion was considered first, while mirodenafil 150 mg had 82% chance of being the worst option when primarily accounting for safety.

Fig. 5
figure 5

Rank acceptability’s from the stochastic multicriteria acceptability analysis. Each intervention has a probability of being the best treatment (rank 1) or the worst treatment (rank 18) considering overall its benefits (achievement of IIEF) and risks (medication related AE) (missing preferences scenario; placebo as baseline)

Fourteen drug regimens and placebo were included into scenario II (discontinuation due AE and discontinuation due failure). Sildenafil 25 mg was the best alternative (benefit–risk ratio of 78%), followed by tadalafil 10 mg (25%). Placebo, sildenafil 100 mg, and mirodenafil were the worst options. Scenario III, accounting for IIEF as benefit criteria and the four individual AE as risks, showed sildenafil 50 mg, tadalafil 20 mg and udenafil 100 mg with higher probabilities of acceptability (20%). Placebo was again the last option (supplemental material). Similar results were obtained when sildenafil 50 mg was defined as baseline for all models.

Discussion

We updated and synthetized further evidence on the efficacy, safety, and tolerability of PDE5i at different dosages through NMA of more than 100 RCTs. Additionally, we performed multi-criteria decision analysis by SMAA to weight the benefits and risks of treatments. A previous NMA conducted by Yuan et al. [10] was criticized due the lack of consistency of literature selection and because drug dosages were not considered in the analyses [25, 26]. In the NMA conducted by Chen et al. [11], authors considered all available PDE5i and their dosages. However, no network plot was made available and only studies of parallel-design with overall outcomes were evaluated, which may not reflect the true profile of PDE5i [11].

Data from individual studies and some systematic reviews suggest that PDE5i have similar efficacy in general ED population [8, 27,28,29], which may be justified by the small structural differences between them [6]. However, similarly to Chen et al. findings [11], our analyses confirm that sildenafil at 25 mg or 50 mg are the most effective option compared to the other PD5i regimes and should be considered as very first line for ED, especially for patients requiring immediate stronger efficacy. Usually, the lowest recommended dose of sildenafil is reserved for special populations (e.g. elderly or those with hepatic or renal impairment) or for those who experience adverse events at a higher dose [4, 5]. In our analyses, most of studies evaluating sildenafil 25 mg included men with median of 55 years without other clinical condition, supporting the initial selection of this drug at 25 mg, especially because dose–response effect of PDE5i may be small and non-linear [4, 30]. Lower doses of sildenafil were related to higher acceptability rates as demonstrated in our SMAA. Conversely, sildenafil at 100 mg was more prone to cause AE, especially visual impairment, and discontinuation due safety.

Tadalafil 10 and 20 mg showed intermediate efficacy, with low rates of AE. This drug should be indicated to men wishing to optimize tolerability and prolonged erection [6, 7]. Tadalafil provides the longest therapeutic effect (up to 36 h) among all PDE5i, with first effects appearing 60–120 min after administration. Udenafil has similar onset of action (60–240 min), but reduced duration of effects (12 h). Sildenafil and vardenafil also present shorter plasma half-time (10–12 h), but clinical effects appear earlier (30–60 min). The duration of action for avanafil, mirodenafil, and lodenafil is between 6 and 12 h, with avanafil presenting the faster onset of action (15–30 min) among all PDE5i [31,32,33,34,35,36].

Despite the rapid onset of action, avanafil may cause anxiety, resulting in ineffectiveness [37, 38]. Our results confirm this hypothesis, showing that avanafil had the less efficacious profile. This drug presented the safest profile for ED, significantly causing less medication-related and serious AE at low dosage, probably because its short half-life. Mirodenafil 150 mg presented significantly more risks among PDE5i, being the worst therapeutic option in SMAA. Similarly, intermediate to high doses of vardenafil and udenafil were related to more AE, especially nasal congestion.

Treatment discontinuation of PDE5i is common and mostly due to therapeutic failure [8, 39]. The main reasons for therapeutic failure include incorrect use of PDE5i, lack of sexual stimulation, and lack of adherence depending on regimen (on-demand or daily dosing). High-fat meals decrease efficacy of sildenafil and vardenafil in about 30% due to retarded absorption of the drug; intake of alcohol delays the absorption of lodenafil and mirodenafil, but may enhance their bioavailability [6]. Thus, men who are prescribed a PDE5i should be instructed in the appropriate use of medication. Some patients who fail to achieve an erection when taking PDE5i on-demand can benefit from a daily dosing regimen or vice-versa. Of men that initially do not respond to therapy, between 30 and 50% may be converted to responders through a simultaneous counselling with his partner [5, 40].

Efficacy of PDE5i depends on the integrity of nitric oxide pathway in producing cGMP. Patients with impairment of this pathway (e.g. diabetes, radical prostatectomy, metabolic syndrome) will probably benefit less from PDE5i [6]. In our study, subgroup analyses by medical conditions did not reveal significantly different response compared to overall population. However, few studies properly accounting for comorbidities were found, which may hamper further conclusions.

As limitations, we are aware of potential introduction of bias caused by studies of poor methodological quality or with inefficient wash-out period. However, few studies of cross-over design were included. The low reporting quality of some trials and variance between efficacy endpoints hampered more analyses to be performed. IIEF has been recommended as a primary endpoint for clinical trials in ED because is a widely used, multi-dimensional self-report instrument. However, other measures (e.g. quality of life, vascular parameters, rigidity testing) are also important to evaluate patient’s improvement. We evaluated daily dosing regimens because few trials properly reported the results for on demand regimens. We analyzed some of the most common and reported adverse events; however, they may widely vary among patients. As with any other method, NMA is not free of limitations. The validity of NMA depends on the distribution of relative treatment effect modifiers across comparisons. The included RCTs differ in terms of size, risk of bias, and external validity. We tried to avoid systematic errors by performing transitivity and sensitivity analyses. Treatment rankings should not be interpreted in isolation from the relative treatment effects. We opted to perform different SMAA scenarios to avoid inconsistencies or selective bias, and to potentially increase the informative value of existing evidence for prescribing decisions.

Conclusions

A correct diagnosis of ED and associated health conditions, along with a patient-tailored therapy to restore sexual satisfaction and improve quality of life, seems the most beneficial strategy to manage this condition. We suggest sildenafil at low doses (25 or 50 mg) followed by tadalafil (10 or 20 mg) as first therapeutic options for ED in any case. Patients requiring different onset of action and duration of effects may use vardenafil or udenafil being aware of the AE. The use of avanafil, lodenafil and mirodenafil are hardly justified given the lack of expressive efficacy or high rates of AE. For patients with cardiovascular disorders, sildenafil (low doses), vardenafil and udenafil have the best benefit–risk profile.