Introduction

Hyperuricemia is a physiological abnormality of increased serum uric acid (sUA) concentration owing to urate under-excretion, overproduction, or both [1,2,3,4]. This precipitates the deposition of monosodium urate crystals in the joints and generates tophi, leading to inflammatory reactions manifesting as stabbing pain, swelling, and limb deformation [3,4,5,6,7]. In addition to being an independent precursor of gout [8], hyperuricemia also promotes the incidence or progression of metabolic disorders in various organs [9,10,11,12], impairing somatic, mental, and social well-being [13, 14]. Its incidence is increasing across subgroups of age, sex, socioeconomic levels, and geographic areas [14,15,16,17]. This underlines the need for efficacious and quality hyperuricemia management worldwide in both public health and clinical medicine settings.

At present, hyperuricemia is managed using urate-lowering therapies (ULTs) to decrease the sUA level [1,2,3,4,5,6], and they are usually prescribed lifelong [18]. Acute gout flares (GFs), induced by rapidly lowering the sUA, are commonly encountered on initiation or dose up-titration of ULTs [19]. Accordingly, adherence attrition of ULT has been a long-standing issue and remains challenging. Since poor adherence dilutes the therapeutic effectiveness [20,21,22], it is essential for clinical decision-makers to understand whether any agents are themselves, other than patient attributes or administration modalities, more likely to give rise to adverse events (AEs).

Overall, the most substantial impact of low efficacy of or poor adherence to ULTs has been a long-term burden on health care cost and manpower [23, 24]. Evidence accumulated from 18 observational studies from 1974 to 2016 reported non-adherence rates of 21.5–82.6%, non-persistence rates (temporarily suspending for at least 30 days during therapy) of 54–87%, and post-discontinuation gouty arthritis relapse rates of 36.4–81% with higher likelihood of relapse in patients with poor pre-discontinuation sUA management [20, 25]. Irrespective of the extent of medical resources allocated to non-adherence patients, poor efficacy and low cost-effectiveness continue to be the concerns. Therefore, the current study assessed the efficacy and safety of ULTs with a focus on the occurrence of adherence attrition by type-specific AEs. In addition, a re-verification of ULT efficacy is necessary because several ULTs were approved or left out in the previous meta-analyses [26,27,28] (e.g., arhalofenate, lesinurad, topiroxostat, Terminalia, and dual agents). Hence, using the Bayesian network meta-analysis, we aimed to comprehensively compare all the market approved ULTs for the treatment of hyperuricemia.

Materials and methods

The operational hypothesis and outcomes

This meta-analysis study focused on three outcomes: (i) ULT efficacy, measured with the proportion of patients achieving the therapeutic target level of sUA (≤ 6 or 5 mg/dL in severe gout patients); (ii) safety of ULT, measured with the proportion of patients reported AEs of overall, serious AEs (SAE), and death; and (iii) adherence attrition events (AAEs) occurrence, measured with the proportion of patients reported discontinuation study medication owing to AEs (DCE), gout flare attacks (GFs), drug-related AEs (dAEs), and skin-related AEs (skAEs). Reports of withdrawals with no definite statements regarding the reasons were not included in DCE. The last part was set based on a postulate that a ULT associated with a higher occurrence of AAEs in the contexture of RCT implementation would be associated with lower adherence in a realized clinical circumstance, provided equivalent dosage titration was administered.

Searching logics and selection criteria

This meta-analysis included only peer-reviewed RCTs. We searched for studies in the following databases: PubMed, EMBASE, and Medline. For unpublished trials, trials published in conference abstracts, or protocol-only studies, we retrieved the most updated progress through the websites of trial registry systems (Appendix 1) for their peer-reviewed publications. The trial registry systems, logic, and searching and screening processes are portrayed in Fig. 1 and Appendix 1. Extended searching based on those eligible articles was performed by examining their “related articles” shown in the side-menu of PubMed and Google Scholar.

Fig. 1
figure 1

Flow chart of literature search and study selection

Studies were eligible for inclusion if (i) the publication year was from inception of databases to February 28, 2019; (ii) the study was conducted in line with RCT design with random allocation implemented at the individual patient level; (iii) the study had at least one arm adopted to the currently on-market or newly announced agents (as listed in Appendix 2); (iv) the study population was patients of primary hyperuricemia, gout, or both; and (v) the study endpoints included the proportion of patients whose sUAs were controlled under the target level by ULT. We excluded studies that (i) had no arms identical to the arms of other trials, (ii) were duplicated or non-relevant studies (e.g., RCT extension studies or governmental reports based on RCT results), (iii) were designed to evaluate only acute symptom alleviation (e.g., pain score), (iv) were implemented on patients of non-human, no-adults, healthy, or secondary hyperuricemia (e.g., tumor lysis syndrome, and pyrazinamide-induced hyperuricemia), (v) left the efficacy assessment using the aforementioned international consensus standard for effective therapy unpublished, (vi) performed an efficacy assessment using a standard apart from the consensus [1,2,3,4,5], or (vii) were non-English publication with full-text inaccessible. Eventually, 39 articles were left for subsequent analyses [29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67].

Data extraction and quality assessment

Data assembling and methodological quality assessment were performed by one reviewer (Y-JL) and were double-checked by another reviewer (S-SC). Any inconsistency was resolved through discussion following the ad hoc guidelines for assessing the risk of bias in RCTs. A template of data extraction was established based on the extraction over the first 10 eligible articles. After data extraction of all 39 articles, a repeat extraction was performed in an order distinct from the initial extraction to make the extraction rules over all 39 articles as consistent as possible. Finally, a round of visual inspection was performed for confirmation (by reviewer YJL). Then, another reviewer (SSC) provided the final confirmation and consulted all other authors to integrate and resolve inconsistencies. Quality assessments of all the eligible studies were carried out using the Cochrane Collaboration’s tool for assessing the risk of bias [68].

Statistical analysis

RevMan 5.3 (Cochrane Review Manager, Cochrane Collaboration, Oxford, UK) was used to visualize the quality assessment results. The quantitative synthesis analysis was performed by Bayesian network meta-analysis with a random-effect model using the software R (version 3.5.2). The number of studies and patients contributing to head-to-head comparisons were visualized using network geometric plots. Pooled odds ratio (OR) and its 95% credible interval (CrI) were reported as the effect size estimates and the associated effect size precisions for comparison of efficacy, safety, and AAEs. Consistency between direct and indirect comparisons was tested using a node-splitting method and summarized with forest plots by direct and indirect evidence.

Of the final included 39 articles, four trials evaluated dual-agent ULTs; all others evaluated single prescriptions. We first evaluated the efficacy of ULTs under a network meta-analysis model incorporating all 39 articles as a pooled model (model MP) upon the assumption that the controlled groups in studies with the dual-agent regimens allopurinol + placebo and febuxostat + placebo were equivalent to the active-controlled groups for single prescriptions: allopurinol and febuxostat, respectively. Sensitivity analyses were then performed under three different scenarios of separated synthesis: (i) evaluation based on evidence solely from trials for single prescriptions (model MS) and those for dual-agent prescriptions (model MC); (ii) evaluation based on the aforementioned pooled model assumptions could be altered against the separate module evaluations; and (iii) since the estimates for the efficacy of ULT pegloticase were not stability in the assessment, another analysis based on evidence without this agent (model MP1) was performed to assess the influence of pegloticase in MP.

Results

The 39 eligible RCTs comprised a total of 19,401 patients. The characteristics of the study population are summarized in Table 1 and Appendix 3. The included studies were published from 1999 through 2019, and consisted of placebo and 14 ULTs (allopurinol, febuxostat, febuxostat immediate release (IR) formulation, febuxostat extended release (XR) formulation, Terminalia bellerica, Terminalia chebula, topiroxostat, arhalofenate, benzbromarone, lesinurad, probenecid, pegloticase, lesinurad + allopurinol, and lesinurad + febuxostat), and derived 33 active arms by varying formulations and dosages.

Table 1 Characteristics of patients in the RCTs included in the data synthesis analysis

Characteristics of studies and quality assessment

In general, more male patients were enrolled in all the studies, with male patients making up > 80% of the total patient population. Of all studies, 83.8% had participants with a mean age of ≥ 50 years, and 21.6% had participants with a mean age of ≥ 60 years. Almost all studies had an average BMI of > 25 kg/m2 and a basal sUA level of > 6 mg/dL, except for one study [67]. Four studies [64,65,66,67] enrolled patients using dual-agent ULTs, and patients in these dual-agent ULTs trial had a lower basal sUA. Of the 39 studies, 8 (20.5%) studies were multinational trials, 12 (30.8%) studies were conducted solely in North America, 17 (43.6%) studies were conducted in Asia (of which Japan by itself contributed 12 studies (30.8%)), and 2 studies were (5.1%) in Australia. Over half of the RCTs (64.1% = 25/39) performed ex ante registration on a public website. It was also observed that trials conducted in Europe and North America enrolled more obese patients than those conducted in Asia.

Open-label, single blinding, and unclear blinding trials were rated as high risk in either performance bias or detection bias, or both (Appendix 4). Most studies explained the blinding maneuvers in patient groups and investigators, but few explained blinding of assessors. The reporting bias mainly lacked complete AE reports. Only a few trials explicitly stated that they did not receive any funding from pharmaceutical industries and reported ex ante registries, and were therefore rated as low risk in other bias.

Arms administering allopurinol, febuxostat, and placebo contributed the most majority of evidence for direct comparison: 21 (55.3%) trials had at least one arm using allopurinol (Al and Alc in Appendix 5) [29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44, 46, 47, 64,65,66], and 22 (57.9%) trials used febuxostat [29,30,31,32, 35,36,37,38,39, 43, 44, 46,47,48,49,50,51,52,53,54,55,56, 67]. The other ULTs contributed less evidence. The study by White WB enrolled the most number of patients, and comprised of nearly 32% (= 6190/19,401) of the total sample size [45]. However, the White WB study was included only in efficacy analyses and not the other analyses, because the White WB study reported AEs only concerning cardiovascular events.

Network meta-analyses of efficacy, safety, and AAE occurrence

The network geometry plots for evaluating efficacy and safety with MP and for evaluating DCEs, GFs, dAEs, and skAEs are displayed in Fig. 2. Of the included trials, 32 (84.2%) trials reported 670 (5.6% in 11,887) DCEs, 27 (71.1%) trials reported 1781 (20.5% in 8677) GFs, 17 (44.7%) trials reported 1245 (17.7% in 7040) dAEs, and 16 (42.1%) trials reported 307 (4.7% in 6480) skAEs. AAEs occurrence was observed to differ substantially over the time of study publication year. Classifying studies by the publication year (1999–2009, vs. 2010–2019), reporting rates declined in GFs (6/7 = 85.7% vs. 21/31 = 67.7%) and skAEs (6/7 = 85.7% vs. 10/31 = 32.3%), but an increased in dAEs (1/7 = 14.3% vs. 16/31 = 51.6%) was observed.

Fig. 2
figure 2

Network geometry plots of the assessments analyzed under a pooled model (Mp) in this study. (1) The edges connecting the nodes indicate head-to-head comparisons. The thicker the edges, the higher the number of studies contributing to the evidence. The sizes of nodes were proportional to the sample sizes of the administered ULT in this meta-analysis. Code 911 indicates placebo. The other codes of ULTs are noted below by pharmacologic attributes. (2) Xanthine oxidase inhibitors (XOIs): 111 = Allopurinol, 121 = Febuxostat 20 mg/day, 122 = Febuxostat 40 mg/day, 123 = Febuxostat 60 mg/day, 124 = Febuxostat 80 mg/day, 125 = Febuxostat 120 mg/day, 126 = Febuxostat 240 mg/day, 127 = Febuxostat 40/80 mg/day, 128 = Febuxostat XR 40 mg/day, 129 = Febuxostat XR 80 mg/day, 131 = Terminalia bellerica 250 mg/day, 132 = Terminalia bellerica 500 mg/day, 133 = Terminalia chebula 500 mg/day, 141 = Topiroxostat 40 mg/day, 142 = Topiroxostat 60 mg/day, 143 = Topiroxostat 80 mg/day, 144 = Topiroxostat 120 mg/day, 145 = Topiroxostat 160 mg/day, 211 = allopurinol + colchicine, 221 = allopurinol + placebo, 231 = Febuxostat 80 mg/day + placebo. (3) Uricosuric drugs: 311 = Arhalofenate 600 mg/day, 312 = Arhalofenate 800 mg/day, 321 = Benzbromarone, 331 = Lesinurad 400 mg/day, 341 = Probenecid 2 g/day. (4) Recombinant porcine-like uricase drugs: 411 = Pegloticase 4 mg/2 weeks, 412 = Pegloticase 8 mg/2 weeks, 413 = Pegloticase 8 mg/4 weeks, 414 = Pegloticase 12 mg/4 weeks. (5) Uricosuric combined with XOI prescription: 811 = Lesinurad 200 mg/day + allopurinol, 812 = Lesinurad 400 mg/day + allopurinol, 813 = Lesinurad 600 mg/day + allopurinol, 821 = Lesinurad 200 mg/day + febuxostat 80 mg/day, 822 = Lesinurad 400 mg/day + febuxostat 80 mg/day

The pooled OR estimates and the ranking probabilities of efficacy assessments are displayed in Tables 2, 3, and 4 (the analysis results for overall AEs and AAEs assessments are listed in Appendices 6 and 7; the corresponding 95% CrIs and forest plots with selected reference ULTs in Fig. 3). All active agents exhibited significantly favorable efficacy than placebo (see the last row of Table 2 and Fig. 3a), except those extracted from Terminalia. bellerica 250 mg/day and T. chebula 500 mg/day (pooled ORs [95% CrI] were 0.79 [0.1, 8.6] and 0.33 [0.1, 2.2], respectively). Of the whole estimation sampling history, there was a 54% probability for placebo to be ranked as the least efficient agent and 99.9% to be ranked as one of the lowest three (Table 3).

Table 2 Bayesian network meta-analysis estimates of odds ratios (ORs) under the pooled model (MP) for the efficacy and safety assessments of ULTs
Table 3 Ranking probabilities under the pooled model (MP) for the efficacy assessments of ULTs
Table 4 Summary of the patterns exhibited in the pooled estimates and ranking probabilities
Fig. 3
figure 3

Forest plots for efficacy/safety assessments

Figure 3 b shows the forest plot for comparing the efficacy of different ULTs as compared to allopurinol. Among xanthine oxidase inhibitors (XOIs), febuxostat was found to be significantly superior to allopurinol (OR estimates: 1.97–20.41), except for the lowest dosage and the varying dosage (20 mg/day and mixed-40/80 mg/day in an arm). The two newly launched formulations (XR, 40 and 80 mg/day) were equivalent to the corresponding standard formulations (OR for XR vs. IR: 1.57 [0.9, 2.9] and 1.09 [0.6, 2] for febuxostat 40 and 80 mg/day). Topiroxostat was non-inferior to allopurinol, had a lower efficacy than febuxostat 120–240 mg/day, and was superior to T. bellerica 250 mg/day or T. chebula 500 mg/day. Among uricosuric agents, benzbromarone was superior to all of the other four but was inferior to febuxostat 120 mg/day (0.24 [0.07,0.83]) and 240 mg/day (0.09 [0.02,0.37]). All dual-agent ULTs had a superior efficacy to monotherapy of lesinurad (ORs: 10.66–39.78), allopurinol (ORs: 4.7–15.47), and febuxostat (ORs: 2.09–7.84). In general, the dual agents had a superior efficacy to XOI agents and uricosuric agents (rankings: 6–14 vs. 5–34 and 18–31), and most XOI agents had a superior efficacy to the uricosuric agents (Table 3).

Patients receiving placebo had lower overall AEs than those receiving probenecid 2 g/day, pegloticase 4 mg/2 weeks and 8 mg/4 weeks, and lesinurad 400 mg/day + allopurinol (see the last column of Table 2; Fig. 3c).

Of the 38 eligible RCTs, 29 (76.3%) reported 239 (1.85%) SAEs and 34 (89.5%) reported 11 (0.085%) cases of death in 12,900 patients (see Appendix 9). Given that the number of SAEs and deaths was scarce, the data are insufficient for definitive conclusions. However, it was observed that dual-agent regimens have substantially higher risk of SAE (lesinurad 400/day + allopurinol vs. placebo, allopurinol, febuxostat 40 mg/day, and lesinurad 200/day + allopurinol: 3.2 [1.4, 7.5], 1.97 [1, 4], 2.56 [1.1, 6.1], and 2.08 [1.1, 4.2]) and the dual-agents had higher risk of all-cause AEs than most XOIs (rankings: 7–15 vs. 6–29; Appendix 7 (a)).

The top three single agents with the most frequent DCEs were pegloticase, probenecid, and lesinurad, respectively (ranking: 2–6). Compared with allopurinol, febuxostat, topiroxostat, arhalofenate, or benzbromarone monotherapy (ranking: 7–29), and dual agents appeared to have more DCEs. The occurrence rates of DCE were also observed to increase with the doses of lesinurad. The Fx8 (febuxostat of 80 mg/day XR formulations) tended to be associated with fewer DCEs than febuxostat of IR formulations (rankings: 25 vs. 15). This could result from the extended releasing pharmacological characteristics of Fx8. It was also observed that DCEs increased for febuxostat ≥ 40 mg/day with an increase in dosage (trend in ORs: 0.91~1.86, in rankings: 20~10). The detailed assessment results of DCEs are in Appendix 6 (a), Appendix 7 (b), and Appendix 8 (b) and (f).

An increased in risk of GFs were observed in three comparison modes (see Appendix 6 (a), Appendix 7 (c), and Appendix 8 (b) and (g) for detailed results): (1) febuxostat > 80 mg/day and dual-agent vs. placebo (OR < 1 in the last column of Appendix 6 (a); rankings: 4–12 and 5–10 vs. 18); (2) febuxostat ≥ 120 mg/day compared with topiroxostat (OR: 0.05–0.77; rankings: 7–12 vs. 6–26); (3) topiroxostat > 80 mg/day vs. ≤ 60 mg/day (OR: 8.29 [1.1–242.4] and 8.2 [1.1–229.2]).

Compared to placebo, a higher risk of dAEs was observed in dual-agent ULTs, lesinurad, and febuxostat of 40, 80, 40/80, 80 XR, and 120 mg/day (rankings 14 vs. 2–6, 1, 11, 2–14, and 15–18). The data on skAEs were too scarce to make a conclusion. The detailed assessment results of dAEs are in Appendix 6 (b), Appendix 7 (d), and Appendix 8 (c) and (h).

Heterogeneity and inconsistency

Substantial heterogeneity were observed for the following assessments: (i) efficacy assessment in comparison of febuxostat 60 mg/day and topiroxostat 160 mg/day vs. allopurinol (p = 0.049 and < 0.001) and of topiroxostat 160 mg/day and placebo vs. topiroxostat 120 mg/day (p = 0.00025 and 0.00025); (ii) all-cause AE assessments, topiroxostat 120 mg/day vs. allopurinol (p = 0.00175) and placebo vs. topiroxostat 120 mg/day (p = 0.001); for DCEs, febuxostat 80 mg/day vs. febuxostat 40 mg/day (p = 0.035) and topiroxostat 160 mg/day vs. topiroxostat 120 mg/day (p = 0.004); and (iii) dAE assessments, febuxostat 40 mg/day vs. allopurinol (p = 0.028), placebo vs. febuxostat 40 mg/day and topiroxostat 120 mg/day (p = 0.013 and 0.029) (see Appendix 10).

Sensitivity analysis

The discrepancy between the full-pooled model (MP) and the separated models (MS and MC, Appendix 11.A) were negligible for most comparisons except for pegloticase. Because these estimates were derived from fragile evidence, broad variation among the iterated estimates contributed to such vast discrepancy. Nevertheless, most differences between the estimates of the model with and without pegloticase (MP and MP1; Appendix 11.B) were very minor. Moreover, incorporating pegloticase into the full model did not alter the findings for other agents; therefore, the results based on model MP are retained in our main text, and the rest are attached as Supplementary Information.

Discussions

In general, the main result of this study is in line with the previous meta-analysis comparing the efficacy of different ULTs. Similar to the network meta-analysis published by Li S in 2016 [26], it was found that febuxostat was associated with the best urate-lowering efficacy among all the monoagent ULTs investigated. In clinical practice, both allopurinol and febuxostat are recommended as first-line drugs, but febuxostat is only prescribed when allopurinol is contraindicated or not tolerated. This is because febuxostat is far more expensive than allopurinol [69], and the AE profile of febuxostat is less well characterized than allopurinol. Allopurinol was approved by the US Food and Drug Administration (FDA) in 1965, but febuxostat was only approved by US-FDA in 2009. Hence, febuxostat has a much shorter period of post-marketing surveillance than allopurinol, and some rare AE might not be reported with the smaller patient population. In fact, a recent study by White WB [45], which has a sample size of 6190 patients, suggested that cardiovascular mortalities were observed to be higher with febuxostat than with allopurinol. Therefore, the routine clinical practice of first prescribing allopurinol when it is not contraindicated for patients newly diagnosed with hyperuricemia should be continued.

However, this study has several improvements in study design, when compared to previous reviews [20, 25,26,27,28]. First, the Bayesian network meta-analysis was used to facilitate exhaustive mutual comparison, and this allows the incorporation of zero-event observations. Second, the analysis was based on data only from RCTs, where potential confounding factors could be controlled as much as possible. Third, the synthesis analysis included newly launched [27, 64,65,66,67], innovative [56], new formulations [48, 49], and agent used only in Japan [33, 34, 57,58,59]. Fourth, since poor adherence is related to worse sUA control [21, 22] and null efficacy, adherence attrition-related AEs were also evaluated to elucidate the necessity and direction of future cost-effectiveness analysis for ULTs.

Profile of patients with gout have sex discrepancy: women of both prevalent and incident cases were approximately 6–10 years older at initial diagnosis [70], had a higher burden of comorbidities, had different comorbidity profiles, were more obese [71], had fewer dietary triggers (seafood, red meat, hard liquor, wine, and beer) [71], and had more diuretic-triggered gout flares [70, 71]. Nevertheless, current evidence dominantly based on male individuals.

The reporting of AAEs was a little inadequate. First, the disclosure rate of dAEs was < 50% and that of GFs was merely 71%, both events considerably influence patient adherence to therapy [72]. Second, the reporting rate by publication years (1999–2009 vs. 2010–2019): declined in GFs and skAEs and increased in dAEs. This lower reporting could be attributed to several reasons, including, null finding, favorable-outcome selection, or changing viewpoints. The lower reporting of AAEs made evidence retrieval and clinical decision-making difficult. To make future comprehensive utility and cost-effectiveness analysis for ULTs more feasible, emphasis should be put on sophisticated AE reporting (e.g., frequency and time of GFs, time to DCEs), especially for RCTs involving chronic diseases that necessitate long-term medication use, where interferences can be controlled.

The results of our study should be interpreted in light of both strengths and weaknesses. The main strength of this study is including only peer-reviewed RCTs to reduce the effects of confounding factors as low as possible. Apart from the sUA lowering induced GF [19], intolerance, or allergy [17, 73, 74], the interference toward ULT adherence could come from the following sources: heterogeneous demographics [18]; socioeconomic level, health care capability, and health literacy [75]; physical disability [25, 76,77,78,79]; strategies on patient management [75]; physicians’ prescription habits, specialties, and competence [17, 18, 76, 77, 80, 81]; and information accessibility [80]. Thus, based on the study design, differences in adherence attrition occurrence observed here was likely owing to the ULT agents.

In addition, the Bayesian methods applied here, in contrast to the Frequentist method (e.g., computation in STATA), facilitated the incorporation of zero-event observations without requiring a technical correction of data, while imposed more variation on pooled estimates. The included evidence incorporated direct comparison to active controls (allopurinol or febuxostat) for every ULTs, except for pegloticase. However, the unique bridge connecting pegloticase and other agents presented a zero-event observation, i.e., no patients achieved the target therapeutic effect in the placebo arm [63]. These resulted in poor precision of the assessment for pegloticase. The zero-event observations were found in 11 out of the 39 included RCTs. This introduced a discrepancy in study estimates between this and a previous review [26].

There are several important limitations to this study. First, non-pharmacological hyperuricemia management interventions were excluded in this study, for example, weight loss, economic and fundamental disease risks modification, lifestyle modification [7], and healthier diet [82]. These interventions were excluded for lacking comparability, and patient adherence remains the key issue for their realistic effectiveness. Second, despite trying to be as compressive as possible to investigate all the trials on ULTs, some ULTs (e.g., azapropazone, benziodarone, sulfinpyrazone, ethebencid, zoxazolamine, and ticrynafen) were not included due to the evidence scarcity. Reasons for the limited available evidence include side-effects (e.g., ticrynafen and benzbromarone users are prone to blood pressure disorder, hepatotoxicity, and nephrotoxicity), relatively newly approved drugs (e.g., arhalofenate), unfavorable mode of administration (e.g., pegloticase, intravenous infusion administered), factors related to cost or standard practices of the regional physician community, and market factors such as profitability by patent in-force (e.g., azapropazone and benzbromarone). Third, the RCTs that have been published were predominantly conducted in economically developed areas, where citizens are more susceptible to hyperuricemia [15, 24]. It is unclear if the same result will be obtained on patients from the developing countries. Fourth, treatment options are not equally available in different parts of the world, and it is unclear if the same efficacy/AE profile will be obtained from a different ethnic group. For example, in the US, trials on ULTs are limited to febuxostat, allopurinol, probenecid, and pegloticase, while trials on Topiroxostat can only be found in Japan. Finally, pharmaceutical companies supported most RCTs, and it is hard to completely get rid of profit-counting in the design of the trial [83, 84].

To conclude, evidence of RCTs regarding the second-line agents and the XOIs launched after febuxostat is scarce and uneven across nations. We cannot overemphasize the need for more sophisticated reporting of adherence attrition AEs in order to allow cost-effectiveness analysis. Comparisons on the efficacy, safety, and adherence attrition occurrence over various ULTs revealed the following conclusions: (i) febuxostat (≥ 40 mg/day) and the dual regimens (XOIs + uricosuric agents) were superior to others in efficacy, but lesinurad-based dual regimens require further surveillance on their AE pattern when lesinurad is up-titrated; (ii) evaluation on long-existed second-line agents (probenecid and pegloticase) remains insufficient; (iii) T. bellerica 500 mg/day, a novel natural fruit extract–based XOI, could be a cost-effective alternative for superior efficacy to placebo and lower AE occurrence; (iv) topiroxostat ≥ 80 mg/day could be equivalent to febuxostat, although the evidence is largely dependent on a single nation (Japan); and (v) more evidence is required for arhalofenate.