Introduction

Dementia is a progressive neurodegenerative syndrome manifested by cognitive impairment, memory loss as well as behavioral and psychological disorders, which has posed a great burden on both human health and global economy. Over 46 million patients suffered from dementia in 2015 and the population is estimated to enlarge to 131.5 million until 2050 [1]. Meanwhile, the total worldwide cost of dementia is 818 billion dollars which will be trillion dollar by 2018 [1]. Undeniably, behavioral and psychological of dementia (BPSD) are the most prominent and distressing manifestation greatly damaging the quality of life for patients, families, and caregivers. BPSD is a wide spectrum of syndromes including mood disorders, depression, agitation, psychosis, sleep disturbances, anxiety, apathy, dysphoria, aberrant motor activity, hallucinations, and delusions [2]. The prevalence and severity of BPSD vary associated with the faster progression of dementia and may finally impair nearly all the patients, which often underline the decision to institutionalization [3]. Given that the neuropathology and neurobiology of dementia remain indistinct, no consensus about etiological treatments has been researched. Thus, alleviation of BPSD is the mainly medical intervention to improve the quality of patients’ and caregivers’ lives [4].

Although various pharmacological and non-pharmacological therapies have been proposed and discussed targeting BPSD, the previous descriptive reviews did not provide any quantitatively summary across all the available interventions as well as no consistent judgements had been concluded. Generally speaking from the literature review, non-pharmacological interventions are recommended as first line treatments even from guideline, such as exercise, cognitive stimulate training, music therapy, light therapy, aromatherapy, reminiscence therapy, and so on. Within pharmacological strategies, antipsychotics are chosen with priority in spite of their well-known adverse effects [5], and so as antidepressants [6]. Different from the licensed drugs for cognitive impairment such as cholinesterase inhibitors (ChEIs) and memantine (an N-methyl-d-aspartate (NMDA) receptor antagonist), many “off-label” drugs lack convictive evidence including some psychotropics, mood stabilizers, stimulants, anticonvulsants, traditional medicines, etc.

Due to the complication of broad interventions and the lack of head-to-head trials, it is impossible to synthesize present data on this issue depending on traditional pairwise meta-analysis method. Network meta-analysis exactly rises to this challenge, because it can not only analyze quantitatively findings as well as evaluate both direct and indirect evidence simultaneously. Thus with the exploration of NMA, comparative efficacy and safety can be precisely accessed and interpreted [7]. NMA has been a powerful and reliable method to explore a broader set of potential evidence enjoyed highly approval and increasingly number [8].

Hence, we conducted this systematic review and hierarchical Bayesian NMA included only RCTs to provide comparative evidence and quantitative hierarchies on the efficacy and safety of all available therapies for patients with BPSD.

Methods

We performed a series of NMAs using a Bayesian model, which conformed to principles of the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) extension statement for reporting of systematic reviews incorporating NMA of health care interventions.

Eligibility criteria

Participants

Participants in our NMAs were all diagnosed as dementia of various types including mixed dementia, Alzheimer’s disease (AD), vascular dementia (VD), Lewy body dementia (DLB), Parkinson disease with dementia (PDD), mixed dementia, etc. The diagnosis of dementia was defined by study authors meeting with corresponding diagnostic criteria such as Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV) [9] for dementia, National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS–ADRDA) criteria for AD [10], Consensus guidelines for the clinical and pathologic diagnosis for DLB [11], etc.

Treatments

Comprehensive therapies including pharmacological or non-pharmacological treatments of BPSD were carefully considered for our NMAs when accessing efficacy and safety. To fully explore the potential medication for BPSD, we included all available interventions in our assessment of efficacy and safety, although some of which may be “off-label”. Accordingly, we mainly searched several fields of pharmacological and non-pharmacological therapies including antipsychotics, antidepressants, cognitive enhancers, benzodiazepines, anticonvulsants, reminiscence therapy, validation therapy, aromatherapy, exercise, and so on. Specific potential pharmacological and non-pharmacological interventions we searched are listed in Table 1.

Table 1 Searching keywords of pharmacological and non-pharmacological therapies

Comparators

Placebo, usual care or therapy, and any other corresponding pharmacological or non- pharmacological interventions were eligible in our NMAs.

Outcomes

After investigating the availability and scientificity of all the scales included to evaluate the efficacy of medication for BPSD, we finally choose Neuropsychiatric Inventory (NPI) [12] and Cohen–Mansfield Agitation Inventory (CMAI) [13, 14] to appraise the efficacy of included interventions. Among the overall adverse events (AEs), we selected the risk of total AEs, diarrhea, dizziness, falls, headaches, nausea, vomiting, and cerebrovascular diseases (CVDs) as the primary outcomes of safety because of their highest occurrence. The data we extracted were the results of intent-to-treat population with last observation carried forward method, unless it was unavailable.

Information source and literature search

The systematic electronic search of the literature was performed with Pubmed, EMBASE, the Cochrane library, and CINAHL, which covered English-language articles published as full-text from inception until 1st August 2018. We retrieved unpublished studies via conference proceedings, clinical trial registries, and author contact. Reference lists of included studies and related systematic reviews were also scanned to identify additional studies for inclusion in our NMAs.

Study selection

Our NMAs only included RCTs, which appraised the efficacy or safety of any pharmacological or non-pharmacological intervention in the treatment of patients with any type of dementia. Observational studies (prospective and retrospective), single arm noncomparative studies, review articles, nonhuman studies, and studies with incorrect comparator were strictly excluded.

Data extraction and quality

According to the eligibility criteria discussed above, the evaluation and screening of articles were performed by two of us independently. Based on elaborate discussion, a third reviewer would intervene to call the final determination when there still being any controversy. Besides, the baseline characteristics of each RCT such as: age and sex constituent ratio of patients, design and sample size of trials, name and dosage of drugs, efficacy outcomes as well as occurrences of AEs were all collected and analyzed to avoid biases. Overall, the process of the extraction was executed strictly and scientifically, with no discrepancy left.

Statistical analysis

Our systematic review and NMAs were done across all the types of dementia to derive overall efficacy and safety on comprehensive therapies for BPSD, which referred to the PRISMA extension statement for reporting systematic reviews and NMAs of health care interventions [15]. Before all, we summarized and analyzed the baseline data and outcomes of involved studies’ and patients’ characteristics. Accordingly, odds ratios (ORs) for dichotomous outcomes and mean differences (MDs) for continuous outcomes with 95% credible intervals (CrIs) were selected to reflect the assessments.

We conducted two types of meta-analyses. Initially, we conducted traditional pairwise meta-analyses using a random-effect model, through which the heterogeneities and publication biases among the trials were well anticipated before NMAs. Analysis above evaluated the heterogeneities by I2 statistic, judged the publication biases using funnel plots, and all the process was performed in Revman version 5.3. Second, NMAs were executed to obtain estimates for all the valuable outcomes. The random-effect model was adopted for it being the most appropriate and advisable methodology when considering the between-study heterogeneities [16, 17]. Within the Bayesian hierarchical model frameworks, Markov chain Monte Carlo estimation was applied with four chains. In addition, the models were run for 100,000 iterations to ensure model convergence, which was checked by visual inspection of the mixing of chains, after discarding the first 20,000 iterations and thinning of 10. Besides, the convergence was estimated by visually examining the iteration plot and the potential scale reduction factor [18]. Herein, network diagrams were connected using the GEMTC and JAGS packages in R version 3.0.3, and ranks of the efficacy and safety of therapies were indicated by surface under the cumulative ranking (SUCRA) probabilities displaying in rank plots [19].

Finally, the comparative efficacy and safety of comprehensive therapies for BPSD were first quantitatively analyzed through our NMAs. The modelled binary outcomes combined direct and indirect evidence to specify the relations and comparisions among all the available trails, which exactly is the highlight of NMA method.

On the whole, we carefully considered and assumed the transitivity of NMA, which means that we can learn about treatment A versus treatment B via treatment C [20]. This assumption was set after reviewing all data of studies’ and participants’ characteristics and examining potential efficacy modifiers such as age, timing of exposure, risk-of-bias, etc. The common within-network between-study variance (τ2) across therapies was presumed, because the large number of treatment comparisons may lead to the unavailable cases. Accounting for the consistency, the design-by-treatment interaction model was applied to examine the consistency for NMAs. In addition, if we ultimately tracked the inconsistency without identifying any discrepancy to blame, additional analyses would be done such as sensitivity analysis (SA) on baseline characteristics including study design, dose, imputation, as well as subgroup analysis (SCA) on influential difference, meta-regression duration, etc.

In the following sections, if the results of NMAs for efficacy turned out to have statistical significance or difference, they would be identified as “superior”, which also means better and more beneficial. If there results of NMAs for safety appeared to be statistically significant, they would also be viewed as “superior” but indicated worse and associated with a higher risk of adverse events. On the contrary, it would be “inferior”. Besides, all the NMAs below converged adequately (potential scale reduction factor = 1.00–1.01) and the derived hierarchies (relative ranks) were described from the best efficacy to the worst and the most tolerate to the least.

Data availability statement

YES-all data are fully available without restriction.

Results

Literature search and description of studies

The electronic literature search yielded 85,081 potentially relevant articles including 32,747 from Pubmed, 37,841 from EMBASE, 5888 from CINAHL and 8605 the Cochrane Library. 8185 abstracts were reviewed after deleting by duplicates and titles. We then excluded 5070 articles due to not meeting the inclusion criteria. In addition, 3115 studies were full-text reviewed elaboratively to ultimately collect 146 RCTs including 133 pharmacological interventions and for 13 for non-pharmacological interventions. A summary of the literature search is presented in Fig. 1a, and the weighted network is described in Fig. 2a.

Fig. 1
figure 1

a Flowchart of literature review. b Quality assessment of studies included. Each methodological quality item is presented as percentages across included studies

Fig. 2
figure 2

Network diagram of the efficacy and safety of therapies for behavioral and psychological symptoms of dementia. a General network diagram; bNPI Neuropsychiatric Inventory; c Cohen–Mansfield Agitation Inventory (CMAI); d total adverse events; e diarrhea; f dizziness; g falls; h headache; i nausea; j vomiting. The nodes are linked by a line when the treatments were directly comparable. The size of the nodes (blue circles) corresponds to the number of patients that have received the particular treatment and the width of the lines is proportional to the number of trails comparing the treatments it connects. ARI aripiprazole, ARO aromatherapy, CIT escitalopram, CST cognitive stimulated training, DIS discontinuation of antipsychotics, DON donepezil, EXE exercise, GAL galantamine, HAL haloperidol, LIG light therapy, MEM methylphenidate, MET methylphenidate, OLA olanzapine, PLA placebo, QUE quetiapine, REM reminiscence therapy, RIS risperidone, RIV rivastigmine, RIV P rivastigmine patch, SER sertraline, TRA trazodone, VAL valproate, YOK Yokukansan

The studies included in our NMAs are all designed as randomized controlled trials (RCTs). The characteristics of these trials can be summarized as follows: the publish date from 1998 to 2016, the number of participants ranging from 12 to 2048, the average age distributing between 61 and 90.1, and the percentage female from 16 to 100%. Besides, the whole sample size is 44,873, the duration from 2 weeks to 12 months with one exceptional 104 weeks. Elaborate details and citations are displayed in ESM Appendix 1.

Risk of bias

Two members of us assessed the risk of bias and quality of studies independently, and rigorously followed the Cochrane risk of bias tool to assess the individual risk of bias of each study. Specifically, the criteria were adequate sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting, and others. In general, the studies included in our NMAs showed a relatively and acceptable low risk of biases across different parameters scored (Fig. 1b). The situation of allocation concealment was too hardly to figure out for most studies, so that could cause potential selection bias.

Efficacy

The NMA on NPI was performed to evaluate the efficacy of 18 therapies (aripiprazole, escitalopram, donepezil, galantamine, memantine, rivastigmine, rivastigmine patch, haloperidol, methylphenidate, olanzapine, risperidone, quetiapine, valproate, Yokukansan, discontinuation of antipsychotics, cognitive stimulated training, exercise, and reminiscence therapy) based on 82 RCTs with 21,224 patients, network of which is shown in Fig. 2b. Results showed that eight treatments were superior to placebo significantly including aripiprazole (MD − 3.65, 95% credible interval (CrI) − 6.92 to − 0.42), escitalopram (MD − 6.79, 95% CrI − 12.91 to − 0.60), donepezil (MD − 1.45, 95% CrI − 2.70 to − 0.20), galantamine (MD − 1.80, 95% CrI − 3.29 to − 0.32), memantine (MD − 2.14, 95% CrI − 3.46 to − 0.78), and risperidone (MD − 3.20, 95% CrI − 6.08 to − 0.31) (Figure 1A in ESM Appendix 2, Fig. 3a). In addition, valproate was significantly inferior to aripiprazole (MD − 5.03, 95% CrI − 9.37 to − 0.68), escitalopram (MD − 8.17, 95% CrI − 14.91 to − 1.19), haloperidol (MD − 4.81, 95% CrI − 9.82 to − 0.13), memantine (MD − 3.50, 95% CrI − 6.79 to − 0.31), and risperidone (MD − 4.58, 95% CrI − 8.62 to − 0.50) (Figure 1A in ESM Appendix 2, Fig. 3a). The derived hierarchy showed that most of the therapies except valproate and reminiscence therapy all may have better efficacy than placebo, and the discontinuation of antipsychotics might be harmful. Elaborate rank of possibilities from the best efficacy to the worst is listed in Figure 2A in ESM Appendix 3.

Fig. 3
figure 3

Forest plots of network meta-analyses demonstrating the benefits and risk of interventions against placebo. aNPI Neuropsychiatric Inventory; b Cohen–Mansfield Agitation Inventory (CMAI); c total adverse events; d diarrhea; e dizziness; f falls; g headache; h nausea; i vomiting. ARO aromatherapy, CIT escitalopram, CST cognitive stimulated training, DIS discontinuation of antipsychotics, DON donepezil, EXE exercise, GAL galantamine, HAL haloperidol, LIG light therapy, MEM methylphenidate, MET methylphenidate, OLA olanzapine, PLA placebo, QUE quetiapine, REM reminiscence therapy, RIS risperidone, RIV rivastigmine, RIV P rivastigmine patch, SER sertraline, TRA trazodone, VAL valproate, YOK Yokukansan, CrI credible interval

The NMA on CMAI was conducted to estimate the efficacy of 14 treatments (aripiprazole, escitalopram, galantamine, rivastigmine, haloperidol, olanzapine, risperidone, quetiapine, valproate, trazodone, sertraline, exercise, light therapy, and aroma therapy) based on 31 RCTs with 4541 patients, network of which is shown in Fig. 2c. Results showed that aripiprazole (MD − 4.00, 95% CrI − 7.39 to − 0.54) and risperidone (MD − 2.58, 95% CrI − 5.20 to − 0.6) were superior to placebo (Figure 1B in ESM Appendix 2, Fig. 3b). The derived hierarchy indicated that all except rivastigmine, trazodone, and quetiapine would have more efficacy than placebo, and elaborate rank of possibilities from the best efficacy to the worst is listed in Figure 2B in ESM Appendix 3.

Safety

Total AEs

The NMA on total AEs included 11 treatments (escitalopram, donepezil, galantamine, haloperidol, memantine, olanzapine, quetiapine, risperidone, rivastigmine, rivastigmine patch, and Yokukansan) based on 62 RCTs with 27,061 patients, network of which is shown in Fig. 2d. Results showed that donepezil (OR 1.27, 95% CrI 1.07–1.50), galantamine (OR 1.91, 95% CrI 1.58–2.36), risperidone (OR 1.48, 95% CrI 1.13–1.97), and rivastigmine (OR 2.02, 95% CrI 1.53–2.70) owned higher risk than placebo (Figure 1C in ESM Appendix 2, Fig. 3c). Quetiapine was safer than donepezil (OR 1.69, 95% CrI 1.09–2.57), galantamine (OR 2.54, 95% CrI 1.64–3.93), haloperidol (OR 2.06, 95% CrI 1.13–3.77), risperidone (OR 1.97, 95% CrI 1.26–3.10), and rivastigmine (OR 2.69, 95% CrI 1.65–4.32) (Figure 1C in ESM Appendix 2, Fig. 3c). There was no statistical evidence of the other eight therapies being harmful than placebo. Besides, Yokukansan and quetiapine had the possibility of being safer than placebo, according to the derived hierarchy demonstrated from most tolerate to the least in Figure 2C in ESM Appendix 3.

Diarrhea

The NMA on total diarrhea included 12 treatments (aripiprazole, escitalopram, donepezil, galantamine, haloperidol, memantine, olanzapine, risperidone, rivastigmine, rivastigmine patch, sertraline, and valproate) based on 57 RCTs with 26,123 patients, network of which is shown in Fig. 2e. Results showed that seven treatments including escitalopram (OR 2.53, 95% CrI 1.11–5.84), donepezil (OR 1.89, 95% CrI 1.50–2.33), galantamine (OR 1.34, 95% CrI 1.08–1.63), rivastigmine (OR 2.30, 95% CrI 1.76–2.95), sertraline (OR 2.84, 95% CrI 1.28–6.67), and valproate (OR 3.17, 95% CrI 1.60–6.55) were less tolerate than placebo (Figure 1D in ESM Appendix 2, Fig. 3d). Surprisingly, risperidone was safer than placebo (OR 5.21, 95% CrI 1.92–15.05) and all the other therapies except haloperidol (Figure 1D in ESM Appendix 2, Fig. 3d). More details of comparative safety are displayed in Figure 1D in ESM Appendix and Fig. 3d. The derived hierarchy suggested that haloperidol and memantine may be safer than placebo, and elaborate rank of possibilities from the most tolerate to the least is listed in Figure 2D in ESM Appendix 3.

Dizziness

The NMA on dizziness included 10 treatments (donepezil, galantamine, memantine, methylphenidate, olanzapine, quetiapine, risperidone, rivastigmine, rivastigmine patch, and sertraline) based on 52 RCTs with 21,564 patients, network of which is shown in Fig. 2f. Results showed that five treatments including galantamine (OR 1.58, 95% CrI 1.15–2.20), memantine (OR 1.65, 95% CrI 1.14–2.50), methylphenidate (OR 14.72, 95% CrI 4.52–49.52), rivastigmine (OR 1.94, 95% CrI 1.34–2.81), and sertraline (OR 4.03, 95% CrI 1.55–11.24) were superior to placebo meaning more harmful (Figure 1E in ESM Appendix 2, Fig. 3e). No treatments were safer than placebo. Besides, methylphenidate showed statistical differences from others except sertraline indicating higher risk. In addition, sertraline was superior to donepezil, olanzapine, placebo, and risperidone, which means that it was more harmful. More data of comparative safety are displayed in Figure 1E (ESM Appendix 2) and Fig. 3e. The derived hierarchy suggested that only quetiapine owned the possibility of being safer than placebo, and elaborate rank of possibilities from the most tolerate to the least is listed in Figure 2E in ESM Appendix 3.

Falls

The NMA on falls included 11 treatments (escitalopram, donepezil, galantamine, haloperidol, memantine, olanzapine, quetiapine, risperidone, rivastigmine, rivastigmine patch, sertraline, valproate, and Yokukansan) based on 44 RCTs with 14,016 patients, network of which is shown in Fig. 2g. Results showed that only rivastigmine (OR 1.57, 95% CrI 1.04–2.46) and rivastigmine patch (OR 3.26, 95% CrI 1.63–6.19) were less tolerate comparing to placebo (Figure 1F in ESM Appendix 2, Fig. 3f). There were significant difference between rivastigmine, and donepezil (OR 0.59, 95% CrI 0.39–0.90), galantamine (OR 0.53, 95% CrI 0.32–0.87), and quetiapine (OR 0.48, 95% CrI 0.29–0.84), ad so as rivastigmine patch with donepezil (OR 0.29, 95% CrI 0.14–0.57), galantamine (OR 0.25, 95% CrI 0.13–0.53), haloperidol (OR 0.27, 95% CrI 0.13–0.62), memantine (OR 0.31, 95% CrI 0.16–0.64), quetiapine (OR 0.24, 95% CrI 0.11–0.52), risperidone (OR 0.29, 95% CrI 0.13–0.63), and rivastigmine (OR 0.49, 95% CrI 0.29–0.83), which indicated that rivastigmine and rivastigmine patch were greatly harmful than other therapies and placebo (Figure 1F in ESM Appendix 2, Fig. 3f). The derived hierarchy indicated that Yokukansan, galantamine, quetiapine, and donepezil may be lower risk than placebo. Elaborate rank of possibilities from the most tolerate to the least is listed in Figure 2F in ESM Appendix 3.

Headache

The NMA on headache included 11 treatments (aripiprazole, escitalopram, donepezil, galantamine, memantine, methylphenidate, olanzapine, quetiapine, risperidone, rivastigmine, sertraline, and valproate) based on 46 RCTs with 19,273 patients, network of which is shown in Fig. 2h. Results showed that only rivastigmine (OR 1.97, 95% CrI 1.25–3.13) possessed higher risk comparing to placebo (Figure 1G in ESM Appendix 2, Fig. 3g). Others did not demonstrate any statistical difference with each other or placebo (Figure 1G in ESM Appendix 2, Fig. 3g). The derived hierarchy indicated that memantine and quetiapine may be safer than placebo. Elaborate rank of possibilities from the most tolerate to the least is listed in Figure 2G in ESM Appendix 3.

Nausea

The NMA on nausea included 11 treatments (escitalopram, donepezil, galantamine, haloperidol, memantine, methylphenidate, olanzapine, quetiapine, risperidone, rivastigmine, rivastigmine patch, and sertraline) based on 63 RCTs with 28,145 patients, network of which is shown in Fig. 2i. Results showed that donepezil (OR 2.06, 95% CrI 1.51–2.80), galantamine (OR 2.60, 95% CrI 1.89–3.56), and rivastigmine (OR 5.49, 95% CrI 3.65–8.25) were more harmful than placebo. Donepezil (OR 3.42, 95% CrI 1.55–7.84), galantamine (OR 4.32, 95% CrI 1.94–10.02), haloperidol (OR 4.83, 95% CrI 0.95–25.22), and rivastigmine (OR 14.68, 95% CrI 2.36–108.59) displayed more risk than memantine. Galantamine (OR 6.97, 95% CrI 1.14–50.15) showed more risk than methylphenidate (Figure 1H in ESM Appendix 2, Fig. 3h). Furthermore, rivastigmine appeared to own more risk than donepezil (OR 0.37, 95% CrI 0.24–0.60), memantine (OR 0.11, 95% CrI 0.05–0.25), methylphenidate (OR 0.07, 95% CrI 0.01–0.42) (Figure 1H in ESM Appendix 2, Fig. 3h). The derived hierarchy indicated that memantine and methylphenidate may be more tolerate than placebo. Elaborate rank of possibilities from the most tolerate to the least are listed in Figure 2H in ESM Appendix 3.

Vomiting

The NMA on vomiting included 12 treatments (aripiprazole, donepezil, galantamine, haloperidol, memantine, olanzapine, quetiapine, risperidone, rivastigmine, rivastigmine patch, sertraline, and valproate) based on 52 RCTs with 22,977 patients, network of which is shown in Fig. 2j. Results showed that donepezil (OR 1.96, 95% CrI 1.35–2.92), galantamine (OR 3.23, 95% CrI 2.31–4.55) and rivastigmine (OR 6.94, 95% CrI 4.59–10.44) were superior to placebo indicating higher risk (Figure 1I in ESM Appendix 2, Fig. 3i). Besides, rivastigmine demonstrates significant difference with other treatments including aripiprazole (OR 0.23, 95% CrI 0.08–0.65), donepezil (OR 0.28, 95% CrI 0.17–0.48), galantamine (OR 0.47, 95% CrI 0.28–0.79), haloperidol (OR 0.17, 95% CrI 0.04–0.79), memantine (OR 0.12, 95% CrI 0.05–0.30), and risperidone (OR 0.16, 95% CrI 0.04–0.64), which means that rivastigmine may be less tolerate and safe (Figure 1I in ESM Appendix 2, Fig. 3i). The derived hierarchy indicated that memantine and olanzapine may be safer than placebo. Elaborate rank of possibilities from the most tolerate to the least is listed in Figure 2I in ESM Appendix 3.

Cerebrovascular diseases

The NMA on nausea included 9 treatments (donepezil, haloperidol, memantine, olanzapine, quetiapine, risperidone, rivastigmine, rivastigmine patch, and sertraline) based on 27 RCTs with 4352 patients, network of which is shown in Fig. 2k. Results showed that olanzapine (OR 4.06, 95% CrI 1.25–15.43) and risperidone (OR 3.94, 95% CrI 1.85–10.73) were superior to placebo figuring higher risk of cerebrovascular diseases (Figure 1J in ESM Appendix 2, Fig. 3j). The derived hierarchy indicated that rivastigmine patch, sertraline, quetiapine, memantine, and donepezil may be safer than placebo. Elaborate rank of possibilities from the most tolerate to the least is listed in Figure 2J in ESM Appendix 3.

Discussion

As the prevalence of BPSD, there is an urgent concern about the efficacy and safety of all potential interventions including both the pharmacological and the non-pharmacological. On the whole, our NMAs were based on 146 RCTs on 44,873 patients with an acceptable risk bias. We searched all the available treatments of BPSD (Table 1), and finally collected available evidence of pharmacological treatments of antipsychotics (aripiprazole, haloperidol, olanzapine, quetiapine, and risperidone), antidepressants (escitalopram, sertraline, and trazodone), ChEIs (donepezil, galantamine, and rivastigmine), NMDA receptor antagonist (memantine), stimulants (methylphenidate) and traditional medicine (Yokukansan), as well as non-pharmacological treatments of reminiscence therapy, light therapy, aromatherapy, exercise, and cognitive stimulated training. Besides, the effects of discontinuation of antipsychotics and the differences between rivastigmine oral and rivastigmine patch were also well accessed. The objective of this study was to quantitatively determine if all the 22 treatments actually benefited the patients with BPSD, if the discontinuation of antipsychotics did significant harm to patients, and whether there was any difference in efficacy of rivastigmine between diverse dosage forms. More important, the comparative efficacy and safety of the interventions were synthesized and interpreted by the rank possibilities and derived hierarchies.

First, our NMA on NPI stressed that six therapies including aripiprazole, escitalopram, donepezil, galantamine, memantine, and risperidone presented significant differences superior to placebo, which means that they would provide more improvements on BPSD. However, valproate displayed great inferiority to several interventions including aripiprazole, escitalopram, haloperidol, memantine, and risperidone, which indicated that it may be much less efficacious. On the derived hierarchy, all the 18 included treatments except valproate and reminiscence therapy were proved to rank better than placebo providing possible benefits. Although escitalopram ranked the best, more data were needed to elucidate the possibility of escitalopram’s hierarchy, because only two RCTs with 217 patients included may give rise to lower precision in the estimation of its relative efficacy. Maybe, the rank possibilities supported probable benefits of non-pharmacological therapies, whereas they actually should be the second recommended choice behind the pharmacological interventions or only be assistant medication. We could see that antipsychotics behave quite well, four of which demonstrated significant differences including aripiprazole, haloperidol, quetiapine, and risperidone. In addition, three of the extensively used cognitive enhancers (donepezil, galantamine, and memantine) presented more improvements than placebo, while it surprised us that rivastigmine turned out to be less efficacious, and data of them were sufficiently available, thereby providing great confidence in this efficacy profile. In addition, there was no superior harm caused by the discontinuation of antipsychotics, and also no statistical difference between rivastigmine and rivastigmine patch. Second, the NMA on CMAI interpreted that aripiprazole and risperidone owned superior effectiveness than placebo, though all the therapies had the possibility of being more beneficial than placebo. Herein, we urged caution in the interpretation of the differences noted between the results of NPI and CMAI. On one hand, we blamed the lack of studies of some drugs; for example, we failed to search RCTs of escitalopram, donepezil, and memantine on CMAI. On the other hand, NPI and CMAI perhaps possessed incompatible ability in detecting the behavioral and psychological symptoms owing to the intrinsic characteristics of each tool when evaluating the disease, and/or the heterogeneity among patients and studies.

Meanwhile, on the assessments of safety, we carefully selected total risk of AEs, diarrhea, dizziness, headache, nausea, falls, vomiting, and cerebrovascular diseases given that these were the most common and essential adverse events and may give rise to substantial numbers of participants to drop out. Initially, outcomes on total AEs revealed that donepezil, galantamine, rivastigmine, and risperidone indeed increased the risk of adverse events. Quetiapine displayed significantly lower risk comparing with placebo, which together with Yokukansan ranked safer than placebo. However, different from the patch rivastigmine, the oral form of rivastigmine was proved to own the most possibility of causing adverse events. On diarrhea, escitalopram, donepezil, galantamine, rivastigmine, sertraline, and valproate were all observed to do harm to the occurrence of diarrhea, whereas risperidone showed superior safety than all others except haloperidol which also ranked quite well. Describing the NMA on dizziness, there was significant difference between galantamine, memantine, methylphenidate, rivastigmine, and sertraline with placebo which means that they would suffer higher risk. In addition, methylphenidate actually presented great inferior risk to all other included treatments except sertraline. Only rivastigmine (patch or oral) was found to own superior higher incidence of falls than most of others, and the oral form appeared to be more harmful than placebo on headache. Besides, donepezil, galantamine, and rivastigmine aggravated the risk of both nausea and vomiting. According to the derives hierarchies of possibilities, Yokukansan and quetiapine may be the safest on total AEs and fall better than placebo, so as quetiapine on dizziness and headache. Memantine ranked safer than placebo on headache and vomiting, and haloperidol behaved better than placebo on diarrhea. Since cerebrovascular disease is a major concern of with pharmacological interventions for BPSD, we revealed that olanzapine and risperidone had significantly more risk than placebo, and rivastigmine as well as quetiapine may conduct more risk than placebo according to the derived hierarchies of possibilities. Although the number of RCTs and patients on CVDs is limited, we could also provide a hint that the pharmacological interventions may address more risk of CVDs comparing to placebo and the overall safety. Conclusively, there indeed existed the statistical difference and possibility of treatments being more risky causing AEs than placebo, while the whole safety was acceptable referring to the corresponding ORs. More RCTs are needed to provided evidence to assess the precise risk of CVDs.

From the literature review, no previous meta-analysis was published on the overall treatments of BPSD not to mention reporting such a large scale of interventions using NMA method, and only several traditional descriptive reviews proposed some experienced conclusions, some of which were not accurate or evidence-based. It is true that non-pharmacological indeed provides effective amelioration for BPSD including music therapy [21], light therapy [22], aromatherapy [23, 24], exercise [25, 26], etc. However, the non-pharmacological methods were over appraised to be useful and potentially cost-effective, which were even recommended as the first choice [27, 28]. Accordingly, the pharmacological medication was underestimated. To be specific, there were always controversies in the efficacy and safety of memantine and cholinesterase inhibitors urging more powerful evidence to acclaim their benefits [29]. Studies used to address that the use of antipsychotic may be associated with greater risks than benefits for the treatment of BPSD, thus suggesting that they being employed with much caution [30, 31], which was testified to be misleading in this analysis. On the contrary to the old view that antidepressants have limited benefits for BPSD, more newly studies claimed that they are not only of fine efficacy, but also well tolerated [32], which is consistent with our results. In addition, since the lack of available data, some “off-label” medication such as stimulants, mood stabilizers, benzodiazepines, anticonvulsants, and traditional medicine did not obtain solid evidence to be assessed.

Herein, several distinctive findings were confirmed and proposed in this present analysis. Conclusions on efficacy from the whole to the part were well summarized. First, the pharmacological treatments mostly provided more benefits than non-pharmacological treatments. Second, significant differences superior to placebo could be expected from the antipsychotics and cognitive enhancers. Third, aripiprazole, haloperidol, quetiapine, and risperidone of antipsychotics presented great superiority to placebo on efficacy and ought to be the first recommended choice for BPSD. Fourth, memantine, donepezil, and galantamine may be the modest efficacy, and this judgement is the most convincing and powerful given their extensive use and substantial data. Fifth, on the contrary to the previous findings, rivastigmine did not demonstrate statistical improvements on BPSD. Sixth, escitalopram exhibited great benefits and rank the best on NMA for NPI, though the data of which were limited and may lead to some hesitation in its efficacy. Seventh, risperidone was proved to be far more efficacious than placebo on CMAI, which means that it may own unique effectiveness on treating agitation. Eighth, we need not worry about the harm caused by the discontinuation of antipsychotics, since there being no significant differences with continuation. Summaries on safety were elaborately assessed. To be specific, first, donepezil, galantamine, rivastigmine, and risperidone could induce higher incidence of adverse events. Second, treatments did greatly aggravate the occurrence of diarrhea and dizziness, while other AEs such as falls, headache, nausea, and vomiting were observed to induce little harm comparing with placebo. Third, based on large amount of data, donepezil, galantamine, and rivastigmine provided significant differences with placebo damaging the safety, whereas memantine behaved quite well, being relatively safe. Fourth, conformed with our consensus, rivastigmine patch was testified to be more tolerate causing less adverse events than the oral form. Fifth, quetiapine, memantine, and Yokukansan may be the most tolerate ones according to the rank possibilities. Sixth, olanzapine and risperidone may cause more risk of CVDs than placebo, and some other pharmacological interventions did not present satisfied safety comparing to placebo. In brief, the statistical differences between therapies and placebo actually warned the higher risk caused by interventions, while generally the safety was quite well and the safety was acceptable.

It should be highlighted that we have several strengths. First, as there were only descriptive reviews from experience or expert consensus, this study is the first attempt to quantitatively synthesize the efficacy and safety of therapies of BPSD through our NMAs, as well as comprehensively summarized all the available data no matter pharmacological or non- pharmacological. Second, using NMA methods, we could not only conclude the head-to-head studies, but also do the indirect comparision, so as to get the comparative evidence on efficacy and safety displayed in the derived hierarchies. Third, to strengthen evidence-based power of our conclusions, we only selected RCTs strictly following the inclusion/exclusion criteria, which, therefore, were the most comprehensive and owned the best quality. Fourth, we conducted our NMA exactly referring to the guidelines in the PRISMA extension statement for reporting systematic reviews and NMAs of health care interventions to make sure the scientific rigor. Fifth, our conclusions were based on a substantial number of patients and RCTs comparing with the previous knowledge syntheses giving rise to great guarantee of the precision and credibility.

Our study indeed has further limitations. First, although there may exist some dose–response relationship, we failed to incorporate differences in the drug dosages of pharmacological interventions, because the data were rarely available. Second, for some of the therapies, the paucity of reported data was limited, so that the corresponding efficacy or safety may be deficient or merely capable of providing little hints. Third, due to the large scale of participants and extensive interventions, some biases were inevitable such as the discrepancies in duration and gender ratio, which we tried our best to avoid through methods of sensitivity analysis, etc. Fourth, although adjusted funnel plots suggested no evidence of publication bias and small-study effects, asymmetry may have been masked for several studies owing multiple arms. Aiming at reducing the majority of correlations induced by multi-arm studies, we plotted data points corresponding to the study-specific basic parameters. Sixth, although we have carefully searched all the available non-pharmacological interventions, the number of RCTs is still limited and may give rise to hesitation in the conclusions.

Conclusively, our NMAs on efficacy suggested pharmacological intervention to be the first choice for BPSD instead of the non-pharmacological methods. To be specific, aripiprazole, haloperidol, quetiapine, and risperidone of antipsychotics showed great superiority comparing with placebo. In addition, memantine, galantine, and donepezil may provide the most proper efficacy especially considering their overall therapeutic effects on dementia. The overall safety of most treatments was thought to be fine and acceptable, though differences between placebo and therapies were observed on the incidence of total AEs, diarrhea, and dizziness. In addition, the risk of CVDs caused by pharmacological interventions is not so good especially in olanzapine and risperidone in this analysis, which reminds us to pay more attention in clinical medication and also calls for further RCTs to be performed on this. In addition, since there were some limitations, we really expect to update and revise our NMAs.