Introduction

Depressive symptoms and major depressive disorder (MDD) are common among older adults, affecting up to 16 and 4% of the people over the age of 65 in the community, respectively [1,2,3,4]. Depression is even more common among older adults in hospital and long-term care settings [5]. MDD in older adults is associated with a significantly increased risk of functional decline and all-cause mortality [5,6,7]. Guidelines for the management of MDD in older adults recommend that psychotherapy, antidepressants, or their combination be used as the first-line treatment for mild-to-moderate MDD [8].

In the past decades, a relatively small number of psychotherapies for MDD, such as cognitive behavior therapy [9, 10], interpersonal psychotherapy [11,12,13], behavioral activation [14,15,16], and psychodynamic therapy [17,18,19], have been well examined in ten or more randomized trials. The limited effect of antidepressants on mood, disability, and cognitive outcomes in older adults with cognitive deficits highlights the need for psychosocial interventions for this population [20,21,22]. Problem-solving therapy (PST) has been found to be effective in reducing both depression [23] and disability [24] in depressed elderly individuals with executive dysfunction. Currently, PST has emerged in several randomized controlled trials as another promising time-limited, manualized psychotherapy for MDD in older adults [25, 26].

PST was originally outlined by D'Zurilla and Goldfried [27], and the theory and practice of PST have been refined and revised over the years by D' Zurilla, Nezu, and their associates [28]. PST teaches individuals a systematic and stepwise approach to identify and solve problems based on the rationale that developing the skills to address life stressors decreases the negative impact of these stressors on mood and wellbeing by helping individuals cope more effectively with stressful problems in daily life [29]. PST is also a feasible and acceptable treatment for depression in older Chinese adults based on the cultural themes of measurement methodology, stigma, hierarchical provider–client relationship expectations, and acculturation [30].

PST was found to be equally effective as other psychosocial treatments and significantly more effective than no treatment, treatment as usual, and attention placebo treatments [28]. PST has also been extensively studied in adults and mixed-age populations for a variety of mental health disorders, including MDD. Two previous meta-analyses [31, 32] recently reported that evaluated the effects of PST on depression. Specifically, a meta-analysis of 30 studies [32] found that PST significantly reduced depression in adult patients. Another meta-analysis of seven randomized controlled trials (RCTs) that was conducted in 2015 [31] found that PST was an effective treatment in older adults with MDD; this meta-analysis included a small number of studies, and there were a relatively small number of high-quality studies of PST in this population. The heterogeneity of this article [31] was high, indicating that the effect sizes varied strongly across studies. From 2015 until now, an additional four studies were conducted about the treatment effect of problem-solving therapy in major depressive disorders in older adults; thus, it may be possible to better identify possible causes of heterogeneity. Some RCTs showed inconsistent results [21].

Therefore, we conducted an updated systematic review and meta-analysis to study the effectiveness of PST on older adults with MDD, to compare the efficacy of PST with other therapies in treating MDD, to examine the potential causes of heterogeneity and to explore the effect of PST duration on the therapeutic effects of PST on MDD.

Methods

This systematic review and meta-analysis were conducted in accordance with the statement of preferred reporting items for systematic reviews and meta-analyses (PRISMA) [33].

Eligibility criteria

The PICOS framework was used to develop the basis of the literature search strategy. We included studies based on the following components:

  1. 1.

    Study design: we included only RCTs.

  2. 2.

    Participants and setting: older adults (average study population of ≥ 60 years) diagnosed with major depressive disorders were the target population. We set no limitations on the types of depression or the setting in which the study was conducted (outpatient or inpatient). Depression could be established with a diagnostic interview or with a score above a cutoff on a self-reported assessment.

  3. 3.

    Interventions and comparison: each trial comprised two or more groups in which one group received PST or adaptations of PST and the other group received antidepressant therapy, waitlist, or other therapies.

  4. 4.

    Outcomes: the outcome of the eligible included studies was depression severity, which was assessed by any instrument of depression, such as the Geriatric Depression Scale (GDS) [20], Hamilton Rating Scale for Depression (HRSD) [34], Beck Depression Inventory (BDI) [35], Patient Health Questionnaire (PHQ), Montgamery Asberg Depression Rating Scale (MADRS) [36], or Center for Epidemiological Studies-Depression scale (CES-D).

We included articles published in Chinese and English. Additionally, only the most recently published article was included if multiple articles from the same study were available.

Information sources and search strategy

We searched the literature in the Wanfang, CNKI, SinoMed, Cochrane Library, Embase, MEDLINE, UpToDate, Web of Science, PubMed, and PsycINFO databases with a combination of medical subject headings (MeSH) search, text word search and Boolean logic retrieval. The terms and keywords “major depression”, “PST”, “major depressive disorder”, “Problem solving”, “Randomized Controlled Trial” and “RCT” were used in various combinations during the search (Appendix). The key words for the Chinese literature retrieval included the following: (“重度抑郁” OR “重度抑郁症” OR “重度抑郁障碍”) AND (“问题解决” OR “问题解决疗法” OR “PST”) AND (“随机对照试验” OR “随机试验”). In addition, the reference lists of identified studies were manually evaluated to include other potentially eligible trials.

Two reviewers independently searched for articles. We searched the entire literature published before Nov 1, 2019. The search was last updated on Feb 1, 2020.

Study selection and data extraction

Studies were independently screened by two reviewers. All the papers that may have met the inclusion criteria according to one of the reviewers were retrieved as full texts. A third reviewer was consulted to resolve any differences in opinion when there were disagreements in selecting eligible studies between reviewers.

We used a standardized table to extract the following information from all the included articles: first author(s), publication year, country, target group, setting, duration of PST, depression measurement instruments, sample sizes, etc.

Assessment of risk of bias

The Jadad scale was used to evaluate the methodological quality of each trial. Each study was examined with respect to the following four items: (1) generation of a random sequence (described and appropriate = 2, unclear = 1, inappropriate = 0); (2) allocation concealment (described and appropriate = 2, unclear = 1, inappropriate = 0); (3) double blind (described and appropriate = 2, unclear = 1, inappropriate = 0); and (4) withdrawals and dropouts (described = 1, no = 0). Therefore, the studies were scored in the range of 0–7, and a higher score indicated a better quality of research [37]. A Jadad score > 3 indicated high quality, while a score ≤ 3 was considered low quality.

Statistical analysis

Effect sizes were calculated using standardized mean differences (SMDs). Statistical significance was defined as a p value of < 0.05. An SMD < 0 showed that the intervention group had greater improvements in the major depression outcomes than the control group, an SMD > 0 indicated that the intervention group had lower improvements in the major depression outcomes than the control group, and SMD = 0 indicated that the intervention and control groups had similar changes in the scores on the depression scale. I2 statistics were calculated to assess the degree of statistical heterogeneity between studies [38]; I2 > 50% indicated significant heterogeneity across studies [39]. For analyses in which I2 was below 50%, a fixed-effects model was used, and if I2 was above 50%, a random-effects model was applied [40]. We conducted subgroup analysis when the heterogeneity was obvious.

Given the study heterogeneity, sensitivity analyses were performed using the leave-one-out approach to increase the robustness of the pooled estimates. A funnel plot is a class of methods for testing publication bias [41] and was used in our meta-analysis. We summarized the extracted data in tables and performed a narrative synthesis of all the included studies.

Results

Study research

The detailed process of the study selection is shown in Fig. 1. In total, 1390 articles were initially identified, and 1294 articles were excluded either because of duplication or because they were deemed irrelevant to this meta-analysis after careful review of the titles and abstracts. In addition, of the 96 trials that remained, an additional 86 articles were excluded for various reasons. Thus, ten articles were ultimately selected for inclusion in the meta-analysis [21, 23,24,25,26, 42,43,44,45,46].

Fig. 1
figure 1

Flowchart of the study selection process used for this review

Characteristic of studies

A summary of the study characteristics included in the meta-analysis is presented in Table 1. In total, 892 subjects were included in the ten eligible studies, and the total number of subjects included in each study ranged from 22 to 221 subjects. All the participants were diagnosed with major depression by various criteria or assessments, such as the RDC [47], DSM-IV, SCID-IV [48] or scales assessing depression.

Table 1 Characteristics of the studies included in the meta-analysis

The ages of the study participants ranged from 65.2 to 80.5 years. The majority of the patients were recruited from community outpatient samples. All but one study [45] excluded patients with dementia as defined by an MMSE [49] of less than 24; however, a total of five out of nine studies included patients with some degree of executive cognitive impairment as measured by the Dementia Rating Scale Initiation/Perseveration subscale (DRS IP) and Stroop Color-Word (Stroop CW) [21, 23, 24, 43, 45]. All the studies included only individuals with MDD with the exception of one study that included a majority of individuals with MDD (65.2% of all study participants) along with some individuals with depressive disorder not otherwise specified (29.7%) and dysthymia (5.1%) [26].

Only one study provided PST in a group format [42]; in the other studies, PST was provided individually. The majority of the studies used in-person PST. One study included both an in-person PST group and a group of participants who received PST by video call [26]; however, only the in-person group was included in the meta-analysis. Several studies used variations of PST based on the treatment setting and patient characteristics, including PST adapted for a home care setting [44, 46] and PST within a home-administered intervention targeting individuals with depression, cognitive impairment, and disability [21, 45]. PST was administered weekly in most studies, and the length of treatment varied from 6 to 12 sessions. PST was compared to a control treatment consisting of supportive therapy (ST) in five studies. Other control groups were waitlist control or usual care (UC).

Risk of bias in the included studies

Seven of the included trials [23,24,25, 43,44,45,46] were classified as high quality (Jadad score > 3), and the remaining three trials [21, 26, 42] were classified as lower quality (Jadad score ≤ 3). All of the seven high-quality trials had adequate allocation concealment and reported the use of random number generation or a randomization list. None of the ten RCTs used a double-blind design. Details related to dropouts were reported in all of the studies unless there was no dropout.

Effects of problem-solving therapy on major depressive disorders

Ten studies were used to evaluate the effects of PST on MDD by assessing the scores on depression scales, such as the HRSD, MADRS, GHQ-12 and GSD. A random-effects model was applied due to the significant heterogeneity (I2 > 50%), and SMDs were chosen because of the different scales that were used. After PST, the scores on the depression scales in the intervention group were lower than those in the control group, and the differences were statistically significant (SMD = − 1.06, 95% CI − 1.52 to − 0.61). There was significant statistical heterogeneity between studies in this meta-analysis (p = 0.000, I2 = 88.4%). The results are presented in Fig. 2. Because of the considerable heterogeneity, subgroup analysis was carried out. All the results of the subgroup analysis are shown in Table 2.

Fig. 2
figure 2

Forest plot of studies on the efficacy of PST for major depression in older patients

Table 2 The results of subgroup analysis

Subgroup analysis

We analyzed the subgroups according to the type of control group, the duration and type of PST, the recruitment of patients, and the diagnosis of depression. Subgroup analysis was performed based on the types of intervention conducted in the control group to separately calculate the effect sizes for PST versus WL (waitlist) and PST versus other therapies for MDD. Given the significant heterogeneity of the studies (I2 > 50%), random-effects models were chosen for the subgroup analysis. The results showed that PST was significantly superior to other therapies (SMD = − 1.24, 95% CI − 2.00 to − 0.49; I2 = 93.4%; p = 0.000) and WL (SMD = − 0.85, 95% CI − 1.10 to − 0.60; I2 = 0.0%; p = 0.737) for improving major depression.

The duration of PST was 12 weeks in the majority of the included studies; thus, we conducted a subgroup analysis based on whether the duration of PST was shorter than 12 weeks (less than 12 weeks was considered a short-term duration; otherwise the duration was considered a long-term duration) to study the influence of PST duration on the therapeutic effects. A random-effects model was chosen owing to the significant heterogeneity (I2 > 50%). The differences in long-term depression treatment (SMD = − 0.66, 95% CI − 0.86 to − 0.47; I2 = 30.4%; p = 0.196) were statistically significant, while short-term depression treatment (SMD = − 1.82, 95% CI − 3.81 to 0.17; I2 = 95.2%; p = 0.000) was not significantly different.

Most of the participants were from the community and home care, but some were from universities or research centers; thus, we conducted a subgroup analysis based on whether the source of the research objects was community and home care to study the influence of source of the research objects on the therapeutic effects. As shown in Table 2, the differences in the group of the participants from the community and home care (SMD = − 1.15, 95% CI − 1.76 to − 0.54; I2 = 90.5%; p = 0.000) were statistically significant, while the differences in the group of participants from universities or research centers (SMD = − 0.66, 95% CI − 1.12 to 0.20; I2 = 34.5%; p = 0.216) were not significantly different.

Subgroup analysis was performed based on whether the diagnosis of depression was a clinician diagnosis (such as the DSM-IV or RDC) or a depression rating scale diagnosis (such as an HRSD > 15 or Center for Epidemiological Studies-Depression scale (CES-D) > 22) to separately calculate the effect sizes for PST. The differences in the diagnosis of depression by clinician diagnosis (SMD = − 0.64, 95% CI − 0.82 to − 0.47; I2 = 19.6%; p = 0.274) were statistically significant, while the diagnosis of depression by a depression rating scale (SMD = − 2.48, 95% CI − 5.18 to 0.22; I2 = 96.5%; p = 0.000) was not significantly different.

PATH is a home-administered intervention designed to reduce depression and disability in depressed, cognitively impaired, disabled elderly patients. PATH is based on problem-solving therapy (PST). Subgroup analysis was performed based on whether the type of PST was PATH to separately calculate the effect sizes for PST. The difference if the type of PST was PATH (SMD = − 0.93, 95% CI − 1.44 to − 0.41; I2 = 0.0%; p = 0.574) was statistically significant, while the difference if the type of PST was not PATH (SMD = − 1.10, 95% CI − 1.64 to 0.57; I2 = 90.9%; p = 0.000) was not significantly different.

Sensitivity analyses and publication bias

Given the study heterogeneity, sensitivity analyses were performed using the leave-one-out approach to elevate the robustness of the pooled estimates. As shown in Fig. 3, four of included trials were missed by the leave-one-out approach.

Fig. 3
figure 3

Funnel plot

Begg’s tests and Egger’s tests showed no significant publication bias in the current meta-analysis of PST (Begg’s test: p = 0.283; Egger’s test: p = 0.106).

Discussion

In this updated meta-analysis, we examined the effects of PST on MDD in older adults. A total of ten RCTs met the inclusion criteria of the meta-analysis. Our overall findings indicated that PST was more effective in treating MDD than other treatments. These findings are consistent with the existing literature [31], suggesting that PST is associated with reductions in depressive symptoms among older adults and indicating that PST appears to be as effective, or perhaps more effective, in treating MDD in older adults than MDD in younger populations. However, the effect size of PST on depression outcomes in the original meta-analysis was 1.15, which is higher than the effect size of 1.06 that we observed. The results of the heterogeneity test showed high heterogeneity, which may be due to the differences in the measurement scales, the diagnostic criteria for MDD, the recruitment of participants, the type of PST, the duration of the intervention and the difference in the control group.

The types of interventions administered in the control group varied considerably. Our subgroup analysis showed that there was a significant difference in the effects of PST on MDD compared with WL, and the heterogeneity was small (I2 = 0%); otherwise, no significant difference was observed. We are not sure PST if was more effective than other therapies (ST, UC, EVO, and TSC) for treating MDD because the heterogeneity was large (I2 = 93.4%). Our subgroup analysis only separated WL; thus, we do not know if PST was more effective than ST, UC, EVO or TSC. It remains uncertain whether there is a difference between these types of PST. Therefore, further studies are needed to determine whether there are significant differences among these treatment methods and their efficacy.

According to our subgroup analysis based on PST duration, there was a significant difference in the effects of PST on MDD compared with other treatments if the duration of PST was 12 weeks (I2 = 30.4%); otherwise, no significant difference was observed (I2 = 95.2%). This result suggested that the duration of PST should be longer than or equal to 12 weeks when PST is used for the treatment of older adults with MDD to ensure a treatment effect. However, due to the large heterogeneity, this result still needs further verification.

The recruitment of participants in the studies varied considerably. Our subgroup analysis showed that there was no significant difference in the effects of PST on older adults with MDD from universities or research centers. (I2 = 34.5%) We cannot definitively explain this finding.

There was no significant difference in the effects of PST on older adults diagnosed with depression by a depression rating scale, and the heterogeneity was large (I2 = 96.5%); when depression was diagnosed by a diagnostic interview, there was a significant difference, and the heterogeneity was small (I2 = 19.6%). To achieve a better treatment effect, it is better to choose a population diagnosed with depression by diagnostic interview.

According to our subgroup analysis based on the type of PST, there was a significant difference in the effects of PATH on MDD (I2 = 0%), and there was no significant difference in the effects of other types of PST on MDD (I2 = 90.9%). The result showed that PATH is effective for older adults with MDD, but we do not know if there is significant difference in the effects of a home-based model of PST (PST-HC) on MDD or the effects of PST on MDD administered via primary care (PST-PC) on MDD. Therefore, further studies are needed to determine whether there are significant differences among these treatment methods and their efficacy. PATH may provide relief and sustain better functional outcomes in a large number of older adults with depression who are at risk of developing dementia [21, 45].

Limitations

However, there were several limitations with our meta-analysis. First, although we conducted a subgroup analysis according to the treatment used in the control groups, we were unable to rank the therapeutic effects of all the treatment methods in MDD. Second, the lack of a clear explanation for the source of the heterogeneity led to a lack of consistency in the results, which may have affected the overall results of our meta-analysis. Therefore, the results should be cautiously interpreted.

Implications for practice

PST should be considered for treating patients with MDD in the community, in the clinic, and elsewhere with a medical professional. PST has no known side effects, and many studies have shown that PST can effectively treat depression. The problem-solving skills that patients learn through PST intervention can prevent the recurrence of depression; thus, PST can be safely used for early prevention and late treatment of MDD. In addition, the results suggest that PST is more likely to have a positive effect if it lasts longer than 12 weeks for the treatment of MDD. Of course, the effect of PST on the treatment of MDD may also be related to other individual factors, such as age, religious beliefs, educational level and other factors that can aggravate or alleviate MDD.

Implications for future studies

More high-quality randomized controlled trials with larger sample sizes and more stringent designs are needed to examine the efficiency of problem-solving therapy for major depressive disorders in older adults. To identify the optimal intervention plan for problem-solving therapy, multi-arm designs including problem-solving therapy with different intervention periods, frequencies, and durations against a control are suggested.

Conclusion

In conclusion, PST had a positive effect on alleviating MDD among older adults, and it may be more effective than some forms of psychotherapy, although the effect sizes were small. The effect sizes were influenced by the types of intervention in the control group and the duration of the intervention. However, more rigorous, multicenter, high-quality RCTs are needed to verify the present conclusion, as our findings were limited by the low quality of the methodology and the small sample sizes.