Introduction

Cognitive impairment (CI) is prevalent in stroke patients. More than two thirds of stroke patients in the acute stage [1] and 57.7% patients in 3–6 months [2] have different degrees of CI. Post-stroke cognitive impairment (PSCI) destroys functional status [3]; increases disability [4], institutionalization [5, 6], and mortality [5]; and brings great distress to patients and caregivers [4]. PSCI is also associated with higher stroke recurrence [7], higher costs of care [4], and lower quality of life [8]. Furthermore, the PSCI places a significant burden on the health care system [9].

Accurate and early detection of PSCI is meaningful for the clinicians and stroke patients, because it can help inform rehabilitation and discharge planning [10] and then may increase patients’ chances for returning to work and improve their quality of life [11]. Cognitive screening in patients with stroke is endorsed by guidelines and best practice recommendations [12, 13].

The formal neuropsychological assessment is a reliable means for evaluating CI [2]. However, it is time-consuming and unpractical in clinical assessments and large-scale studies [2, 14]. Therefore, a brief and sensitive screening tool is urgently needed [15].

Various approaches have been applied to PSCI screening, but there is no consensus on the optimal assessment [16]. The Mini-Mental State Examination (MMSE) is the most widely used screening tool, but it is doubted for inaccuracy in screening PSCI [6, 16]. The Addenbrooke’s Cognitive Examination-Revised (ACE-R) has had a large dissemination in the last years [17], but it has poor specificity for the detection of PSCI [18]. The 5-min protocol of Stroke-Canadian Stroke Network (NINDS-CSN 5-min protocol) consists of only verbally conducted tests, and it may be suitable for the early identification and screening of cognitive sequela of stroke [19]. The Montreal Cognitive Assessment (MoCA) is an increasingly popular cognitive screening tool which has good sensitivity and specificity in detecting CI and includes the assessment of multiple cognitive domains [11, 14, 20,21,22,23]. Moreover, the MoCA is recommended by the National Institute of Neurological Disorders and Stroke-Canadian Stroke Network (NINDS-CSN) for use in stroke prevention clinics (SPCs) [24]. However, the thresholds of the MoCA are set to assess mild cognitive impairment (MCI) in community-dwelling older adults and should be revised in stroke settings [25]. The cutoffs of the MoCA used in PSCI are diverse, and the optimal cutoff has no consensus.

The aims of this systematic review were (1) to identify and quantify studies reporting the diagnostic accuracy of the MoCA in stroke survivors; (2) to compare the sensitivity and specificity under different cutoffs of the MoCA and give the optimal cutoff in different stroke stages; and (3) to compare the MoCA with other screening tools (especially the MMSE) in stroke patients with PSCI determined by a neuropsychological evaluation.

Methods

Search strategies

This systematic review was performed based on the Test Accuracy Working Group of the Cochrane Collaboration and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement guidelines [26, 27]. A systematic literature search of multiple electronic databases (PubMed, Web of science, Embase, CINAHL) from inception to February 26, 2017, was conducted by two trained reviewers independently. The search was restricted to English and primary studies. The search terms (“stroke” or “post-stroke dementia” or “PSD” or “post-stroke cognitive impairment” or “PSCI”) and (“Montreal Cognitive Assessment” or “MoCA”) were used in title or abstract. The references of the included studies were also searched to identify additional studies.

Study selection

Observational studies assessing PSCI by the MoCA were acceptable. Inclusion criteria included (1) studies recruiting stroke patients, (2) studies assessing PSCI by the MoCA, (3) studies setting the neuropsychological evaluation as the reference standard, and (4) papers published in English.

Exclusion criteria included (1) studies using the MoCA as the golden standard and (2) studies without a golden standard or essential data.

Two reviewers (D.S. and X.C.) independently reviewed the titles and abstracts of studies. Papers matching the predefined inclusion criteria or with no consensus between reviewers were reviewed in full text. Disagreements were resolved through discussions and consultations to the third reviewer (Z.L.).

Quality assessment and data extraction

The quality of studies was assessed by the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) [28]. Two independent reviewers (D.S. and X.C.) used these 11 items of the QUADAS-2 to assess the methodological quality of the 12 studies, and disagreements were also resolved through discussions and consultations to the third reviewer (Z.L.). The following data were extracted by two independent reviewers (D.S. and X.C.) from the included studies: (1) descriptive aspects including the primary author, the year of publication, the reference standard, the sample, the MoCA assessment time (the stage of stroke), and the comparison type; (2) statistical aspects including the rate of PSCI assessed by reference standards, the cutoffs of the MoCA, the true positive (TP), the false negative (FN), the false positive (FP), the true negative (TN), the sensitivity, the specificity, the positive predictive value (PPV), and the negative predictive value (NPV). If any of the above data could not be found or calculated in an included study, it would be marked as “Unknown, UN.”

Statistical analysis

The figure of risk of bias was drawn by Review Manager 5.3 [29], and statistical analyses were conducted by MetaDisc 1.4 [30]. The random-effects model was recommended in pooled estimates of diagnostic meta-analyses to reflect inter-study heterogeneity [31]. Therefore, the pooled sensitivity, specificity, positive- and negative-likelihood ratios, and diagnostic odds ratios were all analyzed with a random-effects model. In the statistics of sROC curve, the area under the curve (AUC) and index Q* were used to measure test accuracy. The AUC reflected test accuracy as follows: uninformative if AUC = 0.5; low accuracy if 0.5 < AUC ≤ 0.7; moderate accuracy if 0.7 < AUC ≤ 0.9; very high accuracy if 0.9 < AUC < 1; and perfect if AUC = 1 [32]. The value of index Q*, where the sensitivity equals the specificity in a ROC curve, was defined as “1” if accuracy was 100% [33]. Heterogeneity between studies was verified with sight while having a common part in the confidence interval and estimation of efficacy through picture of a forest [34]. In addition, we investigated it by I2 test, whose significance was set at 5% and values were interpreted as follows: low heterogeneity if I2 ≤ 25%; moderate heterogeneity if 25% < I2 ≤ 75%; and high heterogeneity if I2 > 75% [35].

Results

Selection process

A total of 650 articles were retrieved in a systematic search, and 303 duplicates were removed. The titles and abstracts of the remaining 347 articles were examined, and 312 articles were excluded for irrelevant study contents, unsuitable study designs (including reviews, meetings, letters, and case reports), incorrect study objects, lacking a reference standard, and no use of the MoCA. Finally, 11 articles [2, 11, 14, 16, 20, 22, 36,37,38,39,40] were included after the full text review and 1 article [41] was added by reviewing references of these 11 selected articles. The process and outcome of the literature selection are presented in detail in Fig. 1.

Fig. 1
figure 1

Study flow diagram of study selection process

Risk of bias in included studies

Among these 12 included articles, 7 [11, 14, 36,37,38,39,40] had a low risk of bias in all domains and items and all 12 had low applicability concerns. The risk of bias and applicability concerns in different domains are shown in Figs. 2 and 3.

Fig. 2
figure 2

Outcomes of quality assessment of each included study (by QUADAS-2)

Fig. 3
figure 3

Overall quality assessment of included studies (by QUADAS-2)

Characteristics of selected studies

Twelve included studies were published from 2010 to 2017 and were all conducted in hospitals. All these studies applied the MoCA and set the formal neuropsychological assessment as the reference standard. Among these studies, seven [2, 14, 16, 20, 38, 39, 41] included patients with the stroke or transient ischemic attack (TIA) and five [11, 22, 36, 37, 40] included patients with the stroke. Seven studies [14, 22, 36,37,38, 40, 41] were from Western countries and five studies [2, 11, 16, 20, 39] were from Asian countries. The characteristics of the 12 studies are presented in Table 1.

Table 1 Characteristics of selected studies

Comparison of the MoCA and other scales

In the included studies, seven involved the comparison of the MoCA and other scales [2, 11, 14, 16, 20, 36, 41].

MoCA vs ACE-R

One study [14] compared the MoCA and the ACE-R for detecting mild cognitive impairment (MCI) at ≥ 1 year after the TIA or stroke. The optimal sensitivity and specificity for the MoCA were achieved under the cutoffs of approximately 25 to 26 (25v24, sensitivity = 0.77, specificity = 0.83; 26v25, sensitivity = 0.87, specificity = 0.63). The optimal sensitivity and specificity for the ACE-R were achieved under the cutoffs between 92 and 94 (ACE-R < 92, sensitivity = 0.72, specificity = 0.79; ACE-R < 94, sensitivity = 0.83, specificity = 0.73). Both the MoCA and the ACE-R have good sensitivity and specificity for MCI.

MoCA vs NINDS-CSN 5-min protocol

One study [20] compared the MoCA with the NINDS-CSN 5-min protocol and found that the MoCA had statistically larger AUCs in subacute stage [AUC (95% CI) 0.89 (0.85–0.93) vs 0.80 (0.74–0.87), P < 0.01] and 3–6 months after the stroke [AUC (95% CI) 0.90 (0.86–0.94) vs 0.83 (0.77–0.89), P < 0.01] for predicting patients with moderate-severe CI at 1 year.

MoCA vs MMSE

Six studies [2, 11, 14, 16, 36, 41] demonstrated the comparison of the MoCA and MMSE. Three [11, 14, 41] detected the CI and three [2, 16, 36] detected moderate-severe CI.

Both the MoCA and MMSE were of equivalent discriminatory abilities in detecting MCI within 2 or 3 weeks after stroke event [11, 41]. But compared to the MoCA, the MMSE was lacking in sensitivity for detecting MCI at ≥ 1 year after TIA or stroke [14] (only cutoffs of 29v28 or greater had sensitivities > 0.70).

Three studies [2, 16, 36] showed the comparison of the MoCA and MMSE for detecting moderate-severe CI under different cutoffs. One study [16] showed that the MMSE was lacking in sensitivity after acute stroke. Two studies [2, 36] found that both the MMSE and MoCA were of equivalent discriminatory abilities at 3–6 months after the stroke. The synthesis of these studies showed that the MMSE had higher specificity while the MoCA had higher sensitivity (details can be seen in Table 2).

Table 2 Comparison of the MMSE and the MoCA for detecting moderate-severe CI under different cutoffs within 3 months

Diagnostic test accuracy of the MoCA in stroke patients under different cutoffs

The 12 studies involved 2130 stroke or TIA patients in total. The cutoff points were various. Among the included articles, four studies [2, 37, 39, 40] set the optimal cutoff and the other eight studies [11, 14, 16, 20, 22, 36, 38, 41] analyzed the diagnostic test accuracy of the MoCA under different cutoffs. The range of the cutoff points was from 16v15 to 28v27. Studies with data under the same cutoff and stroke stage were synthesized. One study [20] had two groups of data in 2 weeks and 3–6 months after the stroke. One study [38] did not illustrate the stroke stage and could not be synthesized with other studies.

Distinguish CI from no cognitive impairment in stroke patients within 1 month after stroke

There were only two studies [22, 41] reporting diagnostic test accuracy of the MoCA under the cutoffs of 16v15, 17v16, 27v26, and 28v27. The pooled sensitivity and specificity were [0.54 (95% CI 0.34–0.64), 0.94 (95% CI 0.89–1.00)], [0.61 (95% CI 0.50–0.70), 0.95 (95% CI 0.87–0.99)], [0.99 (95% CI 0.95–1.00), 0.16 (95% CI 0.15–0.37)], [1.00 (95% CI 0.96–1.00), 0.19 (95% CI 0.10–0.30)] respectively.

The cutoff of 18v17 was used in three studies [20, 22, 41], and a total of 364 stroke patients were involved. The pooled sensitivity was 0.68 (95% CI 0.60–0.76) and heterogeneity between the articles was low, 15.0% (χ2 = 2.35, P > 0.05). The pooled specificity was 0.86 (95% CI 0.81–0.90), and heterogeneity between the articles was moderate, 65.3% (χ2 = 5.77, P > 0.05). The sROC AUC was 0.80 (SE = 0.07) while Q* value was 0.73 (SE = 0.06) (Fig. 4a).

Fig. 4
figure 4figure 4

Sensitivity and specificity of the MoCA under different cutoffs

The cutoff of 19v18 was used in three studies [20, 22, 41], and a total of 364 stroke patients were involved. The pooled sensitivity was 0.75 (95% CI 0.67–0.81), and heterogeneity between the articles was low,11.4% (χ2 = 2.17, P > 0.05). The pooled specificity was 0.83 (95% CI 0.78–0.87), and heterogeneity between the articles was moderate, 54.7% (χ2 = 4.41, P > 0.05). The sROC AUC was 0.88 (SE = 0.03) while Q* value was 0.81 (SE = 0.03) (Fig. 4b).

The cutoff of 20v19 was used in three studies [20, 22, 41], and a total of 364 stroke patients were involved. The pooled sensitivity was 0.80 (95% CI 0.73–0.86), and the pooled specificity was 0.78 (95% CI 0.72–0.82). The heterogeneity between the articles was moderate, 58.4% (χ2 = 4.81, P > 0.05) and 63.8% (χ2 = 5.52, P > 0.05). The sROC AUC was 0.90 (SE = 0.02) while Q* value was 0.83 (SE = 0.02) (Fig. 4c).

The cutoff of 21v20 was used in three studies [20, 22, 41], and a total of 364 stroke patients were involved. The pooled sensitivity was 0.85 (95% CI 0.78–0.90), and the pooled specificity was 0.72 (95% CI 0.66–0.77). Heterogeneity between the articles was moderate, 74.8% (χ2 = 7.94, P < 0.05) and 66.4% (χ2 = 5.95, P > 0.05), for the sensitivity and specificity, respectively. The sROC AUC was 0.90 (SE = 0.03) while Q* value was 0.83 (SE = 0.03) (Fig. 4d).

The cutoff of 22v21 was used in three studies [20, 22, 41], and a total of 364 stroke patients were involved. The pooled sensitivity was 0.87 (95% CI 0.81–0.92), and heterogeneity between the articles was high, 75.3% (χ2 = 8.09, P < 0.05). The pooled specificity was 0.65 (95% CI 0.60–0.71), and heterogeneity between the articles was moderate, 59.0% (χ2 = 4.87, P > 0.05). The sROC AUC was 0.85 (SE = 0.04) while Q* value was 0.79 (SE = 0.04) (Fig. 4e).

The cutoff of 23v22 was used in four studies [11, 20, 22, 41], and a total of 559 stroke patients were involved. The pooled sensitivity was 0.87 (95% CI 0.82–0.91), and the pooled specificity was 0.69 (95% CI 0.64–0.74). Heterogeneity between the articles was low, 17.3% (χ2 = 4.07, P > 0.05) and 0.0% (χ2 = 1.65, P > 0.05), for the sensitivity and specificity, respectively. The sROC AUC was 0.85 (SE = 0.05) while Q* value was 0.78 (SE = 0.04) (Fig. 4f).

The cutoff of 24v23 was used in three studies [11, 22, 41], and a total of 268 stroke patients were involved. The pooled sensitivity was 0.90 (95% CI 0.84–0.94), and the pooled specificity was 0.66 (95% CI 0.56–0.75). The heterogeneity between the articles was low, 0.0% (χ2 = 1.39, P > 0.05) and 18.0% (χ2 = 2.35, P > 0.05), for the sensitivity and specificity, respectively. The sROC AUC was 0.89 (SE = 0.06) while Q* value was 0.82 (SE = 0.06) (Fig. 4g).

The cutoff of 25v24 was used in three studies [22, 37, 41], and a total of 391 stroke patients were involved. The pooled sensitivity was 0.84 (95% CI 0.80–0.88), and the pooled specificity was 0.37 (95% CI 0.27–0.48). The heterogeneity between the articles was high, 79.5% (χ2 = 9.77, P < 0.05) and 75.9% (χ2 = 8.29, P < 0.05), for the sensitivity and specificity, respectively. The sROC AUC was 0.86 (SE = 0.59) while Q* value was 0.79 (SE = 0.57) (Fig. 4h).

The cutoff of 26v25 was used in three studies [11, 22, 41], and a total of 268 stroke patients were involved. The pooled sensitivity was 0.96 (95% CI 0.92–0.99), and heterogeneity between the articles was 0.0% (χ2 = 0.19, P > 0.05). The pooled specificity was 0.34 (95% CI 0.16–0.35), and heterogeneity between the articles was moderate, 53.5% (χ2 = 4.30, P > 0.05). The sROC AUC was 0.95 (SE = 0.11) while Q* value was 0.89 (SE = 0.14) (Fig. 4i).

The diagnostic test accuracy of the MoCA under different cutoffs in 1 month after the stroke is summarized in Table 3. The sROC AUC of 26v25 was the highest, and the diagnostic test accuracy of MoCA under the cutoff 26v25 was the highest. But, when the sensitivity and specificity are both important, 20v19 is the optimal cutoff.

Table 3 The diagnostic test accuracy of the MoCA under different cutoffs in 1 month after stroke

Distinguish CI from NCI in stroke patients within 3–6 months after stroke or TIA

There were two studies [20, 39] reporting diagnostic test accuracy of the MoCA under this situation. The sensitivity and specificity under different cutoffs could be seen in Table 1. The optimal cutoff in study [20] (21v20: sensitivity 0.83, specificity 0.80) was superior to the cutoff in study [39] (24v23: sensitivity 0.78, specificity 0.80). From current studies, the optimal cutoff for distinguishing CI from NCI in stroke patients within 3–6 months after the stroke of TIA is 21v20.

Distinguish CI from NCI in stroke patients at ≥ 1 year after stroke or TIA

There was one study [14] reporting diagnostic test accuracy under this situation. The sensitivity and specificity under different cutoffs can been seen in Table 1. In this study, the optimal cutoff for distinguishing CI from NCI at ≥ 1 year after the stroke or TIA was 24v23 and 26v25.

Distinguish moderate-severe CI from MCI and NCI in stroke patients within 1 month after stroke or TIA

Only one study [16] reported diagnostic test accuracy under this situation. The sensitivity and specificity under different cutoffs can been seen in Table 1. The optimal cutoff was 22v21.

Distinguish moderate-severe CI from MCI and NCI in stroke patients within 3–6 months after stroke or TIA

There were two studies [2, 36] reporting diagnostic test accuracy under this situation. The sensitivity and specificity under different cutoffs can been seen in Table 1. Two studies [2, 36] all reported the test accuracy under the cutoff 22v21, and the pooled sensitivity and specificity were 0.84 and 0.65. The pooled sensitivity and specificity were all lower than the optimal cutoff in study [36] (24v23: sensitivity 0.92, specificity 0.67). From current studies, the optimal cutoff for distinguishing moderate-severe CI from MCI and NCI in stroke patients within 3–6 months after the stroke or TIA is 24v23.

Discussion

Early detection of the PSCI is important, and a brief and accurate screening tool is urgently needed. The MMSE, the ACE-R, the NINDS-CSN 5-min protocol, and the MoCA are the most widely used tools for PSCI screening. We conducted this study to assess the discriminant validity of the MoCA and the other tools for detecting the PSCI determined by a neuropsychological battery and to find the optimal cutoff by performing a systematic review and meta-analysis of 12 studies.

Only one study [14] compared the MoCA and the ACE-R for detecting MCI at ≥ 1 year after the TIA or stroke and indicated that both these two tools have good sensitivity and specificity. Contrasting to this result, a former study [18] underlined that the ACE-R has poor specificity for the detection of cognitive impairment after stroke [18]. Different types and stages of disease may account for the difference. More studies are needed to verify the test accuracy of the ACE-R for detecting PSCI.

Compared to the NINDS-CSN 5-min protocol, the MoCA administered in subacute stage and 3–6 months after the stroke is superior in predicting moderate-severe CI at 1 year [20]. But, the NINDS-CSN 5-min protocol consists of only verbally conducted tests and can be used in patients with weakness of the dominant hand or visual field defect [19]. Therefore, it could act as a supplement for the MoCA.

Both the MoCA and MMSE have a similar ability to detect PSCI. Previous studies found that the MMSE had a ceiling effect and low sensitivity [6, 23, 42, 43]. The MoCA may be more suitable than the MMSE for assessing the CI in TBI and stroke patients after a long-term period since the onset [21, 42]. Compared to the MMSE, the MoCA has higher sensitivity but lower specificity. And, if the aim of screening is detecting only more severely cognitive-impaired patients, the MoCA’s superiority in sensitivity would be lost [36]. So, albeit the sensitivity is the most important quality in a screening tool [36], the uses of the MoCA and MMSE should be combined to optimize the PSCI screening.

The results of the systematic review can provide scientific evidence for choosing the optimal cutoffs of the MoCA in PSCI screening. As screening aims and stroke stages change, the optimal cutoff changes. While distinguishing the CI from the NCI, the optimal cutoffs are 26v25 within 1 month, 21v20 after 3–6 months, and 24v23 and 26v25 after 1 year since the stroke or TIA onset. While the sensitivity and specificity are both important, the optimal cutoff for distinguishing CI from NCI within 1 month is 20v19. While distinguishing moderate-severe CI from MCI and NCI, the optimal cutoffs are 22v21 within 1 month and 24v23 between 3 and 6 months since the stroke or TIA onset. The optimal cutoffs of the MoCA were diverse in previous studies and did not come to an agreement. The results of this study can provide reference for subsequent researches and clinical applications.

There were several limitations of the present study. Firstly, considerable amount of related conference papers were deleted and it may be a loss of information. But, conference papers were lacking in necessary data and full texts, and the removal seemed to be reasonable. Secondly, due to a lack of researches, this study did not give the optimal cutoffs of the MoCA in all stroke stages.

Conclusion

Compared to the MMSE, the MoCA has higher sensitivity but lower specificity. Both the MMSE and MoCA are screening tools for the PSCI, and the use of these two tools should be in line with the aim of screening. The optimal cutoff differs in different stroke stages and screening aims. There is a lack of researches in diagnostic test accuracy of the MoCA in the detection of the PSCI under different stages, especially in 3 and 6–12 months.