Introduction

Intracerebral hemorrhage (ICH) has high morbidity and mortality and is a leading cause of permanent disability [1, 2]. There is an unmet need to find specific and effective therapies for this devastating disease. In recent years, there has been increased attention on the role of secondary injury after ICH [3, 4] and investigations of therapies targeting secondary injury as potential treatments for ICH [5,6,7,8].

Perihematomal edema (PHE) is considered a radiological marker of secondary brain injury after ICH and is increasingly used as a surrogate measure to assess the potential efficacy of newly developed therapies in improving ICH outcomes [9, 10].

Various studies investigated the association between PHE and outcomes in patients with ICH, but results have been inconclusive [11,12,13,14,15]. To summarize the available data and get a better understanding of the prognostic value of PHE, we conducted a systematic review and meta-analysis of published studies to assess the association between PHE and outcome in patients with ICH.

Methods

Search Strategy and Data Sources

We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement for randomized controlled trials [16]. Five large databases, Medline (PubMed), Cochrane, Embase, Web of Science, and ScienceDirect, were searched from inception through December 2020. An additional search was done in Google Scholar, and we used bibliographies from relevant articles to identify further articles. We restricted the search to studies that were published in English. The following search terms were employed: (Intracerebral hemorrhage OR intracerebral haemorrhage OR cerebral hemorrhage or cerebral haemorrhage OR brain hemorrhage OR brain haemorrhage OR hemorrhagic stroke OR haemorrhagic stroke OR ICH) AND (PHE OR PHO OR perihematomal edema OR perihaematomal edema OR perihematomal oedema OR perihaematomal oedema) AND (outcome OR predictor OR prognosis OR mortality OR death).

Study Selection

Studies that met all the relevant criteria were included in the analysis.

Inclusion criteria were the following:

  • Clinical trials, cohort, case–control, retrospective, and prospective studies in human participants reporting at least one of the following outcomes: functional outcome or mortality

  • Using logistic regression to examine the association between PHE and outcome and reporting the odds ratio

  • Using computed tomography (CT) or magnetic resonance imaging to assess ICH and PHE

Exclusion criteria were the following:

  • Inadequate data

  • Poor rating on quality assessment scale, as described below

  • Studies not reporting odds ratios

  • Studies on traumatic or secondary ICH, not spontaneous ICH

Data Extraction and Quality Assessment

Two of the authors (SM and MS) performed the literature search and subsequently screened the titles and abstracts for eligibility. Full texts of potentially eligible articles were reviewed for inclusion; SM then extracted data on study design, demographics, PHE measurement(s), and outcomes. After the final set of eligible studies was identified, three investigators (SH, JT, and SM) independently evaluated those articles for quality and study eligibility using the Newcastle–Ottawa quality assessment scale (NOS) for cohort studies, which is a tool recommended by the Cochrane Collaboration [17, 18]. All disagreements were resolved in a consensus meeting and 100% consensus was achieved.

The NOS is a validated and widely used instrument to assess the quality of nonrandomized studies. The three categories—selection, comparability of the study groups, and ascertainment of outcome—comprise eight items the studies are judged on. Comparability deals with the control variables used in the identified studies. For the purpose of our analysis, the most important variable to control for was hematoma volume. In addition, three out of the following four variables had to be present as control factors: age, hematoma location, intraventricular hemorrhage, and Glasgow Coma Scale or National Institutes of Health Stroke Scale. The maximum NOS score is 9. However, because we only included prediction studies in the current analysis, item two in the study selection group (selection of nonexposed cohort) was omitted from our assessment, and the maximum awarded score for the highest quality studies in this meta-analysis was 8. Studies with a NOS score > 7 were considered of high methodological quality, between 4 to 7 moderate, and < 4 insufficient [19]. The latter studies were excluded from our analysis.

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Statistical Analysis

All statistical analyses were performed with STATA 16 [20]. The primary outcome labeled as outcome in the identified articles included both mortality and/or functional outcome as assessed by modified Rankin Scale (mRS). The odds ratios and their corresponding confidence intervals (CIs) were extracted from each study. Before entering the data into the STATA 16 meta-analysis tool using the DerSimonian-Laird random-effects model, the natural logarithm of the odds ratios and the lower and upper CIs were calculated due to of the asymmetry of the CIs. The random-effects model is more conservative in dealing with heterogeneity than the fixed-effects model as it takes into account within-study and between-study variances [21]. The odds ratio represents the increase in odds of the outcome per 1 unit increase in PHE. All odds ratios used for the analyses were adjusted for various common variables as presented in Supplementary Table S1. Three studies did not explicitly report the variables used for adjustment but clearly stated that potential confounders were used to adjust [22,23,24]. The forest plots all display odds ratios and CIs.

Secondary analyses were conducted to evaluate the impact of different components of PHE measurements (volume and growth) on different outcomes; mortality or functional outcome at any time, on hospital discharge, or at 90 days.

Heterogeneity across the studies used in the meta-analysis was estimated using the I2, which tests for the percentage of variability across studies, and the Cochrane’s Q homogeneity test [25]. To test for publication bias, we used a simple funnel plot for visual inspection and the Egger’s regression test for the statistical confirmation [26].

We conducted a detailed sensitivity analysis using the leave-one out method. This involves performing a meta-analysis on each subset of studies by sequentially leaving out one study at a time. It allows for the determination of each study’s effect on the overall effect size. Sensitivity analysis was calculated in STATA 16 using the odds ratio, lower CI, and upper CI of each study utilizing the “metaninf” command.

Results

Search Results

The initial search across all databases resulted in a total of 1165 publications, of which 388 duplicate titles were removed with the help of Endnote’s duplicate finder tool [27]. The remaining 777 publications were filtered for relevance regarding our investigated topic by reading the titles. This process excluded additional 556 articles. One-hundred and forty-nine articles were excluded after reviewing the abstract because they were either review articles or commentaries, the content was irrelevant, or they were conference abstracts. Full texts were acquired and assessed for the remaining 72 articles; 47 were eliminated for various reasons, such as inclusion/exclusion criteria not met, data duplicated in another study, no data available for our question, or no multivariate analysis done or proportions provided. The additional search in Google Scholar, bibliographies and other sources did not add any new publications to the final selection. A total of 25 studies initially met our eligibility criteria; however, 5 of these studies were later excluded for the following reasons. One study assessed PHE using the acute diffusion coefficient value in a voxel-based magnetic resonance imaging analysis. The authors reported cytotoxic and vasogenic edema separately and hence was not comparable to the other studies, which primarily used CT [28]. Another study subdivided the data based on ICH location (deep vs. lobar), and did not report data for the combined cohorts [29]. A third study was excluded because the same sample was reported in another publication by the same group [30]. The fourth study assessed factors influencing 7-day mortality, but PHE growth did not survive multivariate testing [31]. The fifth study was excluded because the reported odds ratio was 0.00, making it impossible to calculate the beta-coefficients [32]. The final analysis included 20 studies. Altogether, these studies included a total of 6633 patients. Figure 1, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses flowchart, illustrates the study selection process.

Fig. 1
figure 1

Preferred reporting items for systematic reviews and meta-analyses flowchart illustrates the study selection process

Quality Assessment

The average NOS score across all studies was 7.5, with a range from 6 to 8. Thirteen of the 20 included studies had a NOS score of 8, four had a score of 7, and only three had a score of 6, indicating that all studies had a moderate or high quality. None had to be excluded due to poor quality.

Systematical Review

Study Characteristics

Table 1 summarizes the characteristics of the included studies. Of the 20 included studies, published between 1989 and December, 2020: 3 were prospective, 12 were retrospective, and 5 were secondary or post hoc analyses of prospective data. All studies used CT scans to assess PHE. Data were acquired in various international locations in five studies [33,34,35,36,37]; in China, South Korea and Australia in one study [38], USA in five studies [13, 15, 24, 39, 40], Germany in three studies [41,42,43], Iran in two studies [22, 23], China in two studies [14, 44], Turkey in one study [45], and Finland in one study [11].

Table 1 Study Characteristics

Participants

The included studies reported data from 6633 patients with spontaneous ICH, which included 2376 women and 3682 men. Four studies [24, 36, 40, 41] did not report a breakdown of sex, hence the difference in total number of patients versus number of men plus number of women. The overall mean age was 65.16 ± 14.62 years. The age was reported as median and interquartile range in 7 out of the 20 studies. We used the proposed estimation method by Luo et al. [46] and Wan et al. [47] (online calculator provided by Hong Kong Baptist University Department of Mathematics http://www.math.hkbu.edu.hk/~tongt/papers/median2mean.html) to convert median and interquartile range into mean and standard deviation for each study and subsequently combined them using the method described in the Cochrane handbook [48].

Clinical Outcomes

Modified Rankin Scale and Mortality

Fifteen studies reported (dichotomized) mRS as the outcome measure and five reported mortality; three reported both. In the latter scenario, we used the outcome employed in the primary analysis. The mRS scale was dichotomized in all studies. Seven reported a dichotomization of 3–6 for poor outcome [14, 22, 33,34,35, 38, 40], five studies chose 4–6 [15, 37, 39, 42, 43], and for two of them poor outcome meant a mRS of 2–6 [13, 44]. One study used mRS but didn’t report the cut off for dichotomization [36] (see Supplementary Table S1).

Outcome Assessment Period

Likewise, not all studies used the same time point for outcome assessment. Seven studies used in-hospital or discharge assessments, and 11 used 90-day outcomes. Two studies used 30-day [45] and 6 months outcome period [11], respectively. We excluded these two from the secondary analyses exploring outcomes based on assessment time points.

PHE

The studies employed manifold measures and parameters for PHE, and the times of assessments were variable. Most studies (n = 13) assessed PHE volume [11, 14, 15, 23, 24, 33, 36, 39, 40, 42,43,44,45]. Six studies used baseline scans; two assessed PHE after 72 h from ictus [14, 36]; two studies used the peak absolute volume out of a cluster of five time points from day 1 through day 12 [42, 43], one used relative PHE volume [40], and two used edema extension distance [11, 33]. Seven studies assessed absolute or relative PHE growth over time, usually at 24 or 72 h after baseline scans [13, 22, 34, 35, 37, 38, 41]. One study utilized three different measures for PHE, absolute and relative volume as well as PHE expansion rate [13]. For this study, we only used the PHE growth at 72 h for the purpose of this meta-analysis because this was part of their primary analysis and the results of association with functional outcome were reported. Seventeen of the included studies assessed absolute PHE volume and growth, whereas only three studies used the following parameters: edema extension distance (n = 2), relative PHE (n = 1).

Meta-Analysis

Overall Effect

The overall pooled effect size of PHE (all measures) on outcome was 1.05 (95% CI 1.02–1.08; p < 0.00). The I2 was 71.87%, indicating very high heterogeneity between the studies. Four of the 20 studies showed an inverse association, while 16 showed a direct association. However, the majority of these studies had an effect which was either below or very close to one (see Fig. 2a).

Fig. 2
figure 2

Meta-analysis forest plot of the overall effects of PHE measures on outcome in ICH. a Data derived from a DerSimonian-Laird random-effects model. Depicted are odds ratios and confidence intervals (CIs) for each trial as well as the total pooled effect size of all trials. The size of the squares represents the weight of each study, and the diamond at the bottom represents the overall pooled effect size for the group. bd Mortality and functional outcome. Secondary analysis that provides the effect sizes for mortality and functional outcome (mRS). 90-day and in-hospital assessment. Secondary analysis that shows the effect sizes for the 90-day and in-hospital assessment time points. PHE growth and PHE volume. Secondary analysis that shows the effect sizes for PHE growth and PHE volume. Data derived from a DerSimonian-Laird random-effects model

Secondary Analyses

Association Between PHE and Mortality and Functional Outcome

The pooled effect size was 1.01 (95% CI 0.90–1.14) for studies assessing mortality and 1.04 (95% CI 1.02–1.07) for studies that evaluated functional outcome (mRS). Heterogeneity for both of these groups was substantial with an I2 of 70.97% and 72.14%, respectively (Fig. 2b).

Association Between PHE and In-Hospital/Discharge Versus 90-Day Outcomes

When studies were grouped based on the timing of outcome assessment, the effect size was negligible for in-hospital/discharge assessments (1.04; 95% CI 1.00–1.08) and at 90 days (1.06; 95% CI 1.02–1.11). Heterogeneity was substantial for both time points; I2 of 67.88% and 75.40%, respectively (Fig. 2c).

PHE Volume and PHE Growth

In studies that assessed PHE volume, the effect size was 1.04 (95% CI 1.01–1.07). The I2 was at 75.46%. On the other hand, the effect size in studies that evaluated PHE growth was 1.14 (95% CI 1.04–1.25), and heterogeneity was moderate to high (I2 = 60.18%) (see Fig. 2d).

Publication Bias and Sensitivity Analysis

Visual inspection of our funnel plot (Fig. 3a) reveals a slightly asymmetrical right-ward pattern. This seemed to be attributed to substantial heterogeneity between studies. Since the visual assessment can be subjective, we tested our funnel plot asymmetry statistically using the Egger’s test which revealed a z value of − 0.75 and a p value of 0.45, indicating an absence of bias.

Fig. 3
figure 3

Assessment of bias. a The funnel plot assesses publication bias in the 20 included trials. b The results of the sensitivity analysis reveal a stable analysis. CI, confidence interval

The conducted sensitivity analysis using the leave-one-out method indicates that the meta-analysis is stable. No removal of any study significantly changed the overall effect size. Although in Gebel et al. 2002 [40] the lower CI was exactly 1.00—the lower margin of statistical significance—this was a very small study with a very low odds ratio and its inclusion is unlikely to change our overall assessment (Fig. 3b).

Discussion

This meta-analysis yielded a weak association between PHE and outcome in ICH with very high heterogeneity between the 20 included studies. Secondary analyses that investigated PHE volume and growth separately revealed that PHE volume on CT scan has only a weak impact on functional outcome and/or mortality after ICH. Conversely, PHE growth might adversely influence functional outcome and mortality, albeit the effect size is relatively small and the heterogeneity in measurement methods and adjustment for confounders too high to derive definitive conclusions.

All studies included in this meta-analysis that investigated absolute PHE growth used a second imaging time point between 24 to 72 h after ictus. Most calculated interscan absolute PHE volume growth, but two calculated the rate of PHE growth between scanning time points [13, 37]. Using the rate of PHE growth has been proposed as an alternative to absolute change in PHE volume because the speed at which the hematoma mass lesion expands is important in determining neurological injury and likely outcome [49]. While one study [13] reported that PHE expansion rate at 72 h was significantly associated with poor outcome, another study [37] found an association with poor outcome only in basal ganglia ICH. The findings that PHE growth, but not absolute volume, is associated with functional outcome and mortality might, in part, be explained by the fact that PHE volume is a static measure whereas PHE growth mirrors the progression of the PHE and its evolution over time [50,51,52].

The formation of PHE involves multiple complex pathophysiological processes including clot retraction, thrombin formation, activation of the complement and coagulation cascade, hemolysis of erythrocytes and subsequent hemoglobin and iron-mediated toxicity, inflammation, and blood brain barrier disruption [50,51,52,53]. These processes also contribute to secondary brain injury [3, 54, 55]. Perihematomal edema starts early after ICH onset, increases most rapidly during the first 2 days, and lasts for ~ 2–3 weeks [50,51,52,53]. The resulting mass effect can lead to increased intracranial pressure, ventricular compression, or even herniation. Therefore, there is pathophysiological plausibility to link PHE and its growth to poor outcomes after ICH. The temporal evolution of PHE development correlates with the development of secondary injury. Delayed PHE has been attributed to lysis of red blood cells and accumulation of hemoglobin degradation products and resulting neuroinflammation and iron-mediated toxicity [56]. These pathophysiological mechanisms of secondary injury have been linked to recovery and behavioral outcomes in experimental models of ICH, and their role in humans is the subject of ongoing intense investigations. Published studies on the association between PHE and ICH outcomes have reported variable results [13,14,15, 40]. Interpretation of this incertitude is difficult because several of the clinical factors that influence ICH outcomes, such as ICH volume and hematoma expansion, may also influence PHE and its severity [15, 38, 57]. Our meta-analysis confirms that heterogeneity is high between these studies. Furthermore, most studies included in this analysis examined PHE within the first 24 h after ictus, with few studies extending into 72 h. Therefore, it should be pointed out that this meta-analysis does not fully capture the effects of delayed PHE and its potential impact on outcome.

As for the variability in used PHE measures, absolute PHE was the most widely used in the studies included in this meta- analysis; however, the results for this group still revealed a high heterogeneity. Furthermore, there were differences in how absolute PHE volume was derived and its timing between studies. While some measured the absolute PHE volume on baseline CT scan [15, 23, 24, 36, 39, 44], others measured PHE volume on CT scans obtained at ~ 72 h after ictus [13, 14] or measured the peak absolute PHE volume in a string of five cluster time points [42, 43]. In one study, the absolute volume on the slice showing the largest perihematomal lesion was chosen [45]. Considering the temporal evolution of the PHE formation, longer assessment periods after ICH onset might seem intuitive and preferable since early assessment of PHE soon after ictus is likely to miss the PHE peak. To explore this possibility, we ran a supplementary analysis including the groups (1) absolute volume at baseline, (2) absolute volume at 72 h, (3) growth at 24 h and (4) growth at 72 h. The results confirm the larger trend shown in the secondary analyses in that the growth measures yield larger effects sizes for the time points after 24 and 72 h over the baseline volume, and a slightly higher result compared to the absolute volume measure at 72 h. These results indicate that from a pathophysiological viewpoint, measuring later in the PHE evolution might be more beneficial (see supplementary results).

Some studies suggest that PHE might be associated with early neurological deterioration and poor in-hospital and short-term outcomes [42, 57, 58]. We performed exploratory secondary analyses to examine the associations between PHE and functional outcome or mortality at hospital discharge and at 90 days separately. We found only a weak association between PHE and either mortality or functional outcome at either time point. Heterogeneity was likewise high in these groups.

An additional layer of heterogeneity and complexity pertains to methods of PHE measurements. Previous studies have used several methods and parameters to assess PHE; each has its drawbacks and advantages [59]. The processing of images and measuring PHE volumes utilized numerous software and algorithms such as the standard ABC/2 methods, manual, semiautomated, automated, or threshold and edge detection methods. While the ABC/2 method was shown to be too inaccurate to measure PHE volume [60, 61], manual methods are time-consuming and present with high rates of intrarater and interrater variability [62]. On the other hand, semiautomatic methods—despite presenting with better intraobserver and interobserver reliabilities—are limited by several issues such as lack of external validation, small derivation sample sizes, omission of segmentation time comparisons [50]. A novel approach which combines an edge detection algorithm and allegedly takes the pathophysiology of PHE formation into consideration was proposed by Urday’s group [13], although, as they point out themselves, the study was based on retrospective data and only used a very small sample size.

Our meta-analysis has limitations; most are inherent to all meta-analyses and include variability of the published studies as well as the different criteria used for assessment of clinical and radiological outcomes. The included studies used various PHE and outcome measures assessed at different timepoints. These differences in design increase the clinical as well as the statistical heterogeneity as confirmed by I2 and Cochrane’s Q results. To address the issue of heterogeneity, we applied a random-effects model, which is based on the assumption that the different study effects are not identical but follow some distribution (Cochrane handbook 9.1, 9.5.4). Furthermore, the evolution and progression of PHE can vary dramatically between patients. Our meta-analysis is based on overall results extracted from different studies, and not a patient-level meta-analysis. Therefore, the overall conclusions may not be applicable to all patients with ICH. Another limitation is the disparity in the adjustment for confounding prognostic variables between various studies. We decided a-priori on a minimum set of prognostic variables that must be adjusted for in our quality assessment of eligible studies to address this limitation. However, we were still not able to account for other potential predictors of outcome such as hematoma growth, withdrawal of care, acute treatments, or comorbidities. Lastly, our analysis predominantly included studies using measures of absolute PHE volume and growth. Other parameters of PHE such as edema extension distance [63], PHE growth rate [13], and peak PHE volume [43] were either not assessed or only assessed in a handful of studies. Therefore, the prognostic value of these parameters was not fully evaluated. The reported odds ratios for most of the studies included in this meta-analysis were small; all except for three studies were between zero and one or slightly above one. These small individual effect sizes together with a large heterogeneity might be responsible for the negligible effect size.

Despite rising interest and publications regarding PHE and outcome after ICH and its role as a potential therapeutic target, there is paucity of meta-analyses on this topic. We are only aware of one previous systematic review which calculated a few smaller meta-analyses [64]. The authors probed 21 trials which investigated the prognostic role of PHE in ICH, but unlike our study they did not run an overall meta-analysis. Their few small meta-analyses which never included more than three studies per analysis—as they only combined those utilizing the exact same PHE measures—showed an advantage for PHE growth and expansion rate over PHE volume. Our meta-analysis represents an advance on the analysis by Yu et al. [64], given the new publications on this topic, particularly those from randomized clinical trials as well as calculating an effect size across all eligible studies. Overall, these two investigations seem to have similar results reaffirming the call for future studies to adopt a more coordinated, systematic, and standardized assessment, design, and analyses [59].

Conclusions

To summarize, this meta-analysis demonstrates that PHE volume within the first 72 h after ictus has a weak effect on functional outcome and mortality after ICH, whereas PHE growth might have a slightly larger impact. Definitive conclusions are limited by the large variability of PHE measures, heterogeneity, and different evaluation time points between studies. This meta-analysis highlights the challenges related to interpretation of existing data on the relationship between PHE and outcome, given the variability in imaging techniques, timing of PHE and outcome assessments, and PHE parameters between various studies. Our findings call into attention the need for future studies to use standardized timing, measures, and quantification of PHE to accurately assess the relationship between PHE and outcome after ICH.