Introduction

Intracerebral hemorrhage (ICH) accounts for 10–15% of all strokes [1] and is one of the leading causes of stroke-related morbidity, mortality, and disability worldwide [2, 3]. Despite advances in medical knowledge, treatment for this most deadly and disabling form of stroke remains strictly supportive with no evidence-based interventions currently available [4]. Medical and surgical treatments, such as blood pressure control, osmotherapy, and hematoma evacuation have not shown definitive benefits in improving outcome [4, 5]. Moreover, a number of promising neuroprotective agents for ICH has failed to demonstrate efficacy in Phase III trials. Importantly, debate continues over the development of a standardized and widely accepted clinical grading scale and outcome prediction model in ICH [68]. This has led to ongoing heterogeneity in ICH management and inconsistencies in patient enrollment criteria for clinical trials investigating new therapies [7].

A clinical grading scale is important for standardizing assessment and communication among clinicians. Moreover, a proper grading scale facilitates the prognostication and treatment selection for patients as well as a clear comparison of clinical studies, quality of care, patient selection, and treatment efficacy for clinical trials [7, 9]. An efficient and effective scale will also minimize the cost, pain, and recovery effort resulting from medically futile treatments and may prove to encourage a bed-side, humanitarian approach [8]. Examples of widely accepted clinical grading scales include Hunt–Hess Grade [10] and World Federation of Neurological Surgeons (WFNS) scale [11] for aneurysmal subarachnoid hemorrhage, the National Institute of Health Stroke Scale (NIHSS) for ischemic stroke, Glasgow Coma Scale (GCS) for traumatic brain injury [12], and Spetzler–Martin scale for cerebral arteriovenous malformation [13].

For a clinical grading scale to become widely used and effective, it must be accurate and reliable in risk-stratification and/or outcome prediction, but also easy to use without requiring special training or statistical expertise [7]. For convenience and efficacy, an ideal grading scale should consist of factors assessed routine clinical evaluations. In 2001, Hemphill et al. [7] introduced the original ICH score (oICH), which represents one of the first simple and easily assessable clinical grading scales for ICH. Since then, a number of modifications to the oICH score have been proposed and other pragmatic ICH scales have been developed. With many ICH grading scales and modified versions available, it is becoming increasingly difficult for clinicians and researchers to determine which scales offer optimal predictability and reliability. In this article, we review the existing clinical grading scales for ICH with emphasis on their development and validation. PubMed and MEDLINE were searched extensively using the following keywords singly and in combination: “intracerebral hemorrhage,” “grading,” “score,” “scale,” “clinical,” “ICH Score,” “outcome,” “prediction,” “risk”, “stratification,” “validation,” “prognosis,” “mortality,” “functional,” and “withdrawal.” Reference list of all eligible publications was reviewed for potentially relevant studies.

ICH Scores

The Original ICH Score

In 2001, Hemphill et al. [7] introduced the ICH score, now known as the oICH score, as a means to risk-stratify patients at presentation with regard to 30-day mortality. The oICH score consists of five clinical and radiographic factors that are commonly assessed in the setting of ICH: GCS, ICH volume, intraventricular hemorrhage (IVH), infratentorial location of ICH, and age (Fig. 1) [7]. The score ranges from 0 (i.e., young patient with supratentorial ICH who has a GCS of 14 at presentation with a small hematoma and no IVH) to a score of 6 (e.g., elderly patient with infratentorial ICH who is comatose following ICH with large hematoma and IVH). In the original cohort, a score of 3 and above was associated with significant increase in the risk of 30-day morality (72 vs. 26% for oICH score of 2) [7]. In subsequent validation studies, a score of 0, 1, 2, 3, 4, 5, and 6 have been associated with 30-day mortality of 0–10, 7–13, 30–44, 56–78, 70–100, 100, and 100%, respectively [6, 1418]. The oICH score represents the most widely validated ICH grading scale to date and has served as a control for a number of grading scale comparison studies [8, 19]. Recent applications to stratify patients for ICH clinical trials [20, 21] and to determine the effects of statins on mortality after ICH [22] evince the increasing acceptance of the oICH score.

Fig. 1
figure 1

The original ICH score and modified versions

The oICH score was derived from a retrospective analysis of 152 spontaneous (i.e., nontraumatic) ICH patients who presented to two centers based in San Francisco, California, USA (Table 1) [7]. After identifying the five independent predictors of 30-day mortality, points were assigned to each component based on its correlation with outcome. GCS at time of transfer from emergency department to intensive care unit (ICU), hospital ward, or operating room was the strongest independent predictor of outcome and therefore given the most weight. The GCS component was categorized into three subgroups in order to more accurately account for significant clinical variability [7]. The remaining four components of the oICH score had similar strength of association to outcome, and hence, were weighted the same. Age was included in the oICH score, although it has been reported as an inconsistent predictor of outcome [2325]. Concerning ICH volume, a threshold of 30 cc was chosen through analysis of previous studies [23] to distinguish between a small and medium hematomas, of which the latter was associated with increased risk of death in the cohort. On the contrary, differentiating “medium” and “large” hematomas (i.e., 60 cc and above) failed to improve the scale, possibly due to the significant confounding effects of GCS, age and the presence of IVH on large hematoma size [7]. The IVH component of the scale was dichotomized to minimize the subjectivity associated with quantifying IVH. In addition, factors associated with clinical and radiologic progression of ICH were not included because the oICH score was specifically developed to be used at presentation.

Table 1 The original ICH score and modified versions

Evaluation and Validation of the Original ICH Score: 30-Day Mortality

The oICH score has been independently validated as an accurate risk-stratification tool for 30-day mortality in both spontaneous and primary (i.e., ICH due to chronic hypertension or amyloid angiopathy) ICH cohorts [14, 15, 17, 18]. In 2004, Clarke et al. [14] applied the oICH score to a retrospectively analyzed cohort of 175 spontaneous ICH patients treated at Stanford and Santa Clara Medical Centers located in California and determined that the scoring system accurately stratified patients at presentation with regard to 30-day mortality [14]. Similar to Hemphill et al., an oICH score of 3 was associated with greater than 70% 30-day mortality. The similar predictive utility of the oICH score in Clarke et al.’s [14] cohort as compared to Hemphill’s may partly be attributed to proximal geographic locations and hence, a skewed sample ICH population or general management attitudes.

The oICH score has been validated in geographically, culturally, and socioeconomically distinct ICH cohorts. In 2002, Fernandes et al. [15] retrospectively applied the oICH score to 393 spontaneous ICH patients from the University of Newcastle, United Kingdom. Although mortality was variably assessed at 2–4 weeks after ICH onset; the oICH score accurately risk-stratified the patients with regards to 30-day mortality. The oICH score’s utility as a 30-day mortality risk-assessment tool was validated again when applied retrospectively to 70 spontaneous ICH patients from a rural, agricultural town in Argentina [16]. The oICH score has been validated prospectively as well. Jamora et al. [17] prospectively applied the oICH score to 302 consecutive spontaneous ICH patients admitted to two large centers in the Philippines. The scoring system was able to accurately risk-stratify the 243 patient cohorts with regard to 30-day mortality. Although the overall mortality was lower (23%) compared to previous studies, an oICH score of 3 or above was again associated with high (>60%) 30-day mortality. In another study, Patriota et al. [18] applied the oICH score to a prospective cohort of 37 primary ICH patients from São Paulo, Brazil and found the score accurately stratified the patients according to the risk of 30-day mortality. In the Brazilian cohort, sensitivity and specificity of the oICH score were 85.7 and 65.2%, respectively, when using the score cut-off >2 [18].

Evaluation and Validation of the Original ICHs: Functional Outcome

Although the oICH score was developed for 30-day mortality, studies have used it to risk-stratify patients with regards to both early and late functional outcome. Current published results are inconsistent and the predictive utility of the score is specific to the outcome measure of interest [15, 17, 18, 26]. In 2002, Fernandes et al. [15] reported that the oICH score may be suboptimal at predicting unfavorable outcome (i.e., severe disability, vegetative state, and mortality) at “neurosurgical discharge” because such outcome was seen in nearly all patients with oICH score ≥2. Although a similarly high rate (90%) of poor functional outcome (i.e., modified Rankin Scale ≥4 (mRS) [27]) was seen at discharge and at 1 month in patients with oICH scores of 3, the oICH score was predictive of functional outcome at both time points in the Filipino cohort [17]. The oICH score was also used to predicted good functional outcome at 30 days (Glasgow Outcome Score ≥4 (GOS) [28]) in the Brazilian cohort [18]; the sensitivity and specificity were 100 and 42.5%, respectively. In the same study, however, the oICH score poorly correlated with worse outcome (GOS 2, 3) and did not adequately discriminate this group of patients with regard to outcome.

The oICH score has also been used to risk-stratify patients with regard to long-term functional outcome, at 12 months [6, 26]. The authors applied the score to a prospective cohort of 243 spontaneous ICH patients who presented to two centers in San Francisco, California and assessed functional outcome at 12 months [26]. In the study, increasing oICH score was associated with a lower probability of favorable outcome at 12 months. In addition, the score accurately stratified patients with regard to 12-month functional outcome (mRS), regardless of the specific cut-off point for outcome definition. Such results are yet to be replicated in studies with similarly long-term follow-up.

Modification of the Original ICH Score

The oICH score has been externally validated in both independent and distinct populations as a cohesive and manageable risk-stratification tool for mortality and short-term functional outcome [7, 15, 17, 18, 26]. With an increased importance now placed on predicting mortality and functional outcome after ICH, a number of studies have sought to improve the predictive utility of the oICH score through modifications (Fig. 1; Table 1). Of note, these studies also validate the oICH score in their respective cohorts.

Modified ICH Score

In 2003, Cheung et al. [29] developed the modified ICH score by substituting NIHSS for GCS (Fig. 1). The NIHSS, a widely accepted assessment scale for ischemic stroke, may be advantageous over GCS in predicting outcome following ICH due to its wider range and the ability to assess neurological dysfunction in addition to level of consciousness [29]. Moreover, unlike GCS, NIHSS has not been associated with bias against aphasic patients [30].

The modified ICH score development was based on a retrospective analysis of prospectively enrolled 141 spontaneous ICH patients treated in a Hong Kong-based center (Table 1). Based on evaluations made at time of admission, NIHSS, and not GCS, was identified an as independent predictor for both 30-day mortality and good outcome (mRS ≤ 2) and was therefore included in the scale. The NIHSS, like the GCS component in the oICH score, was divided into three categories; 0–10 (0 point), 11–20 (1 point), and 21–40 (2 points). The ranges of the three sections of NIHSS were chosen based on the results of a previous ischemic stroke study in which these thresholds optimized the correlation between initial neurologic deficits and outcome [31]. In the cohort, the oICH score accurately risk-stratified patients with regards to both 30-day mortality and good functional outcome. Compared to the oICH score, the modified ICH score was slightly worse in predicting 30-day mortality but performed better in predicting good outcome. The authors also reported that the modified ICH score is more reliable than the oICH score for predicting both outcome measures. Though promising, the modified ICH score needs external validation.

mICH Scores

Godoy et al. [6] developed two modified versions of the oICH score, mICH-A and mICH-B scores, to more accurately predict 30-day mortality and 6-month good functional outcome (GOS ≥ 4) (Fig. 1) The mICHs both include four of the five components in the oICH with different cut-off points (Fig. 1; Table 1). Infratentorial location is excluded due to its uncertain prognostic value [14, 24]. Additional categories of age and inclusion of the APACHE II [32]-defined co-morbidities aim to more accurately estimate the elderly health status across different cultural and socioeconomic populations [6]. The number of categories for GCS is increased from three to four to account for this component’s strong influence on outcome. Similarly, ICH volume is divided into three categories to account for the effects of hematoma size and growth (if assessed multiple times during hospital course) on outcome. The Graeb Score [33] is included to grade severity of IVH. Further characterization of IVH has also been recommended because it shows to provide additional prognostic information [34]. Hallevi et al. [35] demonstrated this by substituting ICH volume with total hemorrhage volume (i.e., ICH volume + IVH volume) to increase the oICH score’s specificity for in-hospital mortality.

The mICH scores were developed based on the analysis of a prospective cohort of 153 primary ICH patients admitted to ICUs located in Buenos Aires, Argentina (Table 1). Similar to the Hemphill study, only patients presenting directly to the study centers within 24 h of ICH onset were enrolled. The oICH score reliably predicted mortality though less so for good outcome (GOS ≥ 4) in the study cohort. In comparison, the mICH scores were equally predictive of mortality but were significantly better at predicting good outcome. Generally, greater predictive utility was seen with the mICH-B score despite having fewer categories for age and the Graeb scale [6].

New ICH Grading Scales

The oICH score and the modified versions of the score are simple, easy to use, and predict mortality and/or functional outcome with acceptable accuracy and reliability. Nevertheless, the oICH score may not be optimal for estimating functional outcome as it was originally intended to predict 30-day mortality and does not include factors that have been strongly associated with outcome following ICH [36]. A number of new ICH clinical grading scales have been developed to predict various outcome measures following ICH (Fig. 2; Table 2). These scores are based on independent predictors of outcome from the respective cohorts and some have attempted to include novel factors, such as pre-ICH neurological impairment and dialysis dependency, to improve outcome prediction [9, 36].

Fig. 2
figure 2

ICH clinical grading scales independently developed since the original ICH score

Table 2 ICH grading scales independently developed since the original ICH score

ICH-GS

In 2007, Ruiz-Sandoval et al. [8] introduced the ICH grading scale (ICH-GS) which is assessed at initial evaluation to predict in-hospital mortality as well as 30-day mortality and good functional outcome (GOS ≥ 4) at 30 days. The score ranges from 5 to 13 and utilizes statistically derived cut-off points for the GCS component that are different from those of the oICH score (Fig. 2). Another notable feature of the ICH-GS is that the categorization and point assignment for ICH volume depends on the ICH location (i.e., infratentorial vs. supratentorial). This feature accounts for the difference in compliance of the two spaces. Moreover, this modification allows the scale to more accurately describe both ICH and patient characteristics. Thus, unlike the oICH score [patients with an oICH score of 6 (i.e., elderly patient with infratentorial ICH who presents comatose with large hematoma and IVH) is rarely seen] [6, 1418], it may be more likely to find patients with every ICH-GS score. This may lead to a finer risk-stratification of patients and improved prediction of outcome.

The ICH-GS was developed from a retrospectively analyzed cohort of 378 primary ICH patients admitted to centers in Guadalajara, Mexico (Table 2). Although the ICH-GS consists of the same five components of the oICH, it was independently derived rather than through modification of the oICH. In the Mexican cohort of 310 patients, the ICH-GS had significant higher sensitivity than the oICH score in predicting in-hospital mortality and 30-day mortality. In addition, the ICH-GS more accurately predicted good functional outcome at 30 days as compared to the oICH score.

Recently, Di Napoli and Godoy directly compared the oICH score, ICH-GS, and mICH-B score in their abilities to predict 30-day mortality and good outcome (GOS ≥ 4) at 6 months in their Argentinean cohort [37]. Optimal cut-off values of the three scales were identified and compared. Although all three scoring systems demonstrated acceptable accuracy and reliability, both the oICH score and the mICH-B score were better at predicting 30-day mortality than the ICH-GS score [37]. The mICH-B score was better at predicting 6-month good functional outcome compared to both the oICH score and ICH-GS. Furthermore, the oICH and mICH-B scores better predicted mortality compared to the ICH-GS, and the mICH-B was the best at predicting good functional outcome at 6 months.

FUNC Score

In 2008, Rost et al. proposed the FUNC Score, to be assessed during initial evaluation to predict functional independence (GOS ≥ 4) at 3 months. The FUNC Score is an 11 point scale based on five components; age, ICH location (lobar, deep, or infratentorial), ICH volume, GCS, and presence of pre-ICH cognitive impairment (Fig. 2). The cut-off points for GCS and ICH volume were determined based on clinical relevance and categories for age were based on the distribution of the cohort [36]. ICH location was categorized and assigned points based on the strength of association with outcome. The presence of IVH was not included because it did not independently predict outcome in the cohort and its addition to the score did not lead to improvement in the score’s ability to predict outcome. The unique inclusion of pre-ICH cognitive impairment is to account for the significant effects of pre-morbid status on the success of rehabilitation and functional recovery [38].

The FUNC score was developed from a retrospective analysis of 629 primary ICH patients admitted to the emergency department at the Massachusetts General Hospital (Table 2). Unlike other studies, patients who were urgently transferred from other centers were included in the study. Patients with secondary causes of ICH or ICH due to excessive anticoagulation (INR > 3.0) were excluded from the study. Unlike the Hemphill study [7], the GCS component was assessed upon admission to the emergency department. Also, ICH volume was calculated using a computer-assisted planimetric analyses rather than the ABC/2 method. Pre-morbid cognitive impairment was based on family interview and medical records as well as the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) [39]. Two-thirds of the entire cohort was randomly selected for the FUNC score development and the rest for validation of the Score. In both subsets, the FUNC score accurately predicted functional independence; among patients who scored 1, more than 70% of patients demonstrated functional independence at 90 days whereas only 45% of patients with a score of 5 achieved the same status [36]. When applied only to those who survived to 90 days (55%), the FUNC score retained its predictability and reliability, suggesting that the score is minimally affected by withdrawal of care. The authors also reported that the FUNC score performed equally well even when applied to patients who were transferred from other centers. External validation of the FUNC score is necessary.

New ICH Score

In 2003, Cheung et al. [29] developed the new ICH score using the same Hong Kong cohort they used to validate the oICH score and develop the mICH scores. The new ICH score is one of the earliest ICH grading scales that incorporate medical variables. The authors identified five admission-associated independent predictors of 30-day mortality and good outcome (mRS ≤ 2) and assigned points to develop the new scoring system; NIHSS (identical cut-off and points as the mICH scores), temperature (0 point if <36°C; 1 point if >36°C), pulse pressure (0 points if ≥60 mmHg; 1 point if ≤60 mmHg), presence of IVH (0 point if no; 1 point if yes), and subarachnoid extension (0 point if no; 1 point if yes) (Fig. 2). One advantage of the new ICH score is that it does not require hematoma volume calculation, which can be inconvenient and imprecise even with the popular ABC/2 method [40]. The new ICH score was sensitive (91%) with a high negative predictive value (97%) for mortality and specificity (92%) with high positive (85%) and negative predictive values (82%) for good outcome. Moreover, the new ICH score was not as accurate in predicting mortality, but predicted good outcome better than the oICH score. The new ICH score has not been validated in an independent cohort.

Essen ICH Score

In 2006, Weimar et al. [19] introduced the Essen ICH score to be assessed at admission to predict 100-day functional independence or complete recovery (Barthel Index (BI) [41] ≥95). The score ranges from 0 to 10 and consists of five categories of NIHSS (0–5, 6–10, 11–15, 16–20, >20 or coma), four categories of NIHSS level of consciousness (alert, drowsy, stupor, coma), and age (<60, 60–69, 70–79, and ≥80) (Fig. 2). The cut-off values have been statistically derived to yield the best-fit model. No radiographic factors independently predicted outcome in the cohort and were not included in the score. Also, the Essen ICH score does not require the calculation of hematoma volume. Importantly, since the score predicts functional recovery, it has the potential to help design clinical trials of novel therapies by correctly identifying and excluding patients with high chance to full recovery undergoing conventional management strategies [19].

The Essen ICH score was initially developed as a logistic regression model [19, 30]. The German Stroke Study Collaboration group analyzed 207 of 327 prospectively followed spontaneous ICH patients admitted to 11 different neurological ICUs. In the initial study, independent predictors of functional recovery were identified [30]. Only patients admitted within 6 h of onset and prior mRS ≤2 were enrolled to ensure that the study cohort consisted of only patients with some level of functional independence. Patients who were intubated or comatose at admission were excluded, as their chance of achieving functional independence was deemed unlikely [30]. Admitting physician’s prediction for outcome [death, severe dependence (BI < 70), moderate dependence (BI 70–90), and functional independence (BI ≥ 95)] at 100 days was assessed within the first 24 h after admission. Due to the management preference at the center, all patients were assessed using the NIHSS but not GCS, and therefore, GCS was not tested in the cohort. The NIHSS component of the score was assessed both at admission and 48–72 h after admission. Subsequently, only age and NIHSS were identified as independent predictors of outcome based on data from patients who were neither comatose nor intubated at admission who also completed their 100-day follow-up. Based on these findings, the Essen ICH score was applied to another prospective cohort of spontaneous ICH patients from a German center who were admitted within 24 h of onset [19] (Table 2). At this point, NIHSS level of consciousness was added to the score to account for intubated or comatose patients that were initially excluded from the analysis. When applied to 304 patients who met the enrollment criteria and also completed follow-up, the Essen ICH score demonstrated accurate prediction of outcome with scores >7 and <3 predicting mortality and functional independence, respectively [19].

The Essen ICH score has been validated in an independent yet another German hospital, and has been shown to be superior at predicting functional independence at 100 days, compared to the oICH and the modified ICH scores as well as admitting physicians [19]. The Essen ICH score has also been validated using the Virtual International Stroke Trials Archive (VISTA) that is comprised of patient data from 21 acute stroke randomized clinical trials [42]. The authors selected 564 patients from VISTA who were assessed within 24 h of onset with NIHSS and also had 3 month outcome data (either in BI or mortality status). In this very specific cohort, the Essen ICH score demonstrated 78% accuracy in predicting functional recovery [42].

Simplified ICH Score

In 2009, Chuang et al. [9] developed the simplified ICH (sICH) score to predict 30-day mortality in spontaneous ICH patients at presentation (Fig. 2). The sICH score ranges from 5 to 12 points and consists of GCS score, history of hypertension, admission serum glucose, dialysis dependency, and age. Of the five components, dialysis dependency was the strongest predictor of 30-day mortality, followed by GCS of 3 or 4 at initial evaluation. Points have been assigned to each component on the basis of the strength of association with outcome. Dialysis dependency as a predictor of outcome is novel and its association with outcome is attributed to potential uremia-induced platelet dysfunction resulting in an increased risk of hemorrhage [9]. Chronic hypertension, a well recognized risk factor for ICH, can contribute to mortality by impairing cerebrovascular auto-regulation and perfusion. Although it is unclear how hyperglycemia at admission contributes to outcome, potential mechanism may involve exacerbation of cerebral edema and cerebral damage [9]. The sICH score, like the new ICH and Essen scores, does not include radiologic factors. The presence of IVH and hematoma volume, in particular, were not independent predictors of outcome because the presence of IVH did not represent the severity of ICH in the cohort and possibly due to imprecise hematoma volume calculation using the ABC/2 method [9].

The sICH score was developed from a retrospectively analyzed 293 spontaneous ICH patients who were admitted within 24 h of ICH onset to a veterans’ hospital in Taiwan (Table 2). Only those who presented to the study center were included in the study. Through cross-validation, the authors demonstrated that the sICH score had high accuracy (81%), sensitivity (83%), and specificity (81%) for 30-day mortality and performed better than the oICH and ICH-GS scores. External validation of the sICH score is necessary.

MICH Score

Many authors have cautioned against using ICH grading scales to make treatment decisions and called for further investigation to test their utility in the clinical decision making process [7, 17]. In 2008, Cho et al. [20] developed a modified ICH (MICH) score aimed at guiding treatment decisions in the setting of basal ganglia ICH. More specifically, the MICH is intended to determine when to proceed with endoscopic surgical intervention over conservative treatment. The MICH score consists of GCS score, IVH volume, and the presence of IVH or hydrocephalus (Fig. 2). The GCS component has identical cut-off points as the oICH score and the ICH volume is divided into three categories; <20 cc (0 point), 21–50 cc (1 point), and ≥51 cc (2 points). Twenty milliliter is used as a lower cut-off because an ICH volume >20 cc has an increased likelihood of surgical evacuation due to high intracranial pressure [43]. Moreover, 50 cc is the high cut-off value because the benefit from surgical intervention seems to be the greatest with an ICH volume >50 cc [23]. The presence of either IVH or hydrocephalus is assigned 1 point while age is excluded due to lack of association with outcome.

The MICH score was based on data from a prospectively randomized trial of 226 basal ganglia ICH patients admitted to a Taiwanese center (Table 2) [20]. Unlike the original study by Hemphill et al., only primary ICH patients were included. Moreover, patients with very small (≤10 cc) hematomas were excluded on ethical grounds, while patients with large (≥100 cc) hematomas and/or signs of imminent herniation, were excluded to minimize treatment bias [20]. Patients with ICH due to antithrombotic agents were also excluded. Outcome measures were 6 month mortality and good functional outcome (55 ≤ BI and GOS ≥ 4) as well as 1 year good functional outcome. Based on the results of the study, the authors recommended that basal ganglia ICH patients with a low MICH score (0, 1) should undergo conservative treatment. Surgical treatment was recommended for patients with MICH score ≥2 to obtain good neurologic outcome, and those with a high MICH score (3, 4) to decrease mortality [20]. External validation of the MICH score is necessary.

Current Issues and Future Directions

During the past decade, many ICH scales have been developed in an attempt to create a scale that is simple, generally applicable, and predictive of mortality and functional outcome. Many of them have been externally validated as practical assessment tools with acceptable predictive utility. These grading scales and their modified versions have been developed with clinically and statistically sound methodologies and have the potential to become universally accepted.

Identification of the most reliable and accurate ICH grading scale would be of great value to clinicians, patients, and researchers alike. Unfortunately, it remains unclear as to which scale(s) is the best for risk-stratification and outcome prediction after ICH. This is largely due to the absence of studies that directly compare the predictive abilities of all the ICH grading scales in one large, prospective independent cohort. Until such studies are undertaken, critical evaluation, selection and application of the grading scales are imperative.

Currently available ICH grading scales should be used upon careful consideration of the study design and outcome measures. First, understanding the patient selection criteria provides insight into the grading scales’ degree of general applicability and the intended, as well as appropriate circumstances for use. For instance, grading scales developed from spontaneous (i.e., nontraumatic) ICH cohorts may significantly differ from those based on primary (i.e., hypertensive or amyloid-angiopathic) ICH patients. Similarly, grading scales for primary ICH patients may be less reliable for predicting outcome after secondary ICH, such as AVM associated-ICH, which is associated with distinct epidemiology, pathophysiology, and outcome [44]. Likewise, grading scales based on cohorts with urgently transferred patients or those without patients with anticoagulation-induced ICHs must be evaluated accordingly. Other considerations for grading scale evaluation should include discrepancies in age, sex distribution, ICH location, rate of surgical hematoma evacuation, and mortality [68, 20]. Also, location of hospitalization (e.g., general ward vs. ICU) should be taken into account as the level and quality of care may vary significantly depending on where patients are managed. Furthermore, one should be aware that there are complex social, cultural, and economic factors as well as regional management philosophies and preferences that are difficult to account for when grading scales are developed or applied to distinct populations.

It is also important to be cognizant of the specific outcome measure(s) for which a grading scale has been developed. The outcome measure(s) define the intended purpose of the scale, and therefore, understanding this better defines the appropriate application of the scale. Although optimal outcome measures for ICH remain incompletely defined, the most commonly selected are 30-day mortality [7, 9] and 30-day good functional outcome [8, 18, 20, 29] often defined as either GOS ≥ 4 [8, 18] or mRS ≤ 2 [29]. Similarly, understanding of the intended timing of assessment for each grading scale is essential. While some grading scales have been developed for assessment at initial presentation in the emergency department [7], some scales have been developed for potential use throughout the clinical course [29]. Standardization of variables such as inclusion and exclusion criteria, scale assessment timing, hematoma measurement method, outcome measures, outcome measure methods (e.g., GOS vs. BI vs. mRS), and outcome measure timing in ICH clinical studies will greatly facilitate validation and comparison of grading scales.

Withdrawal of care has significant impact on outcome and grading scale performance, and therefore, must be taken into account when evaluating the scales [36, 45]. Care limitation and withdrawal are common after ICH and have been shown to independently predict poor outcome [35, 46]. Care is usually withdrawn from patients who are expected to have a poor outcome, and hence leading to self-fulfill prophecies about the clinical and radiographic factors used for prognostication [46]. Likewise, predictive ability of grading scales may become inflated in a clinical setting with high rates of care withdrawal. Despite its importance, the impact of care limitation and withdrawal has been inadequately addressed or inconsistently accounted for in grading scale development and validation [36, 47]. Currently, the ICH [47] and FUNC scores [36] are the two scales that have demonstrated robustness against early care limitation or withdrawal of care.

It must be stressed that the lack of a standard and widely used grading scale in ICH is not an isolated issue. Despite continued advances in both clinical and experimental research, our understanding of the ICH pathophysiology remains incomplete. There is also an urgent need to elucidate optimal outcome measures and assessment timing in ICH patients. Long-term clinical course and pattern of functional and cognitive recoveries need to be better defined [26]. Development of an ideal grading scale depends on the ongoing progress in these areas of ICH and undoubtedly, vice versa.

Refinement of the existing grading scales and development of new scores must continue based on novel predictors. An ideal grading scale would ideally retain the simplicity and predictive utility of the existing scales and also (1) account for the duration between ICH onset and initiation of treatment, (2) provide up-to-date prognostic information throughout a patient’s hospital course (e.g., based on treatment response or post-bleed day), (3) remain unaffected by care withdrawal, and (4) incorporate novel factors such as previous medications (e.g., statins [22]), biological and genetic markers, and advanced imaging technology. An ideal grading scale must find a balance between simplicity and predictability. More than one grading scale is likely necessary to devise a truly ideal prognostic system for ICH to guide treatment from onset to long-term follow-up. Development of such a scale(s) will significantly contribute to the establishment of a standard management protocol and identification of novel and effective therapies in ICH.

Conclusion

At present, there is no universally accepted and routinely used grading scale or prognostic model for ICH. In the past decade, a number of grading scales have been developed that are both easy to use and highly predictive of outcome. To varying degrees, many of these grading scales have been externally validated in geographically, as well as culturally and socioeconomically distinct populations. These grading scales have significantly increased our understanding of ICH and have established a pathway towards an ideal grading scale. Nevertheless, there exist variations among studies with regards to aspects of study design, including outcome measures, outcome assessment methods, and timing of clinical follow-up. Understanding the strengths, limitations, and intended purpose of grading scales can greatly facilitate their continued validation and comparison. It will also lead to further identification of critical issues that must be addressed by new modifications or grading scales. The authors are confident that the ever-growing insight into the pathophysiology, clinical course, and outcome predictors of ICH will eventually lead to the establishment of a standard management protocol and identification of novel and effective therapies that can benefit many patients worldwide.