Introduction

Traumatic brain injury (TBI) persists as a critical public health concern, with the annual incidence rates reaching up to 700 per 100,000 individuals in Europe, underscoring its significance [3]. The clinical assessment of TBI severity, pivotal for guiding treatment strategies, employs the Glasgow Coma Scale (GCS), categorizing the condition into mild, moderate, and severe classes [10, 22]. This categorization is acknowledged for its correlation with long-term clinical outcomes [17], highlighting its clinical relevance. Nonetheless, GCS’s utility in forecasting specific patient prognoses– distinguishing between favorable and unfavorable outcomes– is limited, indicating a gap in the current assessment methodologies.

The COVID-19 pandemic underscored the critical role of effective triaging and outcome prediction in managing limited intensive care resources [15, 24], a scenario that resonates with TBI management challenges. While GCS remains the cornerstone for TBI assessment, its limitations become evident when considering long-term outcome predictions. Such prognostic insights are crucial for informed decision-making, particularly in avoiding situations where patients with limited recovery prospects are subjected to interventions that may not substantially improve their quality of life. The GCS, primarily a preclinical tool, is restricted to initial clinical observations and does not incorporate radiological data, patient history, or other potential biomarkers. In contrast, the prognosis of hypertensive intracerebral hemorrhage (ICH) has been enhanced with the ICH-score, which integrates age, GCS, and initial radiological findings to predict 30-day mortality with notable specificity and sensitivity [11, 13, 16]. This advancement highlights the potential benefits of incorporating a broader range of clinical data in prognostic models.

Recent decades have witnessed the development of various computed tomography (CT) scoring systems aimed at addressing the limitations of GCS in TBI prognosis [23]. Among these, the Helsinki CT scoring system (HS) has demonstrated notable accuracy in predicting the clinical outcomes of TBI patients [19, 23], suggesting its potential as a more comprehensive assessment tool.

This study aims to assess the predictive capabilities of the HS in comparison to the traditional GCS, within a large TBI patient cohort at a level 1 trauma center in Germany. The goal is to ascertain whether the HS can provide a more reliable basis for clinical decision-making and resource allocation in TBI management, potentially enhancing patient outcomes and optimizing healthcare resource utilization.

Materials and methods

Participants

In this retrospective cohort study with an open-cohort design, we selected a series of consecutive patients presenting with polytrauma accompanied by traumatic brain injury (TBI) of varying severity. Specifically, our inclusion criteria encompassed patients with moderate to severe TBI, indicated by an admission Glasgow Coma Scale (GCS) score ranging from 3 to 12, as well as individuals with complicated mild TBI, denoted by an admission GCS score between 13 and 14, yet necessitating intensive care unit (ICU) admission. This study focused on patients admitted to the Level 1 Trauma Center at the University Medical Center Göttingen over an extensive eleven-year timeframe, from May 2008 to May 2019.

For the purposes of this study, TBI was precisely defined according to the International Statistical Classification of Diseases and Related Health Problems, 10th Edition (ICD-10) codes S06.1 to S06.9, with the prerequisite that the injury was the result of an external force. Excluded from the study were individuals under the age of 18, patients who were deceased upon arrival, those who passed away prior to undergoing CT imaging or ICU admission, and cases where the TBI occurred more than 24 h prior to admission.

The comprehensive data regarding the admission details and subsequent treatment of TBI patients were gathered from the digital patient records maintained by the hospital. This study was conducted with the approval of the local ethics committee of the University Medical Center Göttingen, under the registration number 18/3/20, ensuring adherence to ethical standards and patient confidentiality.

The Helsinki computed tomography scoring system

CT was performed immediately upon arrival to the emergency room for all patients. The HS is calculated based on four main variables, the mass lesion type, its volume, the presence of intraventricular hemorrhage and the condition of the suprasellar cisterns, as shown in Table 1 [19]. Regarding mass lesion type, a subdural hematoma (SDH) is scored 2, an intracerebral hematoma is scored 2, and an epidural hematoma is scored − 3. As for the volume, hematoma volume greater than 25 cm3 is scored 2. 3 points are added in the presence of intraventricular hemorrhage (IVH). A compressed or obliterated suprasellar cistern receives a score of 1 and 5 respectively, adding up to a total score ranging between − 3 and 14. A score increase correlates with a worse outcome. Figure 1 illustrates CT findings of patients with TBI and their respective HS calculations.

Table 1 The Helsinki computed tomography scoring system (HS) (from Raj et al., 2014). The score was recorded based on the CT scan performed in the emergency department
Fig. 1
figure 1

The Helsinki computed tomography scoring system (HS). Top left: a small (< 25 cm3) right-sided frontal intracerebral hemorrhage. Helsinki CT score: 2 of 14 (+ 2 intracerebral hemorrhage). Top right: a large right-sided intracerebral hemorrhage with intraventricular hemorrhage and a thin right-sided subdural hematoma. Helsinki CT score, 9 of 14 (+ 2 subdural hematoma, + 2 intracerebral hemorrhage, + 2 large mass lesion, + 3 intraventricular hemorrhage). Bottom: White arrow shows normal suprasellar cisterns (left, 0 points), compressed suprasellar cisterns (middle, 1 point), and obliterated suprasellar cisterns (right, 5 points). From Raj et al., 2014

TBI treatment

The management of all patients in this study adhered to the best practice standards as outlined in the globally accepted Brain Trauma Foundation guidelines [2]. Surgical interventions, when necessary, were tailored to the specific needs of the patients: individuals presenting with symptomatic subdural or epidural hematomas underwent craniotomy, while those with severe traumatic injuries accompanied by sustained intracranial hypertension were treated with hemicraniectomy. For comatose patients, the protocol included routine invasive monitoring of intracranial pressure to guide therapeutic decisions effectively. In cases where intracranial hypertension persisted despite initial interventions, an external ventricular drain was strategically placed to manage the condition. The transition of patients from the intensive care unit (ICU) to a general ward or specialized rehabilitation facilities was determined based on their response to the acute phase of treatment and overall recovery trajectory.

Outcome measures

The effectiveness of the treatments and the progress of the patients were quantitatively assessed using the Glasgow Outcome Scale (GOS) along with mortality rates at the time of discharge, and subsequently at 6 and 12 months post-treatment [12]. An independent physician (BS) retrospectively assessed these outcome measures based on patients charts and/or telephone interviews with the patients or the next of kin. The neurological outcomes were classified into two broad categories: ‘unfavorable’ encompassing death, vegetative state, or severe disability (GOS scores 1–3), and ‘favorable’ which included scenarios of moderate disability or good recovery (GOS scores 4–5), providing a clear, standardized metric for assessing patient trajectories post-TBI treatment.

Statistical analysis

The data collected in this study were analyzed utilizing SPSS Statistics version 26 (IBM SPSS Statistics, Chicago, IL) and Microsoft Excel 2013 (Microsoft Inc, Seattle, Washington, USA). The central tendency and dispersion of scores were presented as median values accompanied by the interquartile range (IQR). To evaluate the predictive accuracy of HS and GCS regarding clinical outcomes and mortality rates, Receiver Operating Characteristic (ROC) curves were employed. These curves facilitated a comprehensive analysis of the specificity and sensitivity of the HS and GCS scores. Furthermore, to ascertain the strength and direction of the association between these predictive scores and the outcome measures under investigation, Kendall tau-b correlation analyses were conducted. This non-parametric test offered insight into the relationships without assuming normality of the data. The statistical significance threshold was set at a P-value of less than 0.05.

An analysis of the Minimal Clinically Important Difference (MCID) for GOS was performed. The MCID for GOS was defined as an improvement of one point after discharge. A binary outcome based on whether the difference between GOS at 6 or 12 months and GOS at discharge meets or exceeds the MCID was created. Afterwards, a logistic regression using GCS and HS as predictors was performed and models were compared using Akaike Information Criterion (AIC) and classification accuracy. Lower AIC values indicate a better-fitting model.

Results

Patient demographics and clinical data

This study encompassed a total of 544 participants. The cohort had an average age of 62.2 ± 21.5 years. The gender distribution showed a predominance of male patients, constituting 66.5% (362 individuals), whereas female patients made up 33.5% (182 individuals) of the study population. Initial assessment in the emergency department revealed a median GCS of 14 ± 12. The median HS at admission was 3 ± 3.

Hospital mortality (termed at discharge) was observed in 8.6% of the cases. At this juncture, the median GOS) stood at 4 ± 1, indicating that the majority of patients were experiencing symptoms of moderate disability or were on a trajectory towards good recovery.

Long-term follow-up was remarkably high, with data available for 99.6% (542 patients) and 99.1% (539 patients) of the cohort at 6 and 12 months post-injury, respectively. The mortality rate at 6 months post-injury increased to 17.5%, with 95 reported cases, and the median GOS remained at 4, albeit with a wider variability (standard deviation of 2). By 12 months, the mortality rate had risen slightly to 21% (113 cases), with the median GOS score holding steady at 4, again with a standard deviation of 2, indicating ongoing recovery challenges among some patients.

Comprehensive insights into the patient demographics, the nature of their injuries, and the treatments administered are meticulously detailed in Table 2, providing a thorough understanding of the cohort’s clinical landscape.

Table 2 Patient demographics, TBI characteristics, treatment, and outcomes

Comparative analysis and longitudinal predictive power of GCS and HS in TBI outcomes

A Kendall tau-b statistical analysis was conducted to explore the correlation between the GCS and HS with GOS at various time points. At the point of discharge, the GCS exhibited a moderate correlation with the GOS, marked by a τb value of 0.32 and a highly significant p-value of 9.1e-16. The HS demonstrated a more substantial correlation, with a τb value of 0.40 and an even more significant p-value of 7.6e-26, indicating a stronger predictive relationship with patient outcomes (Fig. 2).

Fig. 2
figure 2

Kendall tau-b analysis at discharge. HS correlated significantly higher than GCS with mortality (top third), binary GOS (middle third) and GOS (lower third)

Extending the analysis to 6 months post-discharge, the GCS correlation with GOS slightly diminished to a τb value of 0.20 (p = 5.8e-7), whereas the HS maintained a robust correlation with a consistent τb value of 0.40 (p = 9.4e-27). At the 12-month mark, the GCS showed a τb value of 0.22 (p = 3.6e-8), while the HS correlation remained strong at a τb value of 0.41 (p = 1.8e-28). These findings underscore the HS’s superior and consistent correlation with GOS across all measured time points.

The predictive accuracy of GCS and HS for clinical outcomes according to GOS was assessed through ROC curve analysis. The GCS demonstrated its capacity as an independent predictor of adverse outcomes at discharge with an area under the curve (AUC) of 0.71 and a standard error of 0.023, achieving statistical significance (p < 0.0001). However, the HS outperformed the GCS in predictive capability at the time of discharge, evidenced by a superior AUC of 0.77 and a slightly lower standard error of 0.021 (p < 0.0001), establishing HS as a more effective tool for outcome prediction (Fig. 3).

Fig. 3
figure 3

ROC analysis at discharge. For mortality prediction, the HS shows a significantly higher AUC (top right) than GCS (top left). For outcome prediction, the HS shows a significantly higher AUC (bottom right) than GCS (bottom left)

Longitudinal analysis extending to 6 months post-discharge showed the AUC for GCS at a modest 0.63 with a standard error of 0.025 (p < 0.0001), whereas the HS maintained its superior predictive performance with an AUC of 0.78 and a standard error of 0.02 (p < 0.0001). This pattern persisted at the 12-month evaluation, with the AUC for GCS slightly increasing to 0.64 and the HS maintaining a consistently high AUC of 0.78 (both with a standard error of 0.02 and p < 0.0001), further affirming the HS’s efficacy in long-term outcome prediction for TBI patients.

Mortality prediction in TBI: assessing the comparative efficacy of GCS and HS over time

In assessing the correlation with mortality, a Kendall tau-b analysis revealed that the GCS had a weak correlation with mortality at the time of discharge, indicated by a τb value of 0.11 and a p-value of 0.004. In contrast, the HS displayed a significantly stronger correlation, with a τb value of 0.25 and a highly significant p-value of 3.6e-11 (Fig. 2).

Further investigation into the correlation with mortality at 6 and 12 months post-discharge showed a continued pattern. The GCS’s correlation remained relatively low at 6 months (τb = 0.09, p = 0.028) and 12 months (τb = 0.09, p = 0.021). The HS, however, presented a significantly higher correlation at 6 months (τb = 0.27, p = 1.4e-13) and 12 months (τb = 0.29, p = 1.3e-14), reinforcing its efficacy in predicting mortality outcomes.

The capability of GCS and HS to predict mortality was evaluated through ROC curve analysis. At the point of discharge, the GCS demonstrated a moderate predictive accuracy with an AUC of 0.62 and a standard error of 0.039 (p = 0.006). In contrast, the HS showcased superior predictive performance, evidenced by a significantly higher AUC of 0.79 and a lower standard error of 0.032 (p < 0.0001, Fig. 3).

Further analysis extending to 6 months post-discharge revealed that the GCS maintained a modest AUC of 0.57 with a standard error of 0.03 (p = 0.03), whereas the HS continued to outperform with an AUC of 0.74 and a standard error of 0.03 (p < 0.0001). This pattern persisted into the 12-month evaluation, with the AUC for GCS remaining at 0.57 with a standard error of 0.03 (p = 0.026) and the HS demonstrating a consistently higher AUC of 0.73 (p < 0.0001), further affirming HS robustness in predicting long-term mortality outcomes.

Analysis of minimal clinically important difference

MCID was met or exceeded by 27% of the patients at 6 months and by 32% of the patients at 12 months. The comparison between GOS and HS as predictor models of MCID revealed lower AIC values for the HS; 407 vs. 401 at 6 months, and 433 vs. 421 at 12 months, respectively.

Discussion

The interplay between the radiological characteristics of TBI observed through CT scans and the subsequent clinical outcomes has been a subject of study since the early 1980s [8]. It was then acknowledged that TBI patients presenting with identical GCS scores but differing underlying brain lesions might experience varied clinical trajectories. Consequently, the classification of head injuries evolved to distinguish between focal versus nonfocal lesions, and later, to differentiate diffuse injuries from focal or mass lesions [1, 8, 14]. By 1991, a more nuanced classification emerged, incorporating a broader spectrum of radiological features [14], which was subsequently integrated into the International Mission for Prognosis and Analysis of Clinical Trials in TBI (IMPACT) prognostic model for moderate to severe TBI cases [20].

Building on this foundation, the HS was developed to refine the predictive accuracy of early CT scans post-TBI by incorporating a comprehensive set of radiological indicators [19]. Our current study delves into the predictive capabilities of the HS within a substantial patient cohort treated for TBI at a premier trauma center in Germany. The findings underscore the HS’s superior predictive power over the traditional GCS, demonstrating a significantly enhanced correlation with both mortality rates and unfavorable clinical outcomes at various intervals– at discharge, 6 months, and one year post-injury. This study not only validates the HS as a tool in the prognostication of TBI outcomes but also highlights the evolving landscape of TBI assessment, where integrating detailed radiological assessments can markedly improve outcome predictions.

Validation of the Helsinki score

The HS has been subject to validation across diverse cohorts, demonstrating its robust psychometric properties. A study conducted in China involving 302 patients revealed the HS’s efficacy, showcasing a specificity of 74.6% and a sensitivity of 74.1% in predicting mortality at a threshold of 4.5. When assessing for unfavorable outcomes, the HS exhibited a specificity of 81.2% and a sensitivity of 56.8% [26]. Similarly, research within the Lebanese demographic confirmed the HS’s strong correlation with long-term quality of life and clinical outcomes [21]. Our study, conducted in a German trauma center, represents one of the most extensive validations of the HS, encompassing over 500 participants. This research aligns with the findings from the aforementioned studies and further ones [5, 9], confirming the HS’s applicability and relevance in the German context. This cross-cultural consistency underscores the HS’s global applicability.

Enhancing the usability of the Helsinki score

The results indicate that the HS was a better predictor of whether the MCID is met compared to GCS alone regarding GOS at 6 months as well as at 12 months. This demonstrates that the HS also includes a clinically relevant superiority to the GCS and not only a statistical advantage.

The HS stands out for its practicality, especially within the high-pressure environment of emergency care. Its straightforward application allows for the rapid and reproducible assessment of patient outcomes based solely on radiological findings. This capability ensures that healthcare providers can quickly gauge the prognosis of TBI patients, facilitating timely and appropriate interventions.

Despite its strengths, the HS does not incorporate vital clinical observations or other biomarkers, which are often pivotal in shaping the emergency treatment strategies for TBI. The inclusion of such parameters could enrich the prognostic value of the HS, making it an even more indispensable tool in emergency settings. For instance, research by Posti et al. highlighted the enhanced predictive accuracy of the HS when combined with initial levels of biomarkers like Interleukin 10 and Amyloid ß 1–40 [18]. Although these specific tests are not routinely conducted in our emergency department, they represent a promising avenue for developing more holistic scoring systems that integrate both radiological and biochemical markers.

Looking forward, the potential integration of such biomarkers into the HS or similar scoring systems could revolutionize the assessment process, offering a more comprehensive view of patient prognosis. This evolution would not only improve the accuracy of outcome predictions but also inform more nuanced and effective treatment plans for TBI patients.

Study limitations and areas of further research

Our study’s approach, contrasting the CT-based HS with the more conventional GCS, inherently has its limitations. The GCS, despite its more basic nature, remains a universally recognized and foundational tool in the initial assessment of TBI patients, offering an essential snapshot of their immediate condition. Its enduring relevance underscores its role in clinical outcome prognostication, despite the evolving landscape of diagnostic and predictive methodologies.

The integration of various clinical, radiological, and laboratory markers, as seen in prognostic models like CRASH and IMPACT [4, 6, 7, 25], highlights the complexity of TBI outcomes. This complexity suggests the potential for artificial intelligence (AI) to develop a more comprehensive predictive tool by incorporating a wide range of indicators. Future research should explore AI-driven models to enhance TBI outcome prediction, aiming for a tool that balances sophistication with clinical usability.

Conclusion

This study confirms the effectiveness of the HS within a substantial German cohort, illustrating its superiority over GCS in forecasting outcomes of TBI based solely on radiological assessments. Nevertheless, the HS’s limitation lies in its exclusion of clinical evaluations, an essential aspect of comprehensive TBI management. This finding emphasizes the need for an integrated prognostic framework that synergizes radiological and clinical data, ensuring a more thorough and precise prediction of TBI outcomes.