Introduction

Trauma has emerged as the preeminent global cause of death and disability accounting for an estimated 10 % of the world’s deaths. According to the World Health Organization (WHO), 5.8 million people die each year as a result of injuries; 32 % more than deaths resulting from malaria, tuberculosis, and HIV/AIDS combined [1]. Millions more suffer the nonfatal consequences of injury caused by lasting mental and physical disabilities. The global cost of road traffic crashes alone is estimated to be U.S. $518 billion; for some countries, this equates to 5 % of their gross national product [1]. Consequently, there exists an urgent need for dedicated research and health policy implementation to mitigate the disastrous societal and economic effects of trauma.

A successful strategy to improve trauma outcomes has been through improvements in quality of care (QoC). Research from multiple areas in healthcare, including trauma, has demonstrated that quality improvement (QI) initiatives reduce morbidity, mortality, complications, and costs [27]. In a recent systematic review by the World Health Organization-International Association for Trauma Surgery and Intensive Care (WHO-IATSIC), Juillard et al. [8] concluded that hospital-based and system-wide trauma QI initiatives “have been consistently shown to improve the process of care, decrease mortality, and decrease costs.” While the majority of the publications reviewed were from high-income countries, experiential evidence from other areas of medicine, particularly obstetric care, strongly support the feasibility and efficacy of QI programs in lower-middle income countries (LMIC) [913].

The cornerstone of any QI program is standardized collection of relevant healthcare information as databases [14, 15]. These databases typically enable measurement of the three Donabedian components of health quality: structure, process, and outcomes [16]. Analysis of these data helps to establish baselines, identify factors affecting QoC, monitor improvements temporally, and make interprovider comparisons. Injury-specific data recorded in trauma registries (but not in administrative hospital discharge datasets) is considered critical in improving QoC [17]. In the past two decades, numerous national trauma databases have been setup across Europe, North America, Israel, Japan, and Australia [18]. These databases have helped to identify and improve multiple areas of trauma care in their respective regions/countries [3, 1923].

Recently, the American College of Surgeons Committee on Trauma (ACSCOT) suggested the development of a global trauma registry—the International Trauma Data Bank (ITDB)—with contributions from trauma registries from across the world. First proposed by Raul Coimbra, MD, at the 2011 ACSCOT spring meeting (personal communication), the central idea of the ITDB is to establish a mechanism for global comparative assessments of quality of trauma care to identify potential area for improvements and promote data-driven performance enhancement initiatives on a wider scale. However, a key problem identified remains the feasibility of data aggregation given the lack of standardized data collection practices across the world [24]. The objective of this study is to determine whether trauma data from across the world could be combined to explore the feasibility of performing international benchmarking using the proposed ITDB concept.

Methods

The goal of this study was to understand the opportunities and challenges associated with trying to compare trauma outcomes from different parts of the world, with dissimilar data collection practices. We compared mortality outcomes of two trauma centers (TCs) [one European high-income country (HIC) and one Asian lower-middle income country (LMIC)] with centers included in the United States/Canadian National Trauma Data Bank (NTDB). The European HIC center was an academic medical and TC (Lyon South Hospital) located in Lyon, France. This institution is one of two academic TCs in Lyon, serving a population of nearly 1.6 million. Injured patients are brought to the trauma center by prehospital physician providers. We included patients triaged to the center’s trauma resuscitation unit during 2002–2004.

The Asian LMIC center (Aga Khan University Hospital) is an academic medical and TC located in Karachi, Pakistan. This private, primary, and referral TC serves a population of 2.1 million. The hospital functions as part of a decentralized trauma system where patients receive no prehospital care. Over the past decade, as part of an institutional trauma quality improvement initiative, this center has hired dedicated trauma care providers, built new facilities, and implemented trauma care protocols based on adapted ACSCOT guidelines [13]. All patients meeting the institution’s trauma activation criteria during 2002–2010 were included [25, 26].

To explore the feasibility of international comparisons, these two centers were compared to one another and to data from North American centers in the NTDB. The NTDB is maintained by the American College of Surgeons and is the largest trauma database in the world, comprising annually submitted data on approximately 700,000 patients from more than 900 centers across the United States and Canada [27]. Since 2007, the quality of data in the NTDB has improved substantially with the institution of the National Trauma Data Standard (NTDS), which has standardized definitions, data collection, and reporting procedures [28]. Submission to the NTDB is voluntary; however, 97 % of level I and 75 % of level II TCs contribute data. Adult trauma patients (≥16 years of age) with blunt and/or penetrating injuries from all three datasets were included in the analysis. Patients who were dead on arrival were excluded. Hospitals in the NTDB missing >20 % data on covariates used to risk adjust were excluded [29]. Given the known association between hospital trauma volume and patient outcomes, the main analysis included NTDB hospitals with annual trauma volumes within 2 standard deviations of annual patient volumes at the HIC and LMIC, i.e., between 25 and 400 patients (Fig. 1) [30]. A sensitivity analysis, including all NTDB centers, also was performed.

Fig. 1
figure 1

National Trauma Data Bank patient and trauma center selection

Multiple patient demographic and injury severity measures were recorded in each of the three datasets. To ensure robust performance assessment, we included patient-level covariates that were uniformly and consistently reported for all centers and considered to be the most important predictors of in-hospital mortality following injury [31]. These included age, gender, type and mechanism of injury, presence of hypotension on arrival, total Glasgow Coma Scale (GCS), and Injury Severity Score (ISS). Because ISS scores reported in the different registries were derived using multiple various versions of the Abbreviated Injury Scale (AIS) scores, data from the three sources were categorized and standardized as described in Fig. 2, before its synthesis into an aggregate dataset. Age was categorized into deciles (16–25, 26–35, 36–45, 46–55, 56–65, 66–75, 76–85, and >85 years). Type and mechanism of injury were determined using International Classification of Diseases 9th edition Matrix of External-cause-of-injury codes (E-codes), where possible, and classified as blunt or penetrating, and stab, fall, gunshot wound, motor vehicle collision, pedestrian, struck-by/or against, or other mechanism, respectively. Hypotension at admission was defined as systolic blood pressure <90 mmHg on arrival and categorized as a binary variable (yes/no). Total GCS was categorized as 3–5, 6–8, 9–11, and 12–15. ISS was categorized as 0–8, 9–15, 16–24, and 25–75.

Fig. 2
figure 2

The recording of important patient level information in each data set and it standardized aggregation as the International Trauma Data Bank (ITDB). *International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) External cause-of-injury codes. **Abbreviated Injury Scale (AIS). Facilities reporting data to the National Trauma Data Bank used any one of the following AIS standards (AIS-80 through AIS-05 or AIS-MAP)

Baseline demographic and injury severity characteristics of patients admitted to the HIC and LMIC TC were each compared to NTDB patients using univariate statistics. To profile centers on mortality outcomes, we adapted the American College of Surgeons Trauma Quality Improvement Program (ACS-TQIP) methodology and ranked hospitals on risk-adjusted O/E ratios [32]. A standard multivariable logistic regression analysis was performed adjusting for age, gender, type of injury (blunt versus penetrating), presence of hypotension at admission, total GCS at admission, ISS, year of admission, and annual hospital volume. These covariates were chosen, because (1) they were consistently reported across the three datasets, and (2) included the basic set of covariates deemed necessary when risk-adjusting for trauma mortality [31, 33]. Model discrimination and calibration were assessed using the area under the receiver operating characteristics curve (AUROC) and calibration curves, respectively. Clustering by facility identifier was performed to account for correlated patient outcomes within individual hospitals. Subsequently individual patient probabilities of mortality were estimated and summed to calculate the “expected” number of deaths at each center. The “observed” or actual number of deaths at a center was then divided by expected number of deaths to calculate the O/E ratio along with its 95 % confidence interval. These O/E ratios (95 % CI) were plotted as a “caterpillar” graph and were used to classify individual hospitals as high performing (upper bound 95 % CI <1), average performing (95 % CI overlapping 1) or low performing (lower bound 95 % CI >1). Subset analyses for blunt and penetrating injury also were performed.

All three datasets contained deidentified patient information, and therefore this study was exempted from institutional review board approval. All analyses were performed using Stata12/MP (StataCorp, College Station, TX).

Results

From approximately 4.4 million patients available in the NTDB 2002–2010, a total of 375,433 patients from 301 centers were included in the main analysis (Fig. 1). The LMIC TC contributed 806 patients (2002–2010), whereas the HIC TC reported 1,003 patients (2002–2004). Figure 2 describes the recording of important patient level information in each data set and its standardized aggregation as the ITDB. Most covariates were similarly reported. However, AIS scores (and subsequent ISS calculation) were derived using different AIS versions. Most NTDB centers reporting data for years 2007 onwards and the HIC center used AIS-98 compared with AIS-90 used by LMIC center.

Table 1 compares the hospital level characteristics from the European HIC, Asian LMIC, and the NTDB. Nearly half of the NTDB centers (48 %) were level 1 centers. Both the HIC (>1,000 beds) and the LMIC center (542 beds) provided definitive patient care and hence were analogously classified as level 1 centers. The number of trauma, orthopedic, and neurosurgeons at these two centers were comparable to the NTDB centers, except that the LMIC center had only one core trauma attending.

Table 1 Characteristics of hospitals included in the International Trauma Data Bank

Table 2 compares baseline patient demographic, type of injury, and crude mortality rates of patients at NTDB centers versus those at HIC and LMIC centers. Both the HIC and LMIC centers had significantly lower proportions of elderly patients (>65 years) than the NTDB (29.4 % at NTDB centers vs. 11.7 % at HIC and 3.8 % at LMIC centers, p < 0.001 for both comparisons). Both non-NTDB centers had a significantly greater burden of penetrating injury (14.5 % for HIC and 36.8 % for LMIC vs. 9.0 % for NTDB, p < 0.001 for both comparisons) and higher crude mortality rates (16.3 and 4.8 % at HIC and LMIC respectively vs. 3.3 % at NTDB, p < 0.001 for both comparisons).

Table 2 Baseline demographics, type of injury, and crude mortality rate of patients treated at the high-income country (HIC) and lower-middle income country (LMIC) trauma centers compared with National Trauma Data Bank (NTDB) patients

Table 3 describes the mechanism of injury and injury severity characteristics. Motor vehicle collisions were the most frequent mechanism of injury in HIC (43.5 %) and LMIC (58.8 %) compared with falls (43.1 %) in the NTDB. Non-NTDB patients had more severe physiologic (23.4 and 5.1 % patients hypotensive on arrival at HIC and LMIC, respectively, vs. 2.9 % in the NTDB, p < 0.001 for both comparisons) and anatomic derangements (38.2 and 31.9 % patients with ISS ≥25 at HIC and LMIC, respectively, vs. 6.4 % at NTDB, p < 0.001 for both comparisons). The majority of NTDB and LMIC patients had GCS score above 12 (88.4 and 94.5 %, respectively) compared with only 11.2 % of the patients at the HIC center.

Table 3 Mechanism of injury and injury severity characteristics of patients treated at the high-income country (HIC) and lower-middle income country (LMIC) trauma centers compared with National Trauma Data Bank (NTDB) patients

Figure 3 shows the position of the HIC and LMIC risk-adjusted O/E based mortality performance on caterpillar plot relative to NTDB centers. The HIC center’s performance was statistically no different than the average performing NTDB centers [O/E = 1.11(95 % CI 0.92–1.35)]. However, the LMIC TC showed significantly worse survival [O/E = 1.52 (1.23–1.88)]. The multivariable logistic model used to benchmark hospitals demonstrated excellent discrimination between survivors and nonsurvivors (AUROC >0.90) and adequate model fit (as assessed using calibration curves). A sensitivity analysis comparing the LMIC and HIC center to all centers in the NTDB did not significantly alter the results. Subset analyses stratified by injury type (blunt/penetrating) revealed a similar pattern; the LMIC demonstrated significantly higher O/E [blunt 1.55 (1.25–1.92), penetrating 1.63 (1.07–2.50)] compared with HIC [blunt 1.18 (1.00–1.41), penetrating 0.70 (0.44–1.12)] (Fig. 4).

Fig. 3
figure 3

Observed/expected (O/E) mortality ratios (95 % CI) for a hospital in a high-income and a lower-middle income country compared to trauma centers included in the NTDB; adjusted for age, gender, type of injury, presence of hypotension (systolic blood pressure <90), Glasgow Coma Scale, Injury Severity Score, year of admission, and hospital volume. Black line at one indicates that the hospital is performing as expected given its patient case-mix

Fig. 4
figure 4

Subset caterpillar plots by injury type; observed/expected (O/E) mortality ratios (95 % CI) for a hospital in a high-income and a lower-middle income country compared with trauma centers included in the NTDB; adjusted for age, gender, presence of hypotension (systolic blood pressure <90), Glasgow Coma Scale, Injury Severity Score, year of admission, and hospital volume > Black line at one indicates that the hospital is performing as expected given its patient case-mix. Both caterpillar plots truncated at O/E = 7 for clarity. a Blunt injury. b Penetrating injury

Discussion

Using trauma data from three different continents, this study establishes a proof-of-concept that global benchmarking of trauma center performance is feasible using aggregated data from countries across the globe. The most important covariates predicting postinjury outcomes were found to be recorded adequately and reported in all three datasets. As few as seven variables can be used to reliably predict in-hospital mortality with excellent discriminative ability. Hospitals from both a European HIC and an Asian LMIC were successfully benchmarked against NTDB TCs using the well-accepted, observed-to-expected (O/E) mortality ratios. This work shows that comparing outcomes using global trauma data is feasible. Therefore, we strongly support creation of the ITDB as a pivotal step towards improving global trauma outcomes.

Trauma registries help to improve patient outcomes and are considered an integral component of regional, national, and local trauma QI initiatives [3, 13, 1923]. Initially, registries were simply in-patient administrative hospital records of trauma patients [17]. With the growing understanding of the impact of outcomes data on trauma care, patient safety, and performance improvement processes, these gradually evolved into regional/national repositories and increasingly included trauma-specific clinical information. Concurrently, complex injury severity assessment systems and risk-adjustment methodologies were developed to predict postinjury outcomes accurately [3349]. Currently, using these large datasets and robust statistical methodologies, observational trauma studies help to guide physicians, researchers, and policy makers to improve quality of trauma care [3, 13, 1923].

One key barrier to establishing large trauma data repositories is the lack of standardized data. While uniform reporting procedures can be developed, implemented, and enforced locally, international standardization is difficult to achieve given the inherent differences in national health policies and medical practices. Although trauma systems within single countries have successfully established standardized reporting practices, similar endeavors at the international level have yet to occur. A recent study by the European Trauma Audit Research Network (EuroTARN) found that trauma registries across Europe differed sufficiently to rule out meaningful outcomes comparisons [50]. Similar findings were reported by a group exploring the possibility of a Scandinavian Major Trauma Outcome Study [51]. To mitigate these concerns, a consensus panel of European experts has proposed a uniform data reporting standard, the Utstein Trauma Template (UTT), containing 36 core variables [5254]. Since introduction of the NTDS, the U.S./Canadian-based NTDB already contains the majority of these core elements.

We successfully demonstrate that trauma data from three different global regions can be aggregated to perform adequately mortality-based external benchmarking using only a few critical variables. These findings are similar to those reported by Nathens et al., while evaluating patient and injury factors that most affected case-mix across NTDB TCs. They concluded that few variables are needed to risk-adjust adequately for mortality outcomes, obviating the need for extensive data collection [55]. This finding is important given the enormous costs associated with the implementation and maintenance of trauma registries. While standardized trauma data reporting initiatives are crucial and work well in HICs, they may not be feasible in resource-depleted LMICs [24]. Therefore, few important predictors of trauma mortality could be considered for the proposed ITDB to perform international benchmarking.

Most variables used to perform risk-adjustment in this present study were recorded uniformly. However, individual hospitals differed in their use of multiple AIS versions. Several studies have identified important differences between AIS versions and have suggested against the use of ISS derived from these varied sources when comparing outcomes [5658]. These differences can, at least partially, be resolved by using mapping software to standardize the reported ISSs [59]. However, a more pressing challenge remains in the development of a globally accessible injury severity assessment system. Most injury scoring systems are resource intensive and are difficult to implement and maintain in LMICs. While simple, low-cost alternatives, such as the Kampala Trauma Score, have been specifically developed for LMICs, using different injury scoring systems may undermine the standardization of a global trauma repository [60]. Benchmarking HICs and LMICs separately would again segregate and regionalize global trauma initiatives rather than bring all regions onto a level playing field to compare outcomes appropriately. Additionally, this geographic segregation would not account for the existence of highly variable healthcare settings, resources, and access within each region.

Rather than pursuing comparisons that are global geographically, a more reasonable approach would be to compare similarly resourced centers with one another, because not all HIC hospitals are abundantly resourced and not all LMIC hospitals are ill-resourced. Using this system, low-resourced centers could use simpler injury scoring systems while higher-resourced centers could use more elaborate systems. This would create resource-based global benchmarking tiers, perhaps similar to how trauma centers in the United States are designated, which may offer greater intra-tier homogeneity of trauma data and enable more appropriate comparisons without regionalizing trauma quality improvements. However, without specific data on hospital resource profile, we restricted our comparisons to similar-volume NTDB centers. This volume-based comparison alone enabled us to identify NTDB centers that performed worse than the LMIC center, raising red flags regarding efficiency of resource utilization.

We compared mortality-based hospital performance using the validated techniques currently used by ACS-TQIP [32]. The ACS-TQIP has been modeled to replicate the methodology and success achieved by the ACS National Surgical Quality Improvement Program (NSQIP), a program that has helped reduce morbidity and mortality rates after major surgery across U.S. hospitals [5]. We specifically chose this methodology, because it remains the most well-recognized and widely cited comparative assessment of TC performance. However, other regional/national performance evaluation systems should be considered in the future to determine the optimal methodology to ensure the global trauma quality improvement initiative remains objective, evidence-based, and data-driven.

Our study has several limitations. First, the non-NTDB centers were chosen based on convenience and we included only one center each from a HIC and a LMIC, which may potentially be a source of bias, because they are not necessarily representative of their respective country’s injury profile. However, the goal of the analysis was not to assess performance of TCs across the world but to explore the challenges associated with the future conglomeration of trauma data. Using limited data on a few important patient variables, we demonstrated a proof-of-concept in support of global trauma benchmarking. Second, only mortality was used as a quality endpoint. While several studies recommend other important quality metrics to corroborate mortality-based performance assessments, such as complication rates or failure-to-rescue, these are not uniformly reported even in well-established trauma datasets [6163]. Hence, we restricted our evaluation to the most commonly used outcome measure: in-hospital mortality.

In conclusion, this study demonstrates the feasibility of aggregating predictors of trauma mortality from existing trauma registries from around the world to undertake comparative performance assessments. This study highlights key areas for future exploration, such as global injury severity scoring systems and resource-based benchmarking. These findings may have important implications as we enter the era of evidence-based global trauma care.