Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Traumatic brain injury (TBI) remains a leading cause of death and disability among young people worldwide and current methods to predict long-term outcome are not strong. TBI initiates a cascade of events that can lead to secondary brain damage or exacerbate the primary injury, and these develop hours to days after the initial accident. The concept of secondary brain damage is the focus of modern TBI management in Intensive Care Units. The imbalance between oxygen supply to the brain tissue and utilization, i.e. brain tissue hypoxia, is considered the major cause for the development of secondary brain damage, and hence poor neurological outcome Monitoring brain tissue oxygenation after TBI using brain tissue \(O_{2}\) pressure (Pbt\(O_{2}\)) probes surgically inserted into the parenchyma, may help clinicians to initiate adequate actions when episodes of brain ischemia/hypoxia are identified. The aggressive treatment of low Pbt\(O_{2}\) values (\(< 15\) mmHg for more than 30 min) was associated with better outcome compared to standard therapy in some cohort studies of severe head-injury patients [1]. However, another study was unable to find similar benefits to patient outcome [2]. We are in the process of starting a randomized controlled multi-center trial (23 centers, 400 patients) in order to assess the impact of such therapeutic strategies (standard vs Pbt\(O_{2}\)-based).

MRI is an excellent modality for estimating global and regional alterations in TBI and for following their longitudinal evolution [3]. To assess the complexity of TBI, several morphological sequences are required: FLAIR (Fluid Attenuated Inversion Recovery) and T2-weighted images for visualizing respectively non hemorrhagic lesions and hemorrhagic lesions, and 3D T1-weighted image (such as MPRAGE) for assessing volume loss. Moreover, diffusion tensor imaging (DTI) offers the most sensitive modality for the detection of changes in the acute phase of TBI [4, 5] and increases the accuracy of long-term outcome prediction compared to the available clinical/radiographic pronostic score [6]. Mean Diffusivity (MD) or Apparent Diffusion Coefficient (ADC) have been widely used to determine the volume of ischemic tissue, and assess intra- and extracellular conditions. A reduction of MD is related to cytotoxic edema (intracellular) while an increase of MD indicates a vasogenic edema (extracellular). Changes of MD are expected with severe TBI. The volume of lesions on DTI shows a strong correlation with neurological outcome at patient discharge [6]. We consider a clinically relevant criterion to be the volume of vulnerable brain lesions after TBI, as previously suggested [7]. In consequence, we need an automatic segmentation method to assess the tissue damage in severe trauma (GSC \(<9\)), acute phase i.e. before 10 days after the event.

There are only a few studies that investigated alterations in TBI, mainly on moderate or mild TBI (Glasgow score \(>12\)) (see [8] for a review) and very few on severe TBI, in chronic stage i.e. more than several months post-injury [913] or acute phase, less than 10 days post-injury [6, 14]. Clearly, current proposed methods lack sufficient robustness to capture TBI-related changes without excessive user input [15]. Skull deformation, the presence of blood in the acute phase, the high variability of brain damage that excludes the use of anatomical a priori information and the diffuse aspect of brain injury affecting potentially all brain structures render TBI segmentation particularly demanding. To assess the diffuse aspect of the injury, the brain is firstly divided into ROIs using an atlas [6, 9, 16] or multiple atlases [17]. Then, a selection of the structures frequently implicated in TBI such as thalamus, putamen, brainstem and occipital cortices is considered [13, 17]. The methods proposed in the literature are mainly concerned with volumetric changes following TBI and scarcely report lesion load.

In this paper, we report about our methodological developments to assess lesion load in severe brain trauma in the entire brain. We use P-LOCUS [18] to provide brain tissue segmentation and exclude voxels labeled as CSF, ventricles and hemorrhagic lesion. We propose a fusion of several atlases to parcel cortical, subcortical and white matter (WM) structures into well identified regions where MD values can be expected to be homogenous. Abnormal voxels are detected in these regions by comparing MD values with normative values computed from healthy volunteers. The preliminary results, evaluated in a single center, are a first step in defining a robust methodology intended to be used for in multi-center studies.

2 Materials and Methods

2.1 Patients

The patients (n = 5) had a GCS \(< 9\) with a diagnosis of severe trauma. The control group (n = 2) had no evidence of a past or present brain trauma. The study was approved by the Institutional Review Board at the Hospital of Marseille and informed consents were obtained prior to participation directly from the participants (controls) or next of kin (patients). Compared to the standard CT scan, MR imaging allows to detect more brain lesions. For this reason, the participation to this trial may offer benefits to each individual that largely outweighs the risks.

2.2 Data Acquisition

Images were acquired on a Siemens Verio 3 T system whole body scanner (CHU Marseille-Timone). The following morphological sequences were acquired: axial FLAIR (TR/TE/TI:7840/96/2500 ms, 27 contiguous slices, 0.7\(\,\times \,\)0.7\(\,\times \,\)5 mm\(^{3}\)), and T2 Susceptibility Weighted-Imaging (TR/TE: 35/20 ms, 0.8\(\,\times \,\)0.8\(\,\times \,\)1.6 mm\(^{3}\)), 3D sagittal T1-weighted sequence (MPRAGE,TR/TE/TI: 2300/2.98/900 ms, \(1\,\times \,1\,\times \,1\) mm3). In addition, DTI was acquired in an axial plane perpendicular to the main field B0. The DTI parameters used were: field of view of 300 mm, matrix size 96 96, and slice thickness 2 mm (resulting in nearly isotropic voxels). Magnetic field gradients were applied in 63 directions with a value of 1000 mT/m.

2.3 Image Processing

Preprocessing. All MRI scans were reviewed to check for motion and other artifacts. T1-weighted and FLAIR images were processed using P-LOCUS, a Bayesian HMRF approach for tissue and lesion segmentation [18] and resampled at a resolution of 2\(\,\times \,\)2\(\,\times \,\)2 mm\(^{3}\). DTI images were first denoised [19] and preprocessed using the FSL softwareFootnote 1. The images were corrected for geometric distortions caused by Eddy currents and intensity inhomogeneity. The diffusion tensor was estimated, and the local diffusion parameter MD was calculated for the entire brain in each patient and control. These parameters were computed from the three estimated eigenvalues that quantify the parameters of water diffusion in three orthogonal directions. Brain extraction, coregistration and resampling were successfully realized using P-LOCUS even in cases exhibiting large skull deformations.

Segmentation Model Specification. We consider a finite set V of N voxels on a regular 3D grid. We denote by \(\mathbf {y}=\{\mathbf {y}_1, \ldots , \mathbf {y}_N\}\) the intensity values observed respectively at each voxel. Each \(\mathbf {y}_i= \{y_{i1}, \ldots , y_{iM}\} \) is itself a vector of \(M=2\) intensity values corresponding to T1-weighted and FLAIR sequences. The segmentation task is to assign each voxel i to one of K classes considering the observed features data \(\mathbf {y}\). This assignment is considered latent data and is denoted by \(\mathbf {z}= \{\mathbf {z}_1, \ldots , \mathbf {z}_N\}\). Typically, the \(\mathbf {z}_i\)’s corresponding to class memberships, take their values in \(\{e_1,\ldots , e_K\}\) where \(e_k\) is a K-dimensional binary vector whose \(k^{th}\) component is 1, all other components being 0. We will denote by \(\mathcal {Z}= \{e_1,\ldots , e_K\}^N\) the set in which \(\mathbf {z}\) takes its values. We considered 5 classes, 4 for tissues: WM, grey matter (GM), and cephalo spinal fluid (CSF) divided in two classes (ventricles and extra-ventricular), plus an additional lesion class. The set of voxels V is associated to a neighborhood system. Spatial dependencies between voxels are modeled by assuming a Markov Random Field (MRF) prior. Denoting \(\psi = \{\eta , \phi \}\) additional parameters, we assume that the joint distribution \(p(\mathbf {y}, \mathbf {z}; \psi )\) is a MRF with the following energy function:

$$\begin{aligned} H(\mathbf {y}, \mathbf {z}; \psi ) = H_{\mathbf {Z}}(\mathbf {z}; \eta ) + \sum _{i \in V} \log g(\mathbf {y}_i | \mathbf {z}_i ; \phi ), \end{aligned}$$
(1)

where the \(g(\mathbf {y}_i | \mathbf {z}_i; \phi )\)’s are probability density functions of \(\mathbf {y}_i\).

The energy decomposes into a data term and missing data term further specified below. For brain data, the data term \(\sum \limits _{i \in V} \log g(\mathbf {y}_i | \mathbf {z}_i ; \phi )\) in (1) corresponds to the modelling of tissue dependent intensity distributions. For our multi-dimensional observations, we consider M-dimensional Gaussian distributions with diagonal covariance matrices. For each class k, \((\mu _{k1}, \ldots , \mu _{kM})\) is the mean vector and \(\{s_{k1}, \ldots , s_{kM}\}\) the covariance matrix components. We will use the notation \(\mu _m = {}^t(\mu _{km}, k=1 \ldots K)\) and \(s_m= {}^t(s_{km}, k=1 \ldots K)\). When \(\mathbf {z}_i=e_k\) then \(\mathcal{G}(y_{im} ; {\mathbf{\langle } {\mathbf {z}_i},{\phi _m} \mathbf{\rangle }})\) and \(\mathcal{G}(y_{im} ; {\mathbf{\langle } {\mathbf {z}_i},{\mu _m} \mathbf{\rangle }}, {\mathbf{\langle } {\mathbf {z}_i},{s_m} \mathbf{\rangle }})\) both represent the Gaussian distribution with mean \(\mu _{km}\) and variance \(s_{km}\). The entire set of Gaussian parameters is denoted by \(\phi = \{\phi _{km}, k=1, \ldots K, m=1, \ldots , M\}\). Our data term is then defined by setting \(g(\mathbf {y}_i |\mathbf {z}_i; \phi ) \propto \prod \limits _{m=1}^M \mathcal{G}(y_{im} ; {\mathbf{\langle } {\mathbf {z}_i},{\phi _m} \mathbf{\rangle }}) \).

The missing data term \(H_{\mathbf {Z}}(\mathbf {z}; \beta )\) involving \(\mathbf {z}\) in (1) is set as follows. The dependencies between neighboring \(Z_i\)’s are modeled by further assuming that the joint distribution of {\(Z_1, \ldots , Z_N\)} is a discrete MRF on the voxels grid :

$$\begin{aligned} P(\mathbf {z}; \beta ) = W(\eta )^{-1} \, \exp \left( -H_{\mathbf {Z}}(\mathbf {z}; \eta )\right) \end{aligned}$$
(2)

where \(\eta \) is a set of parameters, \(W(\eta )\) is a normalizing constant and \(H_{\mathbf {Z}}\) is a function restricted to pair-wise interactions,

$$ H_{\mathbf {Z}}(\mathbf {z}; \mathbf {\eta }) = - \sum \limits _{i \in S} z_i^t \gamma - \sum \limits _{i,j \atop i \sim j} z_i^t \mathbb {B}z_j, $$

where we write \(z_i^t\) for the transpose of vector \(z_i\) and \( i \sim j \) when voxels i and j are neighbors. The set of parameters \( \mathbf {\eta }\) consists of two sets \( \mathbf {\eta }= (\gamma , \mathbb {B}) .\) Parameter \( \gamma \) is a \( K- \)dimensional vector which acts as weights for the different values of \(z_i\). When \( \gamma \) is zero, no tissue is favored, i.e. for a given voxel i , if no information on the neighboring voxels is available, then all tissues have the same probability. Then, \( \mathbb {B}\) is a \( K \times K \) matrix that encodes interactions between the different classes. If in addition to a null \(\gamma \), \( \mathbb {B}=b \times I_K \) where b is a real scalar and \( I_K \) is the \( K \times K \) identity matrix, parameters \( \mathbf {\eta }\) reduce to a single scalar interaction parameter b and we get the Potts model traditionally used for image segmentation.

Note that the standard Potts model is often appropriate for classification since it tends to favor neighbors that are in the same class. However, this model penalizes pairs that have different classes with the same penalty, regardless of the tissues they represent. In practice, it may be more appropriate, to encode higher penalties when the tissues are known to be unlikely neighbors. For example, the penalty for a white matter and CSF pair is expected to be greater than that of a grey matter and CSF pair, as these two classes are more likely to form neighborhoods.

In practice, these parameters can be tuned according to experts, a priori knowledge, or they can be estimated from the data. More generally, when prior knowledge indicates that, for example, two given classes are likely to be next to each other, this can be encoded in the matrix with a higher entry for this pair. Conversely, when there is enough information in the data, a full free \(\mathbb {B}\) matrix can be estimated and will reflect the class structure (i.e. which class is next to which as indicated by the data) and will then mainly serve as a regularizing term to encode additional spatial information.

For the distribution of the observed variables \(\mathbf {y}\) given the classification \(\mathbf {z}\), the usual conditional independence assumption is made. It follows that the conditional probability of the hidden field \(\mathbf {z}\) given the observed field \(\mathbf {y}\) is

$$P(\mathbf {z}| \mathbf {y}; \psi ,\eta ) = W(\eta )^{-1} \exp \left( -H_{\mathbf {Z}}(\mathbf {z}; \eta ) + \sum _{i \in S} \log g(y_{i}| z_i, \phi )\right) .$$

Parameters are estimated using the variational EM algorithm which provides a tractable solution for non trivial Markov models [20].

Atlas-Based Approach. Given the variability in the spatial extent and the magnitude of the injury in case of severe TBI, the use of values averaged from large regions of WM would not allow the accurate detection of ‘abnormal’ values. Indeed, if the lesions are focal, the detection power is hampered by the averaging with healthy tissues values. The standard way is to use an atlas-based approach where MD at each voxel is compared with normative values computed from homogeneous regions of interest (ROIs) of a healthy volunteer’s brain acting as a reference. We expect MD values to be homogenous inside well identified brain regions defining local normative values. In order to be as exhaustive as possible, we combined two atlases found in the literature. First, the Neuromorphometrics atlasFootnote 2, as provided with SPM12Footnote 3 for academic use, was used to demarcate cortical and sub-cortical regions (mainly GM). For WM regions, we used the ICBM DTI81 atlas, largely used in tractography studies to demarcate the principal fiber tracts. In the case of overlapping labels, the ICBM DTI81 label was selected. However these tracts represent only a small part of the WM volume. To our knowledge there is no atlas dividing the entire volume of WM volume into anatomically meaningful subregions. Consequently, we automatically divided the remaining volume into cubes of 20 mm\({^3}\). This size allows to obtain sufficiently local information while maintaining WM regions large enough to compute reliable normative values. Our combined atlas defines 238 ROIs.

Fig. 1.
figure 1figure 1

Overview of the processing pipeline. After denoising, we used PLOCUS for brain extraction and tissue segmentation, FSL for mean diffusion (MD) map creation and SMP12 for realignment to a template (normalization). The atlas and the brain tissue maps were combined to define 238 ROIs where detection of lesion was performed.

Our final combined atlas was then realigned (non-linear deformation using P-LOCUS) to our control subject’s images and MD values were computed for each ROI. Figure 1 shows the different processing steps. In the literature, for lesion detection, authors usually transform DTI scalar maps (mostly FA) into z-score maps to detect extreme values [3]. Given that MD value distribution is not normal, the z-score would give a biased measure of extreme values. To avoid this effect we chose to use two different thresholds: percentile-based and size-based. By fixing percentile thresholds \(\alpha \)1 for minimal and \(\alpha \)2 for maximal values, we identified clusters of extreme values. The skewness of the distribution is directed toward high values of MD and knowing these values are a marker of cell death and vasogenic edema, which are very frequent in severe TBI, we used a more lenient threshold for \(\alpha \)2. Figure 2 indicates the form of the MD distribution for our two control subjects.

Fig. 2.
figure 2figure 2

Histogram of MD values for control subjects. Percentile thresholds \(\alpha \)1 for minimal and \(\alpha \)2 for maximal values.

We considered lesions as clusters with a size higher than a given threshold \(\beta \). P-LOCUS [18] uses T1-weighted and FLAIR images conjointly to perform brain segmentation in five classes WM, GM, Lesion and CSF (ventricles extra-ventricular). Voxels labeled as CSF and hemorrhagic lesion were automatically excluded. The three thresholds were empirically set on control data to keep the lesion volume under 1\(\%\) of the brain volume. \(\alpha \)1 was fixed at the 2nd percentile, \(\alpha \)2 at the 97.5th percentile (i.e. 2.5\(\%\) for the highest values) and \(\beta \) at 21 contiguous voxels (i.e. 168 ml). These thresholds were the used for lesion detection on patient data.

Manual Approach. To quantify the volume of lesions, three neuroradiologists (OH, CB and YT) with extensive experience in lesion assessment manually segmented the lesion area using the MRIcron softwareFootnote 4. They underwent a specific intensive training to visually detect focal lesions in MD images. Focal lesions included any focal regions of abnormal signal in the MD map. The task was time-consuming: for each subject (n = 5) and each rater (n = 3), fifty slices were examined to detect high values and low values of MD. The raters were unsatisfied with their results: they were not familiar with such precise manual delineation and despite training the task remained particularly difficult because of low contrast and low spatial resolution in the MD images compared to FLAIR and T1-weighted images. To obtain a reference from these segmentations we used the STAPLE algorithm [21]. The algorithm considers our collection of segmentations and computes a probabilistic estimate of the true segmentation and a measure of the performance level represented by each segmentation. To assess the inter-rater variability we also computed three STAPLE segmentation references using manual results in a leave-one-out strategy. We used four evaluation measures to evaluate the quality of the automatic segmentation compared to the reference ground-truth: The Dice coefficient (DC) denotes the volume overlap (DC value of 0 indicates no overlap, a value of 1 perfect similarity), the average symmetric surface distance (ASSD) the surface fit (the lower the better), the Hausdorff distance (HD) the maximum error (the lower the better) and precision & recall (see details in evaluation measures computation in http://www.isles-challenge.org/).

Fig. 3.
figure 3figure 3

Automatic and manual lesion delineation for five subjects (S1 to S5). For each subject: left: automatic delineation, right: manual delineation. Green: abnormal low MD values, Red: abnormal high MD values (Color figure online).

3 Results

Figure 3 shows for our five patients, on transverse views, the reference segmentation computed using STAPLE from three rater segmentations and the corresponding automatic segmentation. The normative values were computed from two controls in each of the 238 ROIs to keep lesion below 1\(\%\) for controls.

Fig. 4.
figure 4figure 4

Left: Automatic vs manual high MD values in voxels. Right: Automatic vs manual low MD values in voxels. Using a leave-one-out strategy we obtained three values for each subject. Black line correlation slope.

Figure 4 indicates the volumetric comparison between manual vs automatic delineation for high MD and low MD values respectively for our five patients. Using STAPLE, for each subject, we computed three references from the manual segmentation provided by two raters among three. This allows to highlight the important inter-rater variability (for instance see for S2). Volume agreements between manual and automatic results are not perfect. Clearly, the automatic delineation minimizes high MD values (Fig. 4, left). This is confirmed by the low precision values with high recall values for high MD (see Table 1). Table 1 reports the values for our different evaluation measures for our five subjects. To our knowledge no such values are available in the literature for a comparison.

Table 1. Measures to evaluate the quality of the automatic segmentation compared to the reference ground-truth. DC: the Dice coefficient, ASSD: the average symmetric surface distance, HD: the Hausdorff distance

4 Discussion

In our study, vulnerable brain lesions were defined based on morphological images and by abnormal values of MD using DTI to distinguish between cytotoxic and vasogenic edema. We used specific analysis of each individual case because the spatial distribution of brain trauma lesion is highly heterogeneous and can not be revealed by a group study. We compared lesion volume delineation using a multi-modal atlas-based automatic method to that of manual delineation by three neuroradiologists. Our results show that the proposed method allows identification of some lesions in severe TBI in coherence with that defined by our experts. Several measures that assess the quality of the automatic segmentation compared to the reference ground-truth (see Table 1) reveal that some discrepancies exist between manual and automatic methods. Clearly, these results should be improved. To our knowledge no such measures have been published yet for automatic lesion detection in severe TBI. These values may serve as a starting point for comparison with alternative techniques. Our trained experts reported that there were not totally confident with their final rating. We observed that lesions were particularly difficult to segment manually due to low contrast and low spatial resolution in diffusion images compared to FLAIR or T1-weighted images; these latter being more familiar to the experts. This was reflected by the high inter-rater variations across the experts (see Fig. 4). We used STAPLE to compute a probabilistic estimate of the true segmentation. However, the low number of raters involved (n = 3) and high inter-rater variability limit the validity of such a “ground truth”. This could explain in part the observed discrepancies between manual vs automatic approaches. The manual task required a specific training and was time-consuming. Consequently, it was difficult to involve more trained experts to define an “expert consensus” and limit bias. In this study we considered mean diffusivity (MD), a physiological parameter extracted from DTI scans, to distinguish between vasogenic and cytotoxic edema. While MD is sensitive to sparse small lesions with low MD values (corresponding to high-level intensity spots in FLAIR) and allows physical quantification of the lesion in terms of water molecule diffusivity alteration, high-level contrast in FLAIR images allows an easy delineation of large damaged regions with high MD values. Further work should be done to improve brain injury characterisation in exploiting such complementary information with our automatic method.

The methodological difficulties in performing MRI in the acute phase of severe TBI explain the rarity of studies for this period. Only two studies [6, 14] address the problem of severe trauma (Glasgow score \(<9\)) in acute phase i.e. less than 10 days post-injury. The former, a multi-center study, aimed to define a long-term outcome prediction from quantitative parameters extracted from DTI in specific ROIs in white matter. The latter was concerned with the evolution of ADC values in the traumatic lesions. No quantitative measurement of the lesion volume was reported for these two studies. Compared to Tumor, Stroke or Multiple Sclerosis, a few papers addressed automatic lesion segmentation in TBIFootnote 5. The majority of TBI studies report volume changes computed in specific ROIs. Few approaches report the lesion load [15, 22]. Because the spatial distribution of the lesion cannot be anticipated, our approach considered the entire brain without any a priori spatial hypothesis. We used two atlases to parcel the entire brain. MD was then computed in each ROI. An MD-driven alternative will be to search for homogenous MD territories clustering directly from the set of control DTI. Strangman et al. [13] reported inadequate skull stripping and poor subcortical structure segmentation with the most common method, FreeSurfer. Using non-linear deformation of a priori tissue probability maps on individual T1-weighted and FLAIR images we successfully used P-LOCUS to provide brain tissue segmentation and exclude voxels labeled as CSF in ventricules and hemorrhagic lesions. To detect outlier/abnormal MD values, we defined normative values on normal controls. Such normative values are highly scanner and sequence dependent and, as in our multi-center study, should be defined for each center involved. Because the influence of age on MD values, the range of normal control age should be matched with TBI patients. The influence of the size of the normal control population on the norm definition should be evaluated. Recently, [23] proposed a method to harmonize diffusion MRI data across multiple scanners. Several rotation-invariant features are computed from spherical harmonic basis functions and used to estimate a region-based linear mapping between signal from different scanners. Such a method might be used to define normative values in pooling normal controls from different sites. A poor estimation of the normative mean in each ROI of the control group biases the detection of aberrant values [22]. Instead of an atlas-based approach, a voxel-based approach to segment abnormal values directly from individual diffusion-weighted images could be introduced avoiding the definition of normative values. However, such an approach remains difficult due to the low contrast present in these images.

In conclusion, this paper reports the image processing steps and the difficulties encountered of the first program aiming to assess the impact of a therapeutic strategy based on Pbt\(O_{2}\) in monitoring the volume of severe post-trauma cerebral lesions and on neurological outcome in a randomized controlled trial. We hypothesise that early monitoring of brain oxygenation with Pbt\(O_{2}\) can reduce the volume of vulnerable brain lesions and, possibly, improve neurological outcome in TBI patients from an unfavorable to a favorable neurological outcome. These preliminary results obtained on a small number of subjects in one center are encouraging and a larger evaluation including more controls and patients is undergoing.