Introduction

Multiparametric MRI stroke protocols including diffusion-weighted imaging (DWI) and perfusion MRI (pMRI) have been increasingly used to characterize acute stroke patients, and can potentially provide an early, objective metric to help guide therapy [1]. Lesion volumes measured in patients with acute stroke using both DWI and pMRI have been shown to correlate with baseline stroke severity (NIH Stroke Scale, NIHSS, score) as well as outcome [24], and the pMRI/DWI mismatch appears to be a good estimate of the tissue at risk of infarction [5, 6]. However, numerous factors complicate the relationship between lesion volume and severity. For DWI, it has been shown that portions of the lesions are potentially reversible with early reperfusion [7, 8] and hemodynamic delay in the mismatch region includes areas of benign oligemia as well as the true tissue at risk [9, 10]. Further obscuring the relationship between lesion volume and severity, the region of tissue identified as hypoperfused depends on the algorithm used to calculate the parametric map [10, 11].

In the absence of a technique to quantify perfusion on an absolute scale, we can at best expect a correlative relationship between the imaging metric of perfusion and a clinical scale. However, it is not obvious which of the algorithms currently used to calculate vascular transit time results in a volume that best correlates with clinical deficit. Several authors have presented data that increasing thresholds of perfusion delays help to differentiate between normal and ischemic brain [12, 13].

The purpose of our study was to determine which of four widely used pMRI algorithms, each with a variety of time delay thresholds, most accurately reflects the clinical deficit at the time of acute imaging. We hypothesize that the volume derived from a specific pMRI algorithm, which correlates best with the acute clinical deficit, most likely shows the actual critically hypoperfused brain area from a clinical perspective.

Materials and methods

Patients

Twenty patients aged 70.9±15.7 years (10 male, 10 female, 18 Caucasians, 1 Hispanic, 1 African American) were retrospectively selected from an acute stroke clinical and MRI database. All patients provided informed consent to participate in a natural history study of stroke including imaging and data analysis. Of the 20 patients, 11 had left, and 9 patients right, hemispheric lesions; the median NIHSS score was 9.5 (interquartile range 13.5, range 1–21). Vessel occlusion localizations were as follows: internal carotid artery (n=7, 35%), middle cerebral artery mainstem (n=3, 15%), middle cerebral artery branch (n=8, 40%), posterior cerebral artery and anterior chorioideal artery (each: n=1, 5%).

The inclusion criteria were (1) pMRI evidence of an acute ischemic hemispheric lesion, (2) MRI performed within 24 h of stroke onset, and (3) technically adequate imaging data. Stroke duration was determined from the time patients were last known to be without deficits. No patient was treated with fibrinolytics or investigational agents before acute imaging and clinical assessment. Patients with previous strokes or other preexisting neurological deficits were ineligible, since a prior history would confound NIHSS findings. NIHSS scores were recorded within the hour before imaging.

Imaging

For each patient, a complete stroke MRI protocol including pMRI was performed on a Siemens Vision 1.5 T scanner. For the purpose of this study only pMRI data are reported. All patients were imaged within 24 h of stroke onset (range 2–22 h). A circularly polarized head coil was used for excitation and signal reception. Dynamic susceptibility contrast imaging is one of the most commonly used techniques for obtaining pMRI [14]. With pMRI in particular, dynamic first-pass bolus tracking of gadolinium DTPA was accomplished under the following multislice, gradient echo, echoplanar scanning parameters: echo time (TE) 47.0 ms, repetition time (TR) 2000 ms, field of view (FOV) 260 mm, acquisition matrix 128×128 and slice thickness 7 mm with no gap in between the slices to obtain a total of 12 pMRI slices. The signal to noise ratio in a representative sample of our patients ranged from 45 to 70. The gadolinium bolus (0.1 mg/kg body weight) and a saline flush of 20 ml were administered intravenously by manual injection over 5 s using an 18- or 20-gauge intravenous catheter. It has been shown before that manual injection by an experienced investigator is indistinguishable from that done by a power injector, which is the current method of choice [15].

Four common algorithms for mean transit time (MTT) were generated for each case:

  1. 1.

    MTT=cerebral blood volume (CBV)/CBF (deconvolution or Ostergaard's method with arterial input function (AIF) and singular voxel decomposition (SVD) [16])

  2. 2)

    MTT=CBV/peak amplitude of the concentration-time curve (area/peak)

  3. 3)

    MTT=difference between concentration peak time and arrival time (time to peak or TTP)

  4. 4)

    MTT=1st/0th moment (first moment method) of the concentration-time curve [17]

In this study, both map construction and lesion volume computation were completed in an automated fashion using MedX (Sensor Systems, Sterling, Va.), a commercially available software program featuring the above MTT algorithms by a contributor blinded to the clinical symptoms (C.S.W.). For the AIF, approximately 10–20 voxels from within the major arteries of the hemisphere ipsilateral to the lesion were selected and averaged based on their TTP and maximum signal change from an automatically presented choice of voxels. Recirculation effects were avoided in the gamma variate fit by including values of greater than 10% of peak concentration during “wash-in” and greater than 60% during “wash-out”. We first computed the mean vascular transit time \({\left( {\overline{{MTT}} _{c} } \right)}\) in contralateral control brain and used it as the reference to compute MTT values. We subsequently selected for voxels in the ischemic hemisphere having values that exceeded a threshold (relative to\({\left( {\overline{{MTT}} _{c} } \right)}\) of greater than 0, 2, 4, 6 and 8 s. Thus, for each MTT algorithm, five sets of hypoperfusion volumes were derived based on increasing thresholds, accounting for a total of 20 sets of lesion volumes. The investigator performing the volume measurements was blinded to the NIHSS score. IRB-approval was obtained before imaging.

Statistical analyses

The 20 sets of lesion volumes were analyzed in a number of ways. First, mean and standard deviations for volume differentials between algorithms were computed to obtain a Bland-Altman description of the data. Second, algorithm versus algorithm Spearman correlation coefficients were derived to assess the degree to which volume estimates varied depending on the algorithm used. Combined, the Bland-Altman description and the algorithm versus algorithm correlation coefficients helped determine whether the four algorithms produced significantly different volume estimates for the various thresholds of MTT delay. Finally, the 20 sets of lesion volumes were tested for a correlation with acute NIHSS score. To evaluate the statistical significance of observed differences in correlation coefficients, Fisher's z transformation tests were performed.

Results

Hypoperfusion volumes

The AIF/SVD and first moment algorithms yielded substantially lower volumes of MTT than the other two algorithms relative to the mean MTT of contralateral brain 0-s threshold, (one-way) ANOVA, repeated measures F(3,79)=12.69; P<0.0001) (Table 1). This disparity narrowed with increasing MTT delays.

Table 1 pMRI lesion volumes±SD (ml) for all algorithms and delay thresholds. Lesion volumes decreased with increasing delay thresholds (TTP time to peak, AIF arterial input function, SVD singular voxel decomposition)

The Bland-Altman description of the data and algorithm versus algorithm correlation coefficients confirmed that the AIF/SVD and the first moment algorithms generated markedly different lesion volumes than the area/peak and TTP algorithms, with the discrepancies decreasing with increasing MTT thresholds. For pMRI maps using AIF/SVD or first moment methods with a 0-s threshold, the mean difference of the means (Bland-Altman) was only 3.35 ml whereas the differences between AIF/SVD or first moment versus TTP or area/peak varied between 52.45 ml and 67.71 ml. Concomitantly, the TTP and area/peak algorithms yielded highly correlated (all r>0.89, all P≤0.0001) but, according to the Bland-Altman description, differing volumes (all mean differences >11.91 ml), illustrating that these two algorithms both render substantially larger lesion volumes than the AIF/SVD and first moment methods. As one could expect, with increasing delays, the lesion volumes decreased for all algorithms and so did their mean differences (Bland-Altman) leading to higher correlations between the different algorithms. The lesion volumes decreased more markedly with the first moment than with the AIF/SVD methods (Tables 2 and 3, Fig. 1).

Table 2 Bland-Altman plot-derived mean differences of the pairwise comparison of two pMRI methods, illustrating that the especially lower delay thresholds TTP and area/peak as well as first moment- and AIF/SVD-derived volumes are quite similar (TTP time to peak, AIF arterial input function, SVD singular voxel decomposition)
Table 3 Correlation coefficients of the non-parametric Spearman's signed ranks correlation for all pairs of algorithms at the five different delay thresholds. The correlation consistently increased with increasing delay thresholds. See also Fig. 1 (TTP time to peak, AIF arterial input function, SVD singular voxel decomposition)
Fig. 1
figure 1

Scatterplot of the four tested algorithms; time delay/threshold 0 s. The matrix scattergram illustrates the correlations between the four algorithms for the set of algorithms without a time delay (0 s). The best correlation is yielded for TTP and area peak derived volumes; however, both these algorithms overestimate the critically hypoperfused volume (see Figs. 2 and 3, and Table 4) (AIF arterial input function, SVD singular voxel decomposition)

Clinical correlation

The discrepancies in estimated volumes translated into noticeable differences in clinical correlation. The volumes at a threshold of 0 s for the TTP algorithm did not correlate significantly with NIHSS score (Spearman r=0.27, P=0.24), but in increasing order those for the area/peak, AIF/SVD and first moment methods did. As lesion volumes were analyzed with ever-increasing MTT thresholds, NIHSS correlation for the AIF/SVD algorithm did not change considerably, while correlation with the area/peak and TTP algorithms improved. As for the first moment method, correlation with clinical deficits actually slightly decreased with increased MTT threshold but remained significant and correlated better with NIHSS score than any other algorithm for each MTT threshold. However, the Fisher's z transformation tests revealed that the differences in correlation only proved significant for first moment versus area/peak and for first moment versus TTP for volumes determined at a MTT threshold of 0 s (Table 4, Figs. 1, 2 and 3).

Table 4 Correlation coefficients of the non-parametric Spearman signed ranks correlation for all algorithms and delay thresholds with the baseline NIHSS score. The correlation was highest for the first moment method at all delay thresholds. See also Fig. 2 (TTP time to peak, AIF arterial input function, SVD singular voxel decomposition)
Fig. 2
figure 2figure 2

Scatterplots of NIHSS score versus volumes derived by all four algorithms with 0 s delay/threshold. The scattergrams illustrate the correlations of NIHSS scores with the volumes rendered by all four algorithms for group with 0 s delay. The best correlation and lowest intercept is seen for the first moment algorithm (CBF cerebral blood flow, AIF arterial input function, SVD singular voxel decomposition)

Fig. 3
figure 3

Upper (TTP) and lower (first moment) row show two adjacent slices of the same patient's parameter maps calculated with different algorithms but without a time delay threshold. The area of hypoperfusion (shown as hyperintensity) is substantially larger on the TTP than on the first moment maps

Discussion

Dynamic susceptibility contrast imaging with T2*-weighted images and conversion of signal-intensity curves to concentration-time curves is one of the most commonly used techniques for obtaining pMRI [11, 14]. The inability of pMRI reliably to differentiate infarct core, tissue at risk (penumbra) and oligemia (benign hypoperfusion) [10, 18] may preclude a valid application of the pMRI/DWI mismatch concept in the decision-making process for acute stroke therapy [1, 5]. Despite evolution and refinement of the different pMRI techniques and algorithms [19], problems of quantitative measurement of CBF, MTT and CBV such as inaccurate assessment of the AIF, tracer bolus dispersion and confounding role of concomitant vascular disease, remain unsolved [1921]. Therefore, while pMRI cannot quantify CBF directly, improvements in data generation and analysis should achieve better correlation between inferred perfusion deficits and clinically significant pathology such as the clinical deficit at the time of imaging [22]. This approach has been criticized as the clinical stroke scales in use are noisy and ordinal only; however, in current practice most use the NIHSSS [23] in therapeutic trials as well as in clinical practice.

Among the various hemodynamic parameters described, the most frequently used is the MTT due to its ability to delineate visually distinct lesions of larger size better than CBF or CBV maps [10, 11, 22]. Ways to estimate MTT include ratio of area to peak, first moment, AIF/SVD, TTP, time to arrival (TTA), and relative time to arrival (TTP-TTA).

By applying varying delay thresholds to TTP maps, Neumann-Haefelin et al. aimed to improve the diagnostic sensitivity of TTP parameter maps and find parameters indicative of both infarct progression and functional impairment [12]. In 20 patients with nonlacunar ischemic stroke imaged within 24 h of symptom onset, they found that a TTP delay of ≥6 s in the mismatch region was associated with lesion enlargement between the initial and follow-up MRI scans. Also, the volume of the regions with TTP delays of ≥4 s correlated better with the European Stroke Scale than other pMRI lesion volumes, which indicated that a TTP delay of approximately 4 s might be the threshold for functional impairment of brain tissue. They concluded that TTP delays exceeding 4 s contribute to the acute clinical deficit, while severe perfusion deficits with TTP delays of more than 6 s depict the areas of brain at high risk of irreversible infarction. In agreement with Neumann-Haefelin et al., in our patients TTP-derived lesion volumes correlated increasingly with baseline NIHSS score when the delay was increased. Butcher et al. [13] compared MTT, CBV and CBF parameter maps in 17 thrombolysis and 18 control patients imaged within 6 h of stroke onset to assess the degree of relative perfusion disturbance which leads to final infarction or to tissue salvage. In this study, MTT was prolonged by 22% and CBF 10% lower in infarcted regions relative to salvaged tissue, while CBV did not differ significantly between infarcted and salvaged regions. When reperfusion occurred, tissue with more severely prolonged MTT was salvaged from infarction relative to patients with persistent hypoperfusion, although MTT threshold of salvage inversely correlated with time from onset.

The Bland-Altman description of our data and the algorithm versus algorithm correlation coefficients indicate that the differential volume estimates stem from actual differences between the algorithms as opposed to other sources of variation. Although these statistics did not establish a significant difference between the first moment method and the AIF/SVD algorithm, the higher correlations with clinical deficits at every MTT threshold suggest that the first moment method has an advantage over the metric calculated using AIF/SVD, area/peak and TTP algorithms. In contrast, NIHSS correlation with the AIF/SVD algorithm seemed independent of the MTT threshold, although the lesion volume diminished substantially with increasing delay thresholds. While not formally tested, the lower correlation with NIHSS score and higher difference of the mean implies that TTP and area/peak inconsistently overestimate the degree of clinically meaningful hypoperfusion when the level of MTT delay is not adjusted.

We are aware of only two other studies that compared different pMRI algorithms in individual patients with nonlacunar ischemic strokes who were scanned within 24 h after onset of stroke symptoms. Yamada et al. [11] assessed the sensitivity and specificity of pMRI in six patients with and six patients without an extracranial internal carotid artery stenosis of >70% compared to follow-up MRI or CT. Four of the seven different postprocessing methods were identical to ours: first moment, ratio of area to peak, TTP and deconvolution methods. Other methods included relative TTP, TTA and full-width at half-maximum. They found that first moment, TTP and deconvolution methods had high sensitivity (sensitivities of 74%, 77% and 81–94%, respectively). In patients with concomitant carotid stenosis, only first moment and deconvolution methods maintained relatively high specificity. This is consistent with the findings of Neumann-Haefelin et al. [24] in 27 acute stroke patients with severe internal carotid artery stenosis. A considerably smaller fraction of the total pMRI/DWI mismatch was at risk than in patients without carotid disease according to TTP maps in particular, even when delay thresholds were applied. Both groups concluded that the calculation technique and presence of underlying vasculopathy have a direct impact on the results of pMRI [11, 24]. In the study by Yamada et al. [11], among the three algorithms with good accuracy, the deconvolution method was superior to the first moment method because of higher lesion conspicuousness. Although we found the first moment technique to be more reflective of the clinical deficit, the results of the two studies are not necessarily contradictory as both studies consistently show these two methods to be the most reliable. Furthermore, we primarily assessed the power of different pMRI algorithms to detect critical hypoperfusion as presumably reflected in clinical severity measured by the NIHSS.

Grandin et al. [10] retrospectively studied the predictive value and diagnostic strength of different relative, semiquantitative (relative to AIF) and quantitative pMRI algorithms to differentiate infarct core, tissue at risk and oligemia in 66 patients within 6 h of stroke onset. They found that a function of peak height and TTP (<54% or >5.2 s, 71% sensitivity, 98% specificity) and quantitative CBF (<35 ml/min per 100 g, 69% sensitivity, 85% specificity) were the best predictors of infarct growth. Quantitative and relative measurements (without or with deconvolution) worked equally well. However, the first moment method did not reliably predict final infarct in contrast to the study of Yamada et al. [11]. The better performance of the first moment method with regard to clinical deficit in our study, may in part be explained by the fact that we studied patients in the 24-h time window, when reduction of the MTT volume compared to an earlier baseline scan has been seen despite persisting vessel occlusion [4]. In addition, we validated the various pMRI algorithms against baseline stroke severity rather than final infarct volume, and introduced thresholds in contrast to Grandin et al. [10].

Limitations to our study include a lack of statistically significant differences except when comparing the first moment method against the area/peak and TTP algorithms without delay thresholds. However, future studies with larger sample sizes may very well validate the general trends illustrated by this study. A lack of interobserver comparisons may be offset by the blinded evaluation of volumes. Also, the automated lesion volume calculation more than compensates for the interobserver variability that normally accompanies manual tracing and computation of lesion volumes. Of note, this was an exploratory investigation assuming that the four algorithms vary in their capacity to identify clinically significant hypoperfusion, but there was no a priori indication as to the magnitude of this difference. First, a larger trial with multivariate analysis is needed to assess the validity of a given MTT algorithm (such as the first moment method) in determining the pMRI lesion volume which reflects the clinical deficit (i.e., critically hypoperfused brain tissue). In a second step, these algorithms then can be used to predict clinical and morphological outcome.

Another noteworthy problem is the recently reported insensitivity of the NIHSS to right hemispheric stroke symptoms such as hemineglect and spatial attention deficit [25], which leads to poorer correlations of NIHSS with pMRI volumes. On the other hand, despite this weakness, the NIHSS has been and still is the most widely applied stroke scale used in acute stroke trials. Correlations with other scales, such as the Scandinavian or European Stroke Scales, have been less extensively studied and show worse correlations with imaging findings than the NIHSS [4]. Furthermore, beyond the 6-h time window (our patients were assessed within 24 h of symptom onset), substantially higher correlations of scales and lesion volumes have been reported [3, 4] which may be due to a smaller area of critically hypoperfused but still viable, i.e., penumbral, brain tissue. On the other hand, pMRI/DWI mismatch [26] as well as penumbral tissue according to single photon emission computer tomography (SPECT) [27] beyond a time window of 6 h are not that rare and, as reported recently, may be successfully treated by recanalization therapy [28]. Finally, as newer MRI-based thrombolysis studies approach extended time windows up to 24 h from symptom onset [29, 30], our choice of a 24-h time window, rather than <12 or <6 h only, meets the clinical needs.

In conclusion, we found the highest correlations between parameter map and clinical deficit with the first moment and the AIF/SVD methods. The latter was also relatively independent of the delay threshold used to derive the volumes. For the other algorithms, moderate correlations were observed for volumes with delay thresholds of ≥4 s. As the first moment method, despite being a qualitative pMRI algorithm, seems to reflect clinical stroke severity equally well, a time-consuming quantitative analysis including definition of one or more AIF might be needed. Furthermore, arbitrarily defined time delays do not improve the correlation with baseline clinical deficit. The feasibility of quantitative perfusion assessment including the time consuming postprocessing in the emergency setting of acute stroke needs to be validated. While a “quantitative” pMRI technique—once problems such as AIF and numerical algorithm are solved—may be expected to provide more accurate results, the first moment technique despite being qualitative, performs well. First moment pMRI is a promising surrogate parameter candidate for the clinical deficit in acute stroke and is thus helpful for clinical (and investigative) decision-making. Once this has been firmly established in a larger series, the predictive (prognostic) power for final stroke lesion volumes and clinical outcomes should be tested.