Introduction

Pituitary adenomas are benign, epithelially derived neoplasms that represent 10–15% of intracranial tumors [1, 2]. Their location, infiltrative growth, and central role in hormonal activity combine to make successful treatment challenging. Lifelong assessment of pre- and post-surgical adenoma status is necessary for monitoring patient well-being and is typically performed using magnetic resonance (MR) imaging. One parameter that strongly influences treatment options is volumetric tumor change, both in terms of possible mechanical compression of sellar region structures including the optic chiasm and cavernous sinuses, and for the evaluation of treatment effects. While tumor volume change may be small between consecutive MR exams, the change may be decisive for the choice of treatment strategy.

Characterization of adenoma tumor growth from visual comparison of non-co-registered 2D MR images from different time points is inherently subjective. Volume estimates from 2D images may be calculated from the formula (0.5 × anterioposterior dimension × vertical dimension × horizontal dimension) [3], which, for a lesion with an irregular shape, should be considered to be only a rough estimate [4]. Using this method, a volume change is commonly considered significant when the tumor volume change exceeds 20% [5], a threshold value which takes into account the limitations by estimating tumor diameter from separate slices through random parts of the tumor in sometimes slightly different projections. The method is therefore unable to detect subtle yet clinically important volume changes. As an example, using the expression of the formula above, a 1-mm decrease in the diameter of a 10-mm tumor would result in almost 30% decrease in volume. This 1-mm difference is almost impossible to detect with any certainty using visual inspection of conventional 2D MR images.

Slice by slice, the summation of manually defined tumor area has proven to be a more accurate method to estimate tumor volumes [3], however, these manual measurements have been demonstrated to be less reliable than semi-automated methods when applied in other settings such as gliomas [6]. Manual measurements also have the disadvantage of being time consuming, especially with larger adenomas, resulting in limited utility in most radiological practices.

A standardized and fast 3D-based method for the direct detection of tumor volume changes may counteract these potential drawbacks by increasing diagnostic precision and enhance the radiological workflow. It is our hypothesis that visual comparison of distortion-corrected and rigid body co-registered 3D MR images from different time points can aid the radiologist in making an accurate and objective assessment of volumetric changes better than what can be achieved by visual inspection of unregistered 2D images. In view of the above, the purpose of our study was to evaluate the utility of visual inspection of affine-aligned longitudinal 3D MR images that have been corrected for both geometric distortion and intensity nonunifomity in assessing structural change of pituitary adenomas.

Materials and methods

This prospective study was approved by the regional ethics committee. An informed consent was obtained for all patients. Thirteen adult patients, with no inclusion criteria other than being referred to a routine follow-up MR study for a residual pituitary macroadenoma, were included: eight females and five males with age 29–76 years, and mean age of 53 years. Of these, seven patients were imaged at two different time points, four at three time points, and two at four time points, with mean time between consecutive MR exams 444 ± 147 days (±SD). All adenomas were assessed for volume change both regarding change since last examination and since first examination producing a combination of 29 time points. No patient underwent surgery from 1 year prior to the first scan or during the period of follow-up scans.

MR imaging

MR imaging was performed at 1.5 Tesla (Siemens Sonata, Erlangen, Germany) using an eight-channeled head coil. The MR imaging protocol was as follows: axial 2D turbo spin echo T2-weighted (w) images with repetition time (TR) = 4,000 ms, echo time (TE) = 97 ms, 0.6 × 0.4 × 5 mm3, and sagittal and coronal T1-w images (pre- and post-gadodiamide 0.05 mmol/kg administration) with TR = 516 ms, TE = 12 ms, 0.7 × 0.4 × 3 mm3. In addition, a radio frequency-spoiled sagittal gradient echo (GRE, FLASH) post-contrast T1-w 3D image sequence with TR = 20 ms, TE = 1.8–15.8 ms, 1.3 × 1.0 × 1.3 mm3 was acquired in all patients. The coronal 2D images were oriented perpendicular on the sella.

Image post-processing

Raw images, both 2D and 3D, suffer from geometric distortion due to nonlinearities in the gradient magnetic fields used for spatial encoding. This effect can be substantial, with distortion of the field of view up to a few centimeters, depending on the scanner model and head positioning within the scanner [7]. Correction for such geometric distortion is especially important when measuring subtle longitudinal structural changes from serial MRI scans [8].

All 3D scans acquired in the present study were first corrected for gradient nonlinearity distortions using custom software incorporating the exact specifications for the 3D MRI gradient field patterns as supplied by the scanner manufacturer [9]. After correction for nonlinear geometric distortions, each subject’s 3D follow-up images were then affine-aligned to the subject’s baseline image and finally corrected for mutual variation in intensity [10, 11].

Image analysis

In order to assess the diagnostic efficacy[12] of the proposed technique, two neuroradiologists with 22 and 3 years of experience, respectively, examined the patient images and reached consensus with respect to tumor growth (yes/no) using two approaches: (a) the conventional visual inspection of non-co-registered 2D images (excluding the 3D image series) and (b) visual comparison of the post-processed 3D images, where 3D image sets from different time points were used as underlay/overlay images and a transparency slider employed to gradually switch between the images. The 3D images were visualized in a multiplanar reconstruction window showing all three orthogonal projections.

Absolute and relative adenoma volume changes served as gold standard and were estimated by summation of manually drawn region of interests (ROIs) in 3D T1-w images. All ROIs were defined from contrast agent-enhanced images, which previously have been demonstrated to improve the delineation of an adenoma from the cavernous sinus from 47% in unenhanced scans to 91% in enhanced scans [13].

For each patient, the volume change between two consecutive exams and, in the case of more than two time points, the volume change between a given time point and the first time point was assessed making a total combination of 29 time points.

The image distortion–correction, affine registration, and intensity normalization was performed using a custom-designed software developed at the Multimodal Imaging Laboratory at University of California, San Diego, and visualization was performed using nordicICE (NordicImagingLab AS, Norway).

Statistics

The correlation between the two methods in terms of qualitative assessment of tumor growth (‘change’ = 1, ‘no change’ = 0) at all longitudinal combinations of time points was evaluated using Spearman rank correlation. For the ‘change’ and ‘no change’ groups, the median and range of volume changes (in absolute values and relative percent) were obtained for both methods, and Mann–Whitney tests were used to assess any difference between the volume changes in the cohorts. To correct for multiple comparisons, P < 0.0125 was considered significant (Bonferroni corrected). In addition, median and range of volume changes for the time points in which the two methods gave different results were assessed and the volume changes were compared using Mann–Whitney tests.

Also, based on the readings performed by the neuroradiologists (change, yes/no), the diagnostic accuracies of the two methods were assessed by deriving the area (AUC) under the receiver operating characteristic (ROC) curves. In this, to assess any difference between the two methods, McNemar’s tests were performed using the same binary responses (change, yes/no) at the optimal trade-off points between sensitivity and specificity. Based on the image readings and relative volume (percentage) change, these cutoff points were obtained by minimizing the number of misclassifications and maximizing the average sensitivity and specificity values.

Results

Examples of the volume change field produced by the nonlinear registration for a patient identified by both methods as having tumor growth between two consecutive time points, and for a patient identified as having no tumor growth by both methods are shown in Figs. 1 and 2, respectively. Using conventional visual comparison of 2D images, the neuroradiologists observed a qualitative change in adenoma size in 7 of the 29 combinations of time points (in four patients). Using a slider to visually inspect corrected and aligned 3D image pairs, a qualitative change in adenoma size was observed in 16 of the 29 combinations of time points (in seven patients). The resulting qualitative change (yes/no) as a function of absolute volume change and relative percent change for the two methods investigated are shown in Fig. 3. Overall, when including the co-registered 3D images, the neuroradiologists changed their diagnosis regarding tumor growth in 5 of 13 patients (38%) at 9 of the 29 combinations of time points (31%). A summary of the results is shown in Table 1. Figure 4 shows scatter plots of the relative percent change in adenoma size as a function of absolute volume change for both methods investigated. A high relative percent change (>20%) in adenomas with a small absolute volume change (<200 mm3) determined by manually defined ROIs was not detected as a change in adenoma size (4 time points) by either of the two compared methods.

Fig. 1
figure 1

Nonlinear co-registration of two images (a) and resulting displacement field (b) in a patient identified by both manual 2D visual inspection and the 3D based method as having tumor growth between two consecutive time points. Tumor growth (Yes/No) was assessed by using a transparency slider to toggle between the two time points. The net displacement field (Jacobian determinant) derived during the co-registration process allows visualization of volume change (red = expansion, blue = compression)

Fig. 2
figure 2

Nonlinear co-registration of two images (a) and resulting displacement field (b) in a patient identified by both manual 2D visual inspection and the 3D-based method as not having tumor growth between two consecutive time points. Compared to Fig. 1, the net displacement field in (b) does not indicate a volume change

Fig. 3
figure 3

For all consecutive time points, the plots show absolute volume changes (a, b) and relative percent changes (c, d) in the adenomas using the manual reference method (a, c) and the 3D-based method (b, d). Gray bars indicate no observed change in adenoma volume, whereas black bars indicate an observed change

Table 1 Qualitative change (0 = ‘no change’, 1 = ‘change’) in adenoma size for all patients and all time points investigated using the 2D- and the 3D-based methods
Fig. 4
figure 4

a Scatter plots showing relative percentage change as a function of absolute volume change when using the manual reference method. Gray dots indicate no observed change in adenoma volume, whereas black dots indicate an observed change. b Scatter plots showing relative percentage change as a function of absolute volume change when using the 3D-based method. Gray dots indicate no observed change in adenoma volume, whereas black dots indicate an observed change

For the conventional visual comparison method, the absolute (relative) median volume change from baseline was 360 mm3 (13%) for the ‘no change’ group and 927 mm3 (37%) for the ‘change’ group. The corresponding results for the proposed 3D visual comparison method were 194 mm3 (12%) and 592 mm3 (18%), respectively. A significant difference in absolute volume change between the ‘no change’ and ‘change’ groups was observed using the proposed method (P = 0.006). The median volume change for the time points in which the two methods provided different results was 544 mm3 (16%). For all time points where the results of the two methods were different, the 2D method conveyed no visible change in tumor size, whereas the 3D method did yield a visible change in the tumor size. Here, the manually measured absolute volume changes supported the results of the 3D-based method as the volume changes were significantly larger than those in the group with ‘no change’ (P = 0.012). A summary of the statistical comparisons is shown in Table 2.

Table 2 Statistical comparisons between the 2D-based and 3D-based methods

Figure 5 shows results of the ROC analysis. For absolute volume changes, the AUC values (±standard errors) for the volumetric approach and the reference method were AUC =0.803 (±0.081) and AUC = 0.708 (±0.132), respectively. At optimal sensitivity and specificity, there was a significant difference in the resulting binary predictions between the two methods (McNemar’s test, P = 0.008). For relative percentage changes, the corresponding AUC values were 0.625 (±0.106) and 0.643 (±0.147), respectively, and the McNemar’s test showed no significant difference between the two methods.

Fig. 5
figure 5

ROC curves illustrating the diagnostic accuracies of the volumetric approach (black lines) and the reference method (red lines) in terms of detecting relative volume changes (a) and relative percentage changes (b) between consecutive MR scans. Whereas the predictive values were similar in (b), the results of the McNemar’s test on (a) showed a significant difference (P = 0.008) between the two methods at the optimal trade-off points between sensitivity and specificity

The use of co-registered 3D images was found to be helpful in determining the diagnosis in all patients and at all time points (100%). The time to perform the analysis in a single patient was practically equal for both the 2D conventional and 3D-based methods.

Discussion

Advances in the technical development of MR scanners and image processing promote the development of quantitative and qualitative end points with improved diagnostic value. It is anticipated that the traditional custom of subjective interpretation of image readings will evolve toward computer-aided methods for the quantification of morphological changes in medical images. One of the fields that may particularly benefit from computer-aided methods is the assessment of changes in a patient’s tumor burden especially when the tumor evolution is slow. This issue has recently been addressed with the development of a computer-assisted semi-automated method applied on slow-growing meningiomas [14]. A benefit of a 3D volumetric approach compared to 2D-based volume analysis has recently been shown for vestibular schwannomas [15], while for glioblastomas, any significant difference has not been shown between volumes measured on 2D and 3D scans [16]. Tumor volume change (objective response) is an important endpoint in clinical trials [17]. When volume change is used as a primary outcome in such trials, a method with increased sensitivity can directly affect the number of patients that need to be recruited and therefore influence on trial costs.

However, the response evaluation criteria in solid tumors (RECIST) working group stated in its latest update [17] that there is so far not sufficient standardization or availability to recommend adaptation of automated volumetric assessment methods. At present, the RECIST criteria require a 20% increase in longest diameter (LD) for a single solid tumor or in the sum of the LDs of multiple masses to define progressive disease. This 20% threshold must be considered a rough parameter that takes into account the limitations inherent with the RECIST group’s recommended method for volume estimates from 2D images.

In our study, we have investigated the feasibility of a semi-automated co-registration routine of 3D images and assessed changes in tumor volume by the use of a slider to transparently merge images from two different time points in patients with pituitary macroadenomas. For these tumors, there are no definite criteria in common use regarding thresholds for clinically significant volume change, and from our experience, clinicians value information of any significant volume change that can be called by the radiologist. Growth of sellar soft tissue in the postoperative phase would strongly suggest the presence of residual tumor. Tumor proximity to the optic chiasm is an important factor in decision making for surgery and to diagnose growth, especially in direction towards the optic chiasm, it should be important in clinical decision making regarding both surgery and medical treatment.

However, pituitary macroadenomas are often characterized by their irregular shape and blurry demarcation towards the para-, supra-, and infrasellar structures, and the evaluation of minor, though clinically important, volume changes in these lesions may be challenging. Size estimation based on measures of assumed maximum diameters is based on the assumption that the overall lesion can be described by an ellipsoid. This seems less meaningful in macroadenomas. Adenoma size and growth are traditionally assessed by visual impression of 2D MR images where the MR slice thickness may be 3 mm, which limits the ability to detect small changes in tumor volume. At this slice thickness, a minimum change of 3 mm in one diameter is necessary for it to be significant [18]. The use of methods based on visual inspection alone is further complicated by the lack of standardized registration of images acquired at different time points.

One of the primary reasons for conducting follow-up MR imaging on patients with pituitary macroadenomas is to find out if the tumor has changed volumetrically.

Our results suggest that the proposed method, when compared to conventional image reading, may provide a more accurate and objective assessment of the sometimes subtle but still highly significant volumetric changes of these tumors and thereby potentially influence the clinical decision making. As previously discussed by Fryback et al. [12], inducing change in the clinician’s diagnostic thinking is a necessary prerequisite to having impact on patients. There are a number of clinical settings where information regarding volumetric change of a pituitary adenoma is directly applied to therapeutic strategies [1924]. In this feasibility study, the results from employing the proposed method were not incorporated in radiology reports, and we have not explored the implications for follow-up treatment for patients where the use of this method led to a change in diagnosis. Evidence about the clinical impact of detecting even subtle volume changes should be collected in further studies.

The method presented here provides a tool for qualitative visual assessment of tumor volume change. Currently, quantitative volumetric change criteria have not been established to influence treatment options in pituitary lesions. However, the technique proposed here allows for quantitative ROI-specific volume changes to be calculated using the local displacement field. In the current work, we have not fully exploited this capacity but will be explored it in future studies. We expect that automated measurement of volume change will further improve the clinical utility of the proposed technique. Quantitative assessment may be particularly important for detecting a change in adenoma size for tumors with a high relative percentage change but small absolute volume change.

Compared to conventional visual inspection of 2D images, the application of the proposed technique would change the diagnosis of volume change in 38% (5/13) of the patients. It has not been established whether merely using a higher resolution dataset (1-mm-thick slices instead of 3-mm-thick slices) would be instrumental in changing the radiologists’ assessment of tumor growth. However, 1-mm 3D GRE images are not routinely used in the daily diagnostic work-up of pituitary adenomas due to difficulties separating tumor tissue from the parasellar structures; the use of such images alone has therefore not been compared with their use in conjunction with the correction, alignment, and visualization methods described here.

Volumetric approaches to irregular tumors have also been shown to have substantially less intrareader and interreader variability than other methods [25]. This has not been assessed here, nor the method’s reproducibility, which needs to be determined from larger patient groups in future studies.

Volumes based on tumor ROIs manually defined by an experienced neuroradiologist were used as reference. Manual tracings have been demonstrated to have intra- and interoperator agreement indexes on the same level as semi-automated methods [26], and for our patient cohort, the manual tracing method was assumed to provide the highest level of accuracy due to the complex and irregular shape of the adenoma. However, this method is prone to user variability, and some of the smaller volumetric changes that were detected with this method may be due to inaccuracies of the manual tracings.

In future work, we will investigate the utility and precision of nonlinear image morphometry where a displacement field is calculated to locally align longitudinal images and from which a volume change field can be calculated [10]. The volume–change field, when used as a visual overlay on a subject’s baseline image, should assist even further in making a yes/no determination of whether the tumor volume has changed; it should also, in principle, allow for precise measurement of that change.

Conclusion

Visual comparison of geometrically corrected, intensity-normalized, and affine-aligned longitudinal 3D images may enable more accurate assessment of qualitative volumetric change in pituitary adenomas than conventional reading of 2D images. The proposed method was found to be helpful in making the diagnosis of pituitary adenoma volumetric change and resulted in a change of diagnosis in 38% of the patients compared to conventional radiological assessment. We hypothesize that this technique may potentially have an impact on diagnosis and treatment of patients with pituitary adenomas. The feasibility of the method is enhanced by its simplicity of use and ease of implementation in the hospital Picture Archiving and Communication System. The potential for application of this proposed approach in other sorts of tumors remains to be established.