Introduction

The quantification of tumor heterogeneity in medical imaging is a current research interest due to its potential relationship with tumor malignancy. High intratumoral heterogeneity has been related to poorer prognosis, which could be secondary to intrinsic aggressive biology or treatment resistance [1, 2]. However, intratumoral heterogeneity is not completely determined by biopsy samples, as they do not reflect the full extent of phenotypic or genetic variability [3].

Information derived from positron emission tomography/computed tomography with 18F-fluorodeoxiglucose (18F-FDG PET/CT) can be a predictor of treatment outcome [4,5,6,7]. In breast cancer (BC), an association between the maximum standardized uptake value (SUVmax) and response has been described. However, results have been controversial [8,9,10,11,12]. On the other hand, limited experience exists about its relation to prognosis [13,14,15,16].

Texture analysis refers to a variety of mathematical methods used to quantify the spatial variations in gray levels within an image to derive the so-called ‘textural features’, which provide a measurement of intralesional heterogeneity. Textural features offer global tumor information and conform approximations of intratumoral heterogeneity, a biological tumor characteristic associated with aggressive tumor behavior, poor response to therapy and poor survival [17]. Summarizing, radiomics support the use of different image characteristics, as textural variables, derived from image-processing techniques and combined with statistical modeling techniques to predict a certain clinical end point (e.g., survival, local relapse or response) [18,19,20,21,22].

Scarcely reported evidence about the relations of textural parameters obtained by 18F-FDG PET/CT with prognostic factors in BC has been described [23]. Moreover, no previous works have studied the relations of textural parameters with neoadjuvant chemotherapy (NC) response.

The aim of the present work was twofold: first, to assess the relations of textural features with risks categories attending to molecular phenotypes, and second, to build a statistical model obtained from 3D texture analysis of PET images identifying the features that can predict NC response and outcome in patients with locally advanced breast cancer (LABC).

Materials and methods

Patients

All reported patients were participants of an ongoing prospective study initiated in September 2009 and approved by the Local Ethics Committee of our Institution and Research Board. Written informed consent was obtained from all patients.

The inclusion criteria were newly diagnosed BC with a size of at least 2 cm in diameter, with clinical indication of NC, lesion uptake higher than background and absence of distant metastases confirmed by previously requested PET/CT for staging purposes. 68 patients satisfied the inclusion criteria.

The histopathological analysis of the primary tumor was performed on specimens obtained by core aspiration biopsy. The determination of tumor type, histopathological grading, estrogen and progesterone receptors (ER/PR), epidermal growth factor receptor (HER2) and proliferation index based on the Ki-67 proliferation ratio were obtained as described in previous works [24,25,26]. 71.2, 31.8 and 45.5% of patients were HER2, ER and PR negative, respectively, while the remaining patients were positive. TNM was integrated in stages for being more relevant. All patients had ECOG PS of 0. The information of specific adjuvant treatments after NC was not collected.

Additionally, an estimation of the molecular phenotypes and risk categories was performed depending on tumor phenotype: high risk [basal-like or HER2 (+) pure], intermediate risk [luminal B-HER2 (−) or luminal B-HER2(+)] and low risk [luminal A]. Moreover, for statistical analysis, the risk groups were set as categorical variables and divided into two groups: high risk [basal-like and HER2 (+) pure] and low/intermediate risk [luminal B-HER2(-), luminal B-HER2(+) and luminal A] [25].

Patients received standard NC regimen in a combination of anthracyclines, taxanes and anti-HER2 therapy [26].

Sixty patients underwent mastectomy or quadrantectomy and axillary lymph node dissection 4–6 weeks after NC. Breast and lymph nodes specimens were surgically removed, sliced, prepared and analyzed. For this study, only a binary breast histological response was considered to classify lesions as responders (complete or nearly complete response) or non-responders (for the other tumor regression grades) as previously stated [24].

After surgery, adjuvant treatment with/without radiotherapy was administered based on post-NC stage and tumor biology.

Patients underwent a minimum follow-up of 42 months. Disease-free survival (DFS) was defined as the time, in months, from the date of initial staging until tumor recurrence, death or last follow-up examination. Overall survival (OS) was defined as the time, in months, from the date at initial staging until death or last follow-up examination.

FDG PET/CT imaging and tumor segmentation

All PET/CT examinations, previous to NC and surgery, were performed on the same dedicated whole-body PET/CT machine (Discovery DSTE-16 s, GE Medical Systems) following a standardized protocol in three-dimensional (3D) mode [24]. The image voxel size was 5.47 mm × 5.47 mm × 3.27 mm with a slice thickness of 3.27 mm and no gap between slices. Matrix size was 128 × 128.

PET images in DICOM (Digital Imaging and Communication in Medicine) files were imported into the scientific software package Matlab (R2015b, The MathWorks, Inc., Natick, MA, USA) and pre-processed using in-house semi-automatic image segmentation software. The tumor was first manually located in a 3D box and then automatically segmented in three dimensions.

After semi-automatic segmentation, volume of interest was delineated using 40% of SUVmax. SUV-based variables [SUVmax, SUVmean, SUVpeak, metabolic tumor volume (MTV) and total lesion glycolysis (TLG)] were calculated as previously reported [27].

Texture analysis

Many methods have been proposed to quantify tumor heterogeneity from imaging data. For 3D heterogeneity measures, we only considered local and regional spatial textural methods, as we wanted to consider the spatial SUV distribution to assess heterogeneity.

We analyzed the local relations between voxels using the co-occurrence matrix (CM) and the regional relations using the run-length matrix (RLM). Thus, a set of sixteen 3D heterogeneity textural measures was computed automatically using Matlab software [28, 29]. For the computation of the textural features, we considered only the volume of interest segmented (40% of SUVmax). Then, this range was discretized in 32 boxes of equal size to construct the matrices.

Table 1 details the acronyms and description of the assessed textural parameters. Figure 1 shows an example of tumor segmentation and texture image analysis.

Table 1 Textural features and description
Fig. 1
figure 1

(a) Maximum intensity projection of a patient classified with stage IV due to bone metastases (arrows). Breast tumor segmentation (b) and voxel representation in 3D image reconstruction (c). Raw gray levels distribution for energy analysis (d) and after discretization (e)

Statistical methods

The statistical analysis was performed using SPSS software (v. 22.0.00). Categorical variables were described by frequency and percentage, while mean and standard deviation were used to describe quantitative variables. Normality was checked with Kolmogorov–Smirnov test. The level of significance was p < 0.05.

Student t-test was used to compare the means of textural variables with respect to NC response groups, using Mann–Whitney test in the nonparametric case.

Relation between phenotype and textural variables was performed using one-way ANOVA, Kruskal–Wallis test in the nonparametric case, and repeated measures ANOVA with a between-subjects factor. Tukey’s post hoc test was considered with phenotype variable.

Proportional Cox hazards analysis was used for each textural and SUV-based parameter individually and then in a multivariate analysis where PET stage (distant disease vs regional disease), NC response and molecular phenotype (risk categories) were also included. Hazard ratios were computed to assess the differences in the first significant digit.

Finally, receiver operator characteristic (ROC) curves were used for the computation of the optimal cut-off of the textural features to predict prognosis (DFS or OS) maximizing the sum of sensitivity and specificity. The statistical analysis of survival between groups was performed using Kaplan–Meier survival curves and log-rank test.

Results

68 patients were included. Patient characteristics are detailed in Table 2. Mean ± SD of DFS and OS obtained by Kaplan–Meier analysis were 70.87 ± 21.85 and 76.77 ± 18.80 months, respectively.

Table 2 Patient’s characteristics

Long run emphasis (LRE), LRHGE, LGRE and GLNU showed relations with phenotype risk categories, where higher heterogeneity was related to more aggressive molecular phenotypes (Table 3).

Table 3 Textural variables depending on the risk phenotype categories (*)

Only LGRE, LRHGE and RLNU showed associations with NC response as shown in Fig. 2. Mean ± SD for LGRE was 0.17 ± 0.04 for responders and 0.20 ± 0.04 for non-responders with p = 0.048. LRHGE and RLNU showed marginally significant associations. Mean ± SD for LRHGE was 1178.81 ± 1399.86 for responders and 454.42 ± 374.69 for non-responders with p = 0.086. Finally, the mean ± SD values of RLNU for responders was 31.43 ± 21.60 and 20.49 ± 12.30 for non-responders with p = 0.061.

Fig. 2
figure 2

Texture variables associated with neoadjuvant chemotherapy response. p values of 0.048, 0.086 and 0.061 were obtained for LGRE, LRHGE and RLNU, respectively

Results of the univariate Cox regression analysis used to examine the effects of textural parameters on prognosis are shown in Table 4. CON, DIS, SRE, HGRE, SRHGE and RPC protected against established events (HR < 1), death or recurrence. Then, for example, an increase of 0.10 units of SRE reduced 3.8 and three times the risk of death or recurrence, respectively. On the contrary, HOM, LRE, LRHGE, LRLGE and GLNU increased the risk of death or recurrence (HR > 1). For example, for any 0.10 units of increase of HOM, the risk of death or recurrence increased 15.6 or 9.4 times, respectively. With respect to LRHGE, the risk of experiencing an event increased by 7% for every increase of 100 units of the variable.

Table 4 Cox regression univariate analysis for textural features for OS and DFS

Kaplan–Meier analysis showed that OS and DFS were lower in patients with high HOM, SRE or RPC. For HOM, values lower than the most significant threshold had higher survival. On the contrary, higher values of SRE and RPC were associated with lower survival. (Figs. 3, 4 and 5).

The best cut-off value, area under curve (AUC) and sensitivity and specificity obtained in ROC analysis for the OS prediction were, respectively, 2.68, 0.808, 75 and 88% for HOM (p = 0.005); 5.90, 0.756, 88 and 67% for SRE (p = 0.019), and 4.69, 0.804, 88 and 67% for RPC (p = 0.006). For DFS, results were: 2.43, 0.770, 67 and 74% for HOM (p = 0.010); 5.90, 0.703, 78, 67% for SRE (p = 0.052) and 4.84, 0.725, 78, 62% for RPC (p = 0.031).

Fig. 3
figure 3

Kaplan–Meier plots for OS and DFS for patient subgroups with different values of the textural variable “homogeneity” (HOM). Log-rank p values are provided

Fig. 4
figure 4

Kaplan–Meier plots for OS and DFS for patient subgroups with different values of the textural variable “short run emphasis” (SRE). Log-rank p values are provided

Fig. 5
figure 5

Kaplan–Meier plots for OS and DFS for patient subgroups with different values of the textural variable “run percentage” (RPC). Log-rank p values are provided

In the multivariable Cox regression analysis including SUV-based and textural variables, PET stage (distant disease vs regional disease), NC response and risk categories (molecular phenotype), only two variables showed prognostic value. For OS, PET stage and SRHGE showed significant associations. Regarding PET stage, a patient with distant metastases had 12.5 times greater risk of death compared to a patient with regional disease (HR 12.50; 95% IC of 2.58 and 60.41; p = 0.002). On the contrary, SRHGE was a protective variable (HR 0.76; 95% IC of 0.62 and 0.92; p = 0.006). For DFS, only PET stage (HR 4.72; 95% IC of 1.21 and 18.37, p = 0.025) and HGRE (HR 0.89; 95% IC of 0.80 and 0.99, p = 0.035) showed significant associations.

Discussion

Several textural features, in functional imaging of cancer using PET, have shown ability to differentiate tumor types, predict treatment response, or be associated with survival [7, 30,31,32].

In BC, more aggressive tumors are associated with a higher metabolism, tumor burden and also a better histological response to NC [25, 33]. Yoon et al. [34] evaluated the intratumoral metabolic heterogeneity by constructing cumulative SUV histograms. They concluded that lower AUC of those histograms was correlated with the invasive component of breast ductal carcinoma in situ. Soussan et al. [23] found associations between textural features and negative ER, negative PR and triple-negative breast cancer, concluding that tumor heterogeneity assessed on 18F-FDG PET/CT might be used to determine breast cancer aggressiveness.

In this work, we analyzed the relation between textural parameters and histopathological prognostic factors combined into phenotype risk categories. Textural features, as LRHGE, LRE and GLNU, showed relations with risk phenotypes, pointing out a relationship between local tumor aggressiveness and tumor complexity. Variables based on co-occurrence matrix were not good predictors, neither for risk phenotype or for NC response. Both LGRE and LRHGE showed significant differences between risk phenotypes and NC response.

Our analysis presents similarities with the one developed by Soussan et al. [23] who first used 3D matrices to construct a set of textural measures, studied their association with biological features and found that the combination of HGRE and SUVmax identified triple-negative tumors with a sensitivity of 77% and specificity of 71%. On the contrary, Groheux et al. [35] found that none of the considered PET texture metrics could improve differentiation between the three main molecular subtypes of breast tumors beyond the standard clinical factors and SUV metrics.

Tixier et al. [7], in patients with esophageal carcinoma, found that responders to treatment showed greater local and regional heterogeneity at baseline, with better response stratification for the measures of regional tumor heterogeneity. Our results showed similarities, since more heterogeneous tumor values had a better NC response. The fact that not all the textural parameters were associated with response has been previously noted [36]. In the present work, only LGRE, LRHGE and RLNU showed significant or marginally significant associations. Thus, lesions with high gray level values (LGRE, LRHGE) and with a more heterogeneous distribution of the radiotracer (RLNU) had a better NC response compared to the other groups.

Previously reported experience about the association between 18F-FDG tumor uptake and prognosis has outlined controversial results, probably due to the differences of tumor biology and methodology among the works [37,38,39]. Thus, SUVmax does not seem to be a strong variable to predict disease evolution. Regarding volume-based variables obtained in baseline PET, there is less evidence with respect to their prognostic value [13,14,15,16]. The addition of MTV to other variables, as tumor biology, has potential benefits for identifying a subgroup of patients at higher risk for recurrence, although the results are controversial [13, 40].

The mainstay of radiomics is to build clinical models to predict patient outcome, thereby facilitating better patient management. Intratumoral heterogeneity assessed by PET has been described as an independent prognostic factor in several tumors [41,42,43]. In patients with non-small cell lung cancer, both heterogeneity and tumor size were predictive for disease-specific survival, but only texture determined by CM Entropy was determined as an independent factor in multivariate analysis. On the other hand, OS was not significantly correlated, most likely due to the high comorbidity in the cohort [41]. In pancreatic ductal adenocarcinoma, Hyun et al. [42] found that intratumoral heterogeneity of 18F-FDG uptake was an independent survival prognostic factor. However other authors have found no significant associations with prognosis in multivariate analysis [32, 38]. Cook et al. [32] found that progression-free survival was longer in patients with high levels of contrast and busyness, although they did not find an association with OS. Our results are in accordance with these and may be explained by the fact that some textural variables were not representative of tumor heterogeneity. Moreover, the connection between biology and treatment response could influence the survival results, with more biologically aggressive tumors showing a better NC response and thus prognosis. However, we did not find a significant association between prognosis and treatment response, probably due the low number of responders.

In BC, limited reported evidence exists regarding texture and prognosis. Previous authors have described that textural features performed better than SUV parameters, MTV and TLG in the determination of prognosis [16]. Son et al. [43] found an association between a heterogeneity volume-based parameter and OS.

Textural features give information about gray level distribution and the image-visible heterogeneity. On the other hand, not all the texture variables underlay information about tumor heterogeneity in the same way or have the same potential in the heterogeneity description. Some variables offer a distinction between fine textures and coarse ones. Co-occurrence matrix variables give less robust information of texture compared to RLM variables. Thus, the obtained results with CM variables should be cautiously considered.

The results of our prognostic model revealed that for a constant value of SRHGE, PET stage (distant metastases) was a strong predictor for OS. On the other hand, the SRHGE was the best texture variable in the OS prediction. The information of the high gray level voxel distribution supporting more robust information of tumor heterogeneity was comparable to other texture variables. However, when SRHGE was not included in the model, other texture variables appeared to be relevant. This fact can explain the overlapping information offered by some texture variables. For DFS, similar behavior was observed between PET stage and HGRE. Thus, texture and PET stage overcame SUV-based variables and tumor biology.

About the limitations of the study, the reduced and biologically heterogeneous sample could affect the obtained results, although we considered it as representative of the total population of patients included in our prospective study. Moreover, the results obtained were focused on locally advanced tumors and thus, could not be applicable to smaller tumors. Regarding tumor volume, although a selection of tumors with a size of at least 2 cm of diameter was performed, textural variables could not reproduce real tumor heterogeneity, based on the limited number of analyzed voxels.

Regarding the strengths, patients were normalized in a prospective study, tumors were segmented and textural parameters were computed in 3D, using a refined methodology.

The results obtained in this work offer evidence of the associations between textural parameters, treatment response and prognosis in breast cancer.

Conclusions

Texture variables obtained with 18F-FDG PET/CT, were predictors of neoadjuvant chemotherapy response and prognosis, being almost as relevant as PET stage at diagnosis for OS and DFS prediction.

Normalized and well-computed heterogeneity parameters obtained by PET combined with the metabolic stage can help in the assessment of patient prognosis.