Introduction

Cervical cancer is one of the most common types of malignancy in women. An average of 80 % of patients with early-stage disease (FIGO stages IB and IIA) can be cured, but 30–50 % of patients with FIGO stages IIB to IVA at presentation experience treatment failure, and thus there is a high mortality rate among those with locally advanced cervical carcinoma [1, 2]. Heterogeneity is a recognized characteristic of malignant cervical tumors. It occurs at multiple levels from genetic to macroscopic, and has been reported to be associated with factors that influence resistance to therapy such as histological features including cellular proliferation, necrosis, noncellular accumulations, and physiological features including degree of vascularity, gene expression, energy metabolism, and oxygenation [311]. All these aspects of intratumoral heterogeneity may affect the biological behavior of tumors in patients, which may manifest as variability in prognosis and treatment response [12]. It is therefore of great importance to assess intratumoral heterogeneity before, during, and after a course of treatment, potentially allowing personalization of patient management and hence improvement in treatment outcomes.

In addition to being widely employed in tumor diagnosis and staging and in monitoring therapy response, and because of its ability to provide additional information on functional variations in tumors, the use of FDG PET imaging for characterizing intratumoral heterogeneity has also been intensively investigated with the aim of improving prediction of treatment outcome. Indices derived from FDG PET predominantly used in studies of this type are the mean and maximum standardized uptake values (SUVmean, SUVmax) in the delineated tumors [1317]. However, simple SUV measurements that can be affected by several factors such as FDG dose and the duration of FDG uptake after administration may not be sufficiently indicative of tumor behavior [18]. In addition, other tumor characteristics such as perfusion, hypoxia, aggressiveness, and proliferation rate can affect the degree and also the heterogeneity of tumor FDG uptake. Given these inherent limitations of SUV measurements, alternative strategies have been sought for the characterization of intratumoral heterogeneity.

Given that FDG activity distribution reflects underlying tumor biological heterogeneity, characterization of variations in FDG distribution have been recently advocated using various image characteristics, especially textural features. Spatial heterogeneity based on deviation from an idealized ellipsoid structure on pretreatment FDG PET scans has been shown to be significantly associated with survival in sarcoma patients [12]. Textural features at multiple scales ranging from local and regional to global have been evaluated on baseline FDG PET scans of patients with esophageal carcinoma to characterize the heterogeneity of FDG uptake [19]. In patients with cervical cancer, promising results with respect to characterizing tumor uptake heterogeneity for prediction of treatment outcome have been found using textural features based on co-occurrence matrices extracted from baseline FDG PET scans [18]. However, the temporal behavior of tumor heterogeneity characterized by textural features during the course of disease have never been assessed.

The hypothesis of this pilot study was that FDG uptake heterogeneity characterized by textural features evolves during the course of therapy. The values of each feature parameter extracted from FDG PET scans of patients with cervical carcinoma acquired before treatment, after 2 and 4 weeks of treatment, and 12 weeks after the completion of treatment are not static. We believe that the changes in tumor heterogeneity over time during therapy may be a better predictor of outcome in comparison with a single measurement prior to therapy.

Patients and methods

Patient description

Included in this study were 20 patients with a clinical diagnosis of cervical cancer treated exclusively with chemoradiotherapy between 2008 and 2010. The study was approved by the Washington University Human Research Protection Office (HRPO 08-0804). Written informed consent was obtained from all patients. Table 1 summarizes the demographics of all patients investigated. Their mean age at the time of diagnosis was 47.8 ± 11.5 years (median 44 years, range 25–74 years). Patients were staged clinically according to the FIGO staging system, as follows: one patient had stage IB1 disease, one stage IB2, one stage IIA, seven stage IIB, one stage IIIA, eight stage IIIB, and one stage IVA. Histopathological examination of all the tumors showed squamous cell carcinoma in 16, adenocarcinoma in 3 and clear-cell carcinoma in 1. In addition, 13 patients had lymph node metastasis at diagnosis. All patients underwent a pretreatment clinical whole-body FDG PET/CT scan. All patients were treated with concurrent radiotherapy and cisplatin chemotherapy. FDG PET/CT imaging was used to define the extent of the metabolic activity of the lesion for radiation treatment planning. The radiation was based on standard practices for external intensity-modulated radiation therapy and high dose-rate intracavitary brachytherapy for cervical cancer at our institution [20].

Chemotherapy consisted of weekly administration of cisplatin at a dose of 40 mg/m2 for six cycles. During treatment, two research FDG PET/CT studies of the pelvis were performed during weeks 2 and 4 of therapy. A posttreatment whole-body FDG PET/CT scan was used to evaluate treatment response and to detect any residual or progressive disease at week 12 after completion of therapy [13, 21]. Patients were subsequently classified as showing a complete metabolic response (CMR), a partial metabolic response (PMR), or new disease (NEW; i.e. patients with residual disease at the primary site who also developed new disease based on the posttreatment PET scan). At the end of therapy, there was no residual primary tumor in patients with CMR, but residual primary tumor was present in patients with PMR and NEW; thus, we divided the patients into two groups, CMR and PMR/NEW.

Table 1 Patient characteristics (n = 20)

PET/CT protocol

FDG PET/CT was performed in all patients using a hybrid PET/CT scanner (Siemens Biograph 40 TruePoint). Patients fasted for a minimum of 4 h before injection of FDG. The administered dose of FDG ranged from 10 mCi to 15 mCi (median 14.3 mCi), depending on patient weight, and the uptake period after injection was on average 1 h. The uptake time of the PET studies during treatment was matched to that of the pretreatment study. The CT scan was performed in the craniocaudal direction from the base of the skull through the proximal thighs at 120 kVp and 111 effective mAs with a slice thickness of 5 mm. The PET scan was performed with the same anatomic extent with imaging times of 2 to 4 min/bed position. Serum glucose levels ranged from 86 to 152 mg/dl (median 103 mg/dl). Urinary tract activity was minimized by placing a Foley catheter before injection of FDG and by administration of furosemide and intravenous fluids after injection of FDG. The PET images were reconstructed according to the manufacturer’s guidelines.

Tumor extraction and image preprocessing

For each patient, PET/CT images were reviewed for abnormal FDG uptake at the primary tumor site, lymph node regions, and distant sites. The primary tumor was identified and then delineated using the 40 % thresholding method with the threshold determined by optimized correlation with the CT dimensions of the primary tumor [22]. In order to avoid misinterpretation of different intensity values and to obtain meaningful statistical comparisons during textural analysis among patients and over time, voxel intensity values (I) within the delineated tumor volume were subjected to a linear rediscretization process:

$$ \widehat{I}(x)=\left[ {{2^N}\times \frac{{I(x)-\min \left( {I(x)} \right)}}{{\max \left( {I(x)} \right)-\min \left( {I(x)} \right)+1}}} \right], $$
(1)

where Î denotes the resultant intensity values inside the delineated tumors and 2N represents the number of levels after rediscretization with N equal to 8 in the current study. In this study, only the primary tumors were considered for analysis, as textural analysis cannot be reliably performed on small lesions such as nodal and distant metastases due to the very limited number of voxels involved in these structures.

Textural analysis

Two types of volumetric texture encoding methods with emphasis on characterizing regional heterogeneity, as listed in the left column of Table 2, were employed in this study and the following provides a brief description of each method.

Table 2 Texture encoding methods and associated features

Gray-level run length encoding method

The gray-level run length encoding method (GLRLM) is based on computing the number of gray-level runs of various lengths [23]. A gray-level run is a set of consecutive and collinear voxels possessing the same gray-level value, and the length of a run is the number of voxels in the run. A gray-level run-length matrix GLRLM(θ) is defined as follows: each element GLRLM(i,j|θ) represents the number of runs with voxels possessing gray-level intensity i and length of run j along a specific orientation θ. The size of the matrix GLRLM(θ) for a given delineated tumor is N by K with N identical to the maximum gray level inside the tumor and K equal to the longest run length possible within the tumor. Orientation θ in 3-D contains 13 different angular directions, and therefore 13 gray-level run-length matrices were generated for a given tumor with each matrix computed for one direction. A total of 11 textural features, as listed in Table 2, were extracted from each of these 13 matrices, and for each feature the average value of the 13 directions was taken.

Gray-level zone size encoding method

Instead of counting the number of voxels with the same gray-level intensity along various orientations as in the grey-level run length encoding method, the gray-level zone length encoding method (GLZLM) takes account of the number of gray-level zones of various sizes [24]. A gray-level zone is a contiguous region with voxels having the same intensity value, and the size of a zone is the number of enclosed voxels. A gray-level zone size matrix GLZSM is formed with each entry GLZSM(i,j) denoting the number of zones of gray-level intensity i and of size j. The number of rows of the resultant matrix for a given delineated tumor is equal to the maximum gray level inside the tumor and the number of columns is identical to the size of the largest zone possible within the tumor. Once the matrix was obtained, the 11 textural features, as listed in Table 2, were computed for each delineated tumor.

Standardized uptake value analysis

In view of their widespread use in characterizing FDG uptake for therapy response monitoring together with the role they play in FDG PET-based investigations of therapy response prediction, SUV indices including SUVmax and SUVmean were measured for all the identified tumor volumes, with each of these SUV values normalized to the decay-corrected injected activity together with the patient’s body weight on the date of the PET examination, as most commonly used in clinical practice. Changes in SUVmax and SUVmean during the study period were assessed for each patient group. In addition, the ability of SUVmax and SUVmean at baseline to differentiate patients with regard to treatment outcome was also explored and was further compared with that of the textural features investigated.

Statistical analysis

The differences in each feature extracted from the PET scans at different time points during the study period were evaluated with Friedman’s ANOVA test [25]. The capacity of each feature at baseline to distinguish patients with respect to treatment outcome was investigated using the Mann-Whitney test [26]. P values less than 0.05 were considered statistically significant. Trend analyses for features exhibiting significant changes during the study period were carried out using first-order linear regression.

Results

Patient characteristics, including age, FIGO stage, histology, lymph node status at diagnosis, and response seen on the 3-month posttherapy PET scan are shown in Table 1. The mean tumor volume at diagnosis was 62.75 cm3 (median 36.64 cm3, range, 10.38–248.33 cm3). At 3 months posttherapy, 15 patients were considered to have a CMR, 3 patients a PMR, and 2 patients new (NEW) disease.

In each patient, textural analysis for the primary tumor as delineated on the PET scans at each time point was performed to examine how the extracted textural features evolved during the study period. Median values for each feature derived from the GLRLM are shown in Table 3 for the two patient response groups (CMR and PMR/NEW). A comparison of the textural features based upon the GLZLM between the two patient response groups is presented in Table 4. Note that for the CMR group, because there was no abnormal FDG uptake in the region of the original tumor, there are no corresponding entries in Tables 3 and 4. Regarding the features derived from the GLRLM, HGRE, SRHGE, and LRHGE showed statistically significant changes during the study period in the CMR group (P < 0.001; Friedman’s ANOVA), and LGRE, SRLGE, LRLGE, GLNU, and RLNU showed significant changes in the PMR/NEW group (P = 0.0189, 0.0189, 0.0189, 0.0293, and 0.0070, respectively). Regarding the features based upon the GLZLM, SZE, HGZE, SZHGE, LZHGE and ZP showed statistically significant changes during the study period in the CMR group (P = 0.0013, 0.0004, 0.0004, 0.0004, and 0.0128, respectively; Friedman’s ANOVA), and LGZE, SZLGE, LZLGE, GLNU, and ZSNU showed significant changes in the PMR/NEW group (P = 0.0189, 0.0189, 0.0189, 0.0293, and 0.0136, respectively). The median values of the two SUV indices at the different time points during the study period are shown in Table 5 for the two response groups. There were significant changes in both indices in both response groups during the study period (P < 0.001 for SUVmax and SUVmean in the CMR group; P = 0.0109 and 0.0136, respectively, for SUVmax and SUVmean in patients with PMR/NEW; Friedman’s ANOVA).

Table 3 Median values of GLRLM-based features at different time points during the study period together with P values resulting from Friedman’s ANOVA in the CMR and PMR/NEW groups
Table 4 Median values of GLZSM-based features at different time points during the study period together with P values resulting from Friedman’s ANOVA in the CMR and PMR/NEW groups
Table 5 Median values of SUV indices at different time points during the study period together with P values resulting from Friedman’s ANOVA in the CMR and PMR/NEW groups

Six textural features (HGRE, SRHGE and LRHGE derived from the GLRLM, and HGZE, SZHGE and LZHGE derived from the GLZLM) showed significant changes during the study period in the CMR group (Figs. 1 and 2). In the first-order linear regression analysis, all six features showed high negative correlations with follow-up time (R 2 = 0.945, 0.946, 0.943, 0.999, 0.968 and 0.893 for HGRE, SRHGE, LRHGE, HGZE, SZHGE and LZHGE, respectively). No persistent temporal trends were observed for these textural features in the PMR/NEW group.

Fig. 1
figure 1

Changes in textural features HGRE (a, b), SRHGE (b, c) and LRHGE (d, e) in the CMR group. The first column (a, c, e) shows the respective line plots for the individual HGRE, SRHGE and LRHGE values before treatment, at weeks 2 and 4, and after treatment; the second column (b, d, f) shows the corresponding plots of the means with error bars for each feature with the trends over the study period in red

Fig. 2
figure 2

Changes in textural features HGZE (a, b), SZHGE (b, c) and LZHGE (d, e) in the CMR group. The first column (a, c, e) shows the respective line plots for the individual HGZE, SZHGE and LZHGE values before treatment, at weeks 2 and 4, and after treatment; the second column (b, d, f) shows the corresponding plots of the means with error bars for each feature with the trends over the study period in red

The changes in SUVmax and SUVmean in the CMR group are shown in Fig. 3. Both indices showed a highly significant decreases with time (R 2 = 0.913 and 0.999 for SUVmax and SUVmean, respectively). The changes in the SUV indices in the PMR/NEW group are shown in Fig. 4. Although no persistent overall trends were observed in either index over the study period, both indices showed high negative correlations with time through to the 4th week after treatment initiation (R 2 = 0.762 and 0.817 for SUVmax and SUVmean, respectively).

Fig. 3
figure 3

Changes in SUVmax (a, b) and SUVmean (b, c) in the CMR group. The first column (a, c) shows the respective line plots for the individual SUVmax and SUVmean values before treatment, at weeks 2 and 4, and after treatment; the second column (b, d) shows the corresponding plots of the means with error bars for each index with the trends over the study period in red

Fig. 4
figure 4

Changes in SUVmax (a, b) and SUVmean (b, c) in the PMR/NEW group. The first column (a, c) shows the respective line plots for the individual SUVmax and SUVmean values before treatment, at weeks 2 and 4, and after treatment; the second column (b, d) shows the corresponding plots of the means with error bars for each index with the trends over the study period in red

Distributions of various features at baseline, including SUVmax and SUVmean and two examples from each texture encoding category (LGRE and HGRE for the GLRLM-based method; LGZE and HGZE for GLZLM-based method) in the CMR and PMR/NEW groups are shown in Fig. 5. At baseline neither SUV indices nor the textural features was able to differentiate the two response groups (P = 0.1513 and 0.2870 for SUVmax and SUVmean, respectively; P = 0.0577–0.287 for the GLRLM-based textural features; P = 0.0577–0.4313 for the GLZLM-based textural features).

Fig. 5
figure 5

Values of various textural features and SUVmax and SUVmean derived from pretreatment FDG PET scans in relation to patient response group (CMR, PRM/NEW): a SUVmax (P = 0.1513), b SUVmean (P = 0.2870), c LGRE (P = 0.0577), d HGRE (P = 0.1513), e LGZE (P = 0.0577), f HGZE (P = 0.1511)

Discussion

Assessment of intratumoral metabolic heterogeneity as shown on FDG PET scans may be of importance in both treatment outcome prediction and in the development of biologically adapted strategies. Although heterogeneity in tumor FDG distribution has been demonstrated, there have been very few reports with regard to changes in tumor FDG uptake during the course of the disease. In the current work, heterogeneity in FDG uptake as shown by textural features at a regional scale was evaluated at four different time points including baseline, during the 2nd week and 4th week of therapy, and during the 12th week after completion of treatment to explore the temporal behaviors of tumor metabolic heterogeneity in patients with cervical cancer undergoing chemoradiotherapy.

In the cohort of 15 cervical cancer patients with CMR, six textural features, namely HGRE, SRHGE and LRHGE derived from the GLRLM method and HGZE, SZHGE and LZHGE derived from the GLZLM method, showed statistically significant differences at the time points examined and also showed a negative temporal trend in a first-order linear regression analysis of the changes in mean values with time. For the three GLRLM-based features, HGRE represents runs in tumors with high gray-level values, and the other two (SRHGE and LRHGE) also represent runs in tumors with high gray-level values but with emphasis, respectively, on short and long runs. Similarly for the three GLZLM-based textural features, gray-level zones rather than gray-level runs are used to describe contiguous regions of constant intensity in a tumor.

Thus, it can be seen that all six textural features showing significant temporal changes in the CMR group characterize, though through different aspects, the contiguous isointense regions of high activity inside a tumor. Negative temporal trends revealed using linear regression analysis in the six features indicate that contiguous regions of constant high activity appearing at baseline in tumors of the CMR group might either shrink and/or show lower intensity in response to chemoradiotherapy, and thereafter evolve into low-activity regions. Corresponding positive temporal trends were also seen in the textural features describing low-intensity regions (LGRE, SRLGE and LRLGE derived from the GLRLM method, and LGZE, SZLGE and LZLGE derived from the GLZLM method). In the PMR/NEW group, some textural features showed significant changes over the time period examined. These features were mainly those characterizing contiguous regions with constant low gray-level values and those measuring nonuniformity in either the sizes or the gray-level values of the contiguous regions with constant intensity in a tumor. However, no persistent overall temporal trends were observed for these features.

The evolution of SUVmax and SUVmean in each response group during the study period was also examined. In the CMR group, both SUV indices showed statistically significant decreases with time. In the PMR/NEW group, both indices (in contrast to the trend seen in the CMR group) showed significant decreases up to the 4th week of treatment, and then increased, as seen in Fig. 4. Considering SUV indices derived from FDG PET as indicators of tumor metabolism, the decreases with time of both indices seen in the CMR group may be an indication of the progressive eradication of tumor cells as a result of systemic suppression in response to the chemotherapy in conjunction with the targeted attack by radiation. In contrast, in the PMR/NEW group, decreases in the indices seen during the early stage of treatment, possibly reflecting elimination of the susceptible tumor cells in response to the chemoradiotherapy, were followed by a rebound in the indices on the posttreatment PET scans which may have corresponded to recovery and repopulation of tumor cells that were resistant to the administered treatment.

In contrast to the SUV indices which showed parallel temporal trends in both response groups during the early period of therapy (up to the 4th week of treatment), the six textural features that characterized the contiguous isointense regions of high activity inside the tumors showed significant changes only in the CMR group over the treatment period. This suggests, at least for the currently studied patient population, that changes in textural features characterizing isointense regions of high intratumoral activity during the early phase of treatment may be able to predict the outcome of therapeutic interventions better than changes in SUV indices. With regard to their abilities to differentiate response groups before treatment initiation, neither the SUV indices nor the textural features investigated was able to distinguish the two response groups at baseline (as analyzed using the Mann-Whitney test), in accordance with our initial hypothesis that the changes in tumor heterogeneity during therapy may serve as a better predictor of response than a single measurement prior to treatment.

Despite a great number of effective segmentation algorithms proposed for automatic tumor delineation [2730], a strict thresholding scheme with a cut-off value of 40 % of the maximum activity was employed in the current study to define the lesion volume for the identified tumors on FDG PET scans. This method has been previously demonstrated to result in valid segmentations for cervical tumors with optimized correlations with the CT dimensions of the primary tumor and conformity of the resultant volumes with those revealed in surgical series [22, 31]. Given that the radiation treatments in the current study were planned using volumes derived by this fixed thresholding method as the targets, textural analysis was limited to the threshold-defined volume for each FDG PET scan. Considering the inability of fixed thresholding methods to handle FDG distribution heterogeneity as seen in PET images, one may argue that such a strategy for textural analysis would result in texture parameters that inaccurately and/or imprecisely describe the actual underlying FDG accumulation heterogeneity. While acknowledging this view, we nonetheless deem such a strategy relevant in the context of therapy-response prediction with the line of reasoning that the intended target volume of the radiation therapy and evaluation of FDG accumulation heterogeneity within this volume along with its evolution during the course of the disease would be more pertinent to the evaluation of the impact of a prescribed treatment on the final outcome. In addition, before the textural analysis of a given delineated lesion, the associated voxels were first subjected to an intensity rediscretization procedure as described above. It has been previously shown by Tixier et al. [19] that for the texture categories studied in the current work the number of rediscretization levels does not significantly affect the resultant parameter values. Given that the rediscretization procedure used in the current study was very similar to that used by Tixier et al., the effect of the rediscretization procedure on the observed temporal evolution of texture heterogeneity parameters would be expected to be limited or nonexistent.

It must be stressed that there were certain limitations to the present study. In particular, it should be acknowledged that the results were based on a small cohort of patients, and thus further validation of the current findings in a larger patient population and in a population with an even distribution of tumor histology is necessary. Moreover, the presented temporal trends of various features were assessed with patient data from a small number of time points, and thus provided only a coarse analysis of the temporal dynamics of the behavior of tumor metabolic heterogeneity during treatment. In addition, treatment responses seen on posttherapy PET scans were derived in a subjective manner by nuclear physicians.

Future studies of the temporal behavior of intratumoral metabolic heterogeneity could include assessment of the prognostic significance of the identified temporal trends in tumor heterogeneity, exploring the temporal behavior of tumor heterogeneity in terms of other image-derived indices, and investigating the underlying biological mechanisms responsible for the observed temporal characteristics of intratumoral heterogeneity seen on FDG PET. These endeavors may lead to the ability to predict which patients will respond to a given treatment and prospectively identify patients with highly heterogeneous tumors in whom more aggressive therapy should be considered.

Conclusion

The current study is the first to reveal the changes with time in the intratumoral heterogeneity of FDG accumulation at a regional scale in patients with cervical cancer undergoing chemoradiotherapy. The results suggest that image-derived regional textural features may allow further characterization of intratumoral FDG distribution, which may lead to a better understanding of FDG accumulation in malignant cervical tumors and how this relates to response to treatment.