Background

Neoadjuvant chemoradiotherapy (nCRT) followed by surgery has become the standard treatment for patients with locally advanced rectal cancer, resulting in a reduction of local recurrence and an increase of sphincter preservation rate [1]. Even in patients with resectable metastases, nCRT is frequently employed as part of the multimodality therapy. However, responses to nCRT vary considerably among individuals. As reported, 15% to 30% of patients are found free of disease at pathological examination [2, 3]; 18% to 74% are identified with down-staging [1, 4, 5], while the rest are evaluated as stable disease or even disease progression. The degree of pathological response is usually categorized by Mandard’s tumor regression grade (TRG) criteria [6]. Grade 1, or pathological complete remission (pCR), is significantly correlated with prolonged long-term overall survival [7], raising the question that whether surgery can be omitted under such circumstance for the sake of sphincter preservation and enhancement of quality of life. According to Haba-gama et al., patients with clinical evidence of complete remission obtained similar excellent long-term survival with either radical resection or “watch and wait” strategy [8]. Meanwhile, some researchers suggested transanal local excision as an alternative to aggressive surgical resection in cautiously selected patients presenting post-nCRT pCR [9]. Therefore, in order to facilitate individualized therapeutic decisions, emphasis has been placed on accurate and early prediction of tumor regression. Generally, the evaluation is consisted of a series of work-ups, including endoscopic endorectal ultrasound (ERUS), computed tomography (CT), and magnetic resonance imaging (MRI), known of low sensitivity and specificity in the discrimination of residual tumor, radiation-induced fibrosis, and post-nCRT oedema [10].

As a functional imaging technique, 18fluorine-2-deoxy-d-glucose positron emission tomography–computed tomography (18F-FDG PET/CT) allows for both qualitative and quantitative measurements of cell glucose metabolism. Although the correlation between cancer cell glucose metabolic activity and tumor regression has been evaluated in several types of tumor, including rectal cancer, most studies have focused on the assessment of maximal standard uptake value (SUVmax). Other parameters reflecting tumor burden from different aspects have not been fully investigated.

Another promising predictive factor is carcinoembryonic antigen (CEA), a traditionally used biomarker associated with prognosis and tumor recurrence. However, growing evidences have suggested its utility on response prediction in rectal cancer [11].

Our study aimed to determine the value of PET/CT parameters and CEA for the prediction of tumor regression in rectal cancer in order to facilitate the early planning of individualized treatment.

Materials and methods

Study population

This study was reviewed and approved by the Human Ethics Approval Committee, in accordance with the ethical standards of Helsinki Declaration. Informed consent was obtained from each patient included in this study.

Consecutive patients diagnosed with rectal cancer and hospitalized in our institute from August 2010 to August 2013 were prospectively analyzed. The included patients were ≥18 years of age, pathologically diagnosed with rectal adenocarcinoma, clinically staged cT3–4 and/or N+ with or without distant metastasis, and did not have a previous history of any malignancy or anti-cancer therapy. The staging procedure was based on ERUS, abdominal and pelvic MRI, CT of the chest, and 18F-FDG PET/CT examination. The distance of tumor from the anal verge was assessed by ERUS.

All patients received neoadjuvant long-course radiotherapy (25 fractions of 1.84–2 Gy, 5 fractions per week) with concomitant chemotherapy (capecitabine 1000 mg/m2 twice a day, day 1–14, repeated every 3 weeks). Additional induction chemotherapy (1000 mg/m2 capecitabine twice a day, day 1–14 + oxaliplatin 130 mg/m2, day 1, repeated every 3 weeks) was administered at the discretion of the patient’s attending oncologist. A comprehensive re-evaluation work-up, including ERUS, MRI, CT, and, in some patients, a second whole-body PET/CT, was performed. Patients were operated at least 6 weeks after completion of nCRT.

Biomarkers

Serum CEA levels were evaluated at diagnosis (CEA-pre) and before surgical resection (CEA-post) at the same laboratory by the Architect i2000 chemiluminescent microparticle immunoassay (Abbott Diagnostics, CA, USA). The calculated parameters were as follows:

$$\Updelta {\text{CEA}} = {\text{CEA-pre}} - {\text{CEA-post}},$$
$${\text{RI-CEA}} = \Updelta {\text{CEA}}/{\text{CEA-pre}} \times 100\,\% .$$

PET/CT imaging and processing

The patients were examined before and, in some of cases, after nCRT using the same dedicated PET/CT system (Discovery ST-16, GE Health Care, Piscataway, NJ, USA), which has been used at our hospital since 2005, following the same protocol. The patient was instructed to fast for at least 6 h and to avoid caffeine and cigarettes for 24 h before the administration of 18F-FDG. Blood glucose concentrations were confirmed to be less than 150 mg/dL prior to 18F-FDG administration. Then, 18F-FDG (3.7 MBq/kg) was injected intravenously, and the patient was asked to lie relaxed in a dark room for 60–90 min before the scan. To minimize the presence of the tracer in the urinary tract, the patient was asked to void his/her bladder immediately before the scan. The patient was scanned from the calvarium to the middle of the femur while lying in a supine position. CT was performed before PET, and the resulting data were used for attenuation correction of the PET data and for morphological examination through image fusion. Then, the PET scan was performed for 3 min per bed position. Two-dimensional PET images were reconstructed with a slice thickness of 3.75 mm using the ordered subset expectation maximization iterative image reconstruction method. PET, CT, and fused PET/CT images were generated for review by the software syngo TrueD.

Imaging analysis

Images in coronal, sagittal, and transaxial slices were visually interpreted by two experienced nuclear physicians who were blinded to the pathological results and who collaboratively reached a final consensus. Tumor lesions were defined as areas of pathologically elevated 18F-FDG uptake when other causes, such as physiological uptake and inflammation of the intestine, were excluded. Quantitative analysis was performed using the dedicated software syngo TrueD. Volumes of interest (VOIs) of 1.25 cm in diameter were positioned manually over the entire tumor lesion on the attenuation-corrected images of the baseline scan and the scan after nCRT (if it was performed). The parameters of SUVmax and SUVmean within the VOIs before and after nCRT, labeled as SUVmax-pre, SUVmax-post, SUVmean-pre, and SUVmean-post, were recorded. MTV (metabolic tumor volume) was calculated with a fixed threshold value of 40% SUVmax based on a previously published method [12]. However, when SUVmax was <5.5 g/ml, calculation of MTV was considered impossible, as the threshold value is 2.2 in those cases. This value is commonly considered the threshold of physiological tissue uptake, resulting in inaccurate VOI measurements. Total lesion glycolysis (TLG) was also calculated by multiplying the SUVmean value by the MTV value. The other parameters were calculated as follows:

$$\Updelta\text{SUVmax} = {\text{SUVmax-pre}} - {\text{SUVmax-post}}; $$
$$\text{RI-SUVmax} = {\Updelta\text{SUVmax}}/{\text{SUVmax-pre}} \times 100\,\%; $$
$${\Updelta\text{SUVmean}} = {\text{SUVmean-pre}} - {\text{SUVmean-post}} ; $$
$$\text{RI-SUVmean} = {\Updelta\text{SUVmean}}/{\text{SUVmean-pre}} \times 100\,\%; $$
$$\Updelta\text{MTV} = {\text{MTV-pre}} - {\text{MTV-post}}; $$
$$\text{RI-MTV} = {\Updelta\text{MTV}}/{\text{MTV-pre}} \times 100\,\%; $$
$$\Updelta\text{TLG} = {\text{TLG-pre}} - {\text{TLG-post}}; $$
$$\text{RI-TLG} = {\Updelta\text{TLG}}/{\text{TLG-pre}} \times 100\,\%.$$

Pathological tumor response evaluation

Pathological tumor response was estimated using the TRG classification system proposed by Mandard et al. [6]. An experienced pathologist who was blinded to the PET/CT and CEA data was responsible for the evaluation using the following criteria: TRG 1, complete disappearance of tumor cells; TRG 2, rare residual cancer cells scattered throughout the fibrosis; TRG 3, an increased number of residual cancer cells with predominant fibrotic tissue; TRG 4, residual cancer outgrowing the fibrotic tissue; and TRG 5, no regressive changes [6]. Patients were then grouped as responders (TRG 1–2) and non-responders (TRG 3–5) for statistical analysis.

Statistical analysis

Comparisons of continuous and categorical variables between groups were performed by Mann–Whitney test and Fisher’s exact test. Receiver operating characteristic (ROC) analysis was performed for continuous variables that were significantly correlated with TRG or pCR. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated and reported based on certain cut-off values. Missing values were not included in the analysis. A p value < 0.05 (two-sided) was considered to indicate a statistically significant difference. All tests were conducted using SPSS 21.0, except for the ROC analysis, which were performed by MedCalc software 10.1.6.0.

Results

Clinicopathologic characteristics are presented in Table 1. Of the 64 included patients, all received pre-nCRT PET/CT, and 28 also received post-nCRT PET/CT. The median interval between the end of nCRT and second PET/CT was 34 (range 26–44) days, and median time span from nCRT to surgery was 54 (range 42–66) days. The pathological tumor response evaluation classified 19 patients as TRG1 (pCR), 12 as TRG2, 23 as TRG3, 10 as TRG4, and 0 as TRG5. Thus, responders (TRG 1–2) and non-responders (TRG 3–5) accounted for 48.4% and 51.6%. The pCR rate was 29.7%. PET/CT parameters and serum CEA level along with their correlation with tumor regression were detailed in Online Resource 1. The variables that differed between responders and non-responders included ΔSUVmax (24.88 vs. 15.39 g/ml, p = 0.037), RI-SUVmax (0.76 vs. 0.63, p = 0.025), ΔSUVmean (14.43 vs. 8.65 g/ml, p = 0.029), RI-SUVmean (0.77 vs. 0.63, p = 0.011), CEA-pre (6.30 vs. 27.86 μg/L, p < 0.001), CEA-post (2.22 vs. 5.49 μg/L, p = 0.002), ΔCEA (4.08 vs. 23.13 μg/L, p < 0.001), and RI-CEA (0.25 vs. 0.55, p = 0.002). Discriminating pCR from non-pCR, predictive factors determined by univariate analysis were as follows: RI-SUVmean (0.77 vs. 0.65, p = 0.043), MTV-pre (9.87 vs. 14.62 cm3, p = 0.045), CEA-pre (5.62 vs. 22.27 μg/L, p = 0.002), CEA-post (1.95 vs. 4.72 μg/L, p = 0.001), and ΔCEA (3.68 vs. 17.99 μg/L, p = 0.013) (Fig. 1).

Fig. 1
figure 1

Patient with a cT4N1M0 rectal cancer showing a complete pathological response to treatment. Fused PET/CT images before (SUVmax-pre = 28.30 g/ml, SUVmean-pre = 18.11 g/ml, (A) and after (SUVmax-post = 3.32 g/ml, SUVmean-post = 2.00 g/ml, (B) neoadjuvant chemoradiotherapy, with volume of interest (inside irregular contours) delineated were presented. The contrast of pre- and post-treatment images of partial responder (cT4N2M1, TRG = 2, SUVmax-pre = 30.02 g/ml, SUVmean-pre = 17.58 g/ml, (C); SUVmax-post = 7.22 g/ml, SUVmean-post = 3.99 g/ml, (D) and non-responder (cT4N1M1, TRG = 4, SUVmax-pre = 18.35 g/ml, SUVmean-pre = 10.81 g/ml, (E); SUVmax-post = 12.15 g/ml, SUVmean-post = 7.47 g/ml, (F) were shown. SUVmax: maximum standardized uptake value; SUVmean: mean standardized uptake value

Table 1 Patient characteristics

ROC analysis was performed to estimate the accuracy of prediction and is presented in Fig. 2 and Table 2. With an area under the curve (AUC) of 0.786, RI-SUVmean became the strongest discriminator of responders from non-responders, which exhibited a sensitivity of 63.64%, a specificity of 88.24%, a PPV of 77.8%, a NPV of 78.9%, and an accuracy of 78.57% when 0.7644 was used as the cut-off value. CEA-post and ΔCEA were identified as the most accurate predictors of pCR and non-pCR, with an AUC of 0.762 at the selected cut-off values of 1.97 μg/L and 1.82 μg/L, resulting in a sensitivity of 73.68%, a specificity of 75.00%, a PPV of 56.0%, a NPV of 86.8%, an accuracy of 74.60% for CEA-post and a sensitivity of 77.42%, a specificity of 74.19%, a PPV of 75.0%, a NPV of 76.7%, an accuracy of 75.18% for ΔCEA.

Fig. 2
figure 2

A ROC curves for CEA for the prediction of Mandard’s tumor regression grade 1–2; B ROC curves for the prediction of Mandard’s tumor regression grade 1–2; C ROC curves for CEA level for the prediction of pCR; D ROC curves for PET/CT parameters for the prediction of pCR. ROC Receiver operating characteristic, PET/CT positron emission tomography-computed tomography, CEA carcinoembryonic antigen, pCR pathological complete remission

Table 2 ROC analysis

On univariate analysis, none of other clinicopathologic characteristics were statistically substantiated as important indicators for tumor response, as detailed in Online Resource 2.

Discussion

The unsatisfactory accuracy of pathological prediction by routine imaging methods, which was reported only 54%, prompts researchers to search for alternative markers [13]. 18F-FDG PET/CT has shown better capacity in differentiating residual tumor from fibrotic tissue, which contributes to eliminating the chance of over-staging [14]. However, the most optimal parameters and cut-off values for prediction of response to nCRT in rectal cancer have not been clarified yet.

Most previous studies placed great emphasis on the quantitative analysis of SUVmax obtained before or after nCRT, since it was a direct reflection of tumor aggressiveness. However, SUVmax only indicated one maximal 18F-FDG uptake in the lesion and was not always representative of the gross tumor burden [15]. In comparison, SUVmean, MTV, and TLG exhibited average tumor metabolic activity, complete tumor volume, and total 18F-FDG uptake of the lesion, revealing their potential values in prediction. Another concern was that a cross-sectional investigation of parameters only implied the tumor status at one single time point, while differences or RIs might demonstrate dynamic changes. In our study, only MTV-pre was recognized as the single-time-point indicator related to pCR, whereas ΔSUVmax, RI-SUVmax, and ΔSUVmean were predictive of responders, and RI-SUVmean was predictive of both pCR and responders. Of all PET/CT parameters, RI-SUVmean was associated with the highest AUC for TRG 1–2 with recommended cut-off value of 0.7644. The advantage of RI-SUVmean over ΔSUVmean might be explained by the influence of body weight, interval between 18F-FDG administration and image acquisition, as well as blood glucose level on ΔSUVmean, in accordance with a previous study [16].

RI-SUVmean presented a sensitivity of 63.64% and a specificity of 88.24% in discriminating responders, which revealed false negative of 36.36% and false positive of 11.76%. It was usually believed that false negative was caused by therapy-induced inflammation, especially within 12 weeks from the end of nCRT. Since peritumoral inflammatory cells displayed increased 18F-FDG uptake, the decrease in glucose metabolism of tumor cells were underestimated, leading to false impression of poor response [17]. However, therapy-related elevated 18F-FDG uptake was often featured by uniform thickening of rectal wall and relatively diffused distribution. Other possible solutions for this issue lied on improved PET/CT procedure and data process, such as voxel-based dual-time 18F-FDG parametric imaging [18] and biological target volume overlapping segmentation system method [19]. The reason for false positive was that after significant tumor shrinkage, the 18F-FDG uptake of small amount of residual cells scattering in fibrosis may be interfered by physiologic uptakes. Besides, part of tumor entities might be composed of mucinous adenocarcinoma, which was difficult to detect on 18F-FDG PET/CT but less likely to regress because of radio-resistance. Fortunately, even in patients with clinical complete remission, salvage surgery was successful in most re-growing tumors [8].

The different parameters and cut-off values concluded in various studies disclosed the heterogeneity in study populations, interval between nCRT and surgery, criteria for response evaluation, and PET/CT manipulation. First, it is known that a prolonged interval between nCRT and surgery often leads to increased pCR rate [20]. In the current study, most of patients received their second PET/CT scan within 4–6 weeks after nCRT, with interval from nCRT to surgery focused mainly between 7 and 9 weeks, following the standard procedure of our institute. Second, although pCR and Mandard TRG were mostly estimated, down-staging and long-term results were used frequently as surrogate endpoints [21, 22]. Third, the methods of data collection and analysis affected study results as well. As mentioned before, a threshold of 40% of SUVmax was considered as the boundary of MTV. When SUVmax <5.5 g/ml [12], a crossover between the calculated threshold and physiological tissue uptake occurred, yielding an unreasonable overestimation of MTV. For this reason, 5 patients were excluded from the analysis of MTV in the current study. In brief, although the overall accuracy of 18F-FDG PET/CT reached to approximately 80% in all studies [23, 24], the results must be explained with caution due to various confounding factors.

Serum CEA is a routinely used and easily measured tumor marker helpful in diagnosis and long-term follow-up of colorectal cancer. The present study demonstrated that CEA and its related parameters were also valuable indicators of tumor response. CEA-pre, CEA-post, ∆CEA, and RI-CEA were indicators of TRG, whereas CEA-pre, CEA-post, and ∆CEA were indicators of pCR. Previous studies had also identified an association between pre-treatment CEA and tumor response [11, 24]. It was evident that low post-nCRT CEA was an independent predictor of pCR, with an optimal cut-off value of 2.61 ng/ml, generating a sensitivity and specificity of 76% and 58.4%, respectively [25]. Similarly, in our study, the optimal cut-off value for CEA-post was 1.97 μg/L, yielding a sensitivity and specificity of 73.68% and 75.00% for pCR prediction.

The main limitations of our study were the small study population and missing data, which prevented further subgroup analysis. A large cohort is expected to determine a predictive model combining PET/CT parameters, biomarkers, and other clinicopathologic characteristics. Another concern was, except SUVmax, the other PET/CT parameters all relied on the contouring of VOI, which was time-consuming and subjective. Thus, in clinical practice, the application of complicated PET/CT parameters may be primarily adopted in patients whose responses are difficult to assess by regular methods.

Conclusions

We demonstrate that 18F-FDG PET/CT parameters and serum CEA levels are reliable for assessing response and pCR to nCRT in rectal cancer, although the optimal cut-off values require further validation. These indicators are candidate markers for individualized treatment in rectal cancer.