Colorectal carcinoma is the third most common cause of malignancy in both men and women, and the second most common cause of death in the western world [1]. Rectal cancer is challenging because of its unique anatomical characteristics (i.e., partial covering by peritoneum), critical position, and therapeutic options (aimed at preserving pelvic floor integrity, sphincter continence and genitourinary functions) [2].

Staging procedures include: complete bowel endoscopy (to identify possible synchronous tumors), endoscopic ultrasound to assess rectal wall invasion, rigid proctoscopy (to establish the exact level of the lesion), carcinoembryonic antigen test, Contrast-enhanced Computed Tomography, and Magnetic Resonance Imaging of the pelvis to assess the presence of metastasis and pathologic lymph nodes [2].

The most important treatment approach to rectal cancer is surgery with organ-sparing procedures; preserving sphincter and genitourinary functions should always be preferred but, unfortunately, are not always possible [2]. The multimodality strategy is the best choice for locally advanced rectal cancer (LARC), corresponding to stages II e III (cT3-4 N0M0 or any TN1M0) [1], to obtain a better local control and allow a more conservative surgical treatment. This strategy consists of neoadjuvant concomitant chemoradiotherapy (nCRT) followed by more conservative but radical surgery based on total mesorectal excision. In particular, nCRT leads to a reduction in tumor stage and size, increasing the possibility of sphincter conservation [3, 4].

18F-FDG PET/CT, in addition to conventional techniques, provides new findings in the staging of primary rectal cancer, in detecting recurrences and distant metastases, and in delineating tumor volumes for radiotherapy planning [5, 6]. Several previous studies have observed that 18F-FDG PET/CT could be an indicator of treatment response and, in a neoadjuvant setting, it could have a predictive and/or prognostic value for many tumors including LARC [711]. It is now well established that 18F-FDG PET/CT biological information is important and often more predictive of histological response and outcome than anatomical imaging [12]. In cases with different therapeutic options, a cut-off value of a metabolic index that could help in decision making could be extremely useful [2, 7, 10]. The best goal could be a wait-and-see strategy in patients showing a complete pathologic response after nCRT, an approach that could reduce surgery-related mortality and morbidity. However, the level of evidence is still weak and further investigations are needed before they can be “translated” into clinical recommendations for everyday practice [2]. Therefore, the need to standardize response assessment in 18F-FDG PET/CT is becoming increasingly important, and the identification of semiquantitative parameters to evaluate the glucose metabolism of lesions before and after therapy is urgently needed.

The primary endpoint of the present study was to evaluate the role of baseline and post-nCRT 18F-FDG PET/CT procedures in predicting nCRT response in patients with LARC, as compared to both the histopathology scales of tumor response. The secondary endpoint was to evaluate qualitative 18F-FDG PET/CT assessments and identify standardized parameters before and post-nCRT that can differentiate responder from non-responder patients.

Materials and methods

Patients

A total of 68 patients with an initial diagnosis of LARC, located less than 8 cm from the anal verge, were enrolled. All patients underwent conventional diagnostic/staging procedures including rectal examination, EUS, pelvic CT or MR, and colon/rectosigmoidoscopy, to characterize the rectal lesion (location, size, distance from the sphincter apparatus, circumferential resection margin, relationship with neighboring organs, infiltration of the mesorectum, and presence of adenopathies).

All patients had different grades of biopsy-proven rectal adenocarcinoma; no other histotypes were included in the study. The tumor location was defined as the distance between the lower edge of the tumor and the anal verge, and was measured by digital examination and rigid proctoscopy.

The following exclusion criteria were applied: pregnancy, age younger than 18 years, previous rectal treatment (chemotherapy, radiotherapy or surgery), presence of distant metastases at the time of diagnosis, neoadjuvant therapy contraindications due to comorbidity and/or the presence of another synchronic tumor. Institutional Review Board Approval was obtained before patients enrolment. Prior written informed consent to take part in the study was obtained from all patients.

The same techniques, namely rectal examination, endoscopic ultrasound, pelvic computed tomography or magnetic resonance, colon/rectosigmoidoscopy, were repeated at the end of nCRT.

Characteristics of the enrolled patients and the initial staging are reported in Table 1.

Table 1 Characteristics of patients and tumors

Neoadjuvant treatments

Chemotherapy, consisting of 5-fluorouracil (435 mg/m2/days) and leucovorin (20 mg/m2/days) for 32–34 days, was intravenously administered. The whole pelvic field received 25 fractions of 180 cGy/days over 5 weeks, for a total of 5040 cGy, using a 4-field box technique. Neoadjuvant chemotherapy was started concurrently on the first day of radiotherapy.

Surgery

All patients were scheduled to undergo surgery 8 weeks after completion of the nCRT. The same surgical team operated all patients; Total Mesorectum Excision was performed.

18F-FDG PET/CT

The first whole-body 18F-FDG PET/CT was performed around 1 week before the beginning of nCRT (baseline scan), to provide confirmation of the primary tumor and to rule out metastatic disease. The second 18F-FDG PET/CT was scheduled at 5–6 weeks from nCRT completion (post-nCRT scan) in order to reduce potential false-positive results related to chemotherapy and radiotherapy.

Images were acquired with a Discovery LSA PET/CT device (GE Healthcare, Waukesha, Wisconsin, USA) that integrates a PET (advance nxI) with 16-slice CT scanner (light speed plus). All patients, before 18F-FDG administration, fasted for at least 8 h and had capillary blood glucose of <160 mg/mL. They were instructed not to do any physical activity before the examination to avoid artifacts caused by muscles. Images acquisition was obtained 50 min after intravenous injection of 4.6 MBq/kg of 18F-FDG.

Patients were hydrated by drinking 500 mL of water and urinated as needed. No muscle relaxant drugs were administered. The CT scan was carried out from the external acoustic meatus to the root of the thigh with patients lying on their back with their hands above their head. The CT acquisition parameters were: 340 mA (auto), 120 kV, slice thickness 3.75 mm, tube rotation time 0.8 ms, collimation field of view (FOV) of 50 cm. The CT images were reconstructed with a filtered back-projection. The CT data were used for attenuation correction of PET scanning, which was performed immediately after the acquisition of CT images. The CT scans were performed without administration of contrast medium. The PET acquisition was obtained in caudal-cranial direction; PET was reconstructed with a matrix of 128 × 128, ordered subset expectation maximum iterative reconstruction algorithm (two iterations, 28 subsets), 8 mm Gaussian filter, and 50 cm field of view.

Image analysis

Two nuclear medicine physicians with at least 8 years of experience blindly and independently analyzed data using a dedicated Advantage™ Workstation (version 3.2; GE Healthcare, Waukesha, Wisconsin, USA).

Images of baseline and post-nCRT 18F-FDG PET/CT scans were analyzed by the program “MultiVol CONF PETCT” that allows the simultaneous observation of both scans.

Qualitative analysis was performed by visual response assessment (VRA) and the response was classified as follows:

  • Complete response (CR): complete absence of disease in the post-nCRT 18F-FDG PET/CT scan;

  • Partial response (PR): reduction of the disease in the post-nCRT 18F-FDG PET/CT compared to the baseline 18F-FDG PET/CT scan;

  • Stable disease (SD): unchanged disease in the post-nCRT 18F-FDG PET/CT compared to the baseline 18F-FDG PET/CT scan;

  • Progressive disease (PD): increased disease in the post-nCRT 18F-FDG PET/CT compared to the baseline 18F-FDG PET/CT scan, or the appearance of nodal or distant metastases.

Volumes of interest (VOIs) were drawn semiautomatically on the rectal area of the abnormal 18F-FDG uptake corresponding to the tumor in the baseline scan, the boundaries were drawn large enough to incorporate each target lesion in the axial, coronal, and sagittal 18F-FDG PET images; the program automatically reproduced the same VOI in the identical position on the post-nCRT scan; in cases of misalignment, the VOI was manually placed correctly, matching the position with the aid of the anatomical landmarks provided by CT and fusion PET/CT images.

Semiquantitative analyses were performed by calculating standardized uptake values (SUVmax) and mean (SUVmax), using the maximum and mean activity values within each VOI with the highest radioactivity concentration, normalized to the injected dose and patient’s body weight.

The SUVmax and SUVmean values of the baseline scan (SUVbaseline) and the post-nCRT scan (SUVpost-nCRT) were used to assess tumor response to therapy as follows:

  • By calculating the absolute SUVbaseline−SUVpost-nCRT difference (ΔSUV).

  • By calculating a response index (RI%), as RI = [(SUVbaseline−SUVpost-nCRT)/SUVbaseline] × 100.

ΔSUV were calculated both for SUVmax and SUVmean (ΔSUVmax and ΔSUVmean) as well as RI (RImax% and RImean%).

In order to calculate the baseline and post-nCRT metabolic tumor volume (MTV), a fixed threshold value of 40% of the SUVmax uptake was used to determine tumor margins automatically, according to the method previously described by Larson et al. and Lee et al. [13, 14].

Total lesion glycolysis (TLG; baseline and post-nCRT) was then calculated as: SUVmean × MTV.

Response evaluation–histopathology

Two gastrointestinal pathologists with at least 5 years of experience examined all resection specimens. Assessment of the tumor response to nCRT was performed according to Mandard’s tumor regression grade (TRG score) [15] and also by evaluating the (y)pTNM categories according to the International Union against Cancer (UICC, 7th edition, 2010) [16]. According to the TRG, the patients were divided into two groups: “responders” (TRG I and II) and “non-responders” (TRG III to V). According to (y)pTNM, patients were divided into “complete responders” (T0N0) and “partial/non-responders” (T1-3, anyN).

Statistical analysis

Sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) of post-nCRT 18F-FDG PET/CT were evaluated, considering the (y)pTNM response as the gold standard. Concordances of VRA with TRG and with (y)pTNM criteria were evaluated by Cohen’s K. The nCRT response was analyzed by evaluating the results of the baseline and post-nCRT 18F-FDG PET/CT scans (SUVmaxbaseline, SUVmeanbaseline, MTVbaseline, TLGbaseline, SUVmaxpost-nCRT, SUVmeanpost-nCRT, MTVpost-nCRT, TLGpost-nCRT, ΔSUVmax, ΔSUVmean, RImax%, and RImean%) in relation to the TRG and (y)pTNM criteria. Results were compared by t student test for unpaired groups. To evaluate the capacity of 18F-FDG PET/CT measurements to predict the nCRT response in patients with LARC and to individuate hypothetical cut-off values, ROC curve analyses were performed. Statistical evaluations were carried out using SPSS 20.0 for Mac (Chicago, Illinois, USA).

Results

Pathology—TRG and (y)pTNM response

According to the TRG scale, analysis of surgical specimens of the 68 patients identified 25 responders (36.8%) and 43 non-responders (63.2%). According to the (y)pTNM evaluation, 12 patients (17.6%) resulted complete responders while 56 (82.4%) resulted partial/non-responders.

Results of histology are reported in Table 2. Figure 1 depicts a responder according to both criteria, and Fig. 2 a non-responder.

Table 2 Results—pathology analysis
Fig. 1
figure 1

Patient classified as Responder. 18F-FDG PET/CT of a 51-year-old male with a 30 mm length lesion, localized 3 cm from the anal verge (clinical stage II, cT3N0, G2). Baseline MIP PET (A) and sagittal fused (B) images show the rectal lesion (green arrows). 18F-FDG PET/CT parameters collected are: SUVmaxbaseline = 11.7, SUVmeanbaseline = 6.2, MTVbaseline = 8.78, TLGbaseline = 60.30. The post-nCRT 18F-FDG PET/CT scan (C, D) did not show any lesion. 18F-FDG PET/CT parameters collected are: SUVmaxpost-nCRT = 3.5, SUVmeanpost-nCRT = 1.6, MTVpost-nCRT = 26.19, TLGpost-nCRT = 41.24, ΔSUVmax = 8.20, ΔSUVmean = 4.60, RImax% = 70.08, and RImean% = 74.19. Histological analysis revealed (y)pT3N0M0, TGR1, and R0 and the patient was classified as a responder.

Fig. 2
figure 2

Patient classified as Non-responder. 18F-FDG PET/CT of a 77-year-old female with an eccentric ulcerated lesion, 50 mm in length, localized 2 cm from the anal verge. Clinical stage was III (cT2N1, G2). Baseline Sagittal PET (A) and transaxial PET and fused (B, C) images show the rectal lesion (green arrows). 18F-FDG PET/CT parameters collected are: SUVmaxbaseline = 14.4, SUVmeanbaseline = 7.6, MTVbaseline = 21.47, TLGbaseline = 162.70. The post-nCRT 18F-FDG PET/CT (D, E, F) showed a reduction but no disappearance of 18F-FDG pathological uptake in the rectal area (green arrows). The patient was considered as a PR in the VRA; 18F-FDG PET/CT parameters collected are: SUVmaxpost-nCRT = 4.7, SUVmeanpost-nCRT = 1.9, MTVpost-nCRT = 19.51, TLGpost-nCRT = 37.13, ΔSUVmax = 9.7, ΔSUVmean = 5.7, RImax% = 67.36%, and RImean% = 75%. Histological analysis revealed (y)pT2N0, TGR4, and R0 and the patient was classified as a non-responder.

Visual response assessment

Post-nCRT 18F-FDG PET/CT scan showed sensitivity, specificity, accuracy, PPV, and NPV of 87.5%, 66.7%, 83.8%, 92.5%, and 53.3%, respectively.

VRA distinguished 17 patients (25%) in CR, 40 (58.8%) in PR, 9 (13.2%) s in SD, and 2 (2.9%) in PD.

Classifying the response as CR and “others” (PR + SD + PD), concordances between VRA and both TRG and (y)pTNM resulted moderate, being K = 0.458 (95% CI 0.239–0.677) and K = 0.522 (95% CI 0.279–0.756), respectively.

Response assessment by semiquantitative parameters

Mean values of the 18F-FDG PET/CT parameters in all 68 patients are reported in Table 3.

Table 3 18F-FDG PET/CT parameters in all patients

Mean values of the 18F-FDG PET/CT parameters according to the TRG response assessment are reported in Table 4. Statistical analysis showed a statistically significant difference between responders and non-responders for SUVmaxpost-nCRT (t = −1.999, p = 0.050) and RImean% (t = 2.413, p = 0.019).

Table 4 18F-FDG PET/CT parameters in responders and non-responders patients according TRG

Mean values of the 18F-FDG PET/CT parameters according to the (y)pTNM response assessment are reported in Table 5. Statistical analysis showed a statistically significant difference between “complete responders” and “partial/non-responders” for MTVbaseline (t = −2.367, p = 0.021), SUVmaxpost-nCRT (t = −4.389, p = 0.000), SUVmeanpost-nCRT (t = −3.937, p = 0.001), MTVpost-nCRT (t = 2.840, p = 0.014), RImax% (t = 2.328, p = 0.023), and RImean% (t = 2.941, p = 0.008).

Table 5 18F-FDG PET/CT parameters in “complete responders” and “partial/non-responders” patients according (y)pTNM

ROC analysis

ROC curves analysis according to the TRG are reported in Fig. 3A, B. They showed a diagnostic capability for SUVmaxpost-nCRT (AUC = 0.675; SE = 0.065; 95% CI 0.550–0.784), SUVmeanpost-nCRT (AUC = 0.660; SE = 0.067; 95% CI 0.535–0.770), and RImean% (AUC = 0.667; SE = 0.070; 95% CI 0.543–0.777), with no significant difference among them. ROC curves analysis according to the (y)pTNM scale are reported in Fig. 4A, B. They showed a diagnostic capability for MTVbaseline (AUC = 0.747; SE = 0.069; 95% CI 0.627–0.845), SUVmaxpost-nCRT (AUC = 0.794; SE = 0.061; 95% CI 0.678–0.882), SUVmeanpost-nCRT (AUC = 0.774; SE = 0.064; 95% CI 0.656–0.866), MTVpost-nCRT (AUC = 0.784; SE = 0.083; 95% CI 0.668–0.875) RImean% (AUC = 0.751; SE = 0.086; 95% CI 0.632–0.848), and RImax% (AUC = 0.746; SE = 0.087; 95% CI 0.652–0.843) with no significant difference among them. The best cut-off values and their sensitivity and specificity are reported in Table 6.

Fig. 3
figure 3

ROC analysis for the outcome variable TRG. ROC curves of the significant 18F-FDG PET/CT parameters for the assessment of response to nCRT according to the TRG.

Fig. 4
figure 4

ROC analysis for the outcome variable (y)pTNM. ROC curves of the significant 18F-FDG PET/CT parameters for the assessment of response to nCRT according to the (y)pTNM.

Table 6 Best cut-off values according TRG and (y)pTNM

Discussion

At the current state of the art, the major guidelines recommend nCRT in cT3N0 patients in order to reduce tumor size and stage, improve resectability, and the chance of sphincter preservation [2, 11].

In patients with a complete histologically proven response [(y)pT0], a “wait-and-see” strategy may be optimal [17, 18]; however, the level of evidence is still weak and further investigations are needed before “translating” this strategy into clinical recommendations [2].

The tumor response to nCRT varies considerably among patients, ranging from complete disappearance of the tumor in about 15%–20% of cases to no change or even tumor progression during the treatment. In our study, 17.6% of patients achieved a CR (ypT0N0M0), in agreement with the literature [19]. Precise restaging to assess the outcome of the preoperative treatment is essential in these patients, since it may be possible to avoid surgery after nCRT [17, 20].

The most important scales for tumor response are: Mandard’s TRG for its prognostic role in determining modifications after nCRT (we grouped TRG1 and TRG2 together as responders because they have a similar prognosis); and (y)pTNM because it is associated with clinical outcome [(y)pT0 demonstrates the absence of neoplastic cells in the surgical specimen], but does not describe modifications after nCRT [21]. Our study is the first, to our knowledge, to analyze both the qualitative 18F-FDG PET/CT response and 12 semiquantitative PET parameters for both Mandard’s TRG and the (y)pTNM histology assessment.

Depending on the difficulty in differentiating persistent disease from scar tissue, morphological imaging tends to upstage the disease in the work-up after nCRT [22]. Many studies have demonstrated a relatively strong relationship between 18F-FDG uptake and cancer cell numbers [23]. As a result, it is reasonable to expect that the loss of viable cancer cells may be accompanied by a decrease in tumor 18F-FDG uptake. Moreover, the inability of 18F-FDG to discriminate between a minimal tumor burden and no tumor burden has been demonstrated [12].

18F-FDG PET/CT validity for monitoring nCRT effects depends on appropriate timing, because chemotherapy and radiotherapy can produce 18F-FDG avid inflammatory reactions that last 1 week and 6 months, in the form of fibrosis and oedema, respectively. A long interval can reduce the probability of obtaining a nonspecific 18F-FDG uptake but too long a wait is not clinically justified [10, 11, 24]. These points have been considered by the World Health Organization that recommends performance of the 18F-FDG PET/CT scan 7 weeks after nCRT and early surgery 1 week later [10]. In our study, all patients underwent 18F-FDG PET/CT 5–6 weeks after the end of nCRT and surgery was performed after 8 weeks from the end of the combined treatment, consistent with the recommended method.

18F-FDG PET/CT qualitative assessment is always the first step in clinical practice. Several assessment methods have been previously proposed, among which VRA has already been shown to have a significant predictive role in distinguishing TRG responders and non-responders [25]. In our study, the 18F-FDG PET/CT sensitivity, specificity, accuracy, PPV, and NPV, and the moderate concordance between VRA and TRG and VRA and (y)pTNM allow us to consider 18F-FDG PET/CT a good diagnostic tool for the global assessment of LARC patients.

Recently, Murcia et al. achieved better results for 18F-FDG PET/CT diagnostic validity than those obtained in previous studies, regardless of whether the authors used visual analysis or a semiquantitative method [26]. De Geus-Oei et al. and Li et al. recently carried out systematic reviews of monitoring and predicting the response to therapy using 18F-FDG PET/CT in LARC [27, 28]. They identified and analyzed a series of 19 and 31 studies, respectively, although almost all the studies were very small and heterogeneous as regards the methods applied for PET quantification (e.g., VRA, SUVmax, SUVmean, and TLG), the timing of the examination, the metabolic response evaluation criteria, and the clinical endpoints [27, 28].

The most significant semiquantitative parameter for malignancies prognosis and treatment evaluation is the SUV [29]. For uptake measurements, no difference in accuracy with respect to reproducibility has been reported between SUVmax and SUVmean. At the state of the art, SUVmax measurement is mandatory because its value is the most consistent and less dependent on the ROI size [30].

Previous studies suggested SUVmaxpost-nCRT and RImax% as parameters that should be considered in clinical practice. Martoni et al. reported that while the baseline 18F-FDG PET/CT, expressed as SUV1, was correlated with the pathologic response, this correlation was lost at multivariate analysis [31]. Guerra et al. reported, in a study on 31 patients, no statistical differences in SUV1 between Mandard’s TRG responders and non-responders (p = 0.151). Thus, they concluded that 18F-FDG PET/CT as a baseline did not appear to have any relevance in the standard staging workup as a predictor of the pathologic response [7]. In our study, too, neither max nor mean SUVbaseline seemed to have a predictive role.

ROC analysis by Guerra et al. showed that the SUV of post-nCRT 18F-FDG PET/CT has the best accuracy in predicting therapy response, with a threshold of 4.4, while their result equivalent to RImax% in our study was found inadequate for this purpose; however, it should be considered that 18F-FDG PET/CT was performed after 12 weeks, instead of the 5–6 weeks in our study [7]. Maffione et al. demonstrated the highest AUC for SUVmax-post in the TRG assessment, with a cut-off value of 5.1, close to the SUVmax values of 5.5 and 6.0, reported in literature, suggesting the use of a cut-off value in clinical practice [25]; also results by Niccoli et al. revealed a predictive role in TRG assessment for SUVmaxpost-nCRT and RImax% [10]. In our study, the difference between responders and non-responders in TRG assessment resulted statistically significant only for SUVmaxpost-nCRT and RImax%. Furthermore SUVmaxpost-nCRT (cut-off > 4.3; sensitivity 83.7%; specificity 48%), SUVmean post-nCRT (cut-off > 2; sensitivity 74.4%; specificity 56%), and RImean% (cut-off ≤ 74; sensitivity 81.4%; specificity 48%) showed a predictive value in TRG assessment; they had a better AUC, with no significant difference among them. Because of the difference between our present and previous results, and the lack of any parameter clearly resulting to be a better predictor than the others, cut-off values should be considered only as a guide in clinical practice. Similar considerations to those for the TRG can be made for the (y)pTNM assessment. Our study did not show any statistical difference between responders and non-responders nor any predictive value for SUVmaxbaseline even in (y)pTNM assessment. Perez et al. found a significant association between SUVmaxbaseline in TNM complete responders (14 ± 3.7) and in non-responders (18 ± 9.3, p = 0.043); and between SUVmaxpost-nCRT in TNM (5.6 ± 1.9 in complete responders) versus non-responders (6.9 ± 4.9, p = 0.23) [32]. In literature, RImax% resulted the strongest predictor of response, with cut-offs ranging from 45.9% to 66%–77% [3335]. In the analysis by Niccoli et al. SUVmax2 resulted the best parameter for predicting (y)TNM, with a cut-off of > 4.3 (sensitivity 79.5% and specificity 66.7%) [10]. In the current study, there were statistically significant differences between responders and non-responders for SUVmaxpost-nCRT (4.17 ± 1.98, 7.30 ± 1.19), SUVmeanpost-nCRT (1.92 ± 0.98, 3.53 ± 1.60), RImax% (70.32 ± 17.84, 55.33 ± 20.68), and RImean% (73.73 ± 17.60, 56.03 ± 24.15).

ROC analysis showed a better AUC for SUVmaxpost-nCRT (cut-off > 4.3; sensitivity 80.4%; specificity 66.7%), SUVmeanpost-nCRT (cut-off > 2; sensitivity 71.4%; specificity 75%), RImax% (cut-off ≤ 65.5; sensitivity 67.9%; specificity 75%), and RImean% (cut-off ≤ 74; sensitivity 78.6%; specificity 66.7%) with no significant differences among them. These cut-off values are strictly dependent on our study population and should be considered with caution, as also those postulated for the TRG response criterion.

Among the parameters discussed above (SUVmaxpost-nCRT, SUVmeanpost-nCRT, RImax%, and RImean%), SUVmaxpost-nCRT seems to be the most valid parameter in predicting the nCRT response for both the pathological assessments, also having the same cut-off value (>4.3); furthermore, SUVmaxpost-nCRT is easy to collect and should always be taken into consideration when evaluating LARC patients.

SUVmax is well established as a strong quantitative measure, but it represents only the measurement of a single pixel with the highest radiotracer concentration within the ROI, and may not reflect the heterogeneous nature of the tumor. Volume-based PET parameters such as MTV and TLG have been developed to measure the metabolic activity in the entire tumor mass. MTV is considered to play a leading role as a prognostic factor in malignancies, due to its representation of the dual characteristics of tumor volume and extent of 18F-FDG uptake by tumor tissues [29]. TLG has been proposed as a more accurate parameter, because it takes into account SUVmean and MTV [13]. Using these parameters in routine clinical practice is not easy because much time and effort need to be spent to measure them. However, thanks to the development of software capable of automated volume-of-interest (VOI) assessments, these parameters have become easily available. Current clinical oncology guidelines do not include MTV measurements or TLG in the assessment of the response to treatment. However, these parameters will potentially be a useful index for assessing therapeutic response by quantifying the global change in tumor burden during or after treatment. A standard method for tumor delineation has not yet been established owing to controversies regarding whether the value of volume-based PET parameters is affected by the method used.

Moon et al. recently reviewed the methods currently applied, that consist in delineating the tumor boundaries using an isocontour threshold in which all contiguous voxels with values above a chosen threshold are included; the automatic or semiautomatic tumor volume delineation reduces inter- and intra- observer variation. The author reported that the most widely used method is the one that applies a fixed threshold based on a percentage of SUVmax within the tumor (typically, 40%–70%). The fixed threshold method for measuring tumor volume can be limited by the amount of noise present and by the choice of too low a fixed threshold, erroneously including in these cases a significant proportion of the background in the tumor volume [36]. Following these literature experiences, in our study we fixed the threshold at 40%.

A peculiarity of our method is the use of the VOI with the same dimensions in the baseline and post-nCRT scans, even if the post-nCRT scan showed an evident reduction of the lesion size and 18F-FDG uptake. We considered variations of the MTV and TLG intrinsic to the post-nCRT values and so we did not take into account differences between baseline and post-nCRT.

There is still ongoing debate as to which parameter between MTV and TLG is superior for predicting outcomes and assessing treatment response. The best supported hypothesis is that TLG may be the ideal metabolic parameter for evaluating the tumor burden because it simultaneously represents the entity of 18F-FDG uptake and the width of the metabolicallyactive tumor mass [36]. By contrast, in our study TLG did not show any predictive role, either for TRG or (y)pTNM assessment.

Another still open question is what value of SUVmax should be considered as the cut-off to decide whether the MTV should be drawn or not. In the lack of clear indications in this regard we decided to collect all MTV, regardless of the SUV values.

Maffione et al., analyzing MTV results according to TRG, showed that the mean MTV from the baseline PET/CT was 19.0 cm3 (range 2.5–74.3 cm3). Responders showed a mean MTVbaseline of 18.8 cm3 while non-responders showed a mean MTVbaseline of 19.5 cm3, with no statistically significant difference. The mean MTV after nCRT was 5.6 cm3 (range 0.0–50.6 cm3). Responders showed a mean MTVpost-nCRT of 3.5 cm3 while non-responders showed a mean MTVpost-nCRT of 10.5 cm3, with a statistically significant difference [25]. In our study, according to the TRG assessment there was no statistical difference in MTV between responders and non-responders. Nor did ROC analysis show any diagnostic capability for MTVbaseline and MTVpost-nCRT.

In the study by Lee et al. MTV values in responders were 13.92 ± 13.32 vs. 15.92 ± 10.86 in non-responders; TLG in responders was 191.44 ± 91.40 vs. 121.85 ± 115.43 in non-responders. They considered that TLGbaseline of the primary tumor in the initial 18F-FDG PET/CT could be considered as a prognostic factor for complete responders to nCRT in patients with rectal cancer [8]. In our study, according to the (y)pTNM assessment, MTVbaseline and MTVpost-nCRT showed a statistically significant difference between responders and non-responders (16.22 ± 11.07 and 25.82 ± 13.06, 36.76 ± 20.39 and 19.51 ± 11.09, respectively). Also ROC analysis showed a diagnostic capability only for MTVbaseline (cut-off > 16.8; sensitivity 75%; specificity 66.7%) and MTVpost-nCRT (cut-off ≤ 21.8; sensitivity 64.3%; specificity 83.3%) with no significant differences among them. Furthermore, a trend to increasing MTV values emerged in responder patients and to decreasing values in non-responder patients. This is due to the method applied, keeping the same VOI dimensions in the baseline and post-nCRT scans.

As we previously stated for the cut-off values of SUV and RI% parameters, also for MTV and TLG no parameter was a clearly better predictor than others, so the cut-off values should be considered with caution because they are strictly related to the population we studied.

In the study by Maffione et al. several PET parameters (VRA, SUVmax-post, MTVpost, TLGpost, RI, ΔMTV%, and ΔTLG%), both qualitative and quantitative, were significantly related to the histological response to nCRT according to TRG [25].

Among the clinical-pathologic factors, none of the investigated parameters, including the tumor volume parameters (SUVmax, SUVmean, MTV, and TLG), was shown to be an independent predictive factor for (y)pTNM response [8]. A possible explanation for such different results cannot be limited to the nCRT assessment criterion used, but it is strictly related to the population size and to the percentage of patients that achieved a CR.

Even though literature reports consider PET as a favorable predictive tool in patients with LARC, the last version of the National Comprehensive Cancer Network (NCCN) Guidelines on Rectal Cancer does not mention this technique in the paragraph on response to neoadjuvant treatment [37].

Our study suggests that 18F-FDG PET/CT is able to accurately stratify patients with LARC, because of the good qualitative assessment and the predictive role in pathological response of up to 6 parameters, even if our results do not identify any of these as advisable for application in clinical practice.

It must be emphasized that the cut-off values identified by all study groups, including ours, are strictly dependent on the patient population analyzed. For this reason, cut-off values should be considered only as a guide and need further validation.

We can conclude that 18F-FDG PET/CT can be a useful tool in predicting response to nCRT in LARC patients, also because it is a whole-body technique. An optimal approach could be to perform both qualitative and semiquantitative evaluations for baseline and post-nCRT scans, even if at the current state of the art none of the semiquantitative parameters revealed an unequivocally discriminating cut-off value. The prediction of response is essential for planning and providing optimal therapy and 18F-FDG PET/CT should be incorporated into future trials. Study criteria need to be harmonized; the optimal 18F-FDG PET/CT parameter for response prediction has still to be established.