Introduction

Breast cancer is the most commonly occurring cancer in women and the second most common cancer overall, making it one of the main causes of mortality and morbidity in females worldwide [1]. The current diffusion of screening programs [2] has increased the detection of early-stage breast cancer and, together with the availability of increasingly effective therapeutic protocols, has improved survival, fostering the need to reduce the late side effects [3]. The breast-conserving therapeutic approach, with limited surgery followed by radiotherapy, is often the procedure of choice for management of early-stage breast cancer [4,5,6]. The main organs at risk subjected to radiation therapy side effects in left breast irradiation are the heart and the cardiac structures, which can be associated with the most serious early or late side effects [7, 8], and the lungs, with radiation-induced pneumonitis or fibrosis. Therefore, avoidance of cardiac structures is a fundamental point in planning radiation therapy of the breast. It can be achieved by adopting a modified patient setup [9, 10] or specific breathing techniques [11,12,13,14], or by using intensity-modulated techniques that maximize the conformity to the target volume [15,16,17,18,19]. Based on the available findings, an optimal heart-sparing technique is deep-inspiration breath-hold (DIBH), where the heart becomes displaced posteriorly by inflation of the lungs and, thus, the amount of irradiated heart is reduced [20]. Simultaneously, DIBH minimizes the overlap of the lung with the treatment fields. Literature data indicate that surface monitoring systems can be considered an appropriate technology for DIBH monitoring [21]. Recently, the AAPM TG 302 provided technical guidelines for clinical indications for use of surface imaging for breast DIBH treatments. Optical surface scanning systems have proven their usefulness for initial patient positioning, real-time motion monitoring, and beam gating, so their clinical use has greatly increased [22]. The use of surface imaging as a tool for image-guided radiation therapy is called surface-guided radiation therapy (SGRT). The SGRT DIBH clinical workflow enables radiation delivery when the breathing signal is within a predefined gating window and the displacement between the patient’s real-time surface and the reference one is within the tolerances. The selection of these beam-on thresholds can affect the accuracy of treatment delivery. The choice of these tolerances depends on the adopted reference surface type, immobilization, and setup workflow, and should be institution specific, analogous to PTV margins. Guidelines recommend clinical institutions to collect data on the reproducibility of SGRT-guided patient positioning for the entire course of treatment (intrafractional motion) and to use these values to validate and customize thresholds [22].

In the literature, many studies have evaluated SGRT accuracy for patient setup, and several investigated the possible intrafractional chest motion during free-breathing (FB) or DIBH treatment for left-sided breast cancer [23,24,25,26,27,28]. However, to our knowledge, no institution-specific validation of the tolerance selection for DIBH treatments has been published to date.

The aim of this work is to prove the reliability of the selected beam-hold thresholds and to evaluate their impact on intrafractional motion management and DIBH stability by comparing two left breast cancer patient cohorts (tangential vs. locoregional DIBH treatment) with different predefined beam gating tolerances.

Materials and methods

Patient selection

A total of 24 consecutive patients undergoing DIBH whole-breast radiation therapy were selected for the study: 12 patients received 3D conformal locoregional half-beam treatment and 12 3D conformal tangential treatment (referred to in the following as locoregional and tangential samples, respectively). Patients cooperating and able to comfortably hold their breath for at least 20 s were included. The study was conducted in accordance with the Declaration of Helsinki. All patient data were obtained as part of a retrospective data registration program within the framework of routine clinical practice. Written informed consent was obtained from all patients.

Surface-guided DIBH workflow

All patients performed FB and DIBH CT scans, as recommended [22]. During FB CT acquisition, a virtual simulation procedure was carried out and skin mark surrogates were generated. The prospective DIBH CT study was performed using a C-RAD Sentinel™ (C-RAD AB, Uppsala, Sweden) laser scanning system coupled with a BrightSpeed CT scanner (GE Healthcare UK, Chalfont St Giles, UK). The Sentinel system was used to reconstruct a breathing signal by tracking a region of interest defined on the surface of the skin above the xiphoid process. DIBH CT scan was performed when the respiratory signal was within the gating window equal to 5 and 3 mm for tangential and locoregional samples, respectively. Visual coaching by video goggles was provided to the patient to help with following the optimal breathing pattern in DIBH. The patient was trained until a reproducible and stable deep inspiration was achieved. Baseline level and levels of the gating window were defined accordingly to the patient capabilities and recorded in the patient database.

SGRT treatment delivery was performed using C‑RAD Catalyst™ (C-RAD AB, Uppsala, Sweden) connected to an Elekta Synergy linear accelerator (Elekta AB, Stockholm Sweden) [29]. The Catalyst software treatment workflow was divided in two steps: positioning and treatment module. For patient positioning, the displacement between the live image and the DIBH planning reference image was calculated and the corrections were performed automatically by moving the linac couch remotely. Rotations relative to the treatment isocenter were also calculated by the positioning module and values greater than 3° were corrected. DIBH treatments were supported by the Catalyst treatment module by continuous monitoring of both the patient’s motion and the respiratory signal. The motion monitoring was performed by tracking in real time the displacement between the treatment isocenter on the reference surface and that on the live one (“SGRT shift”) by matching the two surfaces with a nonrigid registration algorithm. After calculation of the baseline respiratory signal, the daily surface reference image was captured the first time the patient breathing signal crossed the middle of the gating window. Gated treatment delivery was automatically started only when the breathing signal was within the established gating window and the matching between the reference and the live surface was within the tolerance. Generally, about four BHs for tangential and six BHs for locoregional treatments were needed. The thresholds for beam-on triggering selected for the SGRT shift were equal to 5 and 4 mm for tangential and locoregional samples, respectively. Stricter thresholds for beam-on triggering were adopted when including the lymph nodes because of the matching-fields technique used for the treatment.

During the first three fractions and weekly, triggered iViewTM (Elekta AB, Stockholm, Sweden) electronic portal images (EPID) were acquired for treatment volume and CT reference surface validation. The EPID images were compared with the digitally reconstructed radiographs (DRR) by matching the position of chest wall bones and the breast surface with a maximal acceptable tolerance of 5 mm displacement in each direction and 3° rotation.

Treatment planning

The clinical target volume (CTV), the heart, the ipsilateral lung, the LAD, and the humerus were identified in both the FB and DIBH CT images sets, in accordance with the guidelines [30,31,32,33,34]. When lymphatic nodes were involved in the treatment, the esophagus, the thyroid, and the spinal cord were also contoured; the planning risk volumes (PRV) of the heart and LAD were created by an expansion of 3 mm to take into account the possible cardiac movement as well as uncertainties due to positioning errors. The PTV was obtained by adding an isotropic margin of 5 mm in all directions, except for 10 mm along the craniocaudal axis. The first 3 mm inside the external contour was excluded both from the CTV and the PTV.

Treatments were planned by the TPS Pinnacle3 v.16.2 (Philips Medical Systems, Fitchburg, WI, USA) on the FB and DIBH CT image sets, in order to have a backup plan for preventing the risk that patients were not able to complete the treatment in DIBH. The treatment approach was a 3D tangential beam geometry, which involved two opposing wedged fields irradiating the whole breast, or a four-field “matching” beam arrangement covering the whole breast and involved nodes [22]. The dose prescription was 50 Gy dose in 25 fractions for locoregional cases and 42.56 Gy in 16 fractions for tangential cases. Dose prescriptions were defined to 95% of PTVs. Coverage of a lower volume was accepted in order to respect dose constraints to OARs. Recommendations reported in recent guidelines [20, 35,36,37,38] were used for OAR dose limits.

Data analysis

Evaluation of isocenter shift during radiation beam delivery

The SGRT shifts during beam-on time for each fraction of the selected treatments were investigated by analyzing the three-dimensional motion data of the isocenter obtained from the log files of the optical system. Each of the lateral, longitudinal, and vertical coordinates was considered with its proper sign. The software also calculated the shift vector module (hereafter called “deviation” or “Dev”) tagging the measurements with the beam-on/off status.

Data analysis was conducted on three different levels:

  1. 1.

    fraction level;

  2. 2.

    treatment/patient level; and

  3. 3.

    population level.

Fraction level

Firstly, the cumulative probability distribution, mean, minimum, and maximum of the SGRT shifts were evaluated for each single treatment fraction (hereafter “fraction” features). The data analysis was performed by means of dedicated python routines (https://www.python.org). An example of a single fraction of a tangential treatment is shown in Fig. 1: the real time shifts recorded by the Catalyst system along the lateral (Lat), longitudinal (Long), and vertical (Vert) axis as well as the total deviation (Dev) are reported using green and red points for beam-on and beam-off points, respectively. The histogram of the SGRT shift distribution for the same fraction is also shown in Fig. 1: green bars represent beam-on shifts and red bars report the total fraction shifts.

Fig. 1
figure 1

Example of evaluation of surface guided radiotherapy real-time intrafactional shifts (SGRT shifts) during beam-on time for a single fraction of a particular tangential treatment. ad Plots show lateral, longitudinal, vertical, and deviation SGRT shift, respectively. Each panel shows the real-time shifts recorded by the Catalyst (C‑RAD AB, Uppsala, Sweden) during the fraction. Green and red points represent beam-on and beam-off points, respectively. The inset of each panel shows the histogram of SGRT shift distribution for that fraction. Green bars represent beam-on shifts. Red bars total fraction shifts. Dashed horizontal line in d represents the beam switching threshold (5 mm for this patient)

The distribution of isocenter Dev during beam-on time was analyzed to evaluate the intrafractional DIBH reproducibility, being the parameter used to gate the beam. Isocenter SGRT shifts along the lateral, longitudinal, and vertical directions were also analyzed to evaluate in which direction the maximum deviation occurred. Moreover, the percentage of the total beam-on time with deviations greater than 3 mm, 4 mm, and 5 mm was evaluated for each fraction.

Treatment level

The averages of the SGRT shifts, obtained at the “fraction” level, were calculated over a complete treatment cycle of each patient (hereafter called “treatment” features). An example of the analysis for a complete tangential treatment is presented in Fig. 2a,b. A 3D representation of the average deviation vector for each treatment fraction is plotted in Fig. 2a. In Fig. 2b, the cumulative distributions of the average fraction shift in lateral, longitudinal, vertical, and deviation are plotted. In Fig. 2c,d, the evaluation of the maximum SGRT shifts for all the fractions of the same treatment is shown.

Fig. 2
figure 2

Example of the SGRT shifts evaluation for a single patient undergoing tangential treatment. a A single arrow represents the mean deviation shift for each fraction of the treatment: a total of 16 arrows are shown. b Cumulative distributions for lateral, longitudinal, vertical, and deviation shifts for all the treatment fractions (solid, dashed, dotted, and solid bold lines, respectively). c A single arrow represents the maximum deviation shift for each fraction of the treatment: a total of 16 arrows are shown. d Cumulative distributions for lateral, longitudinal, vertical, and deviation maximum shifts for all the treatment fractions (solid, dashed, dotted, and solid bold lines, respectively)

The averages of the percentage of total beam-on time with deviation > 3 mm, > 4 mm, or > 5 mm over a complete treatment cycle were also calculated for each patient. The cumulative distributions obtained for a single patient are shown in Fig. 1SM in the supplementary materials: the amplitude of the SGRT shift was lower than 3 mm for 90% of the time.

Population level

The statistical distributions of the treatment averages over the complete patient sample were studied (hereafter termed “sample” features). The cumulative distributions, averages, standard deviations, and quartiles of all the “treatment” averages were evaluated. The analysis was separately conducted for locoregional and tangential samples, in order to evaluate the influence of different delivery thresholds for the two cohorts of patients and to compare the breath-hold workflows adopted for these two treatment sites.

Evaluation of respiration signal stability during radiation beam delivery

The respiratory signal stability during all the DIBHs of each fraction was evaluated by calculating the standard deviation of the breathing level during treatment delivery. Analysis was conducted by means of dedicated python routines. The respiratory signal was first obtained from the optical system by means of a text file. Then the histogram of its amplitude was reconstructed. The histogram reported in Fig. 3 shows two peaks: one corresponding approximately to the shift values of the points owning to the signal baseline. The other is related to the points located inside the gating window. This second peak corresponds to the signal part owning to the DIBHs. The signal around this peak was selected (peak maximum ± 4 mm) and standard deviation was calculated, with a σ-clipping procedure to remove the effect of outliers. In detail, a first estimate of mean and standard deviation was used to clip the signal points exceeding ± 2.5σ. The mean and then mean and standard deviation were then calculated again. This procedure was iterated twice. An example of the analysis for a single treatment fraction is shown in Fig. 3.

Fig. 3
figure 3

Example evaluation of the stability of the respiratory signal during an entire fraction. a Respiratory signal amplitude histogram. Green region represents the masked DIBH peaks. b Zoom of the respiratory signal DIBHs. Green points correspond to the masked histogram peak points. Continuous and dashed lines represent signa-clipped mean and standard deviation of the DIBH peak points

The respiratory signal stability is typically measured with its standard deviation during DIBHs, these being identified with the part of the signal falling inside the patient-specific gating window [39]. A more complex approach to characterize the stable part of the DIBHs consists of fitting a straight-line trend to each DIBH [39]. In the present work, we adopted this sigma-clipping method as an easy alternative to characterize the stable part of DIBHs, excluding the contribution of typical signal peaks or outlier points falling inside the gating window.

Data analysis of respiratory signal stability during beam-on time was performed on the three abovementioned levels: at the “treatment” level, the mean and standard deviation of the “fraction” respiration signal standard deviation were evaluated, and, finally, at the “sample” level, cumulative distributions, averages, standard deviations, and quartiles of the “treatment” averages were evaluated.

The influence of gating window amplitude, set differently for locoregional and tangential populations, on the stability of the DIBH technique was investigated.

Statistical analysis

A Student t-test with significance level of 0.05 was performed to evaluate whether the sample statistical distributions of the lateral, longitudinal, vertical, and deviation SGRT shifts were consistent with 0 mm shift. Moreover, by means of a Wilcoxon signed-rank test [40], considering a p-value of < 0.05 as significant, the Lat mean, Long mean, Vert mean, and Dev mean distributions, obtained for locoregional and tangential samples, were compared.

The systematic and random components of the beam-on uncertainties were investigated in the entire sample. We evaluated: (i) the standard deviation of group systematic shift Σ, which is an estimation of the standard deviation of the systematic error calculated as the standard deviation of the means per patient [41]; (ii) the standard deviation of group random shift σ, which is an estimation of the standard deviation of the random error calculated as the root mean square of the treatment shifts standard deviations [41]. The percentage of beam-on time with deviation > 3 mm, > 4 mm, or > 5 mm and the respiration signal stability distributions for locoregional and tangential samples were compared with the Wilcoxon signed-rank test.

Results

Patients’ mean age was 54 ([30–66] range) and 57 ([50–75] range) years among locoregional and tangential samples, respectively. A total of 300 fractions for the locoregional sample and 192 fractions for the tangential sample were analyzed, with a resultant number of 1802 BHs and 768 BHs for locoregional and tangential samples, respectively. None of the patients required re-simulation during treatment due to a dissatisfying x‑ray imaging verification.

For the locoregional patient sample, the difference between the low level and the baseline ranged between 10 and 21 mm, with mean and standard deviation of 14 ± 3 mm, while for the tangential sample, it ranged between 9 and 23 mm with, mean and standard deviation of 15 ± 4 mm.

Figure 2SM shows two typical dose distributions for the locoregional and tangential samples (left and right panel, respectively).

SGRT shifts

The mean SGRT shift vectors during beam-on time, calculated for each fraction and for each complete treatment, for all the patients of the locoregional and tangential samples, respectively, are shown in a 3D plot in Fig. 3SM in the supplementary materials. The 3D plot represents an overview of how the data samples are distributed in 3D space.

The median values with 5–95% confidence intervals for the minimum, maximum, and mean values of SGRT shifts for both locoregional and tangential samples are resumed in Table 1. The Lat mean distributions for both samples were approximately consistent with a null shift, whereas Long mean shift was typically negative (median value −0.8 mm for both samples). Vert mean shift was also negative (median value −0.4 mm and −0.8 mm for locoregional and tangential samples, respectively). The Dev median shift values were equal to 2.5 mm with a 95% CI of [1.8; 3.1 mm] for the locoregional sample, and to 2.4 mm with a 95% CI of [1.7; 3.1 mm] for the tangential sample, showing that results for the locoregional and tangential samples were consistent. The boxplots for the Dev shifts for both locoregional and tangential samples are shown in Fig. 4, whereas the boxplots for the shifts along each axis (Lat, Long, Vert) were reported in Fig. 4SM of the supplemental materials.

Table 1 Median values with 5–95% confidence intervals for the minimum, maximum, and mean values of SGRT shifts for both locoregional and tangential samples
Fig. 4
figure 4

Boxplots for Dev SGRT shift distributions for locoregional and tangential samples. The distribution of minimum, maximum, and mean values for the locoregional and tangential samples are shown. Boxes represent the 25–75 percentile interval. Central lines represent the median. Whiskers represent the 5–95 percentile interval. Circles represent outliers of the 5–95 percentile interval

The Student t-test confirmed that the sample statistical distribution for the Lat mean shifts was consistent with 0 mm shift (p-value > 0.05), both for locoregional and tangential samples. The Vert mean shift for the locoregional sample was also consistent with 0 mm shift (p-value > 0.05), while the Vert mean shift for the tangential sample was significantly different from the null shift (p-value 0.01). The Long mean and Dev mean distributions significantly deviated from 0 mm shift for both the samples, showing p-values < 0.05. Details on the t‑test are reported in Table 2.

Table 2 Mean treatment shifts for the locoregional and tangential samples (lateral, longitudinal, vertical and deviation). Mean population values, t‑test p‑values, Wilcoxon signed-rank test p‑values for the comparison of locoregional and tangential samples

The comparison of the Lat mean, Long mean, Vert mean, and Dev mean distributions between locoregional and tangential samples indicates that no statistically significant difference was found. The resulting p-values of the Wilcoxon signed-rank test are reported in the last column of Table 2.

Results for the percentage of beam-on time when the deviation was raised over 3 mm, 4 mm, or 5 mm are shown in Fig. 5: the amplitude of the SGRT shift was higher than 3 mm only for 30% of the time, both for locoregional and tangential samples, and the shift was higher than 4 mm only for about 10% of the time for both samples.

Fig. 5
figure 5

Boxplots of distributions of percentage of beam-on time with deviation > 3 mm, 4 mm, or 5 mm for locoregional and tangential samples. Boxes represent the 25–75 percentile interval. Central lines represent the median. Whiskers represent the 5–95 percentile interval. Circles represent outliers of the 5–95 percentile interval

The Wilcoxon signed-rank test confirms that data for locoregional and tangential samples are consistent. However, analyzing the dispersion of data for both samples, the Dev > 3 mm and Dev > 4 mm distributions for the tangential sample were more widespread than the locoregional ones (Dev > 3 mm 95th percentile equal to ~ 51% and ~ 44% for tangential and locoregional samples, respectively, and Dev > 4 mm 95th percentile ~ 24% and ~ 16% for tangential and locoregional samples, respectively). This indicates that it is possible to reduce the occurrence of higher SGRT shifts by selecting a lower threshold value, as done for locoregional treatments (Dev threshold equal to 4 mm) compared to tangential treatments (Dev threshold equal to 5 mm). A short delay in the automatic beam interruption when a deviation over the threshold was recorded led to the presence of a fraction of cases with shifts exceeding the beam-triggering thresholds for both locoregional and tangential samples.

The standard deviation of group systematic shift Σ and the standard deviation of group random shift σ for lateral (Lat), longitudinal (Long), vertical (Vert), and deviation (Dev) shifts are presented in Table 3 for the locoregional and tangential samples. The locoregional and tangential samples show consistent mean treatment shifts if considering the Σ and σ values.

Table 3 Standard deviation of group systematic shift Σ and standard deviation of group random shift σ for lateral, longitudinal, vertical and deviation shifts

Results for σ, which provided an indication of the variability of “fraction” shifts over a treatment, were in every case lower than Σ, which provided an indication of the variability of “treatment” shifts over the patient sample. The comparison of Σ with σ gives the important indication that the shift value dispersion is more likely due to intertreatment variability than to intratreatment variability.

Stability of the respiration signal

The respiratory signal stability was evaluated for a total of 1802 BHs and 768 BHs for locoregional and tangential samples, respectively. The statistical distributions of the treatment averages over the patient sample are reported in Fig. 6. The median value of the DIBH respiratory signal standard deviation was lower for the locoregional sample (0.17 mm) than for tangential patients (0.26 mm), with a minimum–maximum range of 0.10 ÷ 0.33 mm and 0.12 ÷ 0.59 mm, respectively. A considerable difference in the distribution of the maximum values for the two anatomical sites was found: higher values were obtained for the tangential sample (~ 0.55 mm 95th percentile) with respect to the locoregional sample (~ 0.24 mm 95th percentile). The Wilcoxon signed-rank test returned a p-value of 0.034, indicating that the difference between the two samples is statistically significant. Taking into account the different respiration signal gating window amplitude selected for tangential treatments (5 mm; locoregional treatments 3 mm), these results suggest an influence of the delivery tolerances on the stability of the respiratory signal during DIBH treatment delivery.

Fig. 6
figure 6

Boxplots of distributions of DIBH respiration signals standard deviation for locoregional and tangential samples. Boxes represent the 25–75 percentile interval. Central lines represent the median. Whiskers represent the 5–95 percentile interval. Circles represent outliers of the 5–95 percentile interval

Discussion

During treatment planning and delivery phases, many different uncertainties can affect dose delivery, with a risk of minimizing the benefits of advanced techniques such as DIBH treatment. The definition of margins for the target and for organs at risk must take into account the uncertainties due to positioning errors as well as movements during treatment, together with geometrical uncertainties involved in the delineation process [36]. Therefore, the accuracy of patient positioning but also the reproducibility of the radiotherapy treatment against intrafractional motion has to be evaluated. In a recent multicentric in vivo dosimetry study, the breast site showed the highest rate of out-of-tolerance fractions [42]. SGRT technology has already shown its suitability as a valuable tool for increasing breast cancer patient setup accuracy in comparison to three-point localization [43]. The advantages of cone beam CT (CBCT) vs. portal images for daily positioning has been suggested, especially for patients with larger breasts and for patient rotation detection [44]. Moreover, the synergy between CBCT and SGRT for patient positioning in DIBH treatments was also evaluated, demonstrating congruence between both methods [28]. To date, only few data exist on DIBH real-time intrafractional motion evaluation [25,26,27,28].

The reproducibility and stability of DIBH SGRT against the backdrop of potential intrafractional motion for patients undergoing adjuvant radiotherapy for left-sided breast cancer with and without the inclusion of lymph nodes were evaluated in the present study. The measurements of SGRT shifts and of the respiratory signal level during DIBH were performed and subsequently analyzed both at the level of the single fractions of the same patient treatment (intratreatment) and at the level of a sample of treatments (intertreatment). Different thresholds for beam-on triggering were adopted for the two investigated treatment sites: tolerances for the locoregional patients were lower for the SGRT shifts and for the gating window amplitude. In fact, treatment of supraclavicular nodes is achieved using anterior and posterior oblique fields whose divergence is matched to tangential breast fields by using a single shared isocenter located at the junction between the breast and the supraclavicular fields. Thus, stricter tolerances were adopted in locoregional treatments to avoid the potential overlap of the matching fields. Results obtained for the two samples with different beam-hold thresholds were then compared.

Firstly, it is important to remark that the selected thresholds were met for all the patients, in both the locoregional and tangential samples, thus demonstrating their reliability.

Lateral cumulative distributions for locoregional and tangential samples were centered on 0 mm, whereas longitudinal and vertical cumulative distributions were shifted slightly toward negative values (mean values < 1 mm), thus indicating that during the beam-on time, the patient is, on average, shifted toward the cranial and anterior directions, probably due to an increased amount of inhaled air during apnea. The t-test performed on the treatment samples confirmed that only the lateral SGRT shift distributions were consistent with a null shift for both the locoregional and tangential samples. The same behavior was found by Reitz et al. [23], who performed an SGRT shift study during SGRT FB breast radiotherapy in a sample of 104 patients for a total of 2028 treatment fractions using the Catalyst system. The two studies indicate that the SGRT shifts are generally smaller along the lateral direction than on longitudinal or vertical ones, independent of the FB or DIBH treatment modality.

The Wilcoxon test indicated that all the lateral, longitudinal, vertical, and deviation shifts were consistent for the locoregional and tangential samples, even if the locoregional sample showed a higher dispersion of the measured shifts with respect to the tangential sample (as confirmed by the 95% CIs reported in Table 1). This could be due to the longer duration of the locoregional treatment, which generally included at least six DIBH vs. the four DIBH of the tangential treatment.

Nevertheless, the evaluation of the percentage of beam-on time with deviation > 3 mm, > 4 mm, or > 5 mm showed that distributions for the tangential sample were extended toward higher values than for the locoregional sample, thus indicating that the frequency of occurrence of a high SGRT shift is lower for the locoregional sample than for the tangential sample. This is an important indication that it is possible to reduce the frequency of high SGRT shifts by adopting lower threshold values.

Intrafractional reproducibility of voluntary DIBH whole-breast radiotherapy implemented with SGRT was studied by Xiao et al. [25] in a sample of 31 left-sided patients. They adopted a 7-mm threshold for the 3D magnitude vector, achieving a mean shift of 1.52 mm, 1.98 mm, and 2.17 mm for Lat, Long, and Vert directions, respectively. A similar result was found by Gierga et al. [27]: they selected a tolerance for BH matching of 5 mm along each direction, obtaining a mean real-time shift of about 2 mm. All the values achieved in [25] and [27] are higher than our results, which were lower than 1 mm for each direction. This finding suggests that a tighter beam gating threshold, such as the ones adopted in the present study, can reduce the SGRT shifts, thus increasing the accuracy of treatment delivery.

Hamming at al. [28] studied the SGRT shifts in 18 left-sided breast treatments by measuring the DIBH variability (i.e., the maximum variation between single DIBH levels). The SGRT system was used just to measure motion data without gating the treatment delivery. The average variability resulted to within 3 mm in all directions, with maximum values up to 12.4 mm in the longitudinal direction. Moreover, Kügele et al. [26] evaluated the beam-on SGRT shift on a sample of 20 locoregional and 20 tangential SGRT DIBH left breast treatments, obtaining median values similar to ours. Maximum values of 5.4 mm, 5.3 mm, and 3.8 mm in the Lat, Long, and Vert directions, respectively, for the tangential sample and 3.4 mm, 5.6 mm, and 2.7 mm for the locoregional sample were reported in [26]. In contrast, in our study, the maximum values were lower than 3 mm for each direction of both samples. This difference may be due to the fact that in the latter study’s treatment workflow, no threshold on the deviation is used to automatically gate the treatment beam. This confirms the importance of setting a proper delivery threshold to increase the reproducibility of the treatment technique against the backdrop of potential intrafractional motion.

The analysis of the standard deviations of systematic shift Σ and random shift σ showed that σ was lower than Σ, suggesting that the variability of the mean sample shifts was more likely due to variations between the patient-specific intrafractional motion (i.e., intertreatment variability) than to the intrafractional motion during each fraction of a single patient (i.e., intratreatment variability). Betgen et al. evaluated intrafractional SGRT shifts in a sample of 19 patients undergoing conventional 3DCRT left breast DIBH treatment by measuring the difference in isocenter position between just two subsequent DIBHs [24]. They measured group mean shifts of at most ~ 0.5 mm in each direction, with Σ and σ of ~ 0.4 mm and ~ 1.4 mm, respectively, in each direction. In our study, smaller σ values were achieved, indicating highly reduced intrafractional motion during each fraction of a single patient (i.e., intratreatment variability) and suggesting greater reliability of the DIBH gating system.

The DIBH respiration signal variability measured in the present study was significantly higher for the tangential sample, as confirmed by Wilcoxon test. The difference is particularly evident in the high values’ tail of the tangential sample distribution. This could be due to the wider amplitude of the gating window for the tangential treatments than for locoregional ones, highlighting the importance of setting tight tolerance levels to get higher performances.

Reitz et al. [39] studied the stability and reproducibility of respiration signal on a sample of 6013 single DIBHs in 103 patients receiving conventional 3DCRT left breast DIBH treatment. They individually adjusted the gating window width for each patient, obtaining a mean amplitude of 3.5 mm (95% CI: [2–4.3] mm) and measuring a median standard deviation of 0.3 mm (~ 0.9 mm 95th percentile), which is slightly higher than our median values of 0.17 mm (~ 0.24 mm 95th percentile) and 0.26 mm (~ 0.55 mm 95th percentile) for locoregional and tangential samples, respectively. This finding confirms the possibility of increasing respiration signal stability via proper threshold selection.

Limitations of the present work consist primarily of the number of patients considered for each treatment site and in the retrospective data sampling. Larger samples could improve the statistical analysis. The robustness evaluation to determine the effect of SGRT shifts on the dose to the target and organs at risk volumes will be part of a future study.

Conclusion

In surface-guided DIBH radiotherapy for left breast cancer, different thresholds for beam-on triggering can affect the reproducibility against the backdrop of potential intrafractional motion by reducing the maximum SGRT shift when using lower threshold values. Moreover, it is possible to get higher respiration signal stability by adopting a tighter gating window amplitude. The present experience may be used as a guide for selecting reasonable thresholds for other clinics adopting a similar DIBH workflow.