Introduction

CyberKnife (Accuray Inc. Sunnyvale, CA, USA) delivers a mega voltage (MV) photon beam to the target via a robotically-controlled linac which dynamically tracks prostate motion and corrects the beam position during fraction delivery. The intrafraction imaging frequency of orthogonally placed cameras can be varied, with shorter imaging intervals resulting in greater treatment accuracy. However, increasing the frequency of imaging and positional corrections increases total duration of treatment which increases the cost of treatment and can expose the patient to greater X-ray radiation dose. The CTV-to-PTV margin in CyberKnife prostate treatment encompasses the potential displacement between images taken, which makes this an important quantity to study [1].

Various quantities describe intrafraction motion and are used to describe the impact of that motion on geometric uncertainty. Geometric deviations of the same direction and magnitude for an entire treatment course for one patient represent a systematic error. Random errors are those that differ in magnitude and direction for each fraction and, over sufficient fractions, result in dose blurring [2,3,4,5]. CyberKnife prostate treatment is usually delivered over 5 fractions, which is significantly fewer than conventional fractionation [6,7,8,9]. Geometric inaccuracies over fractions delivered with CyberKnife have a significant clinical impact compared to blurring of dose that occurs in conventional fractionation.

There are limited prior studies conducted on prostate margins with a detailed analysis of the frequency of imaging and beam repositioning required during CyberKnife treatment to ensure adequate geometric coverage [8, 10, 11]. Studies analysing image acquisition frequency have found a correlation between image acquisition frequency and the necessary margins, indicating that there may exist an optimal combination of the two [8,9,10, 12]. Curtis et al., analysed data from 31 prostate intensity modulated radiation therapy patients with beacon transponders tracked by the Calypso electromagnetic tracking system and found that simulated imaging frequencies of every 15, 60, and 240 s combined with 1, 2, and 3 mm planning margins were required to achieve adequate geometric target coverage for 95% of the time [9]. In patients treated with CyberKnife for prostate cancer, Xie et al., observed prostate displacement over treatment time and found that approxamitely 95% of the dataset was contained within the thresholds of 2 and 3 mm when simulations of 30 and 60 s imaging frequencies respectively were applied, for a sample size of 21 patients [8]. Kioke et al., analysed 16 patients treated with CyberKnife and reported prostate motion mean and standard deviation (SD) of 0.03 (0.91), − 0.01 (0.35) and − 0.02 (1.04) mm between consecutive imaging with average imaging interval of 70 s [7]. Taillez et al., calculated translations and rotations from 162 patients treated with CyberKnife in order to observe the changes in deviations over the length of the treatment [13].

Several authors have attempted to individualise margins based on patient factors [14,15,16,17]. For example, Thompson et al. found that patients with a higher BMI have less intrafraction displacement of the prostate in the superior-inferior dimension compared with patients with a lower BMI [17]. Maruoka et al., found that rectal volumes and mean area correlated positively with posterior margins and BMI correlated negatively with anterior and right margins [18]. In the current study, tracking information was available for 331 prostate patients treated on a single CyberKnife and imaged at several different imaging frequencies, providing information on intrafraction geometric variations, their time-course and their relationship to applied margins. The aims of this study were to (i) use the tracking information acquired in order to characterise prostate intrafraction motion, (ii) assess the adequacy of common SBRT imaging frequencies and (iii) to investigate clinical factors that might predispose a patient to significant intrafraction motion.

Methods

Three hundred and thirty-one patients that underwent CyberKnife SBRT for prostate cancer, on a single CyberKnife device, using fiducial tracking methods between 2014 and 2021 at a single centre in a tertiary hospital between 2014 and 2021 were included in this analysis. This study was approved by the institutional human research ethics committee (2014-031) and participants provided written consent for their data to be used. Our department protocol for preparation, dose targets and constraints has been described previously [19]. Prior to simulation, typically four fiducial markers are implanted into the prostate. Patients are required to empty their rectum and to have a comfortably full bladder when presenting for CT simulation, and prior to the delivery of each fraction.

The CyberKnife Treatment Delivery System v 9.0-11.1 (Accuray, Sunnyvale CA) was used to track the position of the implanted fiducials throughout each treatment. During patient set up, the prostate was imaged with stereoscopic kV imaging in the orthogonal planes and the ‘offsets’ from the planned position were detected by image registration with a digitally-reconstructed radiograph created during treatment planning. If the fiducials’ position is within a given tolerance of the planned position treatment, the CyberKnife robotic arm will reposition and treatment will commence. If the fiducial is outside of this tolerance, the treatment couch will be moved, and the process repeated. Once the treating radiation therapist is satisfied with the patient alignment and treatment has been commenced, the fiducial locations relative to the planned positions are recorded with each kV image acquisition and stored in a log file. All patients were treated using an iris collimator. Patients were typically treated with five fractions delivered on separate days. A median fiducial tracking threshold of 10 mm was used for the patient cohort. If after fraction delivery commencement, the fiducials are detected outside of the threshold, treatment will be paused. In most cases, the outer threshold movement is transitory and once the fiducial moves back within the threshold, treatment will continue. If the patient does not move back within the threshold, an adjustment of couch position will be made. Treatment interruptions that resulted in either adjustments to the couch position or the patient leaving and returning to the treatment couch, resulted in the generation of a makeup fraction file containing the post-interruption tracking data. The generation of a new file ensured that such interruptions did not affect the accuracy of the vectors recorded throughout treatment. In some instances, there were several aggregated fraction files associated with each fraction delivery. Therefore, in this study, the term ‘fraction delivery file’ refers to each individual file as a separate dataset.

The fraction delivery files and plan overview reports generated with the Accuray Precision Treatment Planning System v. 2.0 (Accuray, Sunnyvale CA) were available for manual export to several formats. The fraction delivery reports and plan overviews including the volumes of the delineated patient structures, were manually extracted in this study in excel (xlsx) and comma-separated value (csv) formats respectively. Scripts were written to confirm the quality of the data and to check for duplication’s or missing records. Patient age, weight and height were queried from the relevant onsite registries and joined to the dataset using unique patient keys. All data loading, transformation, analyses, calculations and plots were performed in Python 3.8 with the Jupyter Notebook computing platform.

Eight patients from the cohort were treated using the “In Tempo” method whereby CyberKnife automatically adjusts imaging frequencies depending on patient movement, and the rest of the patients were treated at imaging intervals chosen by the treating radiation therapist. The imaging intervals were usually 15 or 20 s at the beginning of treatment and increased to 45 s depending on patient stability. Imaging interval values were present for 90% of the tracking vectors recorded and the missing imaging interval values were estimated based on the image age at the end of the final beam delivery for each vector recorded.

Motion calculations

kV imaging acquired during each fraction delivery provides the position of the target in all 6 dimensions (translation and rotation). For each fraction file, i, at the jth image acquisition occurring at time \(t_j\) (where j is an integer and \(j=1\) represents the image verified location of the first beam delivery), the position of the fiducials relative to the planned position is

$$\begin{aligned} \mathbf {r_{i,j}} = (x,y,z,roll,pitch,yaw). \end{aligned}$$
(1)

Here, xy and z are vectors in the (−)Superior/(+)Inferior (SI), (+)Left/(−)Right (LR) and (+)Anterior/(−)Posterior (AP) planes and roll, pitch and yaw are rotations about the xy, and z axes respectively. Under the assumption that the CTV did not accelerate during the each time interval, the change in position occurring between subsequent imaging over the time interval applied at \(\mathbf {r_{i,j}}\), could be used to estimate the CTV velocity occurring between \(\mathbf {r_{i,j}} - \mathbf {r_{i,j-1}}\) as

$$\begin{aligned} \mathbf {v_{i,j}} =\frac{ \mathbf {r_{i,j}} - \mathbf {r_{i,j-1}} }{\Delta t_{i,j}} \end{aligned}$$
(2)

where \(\Delta t\) denotes the imaging interval chosen by the treating radiation therapist. The calculation of CTV velocity for each interval, allowed the positions at each second of treatment to be estimated as \(\mathbf {r_{ij-1}}+\mathbf {v_{i,j}}\), which in turn allowed the application of new simulated imaging intervals to be applied to the full dataset. The lower and upper extreme CTV vectors in each plane, occurring between the new simulated image acquisition number \(k\) and \(k-1\), was then determined in order to model the percent of time that a patient’s CTV may remain within a given movement threshold for time intervals not present in the dataset.

Statistical definitions

Using the vectors in Eq. 1, the population mean, mean absolute, systematic and random error are calculated for the shifts occurring between subsequent imaging over the patient cohort. The population mean in this study represents the mean of patient displacement means. The mean absolute, is defined as the mean of the absolute displacement means. The population systematic error, \(\Sigma\), is defined as the SD of patient displacement means. The population random error, \(\sigma\), is defined as the mean SD of patient displacement means.

To assess factors that may predispose patients to significant intrafraction prostate motion, the mean CTV speed for each patient was approximated, with a view to creating a standard variable dependant on patient features. If N denotes the total number of intervals summed across all fraction files belonging to patient p, and R the Euclidean distance of x, y and z between each consecutive imaging interval, then mean patient CTV speed can be calculated as follows:

$$\begin{aligned} S_p=\frac{1}{N} \sum _{j=1}^{N_j} \frac{R_{p,j}}{\Delta t_{p,j}} \end{aligned}$$
(3)

The feature importance and predictive power of patient age, weight and height on patient mean CTV speed, was evaluated by using a cut off speed to create a binary target variable and applying a logistical regression with stratified k-fold cross validation scoring. To investigate the potential impact of patient bladder, rectal and prostate volumes on patient mean fiducial speed, patients were ranked by speed (Eq. 3) and a two tailed Mann Whitney U test was performed to a complete a paired analysis on subsets from the upper and lower quartiles. 27 patients from the lower quartile (Group A-lower speeds) and 30 from the upper quartile (Group B greater speeds), were included in the paired analysis on bladder and rectum volumes. 21 from the lower and 20 from the upper quartiles were included in the paired analysis on prostate volumes. If x and y are random variables of Group A and Group B respectively, then the null hypotheses \((H_0)\) that the distribution of Group A is equal to the distribution of Group B was tested against the alternative hypothesis \((H_1)\) that y is stochastically different to x [20].

Results

A breakdown of patient and fraction information for the cohort is shown in Table 1. The dataset from this patient cohort included 1711 fraction delivery files for 1661 fraction deliveries, which included a grand total of 293,700 beam deliveries and 84,920 prostate vector positions acquired with onboard imaging. The sample size within the dataset for 15, 20, 30, 45, 60 and 75 s imaging intervals is shown in Fig. 1. This data shows interval samples taken from all times during treatment, as imaging intervals may change over the course of fraction delivery.

Fig. 1
figure 1

The distribution of 15, 20, 30, 45, 60 and 75 s imaging intervals in the dataset from the CyberKnife prostate patient cohort

The mean fiducial speed from over the dataset independent of any patient or fraction aggregation was 0.0248 mm/s. The translational displacements between consecutive images had maximum values of 13.4 (S), 11.9 (I), 17.1 (R), 17.8 (L), 13.3 (P) and 10.9 (A) mm. The maximum Euclidean distance travelled between any 2 consecutive images was 20.4 mm over a 45 s imaging interval. The maximum change in roll, pitch and yaw between recorded positions was 4.0°, 9.7° and 8.1° respectively. Displacements in the dataset occurring during all imaging intervals is shown in the Appendix (Fig. 5). The displacements that occurred during 15–20 s imaging intervals only and 45 s imaging intervals only are also shown in the Appendix (Figs. 6 and 7). The population errors calculated from these displacements is given in Table 2.

The Spearman’s rank correlation coefficient calculated on displacements between consecutive imaging showed strong and moderate correlations between planes (SI and AP: 0.74, SI and LR: 0.54, LR and AP: 0.57, Roll and AP: 0.53, Pitch and Roll: 0.63). Box plots demonstrating the results of the simulated percent of time patients were likely to experience CTV displacements of greater then 1, 2, 3, 4, and 5 mm for different chosen imaging intervals are shown in Figs. 2 and 3. The mean prostate position for each minute of delivery across all patients and fraction delivery files up to 25 min (prostate drift), relative to the starting position is shown Fig. 4. Age, weight and heights available for the cohort as shown in Table 1, gave Spearman’s R values of 0.14, > 0.001 and − 0.12 respectively. BMI was calculated for the 44 patients with available heights and weights, which gave a Spearman’s R value of − 0.16. Figure 8 (Appendix) shows the random scatter of data points demonstrating a lack of a direct linear correlation between the variables measured and speed. With a speed tolerance of 1 mm per 45 s, the binary target variable of \(0: \le 0.022\frac{mm}{s}\) and \(1: >0.022\frac{mm}{s}\) for the logistic regression was created. Using the newly created target variable, the best performing model the Logistic Regression and gave a 5-fold mean (SD) Area Under the Curve (AUC) of the Receiver Operator Curve (ROC) of 64.0% (14%) and included Age, Height and BMI with respective coefficients 0.053, − 0.073 and − 0.098 as shown in the Appendix, Table 3.

Fig. 2
figure 2

Cohort simulated percent of treatment time within the given translational thresholds for SI (a), Roll (b) and LR (c) for imaging intervals of 15–120 s and 1–5 mm thresholds

Fig. 3
figure 3

Cohort simulated percent of treatment time within the given rotational thresholds for Roll (a), Pitch (b) and Yaw (c) for imaging intervals of 15–120 s and 1–5° thresholds

Fig. 4
figure 4

Prostate mean (solid line) and SD (dashed lines) up to 25 min into delivery, for translations (a) and rotations (b)

Table 1 Patient and fraction data summary
Table 2 Population errors for displacements between consecutive imaging

Figures 9 and 10 (Appendix) show the distribution and spread of the volumes for each of the two groups extracted for the paired analysis. Table 4 (Appendix) shows results of the Mann Whitney U test, investigating the potential stochastic differences of the rectal, bladder and prostate volumes between the two groups. At \(\alpha\)=0.05, the null hypothesis was not rejected.

Discussion

The analysis of potential clinical factors that may predispose patients to more significant intrafraction motion is largely inconclusive. The AUC-ROC results (Appendix, Table 3) gave the highest percentage when Age, Height and BMI were included in the model, however, the model with BMI alone yielded an AUC-ROC of only 2% less. The models in this analysis may output a more significant AUC-ROC if weight and height were attainable for a larger sample size of patients from the cohort. A more complex model may also yield a larger AUC-ROC value but has not been completed in this analysis. At \(\alpha\)=0.05, p-values from the Mann Whitney U performed on the subset of patients from the top and bottom quartiles of patients ranked by fiducial average speed, did not indicate statistically significant differences in volumes for bladder, rectum, or prostate (Appendix, Table 4). However, the Mann Whitney U performed on prostate volumes between the two groups yielded the lowest p-value (0.06) followed by rectal volumes (0.19) and both may warrant further investigation with larger subsets of patients. No other significant treatment differences were noted between the two groups.

Independent of patient or fraction grouping, the Euclidean prostate displacements between consecutive images were less than 2, 3, 5, and 10 mm for 92.4%, 94.4%, 96.2%, and 97.7% of all consecutive imaging pairs respectively. The imaging pairs taken at 15–20 s intervals contained shifts of less than 2, 3, 5, and 10 mm for 95.9%, 97.3%, 98.6%, and 99.8%, at 30 s intervals contained less than 2, 3, 5, and 10 mm for 94.3%, 96.1%, 98%, and 99.8% and at 45 s intervals contained less than 2, 3, 5, and 10 mm for 92.9%, 95.5%, 97.8%, and 99.7% of consecutive imaging pairs respectively. The simulated percent of time within thresholds per patient shown in Figs. 2 and 3 considering the vector extremes of a ’worst case’ displacement for the imaging interval indicated that when aggregating motion per patient the mean percent of time patients CTV within 3 mm 96.2% (1.2), 96.2% (1.2) and 96.3% (1.2) % at 15 s imaging and 94.3% (2.9), 94.2% (3.1) and 94.4% (2.9) at 30 s imaging in the SI LR and AP planes respectively. The mean percent of time patients CTV within 5 mm were 95.7% (1.9), 95.1% (2.3) and 95.7% (1.9) for 45 s imaging, 95.1% (2.6), 94.3% (3.3) and 95.2% (2.6) at 60 s imaging and 93.5% (4.2), 92.5% (5.2) and 93.7% (4.2) at 90 s imaging in the SI LR and AP planes respectively. When looking at simulated ’worst case’ rotations the mean percent of treatment time patients CTV with rotations less then \(3^\circ\) were 96.4% (1.2), 96.0% (1.4) and 96.7% (0.87) at 15 s imaging, \(4^\circ\) for 95.2% (2.9) 93.8 (3.7)% and 96.3% (1.5) and \(5^\circ\) 96.2% (1.7), 95.5% (2.3) and 96.8% (0.99) at 45 s imaging.

Our results do not closely agree with Curtis et al. where simulated imaging frequencies of every 15, 60 and 120 s were suitable for 1, 2 and 3 mm planning margins to achieve adequate geometric target coverage for 95% of the time [9]. However, our study has significantly longer treatment times (mean of  35 min) than Curtis et al., (mean of  7 min) due to the different treatment methods used in the analyses. Prior research has shown that random prostate motion can increase as treatment progresses [13]. Furthermore Curtis et al. also reports that the analysis was completed under the assumption that a perfect correction occurred immediately for each interval simulated from the real-time data and as such, their analysis estimates an upper bound of the gain possible from frequent repositioning whereas the methods used in the current study are likely to have more conservative results.

The observation of 30 s imaging throughout the dataset independent of patient grouping shows similar statistical behaviour to Xie et al. where a rolling average of motion over time was used to estimate that 95.6% of the data would be contained within the thresholds of 2 mm and 30 s imaging [8]. Xie et al. also however, indicated that 95.8% of the data would be contained within 3 mm with 60 s imaging imaging and 95.1% at 4 mm with 120 s imaging. Our simulated intervals indicate shorter intervals are needed in order for 95% of the data per patient to occur with in those thresholds. The difference in results for 60 and 120 s intervals could be due to a large difference in fiducial tracking threshold. The results in Xie et al. are relative a 5 mm fiducial tracking threshold, whereas this study includes data tracked with a median 10 mm tracking threshold

End to end tests (E2E) were conducted routinely in our clinic to determine the total positional error for a given tracking mode on the CyberKnife treatment unit. The mean (SD) result of the E2E test (averaged over 2014–2021) for the collimator and tracking method combination in this study was 0.37 mm (0.2). For the purposes of keeping patients comfortable during the longer treatment delivery duration’s associated with CyberKnife delivery and since patients are carefully observed via a live video feed by the two treating radiation therapists, minimal immobilisation is provided for prostate cancer patients during treatment with CyberKnife at our clinic. As such, recorded intrafraction motion represents both internal and external displacements which cannot be distinguished between. Regular imaging and interventions resulted in most of the displacement data aggregating close to 0, however, the shoulders of the data distribution show some large but rare displacements, particularly above the 95th quantile (Appendix, Fig. 5). An example of one patient with very stable vectors but a single random large LR displacement is shown in Fig. 11 (Appendix), demonstrating just how sudden but rare that these large displacements can occur. We hypothesise that these sudden displacements occurring above the 95th quantile of the data could occur as a result of external patient motion. These are important factors to consider when looking at the population errors shown in Table 2 and is why a linear combination of \(\Sigma\) and \(\sigma\) have not been used in this study to calculate displacement margins.

There are a few limitations in this study. When calculating CTV velocity, it was necessary to make the assumption that there was no acceleration between the acquisition of the two consecutive vectors. In a clinical setting, there may be erratic or random movements of the prostate. Oscillations between imaging may result in an under estimation of the true motion. Conversely, sudden movements after stability between imaging may result in an over estimation of the true motion. Given the dichotomous nature of this assumption however, in addition to the large sample size, it is unlikely to produce a bias either for or against prostate motion in the model.

Displacements were considered under the assumption that there was no fiducial migration within the prostate between the initial planning CT and each fraction delivery, and deformation of the prostate during treatment could not be considered. Furthermore, it is worth noting that when the treating Radiation Therapist selects an imaging interval, CyberKnife will not always image at exactly the selected frequency and often the duration between consecutive images will be slightly longer. Finally, although typically patients will begin treatment with frequent imaging of every 15 or 20 s and lengthened later in treatment to 45 s, this is subject to patient stability, therefore the shorter intervals in the data may contain a bias towards larger amounts of motion then a true characterisation of prostate motion in short time intervals. We believe the bias was somewhat mitigated in the simulated intervals, where all intervals were resampled to generate new hypothetical intervals. Any thresholds for the data and imaging frequencies in this study could likely be considered conservative results.

Conclusion

The clinical factors examined in this research were not found to be significant predictors of significant intrafraction motion, however, further investigation into the impact of rectal and prostate volumes as well as patient BMI may be warranted. This paper considered in detail, both actual and simulated imaging interval selections with CyberKnife and considered the number of shifts that would be contained within different motion thresholds both over the whole dataset and per patient. The characterisation of translations and rotations likely to occur during various intervals may be useful for treatment planning during the calculation of imaging intervals and CTV-to-PTV margins for adequate geometric coverage. The impact of imaging on overall treatment time and the dosimetric implications of imaging and margin combinations should be considered when deciding which to use in practice.