Introduction

Modern radiation therapy techniques such as volumetric modulated arc therapy (VMAT) require efficient and accurate pre-treatment quality assurance (QA) verification prior to irradiation of each patient [1, 2]. Verification of VMAT treatment is particularly challenging because, during beam-on time, the gantry rotates, leaves and collimators can move and change speed, and dose rate can be varied [3]. The ideal dosimeter for VMAT treatment should be able to reproduce the entire 3D-planned dose and should be dose rate, energy and gantry angle independent [3,4,5,6,7,8,9].

Several dosimetry systems able to reconstruct 3D dose distributions from dose measured by planar geometry detector arrays [4,5,6,7,8,9] have recently become available. Limitations of these devices are non-optimal spatial resolution and the size of the individual detectors in the array [10]. Moreover, the gamma criterion generally used in dosimetric comparison software is only weakly correlated to critical patient DVH errors and could not be sensitive enough to establish clinical acceptance levels [11].

High-resolution planar dosimeters can help find systematic linac commissioning errors for intensity modulated radiotherapy techniques and in QA [12]. However, there are errors that cannot be found with QA. Bojechko et al. [13] analyzed reported incidents collected over a 2.5-year period, with a rating of potentially severe or critical. Of the photon incidents, only 6% could have been detected by QA, and 74% could have been detected by in vivo transit electronic portal imaging device (EPID) dose verification of the first fraction. Mans et al. [14] reported 17 serious errors found in 4337 in vivo EPID transit verifications: among these, 9 would not have been detected with pre-treatment verification due to their origin.

EPID potentially represents a very efficient method for VMAT verifications due to very rapid set-up time, high-resolution and reproducibility [15]. Moreover, EPID dosimetry offers the possibility to reconstruct 3D distributions in a phantom as QA as well as directly in the patient during treatment [16,17,18,19].

Dosimetry Check™ (Math Resolution, LCC) (DC) is a software that allows 3D dose reconstruction of transit and through-air EPID images. It is calibrated in fluence, through a deconvolution kernel that relates pixel values to Monitor Units (MU) normalized to the center of a 10 × 10 cm2 field [20, 21]. DC can be used for QA and for in vivo dosimetry.

While DC has been evaluated in other studies [22, 23], in this study we tested the suitability of VMAT transit dosimetry for in vivo applications. In the study by Narayanasamy et al. [22], except for in-air IMRT dosimetry, neither VMAT nor transit dosimetry were performed. In the study by Gimeno et al. [23], the suitability of DC for VMAT dosimetry was assessed in only one non-clinical plan. In this study, the validation of the system was done by performing several basic tests, by analyzing DC 3D gamma pass rates for VMAT treatments and by measuring the sensitivity of DC to find delivery inaccuracies, set-up errors and patient anatomical variations. The influence of DC dose calculation algorithm was also assessed.

Materials and methods

DC v. 4.10 was used along with an Elekta Synergy® Linac (Elekta, Crawley, UK) equipped with the EPID system iViewGT. Source to imager distance was fixed at 160 cm. The treatment planning system (TPS) was Elekta Monaco® 5.0 with Monte Carlo dose calculation algorithm. Photon beams of nominal energies 6 MV (6 ×) and 10 MV (10 ×), which are clinically used for VMAT at our institute, were employed. EPID measurements were repeated three times. Mean values and deviations were reported.

DC is composed of two parts: a deconvolution kernel that converts EPID images to fluence, and a pencil-beam algorithm to calculate the dose. The pencil-beam algorithm and the deconvolution kerner were commissioned following vendor specifications [20, 21]. Images were acquired with the iViewGT v3.4 in integrated mode for the static field and in cine mode for VMAT. Each acquisition was normalized to a reference 10 × 10 cm2 field with fixed MU (100) irradiated in air and acquired in movie or in static mode, respectively. CT calibration curve (HU vs. relative electron density) was entered in the dose calculation module of DC. Calculation grid size was set to 3 mm for the TPS and for DC. We used a synthetic model of the treatment couch (iBEAM® evo Couchtop, Elekta) composed of carbon fiber (relative electron density = 0.7) and foam core (relative electron density = 0).

Phantoms

Two homogeneous phantoms were used to test the performances of DC: RW3 slab phantom of 30 × 30 × 20 cm3 and OCTAVIUS® II (PTW, Freiburg, Germany). RW3 phantom was oriented perpendicular to the beam axis.

Basic performance test

Basic dosimetric functionalities of DC were tested by measuring short-term reproducibility, dose linearity, the influence of couch-EPID distance, as well as angular, field size, and dose rate dependence. Details are reported in Online Resource.

VMAT plan verification

20 VMAT clinical treatment plans were verified. We randomly selected five prostate and five whole pelvic nodes plans for the 10 ×; five head and neck (H&N) and five lung plans for the 6 ×. Measured 3D doses distributions were compared by Gamma index analysis with TPS. PTW OCTAVIUS® 4D phantom with Octavius® 1500 detector (OCT) was used as reference for the Gamma Agreement Index (GAI) [8, 9]. OCT was calibrated with a 10 × 10 cm2 field with 293 MU for 6 × and 236 MU for 10 ×, in order to have 2 Gy in the central ionization chamber. Transit EPID images were acquired with the full OCTAVIUS® II phantom, containing no compensation cavity in order to have homogeneous phantom, placed at the linac isocenter. In order to evaluate DC performance in comparison with OCT, at high dose relevant for target and at low dose relevant for organ at risk, GAIs were computed at 3% local dose difference and 3 mm distance to agreement (L). VeriSoft 6.2 (PTW, Freiburg, Germany) was used for gamma computation; points with a dose lower than 10% of the maximum were not considered in the analysis. GAIs computed at 3% global dose difference (normalized to maximum) were also reported (G). Anova test was performed in order to establish the agreement between the distributions of DC and OCT GAI, Pearson coefficient was also computed in order to test the correlation of the two GAI distributions.

Sensitivity analysis

Several errors were introduced in order to measure the sensitivity of DC. We simulated errors related to delivery inaccuracy, incorrect set-up, and anatomical variations (see Table 1). Variation of planned MU and variations of collimator and gantry inclination were simulated by chancing the rt-plan dicom files. Lateral displacements of the isocenter and couch rotations were simulated by moving the table couch. Anatomical variations were simulated by adding a 1 cm bolus to the upper part of the phantom (1 cm Mould), and by removing 2 cm of material in the central part of the phantom (2 cm hole). We applied errors to 4 VMAT plans of different anatomical sites and beam energy (Prostate 10 ×, Whole pelvis 10 ×, H&N 6 ×, Lung 6 ×) and GAIs were evaluated for plans with errors. The threshold for successful error detection was established by using the concept of “confidence limit,” as suggested by AAPM Task Group 119 [24]. Confidence limit (CL) definition is based on mean and standard deviation GAI (\({\text{G}}\bar {A}{\text{I}}\) and σGAI) values for VMAT QA of each anatomical site, according to the formula:

Table 1 Magnitude and type of errors introduced
$${\text{CL}}=(100 - G\bar {A}{\text{I}}+1.96 \times {\sigma _{{\text{GAI}}}})$$
(1)

Therefore, the detection threshold (DT) corresponds to the difference between the expected value 100% and the confidence limit:

$${\text{DT}}=100 - {\text{CL}}$$
(2)

Less than 5% plans show GAI deviating from 100% by more than the CL.

Evaluation of dose computation algorithm in the presence of density inhomogeneity

The influence of dose computation algorithm was assessed, for different anatomical sites, by comparing GAIs, obtained by DC, computed in the homogenous Octavius phantom and in the planning CT. In order to use the same fluence as input for two dose computations, EPID irradiation for this test was in-air. Anova test was performed in order to establish agreement between the two distributions. DVH differences between DC and TPS were evaluated in the PTVs. Average and standard deviation (SD) of ΔPTVmin, ΔPTVmax, ΔPTVmean for each anatomical site were reported. In addition, for lung plans, following an in aqua vivo strategy [25], DVH differences were also evaluated by comparing TPS dose and DC dose obtained by fixing relative electron density of external patient contour to 1 during the dose computation phase.

Results

Basic performance test

Results of basic performance test are shown in Online resource.

VMAT plans verification

The distributions of GAIs for the 20 VMAT plans measured by OCT and DC (Table 2) were correlated (p < 0.001, r = 0.74). Anova test showed that the two GAI distributions were not statistically different (p = 0.88). Average GAI of OCT (L) was 95.6% SD 2.5%, min 90.3%, max 98.8%. Average GAI of DC (L) was 94.2% (SD 2.5%, min 87.9%, max 99.4%). Average GAI of OCT (G) was 99.0% (SD 1.1%, min 95.6%, max 99.9%). Average GAI of DC (g) was 97.8% (SD 2.5%, min 92.5%, max 100%). Comparison of lateral profiles and maps of failing points measured by DC and OCT are shown in Fig. 1.

Table 2 Comparison of GAIs in 20 VMAT plans
Fig. 1
figure 1

Comparison of lateral profiles and maps of gamma > 1 points (in red and blue) of an H&N treatment measured by DC in homogeneous phantom (a), and in planning CT (c) and by OCT (b) is shown. Comparison of TPS and DC DVH is shown in d (solid lines are DC DVH, dotted lines are TPS DVH)

Sensitivity analysis

Confidence limits and detection threshold for each anatomical site are shown in Table 3. GAIs of plans with errors, normalized to the threshold, are shown in Fig. 2. Each simulated error has led to the decrease of GAI. However, not all GAIs related to dose distribution with induced errors were below detection threshold. The 1 mm couch shift error was never detected and 5 mm couch shift error was detected in just two plans. 1° couch rotation error was never detected, 2° and 5° couch rotation errors were detected only the prostate plan. 1° and 2° coll rotation errors were detected in all plans but Whole Pelvis. All other errors were detected in all plans. In total 38/56 errors were over the detection threshold.

Table 3 Mean GAI and standard deviation (SD) of VMAT plans measured by DC are shown
Fig. 2
figure 2

GAI of VMAT plans with the errors, normalized to the confidence limit. The unity represents the threshold for detection of each error introduced

Evaluation of dose computation algorithm in presence of density inhomogeneity

GAIs of plans computed in homogenous phantom and in the planning CT are compared in Table 4. As expected, doses computed in lung by DC pencil-beam algorithm were not comparable with those produced by Monaco Monte-Carlo algorithm. In all other anatomical sites, the GAIs of plans computed in homogeneous phantom and in the planning CT were comparable.

Table 4 Comparison of GAIs of DC dose computed in homogeneous phantom and in the planning CT

Anova test showed that the two GAI distributions, including all anatomic sites, were statistically different (p = 0.02), but, excluding lung plans, the GAIs were not statistically different (p = 0.14). The inaccuracy of DC dose computation algorithm in lung is also visible in DVH results (Table 5). However, the in aqua vivo strategy used (see results in bracket in Table 5) was able to reduce dose differences in lung plans.

Table 5 Dose differences (in Gy) in the PTV are reported in each anatomical site

Discussion

In this paper, we assessed the suitability of DC for VMAT transit dosimetry. Performance tests carried out in this study reported in the online resource showed that basic system behavior is accurate.

For VMAT plans, the gamma evaluation method shows a good agreement between calculated and measured 3D dose distributions. In homogeneous phantoms, GAIs of DC dose distributions are comparable to OCT. Compared to other detectors, OCT, with 729 2d arrays, showed the lower sensitivity for VMAT prostate [10]. However, considering more anatomical sites [26], OCT sensitivity was similar to other systems. The 1500 detector, used in this work, showed better sensitivity in error detection than 729 arrays [27].

It is important to mention that the GAI criterion applied in our analysis considered a local dose difference of 3%. This criterion is more stringent than Van Dyk percentage difference [28] of 3% global dose considered in many publications and in the AAPM TG 119 report [24]. Local 3% GAI criterion shows a better correlation with DVH changes than global 3% GAI [29] which has been shown to not be sensitive enough to identify relevant errors [30].

The sensitivity of gamma-index method in detecting delivery errors has been widely analyzed in literature where various types of delivery inaccuracies have been introduced [29, 30]. In this study, we introduced the errors that were reported in other in vivo EPID dosimetry studies [13, 31].

While DC was proven to be able to detect all simulated irradiation inaccuracies and anatomical variations, it was not very sensitive to set-up errors. Poor sensitivity of DC in detecting set-up errors could be due to the use of a-priori information of the planning CT for dose calculation which is, by definition, in the correct set-up. However, when set-up errors are corrected by IGRT, residual intra-fraction displacements and rotations are expected to be minor during fast VMAT treatment [32, 33].

Recently, Kearney et al. [34] found that Gaussian formalism tends to overestimate the CL and the gamma distribution better represent gamma failing rate. From [34] we can deduce that the CL computed in our work could be overestimated and consequently the sensitivity in errors detection should decrease.

The GAI quantify how much two dose distributions are in agreement, but does not contain spatial information regarding the position of dose discrepancies relative to patient anatomy. The interpretation of delivery errors within the patient’s anatomy by means of dose-volume-based analysis is useful to detect dose differences in critical anatomical regions-of-interest and to establish if clinically acceptable dose errors are within tolerance per patient [29, 30]. 3D anatomy-based dose verification software in conjunction with detector array measurements only allows pre-treatment dose reconstruction on patient’s CT images [29, 35]. Dosimetric control of actual therapy fraction has been proposed through machine log file-derived methods [36], EPID-measured transit fluence [37] and transmission detector measurements [38]. In vivo dosimetry is the only tool able to verify the actual patient treatment, particularly with regard to patient anatomy and possible obstructions from positioning or immobilization devices. The limitation of a system like DC is that results are available only when the fraction is over and errors arising during the first RT fractions cannot be corrected. In this work, DC was proven to be able to detect many different delivery errors and anatomical variations. The comparison of GAIs in homogeneous phantom and in planning CT reported in Tables 4 and 5 showed that pencil-beam algorithm of DC for dose computation produced good results in all anatomical sites except for the lung. Because for lung treatment, the pencil-beam algorithm fails to correctly reproduce the dose, an in aqua vivo strategy [25] can be used (see Table 4). Another strategy to eliminate algorithm dependence would be to compare in vivo dose with QA dose obtained from in-air EPID acquisitions.

In our department, DC is now routinely used for in vivo verifications of stereotactic (all fractions) and H&N treatments (weekly). The clinical workflow for DC is the following: after plan approval, the CT, RT, structure and dose were entered into the DC database. Prior to in vivo acquisitions, EPID images were acquired in air and pre-treatment DC dose was reconstructed. Lung plans were recomputed in water, by assigning relative electron density 1 to the external patient contour. 3d dose was reconstructed just after dose delivery and results were evaluated using gamma agreement and DVH. Global tolerance levels for in vivo dosimetry have not yet been established.

Conclusion

Basic dosimetric performance of DC was accurate. Tests performed in this study showed that DC, using transit EPID images, produced agreement with TPS comparable to OCT. The dose computation algorithm of DC is accurate in all anatomical sites except lung. However in lung cases, the aqua vivo approach used in this work reduced the algorithm dependence of DC results. DC is able to detect errors due to delivery inaccuracy and anatomical variations.