Introduction

The literature on uncertainties in risk estimation from epidemiology is rich and diverse. However, relatively less attention has been paid to the application of stochastic techniques to estimate and propagate uncertainties. Shlyakhter and Wilson (1995), Shlyakhter et al. (1996) and Simon et al. (2015) described early applications of Monte Carlo methods to quantify for example uncertainties in risk estimation from epidemiology by showing how differential misclassification of exposure status in case-control studies increases the probability of getting a statistically significant false-positive result.

Epidemiological methods are applied in radiation therapy to obtain the dose–response relationship for radiogenic second cancer induction. The treatment of a first primary cancer with ionizing radiation can cause the induction of a second primary cancer, an unwanted, detrimental side-effect of the radiation exposure. Research on secondary cancer risk is becoming more important as radiotherapy cure rates for primary cancers increase. In an individual radiotherapy patient, it is generally impossible to establish with certainty the cause of a second cancer, e.g., exposure to radiation versus other carcinogens, or spontaneous occurrence; indeed, only a few biomarkers for radiation-induced cancers are known (e.g., Behjati et al. 2016; Zitzelsberger et al. 2010). Consequently, epidemiological studies attempt to quantify the probability (risk) of second cancer incidence with exposure to ionizing radiation, along with the corresponding uncertainties in that risk. Typically such studies rely on statistical approaches and large cohort sizes are needed to obtain sufficient statistical power and significance in the findings. Even larger cohorts are necessary when the risk is stratified by absorbed radiation dose at the site of second-tumour occurrence, which is necessary to obtain the dose–response relationship. Yet with large cohort sizes, epidemiological studies must take into account uncertainties that arise from all variables in the study, such as the radiation dose delivered to normal tissues.

It is generally accepted that exposure assessment in epidemiology studies of second cancer is difficult for several physical and anatomical reasons. Most radiotherapy patients receive dose distributions that tightly conform to the primary tumour. Although surrounding normal tissues receive a smaller dose than the tumour, this dose is generally not negligible. Figure 1 shows as an example one transversal slice through the patient’s anatomy and the corresponding dose distribution in the breast for a Hodgkin’s disease radiotherapy treatment plan. The figure clearly illustrates the non-uniformity of the dose to the breast. The quantification of the correlation between radiation dose and second cancer risk is further complicated by the uncertainties involved in relating a secondary tumour, which appears years or even decades after the treatment of the primary disease, to the actual dose distribution at the secondary tumour site; anatomy typically changes substantially during latency periods. Another problem of exposure assessment is that the dose outside the originally treated volume cannot be predicted reliably, if at all, by most clinically used treatment planning systems (Howell et al. 2010; Huang et al. 2013; Schneider et al. 2014; Newhauser et al. 2017). Even with the best available dose reconstruction methods, the dosimetric uncertainty is of the order of 40% (Schneider et al. 2014).

Fig. 1
figure 1

Dose distribution for a typical treatment of Hodgkin’s disease (mantle field). The dose was re-calculated in an Alderson Rando phantom. The light, medium and dark gray lines show the 20%, 50% and 85% iso-dose (of the prescribed dose). The left breast and the lungs are segmented in white

Exposure assessment is particularly problematical in regions of large dose gradients, as illustrated in Fig. 1. This is important because most second primary cancers appear near the original radiation treatment field (Dörr and Herrmann 2002; Travis et al. 2003; Diallo et al. 2009). In the regions of large dose gradients, it is difficult to estimate the dose at the precise anatomical point of origin of the secondary tumour because the size of a second tumour at diagnosis typically exceeds a few centimeters. Other dosimetric uncertainties include anatomical changes over time (including patient movement during treatment as well as changes from aging), the impact of fractionation on the dose distribution, and approximations in dose reconstructions. Thus, it is generally accepted that organ-specific dose–response relationships are subject to large uncertainties associated with the estimated dose to the second tumour. Despite of this, however, stochastic methods have not yet been applied to quantify these uncertainties and their ultimate impact and biases in dose–response models derived from epidemiologic studies.

The purpose of this paper is to apply Monte Carlo methods to quantify any consequences of dosimetric uncertainties at the secondary tumour site. In particular, the aim is to determine the impact of the size of the second malignancy at time of diagnosis, on the dosimetric uncertainties and on the resulting shapes of the predicted-dose-response relationships for radiation-induced cancer. To achieve this aim, Monte Carlo methods have been applied, assuming the broad characteristics of one particular epidemiological case-control study on second cancers, to simulate many realizations for the positions and sizes of the tumours reported. The predicted doses in this illustrative study were reconstructed from one treatment plan and the realizations for the 10,000 sets of tumour positions were simulated using published theoretical mechanistic models for radiation-induced cancers.

Materials and methods

This study involves simulating some aspects of a published epidemiological study of Travis et al. (2003) on secondary breast cancer after radiotherapy of young women with Hodgkin’s disease. In essence, a simplified, illustrative epidemiological study was performed using in silico methods to explore the potential impact of the size and location of the secondary tumour on the study results. The dosimetric uncertainties from inter-patient variations were ignored in this study using the three-dimensional dose distribution of one typical treatment. Although the doses to the breast volume are always fixed by the treatment plan, it is possible to simulate the locations and sizes of the breast tumours and, based on these simulated locations and sizes, the doses to the tumours can also be simulated. Second tumour size and location were varied randomly using a standard Monte Carlo sampling technique. Radiation exposures to the healthy breast were calculated with a commercial, clinically commissioned treatment planning system.

The probability distribution of second cancer induction was calculated from established linear- and non-linear mechanistic models (in the following called theoretical dose–response relationship) and applied to simulate dose, and thus location. The results of the simulations were analysed like an epidemiological study providing as a result the dose–response relationship for radiation-induced cancer (in the following called predicted dose–response relationship). In a perfect epidemiological study the predicted dose–response relationship would represent the theoretical one. Here, it is studied how the predicted dose–response relationship can be directly influenced by the tumour size variations. Figure 2 summarizes the methods, including the different steps of the modelling procedure.

Fig. 2
figure 2

Sketch illustrating the used methods including the modelling procedure. All dose values in the figure, including the dose in the plot of the probability distribution are representing the dose at tumour origin

Epidemiological study (Travis et al. 2003)

Characteristics of a matched case-control study conducted by Travis et al. (2003) were used as a basis for the Monte-Carlo analyses done here. The Travis et al. study analysed a population-based cohort of 3,817 women who were treated for Hodgkin’s disease between 1965 and 1994. The mean and median age at diagnosis were both 22 years. Point dose reconstruction for the breast cancer was possible for 102 cases and 257 controls. Patients with breast cancer were grouped into seven dose categories (Table 1) such that the number of controls was constant. Travis et al. (2003) determined the relative risk from controls and cases, and the error and confidence levels using maximum likelihood estimates. The number of cases, controls and the excess relative risk (ERR) are listed in Table 1. Travis et al. found no evidence of nonlinearity in the predicted-dose-response for second cancer as observed when comparing the linear model with a categorical model or with a linear-quadratic model.

Table 1 Point dose estimates and (excess) relative risk for breast cancer after radiotherapy of Hodgkin’s disease from Travis et al. (2003)

Treatment planning and dose distribution

Radiotherapy to treat Hodgkin’s disease has been highly successful in the past and, therefore, the treatment techniques have not been modified between 1965 and 1994. This can be verified, for example, by a comparison of the treatment planning techniques used from 1960 to 1970 (Carmel and Kaplan 1976) with those used from 1980 until 1990 (Hoppe 1990). Additionally, the therapy protocols did not differ very much between the institutions that applied this form of treatment. These factors make it possible to reconstruct a typical three-dimensional theoretical-dose distribution that is representative of a large patient collective of Hodgkin’s disease patients. The largest source of dose to the female breast during conventional radiotherapy for Hodgkin’s disease is the large mantle field. Thus, the mantle field technique for Hodgkin’s disease was reconstructed in an Alderson Rando Phantom with a 200 mL breast attachment (Radiology Support Devices Inc., Long Beach, CA, USA). The clinical target volume included all of the major lymph node regions above the diaphragm as it would be defined for a classical mantle field irradiation. The clinical target volume extended from the inferior portion of the mandible (C2 vertebra) to T10 vertebra. It included the whole mediastinum with the dorsal border at the middle of the vertebra including the paraortic and v. azygos nodes and the axilla up to the sixth rip. The planning target volume was created by expanding the clinical target volume by a margin of between 1.5 and 2.0 cm. Additionally, the left breast was contoured.

Treatment planning was performed on the basis of the review by Hoppe (1990) and the German Hodgkin disease study protocols (http://www.ghsg.org). As per these standards, the prescribed total dose was DT = 40 Gy in dT = 2 Gy daily fractions and with planning target volume coverage by the 95% isodose line. The Eclipse External Beam Planning system version 13.6 (Varian Medical Systems Inc., Palo Alto, CA, USA) using the AAA-algorithm (version 13.6.23) was used for treatment planning. The plan was calculated with 6 MV photons and consisted of two opposed fields. The technique for shaping large fields included divergent lead blocks. Treatment was performed at a distance of 100 cm (SSD). Anterior-posterior (ap/pa) opposed field treatment techniques were applied to insure dose homogeneity. Dose was calculated on 0.25 × 0.25 × 0.25 cm3 grid size which had the same resolution as the planning CT. The voxels of the female breast in each dose bin with their corresponding co-ordinates were identified. The maximum dose in any one volume element in the left breast was 42.2 Gy.

Monte Carlo simulation of dose and second cancer locations

The locations and sizes of the breast tumours were simulated based on the dose distribution in the breast described above. To randomly obtain a predicted dose D at the simulated second cancer induction locations, two different theoretical-dose to risk-response relationships were assumed, a linear excess relative risk (ERR) model:

$${\text{ERR~}}\sim ~D,$$
(1)

and a non-linear ERR model taken from a mechanistic model of radiation-induced cancer (Schneider and Kaser-Hotz 2005; Schneider et al. 2005; Schneider 2009):

$${\text{ERR}}~\sim ~\frac{{{e^{ - {\alpha ^\prime } \cdot D}}}}{{{\alpha ^\prime } \cdot R}}\left( {1 - 2 \cdot R+{R^2} \cdot {e^{{\alpha ^\prime } \cdot D}} - \left( {1 - {R^2}} \right) \cdot {e^{\frac{{{\alpha ^\prime } \cdot D}}{{1 - R}}D}}} \right)$$
(2)

with

$${\alpha ^\prime }=~~\alpha \cdot \left( {1+\frac{{{d_{\text{T}}}}}{{\frac{\alpha }{{\beta ~}}{D_{\text{T}}}}}D} \right).$$
(3)

The nonlinear model was used with the fixed cell sterilization parameters α = 0.07 and α/β = 3 Gy (Schneider et al. 2011a, b) which agree well with independent findings from Qi et al. (2011). The repopulation parameter, R, was 0.62 for the generation of simulated second cancers (Schneider et al. 2011a, b).

The two theoretical-dose–response relationships were applied to 1 Gy dose intervals extending from 0 to 43 Gy, with the largest dose bin (42–43 Gy) covering the maximum dose in the breast. Next, the dose–response relationships were normalized to a value of 1 to form a differential probability distribution and finally integrated to yield a cumulative probability distribution. Thus the proportionality constants (linear slopes) of Eqs. 1 and 2 were defined by the normalization procedure. The cumulative probability distribution was used with uniformly generated random numbers to simulate the dose at tumour origin for 102 second cancers. As the cumulative probability distribution represents ERR, a simulation of the controls was not necessary.

Next, the location of each of these 102 second cancers within the breast was simulated. To do this one breast voxel (co-ordinate) was randomly selected from all the breast voxels which were contained in the dose bin corresponding to the dose at second tumour origin of the previous step. Thus for each second tumour the predicted dose and location were simulated on a 1 Gy and 0.25 cm (voxel size) resolution, respectively.

In a next step the simulations were repeated for a predefined set of tumour sizes, assuming that the diagnosed second tumour was of finite size and spherical shape. Schwab et al. (2014) found that the median size of a diagnosed breast tumour varies with family history of the patient between 1.8 and 3.1 cm. Carter et al. (1989) obtained the distribution of breast cancer sizes in 24,740 patients and found that the diagnosed tumours were larger than 1 and 2 cm in 94.5% and 66.3% of the patients, respectively. Thus, tumour diameters d of 0.0, 0.8, 1.4, 2.0, 2.6, and 3.2 cm were applied for the simulations. The numerical values of sphere diameters originated from the need to use an integer number of 0.25 cm voxels to constitute a sphere. For each of the 102 simulated second tumours and at each of the six simulated second-tumour diameters, all voxel coordinates were determined which were enclosed in a sphere with diameter d. This sphere represented the finite-sized tumour with its origin in the originally simulated co-ordinates of the infinitely small tumour.

The finite size of second tumours combined with the dose gradients of the treatment plan present in the breast means that each tumour volume could contain a range of dose values. Thus, Monte-Carlo methods were initially applied to determine the point dose which would be used in the simulated epidemiological study (predicted dose) by randomly selecting one voxel in the tumour sphere. Each of the 102 second tumours and six tumour sizes were simulated 10,000 times with co-ordinates based on the location determination described in the previous section, and dose selected here.

Analysis of the simulated epidemiological study

The 102 simulated second tumours were grouped in seven predicted-dose categories as originally defined by Travis et al. (2003) and listed in Table 1. From these dose categories, the ERR was calculated for this simulated population with the first dose category defined as the reference, similarly to the procedure applied by Travis et al. (2003). The averaged ERR over all dose categories was scaled to the average ERR of the data from Travis et al. (2003) to be consistent with the epidemiological obtained risk. This procedure defined also the linear slopes of the theoretical dose response curves (Eqs. 1 and 2). The errors σi of ERR in each dose bin were estimated by scaling the errors of the epidemiologically obtained dose–response relationship.

Finally, the model parameters of the linear and non-linear dose–response relationships of Eqs. 1 and 2, respectively, were fitted to the data from Travis et al. (2003) and to the Monte-Carlo simulated linear and non-linear dose–response relationships by minimizing the residuals:

$${\text{RSS}}=\mathop \sum \limits_{{i=1}}^{n} {\left( {{\text{ERR}}_{i}^{{{\text{predicted}}}} - {\text{ERR}}_{i}^{{{\text{Travis}}}}} \right)^2}/\sigma _{i}^{2},$$
(4)

where n is the sample size which is here the number of independent dose categories (n = 6). To quantify and compare the goodness of fit of the two models, for each of the 10,000 simulation runs, the Akaike criterion for small data sets was calculated:

$${\text{AIC}}=n \cdot {\text{ln}}\left( {\frac{{{\text{RSS}}}}{n}} \right)+2 \cdot k \cdot \frac{n}{{n - k - 1}},$$
(5)

where, k is the number of free parameters (k = 1 for the linear and k = 2 for the nonlinear model) and RSS are the residuals of the fitted model. Given the two models for the data, the Akaike criterion estimates the quality of each model, relative to the other model (the smaller the better the goodness of fit). Hence, AIC provides a means for model selection. In doing so, it deals with the trade-off between the goodness of fit of the model and the complexity of the model.

The AIC was calculated for the linear and non-linear model for each of the 10,000 Monte Carlo simulations of 102 second tumours. The best quality model was stored for each of the 10,000 Monte Carlo samples. After 10,000 such analyses, one has the bootstrap frequency of selection for each of the two models. These are called model selection relative frequencies πi, the relative frequency that model i was found to fit the data best. This procedure was repeated for the six assumed tumour diameters.

The relative frequencies πi, were also determined for a breast tumour of typical size. The clinical observed distribution of breast tumour sizes is well known and was obtained by Carter et al. (1989). This tumour size distribution was used to weight the selection frequencies of the six simulated tumour diameters accordingly. The resulting distribution-averaged selection frequency is then representative for a typically-sized breast tumour.

Results

Figure 3 shows the original data of Travis et al. (2003) together with the linear and nonlinear fits to that data. The corresponding Akaike information criterion for the original Travis data are 6.8 and 15.4 for the linear- and non-linear models, respectively. As the Akaike information criterion is substantially smaller for the linear model, it fits the data better than the non-linear model with a probability of over 95%; this finding is consistent with that of Travis et al. (2003).

Fig. 3
figure 3

ERR of breast cancer induction as a function of dose to tumour site from the epidemiological study of Travis et al. (2003) marked by the circles with one-standard-deviation error. The dose to tumour site was estimated by Travis et al. (2003) by a combination of calculations from treatment planning and measurements in a water phantom. The dotted and solid lines indicate the linear and non-linear fit of Eqs. 1 and 2 to the data, respectively

Table 2 lists the model selection frequencies π for the assumed theoretical linear dose–response relationship, and in Fig. 4 π is plotted as a function of the tumour diameter. As expected the model selection frequency for the linear model is always larger for the linear than for the non-linear model, whatever tumour size considered. Consequently the model selection frequency for a realistic breast tumour size distribution is also larger for the linear model than for the non-linear model (Table 2).

Table 2 Model selection frequencies for the linear and non-linear model, for the different tumour sphere diameters and for the size distribution of breast cancers obtained by Carter et al. (1989)
Fig. 4
figure 4

Model selection frequencies π obtained from 10,000 Monte Carlo simulations as a function of the diagnosed tumor diameter for an underlying linear theoretical-dose-response relationship. The prediction results of the non-linear fit and the linear fit are shown as squares and diamonds, respectively

The model selection frequencies for 10,000 Monte Carlo simulations with the underlying non-linear dose–response relationship are also listed in Table 2 and are plotted in Fig. 5 as a function of the tumour diameter. As expected, the simulated epidemiological study predicts the non-linear model for small tumour diameters (< 1.5 cm), because the model selection frequency of the non-linear model is always larger than that of the linear model. However, as tumour size increases above 1.5 cm, the quality of fit of the non-linear model decreases while that for the linear model increases. If the model selection frequencies are weighted with the observed breast tumour size distribution (Carter et al. 1989) the resulting π is 0.39 for the non-linear model and 0.61 for the linear model. Thus, for a typical tumour size distribution of breast cancers, counterintuitively, a linear model shows a clearly larger model selection frequency than a non-linear model, although the data were produced with an underlying non-linear dose–response relationship.

Fig. 5
figure 5

Model selection frequencies π obtained from 10,000 Monte Carlo simulations as a function of the diagnosed tumor diameter for an underlying non-linear theoretical-dose-response relationship. The prediction results of the linear fit and the non-linear fit are shown as squares and diamonds, respectively

Additionally, the dose variation relative to the average dose in the tumour spheres was analysed by averaging the 10,000 Monte Carlo runs. The difference of the maximum and the minimum dose relative to the average was 1.0%, 2.1%, 7.6%, 21.3% and 26.1% for tumour diameters of 0.8, 1.4, 2.0, 2.6, and 3.2 cm, respectively.

Discussion

In this work, Monte Carlo simulations of some aspects of a published epidemiological study of second breast cancer after radiotherapy of Hodgkin’s patients were performed. Specifically, the impact of the size of a second tumour on the apparent selection of linear- versus non-linear dose response models that were obtained from a simulated epidemiology study was assessed. The most important finding is that the widely-used epidemiological method of assigning a point dose to an organ can, in some cases, lead to a biased, incorrect ranking of the quality of fit for candidate dose–response models, i.e., linear- versus non-linear models. In the epidemiologic study simulated in this work, the model selection frequency for the linear model (0.61) was larger than for the non-linear model (0.39), although the underlying theoretical-dose–response relationship used to simulate second tumours was a non-linear relationship. This result, indicative of bias, is almost certainly a consequence of the widely used assumption that the exposure of an organ or tissue can be adequately represented by a single point dose or, stated another way, that one can neglect spatial variations in exposure within the organ of interest. Evidently, it is possible that the predominance of linear dose response curves for second cancer incidence may be an artefact or bias caused by the use of simplistic, indeed inadequate, dose reconstruction methods. It is obvious that the use of simplistic dose reconstruction methods may have profound implications for radiation protection and clinical radiotherapy.

Two potential strategies (Schneider and Walsh 2017) are available to avoid the process of dose reconstruction in epidemiological studies on second cancer induction: (1) the method of organ sub-division takes the inevitable inhomogeneous dose distribution into account by applying epidemiological methods to organ sub-divisions which have a nearly homogenous dose; (2) the method of risk equivalent dose combines dose–response modelling and epidemiological data analysis. Risk models can be optimized using an iterative procedure assuming a variation of organ specific dose–responses and the dose-volume histograms of the organs, instead of point dose estimates.

Many challenges and uncertainties arise when epidemiological studies on second cancer induction are modelled with Monte Carlo simulations. In this first simple approach, variations in the dose distribution between patients were not accounted for because only one standardized treatment plan was applied. In addition, the predicted-dose calculation was performed only with one treatment energy using a state-of-the-art dose calculation.

Future simulated epidemiological studies should consider the individual patient treatment plans. In addition, the uncertainty coming from retrospective dose reconstruction should be included in future Monte Carlo approaches, in addition to the dose uncertainties originating from the finite size of a diagnosed tumour. However, for the intents and purposes of this study, these are not serious limitations because it was sought here only to explore effects caused by variation in tumour size.

Conclusions

This study showed that dosimetric uncertainty due to tumour size and location was sufficient alone to obscure an underlying non-linear dose–response relationship, particularly for tumours with diameter larger than 1.5 cm. As most diagnosed tumours have a diameter larger than 1 cm, the findings of this simulated population are relevant to real patient cohorts. Furthermore, there are other sources of uncertainty not considered here which could further obfuscate the issue. Thus, the results of this study suggest that it will be very challenging for an epidemiological study on second primary tumours in radiotherapy patients treated with highly inhomogeneous dose distributions to accurately identify a non-linear dose–response (unless it takes into account spatial variations in radiation exposure).