Introduction

In 1948 the Soviet Union started to produce plutonium in the Mayak Production Association in the Southern Urals. Large amounts of radioactive waste were released into the Techa River from 1949 to 1956, with maximal releases in 1950 and 1951. Residents along the Techa River were exposed to significant doses of protracted external and internal ionizing radiation. Since the 1960s, demographic and medical information on the exposed population has been collected and dose reconstruction performed by staff of the Urals Research Center for Radiation Medicine (URCRM) in Chelyabinsk. Over the last decade, major improvements in the follow-up of the study population (Kossenko et al. 2002) and dosimetry (Degteva et al. 2000a, b) have been made. A review of the Techa River Cohort status can be found in ref. (Kossenko et al. 2005).

The Techa River Cohort provides important information on carcinogenic risks that resulted from protracted exposure in the low- and medium dose range among an unselected population of both sexes and all ages, with a follow-up time of more than 50 years. Based on the Techa River Dosimetry System 2000 (TRDS-2000) dose estimates, radiation risk analyses of cancer mortality and incidence data provided strong evidence on long-term carcinogenic effects in the Techa River Cohort with a relatively large excess risk (Krestinina et al. 2005, 2007, Eidemüller et al. 2008).

The current analysis is performed by means of both the two-stage clonal expansion (TSCE) (Moolgavkar and Venzon 1979; Moolgavkar and Knudson 1981) and the empirical excess relative risk (ERR) model. The TSCE model assumes that the key processes necessary to convert a healthy cell to a cancer cell can be described by two major basic steps. In spite of this simplification, the model has already been applied successfully to various radioepidemiological data sets (Kai et al. 1997; Luebeck et al. 1999; Jacob et al. 2005; Meza et al. 2008). Since the TSCE and ERR models are based on very different descriptions of both the baseline hazard and the implementation of the radiation risk, a comparison of the risk estimates obtained from both approaches indicates which characteristics of the risk are inherent in the data and which depend on the choice of model.

Since models of carcinogenesis are based on biological transition rates, it may be possible to investigate whether consequences of biological mechanisms such as genomic instability, bystander effects or low dose hypersensitivity could be seen in epidemiological data (Little and Li 2007; Jacob et al. 2007, 2008). For example, it is an important question whether radiation-induced genomic instability plays any role in radiation-induced carcinogenesis in humans. Radiation-induced genomic instability is defined by the occurrence of damages in the progeny of cells, which themselves have been exposed to ionizing radiation without expressing any observable effects (Pampfer and Streffer 1989; Morgan 2003a, b). Observed damages in the daughter cells may include non-clonal chromosome aberrations, increased mutation rates and delayed cell death. In ref. (Eidemüller et al. 2008), it was found that a model including radiation-induced genomic instability described the radiation risk in the Techa River Cohort equally well as a model with age dependent radiosensitivity. In the present study, this line of research will be continued.

Note that the current work is based on a longer follow-up than previous analyses (Krestinina et al. 2005, 2007; Eidemüller et al. 2008). The simultaneous analysis of cancer mortality and incidence in the Techa River cohort allows an exploration whether a common radiation effect on processes of carcinogenesis and common traits of excess risk can be identified.

Materials and methods

Study cohorts definitions and follow-up

This section summarizes the most important properties of the cancer mortality and incidence cohorts used in this work. For a detailed description of radiation conditions at the Techa River and demographic characteristics of the population exposed in the Techa riverside villages the reader is referred to refs. Kossenko et al. (2005), Akleyev and Lyubchansky (1994).

The extended Techa River Cohort (ETRC) includes individuals born prior to January 1, 1950, who lived at least some time during the period 1950–1960 in one of the 41 radioactively contaminated villages along the Techa River, in the Chelyabinsk and Kurgan oblasts (regions).

Solid cancer mortality and incidence studies in the ETRC differ in terms of begin of follow-up, cohort size and study catchment area. The solid cancer mortality (incidence) cohort used in the present report includes 29,771 (17,069) individuals.

The mortality follow-up begins on latest of January 1, 1950, or the date when a person first came to live at the Techa riverside. Solid cancer mortality catchment area includes the territory of those Chelyabinsk and Kurgan oblasts where information on vital status and causes of deaths for exposed individuals were regularly collected. Cohort members who left the Chelyabinsk and Kurgan oblasts are treated as distant migrants. At the end of the current follow-up (i.e., December 31, 2003), 4,520 cohort members had left the cancer mortality catchment area; among 25,251 non-migrants, 15,884 subjects had died (with cause of death known for 90% of the individuals), 7,233 were alive and 2,134 were lost to follow-up.

Since the most comprehensive data on solid cancer incidence are available from 1956 and for the Techa River cohort members exposed within the Chelyabinsk oblast, the cancer incidence follow-up starts from January 1, 1956, or the date when a person first came to live at the Techa riverside area within the Chelyabinsk oblast. The solid cancer incidence catchment area comprises five raions of Chelyabinsk oblast and Chelyabinsk city where a substantial number of cohort members have moved (Krestinina et al. 2007). As of December 31, 2003, the cancer incidence follow-up status can be summarized as follows: 4,443 subjects were alive, 8,271 had died and 4,355 were lost to follow-up mainly due to migration outside the catchment area.

Women make up 58% (57%) of the solid cancer mortality (incidence) cohort subjects. Most of the cohort members are Slavs, while 20% (33%) has Tartar or Bashkir ethnicity. About 40% of the cohort were younger than 20 years at the time of initial exposure and only 30% were older than 40 years.

Exposure conditions and dose estimates

Residents of the Techa River villages received external radiation exposure mainly from the contaminated river shore and flood-plain soils, and internal exposure from ingestion of radionuclides (137Cs, 90Sr, 89Sr and short-lived radionuclides) with drinking water and local foodstuffs. The dose estimates were computed by the URCRM dosimetry team using TRDS-2000 (Degteva et al. 2000a, b, 2008; Napier et al. 2001). The TRDS-2000 provides annual dose estimates for each individual in the cohort starting from January 1, 1950, or the date of arrival in the Techa River area until the end of follow-up (i.e., December 31, 2003, date of death, or date of migration from the catchment area). Dose estimates were computed taking into account age-dependent parameters of internal and external exposures, detailed information on residency in the contaminated area and the date of last known vital status. However, neither the precise location of individual residences within the investigated villages nor detailed lifestyle patterns were taken into consideration.

The solid cancer risk analysis is based on stomach dose. This choice was made because stomach dose is similar to absorbed doses in the lung and other soft tissues. In addition, stomach cancer is the most common cause of cancer death in this cohort. On average, about 75% of the dose to the stomach is due to external exposure while the remainder is due to ingestion of 137Cs. Stomach dose estimates range up to 0.48 Gy with a mean of 0.03 (0.04) Gy in the mortality (incidence) cohort. The cumulative stomach doses are essentially unchanged after 1960 since in particular the external exposures drop strongly after that time.

Study endpoints

In this study, we have analyzed solid cancer incident and death cases (ICD-9 codes 140-199) other than bone cancer (ICD-9 code 170). The doses received due to 90Sr incorporation show only a small correlation to those from 137Cs. Since strontium accumulates in the skeleton, 90Sr may lead to bone doses of several Gy with large uncertainties. In order to avoid this additional source of uncertainty and a potential bias from the 90Sr doses, bone cancers were excluded from the analyses. In total, 2,064 solid cancer deaths and 1,889 solid cancer incident cases with bone cancer excluded occurred between 1950 (1956) and 2003.

We have also performed risk analyses for all solid cancer sites excluding lung cancer, since it is known that lung cancer may show a different age dependence than other cancers and since lung cancer risk could be biased by smoking. In the present study, we have not included a site-specific analysis since the number of excess cases is small, so that it would be difficult to obtain definite site-specific risk estimates, and the statistical power is too low to make predictions about any changes of risk with age.

For the ETRC members, places and periods of residence during the study follow-up are known including an indicator whether the residence was inside or outside the catchment area. The cohort members were not considered to be at risk when they were known to reside outside the catchment area, or when their places of residence were unknown because they were not under active follow-up in such periods. The mortality (incidence) cohort has a total of 903,007 (452,010) person years at risk.

TSCE models for immediate radiation effects

The TSCE model is an effective model, i.e., it does not represent a particular biological pathway to cancer, but rather includes a multitude of possible cellular processes in its effective parameters that characterize the time scales of an initiation process, clonal growth, and transformation to cancer. In the TSCE model (Fig. 1), it is assumed that the complex process leading to cancer can be reduced to two basic steps. In the first step, called initiation, a healthy cell may experience several genetic or epigenetic events that will result in an intermediate cell. This process occurs with an effective initiation rate ν(a) where a is the person’s age. The intermediate cells divide with rate α(a) and differentiate or are inactivated at rate β(a). A primary intermediate cell together with its daughter cells forms a clone of intermediate cells. The process of clonal growth of intermediate cells is called promotion. In a second step, these intermediate cells can convert with the transformation rate μ(a) to malignant cells. Once a malignant cell is produced, it is assumed to lead to cancer, which either leads to death (mortality data) or is diagnosed (incidence data) after a given lag time t lag. We have tested different lag times, but the lag time was found not to have a major influence on the risk estimates. So for the mortality cohort t lag = 5 years was chosen, consistent with (Krestinina et al. 2005; Eidemüller et al. 2008). Since the time between the transformation of an initiated cell to a malignant cell and cancer incidence is shorter than the time until mortality, a time lag of t lag = 3 years was used for the incidence cohort. A slightly better description of the data and more consistent parameter values were obtained for this choice compared to equal lag times for mortality and incidence data.

Fig. 1
figure 1

TSCE model

To describe the spontaneous cancer mortality or incidence risk (i.e., the baseline hazard), constant values of the parameters ν, α, β and μ over lifetime were used. However, since from epidemiological data only the hazard can be extracted—and not, e.g., the size and distribution of intermediate clones—only three parameters are relevant and one parameter can be chosen freely (Portier and Masri 1997; Hazelton et al. 2001). Therefore, we defined three new parameters X, γ and δ, which can be determined in a unique way from the data. Age dependencies, which are also related to radiation exposure, can be determined for all four parameters. Here, for the fourth parameter (the transformation rate μ(a)) only the ratio to an undetermined rate μ0 at birth is used by

$$ \begin{aligned} X(a)=\,& N_s\cdot \nu(a) \cdot \mu_0\\ \gamma(a) =\,& \alpha(a)-\beta(a)-\mu(a)\\ \delta(a) =\,& \alpha(a) \cdot \mu_0\\ m(a) =\,& {{\mu(a)}\over {\mu_0}} , \end{aligned} $$
(1)

where N s is the number of healthy stem cells. Whereas in former analyses with the TSCE model (e.g. in ref. (Eidemüller et al. 2008)) a parameter q instead of δ has been defined, we prefer to use δ in the present work since it allows a more direct relation of radiation effects to the underlying biological parameters, and biologically based models like TSCE models including genomic instability can be more directly transferred to these new parameters. X is proportional to the initiation rate, γ gives the rate of clonal expansion and δ is proportional to the division rate.

To find a good description of the baseline hazard, i.e. cancer incidence or mortality risk in the absence of radiation, the data were tested for possible baseline confounders. For this purpose, baseline rates were adjusted for potential confounders such as gender, birth year, ethnicity and oblast of initial exposure. Since the solid cancer baseline hazard for males and females differs significantly, a separate set of parameters X, γ and δ is used for each gender in the present analyses. For the mortality data, all baseline confounders were found to be significant and the baseline was parametrized by

$$ \begin{aligned} X_{{\rm base},p} =\,& X_{0,p}\cdot F_{{\rm byr},m}\cdot F_{{\rm byr},f} \cdot(1+f_{\rm tar}\theta_{\rm tar})\cdot (1+f_{\rm kur}\theta_{\rm kur})\\ \gamma_{{\rm base},p} =\,& \gamma_{0,p} \\ \delta_{{\rm base},p} =\,& \delta_{0,p}\cdot F_{{\rm byr},f} \\ m_{{\rm base},p} =\,& 1 \\ F_{{\rm byr},p} =\,& 1+f_{{\rm byr},p}\cdot(\hbox{birth year}-1925) , \end{aligned} $$
(2)

where p = m/f is an indicator for male or female, θkur is 0 for residents of the Chelyabinsk oblast and 1 for Kurgan oblast residents, and θtar is 0 for Slavs and 1 for Tartars or Bashkirs; f byr,m (f byr,f ) is 0 for female (male) persons.

The birth year effect was found to be different for both genders: for males it affects the initiation rate whereas for females it changes simultaneously the initiation rate, and δ. As is shown in the Appendix, this corresponds almost exactly to a change in the transformation rate for persons with different birth years. In total there are 10 baseline parameters. Though in general the baseline rates of X base(a), γbase(a), δbase(a) and m base(a) could be age-dependent, in our model they were chosen to be independent on age. It is important to note that even for constant rates the model predicts an increasing hazard function with age: the number and size of intermediate clones grow with time and thus the probability of a malignant transformation will increase.

The incidence cohort has only about half the size of the mortality cohort since only data from the Chelyabinsk oblast are available. A birth year effect was not statistically significant at the 95% level, so the only baseline confounders in this data set are gender and ethnicity:

$$ \begin{aligned} X_{{\rm base},p} &= X_{0,p}\cdot(1+f_{\rm tar}\theta_{\rm tar})\\ \gamma_{{\rm base},p}& = \gamma_{0,p} \\ \delta_{{\rm base},p} &= \delta_{0,p} \\ m_{{\rm base},p} &= 1 . \end{aligned} $$
(3)

The effects of radiation exposure can be incorporated in the model by allowing for a change of the parameters with dose rate. As shown below, the cancer risk is described best with the radiation acting linearly on the initiation rate and in the most simple model the change of the initiation rate is given by

$$ X_p(a) = X_{{\rm base},p}\cdot (1+r_X \cdot d(a)) , $$
(4)

where d(a) is the received dose rate at attained age a and r X represents the strength of the radiation action. If a direct (i.e. the transition rates are changed only during the radiation action as in Eq. (4), as opposed to long-term or permanent effects used to model genomic instability as described below) effect of ionizing radiation on promotion or transformation, instead of initiation, is implemented into the TSCE models, the radiation effect is not significant.

However, as will be shown in the results section, the radiation risk is described significantly better by an age-dependent model:

$$ X_p(a)=\left\{ \begin{array}{ll} X_{{\rm base},p}\cdot (1+r_{X,1} \cdot d(a)) : a < a_{\rm tr} \\ X_{{\rm base},p}\cdot (1+r_{X,2} \cdot d(a)) : a > a_{\rm tr} \end{array}\right. , $$
(5)

where a tr may be interpreted as an age-of-transition between periods of different radiosensitivity. This model with an age-dependent sensitivity (TSCE model with AS) will be compared to TSCE models including effects of genomic instability.

For piecewise constant parameters the TSCE model can be solved stepwise analytically (Heidenreich et al. 1997) and the hazard h(a) can be determined. The total likelihood L tot is then obtained from the product of the likelihoods for all cohort members \(L_{\rm tot}=\prod_i L_i(\Psi_i, a_{1i}, a_{2i})\) (Luebeck et al. 1999), where Ψ i is the survival function for the exposure history of individual i, and a 1i and a 2i are the ages at beginning and end of follow-up. This method does not group data but takes into account the individual exposure history of each person. To determine the best values of the parameters, a maximum likelihood (minimum deviance) fit of all parameters was performed simultaneously using the program MINUIT from the CERN library (James 1994). The best fit of the parameters is obtained by minimizing the deviance Dev = − 2 ln L tot. Once the parameters have been obtained, both the excess relative risk per unit dose (ERRpd) and excess absolute risk per unit dose (EARpd) for each exposed individual i at age a can be computed by

$$ \begin{aligned} \hbox{ERR}_{{\rm pd},i}(a) =\,& \left(h_i(a)/h_{{\rm base},i}(a)-1\right)/D_i(a-t_{\rm lag})\\ \hbox{EAR}_{{\rm pd},i}(a) =\,& \left(h_i(a)-h_{{\rm base},i}(a)\right)/D_i(a-t_{\rm lag}) , \end{aligned} $$
(6)

where D i (a − t lag) is the total accumulated dose at a − t lag; the hazard h i (a) depends on the exposure history of individual i and thus can be different for two individuals with the same age and the same accumulated dose. The ERRpd(a) and EARpd(a) for the total cohort at a certain age a can then be obtained by averaging over the individuals at risk.

For an estimate of the confidence intervals, 10,000 Monte Carlo realizations from the parameter distributions have been simulated. Since the uncertainties of the parameters turn out to be distributed almost symmetrically, a standard Gauss distribution was assumed. In a computer program written by one of the authors (M.E.), these realizations were created taking into account the correlation matrix of the parameters given by MINUIT within a distribution-free approach (Iman and Conover 1982) and using Latin Hypercube Sampling (Cheng and Druzdzel 2000). The program was checked for the correct distributions and correlations and compared to the results from the program Crystal Ball (1998) as a double check. For each realization, the baseline risk, ERRpd(a) and EARpd(a) of the cohort can be calculated for each age a. The values of the percentiles of the full set of realizations then provide the uncertainty bounds.

TSCE models with lifelong radiation effects

Radiation-induced genomic instability (GI) (Pampfer and Streffer 1989) describes an increased rate of new alterations in the genome after exposure to ionizing radiation. GI is observed in the progeny of cells many generations after the initial exposure; a review on the evidence of radiation-induced GI in vitro and in vivo can be found in (Morgan 2003a, b). Ongoing research aims at a better understanding of the underlying biological mechanisms, e.g. (Kadhim et al. 2004; Suzuki et al. 2006; Kaup et al. 2006). In principle, GI could appear at any stage in the carcinogenic process and the question whether GI is an early event is a major question of cancer genetics (Nowak et al. 2002).

Since the TSCE model is described by biological parameters, it is possible to investigate whether potential consequences of long-term or permanent cellular changes are expressed in the data. We assume that the increased rate of (epi-)genetic alterations after radiation exposure will effectively increase the rate at which initiation, promotion or malignant transformation occurs, and that this increase will occur not only during radiation exposure, but also at a later time. Many variations of these radiation effects are possible: For example, mutations could show up directly after exposure or with a certain time lag, doses received more recently could have a stronger effect than doses received longer time ago, or the mutations may need an activation dose to appear; some of these models are discussed in ref. (Eidemüller et al. 2008). Note, however, that the results of such a model testing should be interpreted with caution since positive or negative results do not (dis)prove the presence of any long-term or lifelong changes, but can only indicate whether the data are consistent or not with such models.

In the present paper, different variations of the TSCE model are analyzed by using a standard radiation action on initiation as in Eq. (4)—which can also be called a ’direct’ effect, since the transition rate is increased during radiation exposure and returns to its spontaneous value afterwards—and then incorporating the above mentioned effects. Since the mechanisms of genomic instability are not known and it is not clear whether the TSCE model with its two stages could represent such mechanisms, we will refer to these models as models with lifelong enhanced initiation, promotion or transformation rates rather than as models with genomic instability.

As discussed in the results section, among the variety of models tested, the best model for the incidence cohort included a direct radiation effect on the initiation and a permanent radiation effect on the promotion rate γ, such as might be induced by a potential genomic instability. For the mortality cohort, the two best models include a direct effect on the initiation and a permanent change on either the initiation rate, as was observed in ref. (Eidemüller et al. 2009), or the promotion rate with a slight preference for promotion. Therefore we choose as our best TSCE model for both cohorts a model with a lifelong enhanced promotion rate (TSCE model with LEP); similar to Eq. (5) it is found that the introduction of a threshold age a tr improves the fit significantly. Thus, the preferred model is

$$ \begin{aligned} X_p(a) =\,& X_{{\rm base},p}\cdot (1+r_X \cdot d(a))\\ \gamma_p(a) =\,& \gamma_{{\rm base},p}\cdot \left(1+r_{\gamma,{\rm LEP}}\cdot \int\limits_{a_{tr}}^a da' d(a') \right) \\ =\,& \gamma_{{\rm base},p}\cdot\left(1+r_{\gamma,{\rm LEP}}\cdot (D(a)-D(a_{tr})) \right) , \end{aligned} $$
(7)

where the term ∝ r γ,LEP equals zero for all a < a tr. All doses received after a transition age a tr will increase the promotion rate during the whole lifetime, and the increase of the promotion rate is proportional to the accumulated dose between a tr and a.

Excess relative risk (ERR) models

Parametric ERR models are used here to describe the hazard as a function of attained age in accordance with (Krestinina et al. 2005, 2007). As for the TSCE models, the parametrization of the background differs for the mortality and incidence cohorts. The mortality cohort has the same background confounders as the TSCE model with male and female birth year, ethnicity and oblast. In contrast, the incidence cohort, including only people from the Chelyabinsk oblast, does not show a significant birth year effect for males and females separately, but has a significant joint birth year effect. Thus, the models are given by

$$ \begin{aligned} h_p(a) =\,& e^{\psi_p(a)}\cdot (1+\hbox{ERR}_{pd}(a)\cdot D(a-t_{\rm lag}))\\ \psi_p^{\rm mort}(a) =\,& \psi_{0,p} + \psi_{1,p} \cdot\ln{{a}\over {70}} + \psi_{2,p} \cdot \left(\ln{{a}\over {70}}\right)^2\\ & +f_{{\rm byr},p}({\rm birth\ year}-1925) + f_{\rm tar}\theta_{\rm tar} + f_{\rm kur}\theta_{\rm kur} \\ \psi_p^{\rm inc}(a) =\,& \psi_{0,p} + \psi_{1,p} \cdot\ln{{a}\over {70}} + \psi_{2,p} \cdot \left(\ln{{a}\over {70}}\right)^2\\ & +f_{\rm byr}({\rm birth\ year}-1925) + f_{\rm tar}\theta_{\rm tar} , \end{aligned} $$
(8)

where h p=m/f are the male or female hazard cases per 105 person years. The notation is the same as in Eq. (2). As principal risk models, a model with constant excess relative risk with a linear slope in attained age, and a log-linear model,

$$ \begin{aligned} \hbox{ERR}_{\rm pd}(a)=\,&\hbox{ERR}_{\rm pd}\\ \hbox{ERR}_{\rm pd}(a)=\,&\hbox{ERR}_{{\rm pd},60}+slope\cdot(a-60)\\ \hbox{ERR}_{\rm pd}(a)=\,&\hbox{ERR}_{{\rm pd},60}\cdot e^{\alpha\cdot \ln a/60}\\ \end{aligned} $$
(9)

were used, but also linear-quadratic models or threshold models were tested. The data were analyzed both with individual maximum likelihood methods as for the TSCE models and with grouped data using EPICURE (Preston et al. 1993) with very similar results. For a better comparison to the results of the TSCE models, the results from the individual likelihood fit are presented. The confidence intervals were estimated in a similar way as for the TSCE models.

Results

Mortality cohort

Table 1 compares different models by deviance and number of parameters. All models include 10 background parameters. The TSCE model of Eq. (4) with a direct radiation effect on initiation has one additional radiation parameter r X . A radiation action on promotion or transformation is not significant. Introducing a quadratic dose term, in addition to the linear one, improves the fit only slightly and is not significant (p = 0.65). Also models with a threshold in the dose response were tested with similar results for the deviance as the linear model without threshold. The next two rows display the results for the TSCE models with age-dependent sensitivity (AS) and lifelong enhanced promotion rate (LEP). Compared to the linear TSCE model, both models describe the data significantly better (p = 0.011 and p = 0.0071). The transition age a tr of 24.3 years of the AS model is smaller than the corresponding age of 33.3 years of the LEP model. In the AS model, the radiation parameters (with 1σ errors) are r X,1 = 5.91 ± 16.9 year Gy−1 and r X,2 = 190.0 ± 68.0 year Gy−1; r X,1 is much smaller than r X,2 and has a large error. For the LEP model, the corresponding values are r X  = 24.3 ± 16.8 year Gy−1 and r γ,LEP = 1.17 ± 0.39 Gy−1.

Table 1 Comparison of different TSCE and ERR models for the mortality data

The ERR model with a constant risk in attained age gives a central risk estimate of ERRpd = 0.85 ± 0.35 Gy−1. A quadratic term in dose, in addition to a linear one, is not significant (p = 0.44). Models with a threshold in the dose response were tested with very similar results as the constant ERR model. However, a significant improvement was found with a dependence of the ERRpd on attained age. With a linear dependence on attained age, the deviance improves by 7.1 points with one parameter more, corresponding to a p-value of 0.0077. A log-linear model is not as good as a linear one, but still the improvement is significant with p = 0.028. Both models predict an increase of ERRpd(a) with attained age. The parameter values of the linear model are ERRpd,60 = 0.56 ± 0.34 Gy−1 and slope = 0.074 ± 0.027 (year Gy)−1, indicating a significant increase of risk with attained age. Using a model with a dependence of ERRpd on age at exposure instead of attained age, the deviance is lower by 2.3 points than the constant ERR model (p = 0.13). This change in deviance is mainly due to correlations between the change of ERRpd with age at exposure and with attained age: the addition of a radiation effect on age at exposure to the ERR model linear in a results in a deviance that is reduced only by 0.4 points (p = 0.53), and the corresponding parameter has a large uncertainty.

The ERR models show a deviance that is about 40–50 points higher than that of the TSCE models. This difference is already present in the background description without radiation effects with a deviance of 26331.3 for the TSCE model and 26382.1 for the ERR model. As commented in (Eidemüller et al. 2008), the better description of the TSCE model of the hazard as a function of attained age is probably responsible for the substantially lower deviance of the TSCE models compared to the ERR models. In both models, the correlations between radiation parameters and background confounders remain below 0.2; therefore, changes in the background confounders only have a minor effect on the radiation risk.

Table 2 shows the predicted and observed distribution of solid cancer deaths by dose categories for the different TSCE and ERR models investigated. Compared by dose categories, all models agree very well in each category, the difference being much smaller than the square root of the total number of cases in each category. Note, however, that there is a small difference in the total number of predicted radiation-induced cases: the TSCE model with LEP predicts 53 cases, which agrees well with the 47–48 and 54 cases from the linear and log-linear ERR models, respectively. This corresponds to about 2.6% of the 2,064 observed solid cancer deaths. In contrast, the TSCE model with AS predicts only about 40 cases. All models agree well with the observed cases in almost each dose category. The exception may be the dose category 200–300 mGy where more cases have been observed than calculated. However, even in this category the difference is just about one standard deviation.

Table 2 Solid cancer deaths by dose categories as predicted by the different TSCE and ERR models and compared to the observed cases

Figure 2 shows the ERRpd as a function of attained age for the four models; the central values of the risk are calculated with the best parameter values averaged over all persons at risk including annual dose estimates. For clarity, the error bars are shown only for the TSCE model with LEP (solid line), the uncertainties associated with the other models are of similar size. The error bars show the 68% confidence interval instead of the 95% confidence interval—which would be about twice as large—in order to see more clearly the magnitude of the differences between the models. In terms on the increase of the ERRpd with attained age, all models agree, though they differ in the exact form of this increase. At age 64, the mean age of solid cancer deaths, the TSCE model with LEP and both ERR models agree well on the ERRpd (and EARpd), whereas the TSCE model with AS predicts a lower risk. In the central age range around 65 years, all models agree well within the uncertainty bounds, while for younger and older ages the differences are somewhat larger, especially due to the linear ERR model. This is consistent with the expectation that the models will represent the data well around the age where most of the cases occured, but their predictions will become more model-dependent outside the central region. Around age 55, a very small drop in the excess relative risk can be seen for the TSCE model with LEP. Although the origin of this drop is not really clear, it might come from migration of people with certain exposure patterns.

Fig. 2
figure 2

ERRpd for solid cancer mortality as a function of attained age with corresponding 68% CI. For clarity, the error bars are shown only for the TSCE model with LEP (solid line), the uncertainties associated with the other models are of similar size

In Table 3, the ERRpd and EARpd are given for different ages. Because causes of death are unknown for about 11% of the deceased cohort members, the EARpd estimate is biased downward. Assuming that the distribution of causes of death among deceased individuals with unknown cause is similar to that seen in those with known cause, the EARpd estimate is likely to be about 11% too low. At the age of 64, the TSCE model with LEP and the ERR models agree well in terms of their ERRpd and EARpd risk estimates, though the TSCE and ERR models are based on a completely different parametrization of the baseline. The TSCE model with LEP obtains for the cohort central estimates of

$$ \begin{aligned} {\rm ERR}_{\rm pd}(64)=\,&0.85\,{\rm Gy}^{-1}\ (95\% {\rm CI}\ 0.36; 1.38)\\ {\rm EAR}_{\rm pd}(64)=\,&476\ (10^5\ {\rm PYR\ Gy})^{-1}\ (95\% {\rm CI}\ 203; 720) . \end{aligned} $$
(10)

For ages above 65, the ERR models tend to predict larger excess relative and absolute risk values than the TSCE models; for example, for the TSCE model with LEP the best estimate of the risk at the age of 80 is ERRpd(80) = 1.27 Gy−1, whereas the linear and log-linear ERR models suggest values of 2.03 Gy−1 and 1.80 Gy−1, respectively. For attained age 57–70, the risk estimates of the TSCE model with AS are lower than those of the other models; this is consistent with the lower number of predicted cancer cases. However, all models agree within the uncertainty bounds. Since the deviance of the TSCE model with AS is only slightly lower than that of the TSCE model with LEP, its risk prediction is equally valid as the risk estimates of the other models. The differences observed between the risk values of the various models indicate the ranges of model uncertainty.

Table 3 ERRpd and EARpd values for solid cancer mortality given for different attained ages with the 95% CI in units of Gy−1 and (105 PYR Gy)−1

Since the risk of lung cancer might be different from the risk of other cancers and smoking represents a potential bias factor, the data were re-analyzed for all solid tumors excluding 18 bone and 395 lung cancer cases. The ERRpd remains almost the same with a slightly wider uncertainty range. For example, for the TSCE model with LEP, the risk without lung cancer is ERRpd(64) = 0.83 Gy−1 (95% CI 0.30; 1.44). Due to the smaller number of cancer cases of about 20%, the absolute risk is also smaller by this order of magnitude (EARpd(64) = 360 (105 PYR Gy)−1 (95% CI 127; 585)).

Using a model with both direct and permanent radiation effect on the initiation rate—instead of a permanent radiation effect on the promotion rate as in the TSCE model with LEP—the deviance is higher by only 0.5 points. Thus, the mortality data are equally consistent with a permanent radiation effect on the initiation. However, the TSCE model with a permanent effect on the initiation gives a lower risk with 37 radiation-induced cases and is less consistent with the empirical models than the TSCE model with LEP.

Incidence cohort

Table 4 shows different models compared by deviance and number of parameters for the incidence data, the notation being similar to that in Table 1. The TSCE and ERR models include 7 and 8 background parameters, respectively. The first row shows the result for a TSCE model with a direct radiation effect on the initiation and one additional radiation parameter r X from Eq. (4). A radiation action on the transformation rate is not significant. A radiation action on the promotion rate instead on the initiation rate reduces the deviance by 0.9 points more. However, the model with AS or models with a permanent radiation effect show a strong preference for a direct effect on initiation instead of a direct effect on promotion. Adding a quadratic dose term does not result in a significant improvement (p = 0.53). Models with a threshold in the dose response give similar results as the linear TSCE model without threshold. Compared to the linear TSCE model, the TSCE model with AS describes radiation effects better, but this is not significant at the 95% CI level (p = 0.12); however, the TSCE model with LEP describes the data significantly better (p = 0.017). For the TSCE model with AS, the transition age is a tr = 24.25 years with r X,1 = 15.3 ± 17.3 year Gy−1 and r X,2 = 139.7 ± 64.5 year Gy−1. The TSCE model with LEP has a transition age of a tr = 27.1 years with the parameters r X  = 19.2 ± 16.2 year Gy−1 and r γ,LEP = 0.94 ± 0.36 Gy−1.

Table 4 Comparison of different TSCE and ERR models for the incidence data

Compared to a pure background model without radiation effect, the radiation risk of an ERR model linear in dose and constant in a is significant (p = 0.011), with a risk estimate of ERRpd = 0.86 ± 0.37 Gy−1. Threshold models give similar results as the constant ERR model. A quadratic dose term, in addition to the linear one, leads to an unstable solution with a negative value for the linear coefficient; probably the statistical power is not sufficient to allow estimates of a quadratic contribution. The two ERR models with a linear and log-linear dependence in attained age show almost no improvement compared to the constant model with one parameter more. Thus, the preferred model for the incidence data is the constant ERR model. Models with a dependence of the risk on age at exposure did not show any improvement.

The deviance of the ERR models is about 13–21 points greater than that of the TSCE models and has one background parameter more. As is the case for the mortality cohort, the difference between the TSCE and ERR models, is already present in the background description with a deviance of 22492.3 for the TSCE models and 22508.9 for the ERR models. In the TSCE and ERR models the correlations between radiation parameters and background confounders remain below 0.2.

Table 5 shows the predicted and observed distribution of solid cancer cases by dose categories of the TSCE model with LEP and the constant ERR model. Compared by dose categories, the models agree very well in each category, the difference being much smaller than the square root of the total number of cases in each category. Also the number of predicted cases agrees well with that of the observed cases. The TSCE model with LEP and the constant ERR model predict 57 and 53 radiation-induced cases, respectively, corresponding to 3.0 and 2.8% of all 1,889 observed cases.

Table 5 Solid cancer incident cases by dose categories as predicted by the TSCE and ERR model and compared to the observed cases

Figure 3 shows the ERRpd as a function of attained age. The models are compatible within the 68% confidence intervals, though the TSCE model predicts an increase in the ERRpd with age, whereas the ERR model prefers a constant risk with age. In fact, the ERR model with a linear slope even predicts a slightly decreasing risk; however, it agrees well with the constant ERR model.

Fig. 3
figure 3

ERR pd for solid cancer incidence as a function of attained age, and 68% CI. For clarity, the error bars are shown only for the TSCE model with LEP (solid line); the uncertainties of the constant ERR model are of similar size

In Table 6, the ERRpd and EARpd values are given for different attained ages. The models are compatible within the uncertainty bounds; at the age of 63, the TSCE model with LEP predicts for example

$$ \begin{aligned} {\rm ERR}_{\rm pd}(63)=\,&0.91\,{\rm Gy}^{-1} (95\% {\rm CI}\,0.35; 1.52)\\ {\rm EAR}_{\rm pd}(63)=\,&735\,(10^5\ {\rm PYR\ Gy})^{-1}\ (95\% {\rm CI}\ 267; 1133) . \end{aligned} $$
(11)

Analyzing the solid cancer incidence risk without the 310 lung cancer cases, the TSCE model with LEP provides a risk of ERRpd(63) = 0.97 Gy−1 (95% CI 0.36; 1.60) which again is close to the risk including lung cancer. The increase in risk with attained age is slightly larger than the increase in risk including the lung cancer cases: it increases from 0.70 Gy−1 at age 50 to 1.30 Gy−1 at age 80. Due to the smaller number of cancer cases of 16%, the absolute risk is smaller by about this order of magnitude (EARpd(63) = 629 (105 PYR Gy)−1 (95% CI 216; 984)).

Table 6 ERR pd and EAR pd for solid cancer incidence given for different attained ages with 95% CI in units of Gy−1 and (105 PYR Gy)−1, respectively

Using a model with both direct and permanent radiation effect on the initiation rate—instead of the permanent radiation effect on the promotion rate as in the TSCE model with LEP—the deviance is greater by 5.9 points with the same number of parameters. This is a strong effect since the increase is only due to the radiation action. This model predicts only 35 excess cancer cases which is substantially less than the number of excess cases predicted by the other models. The analysis of the incidence data thus shows a preference for a lifelong (or long-term) enhanced radiation effect on the promotion rate.

Comparison of mortality and incidence solid cancer risk

It is very interesting to compare directly the risk estimates for the incidence and mortality cohort. The mortality cohort consists of 29,771 persons and 903,007 person years with 2064 solid tumor deaths (excluding bone cancer), corresponding to 6.9% of the persons and 0.23% per person year. The incidence cohort includes 17,069 persons and 452,010 person years with 1,889 solid cancer incident cases, corresponding to 11.1% of the persons and 0.42% per person year. These rates are almost twice as large as in the mortality cohort. Regarding radiation risk, the percentage of radiation-induced cancer cases is similar in both cohorts: For example, for the TSCE model with LEP about 2.6% (53 of 2,064) solid cancer cases in the mortality cohort are radiation-induced which compares well with the 3.0% (57 of 1,889) radiation-induced cases in the incidence cohort.

Figure 4 displays the solid cancer incidence and mortality rates (hazard) for the TSCE model with LEP together with the observed hazard, grouped in 3-year intervals. The standard deviation of the observed hazard is estimated (assuming a Poisson distribution) from the square root of the number of observed cases. For a better comparison, the hazard of the incidence cohort is multiplied by a factor of 0.6, so that the hazards of both cohorts coincide at the age of 63. The shapes of the hazard functions are very similar. However, after scaling the incidence hazard is larger than the mortality hazard below 63 years, while it is lower for older ages. A similar comparison using the ERR models leads to the same conclusion. Note that the mortality cohort includes both the Chelyabinsk and Kurgan oblasts whereas the incidence cohort includes only the Chelyabinsk oblast. For a check we have therefore determined the hazard function of the mortality cohort including only the Chelyabinsk oblast and compared it to the incidence hazard as in Fig. 4. The result is similar though the difference between both cohorts becomes a bit larger.

Fig. 4
figure 4

Comparison of mortality and incidence hazards of the TSCE model with LEP together with the observed cases, grouped in intervals of 3 years. The hazard of the incidence cohort is scaled by a factor of 0.6

Now we investigate to what extend the radiation risks of both cohorts can be compared. In principle one would expect that the ERRpd estimates should be similar in both cohorts, whereas the EARpd should be higher in the incidence than in the mortality cohort, due to the larger background risk. In Fig. 5, the ERRpd from Figs. 2 and 3 (for clarity, the confidence intervals have been omitted) are plotted as a function of attained age using the TSCE model with LEP and the linear and constant ERR model for the mortality and incidence cohort, respectively. All models coincide indeed very well for the risk at age 63, which is about the mean age of solid cancer mortality or incidence. The difference is larger for the dependence of ERRpd on age: The TSCE model with LEP predicts almost the same increase of ERRpd with age for both cohorts; the ERR model, however, predicts a steeper slope for the mortality cohort and a constant risk for the incidence cohort. The same increase in risk with age predicted by the TSCE model for mortality and incidence is not due to the fact that the same TSCE model with LEP was used: a similar increase both for mortality and incidence excess risk is also observed when the TSCE model with AS is used.

Fig. 5
figure 5

ERR pd as function of attained age using the TSCE model with LEP and the linear ERR model

From Fig. 3 it can be seen that for the incidence cohort, the ERRpd of the ERR and TSCE models agree within the 68% CI for the whole age range; whereas for the mortality cohort from Fig. 2 the ERRpd agrees for both models within the 68% CI for central ages, but is outside this interval for younger or older ages. However, even taking into account the large uncertainties, the magnitude of the change of the ERRpd with age remains substantial for the ERR models. Though there is some difference in the distribution of cases by cancer sites in the mortality and incidence data sets, it is too small to explain the difference in the risk estimates. Since the incidence cohort includes only subjects exposed in those Techa riverside villages located within the Chelyabinsk oblast, we have—to check for consistency—re-analyzed the mortality data including only people initially exposed within the Chelyabinsk oblast. The results are very similar, however, for the ERR model the increase in ERRpd with attained age for the mortality data is reduced by 25% and is more compatible to the risk from the TSCE models.

For the EARpd, the same results as for the ERRpd are found after correction for the different baseline rates: multiplying the incidence risk with a factor of 0.6 as was done for the hazard, the EARpd of all models agrees well at age 63. The difference in the EARpd becomes larger for younger and older ages, as was the case for the ERRpd.

Though the likelihood and the cancer risk do not change for different values of the transformation rate μ0 (Portier and Masri 1997; Hazelton et al. 2001), it is interesting to compare the male and female TSCE baseline parameter values as presented in Table 7. For consistency, the parameter values representing effective biological time scales should have the same order of magnitude in the incidence and mortality cohort. Since the mortality hazard is smaller than the incidence hazard, it is assumed that only about 60% of the cells that lead to cancer will be fatal. For this reason, the transformation rate is scaled with the factor 0.6. It can be seen that the incidence and mortality parameters agree indeed very well: for the male parameter set, the initiation N s · ν0,m is slightly different, but α0,m and β0,m coincide almost perfectly, whereas for the female parameter set, the initiation agrees perfectly with a small difference in α0,f and β0,f . The difference α0 − β0 gives the promotion rate and is equal for different μ0.

Table 7 Baseline parameter comparison of mortality and incidence cohorts by gender, for different values of μ0

The promotion rate for females is lower than that for males. For the same values of the transformation rate, N s · ν0, α0 and β0 for females are consistently higher than for males. This suggests that either females might have a larger initiation rate than males, or that the transformation rate for females could be higher than for males so that the initiation rate would be of similar size.

Though the hazard and the radiation risk are largely independent of the lag time, this is not true for the parameter values. For the mortality and incidence cohorts, lag times of 5 and 3 years have been used, respectively, since it takes more time for a tumor to turn fatal. Using the same lag time of 5 years both for the incidence and mortality cohort, however, the difference between the incidence and mortality parameter values becomes larger, confirming the consistency of using different lag times.

Discussion

The Techa River cohort has specific features that make it especially valuable for radiation risk assessments and protection standards: the cohort consists of a large, unselected population of men and women of all ages with a long follow-up time of more than 50 years. The doses were received over a period of several years with cumulative doses in the low- to medium range. The ETRC follow-up and dosimetry are still under active investigation.

Compared with a previous analysis of the mortality data of the ETRC (Eidemüller et al. 2008), the current analysis includes four more years of follow-up. By a comparison of Table 3 with Table 5 from ref. (Eidemüller et al. 2008), it can be seen that the ERRpd estimates of the current analysis are about 20% lower for similar models at central ages, though the difference changes depending on model and age. In the present analysis, the increase of the ERRpd with attained age is smaller for the TSCE models, in particular for older ages, whereas the ERR models predict an increase similar to that obtained in the former analysis. In ref. (Krestinina et al. 2007) the incidence data were analyzed with an ERR model with a linear dependence on dose and constant in attained age. Also in that analysis a slightly higher risk of ERRpd = 1.0 Gy−1 (95% CI 0.3; 1.9) was found than in the current analysis with the constant ERR model (ERRpd = 0.86 Gy−1 (95% CI 0.13; 1.59)). Though in (Krestinina et al. 2007) a small increase in ERRpd with attained age has been seen, this was not significant.

It is very interesting to note that the risk estimates deduced from the Techa River cohort are somewhat larger than those obtained from the atomic bomb survivor (LSS) cohort, both for mortality (Preston et al. 2003, 2004) and incidence (Preston et al. 2007), though they are compatible within their 95% CIs. In ref. (Boice and Lubin 1997), the epidemiologic evidence of the relation between occupational and environmental radiation and cancer is reviewed. Recently, in the 15-country nuclear worker study (Cardis et al. 2007) and the Hanford workers study (Wing and Richardson 2005), a higher risk than in the LSS cohort and well in agreement with the Techa River cohort has been observed. For a more detailed discussion of the relation of these risk estimates to the Techa River Cohort and the dependence of risk on attained age and age at exposure see (Eidemüller et al. 2008).

One should be careful to relate the estimated parameter values of the TSCE models to actual biological processes. Biological mechanisms for various cancer sites might be different with a diverse number of carcinogenesis stages or a different dose response shape. Thus, the results obtained in the present study should only be interpreted as average values for all solid cancers and cannot be directly transferred to specific cancer sites. Furthermore, as discussed in (Portier and Masri 1997; Hazelton et al. 2001), different models could fit the data equally well. For example, based on the deviance it is not possible to distinguish in the mortality cohort between the TSCE models with LEP and AS, though they represent very different biological mechanisms. On the other hand, there is a good indication for a direct radiation effect on an early stage of carcinogenesis because models with a direct radiation action on promotion or transformation alone fitted the data significantly worse, and adding such mechanisms to a radiation action on the initiation rate does not improve the fits.

All analyses in this study have been performed both with TSCE and ERR models. The good agreement between the models, which differ substantially both in baseline and radiation risk parametrisation, indicates that the baseline confounder and risk estimates are present in the data rather than artefacts of the models. The direct comparison of incidence and mortality data allows the development of TSCE models that are based on the same mechanistic processes for the radiation effects in both cohorts. In particular, in both cohorts the best description was achieved with a model incorporating a lifelong radiation effect on promotion. Whereas for the mortality data, a lifelong radiation effect on initiation was almost as good as a lifelong radiation effect on promotion, the incidence data showed a strong preference for a lifelong radiation effect on promotion. Since it is not clear if some of these results are specific to the Techa River cohort, it will be important to apply this type of models to other cohorts, to see whether common mechanisms could be identified for a larger number of cohorts. Note that in an analysis of the Swedish hemangioma cohort (Eidemüller et al. 2009) a significant improvement was found for a model with a lifelong radiation effect on initiation. Due to the early exposure, the Swedish hemangioma cohort is especially suited to detect radiation effects at an early stage of carcinogenesis.

Comparison of the mortality and incidence cancer risks obtained in the present work showed that in the Techa River cohort the central ERRpd estimate at the mean age of solid cancer mortality or incidence is very similar in both cohorts, and all models agree well on this value in spite of the large error bounds involved. Although the dependence of the risk on modifiers such as attained age is more uncertain, the preferred TSCE models showed a remarkable agreement in the increase of risk with age. Taking into account all models and both cohorts, the models indicate an increase in ERRpd with attained age, though the ERR model for the incidence data is compatible with a constant risk.