Introduction

For most cancer sites, risk increases exponentially with age. Therefore, the rapid changes in age structure in the European populations towards elderly people in the next decades will cause an increase in the number of cancer patients and this increase is of very high public health interest. Think, for example, of the costs to the health care system caused by cancer patients or think of challenges in planning and/or allocation of resources.

The principles underlying prediction models make it a very challenging task to predict valid numbers for cancer incidence. First of all, cancer prediction calls for the time trend of cancer incidence to be modelled. The key question is which model is the best model for a given picture of cancer incidence in a given country. While general principles exist, modelling requires that the best model be chosen for each particular cancer site. Second, prediction of cancer figures calls for a thorough understanding of the factors influencing time trends. It goes without saying that predictions, at least long-term predictions, are cloaked in uncertainty, because, for example, we cannot predict how factors that are influenced by political or public health decisions will behave in the future. For instance, decisions on screening activities or on stop-smoking campaigns to take place in the future will change cancer incidence. Third, given a number of uncertainties, a cautious interpretation of the results is warranted. These challenges might be even greater in a smaller country, where current time trends are inherently more exposed to random variation. In summary, prediction of cancer incidence is a challenging task, but one that is very important from the public health perspective [1].

The general assumption underlying all prediction modelling is that the current trend can be extrapolated to the future. Thus, one of the key assumptions is that the registration of cancer incidence will remain at a constant high level of completeness during the whole prediction interval. Also, the choice of prediction base, namely the time period from which the observed time trend is extrapolated to the future, is an important issue. If the chosen prediction base is too short, the predicted numbers could rely too strongly on most recent data, which might not characterise future development, or they could rely on random variation. If, on the other hand, the chosen prediction base is too long, the predictions can be too strongly influenced by historical time data, which again might not characterise the future situation.

In Tyrol, cancer incidence data have been registered on a high level of completeness for more than 20 years. Therefore, we aimed to compute well-founded predictions for cancer incidence for Tyrol for the year 2020 for all frequent cancer sites and for all cancer sites combined.

Methods

Cancer Incidence data are collected by the Cancer Registry of Tyrol. The Cancer Registry of Tyrol was established in 1986 and has registered data on a population basis since 1988. Also since 1988, registry data have been published in Cancer Incidence in Five Continents [2]. A detailed analysis of completeness as well as a comprehensive analysis of data quality has been published elsewhere [3]. We aimed to calculate predictions for all cancer sites with at least 20 incident cases per year and sex.

Prediction was computed based on a methodology proposed by Hakulinen and Dyba [46]. The underlying assumptions are: (1) future cancer incidence trends can be modelled by extrapolating the historic trend; (2) the length of the data time series (prediction base range) permits models to be estimated that take into account age- and sex-specific trends; (3) the number of events in each age-sex-time period stratum are Poisson-distributed. Based on these assumptions, Hakulinen and Dyba proposed the following linear model

$${d_{it}} = {n_{it}}({\alpha _i} + {\beta _i}t)$$
(1)

for increasing time trends, and the following log-linear model

$${d_{it}} = {n_{it}}\exp ({\alpha _i} + {\beta _i}t)$$
(2)

for decreasing time trends, with i and t denoting an index for age class and time, respectively, dit and nit denoting the number of cases and the person-years in each age-time stratum, respectively, and αi and bi denoting the unknown parameters to be estimated by the model. We applied age-specific models throughout this analysis in order to allow for different time trends in age groups. Age was categorized in broader groups, namely 0–49, 50–59, 60–69, 70–79 and ≥ 80. For example, for female lung cancer in the age class 50–59 ( i = 2) the estimates for \({\hat \alpha _2}\) and \({\hat \beta _2}\) are calculated on the basis of the data observed in the PBR. Based on the population estimate \({\hat d_{2,2020}}\) for the number of females in age class 50–59 in year 2020 we can finally calculate the estimate \({\hat n_{2,2020}} = {\hat d_{2,2020}}*({\hat \alpha _2} + {\hat \beta _2}*2020)\). After having calculated the estimates for each age class 0–49, 50–59, 60–69, 70–79 and ≥ 80, the total number of female lung cancer cases is calculated as the sum of the predicted number of cases in the individual age classes.

In order to obtain a more stable estimate for some rarer cancer sites, the number of cases per year and age group where smoothed by applying a 3-year window.

Concerning the choice of PBR, namely the time interval defining the time trend that is extrapolated according to models (1) and (2), we carefully discussed the time trend for each particular cancer site. The decision was to define the PBR as the time period from 1990 to 2009, thus omitting the first 2 years of registration, which could reflect a learning curve. We found a few exceptions from this general rule, namely bladder cancer (PBR 2003–2009), prostate cancer (PBR 1993–2009) and all sites except NMSC including prostate cancer (PBR 1993–2009). These decisions will be discussed in detail in the Discussion section.

We report the prediction in absolute number of cases, because this is the most relevant figure for both medical doctors and health policy makers. The prediction interval was estimated as confidence interval for rates. In addition, we add the relative proportion of the increase based on the actual numbers (mean number of incident cases in the years 2007–2009). In order to estimate the proportion of increase attributable to the shift in age structure only, we compared the estimated number of cases with the number of cases obtained when applying the estimated age-specific incidence rates for the year 2020 to the estimated population size for the year 2020, but based on the age structure of 2009.

All computations were performed with Stata, Version 11.0 [7], applying a package written by T. Dyba [8].

Population data, namely for the time period from 1988 to 2009 as well as projections for 2020, were supplied by Statistics Austria [9]. The population of Tyrol in 2009 was 704,792, 51.1 % females. The predicted population size for the year 2020 is 726,847, i.e. an increase of 3.0 %. The age group ≥ 70 made up 10.7 % of the population in 2009 and is predicted to be 13.9 % in 2020.

Results

Throughout the Results section, increase or decrease in the number of incident cases is defined as the absolute difference in the numbers predicted for year 2020 as compared with the average number of cases in the period 2007–2009. Percentages are relative percentages of the increase in absolute numbers as compared with the average number of cases in 2007–2009.

For females, for all cancer sites combined except NMSC, we predicted an increase of 235 (15 %) in the number of incident cases for 2020, see Fig. 1. We predicted, for 2020, an increase of more than 20 cases per year for breast cancer (92 cases, 21 %), lung cancer (64 cases, 52 %), colorectal cancer (40 cases, 24 %), melanoma (38 cases, 30 %) and the haematopoietic system (37 cases, 35 %); details are shown in Table 1. A decrease for the year 2020 was predicted for stomach cancer (− 12 cases, − 24 %), cervical cancer (− 5 cases, − 12 %) and ovarian cancer (− 4 cases, − 6 %). For all cancer sites except NMSC 94 % of the increase is due to the shift towards older age groups, and for most cancer sites with a predicted increase more than half of the increase can be attributed to the shift towards older age groups.

Fig. 1
figure 1

Prediction of the number of female incident cancer cases in Tyrol/Austria (all cancers except non-melanoma skin cancer)

Table 1 Prediction of the number of female cancer incidence cases for the year 2020 with a 95 % prediction interval for Tyrol/Austria; by cancer site

For males, for all cancer sites combined except NMSC, we predicted an increase of 362 (21 %) in the incidence number for the year 2020, see Fig. 2. An increase for 2020 of more than 20 cases per year was predicted for prostate cancer (105 cases, 25 %), colorectal cancer (91 cases, 45 %), the haematopoietic system (71 cases, 55 %), bladder cancer (69 cases, 100 %), melanoma (64 cases, 52 %), head and neck cancer (32 cases, 42 %), pancreatic cancer (27 cases, 55 %), lung cancer (25 cases, 11 %) and liver cancer (23 cases, 62 %). The only cancer site with a predicted decrease of more than 10 cases for 2020 was stomach cancer (− 15 cases, − 22 %); details are shown in Table 2. For all cancer sites except NMSC 98 % of the increase is due to the shift toward older age groups in the next decade. For most cancer sites with a predicted increase more than half of the increase can be attributed to shifts in age structure, the only exception being melanoma, for which 25 % of the increase is attributable to shifts in age structure.

Fig. 2
figure 2

Prediction of the number of male incident cancer cases in Tyrol/Austria (all cancers except non-melanoma skin cancer)

Table 2 Prediction of the number of male cancer incidence cases for the year 2020 with a 95% prediction interval in Tyrol/Austria; by cancer site

Discussion

For the year 2020, as compared with 2007–2009, for all cancer sites combined except NMSC we demonstrated in Tyrol/Austria an increase in the number of incident cancer cases by 15 % for females and 21 % for men. The biggest increase in absolute numbers is seen for females in breast cancer (92), lung cancer (64), colorectal cancer (40), melanoma (38) and the haematopoietic system (37) and for males in prostate cancer (105), colorectal cancer (91), the haematopoietic system (71), bladder cancer (69) and melanoma (64).

Our results are in line with those reported in the literature. A number of authors have demonstrated an increase in cancer incidence and/or mortality worldwide [1, 1013]. The increase in corpus cancer is in line with that found in an investigation conducted in Norway, in which Lindeman et al. concluded there would be a dramatic increase unless effective preventive strategies are implemented [14]. The decline in stomach cancer was also shown by Amiri et al. [15]. Quon et al. showed a dramatic increase in prostate cancer incidence for Canada [16], which corresponds fairly well to the increase in prostate cancer incidence demonstrated by our analysis.

However, when comparing results of cancer incidence prediction between countries, one must bear in mind that the distribution of cancer sites can differ, as is clearly demonstrated by the difference between high-income and low-income countries [13]. Coupland et al. pointed out that population trends are likely to differ between rural areas and entire countries [17].

After long deliberation we decided to also calculate a prediction for all cancer sites combined, despite the fact that several external factors that are expected to influence global cancer incidence were not modelled in our analysis. Some of the strong external factors are smoking habits (we know that in high-income countries about 30 % of cancer sites are attributable to smoking [18]) and screening attitudes or screening methods. With regard to smoking habits, no major steps in reducing smoking were achieved in Austria from 1900 to 2005 and, by comparison with many other countries, Austria took only “minor” steps to ban smoking in public places in 2005. Therefore, the effects of this smoking ban are expected to be small and, due to latency time, the effects are not expected to manifest themselves for another 15 years, meaning not before the year for which we predicted cancer incidence, namely 2020. From an international perspective, Strong et al. dealt with the interesting question of prevention of cancer through tobacco and infection control [19]. Soerjomataram et al. demonstrated that the lung cancer gap between the socioeconomic groups can be reduced to some extent by targeted interventions against smoking [20], Menvielle et al. investigated scenarios of future lung cancer incidence by educational level in Denmark [21], and Didkowska et al. predicted lung cancer incidence on the basis of forecasts of hypothetical changes in smoking habits [22].

More effects can possibly be expected from changes in screening attitudes, in our opinion mostly with regard to colorectal cancer. If colonoscopy screening intensifies in the upcoming years, we would expect a decrease in colorectal cancer incidence. However, no exact figures on colonoscopy screening attendance are available, and therefore there is considerable uncertainty concerning the prediction of colorectal cancer.

One factor that clearly influences the predicted number of cancer cases is the choice of prediction base range. Bearing in mind the small number of cases per year for some cancer sites in our population, we favoured a longer prediction base, namely from 1990 to 2009, in order to somewhat smoothen the rates caused by random variation. For a sensitivity analysis we investigated possible PBR alternatives, namely 1995–2009 and 2000–2009. After careful inspection of the time trend for all sites analysed, we concluded that in most situations, choosing a longer prediction base, as generally done in our analysis, gave very reasonable results (data not shown). The choice of a longer PBR could also be a good way to deal with random variation in historical data, which plays a role in a cancer registry smaller than ours.

Misspecification of the model could be a critical issue. However, in most situations the choice between the linear and the log-linear model was very straightforward. Therefore, errors due to model misspecification are rather unlikely. However, our model relies on the linearity assumption, which means using a linear or log-linear time trend of incidence in the PBR. First of all, one of the advantages of linearity is the principle of parsimony, meaning using the simplest model that fits the data [8]. In addition, linear models, when applicable, have the advantage of narrower prediction intervals [8]. Next, we carefully analysed the time trend in the PBR and saw that for the most influential cancer sites the linearity assumption was fulfilled, see Fig. 3. There are only two exceptions, namely prostate cancer and colorectal cancer, and these are discussed in the paragraph concerning the respective cancer site below.

Fig. 3
figure 3

Temporal trend of most frequent cancer sites in Tyrol/Austria in the prediction base range, with a linear regression line

We performed predictions for all cancer sites with at least 20 cases per year. We felt rather confident with the prediction for most cancer sites, because we were able to refer to a detailed analysis of the time trends in the past 20 years. Exceptions were colorectal cancer, prostate cancer, bladder cancer, thyroid cancer and melanoma, and in the next paragraphs we will discuss the pros and cons of conducting the prediction also for these cancer sites.

For colorectal cancer, we observed a decrease beginning at about 2005. This decrease could be attributable to colonoscopy screening, or random variation (we already observed a similar decrease in colorectal cancer incidence between 1995 and 2000, which in our opinion is most likely due to random variation), or to some other unknown cause. Since we have no data on colonoscopy screening attendance, we felt it would be best to base the prediction on a long-time PBR, which would act to smoothen random variation. However, the prediction for colorectal cancer is somewhat uncertain when we recall the possibility of a decrease of colorectal cancer incidence due to colonoscopy screening attendance and this time trend could continue in the next decade.

Breast cancer (BCA) incidence increased until 1993, after which it was rather stable with some minor fluctuations. The most plausible reason for the increase until 1993 is the introduction of spontaneous mammography screening between 1990 and 1993. Intensification of the screening program around 2000 and the introduction of an organized screening program in mid-2008 caused only minor changes in incidence: the most likely explanation for this is that after more than a decade of spontaneous mammography screening, further changes in the program caused only minor effects in the pre-screened population. Similar time trends are known from subsequent screening rounds in many countries.

The incidence of prostate cancer in Tyrol is severely distorted by PSA testing: after testing started in 1992, prostate cancer incidence approximately doubled already in 1993 [2325]. The past 5–7 years have shown a decrease in PCA incidence (most likely due to less intensive PSA testing). However, we are not able to predict PSA test attendance in the coming decade and, in fact, we do not have sound data on PSA test attendance since 2000 because we lack a screening database. Therefore, we decided to define the PBR for prostate cancer as the period from 1993 to 2009, because it should best reflect the time trend of prostate cancer incidence after introducing PSA testing in our region. However, given the uncertainties surrounding PSA test attendance in future, the prediction can also only be largely uncertain.

For bladder cancer we observed a sharp decrease in incidence, which is most likely due to the shift in the definition of invasive bladder cancer [1]. This decrease started in about 1995 and ended in 2003 (data not shown). The incidence of bladder cancer seems to have flattened since then at a lower level. Therefore, this cancer site was the only one with a rather short PBR, namely from 2003 to 2009, and we felt rather confident with the prediction based on the chosen PBR.

When looking at the time trend for thyroid cancer, we observed peaks in individual years that can be attributed to time-dependent screening campaigns. In this situation choosing a long PBR smoothes away these peaks, which we feel should give a reliable basis for prediction.

Finally, melanoma in Tyrol is characterised by a large shift in melanoma cases from hospitals to dermatologists in private practice. After a long discussion we decided to also predict the number of melanoma cases because of the large number of cases and the high public health relevance of this cancer entity. Our PBR choice was the long 20-year period from 1990 to 2009. However, besides the random variation shown in the prediction interval there is additional uncertainty in the prediction of melanoma because opportunistic melanoma screening seems to have established itself in the past 5 years. Nevertheless, we have no data on screening attendance and therefore are not able to predict the consequences more precisely. In summary, the prediction of melanoma incidence definitely entails some uncertainty.

This study entails strengths and limitations. Among the strengths is the good and constant quality of data in this regional cancer registry [3]. We applied a well-accepted method for prediction; the model we used is rather simple and misspecification is unlikely. Among the limitations are the small number of incident cases for certain cancer sites which can cause a bias due to random variation, the intrinsic problems for some cancer sites which are influenced by several external factors that we cannot reflect in the model (especially for prostate cancer and colorectal cancer) and in general the fact that the model we applied cannot reflect external factors.

In summary, we demonstrate an increase of 15 % in females and 21 % in males in the number of incident cancer cases in the next decade. Most of this increase is due to the shift in age structure towards elderly persons. However, the prediction intervals are still broad and therefore the results must be interpreted with some caution.