Introduction

The incidence of breast cancer (ductal carcinoma in situ (DCIS) and invasive breast cancer) has risen substantially over the past 20 years [1, 2], in parallel with increasing use of screening mammography. While DCIS overdiagnosis is well accepted as a consequence of mammography screening [3], overdiagnosis of invasive breast cancer is more controversial and more counterintuitive. However, it is plausible that overdiagnosis of invasive breast cancer could occur through extension of the “length time” effect of screening which is the tendency of screening to detect less aggressive or even inconsequential cancers while more active cancers occur as interval cases [4]. Overdiagnosis is important because it results in overtreatment, including surgery, radiotherapy and endocrine therapy of women who would not be diagnosed or treated for breast cancer without screening. In addition, it has psycho-social consequences including anxiety, depression, labelling and impacts on insurance status [5].

Quantifying the extent of overdiagnosis of invasive breast cancer in a screened population is difficult because other factors may also contribute to an observed increase in incidence. First, changes in the prevalence of breast cancer risk factors such as nulliparity, obesity and use of hormone replacement therapy (HRT) may contribute [6, 7]. Secondly, some of an observed increase should be due to lead time [8]. Furthermore, the methodology for estimating overdiagnosis is not well established and susceptible to bias. Previous estimates of overdiagnosis of invasive breast cancer vary widely from negligible [9] to 56% [10] in those aged 50–69 years. However, many estimates are biased by inadequate methods [8], in particular, inappropriate or inadequate adjustment for lead time and concurrent increases in breast cancer risk factors. The best way to adjust for lead time is contested [11].

Our aim in this piece of work was to accurately estimate the frequency of overdiagnosis of invasive breast cancer using a methodologically sound approach that included allowance for both the increasing prevalence of important breast cancer risk factors and the lead time that should occur with screening mammography.

Contextual information for mammography screening in NSW, Australia

An indication of the possible overdiagnosis of invasive breast cancer associated with current screening is shown in Fig. 1, where the age-specific breast cancer incidence curve for women in New South Wales (NSW), the most populous state of Australia, is substantially higher in 1999–2001 than it was in 1972–1983. Organised mammography screening was well established in NSW by 1999–2001; this period immediately preceded widespread publicity of the link between hormone replacement therapy (HRT) use and breast cancer and a subsequent sharp drop in HRT use.

Fig. 1
figure 1

Mean annual age-specific breast cancer incidence in women prior to universal health insurance (Medicare) (1972–1983) and under established organised mammography screening (1999–2001) in NSW (Source: NSW Central Cancer Registry)

Overall, approximately 4,000 new cases of invasive breast cancer are diagnosed annually in NSW, and approximately 50% of these are in women aged 50–69 years. In women aged <40 and ≥80, well outside the 50–69 year a target age range for screening, incidence rates changed little over 1972–2001 (Fig. 2). Rates in 40–49 and 70–79-year-old women, who are eligible for and partly exposed to screening, rose somewhat over the same period, while incidence rates in target women (50–69 years) closely tracked their participation in mammography screening and proportionately increased the most.

Fig. 2
figure 2

Annual age-specific breast cancer incidence in NSW women aged 25–40, 40–49, 50–69 70–79 and 80+ years, and estimated annual percent of NSW women aged 50–69 year having screening mammography. Includes mammograms done by BreastScreen NSW and mammograms reimbursed by Medicare (bilateral mammographic examinations [item number 59300], obtained from the Health Insurance Commission)

In 1984, diagnostic mammography became reimbursable by Medicare, Australia’s universal health insurance scheme. Breast cancer incidence in 50–69-year-old women began to rise from about 1984, then more sharply with the roll-out of the national government-funded mammography screening programme through BreastScreen Australia (Fig. 2). BreastScreen services reached full coverage in NSW in 1995 (with approximately 60% participation in biennial screening among women aged 50–69 years), and incidence rates in the target age groups steadied thereafter to about 250–300 per 100,000 women, compared with 130–180 per 100,000 before 1984.

Methods

Broadly, our approach compares the observed and expected annual incidence of invasive breast cancer for a screened population in which screening had reached a “steady state” situation [8]. Our calculation of the expected incidence in the screened population includes two components: first, we estimated the expected annual incidence of invasive breast cancer in NSW in 1999–2001 without screening, taking into consideration the risk factor profile of the population at that time; and then secondly, we adjusted this incidence for lead time to allow for the increment in incidence due to advancement by screening of the time of diagnosis of cancers that would have presented clinically [12]. These two steps enable us to calculate the expected incidence with screening in NSW in 1999–2001 in the absence of overdiagnosis. Thus, any difference between the expected incidence and the observed incidence must reflect overdiagnosis. The timeframe 1999–2001, 5 years after full geographic coverage was achieved, was chosen because screening was then in a “steady state” situation with prevalent screens constituting only about 10% of screening mammograms, and because it preceded reductions in HRT use following widespread publicity of the link between HRT and breast cancer.

All incidence rates, both observed and expected, were rates of invasive cancer. Ductal carcinoma in situ (DCIS) was excluded from all our analyses. Prior to the advent of screening in NSW and up until 1995, DCIS was not reliably recorded on the NSW Central Cancer Registry and, consequently, it was not possible to reliably estimate screening-related changes in DCIS incidence.

Expected incidence

Expected incidence without screening

Linear regression modelling was used to estimate invasive breast cancer incidence for NSW women without screening. We used interpolation and extrapolation approaches to the modelling:

(1) Interpolation approach: incidence in NSW women in unscreened age groups (women 40 years or younger and 80 years or older) for 1972–2001 was modelled by 5-year age group, with incidence for the intermediate screening target age groups (50–69 years) interpolated from the modelled estimate for age when treated as a single ordinal variable (e.g. youngest age group, 25–29 years = 1; oldest age group, 85+ years = 13). That is, the form of the regression model was:

$$ {\text{Incidence}} = \alpha + \beta \times {\text{Age}} + \varepsilon . $$

(2) Extrapolation approach: incidence for the period prior to the introduction of screening (1972–1983) was modelled for all 5-year age groups, 25–85+ years as categories, and extrapolated to 1999–2001 by modelling time explicitly in years since 1972. This regression model was specified as:

$$ {\text{Incidence}} = \alpha + \beta_{1} \times {\text{Age}}_{1} + \cdots + \beta_{13} \times {\text{Age}}_{13} + \beta_{14} \times {\text{time}} + \varepsilon . $$

Adjustment for changes in major breast cancer risk factors

Estimates of the prevalence of two of the main breast cancer risk factors among NSW women, HRT use and obesity were obtained for the periods before screening was introduced and for 1999–2001 (Table 1) [1315]. Nulliparity in 50–69 year women at age 30 was estimated from historical fertility data for NSW [16] (Table 1) and is assumed to reflect delayed childbirth. We obtained estimates of relative risks for breast cancer for each risk factor [1724] (Table 1). The combined risk factor contributions were estimated for periods before (1980–1983) and during screening (1999–2001) by calculating and summing the population attributable fractions (PAF) for each risk factor in each period.

Table 1 Relative risks, prevalences and attributable fractions for breast cancer risk factors before and after the introduction of screening

We then used the difference in the combined PAFs between the two periods to adjust upward the expected incidence. We adjusted for HRT use and nulliparity, and for obesity in non-HRT users only since HRT use by obese women appears to neutralise obesity as a risk factor for breast cancer [1315]. The PAF adjustment approach is best illustrated by an example: if breast cancer incidence prior to screening was 100 per 100,000, and 30% of that incidence was attributable to the combined effects of HRT use, obesity and nulliparity, then 30 out of 100 breast cancer cases would be attributable to these risk factors. If the prevalence of these risk factors increased to produce an overall attributable fraction of 40% (i.e. 40 breast cancer cases out of 100 were attributable to risk factors), then the total incidence would be expected to increase to around 110 per 100,000 as a result of the increased prevalence of the risk factors.

As a range of estimates of breast cancer risk factor prevalences was available, we chose the lowest estimates of them prior to mammography screening and the highest during the mammography screening period of interest (1999–2001). This approach has the effect of producing the highest estimate for rises in attributable fractions, which in turn produce the highest estimate of expected breast cancer incidence, and consequently a conservative (minimum) estimate of overdiagnosis.

Adjustment for lead time

The estimated expected incidence in the population without screening was shifted forward by 2.5 and 5 years to generate the expected incidence with screening. For example, for a lead time of 5 years, expected incidence in 65-year-old women without screening is regarded as the incidence to be expected in 60-year-old women with screening. When assuming a lead time of 2.5 years, an interpolated value between the two age groups was adopted. That is, expected incidence in 62.5-year-old unscreened women was assumed for women aged 60 years with screening.

Observed incidence

The average annual age-specific incidence of breast cancer in women aged 50–69 years in NSW was obtained for the years 1999–2001 from incidence rates published by the NSW Central Cancer Registry (Fig. 1). During 1999–2001, most (89.9%) screened women of this age were having subsequent rather than initial (prevalent round) screens [25].

The cumulated incidence per 100,000 for 50–69 years women was calculated as 5 times the annual 5-year age-specific incidence rates summed over the age groups (50–54, 55–59, 60–64, 65–69) making up the 50–69-year age bracket [26].

Further details of any component of these methods are available from the authors on request. Regression modelling was conducted using SAS, v.8.2 [27].

Results

The expected invasive breast cancer incidence rates in 1999–2001, without screening and adjusted for risk factor trends, ranged from 138 to 169/100,000 in women 50–54 years of age in the interpolation and extrapolation models, respectively, to 238 to 251/100,000 in women aged 70–74 (Table 2). The regression models explained 95 and 97% of the variance in incidence in the interpolation and extrapolation models, respectively. Adjustment for long-term time trends made little difference to the estimates of expected incidence, and no significant interactions between time and age group were found in the extrapolation model.

Table 2 Estimated expected breast cancer incidence in 1999–2001 without screening from the interpolation and extrapolation models, with and without adjustment for major breast cancer risk factors

The expected incidence rates from the interpolation and extrapolation models, further adjusted for assumed mean lead time, and the observed age-specific incidence rates in 1999–2001 are given in Table 3. Cumulated incidence rates for ages 50–69 years are also given. From these, estimates of the overdiagnosis were calculated, as shown in Table 3, for the two different estimates of expected incidence for each age group. Thus, our estimates of overdiagnosis of invasive breast cancer are 53 and 35% for women aged 50–54 years, 56 and 37% for women 55–59, 43 and 35% for women 60–64, and 21 and 15% for women 65–69, allowing for 5-year lead time. Across the screening age range, 50–69 years, our estimates of overdiagnosis are 42 and 30% from the interpolation and extrapolation models, respectively. Corresponding estimates of overdiagnosis are 51 and 36% if the mean lead time is assumed to be 2.5 years.

Table 3 Estimates of overdiagnosed invasive breast cancer associated with mammography screening in NSW women aged 50–69 years, with adjustment for various mammography screening lead times and for changes in risk factors coinciding with advent of screening

Discussion

This work provides age-specific and cumulated estimates of overdiagnosis of invasive breast cancer associated with mammography screening in 50–69-year-old women in NSW targeted for screening by BreastScreen NSW. Ours is the first study to have adjusted both for changes in breast cancer risk factors coinciding with the advent of screening and for lead time; therefore, we are confident in attributing the observed breast cancer excess to overdiagnosis associated with screening. Our results indicate there is substantial overdiagnosis of invasive breast cancer attributable to screening mammography in NSW. However, we think our estimates of overdiagnosis are probably conservative for three reasons. First, we chose the most extreme estimates of changes in risk factor prevalence between 1980 and 2001, thus we probably over-adjusted for the risk factor trends contributing to the rise in breast cancer incidence. Second, our estimates were based on a population in which 60% of women were having biennial mammograms [25]. They would almost certainly be greater if screening participation were greater. Third, our results are limited to invasive cancer, as reliable data on DCIS rates prior to screening were not available. About 18% of all breast cancers diagnosed by BreastScreen NSW are DCIS, therefore, our estimates would be substantially higher if we included DCIS.

Our estimates of overdiagnosis were greater in younger women than older women. Some readers may find this counterintuitive as one might expect cancers of limited biological significance to be more prevalent in older women. The difference might partly be accounted for by a higher percentage of prevalent screens among younger women; however other explanations may also be possible and should be explored. We note that we endeavoured to keep any effect of prevalent rounds screens to a minimum by using data from a time period when screening had reached full geographic coverage and prevalent screens accounted for around 10% of all screens. In addition, others have estimated the effect of prevalent screens on overdiagnosis estimates as “minimal” [28].

The accuracy of our estimates of overdiagnosis depend on a number of factors, the most important of which are the accuracy of our estimates of incidence without screening, the adequacy of the adjustment for breast cancer risk factors, and the validity of the adjustment for lead time.

Incidence without screening

We estimated incidence without screening from rates during 1972–1983, well prior to any organised screening in NSW. We also examined age groups not offered screening, women <40 and >80 years, during the whole 1972–2001 period (Fig. 2). The differences in estimates of overdiagnosis produced from these two approaches (Table 3) indicate some uncertainty around our results but do not change the overall finding of a substantial degree of overdiagnosis. Estimates from the interpolation method could be biased if 50–69-year-old women experienced increases or decreases in underlying incidence from 1984 to 2001 not present in those <40 and 80+ years of age because the linear assumption underlying the interpolated age-specific estimates would be violated. Estimates from the extrapolation method, which modelled incidence separately for each age category without assuming a linear increase with age, could also be biased if the trend in the underlying incidence from 1984 to 2001 in 50–69-year-old women differed materially from their pre-1984 trend. However, the most likely cause of such effects is a change in the prevalence of breast cancer risk factors, which we have adjusted for.

Adjustment for breast cancer risk factors

Relatively rapid and substantial changes in breast cancer risk factors, particularly HRT use, obesity and delayed reproduction occurred among women aged 50–69 years during the period in which population screening was introduced [1315]. Current use of HRT use in 1999–2001 was approximately 30% among women aged 50–69 years; the lowest estimate we found in the pre-screening period was 2% [13]. HRT use sharply declined following widespread publicity in 2001 of the link between HRT and breast cancer [29]. Accordingly, we avoided studying the post-2001 period as this would complicate estimating screening-related overdiagnosis of breast cancer.

Obesity prevalence also increased substantially during the 1990s: in the period of steady-state screening (1999–2001) it was estimated to be about 29% in women aged 50–69 years compared to 11% in 1980 [15]. The estimated nulliparity rate at age 30 in women aged 50–69 during 1980–2001 fell in the 1980s to about 20% but rose again during the 1990s to reach about 28% in 1999–2001. Use of the maximum estimated changes in the prevalences of these risk factors when adjusting the expected incidence of breast cancer for 1999–2001 will tend to produce underestimates of overdiagnosis if the prevalence changes were, in fact, smaller.

Adjustment for lead time

In the absence of overdiagnosis, lead time should be the only factor that distinguishes expected incidence in a population with screening from that in an identical population without screening. We dealt with lead time by shifting the modelled incidence estimates in women without screening forward by 5, or 2.5 years. While a somewhat simplistic approach, dictated in part by ready availability of cancer and screening data by 5-year age group, it is probably conservative, as lead time is likely to lengthen with age and is most likely to be less than 5 years overall, especially in women aged 50–69 years [30]. Estimates of the duration of the pre-clinical stage of breast cancer for standard-risk women range between 2.1 (at age 50) to 4.7 years (at age 70) [31]. By definition, lead time estimates would be less than these estimates and so would be expected to be less than 5 years.

Comparison with other studies

Our estimates of overdiagnosis are higher than the estimates derived from most of the screening trials, which are generally less than 30% [8, 32, 33]. While randomised controlled trials remain the strongest study design for estimating overdiagnosis [8], we can suggest several possible reasons for this difference.

First, screening of the control group and incomplete participation by those offered screening in trials would, by reducing the difference in screening between the two groups, bias overdiagnosis estimates downward [8]. In the case of the Malmö trial, for example, after adjustment for lead time and opportunistic screening occurring in 24% of the control group, overdiagnosis was re-estimated as 25%, up from 10% [11]. Furthermore, some estimates of overdiagnosis have been derived from follow-up of women in trials in which the control group was offered screening at the conclusion of the trial, which would also bias estimates downward [8].

Second, screening may have changed since the trials were conducted. Detection rates in the late 1990s were probably higher than when the mammography screening trials were conducted in the 1980s. In the case of the BreastScreen NSW programme, detection rates increased from about 28 to 42 per 10,000 women screened over the late 1990s [34]. Higher detection rates might lead to greater overdiagnosis [35]. Indeed increased detection rates in Australia have not been accompanied by decreased interval cancer rates [2], suggesting that the increased detection had simply produced greater overdiagnosis. Alternatively, pathologists may be over-calling screen-detected lesions as cancer, although we know of no evidence that this is occurring.

Our estimates are, however, similar to other estimates that have been derived from recent observational data from screening programmes. These include Swedish data estimating overdiagnosis at 54 and 21% for women aged 50–59 and 60–69, respectively [36]; data from Norway estimating overdiagnosis at 56% for women aged 50–69 years [10]; and a recent international meta-analysis estimating overdiagnosis at 52% [37]. The similarity between our results and these data from other screening programmes lends credence to our estimates, particularly as the methods used by us and others are somewhat different. In our study, we have thoroughly adjusted for changes in the prevalence of important risk factors, particularly HRT and obesity, and we still obtain estimates of overdiagnosis that are substantial. Our estimates are somewhat lower than those recently derived for NSW by Jorgensen et al. [37], not only because of our different approach but because we have adjusted for risk factor prevalence changes and excluded DCIS.

Conclusion

The incidence of invasive breast cancer has increased substantially in women of screening age since the implementation of screening programs. Our estimates of overdiagnosis (42 and 30% from the interpolation and extrapolation models, respectively) are robust, as we have used a methodologically sound approach, which includes thorough adjustment for breast cancer risk factors such as HRT and obesity, and for lead time. Our findings suggest that overdiagnosis of breast cancer associated with screening is greater than is generally acknowledged.

Implications of this research

It is important that practitioners, policy makers, and women attending for, or considering, screening are aware of the potential extent of breast cancer overdiagnosis and consequent overtreatment. We have already published one randomised trial of a decision aid presenting the benefits and harms (including overdiagnosis) of breast cancer screening to women [38]. More research is needed into ways of presenting information, including on overdiagnosis, to help women make an informed choice about whether to participate in screening. In addition, efforts should be made to minimise overdiagnosis and overtreatment, for example by trials of less aggressive treatment for women with screen-detected cancers; and develop methods for predicting which screen-detected cancers would be unlikely to progress during a woman’s lifetime [3941].