Introduction

Breast cancer is the second most common cancer in women in the USA [1]. In 2015, an estimated 231,670 women in the USA were diagnosed with breast cancer with a mortality rate of 40,000 [2]. Incidence rates in the past 10 years have been relatively stable with a decline in cancer deaths approximately 1.9 % each year between 2002 and 2011. The lifetime risk is estimated at 12.3 % with an approximate mortality rate of 2.8 % [3]. In 2006, the overall 5-year survival rate for breast cancer was 90.6 % with an estimate of 2.9 million women with breast cancer living in the USA in the year 2011 [4].

Although many risk factors have been associated with breast cancer, most relationships are weak or inconsistent [5]. At present, no single external factor, environmental or dietary, has been shown to cause a genetic mutation leading to breast cancer. However, breast cancer has a known asymptomatic initial phase that can be detected with the use of diagnostic imaging. It is well documented in the literature that the earlier breast cancer is diagnosed, the better the prognostic outcome.

Currently, mammography is the gold standard imaging examination in screening breast cancer and has been shown to reduce mortality from breast cancer [6]. The USA has one of the most intensive screening programs with the Patient Protection and The Affordable Care Act mandating insurance coverage for annual mammograms beginning at the age of 40 [7]. For women considered to be in a high-risk category, there are recommendations that allow for a baseline mammogram starting earlier than age 40 [8].

Although there is consensus on the benefit of mammography (see Fig. 1), there are issues such as the benefits and harms of screening, the optimal screening intervals, and the appropriate ages to begin and end routine screening. The lack of developing a universal policy among experts has led to much variation in practice guidelines.

Fig. 1
figure 1

Benefit of mammography. Craniocaudal (a) and mediolateral oblique (b) views of the left breast from a screening mammogram in a 41-year-old female. The yellow circle shows a small invasive carcinoma that was detected from the initial screening mammogram

Recently, the United States Preventative Services Task Force (USPSTF) released its revised guidelines advocating biennial screening for women aged 50 to 74 years [9••]. The American Academy of Family Physicians (AAFP) supports a similar decision as well, with both recommending that the decision for initiating screening mammography prior to age 50 should be an individualized decision [10]. The American College of Radiology (ACR), American Congress of Obstetrics and Gynecologists (ACOG), and the National Comprehensive Cancer Network (NCCN) recommend yearly mammograms starting at age 40 and continuing for as long as the woman is in reasonably good health [8, 11, 12]. Modified guidelines from the American Cancer Society (ACS) were released earlier this year that stated that annual screening mammograms can be initiated beginning at age 45 [9••].

Despite variations in clinical practice and guidelines, the volume of screening mammogram remains high and relatively stable in the past decade [13, 14•]. Conflicting screening recommendations thus remain a major inconvenience to both providers and patients, mandating policymakers to resolve the controversies at once.

Background

From the 1960s, multiple screening studies have been conducted to evaluate the role of screening mammography and its influence in breast cancer mortality among women between the ages of 40 and 74 years [15•]. Even to this day, studies are being published debating on the benefits of screening mammography. Bleyer et al. compared eight countries in Europe and North America and concluded that there is no support for the hypothesis that mammography screening is a primary reason for the mortality reduction of breast cancer [16•]. Breast cancer and its treatment is a heterogeneous disease process with multiple confounding variables, and it therefore lacks a single causal factor. The claim to exclusively distinguish the primary reason for breast cancer mortality reduction is flawed in its scientific question.

In order to understand the variability in the screening guidelines, one has to understand the measures, inclusion criteria, limitations, and endpoint in the study designs. The screening studies are categorized as observational or experimental. Experimental studies are classified into randomized clinical trials (RCTs) and systematic review/meta-analyses.

RCTs randomly assign participants into an experimental group or a control group. The only expected difference between the experimental group and the control group is the outcome variable being studied [17]. One of the greatest advantages of RCTs is the method of randomization and unbiased nature of the study. The intension-to-screen analysis used by RCT is analogous to the intension-to-treat methodology in drug trials and crucial in determining efficacy. However, critiques argue that conclusions from these trials cannot be translated into the general population. Moreover, RCTs are more expensive and time-consuming [18].

Observational studies, on the other hand, draw inferences about the possible outcome where the investigator does not control the assignment of participants. Advocates of observational screening studies emphasize that conclusions from these trials provide information closer to reality. Added advantage of population-based observational studies is that they are often cost-effective, have easier access of collected data, and can be designed as prospective or retrospective (cohort, case control, cross-sectional). However, observational studies are subject to biases and lack comparability between groups that can only be achieved through randomization [19]. For example, in many of the observational studies published, the participating group consists of women who electively “self-select” themselves to be in the screened or unscreened population and in the annual versus biennial groups. Studies with these models are inherently biased as the women differ in their decision-making. Thus, the results of the studies performed, regardless of the design, have limited efficacy and need to be taken into account when considering the directives for optimal guideline.

After acknowledging the limitations mentioned above, when different designs of studies have similar results, the strength of the conclusion is considered to be stronger. There is an established consensus now from both observational and RCTs that there is mortality reduction from mammography. The next leading question asked is regarding the magnitude of benefit for the respective age groups. Given the variation in screening studies, questions have risen as to whether RCT data have underestimated or overestimated the real benefit of mammography. The key question we are trying to resolve can be summarized as follows:

“What is the effectiveness of screening mammography in reducing the breast-specific cancer mortality and how does it differ by age and screening intervals?”

Controversy 1: Debate over Magnitude of Benefit

Magnitude is defined as the relationship of cancer mortality relative to the penetration of screening. If screening mammography is having an impact, then magnitude measured as the change in cancer mortality should decrease. Listed are a few examples of major trials and their results in mortality reduction from screening mammography:

  • The Swedish Two-County Trial, initiated in 1977, was the first screening trial to demonstrate a breast cancer mortality reduction by mammography. The first results were published in 1985, showing a significant 30 % breast cancer mortality reduction in women invited to screening [19].

  • Gothenburg Mammographic Screening Trial, from 1982 to 1996, reported results on breast cancer mortality. The trial concluded that the relative reduction in breast cancer mortality was 23 % due to screening mammogram [20•].

  • UK Age Trial established in 1991 and is unique in that it was the only trial specifically designed to study the effectiveness of initiating screening at the age of 40. After a 10-year follow-up, the trial concluded no significant benefit in mortality [21•].

  • Broeders et al. published a time-trend study to assess the population-based mammographic screening on breast cancer mortality in Europe [22]. A time-trend study compares changes in breast cancer mortality among populations in relation to the introduction of screening mammogram [9••]. Broeders et al. estimate of breast cancer mortality reduction was 25-31 % for women invited for screening, and 38-48 % for women actually screened [22].

  • Coldman et al. combined observational studies from seven Canadian breast screening programs, representing 85 % of the Canadian population, and concluded an average breast cancer mortality reduction of 40 % by screening mammography [23•].

  • Paap et al. investigated six case-control studies showing differences in magnitude of breast cancer mortality reduction, which ranged from 38 to 70 % [24].

  • Nickson et al. conducted a case-control study and meta-analysis of published case-controlled studies and estimated a 49 % reduction in breast cancer mortality [25].

One of the major differences among various trials is in the interpretation of scientific literature by different scholars, institutions, and professional societies. The USPSTF noted that there were no trials that met the criteria for good quality [9••]. In order to address the gaps in the study design, meta-analyses of RCTs and case-control studies are commonly used to amplify the statistical significance. In addition, to circumvent the short case accrual results, the “longest follow-up available” from each trial is used. Adjusting RCT relative risk to the long accrual method diminishes the estimate. Moreover, the follow-up period has been inadequate and sometimes contemporaneous with screening. Another variable to consider among trials is the variation in diagnostic outcomes, as some studies tend to have increased number of advanced stage breast cancer than others [9••].

The majority of the RCT studies are outdated, and since then, tremendous advancements have improved the technology of mammography and breast cancer treatment [9••]. The absolute benefit has been shown to increase with longer follow-up times. Updated results from large screening trials have attempted to address these issues. In 2011, the Swedish Two-County Trial study was updated with a 29-year follow-up, which showed highly significant reduction in breast cancer mortality with RRFootnote 1 = 0.69, 95 % confidence interval, confirming the original findings and consistent with most recent meta-analysis [19, 25]. In 2016, updated results with a longer follow-up, the Gothenburg trial was reported and concluded a significant 30 % reduction, especially in women younger than 50 years [20•]. In 2015, updated results of the UK Age trial with a 17-year follow-up demonstrated a statistically significant reduction in breast cancer mortality in the first 10 years when comparing the group that started screening at age 40 to the control group with an RR of 0.75, but not thereafter (RR 1.02). After 17 years of follow-up, the study found a RR of 0.88 for breast cancer mortality from tumors diagnosed from screening [21•]. The meta-analysis of the RCTs used for the USPSTF for women ages 39 to 49 and the estimation of absolute benefit for this group over 10 years are therefore underestimated [9••].

One of the essential rules in the field of science is to interpret research studies by analyzing the quality of the study before accepting the results. Evidence-based medicine demonstrates a clear benefit of mammography screening in all age groups. The differences between the national guidelines are in the measured endpoints of the conducted screening trials. As a result, we see that observational studies demonstrate a higher benefit in mortality reduction than that shown for the RCTs.

Controversy 2: Appropriate Age to Start Screening Mammography and the Designated Intervals

Age to Start

Mammography starting at age 40 has been shown to save the most years and lives. However, the USPSTF recommends starting at age 50 and terminating at age 74. Furthermore, the guideline suggests that the patient’s desire to initiate this service earlier than age 50 and continuing beyond age 74 should be discussed with her physician [9••].

The recent 2015 adjusted ACS recommendations are slightly different and are as follows:

Women with an average risk of breast cancer should undergo regular screening mammography starting at age 45 years (strong recommendation); women aged 45 to 54 years should be screened annually (qualified recommendation); women 55 years and older should transition to biennial screening or have the opportunity to continue screening annually (qualified recommendation); women should have the opportunity to begin annual screening between the ages of 40 and 44 years (qualified recommendation) and women should continue screening mammography as long as their overall health is good and they have a life expectancy of 10 years or longer (qualified recommendation) [26•].

A strong recommendation was defined as an acceptable guideline where the benefits of compliance to the intervention outweighed the negative effects that may result from screening. Qualified recommendations were defined as obvious benefit but with less certainty regarding the benefits versus harms and about patients’ individual decision-making. It was stated in the ACS recommendations that the shift in the recommendations was mainly due to recent increasing evidence from long-term follow-up of several RCTs and observational studies of population-based screening programs [26•].

Nevertheless, the only RCT designed to study the age group of 40–49 years, the Age Trial, randomized women at ages 39 to 41 and yielded a mortality reduction of 17 % [21•]. Johansson et al. cohort study in Sweden established a 26-30% mortality reduction in this age group [27]. However, a RCT trial has yet to evaluate breast cancer mortality or all-cause mortality outcomes on the basis of risk factors in addition to age. Future head-to-head trials of different screening intervals are needed to provide information to determine the specific effects of screening intervals.

Screening Interval (Annual Versus Biennial)

An important point to consider is that there have not been any trials specifically designed to evaluate screening intervals or compare annual screening with biennial screening. Screening intervals are influenced by estimates of tumor growth, biology of tumor, and interval cancer rates. Screening annually has been shown to save more lives with a 39 % reduction in the mortality rate, if done annually [19]. Miglioretti et al. conducted a prospective cohort over a 16-year period to compare the prognostic characteristics in women screening annually versus biennially. The study concluded that premenopausal women diagnosed with breast cancer following biennial screening are more likely to have tumors with less favorable prognostic features than women who underwent annual screening [28•]. Despite the evidence supporting annual screening interval, the current guidelines recommending biennial screening appear to cause a change in the screening-rate-observed population in the USA after these guidelines were published. For example, after the USPSTF publication of biennial mammography screening recommendations starting at age 50, there was a slight decrease in women being screened between ages 40 and 49 [29, 30•].

But are there harms to annual screening? It is suggested that there is an increase in the number of false positives which is associated with unnecessary imaging, unnecessary biopsies, inconvenience, and anxiety. There is also radiation exposure and overdiagnosis to consider but not yet proven.

Controversy 3: Risk Versus Benefit

Overdiagnosis

Overdiagnosis is an ongoing controversial debate in the practice of breast cancer screening. Although there is no established definition of overdiagnosis, it has been accepted as the diagnosis of entity in an asymptomatic patient that does not produce a net benefit for that patient and does not have an impact on the individual’s life, if left untreated. It is vital that the term “overdiagnosis” should be differentiated and not be confused with “misdiagnosis.” The concept of overdiagnosis as a negative result of mammographic screening is now being widely discussed. Although mammographic screening has increased detection of early noninvasive breast cancer and early invasive cancers, the rates of advanced cancer have not changed significantly in the last 30 years. Data from the Surveillance, Epidemiology and End Results (SEER) program of the National Cancer Institute for Breast Cancer Screening demonstrated an increase of 122 early breast cancers per 100,000 women between 1976 and 2008. However, advanced-stage breast cancers decreased by 8 % during that period [31]. This evidence supports the theory that some of the screening-mammography-detected cancers would not necessarily progress to an invasive form.

According to a recent report from the International Agency for Research on Cancer (IARC), the rate of overdiagnosis by the Euroscreen Working group is estimated at 6.5 % and concluded that the benefits of screening overweighed the risk in women ages 50–69 years of age. Many other published studies have recorded a wide range in the rates of overdiagnosis [32•]. Despite varying rates of overdiagnosis, there is a lack of consensus on the definition and metrics among studies on overdiagnosis. Until there is standardization in the methodology and common metrics, the estimates are purely conjectural at best.

A potential solution to get more accurate rate of overdiagnosis would be to conduct future studies such as the Low-Risk DCIS trial (LORIS). LORIS is a phase III trial randomizing women with low and intermediate grade of screen-detected DCIS to surgery or to active monitoring. The randomization of this study has the potential to help us to identify early breast cancer that can potentially spare surgery and minimize rate of overdiagnosis [33•]. As of today, we are still at a loss in predicting which patients diagnosed with DCIS can be followed, and which of those patients will have progression to an invasive cancer requiring surgical treatment. Until there is a dependable, scientific methodology in predicting the progression of DCIS, or lack thereof, we are obliged to treat all women diagnosed with DCIS to reduce overall mortality.

False Positive

The highest rates of false positives are noted to be among women age 40 to 49 years undergoing annual screening that had heterogeneously dense (68.9 %) or extremely dense (65.5 %) breasts [34•]. While false-positive studies may lead to anxiety and inconvenience, these are subjective measures and difficult to quantify in the general population. However, the assessment of false positive should be taken into account as a benchmark to improve for breast imager’s performance standard. Multiple studies in the past have shown that facilities with higher false-positive rates are among radiologists with lower screening mammography volumes [3537].

Radiation Exposure

Two-view screening mammography has a radiation dose of 3.7 mGy which is equivalent to background radiation. It is suggested that fatal-radiation-induced breast cancer occurs between 2 per 100, 000 screened in women age 50 to 59 screened biennially and 11 per 100,000 screened in women ages 40 to 59 screened annually [9••].

In summary, both false positives and overdiagnosis can be grouped as an inconvenience and which is best determined by the consequences of women deciding the trajectory of their clinical care. At present, the scientific data in this category is too heterogeneous and unreliable. At best, we can arm our patients with the current clinical data and help them to make an informed decision.

Conclusion

The goal of screening mammography is to detect asymptomatic, non-palpable breast cancer. Like all screening tests, mammography is imperfect. To make matters more challenging, the guidelines are complicated, allowing for weak adherence and reduced patient compliance. Referring clinicians are under stress to discuss benefits, risks, limitation, and harms associated with screening during the short patient visit. For the past 30 years, organizations have differed in their recommendations, often influenced by the stakeholders and policymakers. While more frequent screening has the potential to save more lives, the challenge is in justifying the expenditure of health-care dollars, which could potentially be used for alternative health-care issue to serve a larger population. The USPSTF in using a utilitarian approach has come up with the cost-effective approach to justify their guidelines. However, our society may reject the cost-effective approach. Policymakers therefore need to face the current evidence-based medicine proven to save lives and allocate health-care dollars, accordingly. Furthermore, in addition to costs, policymakers and insurance providers will need to consider harms and benefits, quality of life, and investigate the data regarding the impacts of biennial versus annual screening in women over and under the age of 50. The direct application of clinical trials to the breast cancer screening policy and clinical practice remains a challenge. The gaps in scientific literature can be overcome by combining the empirical evidence with modeling. Future modeling screening programs and impact on mortality can be measured as a function of comorbidity, cognitive/physical functioning, and life expectancy, in addition to cost-effectiveness of different screening methods.

Screening recommendations vary not only by geography but by institutions as well. Physicians, especially, radiologists may be confronted with the controversies surrounding the screening recommendations. Regardless of the plethora of obstacles, all women should be made aware of breast changes and encouraged to report them promptly. Additionally, clinicians have the due diligence to inform their patients of current facts and guidelines regarding breast health. Well-woman examinations are considered as an opportunity for physicians to discuss with patients the most recent data. Although some women in their early 40′s may review the benefits and harms and decide that mammography is right for them, many others will decide to wait until they are age 45 or older.

In our era of patient-centric medicine, we have a shift toward shared decision-making process. The National Cancer Institute has launched a new precision-based cancer screening initiative [38•]. Decision aids will become an essential tool to help summarize complex evidence that is currently available. The knowledge gained can empower patients to make an informed decision with their physicians.

For clinicians and radiologists, evidence presented to us in the field of breast cancer imaging should be systematically analyzed without omission or misinterpretation. Doing so will allow us to draw conclusions that is evidence-based and accurate. What we can learn from the mammography war is that continuous evaluation and reassessment of new data is needed to update us on ongoing screening mammography programs and its outcome to breast cancer mortality.

In the upcoming years of precision medicine, providing patients with information regarding benefits and harms and varied screening guidelines can aid them to make an informed decision about the chronology and frequency of their breast screening.