Introduction

Breast cancer remains the most common type of female malignancy diagnosed in the Western world and the second most common cause of death in all women [1, 2].

Known risk factors for breast cancer include aging, early menarche, first live birth, late menopause, proliferative breast disease, and family history. Environmental factors including diet and lifestyle have also been implicated but their association with breast cancer risk has been less well demonstrated. Because family history is one of the strongest predictors of a woman’s chance of developing breast cancer, researchers turned to cancer-prone families to find specific inherited genetic alterations that could be the culprit. After decades of research, two genes were found that are altered in many families with hereditary breast cancer. In 1994 and in 1995, BRCA1 (BReast CAncer gene) and BRCA2 were discovered, respectively [3,4,5,6]. The search for other genes continued, and recently, several more genes, such as p53, STK11, CDH1, PALB2, PTEN, and the mismatch repair genes have been found to be associated with breast cancer risk [7]. These genes confer an extremely elevated lifetime risk of breast cancer, which can be as high as 80%. However, those genes are rare and account for only up 5% of breast cancer [8,9,10].

Research turned towards SNPs or single nucleotide polymorphisms for identification of additional genetic risk. SNPs are the most common genetic variation identified in a population. About 10 million SNPs are found in the human genome, and they make up more than 1% of the human genomic profile. An individual is estimated to carry between 2.8 and 3.9 million single–base pair variants. When considered individually, SNPs confer a very small increased risk. However, risks from several SNPs can be combined into what has been called a polygenic risk score (PRS) to provide a more significant estimation of risk. The PRS is calculated as the summation of the number of risk alleles multiplied by their presumed odds ratio of causing breast cancer [11]. The generated score is then used to predict an individual predilection for a specific disease. This approach can be used to build a powerful risk prediction model.

There has been renewed interest in the PRS to predict phenotypic differences and predisposition to common diseases. Major milestones have contributed to the PRS application including the generation of genome-wide association studies (GWAS) and the availability of large cohort studies with long follow-ups. Large GWAS and public datasets with easy reach to scientists worldwide like for example the UK biobank have made the PRS an attractive clinical tool [12]. The hope is that genomic profiling can ultimately stratify an individual’s risk for any disease and improve methods of screening as well as prevention.

Early clinical applications have been conducted in plant, animal, and behavioral genetics. In plants, GWAS and SNP studies identified seven major regions responsible for iron deficiency chlorosis in soybean [13]. Interesting enough, the PRSs are widely used in animal breeding also called genomic prediction, where they help in livestock breeding [14]. Early studies in humans have used the PRS to predict educational attainment by the age of 16 or even predisposition for psychological and psychiatric diseases such as schizophrenia [14,15,17]. Other applications have been in coronary artery disease [18], type 2 diabetes, and more recently other diseases including malignancies.

During the past decade, the focus has shifted to the field of cancer prevention and screening. Prostate cancer studies showed that the PRS can possibly be used to predict high risk vs. low risk for malignancy, leading to better-individualized care [19]. In a large Swedish population-based study, adding a PRS on prediction models for prostate cancer helped decrease the numbers of unnecessary prostate biopsies from 12 to 5% without missing cases of aggressive cancer [20].

The PRS is gaining incredible momentum in the breast cancer field. It is shedding light on a large proportion of the familial risk in women with breast cancer that have no pathogenic deleterious mutation in high- or moderate-risk breast cancer genes [20,21,23]. This review will discuss the variety of the PRSs published today, review their clinical applications, and highlight the limitations currently associated with the use of the PRS.

Discovery of SNPs and Association with Breast Cancer Risk

In 2005, 2 large cooperative groups, the Cancer Genetic Markers of Susceptibility (CGEMS) Breast Cancer Consortium and the Breast Cancer Association Consortium (BCAC), started using the GWAS to identify SNPs that increase breast cancer risk [24, 25•]. Soon after, Easton and colleagues identified the first five breast cancer risk loci (FGFR2—rs2981582, 8q24—rs13281615, LSP1—rs3817198, TNRC9—rs3803662, and MAP 3K1—rs889312) in a three-stage GWAS of Caucasian women that involved several thousands of controls and cases [25•]. To this day, SNPs in the FGFR2 remain among the strongest loci implicated with breast cancer.

By 2015, 79 breast cancer susceptibility loci had been published, and 71 of those were confirmed in a 2015 meta-analysis including the data from BCAC and 11 additional GWAS [26]. That meta-analysis included 62,533 breast cancer cases and 60,976 controls and was able to identify 15 new breast cancer susceptibility loci which brought the total number of identified SNPs to 94.

More recently, the largest breast cancer GWAS to date used the Illumina OncoArray BeadChip, which included approximately 570,000 SNPs to study over 100,000 cases. 61,282 breast cancer cases and 45,494 controls of European ancestry were genotyped using that OncoArray platform, and results were used in a meta-analysis that led to the discovery of even more specific SNPs. To date, the identified breast cancer susceptibility loci or SNP account for approximately 18% of the familial risk for breast cancer [27, 28].

More homogenous populations with specific subtypes of cancer are currently being investigated to generate more informative SNPs. It is interesting that many variants confer risks that differ by breast cancer subtypes suggesting that subtype-specific PRS might predict for a subtype specific disease [28,29,30,31,32,34]. In Jan 2019, Mavaddat and colleagues published data from 79 studies conducted by the BCAC. They reported the development and validation of the PRS for breast cancer optimized for prediction of subtype-specific disease. The data was based on the largest available GWAS dataset using 313 breast cancer-specific SNPs [35]. Interestingly, the prediction was significantly better for estrogen-positive (ER+) tumors as compared with estrogen-receptor-negative. However, this was no surprise, since ER+ tumors are much more common and are represented more frequently in the GWAS data sets. That makes the GWAS less powerful in predicting estrogen-receptor-negative subtypes. As a result, more efforts led by the Triple Negative Breast Cancer Consortium (TNBCC), among other groups are focusing specifically on identifying SNPs directly linked to triple negative disease [30,31,32,34]. Larger data sets for the GWAS analysis are still however needed for prediction of less common disease subtypes [36].

PRS, Breast Cancer Risk Models, and Clinical Application

Identifying specific SNPs and generating a PRS remain only one part of breast cancer risk assessment. A comprehensive approach to adequate risk stratification does require the incorporation of the genetic risk with other well-known risk factors including age, family history, menstrual history, BMI, and breast biopsies and use of HRT.

At present, the PRS is being investigated as an added component in several risk prediction models like Tyrer-Cuzick, Gail, and BRCAPRO. Both multiplicative and non-multiplicative risk models have been proposed as another way to incorporate all those risk factors [37]. Recently, Dr. Cuzick and colleagues generated a combined approach to improving breast cancer risk stratification to enable more early targeted preventive strategies. They proposed a better predictive risk by incorporating the PRS with breast density as well as the Tyrer-Cuzick model [38]. Studies of the PRS interaction with lifestyle and hormonal risk factors are currently ongoing. The goal is to come up with a formula that provides an accurate estimate of a lifetime risk (SNP × lifestyle/hormonal factors) [39].

Knowing their individualized risk profile can help the patient and his/her health provider make better informed medical decisions. This personalized model can ultimately affect 3 important areas: (1) screening recommendations, (2) prevention recommendations, and (3) compliance to both screening and preventive recommendations which is ultimately the most important goal.

Concerning screening recommendations, in healthy women, a prediction model for breast cancer can be the answer to the frequency of mammograms and reduce screening interval in the high risk whereas increasing screening interval in the low risk. A woman with a low risk score might need every other year screening, whereas a woman with a high score would benefit from more aggressive screening on every 6-month basis.

Furthermore, it can also guide the type of screening modality. In the higher risk patient population, identified as carrying more than 20% lifetime risk and yet with no mutation, a PRS may provide a tool to differentiate women who need the addition of a screening MRI vs. women who do not.

The PRS may also impact clinical care by informing national screening guidelines, particularly in women ages 40–49, for whom professional societies still disagree on ultimate recommendations. The Women Informed to Screen Depending On Measure of risk (WISDOM) population-based study is currently investigating breast cancer screening modalities in the genomic era [40]. It is evaluating the recommendation thresholds by combining the traditional models with the PRS. It is also looking at subtype-specific risk, which may result in more frequent screening for women at a higher risk of more aggressive breast cancers (e.g., ER-negative disease) [40]. Those efforts are happening worldwide. Ongoing international studies investigating the use of the PRS to inform targeted breast cancer screening programs are currently underway worldwide including CORDIS, a European-randomized study comparing personalized, risk-stratified standard breast cancer screening in women aged 40–70 (https://cordis.europa.eu/project/rcn/212694/factsheet/en).

Another area where a PRS could be useful is possibly predicting the profile or phenotype of a specific subtype of breast cancer to guide chemoprevention. If a PRS can predict the risk of developing estrogen-receptor-positive breast cancer, it will provide additional support for a recommendation of prevention therapy. Providers may be more inclined to give tamoxifen or other endocrine therapies to a woman with the PRS predicting endocrine-sensitive disease vs. triple negative or HER2-positive. And finally, recent data is showing that the more patients are informed of their risk, the more they are inclined to be compliant with recommendations made by their providers. The GENRE study presented at the American Society of Clinical Oncology, ASCO, this year (2019) by Julian Oliver Kim and colleagues revealed that when provided with the PRS risk counseling, 41.9% of those with high PRS were more inclined to take endocrine therapy and 46.7% of women with low PRS were less inclined to take it. Another study is currently recruiting in the USA to assess how the PRS affects the breast cancer risk management recommendations that healthcare providers make to their patients.

Polygenic Risk Score and Cost-effectiveness

Genomic sequencing remains however relatively expensive, and applying the PRS on a population-wide scope might seem financially draining. However, in the long term, if it ultimately leads to a better individualization of care, it might spare people unnecessary testing and invasive procedures. Recently, as published in the JNCCN, Dinan et al. have shown that genomic testing (oncotype dx) is associated with lower healthcare costs particularly in clinically high-risk patients [41]. A British study in the JAMA tackled recently the following question “Can risk-stratified screening for breast cancer improve the cost-effectiveness and benefit-to-harm ratio of screening programs?” [42]. That study showed that a risk-stratified screening strategy could improve the cost-effectiveness in a breast cancer screening program. It concluded that offering screening to lower risk women is not cost-effective, whereas targeting the higher risk population would markedly improve the benefit-to-harm ratio.

Other individualized screening strategies have echoed the same findings.

Polygenic Risk Score Limitations

The PRS has unfortunately several limitations. There have been only a handful of clinical trials that validate the PRS predictive power in clinical settings [43••].

Recently, Vachon and colleagues reported on the effect of 75 SNPs on breast cancer risk in women taking SERMs for primary prevention in the NSABP P-1 and P-2 studies. In that study, they show that the predictive intrinsic risk of breast cancer by the PRS is maintained regardless of chemo-preventive therapy. The risk of breast cancer ranged from OR = 0.59 to 1.98 for those with the lowest and highest PRS respectively compared with the average PRS [44••].

Another more recent study that predicts lifetime breast cancer incidence by risk score is the study by Mavaddat and colleagues. In that study, they demonstrated that compared with women in the middle quintile, those in the highest 1% of risk had 4.37- and 2.78-fold risks and those in the lowest 1% of risk had 0.16- and 0.27-fold risks of developing ER-positive and ER-negative diseases, respectively. Nevertheless, the prediction for ER+ disease remains significantly better [35]. No study however has used the PRS score to predict who would benefit from chemoprevention.

Furthermore, the PRS has not been incorporated with the pathological risk related to high-risk lesions (atypical ductal hyperplasia for example), or the risk related to high risk imaging like dense breasts. However, it is getting closer to being incorporated into many prediction models like the BOADICEA [45••].

But the main problem with the PRS currently remains the fact that it is currently more tailored to the white Caucasian population and does not have a fair representation of different ethnicities. The PRS generated from the GWAS mostly representing European-descent cohorts can ultimately improve health outcome in European patients but will most likely under- or overestimate the risk in individuals of different descent. This can be also complicated by the fact that in some cultures, having a genetic predisposition remains a taboo and not a topic to be approached publicly. We have come to realize that precision medicine can lead to more health care disparities. I believe this should not however stop the scientific community from moving forward. It should push us to join more forces to broaden our horizons and capture a wider representation of our real patient population.

It remains, however, that in the field of genetics, the PRS information has recently sneaked into the official genetic report results without being officially validated. Multiple companies do include it now as part of reporting results for a genetic panel test. Validation with more studies is however still needed before adopting widespread use of this tool.

The Future

Technology is moving so quickly that we as medical professionals can barely keep up. We have reached an era where genomic sequencing and gene profiling have become easily accessible and relatively affordable. With the advent of artificial intelligence, it is only a matter of time and I do not mean decades, I mean a few years that every single person and every single disease will be sequenced and genomically profiled.

However, pressing questions remain: (1) How much technology can alter clinically relevant outcomes; (2) Can the PRS help with predicting clinically aggressive vs. indolent disease; (3) Can the PRS help with predicting disease onset younger vs. older, etc.; (4) Can the PRS be added to imaging, pathology, and environmental factors?; and (5) Can the PRS ultimately generate such a specific prediction that no biopsies will ever be needed for benign diseases or more specifically no biopsy is needed for any lesion that will not become life-threatening.

It remains to be determined however if we can select for the SNPs that really matter from our complex and advanced DNA makeup. We can also hope that genetic companies and researchers will join forces to make the field advance in a meaningful and timely manner, because the goal remains the prevention of a potentially deadly disease.

Conclusion

Genetic testing has come a long way; it has become cheaper and more available, and we are all under pressure to use it. I think we can make the case that the inclusion of polygenic risk score in risk assessment is essential for personalized medicine. However, validation is critical, as is a better understanding of how to incorporate a PRS with clinical risk factors.

Unfortunately, we are still far from the goal. The desired association or prediction linked to the PRS has not been confirmed, and larger studies with more diverse populations need to be conducted to reach a reliable tool and ultimately lead to better mechanisms of prevention.