Introduction

Breast cancer is a leading cause of female cancer death globally [1] and has been introduced as the second cause of cancer-related deaths in Iran [2]. Moreover, the high incidence of the disease and low age of diagnosis has been demonstrated among Iranian women during the recent decade [3,4,5], as it is worth mentioning that most of them are still at the appropriate age for employment [6].

There are heterogeneous risk factors responsible for an increased breast cancer incidence and mortality such as dietary changes, lack of physical activity, late pregnancy, having relatively fewer children as well as commonly used hormone-replacement therapy (HRT) [7, 8]. In addition, the disease may show all the hallmarks of a multistep genetic disease. The recognized genes involved in hereditary breast cancer account only for 16–20% of the familial type. However, 85 to 90% of all cases are non-hereditary, which is the most common but least known, in terms of genetic predisposition [9, 10]. However, many genetic markers for BC susceptibility have been suggested through Genome-wide Association Studies (GWAS) [11].

rs13387042 on 2q35 locus has been identified as a hotspot for BC susceptibility in GWAS as well as replication studies in different populations [12,13,14,15]. It is located near TNP1 (transition protein 1), IGFBP5 (insulin-like growth factor binding protein 5), IGFBP2 (insulin-like growth factor binding protein 2), and TNS1 (Tensin 1/matrix remodelling-associated protein 6) genes [13, 16]. IGFBP5 gene is an essential factor in normal mammary epithelial development. Along with 2q35, they have been constantly associated with cancer, although little is known about the nature of their interaction [17]. Moreover, it has been revealed that non-coding regions have a critical role in the regulation of gene expression. Since rs13387042 is located in this area, it has been at the center of focus of different investigations [16]. The susceptibility of this SNP in breast cancer has been explored in European, African, and Asian populations [18,19,20]. In the present study, we investigated the association of this genetic marker with the risk and survival of breast cancer in a cohort of North-eastern Iranian patients.

Materials and methods

Subjects

506 women consisting of 184 patients and 322 controls were enrolled in our study. All of the controls were evaluated by routine clinical examinations. They had no sign of breast cancer or a history of malignant breast disease. The breast cancer patients (N = 184) were confirmed cases with available pathological information, including HER2, PR, ER status, stage, and the grade of the tumour.

The study was approved by the local ethical committee of Mashhad University of Medical Sciences (Ethical approval number: IR.MUMS.REC.1394.186). Signed informed consent was obtained from each study participant.

Genotyping method

DNA was extracted from the whole blood samples using the saturated salting-out technique. Allele-specific polymerase chain reaction (AS-PCR) was carried out to detect the genotype of patients and controls. A total of 12 μl of the final PCR reaction volume was used for this purpose. The reaction volume was composed of 2 μl of the DNA template (150 ng), 3 μl distilled water, 1 μl of each of the 2 primers (1 common primer, and 1 mutant or normal primer with the concentration of 10 µM), 5 μl Taq 2 × master mix (AMPLIQON). The 5′ to 3′ sequence of the two allele-specific forward and common reverse primers are listed below: Allele A specific forward 5′-ACAGAAAGAAGGCAAATGGAA-3′ (size band: 220 bp), Allele G specific forward 5′-ACAGAAAGAAGGCAAATGTAG-3′ (size band: 184) and common reverse 5′GGAGAATCACTTGAACCTGGA3′.

PCR condition included 10 min initial denaturation at 95 °C followed by 35 cycles as 15 s denaturation at 95 °C, 15 s annealing at 56 °C for rs13387042 G and 58 °C for rs13387042 A, and 15 s extension at 72 °C, and 10 min final extension at 72 °C. PCR was performed in a Veriti 96 well PCR Thermal Cycler (Applied Biosystems, Foster City, California, United States), and then Electrophoresis with 2% agarose gel was done for all the samples. 10% of samples were randomly re-genotyped to confirm the genotyping results.

Statistical analysis

The statistical evaluation was performed using the “SPSS version 16” software package (SPSS Inc., Chicago, IL, USA). Descriptive statistics, including frequencies, mean and standard deviation (SD), were used to describe all variables. Binary logistic regression analysis was performed to assess the association between rs13387042 and the risk of breast cancer. Moreover, the survival analysis was done using Kaplan–Meier and Cox regression methods and the log-rank test was used to estimate differences between the groups. A p-value of less than 0.05 was considered significant.

Results

Characteristics of the study population

Demographic features of the patients and controls have been summarised in Table 1. Furthermore, tumour characteristics of breast cancer cases have been shown in Table 2. The mean age in patients, 46.03 ± 11.99 years, was higher than controls (p = 0.016). Comparison of Body mass index parameter between the two groups showed the cases (28.32 ± 4.80 kg/m2) were more overweight than controls (25.19 ± 4.17 kg/m2). Furthermore, the age of the first gestation was 21.36 ± 4.85 and 22.60 ± 4.52 years in patients and healthy women, respectively (p = 0.009). The marital status (p = 0.006), education (p < 0.001), and physical activity (p < 0.001) indicated a significant difference between breast cancer cases and controls. This significant difference was also observed between the two groups in the menopause status (p < 0.001), and the frequency of menopausal women was higher in patients with 44.7%. A significantly higher frequency of no breastfeeding history was observed in the patients (8.9%) than healthy individuals (2.1%) (p = 0.003).

Table 1 Demographic characteristic of the patients and controls
Table 2 Frequency of tumor characteristics of Breast cancer cases

The histopathological evaluation showed 51.3% of cases were in stage II, and 58.3% had grade II. 86.1% of the breast tumours were invasive ductal carcinomas, and in most patients (43.6%), cancer had not been spread to the lymph nodes. 30.1% of tumours sizes were between 2 cm or less. Evaluation of oestrogen and progesterone receptors indicated that more than 79.2% of tumours tissue expressed these receptors on their surface, and 9.2% of patients were Triple-Negative Breast Cancer (TNBC).

Association of rs13387042 with breast cancer risk

Genotypes and alleles frequencies in breast cancer and healthy controls have been shown in Table 3. Based on Backward LR regression, BMI was the only baseline factor with a significant difference between groups. Therefore, all comparisons were adjusted for BMI.

Table 3 Distribution of alleles and genotype frequencies of 2q35-rs13387042 variant in control compared with total and ER-positive patients

The comparison of allele frequency distribution in the studied population showed that the genotype frequencies for AA, AG, GG were 21.7%, 66.8%, 11.4%, and 11.2%, 75.2%, 13.7% in case and control groups, respectively. More investigation revealed that allele and genotype frequency in cases and controls were not in Hardy–Weinberg equilibrium. Risk assessment was performed under three dominant, recessive and multiplicative models. A two-fold increased breast cancer risk was observed in the recessive model for AA genotype [p = 0.003, OR 2.53, 95% CI (1.31–3.88)]. However, we did not find a significant association under the dominant model between the two groups [p = 0.791, OR 1.08, 95% CI (0.60–1.97)]. As shown in Table 3 and according to the allele frequency, A allele with 55.2% in cases versus 48.8% in controls, was introduced as the risky allele, however, the p-value was not significant [p = 0.107, OR 1.26, 95% CI (0.95–1.66)].

The association between the rs13387042 variant and BC risk for both ER-positive and healthy controls is demonstrated in Table 3. The AA genotype was associated with increased risk in ER-positive breast cancer patients before adjusting for BMI [p = 0.019, OR 2.53, 95% CI (1.17–5.50)]. There was a statistically significant association in the dominant (AA vs. GG + GA) genetic model [p = 0.015, OR 2.12, 95% CI (1.16–3.88)] with an increased risk of the disease in ER-positive cases.

Genotype frequencies were evaluated among different subtypes of breast cancer. Based on findings reported in Table 4, no significant association was observed for each genotype among the breast cancer subtypes. The AA genotype frequency was higher than that of the AG and GG genotypes in both the early stage (stage I & II) and ER-negative groups while in the ER/HER positive and PR-negative groups the AG genotype frequency was the highest. Logistic regression did not indicate a significant difference among different subtypes of breast cancer (p < 0.05).

Table 4 Distribution of alleles and genotype frequencies of 2q35-rs13387042 variant in different pathological status

Association of rs13387042 with overall survival

According to the findings, rs13387042 did not indicate any association with the overall survival in the genotype (p = 0.529) and allelic (p = 0.480) models before adjustment (Fig. 1A and B, respectively). Furthermore, survival analysis in association with pathological factors showed that overall survival was significantly different in the upper/lower stage groups and the molecular category groups. After adjustment for these two factors, the results did not change. Overall survival information has been shown in Table 5.

Fig. 1
figure 1

Kaplan-Meier plots of different genetic models of 2q35-rs13387042 polymorphism. A Genotype model (GG and AG vs. AA), B multiplicative model (A vs. G)

Table 5 Multivariable overall survival analysis in association with 2q35-rs13387042 variant

Discussion

Recent GWAS have led to the identification of multiple novel genetic variants associated with BC risk, such as 2q35-rs13387042, which has been reported in various studies; however, the results have been inconsistent [13, 16, 21, 22]. With recent advances in genomics, elucidating the molecular basis of disease on a personalised level has become an attainable goal. Among them, genetic polymorphisms have a critical role in diseases susceptibility, diagnosis, and therapeutic efficacy of various cancers [13, 23]. Furthermore, the recent studies reported associated SNPs which are located in the noncoding regions, suggesting that the search for functional polymorphisms should extend beyond the gene regions [24, 25]. In the present study, the association between one of the important variants in 2q35 locus, rs13387042 (located in intergenic reign), and breast cancer was investigated in the North-eastern Iranian population.

Our findings indicated a significant association between 2q35-rs13387042 carriers of two A-allele and breast cancer in our population. This genotype increases the risk of the disease by 2.53–folds compared to other genotypes. This result is similar to those in the study by Stacey et al. however, the amount of conferred risk was lower in their study [13]. In spite of the fact that GWASs have revealed novel genetic markers for BC susceptibility in different populations, little is known regarding the risk factors and molecular events associated with BC in the Iranian population.

In our study, we observed that the A-allele (as a risky allele) frequency is higher in cases than controls (55.2% and 48.8%, respectively). In different studies, the A allele has been associated with an increased risk of breast cancer. This allele is the most common in African populations (77%), and has a lower frequency in Europeans (51%) and Mexican Americans (41%), and is less common in Asians (12%) according to the frequencies from the 1000 Genomes Project [21]. Consistent with HapMap data, the A-allele frequency was much more common in Europeans than in the Asian population. Stacey et al. also showed that 25 percent of the European population carry allele A of rs13387042, who are estimated to have an increased risk of 1.44-fold in comparison with non-carriers [13]. In a study in the Arab population, the GG genotype of rs13387042 on 2q35 showed a significant association with the risk of developing distant metastasis. Also, this allele indicated a better prognosis by presenting a considerably higher overall survival rate [26]. A Taiwanese survey revealed A-allele of 2q35 conferred a higher risk for BC risk than allele G. However, in our study, no significant association of the rs13387042 with breast cancer was found under the multiplicative genetic model.

Although published meta-analysis data on the association between 2q35-rs13387042 and breast cancer risk has introduced A allele as a risk factor for the disease [27], inconsistent results might be observed in different studies. Campa et al. confirmed the association of 14 SNPs including rs13387042 with BC risk [13, 28]. This SNP was first identified as a BC susceptibility SNP in two GWASs conducted among Europeans (4554 cases/17,577 controls) [13]. Later studies on African-American (810 cases and 1784 controls), as well as European women (306 cases and 10,393 controls), confirmed a significant association between the mentioned SNP and breast cancer [18, 29]. In Asians the results were inconsistent. For instance, in a study on the Chinese population, the association between this SNP and breast cancer risk varied from having a significant to non-significant association [30, 31]. Additionally, a study by Hutter et al. showed no significant associations between rs13387042 and BC in African American women [32]. Similar studies and results were reported in the Norwegian series [33]. There are some explanations for such inconsistent results. Importantly, ethnic differences might give rise to these different results, because of changes in allele frequencies in various ethnic populations. Therefore, it is possible that the association between a genetic marker and one specific subtype would not be replicated in other study populations [34]. In addition, some other risk factors such as lifestyle, environmental exposures, diet schedules, individual health backgrounds, tumour ER/PR status and menopausal status as well as adequate sample size and study design can all play a critical role [35].

Histopathological properties may also influence the results, as the original publications on 2q35-rs13387042 reported the association with mostly ER-positive BC [13, 36]. In the same way, we could also find a significant association between rs13387042 with BC for ER-positive diseases. Similarly, in the Stacey et al. study, the risk regarding this variant was observed for ER-positive tumours [13]. In spite of that, this association was observed for both ER+ and ER− in another research [37]. Some studies on genetic markers including rs13387042 showed stronger associations with ER-positive than with ER-negative tumours for several loci [13, 38]. It has to be mentioned that among various loci, rs13387042 showed significantly different associations by ER status, although no overall associations were found for this polymorphism in our study as well as in Kim et al. study [39].

We did not find the association between overall survival and 2q35-rs13387042 alleles and genotypes. Previous studies indicated different results. Studies in UK and Germany did not find prognostic value for rs13387042 [40, 41]. However, the rs13387042-A allele was associated with a better prognosis as indicated significantly higher overall survival rates in the Arab population [26].

Finding the potential biological functions of SNPs like 2q35-rs13387042 can be a significant step towards further studies. Recent studies have shown that two polymorphisms that are in strong linkage disequilibrium (LD) with rs13387042 (rs6721996 and rs4442975) are associated with decreasing expression of IGFBP5 (involved in inhibition of cell proliferation via an insulin growth factor (IGF)-dependent mechanism) as well as an increasing number of A allele. Additionally, in the recent years, IGFBP5 and 2q35 have been consistently implicated in cancer, though little was known about the nature of their interaction [21]. IGFBP5 contributes to the documented involvement of the IGF signalling axis in mammary density as a risk factor for BC [42, 43]. It has been indicated that the 2q35 plays a role in chromatin architecture, and its functional variation is correlated with gene expression. Since a novel intergenic BC risk locus containing an enhancer copy number variation (enCNV; deletion) is located approximately 400 Kb upstream to IGFBP5, which overlaps an intergenic ERα-bound enhancer that loops to the IGFBP5 promoter, thus 2q35 BC risk loci may be mediating their effect through IGFBP5 [17]. Consequently, functional studies may lead to a better understanding of the mechanisms of aetiology of BC.

As a limitation, we could not confirm the association of this variant with specific BC subtypes because we did not have a large sample size to evaluate the less frequent molecular categories. Thus, larger sample sizes could help increase the power and ensure the correct conclusion respecting whether this SNP is associated with specific BC subtypes. Therefore, this may warrant the need for more collaborative studies to assess the strength of the risk in association with susceptibility variants. It should be noted that the purpose of our study was to evaluate the variant 2q35-rs13387042 in the Iranian population, which was confirmed by the risk of breast cancer by the GWA study. In addition, to determine the full role of this genetic locus in the pathogenesis of the disease, it is necessary to consider functional studies as a new project.

Conclusion

Our study demonstrated a slightly significant association of an intergenic SNP with BC risk in an Iranian population. Furthermore, additional investigation of larger data sets along with intrinsic subtypes categorization as well as functional studies are required to conclude how, and to which degree, these variants are influencing BC pathogenesis.