Introduction

Breast cancer (BC), which has to be highlighted in China as the same as many other countries, is becoming the heaviest disease burden and the main leading cause of cancer-related death in women [1,2,3,4,5,6]. In 2015, estimated 272,400 new cases of BC were diagnosed, and estimated 70,700 deaths were expected to occur with incidence increasing year by year in China [4]. The fact that BC, as a complex disease to tackle, is caused by a combination of various factors such as, age, lifestyle, family history, and serum hormone level [4, 5, 7, 8]. Among these factors mentioned above, hereditary factor is considered to be the most important and crucial one, because 5–10% of cases are raised from genetic variation of susceptible genes [9]. Genome-wide association studies (GWAS) have reported significant effects of some gene polymorphisms on BC risk [10,11,12,13,14], but many new associations between single nucleotide polymorphisms (SNPs) and BC risk have not been explored.

Long non-coding RNAs (lncRNAs) are non-coding RNAs longer than 200 nucleotides. Recently, several studies have shown a link between genetic variants in lncRNA genes and breast cancer risk. Cui et al. found a SNP 2 kb upstream of H19 transcription start site that was associated with breast cancer risk in estrogen receptor (ER)-positive patients in the Chinese population [15]. Wu et al. studied risk associations among 22,977 cases and 105,974 controls of European ancestry and found several novel risk-loci that harbored lncRNA genes [16]. MIR31HG, identified as LNCHIFCAR/LOC554202/hsa-lnc-31, is located on chromosome 9 and produces a long non-coding RNA (lncRNA) which acts as a host gene for MIR-31. lncRNA with a lack of protein-coding function defined as RNA longer than 200 nucleotides, plays considerable and remarkable role in complex biological activities, including regulating gene expression through chromatin remodeling, controlling gene transcription, participating in post-transcriptional mRNA process, and mediating protein function or localization [17]. Their dysregulation seems to be contributed to the growth and progression of human tumors [18], and therefore MIR31HG is widely reported to be involved in the development of various cancers, such as colorectal cancer [19], bladder cancer [20], oral cancer [21], lung adenocarcinoma [17], pancreatic carcinoma [22], esophageal squamous cell carcinoma [23] and BC [6, 24, 25]. Studies have shown that as a non-coding oncogene, the down-regulated expression of MIR31HG can lead to diminished cell proliferation, migration, invasion and increased apoptosis in BC [25]. Through gene evaluated, a related study also explained that MIR31HG was regulated by promoter hypermethylation in triple-negative BC and participated in the regulatory mechanism of BC as an important determination of the invasion metastasis cascade [24]. These studies have shown that MIR31HG has an important role in BC, however, there is no report on the relationship between the polymorphisms of MIR31HG and the risk of BC.

This study aimed to reveal the impact of MIR31HG gene polymorphisms on the risk of BC in Chinese women through a case–control study, and to explore the association of MIR31HG polymorphisms with clinical characteristics of BC patients, which may provide a theoretical and experimental basis for further investigating the role of MIR31HG on BC carcinogenesis.

Materials and methods

Study population

The study was approved by the ethics committee of Shaanxi Provincial Cancer Hospital, and informed consents were delivered and signed by all participants. A total of 545 patients with BC (mean age 52.00 ± 9.89 years) were recruited and analyzed for the study in 2017 and 2018. The histopathological diagnosis was followed the classification of breast tumors by the World Health Organization (WHO5th), and clinical staging was based on American Joint Committeeon Cancer (AJCC) on breast cancer TNM staging system. It is necessary to exclude some patients who had family history of cancer, received radiotherapy, chemotherapy, or other treatments before the period of investigation. BC cases were categorized by estrogen receptor (ER), progesterone receptor (PR), lymph node (LN) metastasis, clinical stage, human epidermal growth factor receptor 2 (HER2), Ki67, tumor size, tumor location and distant metastasis, among which ER and PR test results > 1% were defined as ER-positive and PR-positive, and HER2 3+ and 2+ with fluorescence in situ hybridization (FISH) positive indicated HER2-positive. The Ki67 positive rate 20% was used as the cutoff point to divide patients with BC into low (< 20%) and high (> 20%) groups. During this period, 530 cancer-free controls (mean age: 51.66 ± 9.67 years) were enrolled from the healthcare of the hospital at the same time. The excluded criteria of controls were as follows: (1) no gynecological neoplasm, (2) no other history of solid cancers, and (3) no immune disorders (Table 1). Approximately 3–5 mL of venous blood sample was collected from each participant and then was placed into anti-coagulative tubes stored at − 80 °C until use. Demographic and clinic indicators were recorded by self-administered standardized questionnaires and medical records, respectively.

Table 1 Clinical characteristics in cases and controls

Extraction of genomic DNA and genotyping

Studies have shown that MIR31HG plays an important role in BC, but the correlation between MIR31HG polymorphisms and BC risk has not been reported. The selection of the eight candidate SNPs on the MIR31HG gene in this study is based on haplotype data or genotype data [26] and from the 1,000 Genome Projects (http://www.internationalgenome.org/) to select SNPs with a minor alleles frequency (MAF) greater than 0.05 in the global population.

Following the GoldMag-Mini extraction method (GoldMag Co, Ltd, Xi’an, China) strictly, genomic DNA was extracted from the venous blood. DNA concentration was measured by spectrometry (DU530 UV/VIS spectrophotometer, Beckman Instruments, Fullerton, CA, USA). The Agena Biosicence Assay Design Suite V2.0 software (http://agenacx.com/online-tools) was used to design the extended primer. The MassARRAY Nanodispenser (Agena Bioscience, San Diego, CA, USA) and the MassARRAY iPLEX platform (Agena Bioscience, San Diego, CA, USA) were used to genotype, and then Agena Bioscience TYPER software (version 4.0) was used to analyze the data [27].

Statistical analyses

Statistical analysis was set up using Microsoft Excel, SPSS 18.0 statistical package (SPSS, Chicago, IL, USA) and the PLINK 1.07 software. Hardy–Weinberg equilibrium (HWE) for MIR31HG genotype distributions of controls were accessed using Fisher's exact test. The demographic and clinical characteristics of study participants were evaluated by chi-squared test to compare the differences in genotypes and allele frequency distribution between the groups. Welch’s T-test was used to compare ages between cases and controls. Logistic regression analysis was used to evaluate the genetic susceptibility of BC under five genetic models (allele, codominant, recessive, dominant, and additive models). Odds ratios (ORs) and 95% confidence intervals (CIs) from a logistic regression model were performed to analyze the relative risk. All p values were two sided and p < 0.05 was considered to be statistically significant. Multi-factor dimensionality reduction (MDR) analysis was performed to assess the impact of SNPs interactions on BC risk [28]. We used G*power 3.1.9.2 software to calculate the minimum sample size and actual power values required for this study [29].

Results

Demographics of study subjects and SNPs information

The minimum sample size of the case population and the control population calculated by G-power software is 107 and 103, respectively, and the actual power value is 0.95. Eight SNPs were genotyped in 545 patients with BC and 530 cancer-free controls. Demographics and clinical characteristics of all study subjects are displayed in Table 1, which showed that the sample size of the subject population recruited in this study is completely in line with statistical significance. The cases consisted of 371 (68.0%) ER positive tumors, 320 (58.7%) PR positive tumors, 267 (49.0%) LN metastasis positive tumors, 78 (14.3%) HER2 positive tumors, 355 (65.1%) I/II and 156 (28.6%) III/IV clinical stage, 147 (27.0%) low Ki67 status, 258 (47.3%) right and 279 (51.2%) left in tumor locations. There is no significant difference between cases and controls in the distribution of ages (p = 0.453). Basic information including position, allele, role, and MAF of all eight MIR31HG SNPs between cases and controls is shown in Table 2. The genotype distributions of all SNPs were in accordance with HWE (p > 0.05).

Table 2 Allele frequencies in cases and controls among MIR31HG SNPs

Association between MIR131HG polymorphisms and BC risk

In addition, five multiple genetic models (allele, codominant, dominant, recessive, and additive models) were used to analyze the relationship between the candidate SNPs and the risk of BC in Chinese women. The results showed that under the polygenic model, the three candidate SNPs on the MIR131HG gene were significantly associated with BC risk (p < 0.05) (see Table 3). Specifically, rs72703442 was significantly associated with a lower risk of BC under both co-dominant (OR 0.29, 95% CI 0.10–0.79, p = 0.026) and recessive (OR 0.30, 95% CI 0.11–0.82, p = 0.011) models. Rs55683539 was significantly associated with a lower risk of BC in allelic (OR 0.76, 95% CI 0.62–0.93, p = 0.007), codominant (OR 0.46, 95% CI 0.26–0.80, p = 0.012), dominant (OR 0.78, 95% CI 0.61–0.99, p = 0.040), recessive (OR 0.49, 95% CI 0.29–0.84, p = 0.008), and additive (OR 0.76, 95% CI 0.62–0.93, p = 0.007) models. Rs2181559 was significantly associated with a lower risk of BC in allelic (OR 0.82, 95% CI 0.69–0.98, p = 0.028), codominant (OR 0.59, 95% CI 0.40–0.89, p = 0.038), recessive (OR 0.62, 95% CI 0.43–0.91, p = 0.014), and additive (OR 0.82, 95% CI 0.68–0.98, p = 0.026) models.

Table 3 Genotypic model analysis of the relationship between MIR31HG SNPs and BC risk

Age-stratified analysis

Then, according to the average age of the recruited subjects, we conducted a stratified analysis with 52 years as the age node to further explore the effect of the MIR131HG polymorphisms on the risk of BC in Chinese women (Table 4). In women aged ≥ 52 years, MIR31HG rs72703442 was significantly associated with reduced risk of BC under both codominant (OR 0.18, 95% CI 0.04–0.81, p = 0.034) and recessive (OR 0.18, 95% CI 0.04–0.83, p = 0.010) recessive models. Moreover, we detected rs55683539 in the allele (OR 0.68, 95% CI 0.51–0.90, p = 0.007) codominant (OR 0.68, 95% CI 0.48–0.98, p = 0.015), dominant (OR 0.64, 95% CI 0.45–0.91, p = 0.012), and additive (OR 0.65, 95% CI 0.49–0.88, p = 0.004) model and found a significant lower risk result. Meanwhile, women with A allele (compared with those carrying the T allele) and AA genotype (compared with those carrying the TT and TA genotype) for rs2181559 had a reduced risk BC in the allele (OR 0.76, 95% CI 0.56–0.94, p = 0.014) and co-dominant (OR 0.48, 95% CI 0.27–0.84, p = 0.032), recessive (OR 0.54, 95% CI 0.32–0.92, p = 0.021), and additive (OR 0.72, 95% CI 0.55–0.93, p = 0.012) models. However, in women < 52 years old, we did not find that candidate SNPs have an impact on the risk of BC in Chinese women (p > 0.05).

Table 4 Stratified analysis of the age on association between selected SNPs and BC risk

Stratified analysis of demographic and clinic indicators in case group

A stratified analysis of ER, PR, HER2, age at menarche, number of births and menopausal status in the case group was further analyzed. Both ER and PR stratified analyses indicated that rs79988146 was associated with ER positive and PR positive in BC patients under dominant (ER: OR 2.13, 95% CI 1.04–4.35, p = 0.028; PR: OR 1.94, 95% CI 1.03–3.62, p = 0.033) models (Table 5). Moreover, rs1332184 (allele: OR 0.56, 95% CI 0.35–0.89, p = 0.014; and additive: OR 0.54, 95% CI 0.33–0.88, p = 0.009), rs72703442 (allele: OR 0.39, 95% CI 0.20–0.78, p = 0.005; codominant: OR 0.29, 95% CI 0.13–0.64, p = 0.003; dominant OR 0.32, 95% CI 0.15–0.67, p = 0.001; and additive: OR 0.37, 95% CI 0.18–0.74, p = 0.002), rs55683539 (allele: OR 0.59, 95% CI 0.36–0.96, p = 0.033; dominant OR 0.51, 95% CI 0.29–0.91, p = 0.019; and additive: OR 0.60, 95% CI 0.37–0.98, p = 0.032) and rs2181559 (allele: OR 0.62, 95% CI 0.41–0.94, p = 0.025; dominant: OR 0.58, 95% CI 0.35–0.97, p = 0.037; and additive: OR 0.64, 95% CI 0.43–0.97, p = 0.029) were associated with negative HER2 status. The stratification of patients' menarche age showed that rs1332184 had a significant positive correlation with menarche age under allelic (OR 1.50, 95% CI 1.08–2.09, p = 0.017), codominant (OR 2.58 95% CI 1.15–5.79, p = 0.048), dominant (OR 1.55, 95% CI 1.01–2.37, p = 0.044), recessive (OR 2.24, 95% CI 1.03–4.91, p = 0.049), and additive (OR 1.52, 95% CI 1.08–2.12, p = 0.016) models. In addition, stratification of patients' reproductive times showed that rs55683539 and rs10965064 were significantly negatively correlated with patients' age at menarche under both codominant (rs55683539: OR 0.32, 95% CI 0.11–0.93, p = 0.031) and recessive (rs55683539: OR 0.29, 95% CI 0.10–0.86, p = 0.017; and rs10965064: OR 0.52, 95% CI 0.31–0.89, p = 0.015) models (Table 6). We also explore the association of MIR31HG SNPs with menopausal status of BC patients. However, no significant association was found.

Table 5 Stratified analysis of ER, PR and HER2 in case group
Table 6 Stratified analysis of age at menarche, number of births and menopausal status in case group

Table 7 displayed the results of stratified analysis of LN metastasis, clinical stage, and tumor size in case group. Stratified analysis of LN metastasis presented a positive relationship between rs79988146 and LN metastasis under allelic (OR 1.79, 95% CI 1.03–3.11, p = 0.038), dominant (OR 1.88, 95% CI 1.05–3.36, p = 0.032), and additive (OR 1.76, 95% CI 1.01–3.08, p = 0.042) models. Stratified analysis of clinical stage displayed that rs1332184 was associated with the higher stage under allelic (OR 1.43, 95% CI 1.06–1.92, p = 0.020), codominant (OR 2.69, 95% CI 1.27–5.72, p = 0.036), recessive (OR 2.50, 95% CI 1.20–5.22, p = 0.015), and additive (OR 1.42, 95% CI 1.04–1.92, p = 0.026) models. Moreover, rs2181559 might be associated with larger tumor size of BC patients (additive: OR 1.39, 95% CI 1.00–1.94, p = 0.046).

Table 7 Stratified analysis of LN metastasis, clinical stage, and tumor size in case group

MDR analysis for the effect of MIR31HG SNP-SNP interaction on BC risk

The Dendrogram and the Fruchterman-Reingold describe the interactions between these SNPs (Fig. 1A, B). Short connections among nodes represent stronger redundant interactions (Fig. 1A). A negative value for the two-locus entropy indicates an antagonistic effect, and a positive value indicates a synergistic effect (Fig. 1B). MDR analysis showed that candidate SNPs interaction is associated with BC risk (Table 8). The optimal single-locus model for predicting BC risk is rs55683539 [testing accuracy (TA): 0.5038, cross-validation consistency (CVC): 5/10], which, rs55683539-CC group was a high risk group and rs55683539-TT group was a low risk group increase the BC risk. Among the multi-locus models, predicting the best combination of BC risk is through rs79988146, rs1332184, rs72703442, rs2025327, rs55683539, rs2181559, rs10965059, and rs10965064 combination of eight-locus model [TA: 0.5179, CVC: 10/10]. The combination of all high-risk genotypes was associated with an increased risk of BC compared with that of low-risk genotypes.

Fig. 1
figure 1

The dendogram (A) and fruchterman Rheingold (B) of MIR31HG SNP-SNP interaction for BC risk. A Short connections among nodes represent stronger redundant interactions. B Negative percent entropy indicates redundancy

Table 8 SNP–SNP interaction models of the MIR31HG gene the predisposition of BC

Discussion

In recent years, it has been recognized that most tumor formations are under the combined effects of environmental and genetic factors. According to research, the occurrence of tumors may be the result of the superposition of multiple microscopic susceptibility genes [30], which may affect the metabolism of carcinogens, repair DNA damages, regulate hormone levels, and protect the immune function. Although numerous studies have been published on genetic associations with BC and, genetic effects of MIR31HG on cancer, few studies are concerned with available whether MIR31HG could serve as a candidate gene for BC. To the best of our knowledge, this study is the basic and fundamental one to analyze the association between MIR31HG gene polymorphisms and BC risk in Chinese women.

Through studies on MIR31HG related the functions in a BC mouse model, Augoff et al. confirmed that the changes in the expression level of this gene will facilitate tumor invasion and eventually metastasis [24]. Shi et al. also proved that knocking of MIR31HG could inhibit tumor growth in vivo, which showed that the expression level of MIR31HG could be referred as a diagnostic and prognostic marker for BC [25]. In this study, we assessed the relationship between MIR31HG gene polymorphisms and the risk of BC, and found some related targets on evaluating BC risk. The results showed that rs72703442, rs55683539 and rs2181559 on MIR31HG were significantly associated with a reduced risk of BC in Chinese women. And, after age stratification, these three SNPs (rs72703442, rs55683539, and rs2181559) on MIR31HG were significantly associated with a reduced risk of BC in Chinese women ≥ 52 years old. However, no SNPs were found to be associated with BC risk in Chinese women in the < 52-year-old stratification.

By analyzing gene polymorphism, Xia et al. concluded that BC risk was evaluated according to the different ER and PR states [8]. Zhou et al. also believed that the status of ER and PR is still the key to determine the type of BC adjuvant therapy, because estrogen stimulates ER-mediated transcription to increase cell proliferation, thereby increasing the number of DNA replication errors [31]. Above conclusions proved that clinical indicators especially in the status of ER and PR, have a certain influence on the development of BC.

ER and PR stratification showed that rs79988146 on MIR31HG was positively correlated with ER and PR in Chinese female BC patients. In addition, stratification by age at menarche only found that rs1332184 on MIR31HG was associated with age at menarche in BC patients. Number of births stratification showed that rs10965064 on MIR31HG was negatively correlated with BC patients' Number of births.

BC is a complex disease affected by the interaction of factors such as heredity. Multi-gene or SNP-SNP interaction studies may help to discover the risk factors of BC. Therefore, we performed MDR analysis to determine the potential SNP-SNP interactions among the eight SNPs in the MIR31HG gene polymorphisms. SNP-SNP interaction analysis indicated a strong interaction between SNPs on MIR31HG for BC sensitivity. In addition, in the multi-site model, the combination of rs79988146, rs1332184, rs72703442, rs2025327, rs55683539, rs2181559, rs10965059, and rs10965064 is the best multi-site model for predicting BC sensitivity.

However, there are also some limitations that cannot be neglected in our study. First of all, this is a hospital-based case and control study, which may have some inevitable sample selection bias and the absence of partial sample information. Second, our study has a limited generalizability because all participants were Han Chinese. Therefore, further well-designed study with a larger population or other ethnic groups is needed to confirm our findings. Then, our sample size was too insufficient to support stratified analysis of tumor subtypes. Finally, due to other information was incomplete, we didn’t analyze other risk factors for BC, such as lifestyle, family history, and other benign breast lesions. Therefore, population-based studies with a large amount of sample size and more complete information will be needed in the future to improve and enhance the accuracy of assessments and to explore the interaction between genetic variants and these factors.

Conclusion

In conclusion, this study firstly shows that MIR31HG gene polymorphisms are associated with a reduced risk of BC in Chinese women, and provides a theoretical basis for future explorations of the relationship between MIR31HG gene and BC risk in different populations. These findings can provide new biological insights for understanding the role of MIR31HG in the occurrence of BC.