Introduction

Since 1975, the number of breast cancer patients has consistently increased in Japan, and more than 107,000 new cases of breast cancer were reported in 2016, according to the National Cancer Registry [1]. Therefore, identifying high-risk populations is critical for efficient screening and preventive measures. Breast cancer risk estimation using the Gail model used in Europe and the United States has been attempted, but there are no breast cancer risk models strictly for the Japanese population, which is a challenging clinical concern.

Many studies have shown associations of breast cancer risk with lifestyle, reproductive factors, breast density, and specific genetic mutations in genes such as BRCA1/2. Single-nucleotide polymorphisms (SNPs) occur at a specific frequency across the genome sequence of a population, and specific SNPs result in individual genetic and physical characteristics, and have relationships with various diseases, including cancer. In recent years, several genome-wide association studies (GWAS) have identified numerous SNPs associated with breast cancer risk. However, many of these studies have been in European, Chinese, and Afro-American populations, with only a few in Japanese people [2]. The frequency of alleles and degree of linkage disequilibrium associated with breast cancer risk are significant factors for heterogeneity across individual populations and races, leading to a need to verify whether the reported SNPs affected breast cancer risk in Japanese women.

In a case–control study in Japanese women, we identified rs2046210, rs3757318 and rs3803662 as SNPs associated with breast cancer risk [3]. rs2046210 and rs3757318 are located near the estrogen receptor 1 (ESR1) gene, and their association with breast cancer risk in various populations has been shown, with rs2046210 suggested to be associated with a greater risk of ER-negative breast cancer [4]. This observation indicates that SNPs may affect breast cancer phenotype. However, only a few studies have examined relationships between phenotypic changes in individuals with SNPs and cancer risk, and this remains as an important area of study.

Physical characteristics of patients with breast cancer are also associated with the risk of breast cancer. For example, obesity markedly increases the risk of breast cancer in postmenopausal women, but almost certainly reduces the risk in premenopausal women [5], with these results supported by a subsequent large-scale systematic review and meta-analysis [6]. In a study in Japanese women, obesity increased breast cancer risk after menopause, although a large-pool analysis showed that the risk increased even before menopause [7]. Based on European and American research outcomes, taller women have a higher risk of breast cancer, and in the JPHC study in Japan, women of height ≥ 160 cm had a higher risk of breast cancer compared to those ≤ 148 cm, regardless of menopause status [8]. Breast density determined by mammography is also a risk factor, with a high breast density considered to increase breast cancer risk, regardless of race and ethnicity [9].

Numerous SNPs have been linked to breast cancer risk, but the associated mechanisms are poorly understood. Thus, analysis of the association between SNPs and phenotypes may be important for examining the mechanism of action, including in carcinogenesis. Here, we used a dataset from our case–control study to analyze the phenotypes of Japanese women with specific SNPs associated with breast cancer risk. Obesity, height, breast density, and breast cancer characteristics were investigated to understand the biological function of these SNPs and the mechanisms of increased breast cancer risk.

Materials and methods

In a previous case–control study, we investigated relationships among SNPs in Japanese women, breast cancer risk, and environmental factors [3]. We registered patients who had undergone treatment for breast cancer at six facilities in Okayama and Kagawa prefectures from 1987 to 2011 as “cases”. Women who visited Kagawa Prefectural Cancer Detection Center and the screening department of Mizushima Kyodo Hospital in Okayama for screening, and had no history of breast cancer, were included as “controls”. Physical and environmental factors in all subjects were collected using a questionnaire. These included date of birth, height, weight (current, at the time of diagnosis, and at 18 years of age), smoking, drinking, 15 food types eaten, 4 drink types consumed, leisure exercise (current, at the time of diagnosis, and at 18 years of age), menstrual status, age of menarche, age of first birth, birth history, number of births, breastfeeding, hormone replacement therapy, history of mammary gland disease, history of breast cancer in relatives, profession, and educational history. We extracted four physical factors associated with Japanese breast cancer risk: height, weight, BMI, and breast density.

Breast density was determined using digital mammography at breast cancer diagnosis in cases, and at screening in controls [9]. Digital mammography images were exported and interpreted on a monitor display. Two specialists who were qualified in mammographic interpretation and certified by The Japan Central Organization on Quality Assurance of Breast Cancer Screening determined the breast density. Breast density was classified into four categories according to the Breast Imaging Reporting and Data System (BI-RADS): (1) fatty breast (< 25% gland); (2) glandular breast (25–50%); (3) uneven high breast (51–75%); and (4) high breast (> 75%) [10]. The two examiners shared their grading, reevaluated the mammograms if there was a conflict, and made a final decision after discussion. If the density of the two breasts differed, the higher value was used as the final breast density.

Breast cancer characteristics, such as estrogen receptor (ER), progesterone receptor (PgR), and human epidermal growth factor receptor type-2 (HER2) status, tumor size, regional lymph node metastasis, and distant metastasis were collected from medical records. Blood samples were collected from all subjects, and the relationship between 17 SNPs predicted to be associated with breast cancer from previous GWAS and each survey item was investigated [11]. The risk alleles associated with breast cancer were identified, with reference to the Japanese Single Nucleotide Polymorphism (JSNP) database [12].

Age-adjusted odds ratios (ORs) and multivariate ORs with 95% confidence intervals (CIs) for independent SNPs in all subjects and stratified by menopausal status were investigated. In all women, three SNPs were significantly associated with breast cancer risk after multivariate adjustment: rs2046210 (per allele OR 1.37 [95% CI 1.11–1.70]), rs3757318 (per allele OR 1.33 [1.05–1.69], and rs3803662 (per allele = 1.28 [1.07–1.55]) [3]. To investigate the possibility of the risk allele of each breast cancer risk-related SNP affecting a phenotype, relationships between the three SNPs and physical phenotypes (height, body weight, body mass index (BMI), and breast density), and breast cancer characteristics were investigated using the same dataset.

All procedures involving human participants met the ethical standards of institutional and/or national research committees and the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Statistical analysis

Height (cm) was analyzed as four categorical variables (≤ 150, 151–155, 156–160, and > 160 cm) based on quartiles, weight was used as a continuous variable, and BMI was analyzed as six categorical variables (< 18.5, 18.5–24.9, 25.0–29.9, 30.0–34.9, 35.0–39.9, and > 40.0 kg/m2) based on the WHO classification. Breast density was calculated for the four BI-RADS categories, as mentioned above. Expression of ER, PgR, and HER2 in cases was defined as positive or negative: ER and PgR were defined as positive based on immunohistochemical (IHC) expression of ≥ 1%, and HER2 as positive based on an IHC score of 3 or HER2 gene amplification confirmed by FISH positivity.

To compare risk and non-risk allele carriers for each factor, categorical variables were evaluated by Chi-square test, and continuous variables by t test (ANOVA), with the 95% CI calculated. Associations of breast density with possible related factors (number of deliveries, history of breastfeeding, genotype of SNPs) were analyzed using age-adjusted multivariate ordinal logistic regression analysis. A p value < 0.05 was considered significant. JMP ver. 13.2.1 (SAS Institute) was used for all statistical analyses.

Results

A total of 515 patients (cases) and 527 controls provided written informed consent to participate in the study. Of the subjects, 476 cases (92.4%) and 464 controls (88.8%) returned self-administered questionnaires. In two cases, blood samples could not be obtained owing to brittle vessels, and in two other cases, SNP genotyping could not be performed due to poor DNA amplification. Thus, the final dataset for analysis included 472 cases and 464 controls with completed questionnaires and SNP genotyping information.

Physical characteristics of people with SNP rs2046210

rs2046210 was evaluated in 931 people (468 cases and 463 controls) in the dataset. The numbers of risk allele (AA + AG) and non-risk allele (GG) carriers were 474 (50.9%) and 457 (49.8%), respectively, including 255 (54.3%) and 213 (45.5%) cases, and 219 (47.3%) and 244 (52.7%) controls (Table 1). Risk allele carriers were significantly taller than non-risk allele carriers in cases (156.0 ± 5.8 vs. 154.3 ± 5.5 cm, p = 0.002), but there was no significant difference in all subjects or in controls. Risk allele carriers also had significantly lower BMI at diagnosis in all subjects (22.3 ± 3.3 vs. 22.8 ± 3.6 kg/m2, p = 0.026) and in cases (22.3 ± 3.3 vs. 23.3 ± 3.9 kg/m2, p = 0.003), but not in controls. In risk allele carriers, there was a significant difference in distribution of breast density in cases (p = 0.040), and breast density showed a tendency to be higher in cases and controls (Fig. 1).

Table 1 Physical characteristics of people with SNP rs2046210
Fig. 1
figure 1

Relationship of breast density with genotypes of rs2046210 [a cases, b controls] and rs3757318 [c cases, d controls]

Physical characteristics of people with SNP rs3757318

rs3757318 was evaluated in 927 people (465 cases and 462 controls). The numbers of risk allele (AA + AG) and non-risk allele (GG) carriers were 397 (42.8%) and 530 (57.2%), respectively, including 216 (46.5%) and 249 (53.5%) cases, and 181 (39.2%) and 281 (60.8%) controls (Table 2). Risk allele carriers were significantly taller than non-risk allele carriers in cases (155.8 ± 5.7 vs. 154.7 ± 5.6 cm, p = 0.035), but there was no significant difference in all subjects and in controls. In risk allele carriers, there was a significant difference in distribution of breast density in cases (p = 0.044), and breast density showed a tendency to be higher in cases and controls (Fig. 1).

Table 2 Physical characteristics of people with SNP rs3757318

Physical characteristics of people with SNP rs3803662

rs3803662 was evaluated in 927 people (464 cases and 462 controls). The numbers of risk allele (AA + AG) and non-risk allele (GG) carriers were 716 (77.2%) and 156 (16.8%), respectively, including 390 (84.1%) and 74 (15.9%) cases, and 369 (79.9%) and 91 (19.9%) controls (Table 3). Height, body weight, BMI, and breast density did not differ significantly between risk and non-risk carriers in all subjects, cases, and controls.

Table 3 Physical characteristics of people with SNP rs3803662

Factors influencing breast density

Age-adjusted multivariate ordinal logistic regression analysis in the all subject showed that the number of deliveries, and the rs2046210 genotype were independent significant factors that influenced breast density (Table 4). However, in the stratified analysis of the case group and control group, the number of deliveries was the only significant factor affecting the breast density.

Table 4 Results of analysis of factors with a potential influence on breast density

Breast cancer characteristics and SNPs

Risk and non-risk allele carriers in cases were compared to evaluate the relationship between genotype and breast cancer phenotypes. rs3757318 risk allele carriers were significantly more likely to be ER negative (ER-positive rate: 77% vs. 84%, p = 0.036). There was no difference in breast cancer subtype, tumor size, lymph node metastasis, or distant metastasis among risk and non-risk allele carriers for any of the three SNPs (Table 5).

Table 5 Breast cancer characteristics and SNPs

Discussion

Taller height has been shown to be associated with breast cancer risk [5], and rs2046210 and rs3757318 risk allele carriers were significantly taller in the patients in this study. However, this was not observed in controls, which makes interpretation of this result difficult. An analysis of the association between height and genes using whole-genome sequence data from 1,037 Japanese subjects did not include these SNPs [13], which suggests that rs2046210 and rs3757318 are not universal factors affecting the height of Japanese women. An association between these SNPs and height was found only in breast cancer patients, which suggests that other factors (genetic predispositions and environmental factors) affect the SNP effects. rs2046210 is present in chromosome 6q25.1, 29 kb upstream of the start of the ESR1 gene, 180 kb upstream of the transcriptional exon, and 6 kb downstream of C6orf97, whereas rs3757318 is in the intron of C6of97, 200 kb upstream of ESR1, and 34,253 bp upstream of rs2046210. However, whether these SNPs are involved in ESR1 expression or ERα function has not been determined [14, 15]. A potential association between ESR1 and height has been described in a study of adult males from two Swedish population cohorts and in a Swedish case–control study of breast cancer [16, 17].

Risk allele carriers of rs2046210 had significantly lower BMI at diagnosis in all subjects and in patients only. Obesity is an established risk factor for breast cancer, but this risk allele does not appear to be related to obesity in Japanese women. The reason why BMI was significantly lower in risk allele carriers may be due to the taller height of rs2046210 risk allele carriers.

Interestingly, rs2046210 and rs3757318 are both associated with ESR1 and were also both associated with breast density, with risk allele carriers having a higher breast density. There have been many reports on the association of breast density with genetic polymorphisms associated with breast cancer. A systematic review showed strong evidence for a genetic component to breast density. Candidate gene approaches yielded replicated associations of COMT Val158Met and IGF-I rs6220 A > G with premenopausal breast density and of ESR1 (Xba I and Pvu II) polymorphisms with postmenopausal breast density, which possibly translate to breast cancer risk [18]. However, many polymorphisms have only been examined in a few studies, which limits the conclusions. The current study does not allow for a definitive conclusion due to the small sample size, but our results suggest that rs2046210 and rs3757318 may be related to breast density in Japanese women. In particular, the rs2046210 genotype was a significant factor affecting breast density in multivariate analysis in all subjects, and its association with breast density was strong.

rs3803662 is associated with the TNRC9 gene and is one of the four most influential breast cancer risk-related SNPs found in GWAS across 4,000 breast cancer samples [19]. A meta-analysis by Yan et al. suggested no association of rs3803662 with breast cancer risk in Asians [20], and in the current study, we found no significant association between rs3803662 and physical phenotypes associated with breast cancer risk.

Following identification of breast cancer risk loci in early GWAS, subsequent studies have investigated associations between these loci and breast tumor histopathology [21]. To date, many SNPs associated with overall breast cancer are also associated with ER-positive breast cancer. A large-scale case–control GWAS study conducted in Europe aimed at finding susceptibility variants associated with a risk of ER-negative breast cancer found 125 such variants that explained approximately 16% of the familial risk of this breast cancer subtype [22]. In our case–control study, the sample size was small and risk assessment by subtype-specific polymorphism was not possible. However, rs3757318 risk allele carriers had a significantly higher rate of ER-negative breast cancer, which suggests the need for a further study of the association of rs3757318 with ER-negative breast cancer risk.

There were certain limitations in the study. The small sample size limited the conclusions that could be drawn. The multiple tests were not adjusted and the possibility of α error cannot be excluded. In addition, additional SNPs associated with new breast cancer risk have been identified recently, and thus, only limited SNPs were considered. Within these limitations, we conclude that SNPs rs2046210 and rs3757318, which are associated with breast cancer risk in Japanese women, are significantly associated with height and high breast density, and that this association is particularly strong in patients with breast cancer. These findings suggest that SNPs in the ESR1 region affect phenotypes such as height and breast density in breast cancer cases.