Introduction

Approximately 80 % of breast cancers diagnosed in postmenopausal women express estrogen receptor (ER) or progesterone receptor (PR). In this population, the major source of estrogen is the peripheral synthesis of estrone (E1) and estradiol (E2) by the cytochrome P450 enzyme aromatase, which is encoded by the CYP19A1 gene [1]. Third-generation aromatase inhibitors (AIs) (anastrozole, letrozole and exemestane) are potent and specific inhibitors of aromatase and have become established standard therapy for hormone receptor (HR)-positive tumors in postmenopausal patients in all settings of the disease [2]. However, response rates for first line treatment of patients with metastatic disease is only approximately 40 %, with all initial responders eventually developing resistance [2]. Moreover, progressive disease has been described in patients receiving neoadjuvant treatment with AIs [3, 4] and relapses occur during and after completion of 5 years of adjuvant AIs [5, 6]. Thus, there is a need for biomarkers that would allow prediction of which patients would benefit from aromatase inhibition or from alternative treatment strategies [7].

A number of single nucleotide polymorphisms (SNPs) in CYP19A1 have been related to differential aromatase gene transcription [8, 9] or aromatase enzymatic activity [8]. Most previous studies have considered different CYP19A1 polymorphisms in relation to breast cancer risk [1013], breast cancer prognosis [1416] and sex hormone levels [12, 1719] with conflicting results. More recently, selected CYP19A1 SNPs have been investigated for their association with the therapeutic efficacy and toxicity of AIs [2024]. In particular, in patients with HR-positive metastatic breast cancer treated with the aromatase inhibitor letrozole, the presence of a SNP in the 3′ untranslated region (3′-UTR), rs4646 [20] and two SNPs in the exon I.6 of CYP19A1, rs10459592 and rs4775936 [22] have been related to improved treatment efficacy. Although validation has been lacking, these SNPs present candidate markers to direct appropriate selection of letrozole in patients with metastatic breast cancer (mBC).

We conducted the present study to further evaluate whether the CYP19A1 SNPs rs4646, rs10459592 and rs4775936 and the tetranucleotide repeat (TTTA) n in intron 4 are related to the efficacy of AIs in a large cohort of patients with HR-positive mBC. Additionally, we genotyped additional SNPs in CYP19A1 and in other candidate genes related to estrogen and AI metabolism as an exploratory hypothesis-generating dataset.

Materials and methods

Patients

Breast cancer patients with advanced disease were identified from a prospectively maintained database at The Royal Marsden Hospital from 1994 until April 2008. Patients were eligible if they had received at least 4 weeks treatment with a third-generation AI (anastrozole, letrozole or exemestane) for HR-positive advanced breast cancer and were postmenopausal. It was also necessary for archival formalin-fixed paraffin embedded (FFPE) tissue from the primary tumor to be available or from recurrent disease if blocks from the primary lesion were not obtainable.

Clinical data were collected by a comprehensive retrospective case note review to supplement prospectively collected data.

DNA extraction

DNA was obtained from 3 × 8 µm sections from FFPE blocks including whole sections with representation of the tumor confirmed by an H&E staining. Extraction was performed using the Qiagen DNeasy with modifications for FFPE material.

Polymorphism selection and genotyping

Tagging-SNPs were selected to capture the common genetic variation of the CYP19A1 gene based on linkage disequilibrium information available in the International HapMap Project (National Center for Biotechnology Information (NCBI) build 36, release phase I and II March 08, http://www.hapmap.org). SNPs were selected from a region encompassing 10-kb flanking CYP19A1 (Chr15:49278964-49428086) using the tagger algorithm in Haploview with a cut-off of 0.8 for r 2 and a minor allele frequency (MAF) ≥5 % across numerous ethnic groups. A total of 37 SNPs were chosen to represent genetic variation of 137 alleles. Furthermore, detailed literature and NCBI single nucleotide polymorphism database (dbSNP) searches were performed to identify variants in CYP19A1 that (1) had a functional impact on gene expression or enzyme activity; (2) were associated with AI outcome in mBC or (3) were associated with breast cancer survival. The other SNPs included were selected from genes encoding products that are involved in the estrogen metabolic pathway and metabolic and/or pharmacodynamic pathways of AIs. Five SNPs in the estrogen receptor genes (ESR1 and ESR2) were also tested.

Multiplex SNP genotyping was performed using the Sequenom® MassARRAY technology (Sequenom®, San Diego) (http://www.sequenom.com). Primers for PCR and single base extension were designed by using the Assay Designer software package (Sequenom®, San Diego) (Supplementary Table 1). The iPLEX™ assay was followed according to the manufacturer’s instructions using 25 ng of DNA. Genotype calling was performed in real time with MassARRAY RT software version 3.0.0.4 and analyzed by using the MassARRAY Typer software version 3.4

The CYP19A1 variant, rs700519 (p.R264C) was analyzed by Pyrosequencing. Primers were designed using Pyrosequencing Assay Design Software (Qiagen, Crawley, UK) (Supplementary Table 2).

For detection of the tetranucleotide repeat polymorphism (TTTA) n in intron 4 of CYP19A1 (rs60271534), a region containing the polymorphic site was amplified, with previously described oligonucleotide primers of which the forward was fluorescently labeled [25] (Supplementary Table 3). Fluorescently labeled PCR products were separated by automated capillary electrophoresis on the ABI PRISM 3100 Genetic Analyzer (Life Technologies, Warrington, UK) and variations in polymorphism length were determined using GeneScan 3.7 software. The alleles ranged in size from 164 to 195 bp, depending on the number of TTTA repeats. The (TTTA)7 repeat polymorphism contained two different alleles depending on a TCT insertion/deletion (rs11575899) 50-bp upstream of the (TTTA) n tract, resulting in base pair products of 168 and 171 bp, respectively. All laboratory analyses were performed blindly to the clinical data. Positive and negative quality control samples were included in the genotyping assays.

Statistical methods

Validation analyses

Analysis of the effects of the alleles for the confirmatory SNPs rs4646, rs10459592 and rs4775936 and intron 4 (TTTA) n repeat was undertaken by calculating the per allele or feature hazard ratio using a Cox regression model, two-sided 5 % significance values being employed. For the (TTTA) n analysis patients were divided into pre-specified groups, based on the presence or absence of a longer allele in CYP19A1, using the 7-repeat as the cut-off as previously described by other groups [15, 2628].

Exploratory analyses

For the analysis of exploratory SNPs patients were randomly divided using a 2:1 allocation rule into an exploratory and test set. The STATA programme ‘smileplot’, using the Simes method [29] was used to identify per allele hazard ratios in the exploratory set that satisfied a false discovery rate of 5 %. SNPs identified by this method were then evaluated in the test set. An exploratory analysis was then undertaken in all patients to provide estimates of the effect sizes for each SNP that investigators may attempt to validate in other datasets.

Time to treatment failure analyses

Cox regression was used to analyze time to failure of AI treatment. Hazards were assumed to be proportional over time; this assumption was tested for rs10459592 and rs4775936 by considering the interactions of these factors and log(time) using the stcox options tvc and texp in STATA 11.1, no evidence was found to reject it. Multivariable modeling was undertaken using Cox regression to examine the association between genotypes and time to treatment failure (TTF). Adjusting factors included, the number of disease sites; disease-free interval from diagnosis to first recurrence; grade at diagnosis and first recurrence type. The basis of this approach is that associations between these factors and genotypes, which may occur due to chance, may mask or overinflate the predictive significance of the genotypes. The latter may be revealed by removing the effect of the potentially confounding prognostic factors.

For quality control, consistency with Hardy–Weinberg equilibrium was assessed for each polymorphism using the Pearson Chi square test.

Results

Patients and treatment characteristics

A total of 308 women were included with a median age at AI treatment of 63 years (38–85). Patients and tumor characteristics are summarized in Table 1. Four percent (11/308) had metastatic disease at presentation while the remaining 96 % (297/308) relapsed from a previous primary tumor. For these latter cases, median time to first relapse was 6 years (range 1–29 years). The median number of metastatic locations for all patients was 1 (range 1–4). Eleven percent (32/297) of the patients had received neoadjuvant chemotherapy and in 48 % (143/297) chemotherapy was administered in the adjuvant setting. Sixty-nine percent (205/297) had been treated with adjuvant radiotherapy. The majority of patients with early breast cancer had received adjuvant endocrine therapy (82 %; 244/297), which was tamoxifen only (87 %), tamoxifen and AIs (in sequence) (9 %) or an AI only (4 %). In two patients, information related to adjuvant treatment was not available.

Table 1 Patients’ characteristics

Treatment administered in the metastatic stage is summarized in Table 2. 193 Patients (63 %) received AI as first line therapy (22 % of those had also received palliative RT in areas not used for the evaluation of disease response to AIs). The remaining 115 (37 %) received some type of treatment before administration of the AI (18 % chemotherapy only, 12 % endocrine only and 7 % both endocrine and chemotherapy). Letrozole was the most frequently administered AI (54 %; 166/308) followed by anastrozole (35 %; 108/308) and exemestane (11 %; 34/308).

Table 2 Treatments in the metastatic setting

SNP analyses

The first set of analyses consisted of validation analyses for polymorphisms for which there was prior evidence of an association with letrozole efficacy [20, 22]. The minor variant (T) allele of rs4775936 was significantly associated with a prolonged TTF [HR = 0.79 per T allele (95 % CI 0.66–0.95); P = 0.012] (Table 3; Fig. 1). The association of the T allele of rs4775936 with a prolonged TTF was also observed when the analysis was limited to letrozole-treated patients (n = 166); however, it did not reach statistical significance [HR 0.80 per T allele (95 % CI 0.63–1.02); P = 0.068] (Table 3). The long allele (>7 TTTA repeats) in intron 4 of CYP19A1 gene was also associated with a prolonged TTF [HR = 0.84 per long allele (0.7–0.99); P = 0.04] (Fig. 2). The minor (T) allele of rs10459592 showed a trend to a reduced TTF (P = 0.052).

Table 3 Association of CYP19A1 rs10459592, rs4775936 and rs4646 with TTF
Fig. 1
figure 1

Kaplan–Meier estimates of TTF for the presence of rs4775936 variant allele

Fig. 2
figure 2

Kaplan–Meier TTF estimates for the presence of >7 TTTA repeats in intron 4 of CYP19A1

Multivariable models examining the association between genotypes and TTF were adjusted for known prognostic factors including, the number of disease sites; disease-free interval from diagnosis to first recurrence, grade at diagnosis and first recurrence type (Table 4). As expected, a greater number of sites of disease (>1) was strongly associated with reduced TTF (P < 0.001); furthermore, metastatic disease at recurrence (P = 0.01) and a shorter disease-free interval (P = 0.01) were also associated with shorter TTF. In these models, neither the rs4775936 (P = 0.62) nor the (TTTA) n (P = 0.69) variants maintained independent predictive value (Table 4).

Table 4 Multivariable Cox regression models for TTF

Out of the 71 SNPs examined in this study as an exploratory set, 17 (Supplementary Table 4) were either non-polymorphic, or present at a minor allele frequency (MAF) of less than 5 % leaving a total of 54 SNPs for analysis. However, we included the rs700519 (MAF = 0.04) and the rs28757184 (MAF = 0.02) in the exploratory analysis based on previous evidence that these two non-synonymous SNPs affect aromatase enzyme activity [8, 30]. Of the 56 SNPs considered in the exploratory analysis, only the rs11636639 and rs8039089 CYP19A1 variants were associated with TTF in the training data set of 194 randomly assigned cases (P = 0.001 and P = 0.0002, respectively) after corrections for multiple testing (Supplementary Table 5). However, variants were not significantly associated with TTF in the test set of 102 cases (P = 0.69 and P = 0.97, for the rs11636639 and the rs8039089 variants, respectively). The comparability of the training and test sets in terms of disease-free interval, number of metastatic sites, grade and letrozole use was examined; no notable differences were found.

Additional analyses of the whole dataset were conducted to identify putative associations that others may wish to test. The results are shown in Supplementary Table 5 in relation to TTF, rs10046 (HR, 95 % CI 1.27, 1.07–1.51), rs1800796 (HR, 95 % CI 1.46, 1.10–1.95) and rs8039089 (HR, 95 % CI 0.79, 0.67–0.94) are examples of SNPs that may be worthy of further consideration.

Discussion

Since the implementation of third generation AIs as standard treatment for postmenopausal patients in all stages of hormone-receptor expressing breast cancer, significant efforts have been made to elucidate the mechanisms that could affect their efficacy and other parameters such as breast cancer risk and side effects. The identification of different CYP19A1 polymorphisms has led to several studies to determine any relationship with the efficacy and side effect profiles of AIs [10, 2022, 24, 31]. These small studies in different breast cancer treatment settings, with different endpoints and considering different polymorphisms have shown conflicting results. None of the studies of clinical efficacy approached the size of this report with most numbering less than 100 patients giving them limited power to detect modest effect sizes. Colomer et al. [20] evaluated the efficacy of treatment with letrozole in 65 patients with advanced HR-positive breast cancer with respect to two polymorphisms located at the 3′-UTR (rs10046 and rs4646) and one in intron 2 (rs727479) of CYP19A1. The authors reported that patients either heterozygous or homozygous for the rs4646 variant allele had a three times greater time to progression (TTP) compared to those with the wild-type genotype (525 vs. 196 days; P = 0.02). These observations differed from those of Garcia-Casado et al. [21] who found the converse relationship with the minor allele of rs4646 being associated with poor response (P = 0.03) and with worse, but non-significantly different, progression-free survival in 95 postmenopausal women receiving neoadjuvant letrozole. Interestingly, our study detected no significant association between rs4646 and treatment outcome in mBC.

Park et al. [31] genotyped 46 CYP19A1 SNPs in 109 letrozole-treated patients with mBC and identified three SNPs, rs700518, rs10459592, and rs4775936, associated with clinical benefit in an over-dominant model. We observed a significant association of the rs4775936 T allele with TTF in univariate analysis and this SNP has been associated with lumbar spine bone mineral density in a cohort of Spanish postmenopausal women [32]. However, in contrast to the previous study, in our cohort rs4775936 did not retain independent significance in multivariate analysis [22].

The intron 4 tetranucleotide repeat polymorphism (TTTA) n has previously been investigated in studies of cancer risk [10, 13, 28] and estrogen levels [18, 28, 33, 34] with conflicting results. Initial reports found elevated estradiol concentrations in carriers of the 8-repeat allele [18, 28] and the longer (12 and 13) repeat alleles [33] compared with non-carriers, and reduced estradiol levels in carriers of the 7-repeat allele [28]. However, in a series of 1,090 British women no association of this polymorphism with estradiol levels was detected [34]. In our group of patients, carriage of >7 repeats of the TTTA polymorphism was significantly associated with improved AI treatment efficacy in univariate analysis. However, in multivariate analysis this association, similar to that for rs4775936, was not significant. This emphasizes the critical importance of considering pharmacogenetic associations only in the context of all relevant prognostic data.

Our study has some limitations mainly in relation to its retrospective nature, the lack of adherence data and the heterogeneity of the population, including the use of different types of AIs. Adherence data is an important covariate in fully understanding the relationship between genetic variation and drug response [35]. Adherence may be reduced in 10–20 % of women taking AIs due to the musculoskeletal side effects [36]. A recent genome-wide association study in an adjuvant setting has identified three SNPs near the T-cell leukemia 1Aa (TCL1A) gene. Functional analysis established an estrogen-responsive element in TCL1A and a link with interleukin 17 connecting musculoskeletal effects and the polymorphism [37]. We did not have access to adverse event data in this cohort and defining musculoskeletal side effects is more challenging in the metastatic than in the adjuvant setting. However, we determined no relationship between the TCL1A variant (rs7158782) and AI treatment response. Furthermore, our study was not powered to detect modest effect sizes, since its candidate approach was not designed to detect unexpected associations and does not provide information about the response of AIs in the adjuvant setting.

Our study does not support the use of genetic testing of variants in CYP19A1 to direct the use of AIs in patients with mBC. It is likely that, as yet undetermined, germline variants are relevant to AI metabolism and response and techniques including GWAS should be undertaken to identify these relationships.