Introduction

Breast cancer is a major diagnosed malignancy and leading cause of death in women worldwide [1,2,3,4]. Treatment strategies for breast cancer include surgery, radiation therapy, chemotherapy, hormonal therapy and targeted therapy which is given to the patients to shrink the size of the tumors and also kill the cells that move to other organs [5]. The chemotherapeutic drugs given to breast cancer patients include cyclophosphamide, docetaxel, doxorubicin (Adriamycin), epirubicin, 5-fluorouracil, paclitaxel, etc. [6]. Doxorubicin and cyclophosphamide (adjuvant therapies/AC) therapy has been proved to be an effective treatment for early-stage breast cancer. AC therapy was a simple alternative to replace an adjuvant treatment including cyclophosphamide, methotrexate and 5-fluorouracil (CMF). A meta-analysis carried out by the Early Breast Cancer Trialists Collaborative Group established that combination therapy containing anthracycline conferred a significant and relevant advantage in survival and recurrence in comparison with CMF [7].

Significant variability in response to chemotherapy has been reported to occur in patients affected with various cancers [8]. Variation in the genes involved in pharmacodynamic and pharmacokinetic [9] pathways of these drugs (pharmacogenetics) plays a considerable role in prediction of drug response and survival [10]. Cyclophosphamide is a widely used antitumor prodrug which falls in the category of the alkylating agents [11]. The main function of the alkylating agents is to alkylate the guanine nitrogen located at the seventh position in the ring [12]. The prodrug undergoes hepatic metabolism and is metabolized by the CYP450 genes to both active and inactive compounds. CYP3A4/3A5 mediates the N-dechloroethylation of cyclophosphamide giving rise to 2-dechloroethylcyclophosphamide and neurotoxic chloroacetaldehyde. The former is believed to have no cytotoxic effect. Oxidation of the C4 position of cyclophosphamide generates 4-hydroxycyclophosphamide and this is mediated by various isoforms of cytochrome P450s including CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP3A4 and CYP3A5 [13,14,15,16,17]. Previous studies have shown that polymorphism in genes encoding different CYP450s is involved in inter-individual variation in the cyclophosphamide response (Table 1). The secondary conversion involves the production of the main active metabolite aldophosphamide. Aldophosphamide conversion (chemical decomposition) to phosphoramide mustard and acrolein is achieved by the process of fragmentation. Acrolein is responsible for urotoxicity and is further detoxified [30]. These toxic metabolites are converted to acrylic acid, chloroazridine by aldehyde dehydrogenases (ALDH1A1 and ALDH3A1) [31, 32] and detoxification is carried out by conjugation of generated toxic metabolites with glutathione by various glutathione S-transferases (GSTs; GSTM1, GSTA1, GSTT1 and GSTP1; (Fig. S1) [33, 34]. Genetic variation in ALDHs and GSTs has also been reported to influence inter-individual response to cyclophosphamide therapy. The result of these studies has been summed up in Table 1. The genes involved in the pharmacokinetic and pharmacodynamic pathways of cyclophosphamide have been represented in Fig. S1 [31,32,33,34].

Table 1 Effect of genetic variants on drug outcome and adverse drug reactions in patients on chemotherapeutic drugs

Doxorubicin belongs to the class of anthracyclines [35]. The major role of anthracyclines is to intercalate the DNA [36]. A major portion of the drug is eliminated out of the body without getting metabolized by the ATP binding cassette (ABC) transporters (ABCB1, ABCC1, ABCG2) [37,38,39,40,41,42] and Ral-binding protein 1 (RABLP1). The remaining drug is metabolized by a variety of enzymes including aldo/keto reductase family 1, member A1 (AKR1A1), carbonyl reductases (CBR1 and CBR3) [38, 43, 44], NADH dehydrogenase (NQO1) and nitric oxide synthases (NOS1, NOS2, NOS3) [25] (Fig. S2). Polymorphism in ABC, RABLP1,AKR1A1, CBR3, NQO1 and NOS has been reported to influence inter-individual response to doxorubicin therapy (Table 1).

Punjab is a leading food-producing state in India. The Malwa region, south of the Sutlej river, captured national attention 2 years ago due to steeply rising cancer rates. A report on cancer patients in a hospital revealed that the highest number of cancer patients was from the Bathinda region of Punjab [45]. According to the Indian Department of Health and Family Welfare (DHFW), cancer prevalence in the Malwa region in 2013 was 1089 (per million/year) [46]. This is much higher in comparison with the two other regions of Punjab, Majha (647/million/year) and Doaba (881/million/year). The national average cancer prevalence in India has been reported to be 800/million/year [47]. In Punjab, breast cancer is the second most common cancer after lung cancer [48]. Therefore, the present study was carried out to evaluate the variation in all the genes involved in pharmacokinetics and pharmacodynamics of doxorubicin and cyclophosphamide and correlate it with the overall clinical outcome and adverse drug reactions (if any) in breast cancer patients from the Malwa region of Punjab.

Material and methods

Two hundred and fifty breast cancer patients evaluated at Guru Gobind Singh Medical College and Hospital, Faridkot, Punjab, and Max Super Speciality Hospital, Bathinda, from August 2015 to March 2017 were included in the study with written and informed consent of the participants. The patients diagnosed with cancer belonged to the different districts of the Malwa region of Punjab. The study was approved by ethical committee of the hospital as well as Central University of Punjab. All methods were performed in accordance with the relevant guidelines and regulations of the institutional ethics committee. All the patients were examined by a qualified oncologist and the disease was diagnosed by fine-needle aspiration cytology, mammography and histopathology. Patients with other cancers, major renal, cardiac, hepatic, skeletal and neurological disorders were excluded from the study. Information on demographic features and risk factors was collected using a structured questionnaire.

DNA isolation and genotyping

Blood samples (5 ml) were collected in EDTA Vacutainers with the written informed consent of the patient. Genomic DNA was extracted from blood samples using the standard phenol-chloroform method.

DNA concentration and purity was determined using a NanoDrop ND-1000 UV-Vis spectrophotometer (Thermo Scientific). The integrity of DNA samples was validated by electrophoresis using 0.8% agarose-1X TBE gels stained with ethidium bromide. Genotyping was performed on an Illumina Infinium HD assay platform using a Global Screening Array (GSA) microchip (Illumina Inc.) with 200 ng of genomic DNA according to the manufacturer’s instructions. The GSA microchip contains more than 700,000 up-to-date markers, optimized for human genome-wide backbone for unparalleled genomic coverage, including clinically relevant content with all PharmGKB markers. Subsequent sample processing and array hybridization were performed according to the manufacturer’s instructions (Illumina). Genome Studio (Illumina, Inc.) was used for data preprocessing and analysis. Genotypes were called within Genome Studio with the GenCall algorithm of Genotyping Module v1.0. The final sample call rate was 99.99%. The data was subsequently exported to R/Bioconductor to calculate X2 and odds ratio. Annotation was performed using the ClinVar, 1000 Genomes, ExAC, Cosmic and dbSNP databases. A p value ≤ 5 × 10−8 was considered statistically significant. The results of GSA analysis were validated by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) and PCR for CYP2C19*2 and ALDH1A1*2, respectively [29]. To rule out the influence of confounding risk factors, a step-wise multivariate regression analysis was carried out to evaluate the association of genotype with disease, and overall clinical outcome. The confounding risk factors included age, obesity, histological tumor type (ductal, lobular or other), tumor size and tumor grade (1, 2 or 3), estrogen receptor (ER) and progesterone receptor (PR) status (positive or negative) and human epidermal growth factor receptor 2 (HER-2) status.

Follow-up

All patients had been treated with a combination of doxorubicin and cyclophosphamide. This AC chemotherapy comprises of 60 mg/m2 doxorubicin and 600 mg/m2 cyclophosphamide. It is administrated intravenously on day 1 of each 21-day cycle. The therapy is repeated for a total of four or 6six cycles. The patients having poor clinical outcome including recurrence of the disease, metastasis and death were defined as non-responders, whereas the patients with good clinical outcome were defined as responders. The follow-up of the patients was carried out telephonically or with the help of the clinicians during their follow-up visits to the hospital at an interval of 3, 6, 15, 18, 21, 24 and 27 months.

Statistical analysis

Association of genotype with responders and non-responders (univariate analysis) was estimated using the odds ratio with 95% confidence interval (CI) and χ2 analysis using OpenEpi software (version 2.3.1; Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA). Hardy–Weinberg equilibrium was tested for allele and genotypic frequencies. The relationship between gene variants and outcome was also evaluated (multivariate analysis) by multiple logistic regression (MLR method) using open-source R/Bioconductor software. The independent variables were decoded as the following dummy variables: genotype (0 = normal homozygous, 1 = heterozygous and mutant homozygote); age (0 = <50, 1 = >50); menopausal status (0 = premenopausal, 1 = postmenopausal); stage (0 = stage I, 1 = stage II, 2 - stage III. 3 = stage IV); BMI (0 = normal BMI, 1 = obese); outcome (0 = good outcome, 1 = bad outcome); family history of cancer (0 = no family history, 1 = family history of cancer [breast or any other cancer]).

Results

Demographic profile

The study involved 250 female breast cancer patients recruited from the Malwa region of Punjab. The mean age of patients at the time of diagnosis was 53.62 ± 12.31 years. The youngest patient with breast cancer was an unmarried woman aged 23 years. More than half of the patients (55%) were residing in an urban area. No patient was pregnant at the time of diagnosis. Obesity was observed in 44.8% of patients, and a family history of cancer (including breast and other forms of cancer) in 12.3%. Since Punjab is a leading grain producer in India with maximum use of pesticides, around 46.4% of patients had been exposed to pesticides (Table 2).

Table 2 Demographic features of patients

Modified radical mastectomy (MRM) and lumpectomy was performed in 89 and 11% of patients, respectively. All the patients underwent adjuvant chemotherapy (cyclophosphamide and doxorubicin). Trastuzumab was given to patients that were confirmed as positive for HER-2 in the histopathological tests. The primary tumor and axilla were removed completely at the time of surgery. Irradiation treatment was administered in 17.6% of patients, as the cancer was detected in stages III and IV among these patents. Most of the patients (64%) were at stage II at the time of diagnosis, while 18.4% of patients were found to be at stage I. Of the patients, 37.6% were positive for either for ER/PR or both. The patients positive for ER/PR were given hormonal therapy.

Variation in drug-metabolizing, transporter and receptor genes

The patients were screened using the Illumina Infinium HD assay platform and the Global Screening Array chip. Variations among all genes involved in the pharmacokinetics and pharmacodynamics of cyclophosphamide and doxorubicin were determined. Of all the genes involved in pharmacokinetics (CYPs, ABC transporters, SLC transporters, NOS, CBR, XDH, MTHFR, etc.) and pharmacodynamics (TOP2A) of cyclophosphamide and doxorubicin, (Fig. S1 and S2), two gene variants, CYP2C19*2 (G681A; rs4244285; splice variant) and ALDH1A1*2 (polymorphism including 17 bp deletion from position −416 to −432 relative to the transcriptional site), were found to have significant association with the therapeutic outcome (Fig. S3). This finding was validated by PCR-RFLP and PCR methods for CYP2C19*2 variant (Fig. S4). The genotypic and allelic frequencies for CYP2C19*2 and ALDH1A1*2 were found to be in Hardy–Weinberg equilibrium in responders, non-responders and the overall population.

There was a significant difference between responders and non-responders in case of CYP2C19 (G681A) polymorphism. The genotypic distribution and allele frequencies are given in Table 3. A statistically significant difference in the genotypic distribution between responders and non-responders (univariate analysis) was observed for AA vs. GG, [chi square = 10.85, p < 0.001, and crude odds ratio (cOR) = 6.27 (95% CI; 1.86–21.09)]. A significant difference was also observed between the allele frequency of G and A alleles in responders and non-responders [A vs. G, chi-square = 23.07; p < 0.001, and cOR = 2.67 (95% CI; 2.09–5.34)] (Table 3). Death was reported in 26% of patients, whereas recurrence and metastasis was reported in 28% of patients. Poor outcome was observed in 7 and 32 patients harboring bi-allelic and mono-allelic alterations in CYP2C19, respectively.

Table 3 Distribution of CYP2C19*2 (G681A) and ALDH1A1*2 genotypes and allele frequencies in non-responders vs. responders

A multivariate analysis was carried out to confirm these findings using the MLR method. The results of this analysis revealed a significant association of variant genotype AG+AA (heterozygous + mutant homozygotes) with bad outcome (p < 0.001; adjusted odds ratio = 3.21 (95% CI; 1.757–5.928).

In case of ALDH1A1*1(II)/*2(DD) polymorphism (univariate analysis), a significant difference was observed in the genotypic distribution of ID genotype between responders and non-responders [for ID vs. II, chi square = 18.47; p < 0.001, cOR = 4.743 (95% CI; 2.23–10.07)]. A significant difference was also observed between D allele distribution between responders and non-responders [for D vs. I allele, chi square = 17.16, p < 0.001, cOR = 4.119 (95% CI; 2.017–8.41)] (Table 3). Poor outcome was observed in 20 patients harboring ID genotype for ALDH1A1 (rs6151031).

The adjusted odds ratio after controlling for all the confounding factors using MLR was found to be 5.088 (p < 0.001; 95% CI; 2.284–11.71; multivariate analysis). Out of 70 non-responders, the worst prognosis (short-term survival after disease diagnosis) was observed in 13 patients carrying bi/mono-allelic variations in both the ALDH1A1*2 and CYP2C19*2. A Manhattan plot was constructed between the chromosome numbers and –log10p value for all the reported single-nucleotide polymorphisms (SNPs) among which CYP2C19*2 (rs4244285) and ALDH1A1*2 (rs6151031) are located on a risk locus and are strongly associated with the worst prognosis in breast cancer patients (Fig. 1). None of the variants in genes involved in pharmacokinetics and pharmacodynamics of doxorubicin were found to have any significant impact on the outcome.

Fig. 1
figure 1

Manhattan plot representing CYP2C19*2 (rs4244285) and ALDH1A1*2 (rs6151031) SNPs plotted between chromosome number (location) and –log10p values

Discussion

The combination of cyclophosphamide and doxorubicin (AC) has been developed as a possible curative treatment strategy in several solid tumors, including breast cancer. The purpose of the present study was to assess the association of pharmacogenomic factors involved in this combined therapy on the treatment outcome of breast cancer patients measured by overall survival, recurrence and death. Previous studies have focused on individual gene variants involved in the metabolism and transport of these drugs. This is the first study to evaluate all the genes involved in pharmacokinetics and pharmacodynamics of cyclophosphamide and doxorubicin.

Cyclophosphamide requires the activity of CYP450 enzymes in the liver for its bioactivation. The specific CYPs involved in cyclophosphamide metabolism are polymorphic, and different alleles have been demonstrated to be associated with varying levels of protein expression and/or metabolic activity of the proteins expressed [13, 15, 49]. Selected variants in CYP2B6, 2C19, 3A4 and 3A5 have been studied with regard to their impact on cyclophosphamide clinical response [50,51,52]. In the present study, however, all the genes involved in metabolism of cyclophosphamide were studied for the variation and their impact on disease outcome in patients on AC therapy. However, a variation in CYP2C19 (2C19*2; G681A, rs4244285) was found to be associated with clinical outcome. In vitro studies on human liver cells have revealed that CYP2C19 accounts for approximately 12% of CP4- hydroxylation [53]. The variant CYP2C19*2 has been demonstrated to create an alternate splice site that results in a protein lacking the ability to activate cyclophosphamide [7]. The CYP2C19*2 allele has been associated with a lower elimination rate constant for cyclophosphamide in comparison with the wild allele [26, 54].

However, some other studies could not demonstrate the same effect of the CYP2C19*2 allele on cyclophosphamide pharmacokinetics [55, 56]. A previous study from India has also demonstrated CYP2C19*2 allele in interaction with other CYP enzyme variants to be significantly associated with the treatment response [52]. However, in the current study, we did not find significant variation in other CYP enzymes affecting the disease outcome. This discrepancy might be on account of ethnic differences.

The GSA method used in the present study revealed a 17-bp I/D polymorphism in ALDH1A1 gene to be significantly associated with the bad outcome in breast cancer patients [57]. In a study on cancer patients, especially breast cancer, treated with a high dose of cyclophosphamide, patients heterozygous for ALDH1A1*2 (D allele) were found to have an increased risk of liver toxicity [37, 39, 58, 59]. Several isoforms of ALDH have been correlated with metastatic behavior in vitro. In the current study, we found a noteworthy association of the ALDH1A1*2 allele with the bad clinical outcome, including recurrence, metastasis and death. However, we did not observe the *2/*2 homozygous genotypes in patients as well as in controls. A previous study carried out by Spence et al. also reported the absence of *2/*2 homozygous genotype in Caucasians, Asians, Jewish and in African populations. The in vitro expression analysis of ALDH1A1*2 showed no significant change in expression between ALDH1A1*2 and wild-type. However, this may not fully reflect the regulatory mechanisms underlying gene expression in vivo [60]. Ekhart et al. have proposed that an ALDH1A1*2 variant might result in decreased activity of ALDH enzyme which results in the decreased detoxification of 4-hydroxycyclophosphamide and increased liver toxicity [61]. ALDH1A1 has been used as a marker for breast cancer stem cells bearing high tumor-initiating and self-renewable capabilities [62]. A meta-analysis carried out by Liu et al. revealed the ALDH1A1 as a biomarker to predict tumor progression and poor survival of breast cancer patients. In the present study, most of the patients bearing the ALDH1A1*2 genotype who were on cyclophosphamide therapy showed a very poor survival rate [62].

The main drawbacks of the study were that we could not assess the drug levels of the patients bearing the normal and altered genotypes. Further, GSA assesses around 700,000 known gene variants; therefore, novel variants (if any) could not be assessed by this technique. In conclusion, CYP2C19 (G681A) variant and the ALDH1A1*2 emerged as two important biomarkers associated with the worst outcome in breast cancer patients from the Malwa region of Punjab. Therefore, patients should be screened for these two variants before administration of AC therapy.