Introduction

Non-small cell lung cancer (NSCLC), a majority present with advance stage (III/IV), accounts for approximately 80 % of lung cancer which has already become the leading cause of cancer death worldwide [1]. Platinum-based chemotherapeutic agents, such as cisplatin and carboplatin with different combinations, are first-line regimens widely used in patients with advanced NSCLC based on a favorable efficacy, with the clinical benefit rate more than 75 % [2, 3]. However, platinum-based chemotherapy brings patients not only clinical benefit but also side effects such as nausea/vomiting and hematological toxicity, and the treatment outcomes vary greatly among individuals. Accumulating evidences showed that single nucleotide polymorphism (SNP) analysis may help to elucidate the role of genetic variability on the therapeutic effectiveness and toxicity of patients receiving platinum-based combination regimens, and carriers with risk SNPs or haplotypes may respond poorly to therapy [2, 4, 5]. Therefore, molecular markers may be useful in helping to predict treatment outcomes of advanced NSCLC patients, and newly improved treatment options should be rationally designed based on this knowledge to obtain the best efficacy while minimizing the side effects for each individual.

Folate-associated one-carbon metabolism (FOCM) provides one-carbon groups needed for numerous intracellular processes including DNA methylation, cell proliferation, and the synthesis of nucleic and amino acids [6]. Several factors including insufficient folate and genetic variation can disrupt normal FOCM function [7, 8]. Folate functions biologically in the two main forms of 5,10-methylenetetrahydrofolate (5,10-methyleneTHF) and 5-methyltetrahydrofolate (5-methylTHF). The 5,10-methyleneTHF is the methyl donor for the thymidylate synthase-mediated conversion of uracil (dUMP) to thymidylate (dTMP), a precursor for DNA synthesis. Therefore, a possible consequence of 5,10-methyleneTHF deficiency is chromosome breakage and enhanced neoplastic transformation due to misincorporation of uracil instead of dTMP into DNA [911]. The 5-methylTHF transmits methyl group for remethylation of homocysteine to methionine thus finally leading to DNA methylation. Chromosomal instability caused by the aberrant status of global genomic methylation is important epigenetic mechanisms of carcinogenesis, while hypermethylation in the promoter CpG islands potentially causes gene silencing thus confusing expression modulation of both oncogenes and repressor genes, and hypomethylation or methyl deficiency leads to activation of methylation-silenced proto-oncogenes which forms etiological factor of cancer [12, 13]. The 5,10-methylenetetrahydrofolate reductase (MTHFR), involving in maintaining folate and homocysteine homeostasis, catalyzes the irreversible conversion of folate from 5,10-methyleneTHF to 5-methylTHF.

Genetic variation may affect the enzyme activity of MTHFR and consequently may devote to cancer development. Not only two common reported MTHFR SNPs of rs1801133 (C677T) and rs1801131 (A1298C) have been associated with enzyme activity and the risk of cancers [14, 15], recently some clinical studies also demonstrate that the variants seem to influence response as well as toxicity in NSCLC patients treated with platinum-related induction therapy [1622]. Results for treatment outcomes in the available studies are contradictory, varying from positive to no effect or even to negative effect, driven partly by the insufficient size of any single study (Table S1).

In this retrospective study, we selected 10 common genetic variations in MTHFR gene by a strategy of integrating both tagging SNPs and potentially functional SNPs in large sample size (containing 1,004 individuals) to systematically query MTHFR polymorphisms and their associations with clinical efficacy and severe toxicities in stage III/IV NSCLC patients receiving first-line, platinum-based chemotherapy. The objective of this work was to explore the potential impact of MTHFR variants on the treatment outcomes of advanced NSCLC.

Materials and methods

Study design and patient recruitment

The study encompassed 1,004 eligible patients diagnosed with stage III–IV NSCLC from six hospitals in Eastern China between March 2003 and February 2010. No statistically significant difference was observed in the distribution of demographic features among the patients from the six hospitals (P gender = 0.698, P age = 0.321). The eligible patients followed the following criterions: (a) informed consent available and adherence to the treatment schedule; (b) at least 18 years old; (c) presence of a measurable and evaluable lesion; (d) an Eastern Cooperative Oncology Group performance status (ECOG PS) between 0 and 2; (e) an absolute neutrophil count (ANC) ≥1.5 × 109 cells/L, platelets ≥100 × 109 cells/L, serum creatinine ≤1.5 × upper limit normal, aspartate aminotransferase (AST) and alanine aminotransferase (ALT) ≤1.5 × upper limit normal, and creatinine clearance ≥60 mL/min; (f) no other prior history of malignancy or an already cured tumor more than 5 years; (g) no previous chemotherapy, radiotherapy or surgery, or concurrent chemoradiotherapy for this cancer; (h) no active congestive heart failure or cardiac arrhythmia; and (i) no other critical medical or psychological factors that might influence the treatment schedule.

Demographic characteristics were abstracted by trained investigators to obtain information on gender, age at diagnosis, smoking status, tumor histology, and clinical stage. Complete medical history, health examination, and laboratory tests were conducted before any treatment course was started. Survival data was collected from several sources including follow-up calls, the Social Security Death Index, and clinical medical records of inpatient and outpatient. The study protocol was approved by the Ethical Review Committee of Fudan University and the relevant hospitals. The investigators were blinded to the patients’ genotype status.

Severe toxicities, clinical benefit, progression-free survival (PFS), and overall survival (OS) were monitored to the evaluation of the platinum-based treatment outcomes of NSCLC patients. Toxicities assessed twice weekly were graded according to the National Cancer Institute Common Toxicity Criteria (NCI-CTC) version 3.0. Severe toxicities in our study included grade III or IV gastrointestinal toxicity (nausea and vomiting) and hematologic toxicity (leukocytopenia, neutropenia, anemia, and thrombocytopenia). No grade V toxicity (death) was observed in our study. Patients’ responses to the chemotherapy were split into four categories: complete response (CR), partial responses (PR), stable disease (SD), and progressive disease (PD) [4]. The clinical benefit rate was defined as the percentage of patients with CR, PR, and SD. PFS represents the time interval from the date of first chemotherapy to the date of disease progression (including death) or the last progression-free follow-up while OS to the date of death or last follow-up.

Chemotherapy regimens

All the eligible patients enrolled in this study were inoperable and received first-line, platinum-based chemotherapy. The chemotherapeutic regimens were as follows: cisplatin 75 mg/m2 or carboplatin at an area under the curve 5, both administered on day 1 every 3 weeks, in combination with vinorelbine 25 mg/m2 on days 1 and 8 every 3 weeks, or gemcitabine 1,250 mg/m2 on days 1 and 8 every 3 weeks, or paclitaxel 175 mg/m2 on day 1 every 3 weeks, or docetaxel 75 mg/m2 on day 1 every 3 weeks. A few patients received other platinum-combination treatment (n = 49). All chemotherapeutic drugs were received intravenously, and the eligible patients were treated for two to six cycles.

SNP selection and genotyping

Genomic DNA was isolated from whole blood using the QIAamp DNA Maxi Kit (Qiagen GmbH, Hilden, Germany). Polymorphisms were selected by an approach combining both tagging and potentially functional SNPs located within 2 kb upstream of the 5′ untranslated region and 2 kb downstream of the 3′ untranslated region of MTHFR gene. The tagging SNPs were identified with a correlation coefficient (r 2) > 0.80 and a minor allele frequency (MAF) > 0.05 in the Han Chinese in Beijing (CHB) population from the HapMap Project database (http://www.hapmap.org). Therefore, a total of 10 tagging and potentially functional SNPs were selected to represent genetic variants of MTHFR. Based on the information set of the selected SNPs, genotyping was performed using iSelect HD BeadChip (Illumina, San Diego, CA) with the following quality control criteria: genotyping call rate of SNP > 95 %, P value of Hard-Weinberg equilibrium (HWE) > 0.05, and GenCall score > 0.2.

Statistical analysis

The chi-square tests were used to see whether there was any statistically significant difference in the distribution of demographic variables, clinical features, and genotype. Benjamini-Hochberg false discovery rates (FDR q values) were computed to control for multiple comparisons [23]. Only the significant SNPs were performed further analyses by unconditional logistic regression after adjusting for significant epidemiologic factors (P < 0.05 in χ 2 test). Pairwise linkage disequilibrium (LD) relations among the SNPs were examined using D′ and r 2, and the SNPs in a nearly complete linkage were defined with 0.95 ≤ D′ ≤ 1 and r 2 > 0.88 [24].

Comparison of survival in patients groups was based on the use of time-to-event methods, including Kaplan-Meier estimation, log-rank test, and Cox proportional hazards regression models. Clinical variables with log-rank P < 0.05 in univariate analysis were pooled into multivariate analysis. A stepwise variable selection approach was used to select statistically significant classification variables. A P value of 0.05 was considered as the threshold of statistical significance, and all tests were two-tailed as were the reported P values.

Results

Patient characteristics

The main patient characteristics and clinical outcomes were summarized in Table 1. Of the 1,004 advanced NSCLC patients enrolled in this study, clinical benefit was assessed in 976 patients, gastrointestinal toxicity was assessed in 964 patients, and hematologic toxicity was assessed in 979 patients. A few patients were not included due to the loss of follow-up during first-line chemotherapy. Ages of patients in the treatment ranged from 26 to 82 with median age of 58, and therefore, age group was dichotomized by the median age 58 years. Several classified groups were statistically significant in the distribution in the clinical benefit or toxicity (P < 0.05), which were as adjusted factors in further analysis. More detailed information is available in Table S2.

Table 1 Patient characteristics and clinical outcomes (n = 1,004)

Genotyping data

Totally, 10 tagging and potentially functional SNPs were included in our analysis. All SNPs had genotyping rate >95 % and were in Hardy-Weinberg equilibrium (P > 0.05) (Table S3). Pairwise LD relations among the 10 SNPs were illustrated by D′ and correlation coefficients (r 2). We found that SNPs of rs3737967, rs1537516, rs1537514, and rs13306553 were in nearly complete linkage (0.95 ≤ D′ ≤ 1, r 2 > 0.88), as were rs1801131 and rs4846049 (D = 0.949, r 2 = 0.98) (Table S4).

The genotypes of seven polymorphisms (rs3737967, rs1537516, rs1537514, rs4846049, rs1801131, rs1801133, and rs13306553) were significantly associated with clinical benefit, grade III or IV gastrointestinal toxicity, or thrombocytopenia in χ 2 test, and all the polymorphisms remained significant after correction for multiple comparisons at FDR of 10 % (Table S3). However, few polymorphism of MTHFR was statistically significant in the distribution of treatment toxicities with leucopenia, neutropenia, and anemia (Table S5). According to LD analysis, rs1537514 can represent the variants of rs3737967, rs1537516, and rs13306553, and rs1801131 could represent the rs4846049. Hence, the three significant SNPs (rs1537514, rs1801131, and rs1801133) would be retained for further analysis to query their correlation with clinical benefit rate and toxicities of NSCLC patients.

Clinical benefit and severe toxicities

As shown in Table 2, after adjustment for statistically significant covariates, patients carrying heterozygotes of rs1537514 and rs1801131 were significantly associated with better clinical benefit when compared to wild-type homozygotes in advanced NSCLC patients (P = 0.002 and 0.030, respectively). The rs1537514 conferred the risk of severe gastrointestinal toxicity (III or IV grade) in opposing directions for the heterozygote and mutant homozygote after covariate adjustment. Patients carrying the heterozygote of the polymorphism showed decreased risk of severe gastrointestinal toxicity (P = 0.027, odds ratio (OR) = 0.40), while the mutant homozygote was associated with increased risk of severe gastrointestinal toxicity (P = 0.031, OR = 5.09). The rs1801131 only performed decreased risk of severe gastrointestinal toxicity in heterozygous carriers (P = 0.004, OR = 0.40). The propensity was enhanced when heterozygotes with homozygotes were compared (P = 0.003). Mutant homozygote of rs1537514 was significantly associated with increased risk of thrombocytopenia (P = 0.009, OR = 9.34), and heterozygote of rs1801133 was associated with decreased risk of thrombocytopenia (P = 0.016, OR = 0.40), when compared to wild-type homozygotes.

Table 2 SNPs significantly associated with clinical benefits, severe gastrointestinal toxicity, or thrombocytopenia

In the stratification analysis, we compared the treatment outcomes of heterozygotes with that of homozygotes. As shown in Table 3, patients carrying the heterozygotes of rs1537514 exhibited much better clinical benefit and decreased risk of severe gastrointestinal toxicity occurrence in several subsets, age ≤58 years, female, ECOG PS ≤1, and never smokers, and so do the patients carrying the heterozygotes of rs1801131, when compared with patients carrying the homozygous genotypes. The rs1801133 was not significantly associated with any subgroups in our analysis. Due to the insufficient simple size of subgroup for stratification analysis in thrombocytopenia, data was not showed.

Table 3 Stratification analysis of association between two SNPs of MTHFR and clinical benefit and grade 3 or 4 gastrointestinal toxicity

Survival

Over a follow-up period of 5 years, 972 patients were included for OS and PFS, while 32 patients were not included for survival analysis due to the operation therapy in the observation time. Among the 972 genotyped patients, the median PFS was 9.1 months, and the median OS was 19.3 months, similar to values in the literature [25]. The censored data counted for 16.3 % and 731 died from NSCLC. Several demographic and clinical covariates strongly influenced the OS and PFS. Log-rank test showed significant difference in age (P = 0.003, Fig. 1a), gender (P = 0.002, Fig. 1b), and smoking status (P = 0.012, Fig. 1c) groups for OS and in performance status (P = 0.001, Fig. 1d) and chemotherapy regimens (P = 3.9 × 10−6, Fig. 1e) groups for PFS. More details about the patient characteristics and survival are available in Table 4.

Fig. 1
figure 1

OS and PFS curves of significantly associated clinical characteristics and polymorphism. a OS and age. b OS and gender. c OS and smoking status. d PFS and performance status. e PFS and chemotherapy regimens. f PFS and rs1537514. The P values were calculated by the unadjusted log-rank test. OS overall survival, PFS progression-free survival

Table 4 Comparisons of OS and PFS according to clinical characteristics of NSCLC patients

Except the SNPs with complete linkage, the other tagging and potentially functional SNPs were involved in survival analysis (Table 5). No significant associations between polymorphisms and OS were observed. Log-rank test showed significant difference in the median PFS between patients carrying the heterozygous of rs1537514 and those carrying the homozygous (CG vs CC/GG, P = 0.022, Fig. 1f), and patients carrying heterozygous were associated with better PFS after adjusted for epidemiological covariates. However, no significant association was found in further stratification analysis (data not shown).

Table 5 Associations between MTHFR polymorphisms and survival of NSCLC patients treated with platinum-based chemotherapy

Stepwise Cox regression model for NSCLC survival

To determine independent predictors of NSCLC prognosis, further multivariate stepwise Cox regression analysis for the effects of significant feature covariates and SNPs (P < 0.05 in the survival analysis) on NSCLC survival was performed. Five variables (age, gender, smoking status, tumor-node-metastasis (TNM) stage, and histologic type) that were significant in the univariate analysis were used in stepwise Cox regression model for OS analysis. As a result, gender (female) and age (≤58) were independent significant parameters for better OS (P = 0.004 and 0.007, respectively). Similarly, three variables (ECOG PS, chemotherapy regimens, and rs1537514) were used in the stepwise model for PFS, and the only independently significant parameters for better PFS were ECOG PS ≤1 and platinum-vinorelbine therapy (0.004 and P = 2.6 × 10−6, respectively).

Discussion

The main finding of this study was that NSCLC patients carrying heterozygotes of MTHFR polymorphisms (rs1537514, rs1801133, and rs1801131) were associated with better clinical benefit as well as PFS and decreased risk of severe toxicities, whereas those carrying the mutant homozygotes were associated with increased risk of severe gastrointestinal toxicity and thrombocytopenia, when compared with the wild-type homozygous, resulting in a net effect of heterozygote advantage. The trend was more obvious in subgroups of female, age ≤58, ECOG PS ≤1, and never smokers. Better OS were seen in subgroups of female, age ≤58, and never smokers. And patients in subgroups of better performance status and platinum-vinorelbine treatment had long PFS time.

MTHFR is a key enzyme involved in FOCM which links to DNA synthesis and methylation, amino metabolism, and cell proliferation. Study revealed that polymorphism of rs1537514 located within predicted mi-RNA target sites (miR-596 and miR-518a-5p/527) in 3′UTR of MTHFR gene was significantly associated with red blood cell folate [26]. The mutation might down-regulate gene expression by altering the binding activity to mi-RNA and then might have an effect on mRNA stability [26, 27]. Our results showed that patients carrying heterozygotes exhibited better clinical benefits as well as PFS and decreased severe toxicities when compared with those carrying homozygotes. Although no previous studies about heterozygous advantage for MTHFR polymorphisms in NSCLC were found, cases of heterozygote advantage have been demonstrated in other human diseases such as cystic fibrosis [28], non-Hodgkin lymphoma [29], and breast cancer [30]. It is clear that heterozygote has a net fitness advantage over both homozygotes for heterozygote might perform wider gene expression flexibility in terms of either response to specific stimuli or cell type specificity [31, 32] and fits perfectly the underlying balancing selection model and potentially maintains genetic diversity in natural populations [31]. Given that the homozygous recessive allele might lead to a malfunctioning protein in FOCM and the heterozygous phenotype might display a protective role. One thing we should point out that, in our data, completed linkage SNPs of rs1537514, rs3737967, and rs1537516, which are all located in 3′UTR, and rs13306553, which is located in intron 4, displayed similar measures of treatment responses and toxicities, and no reports explored the latter three to date. A study showed significant association between rs1537514 and red blood cell folate (P < 0.0001) [26]. Therefore, it is possible that rs1537514 is a disease-underlying variant and the other SNPs might be substitutes in the same region.

Two common exon polymorphisms C677T (rs1801133) and A1298C (rs1801131) in MTHFR gene with prediction for NSCLC patients with platinum-related treatment have been investigated. The C677T polymorphism that causes Ala222 to Val might enhance propensity of dimeric enzyme to dissociate into monomers and release flavin cofactors [33]. The allosteric regulation of MTHFR might relate to the modulation of enzyme activities by altering a subunit interface [34, 35]. Study showed that homozygous TT might have a 30 % reduction in enzyme activity and decreased production of 5-methylTHF when compared to CT/CC genotype [7, 36]. Similarly, the polymorphism of A1298C that induces a Glu-to-Ala (Glu429Ala) substitution in a regulatory domain might alter the enzyme activity with a 30–40 % reduction [37]. These two mutations might be associated with higher homocysteine or lower plasma folate concentration and then may lead to abnormal DNA methylation and genomic instability [7, 37, 38]. However, most of the available researches investigating the role of MTHFR polymorphisms have yielded discrepant results on the response and survival in NSCLC patients treated with platinum-related chemotherapy [1619, 22, 21, 20]. Other studies observed a heterozygote advantage of MTHFR C677T on offspring’s neural tube defect (NTD) in patients with NTD and on specific cognitive performance in elderly Chinese males without dementia [39, 40]. In our analyses, patients carrying the heterozygote of A1298C or C677T had better clinical benefit and/or lower risk of developing severe drug toxicities, when compared to homozygous carriers. However, no significant association between these two polymorphisms and survival was observed in our study. As shown above, discrepancy exists in the available studies. Among the causes for these inconsistencies, we can note that the variety of drugs (pemetrexed, vinorelbine) co-administered with platinum drugs also contain varieties of compounds, are applied in different clinical settings (adjuvant, neoadjuvant, first- and second-line palliative chemotherapy), and have differences in ethnicity and clinical status. It was important to note that some studies which demonstrated the polymorphism of rs4846049 might modify the binding of has-miR-149 to MTHFR and therefore might be associated with diseases of myelomeningocele, coronary heart disease, and cerebral palsy in infants [4143]. In our data, complete linkage (D = 0.949, r 2 = 0.98) was observed between rs1801131 and rs4846049 (locating in the exon 7 and 3′-UTR, respectively), suggesting that rs1801131 possibly plays mainly functional role in MTHFR enzyme activity, while rs48046049 might function in endogenous regulation to adjacent locus and gene expression by modifying the binding mi-RNA.

In addition to genetic factors, clinical variables are of very high importance in the correlation with the treatment response of cancer [44]. Our findings indicated that chemotherapy regimens would be a significant factor that could influence occurrence of the drug toxicities while without significant difference to the clinical benefit. Therefore, the therapy agents should consider patients’ health status and tolerance. Since chemotherapy regimen of platinum-vinorelbine had much longer PFS than others, more administration should be taken on this regimen. Study showed that the intensity of heavy smoking was an adverse prognostic factor for patients with surgically treated adenocarcinoma [45]. Given that males and smoking patients have worse OS in our analysis, the warning of giving up smoking and cultivating a good habit would be important to improve OS.

There were several advantages in our study design. Firstly, we recruited the largest (reported to date) and homogenous cohort of NSCLC patients (n = 1,004), thus improving statistical power. Secondly, unlike other studies of cancer where surgical resection, radiotherapy, and chemotherapy with multiple anticancer drugs were often jointly used, we focus on advanced NSCLC patients in our study simply treating with platinum-based chemotherapy, which reduced the possibility of false discovery caused by heterogeneity of disease and therapeutic regimen. Thirdly, our results provided an important insight into novel mechanism that heterozygote advantage might exist in MTHFR polymorphisms in response to platinum drug therapy, indicating that heterozygotes had the highest fitness. In addition, the candidate polymorphism approach could cover comprehensively all SNPs in MTHFR gene, and these tagging SNPs could represent the underling causative variant in the same region. However, some limitations existed in our study. The research only included Chinese population, and it is important to replicate these findings in other ethnic groups as an assessment of whether our findings have population-specific effects. The mechanism for the association is unclear. Our findings need to be validated by an additional study with larger sample size and confirmed through biomolecular mechanism exploration.

In summary, we had identified several heterozygotes of MTHFR genetic variants that were potentially associated with better efficacy and reduced risk of toxicities for advanced NSCLC patients with platinum-based chemotherapy, though the mutations were not independent predictors for survival. We project to combine bioinformatics and molecular biology experiments to confirm these findings.