Introduction

Esophageal cancer (EC), which is the sixth most lethal cancer worldwide [1], usually occurs as esophageal squamous cell carcinoma (ESCC) either in the middle or upper third of the esophagus, or as esophageal adenocarcinoma (EAC) in the distal third [1]. EAC is the predominant histologic type in western countries [2]. However, in developing countries such as Iran, ESCC is still the most prevalent form accounting for over 90% of cases [3]. Iran is located in a high risk region for ESCC, the Asian esophageal cancer belt, which stretches from Caspian littoral to northern China [4]. One of the highest risk of ESCC in the world has been reported for this region (about 100 per 105 per year for Gonbad, Golestan Province, Iran) [3]. There are incidence variations across the country, ranging from age standardized rate (ASR) of 2–3 (per 105 populations per year) in the south to 43–63 in the north [5]. Several risk factors including family history, drinking hot tea or unhealthy water, poor nutritional diet, consumption of opium products and low socioeconomic status have been suggested to contribute to risk of ESCC in this region [5, 6]. In Iran, aside from these well-studied risk factors, the genetic components of ESCC in are not fully recognized.

Increasing evidences have suggested that genetic components may participate to risk of ESCC [7,8,9]. Genome wide association studies (GWASs) have yield enormous progress in illuminating the genetic contributors of complex diseases especially ESCC [10, 11], and revealed several molecular mechanisms contributing to pathophysiology of ESCC [12,13,14]. Among these, PLCE1 rs2274223 is a well-known risk variant that has been identified by three large-scale GWASs in Chinese populations [7, 12, 14]. C20orf54 rs13042395 and RUNX1 rs2014300 were also discovered through GWASs in large Chinese cohorts [7, 14]. Association of these variants with risk of ESCC has been further evaluated in high risk populations like Chinese [15], Caucasians [16] and Africans [17]. However, in spite of the high prevalence of ESCC in Iran, there is currently no data regarding possible contribution of these variants to risk of ESCC in this population. In this study, we evaluated the association between three GWAS identified variants (namely PLCE1 rs2274223, C20orf54 rs13042395 and RUNX1 rs2014300) and risk of ESCC in an Iranian cohort and conducted meta-analysis of association between rs2274223 and ESCC.

Materials and Methods

Study Cohort

The study cohort was described elsewhere [18]. A total of 500 unrelated Iranian subjects, including 200 ESCC patients and 300 age and sex-matched participants of the control group, were enrolled in this study. ESCC was diagnosed by upper gastrointestinal endoscopy and histopathology evaluation. The mean age of patients at diagnosis was 61.4 years (range within 20–87 years); 101 patients (50.5%) were male and 99 (49.5%) were female. The participants in the control group had no personal or family history of cancer and were recruited from individuals who had been referred for routine check-ups. The mean age for the control group was 62.7 years (range within 50–86 years); 151 control subjects (50.33%) were male and 149 (49.66%) were female. Written informed consent was obtained from all participants. The study was approved by the ethics committee of Tehran University of Medical Sciences.

DNA Extraction and Genotyping

Genomic DNA was extracted from peripheral blood using the standard salting out protocol. Genotyping was performed using the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) procedure. Genomic regions containing the studied SNPs were amplified with specific primers outlined in Table 1. Mismatches were inserted into the primers of c20orf54 and RUNX1 to create corresponding restriction endonuclease sites (Table 1). A 25 μl reaction consisted of genome DNA (~50 ng), 12.5 μl of 2× Taq DNA Polymerase Master Mix Red (Ampliqon, Denmark), 1.5 μl forward primer (5 μM), 1.5 μl reverse primer (5 μM) and 8.5 μl sterilized water. For PLCE1 rs2274223, a 243 bp genomic region was amplified using touchdown PCR. PLCE1 primers were retrieved from a previous study [19]. An initial denaturation of 5 min at 94 °C was followed by seven cycles at 94 °C for 30 s, 60 °C for 30 s and 72 °C for 30 s with annealing temperature decreasing one °C per cycle, and then 28 cycles at 94 °C for 30 s, 54 °C for 30 s and 72 °C for 30 s and a final extension step of 72 °C for 5 min. The PCR product was digested with the restriction enzyme (BstUI, Thermo Scientific, United States) according to the manufacturer’s instructions. For C20orf54 rs13042395, a 140 bp genomic region was amplified using touchdown PCR. An initial denaturation of 5 min at 94 °C was followed by seven cycles at 94 °C for 30 s, 66 °C for 30 s and 72 °C for 30 s with annealing temperature decreasing one °C per cycle, and then 28 cycles at 94 °C for 30 s, 60 °C for 30 s and 72 °C for 30 s and a final extension step at 72 °C for 5 min. The PCR product was digested with the restriction enzyme (Tru1I, Thermo Scientific, United States) according to the manufacturer’s instructions. For RUNX1 rs2014300, a 135 bp genomic region was amplified using touchdown PCR. An initial denaturation of 5 min at 94 °C was followed by six cycles at 94 °C for 30 s, 63 °C for 30 s and 72 °C for 30 s with annealing temperature decreasing one °C per cycle, and then 29 cycles at 94 °C for 30 s, 58 °C for 30 s and 72 °C for 30 s and a final extension step at 72 °C for 5 min. The PCR product was digested with the restriction enzyme (TaqI, Thermo Scientific, United States) according to the manufacturer’s instructions. The assigned genotypes were confirmed by Sanger sequencing for nine samples (one sample per SNP-genotype), and these samples were then served as controls for the digestion process. The no template control (NTC) PCR and digestion reactions were also performed for each SNP to monitor for possible contaminations. Each NTC PCR contained all reagents except template DNA.

Table 1 Primer sequences that were used for PCR-RFLP. Nucleotides that were mismatched to create the corresponding restriction sites are underlined

Restriction Enzyme Digestions

Figure 1a demonstrates the cleavage site of BstUI with regard to the alleles of PLCE1 rs2274223. Digestion of PLCE1 rs2274223 with BstUI resulted in a 243 bp fragment for the AA genotype, three fragments (243 bp, 155 bp and 88 bp) for the AG genotype, and two fragments (155 bp and 88 bp) for the GG genotype (Fig. 1a and Table 1). Figure 1b shows an agarose gel electrophoresis of the product of PLCE1 digestions for three samples along with the digestion controls and the NTC. Digestion of C20orf54 rs13042395 with Tru1I resulted in a 140 bp fragment for the CC genotype, three fragments (140 bp, 116 bp and 24 bp) for the CT genotype and two fragments (116 bp and 24 bp) for the TT genotype (Fig. 2a and Table 1). However, the small fragment (i.e. 24 bp) usually exits off the bottom end and it did not appear on the gel. Fig. 2b represents an agarose gel electrophoresis of the product of C20orf54 digestions for three samples along with the digestion controls and the NTC. Digestion of RUNX1 rs2014300 AA, AG and GG genotypes with TaqI resulted in a 135 bp fragment, three fragments (135 bp, 111 bp and 24 bp), and two fragments (111 bp and 24 bp), respectively (Fig. 3a and Table 1). The small fragment (i.e. 24 bp) did not appear on the gel. Figure 3b represents the agarose gel electrophoresis of RUNX1 digestions for three samples along with the digestion controls and the NTC. Figure 1c, 2c and 3c illustrate results of the Sanger sequencing for regions encompassing the studied SNPs. The reverse strand was sequenced in the case of PLCE1 (Fig. 1c).

Fig. 1
figure 1

Restriction site and examples of genotyping results for PLCE1 rs2274223 a The recognition and cleavage site for BstUI. The blue nucleotides represent the alleles of the SNP. The vertical red line shows the cleavage site. b The digestion products of three samples analyzed on agarose gel in parallel with three digestion controls and NTC. M, size marker; NTC, no template control. c The results of sequencing genomic region encompassing PLCE1 rs2274223 in three samples. Note that the reverse strand (−) was sequenced in these samples. Therefore, the genotypes of these samples relative to the forward (+) strand are GG, AG and AA (from top to down). SNP position is indicated by an arrow

Fig. 2
figure 2

Restriction site and examples of genotyping results for C20orf54 rs13042395 a The recognition and cleavage site for Tru1I. The blue and red nucleotides, respectively, represent the SNP alleles and the mismatch that inserted to create the site. The vertical red line shows the cleavage site. b The digestion products of three samples analyzed on agarose gel in parallel with three digestion controls and NTC. M, size marker; NTC, no template control. c The results of sequencing genomic region encompassing C20orf54 rs13042395 in three samples. SNP position is indicated by an arrow

Fig. 3
figure 3

Restriction site and examples of genotyping results for RUNX1 rs2014300 a The recognition and cleavage site for TaqI. The blue and red nucleotides respectively represent the SNP alleles and the mismatch that inserted to create the site. The vertical red line shows the cleavage site. b The digestion products of three samples analyzed on agarose gel in parallel with three digestion controls and NTC. M, size marker; NTC, no template control. c The results of sequencing genomic region encompassing RUNX1 rs2014300 in three samples. SNP position is indicated by an arrow

Meta-Analysis

Eligible studies were identified through searching PubMed and Embase databases using keywords “PLCE1” or “Phospholipase C Epsilon 1” or “10q23” or “C20orf54” OR “SLC52A3” and “polymorphism” or “SNP” or “variant” or “variation” or “rs2274223” or “rs13042395” and “esophageal squamous cell carcinoma” or “ESCC” or “oesophageal squamous cell carcinoma”. A manual search was also performed to identify additional relevant studies. The last search was performed on 1 Feb 2018. Original studies were included based on following criteria: (i) evaluated association of PLCE1 rs2274223 or C20orf54 rs13042395 with ESCC using a case-control design (excluding GWASs); (ii) sufficient data for estimating odds ratio (OR) and their corresponding 95% confidence intervals (95% CIs). Data were extracted from eligible studies by two authors (Z. Nariman-Saleh-Fam and M. Bastami). For each study following data were recorded: First author, publication date, ethnicity of subjects, genotyping method and genotype frequencies. In case of discrepancies, consensus was reached by discussion.

Statistical Analyses

All statistical analyses were conducted using R (version 3.1.0) as described elsewhere [20, 21]. SNPs were assessed for significant deviation from the HWE among the control group members, patients and all participants using Χ2 test that was implemented in the HardyWeinberg package in R [22]. Association of PLCE1 rs2274223, C20orf54 rs13042395 and RUNX1 rs2014300 polymorphisms with ESCC was analyzed using logistic regression analysis that was implemented in the SNPassoc package (version 1.9–2) [23]. ORs and 95% CIs were calculated assuming codominant, dominant, recessive, overdominant and log-additive genetic models. The best fitting genetic model was selected based on the lowest Akaike’s information criterion (AIC) value as calculated by the SNPassoc package [23]. The Meta package for R was used to perform meta-analysis [24]. Association of rs2274223 and rs13042395 with ESCC was estimated by calculating pooled ORs and their 95% CIs assuming allelic, homozygote, heterozygote, dominant and recessive models. Heterogeneity was assessed using the Chi-squared based Q test. The random effect model [25] was used to calculate pooled ORs and 95%CIs if there existed a significant heterogeneity (i.e. P < 0.1). Otherwise, the fixed effect model was used [26]. Significance of the pooled OR was determined by the Z test (P < 0.05 was considered significant). In cases of remarkable heterogeneity (i.e. I2 > 50%), the potential sources of heterogeneity across studies was explored using univariate meta-regression and stratified analysis. Sensitivity analyses were performed by omitting one study at a time to measure the consistency of the results and influence of each study on the pooled OR. Publication bias was evaluated by the Begg’s rank correlation test of funnel plot asymmetry [27].

Results

PLCE1 rs2274223 and Risk of ESCC in the Iranian Cohort

Genotype frequency distributions of the studied SNPs are shown in Table 2. Genotype frequencies of PLCE1 rs2274223 were not significantly deviated from the Hardy-Weinberg equilibrium among members of the control group, patients or all subjects (Pvalues were 0.349, 0.149, and 0.067, respectively). The frequency of the minor allele (i.e. G) of this SNP was 0.188 among the control group, 0.265 among patients and 0.219 among whole participants. Logistic regression analysis revealed that rs2274223 was associated with ESCC assuming codominant, dominant, recessive and log-additive models. In the codominant model, subjects carrying the GG genotype had a significantly increased risk of ESCC compared to those with the AA genotype [GG vs. AA, OR (95% CI): 2.47 (1.17–5.23), Pvalue: 0.021]. The dominant model showed an approximate 1.5 fold higher risk for ESCC in individuals carrying at least one G allele than in ones with the AA genotype [AG + GG vs. AA, OR (95% CI): 1.57 (1.09–2.27), Pvalue: 0.016]. Assuming the recessive model, individuals carrying the GG genotype had an approximate 2 fold higher risk for ESCC than ones carrying at least one A allele [GG vs. AA+AG, OR (95% CI): 2.18 (1.04–4.56), Pvalue: 0.036]. Under the log-additive model, each additional copy of G allele was associated with a 1.5-fold increased risk of ESCC [OR (95% CI): 1.51 (1.12–2.02), Pvalue: 0.006]. According to the lowest AIC value, the log-additive was the model that fitted the data in the best way (AIC: 669.5).

Table 2 The genotype distributions and association analyses for PLCE1 rs2274223, C20orf54 rs13042395 and RUNX1 rs2014300

C20orf54 rs13042395 and Risk of ESCC in the Iranian Cohort

C20orf54 rs13042395 genotypes were not significantly deviated from the Hardy-Weinberg equilibrium among members of the control group, patients or all subjects (Pvalues were 0.354, 0.833 and 0.532, respectively). The frequency of the minor allele (i.e. T) of this SNP was 0.25 among the control group, 0.21 among the patients and 0.234 among whole participants. Logistic regression analysis showed that C20orf54 rs13042395 was not associated with ESCC assuming any analyzed genetic model (Table 2).

RUNX1 rs2014300 and Risk of ESCC in the Iranian Cohort

RUNX1 rs2014300 genotype frequencies were not significantly deviated from the Hardy-Weinberg equilibrium among members of the control group, patients or all subjects (Pvalues were 0.341, 1.0 and 0.316, respectively). The frequency of the minor allele (A) of this SNP was 0.1866 among the control group, 0.12 among the patients and 0.16 among whole participants. Logistic regression analysis found that rs2014300 was associated with ESCC under codominant, dominant and log additive models (Table 2). Subjects carrying the AG genotype had more than 1.5 fold lower risk of ESCC than ones carrying the GG genotype assuming the codominant model [AG vs. GG, OR (95% CI): 0.63 (0.41–0.97), Pvalue: 0.018306]. The dominant model revealed that subjects carrying at least one A allele had an approximate 1.7 fold lower risk of ESCC than ones with the GG genotype [AG + AA vs. GG, OR (95% CI): 0.59 (0.39–0.89), Pvalue: 0.010]. Furthermore, each additional A allele was associated with a 1.63 fold lower risk of ESCC assuming the log-additive model [OR (95% CI): 0.61 (0.42–0.87), Pvalue: 0.005]. According to the lowest AIC value, the log-additive was the model that fitted the data in the best way (AIC: 669.2).

Meta-Analysis of ESCC Risk Associated with PLCE1 rs2274223 and C20orf54 rs13042395

The process of study selection is shown in Fig. 4. A total of 85 articles were identified, of which 34 were duplicated records and excluded. Moreover, thirty-seven articles did not meet the inclusion criteria and were excluded. In the study by Malik MA [28], patients had either squamous cell carcinoma or adenocarcinoma histology; but the genotype distributions were not reported separately for each histology. Moreover, in the study by Dong Y [29] the histological type of EC patients was not described. Therefore these studies were excluded [28, 29]. Finally, a total of 14 eligible articles [15,16,17, 19, 30,31,32,33,34,35,36,37,38,39] which contained 13 studies for PLCE1 rs2274223 and seven studies for C20orf54 rs13042395 were included. Adding the present study, 14 studies (which involved 9810 cases and 13,128 controls) for PLCE1 rs2274223 and eight studies (which involved 2363 cases and 5329 controls) for C20orf54 rs13042395 were included in the final meta-analysis. Table 3 shows the characteristics of the included studies. For PLCE1 rs2274223, ten studies were performed in Asian populations, two studies in Europeans and two studies in Africans. For C20orf54 rs13042395, five studies were performed in Asians, one in Europeans and two in Africans.

Fig. 4
figure 4

The process of study selection

Table 3 Characteristics of the studies included in the present meta-analysis

As shown in Table 4, meta-analysis revealed significant associations between PLCE1 rs2274223 and ESCC risk assuming all the analyzed models [G vs. A: OR (95% CI) 1.270 (1.150–1.403), P: 0.000; GG vs. AA: OR (95% CI) 1.710 (1.534–1.905), P: 0.000; AG vs. AA: OR (95% CI) 1.336 (1.261–1.415), P: 0.000; GG + AG vs. AA: OR (95% CI) 1.393 (1.319–1.472), P: 0.000; GG vs. AG + AA: OR (95% CI) 1.509 (1.360–1.675), P: 0.000]. There was no evidence of publication bias in the overall estimation (All Pvalues of the Begg’s tests >0.05, Table 4). Figure 5 and Fig. 6a show the forest and the funnel plots for the homozygote model, respectively. Significant heterogeneity was observed in all models (Table 4). Meta-regression showed that ethnicity (defined as Asians or non-Asians) may be a significant source of heterogeneity in the allelic model (coefficient: 0.296, SE: 0.113, P: 0.023, R2: 45.59%), the homozygote model (coefficient: 0.548, SE: 0.223, P: 0.030, R2: 55.41%), the heterozygote model (coefficient: 0.257, SE: 0.118, P: 0.050, R2: 54.90%), the dominant model (coefficient: 0.317, SE: 0.136, P: 0.038, R2: 42.28%) and the recessive model (coefficient: 0.457, SE: 0.170, P: 0.019, R2: 73.42%). However, test for residual heterogeneity revealed that a significant part of heterogeneity in the allelic model (QE: 27.556, P: 0.006) and the dominant model (QE: 22.777, P: 0.029) is still unexplained. Stratified analysis and meta-regression found statistically significant differences among ethnicity subgroups (Table 5), indicating that PLCE1 rs2274223 was associated with ESCC risk in Asian populations but not non-Asians. Moreover, no significant source of heterogeneity was found among other study level moderators (i.e. year of publication and genotyping method) in the analyzed genetic models. Sensitivity analysis under all genetic models showed that no single study influenced the estimated ORs or 95% CI, indicating statistical robustness of the analysis (Fig. 6b).

Table 4 Meta-analysis of ESCC risk associated with PLCE1 rs2274223 and C20orf54 rs13042395
Fig. 5
figure 5

Forest plot of PLCE1 rs2274223 for the homozygote model. Heterogeneity: I2:61%; tau2: 0.0859; P < 0.01

Fig. 6
figure 6

The Begg’s funnel plot (a) and the sensitivity analysis (b) of association between PLCE1 rs2274223 and ESCC risk under homozygote model. In funnel plot, vertical black line and the dotted line represent pooled fixed and random effect estimates. For sensitivity analysis, each line represents the pooled OR and its corresponding 95% CI estimated by omitting one study at a time

Table 5 Stratified analysis and meta-regression to evaluate the effect of ethnicity (PLCE1 rs2274223)

Meta-analysis found that C20orf54 rs13042395 was not associated with ESCC risk under any genetic model (Table 4). No evidence of publication bias was found (All Pvalues of the Begg’s tests >0.05, Table 4). The fixed effect model was used for rs13042395 as there was no significant heterogeneity among studies. According to the sensitivity analysis, no single study influenced the estimated pooled ORs.

Discussion

Recent large GWASs have yield enormous progress in the understanding of the genetic basis of ESCC by exploring the relationship between a large number of variants and disease predisposition, and discovered several candidate loci in Chinese populations [7, 13, 14]. Although Iran is considered as a high-risk region for ESCC [3], currently little is known about the role of genetic components in the disease risk in this region. This study evaluated whether similar associations existed in an Iranian cohort for three GWAS identified variants and showed that PLCE1 rs2274223 and RUNX1 rs2014300 may contribute to ESCC predisposition in the cohort.

The PLCE1 locus at 10q23 was first identified to independently modulate risk of both ESCC and gastric cancer through a well-powered GWAS in a large Chinese cohort [12]. Among multiple SNPs that reached genome-wide significance threshold, the most notable signal was PLCE1 rs2274223 [12]. The association with ESCC was subsequently confirmed by two independent GWASs in Chinese cohorts [7, 14], and also, reproduced in Chinese [15, 32] and Koreans [31]. A recent joint analysis of three GWASs in Chinese population have found significant result for this locus [13]. These studies demonstrated that the variant allele (i.e. G) increases risk of ESCC. However, similar association was not found in South Africans [17] and Indians [19, 28]. There are inconsistencies between results of studies in Caucasians. Dura et. al. found no association in Dutch Caucasians [30], whereas Palmer et. al. reported an association in U.S. Caucasians in the opposite direction to those in Chinese population [16]. The association reported in Iranians by the current study was in the same direction with the original studies in Chinese populations (Table 2). The present study combined the results of 14 association studies and indicated that PLCE1 rs2274223 is associated with increased risk of ESCC. Meta-regression suggests that a proportion of heterogeneity may be attributed to the ethnicity. However, at-least under allelic and dominant genetic models, there was statistically significant between-study heterogeneity that could not be explained by the study level moderators. In stratified analysis, the association was only significant in Asian populations (Table 5). Quality controls of the meta-analysis (including sensitivity analyses and publication bias evaluations) suggested the reliability of our results. However, it should be noted that the number of studies in non-Asian populations were limited in the meta-analysis and this, in turn, may influence the statistical power of the subgroup analysis and the publication bias assessment. Significant heterogeneity was observed under all genetic models and, therefore, the random effect model was used to estimate pooled ORs.

PLCE1 is a member of the phospoholipase C family of proteins that interacts with the proto-oncogene ras among other proteins [40]. Animal and experimental studies suggest that PLCE1 functions as a tumor suppressor in ras-triggered cancers like colorectal, lung and skin cancers [41, 42]. In the other side, PLCE1 has also been linked to oncogenic functions. Knockout studies have shown that PLCE1-knockout APC min/+ mice are resistant to intestinal tumor formation through attenuation of angiogenesis and tumor associated inflammation [43]. In ESCC, however, the finding that PLCE1 protein level was significantly higher in tumors than in normal esophagus tissues suggests an oncogenic role for this gene [44, 45]. Increased level of PLCE1 was correlated with advanced tumor-node-metastasis stages and lymph node metastasis [45] and with increased expression of NF-κB-related proteins [46]. Its oncogenic function has further been supported by an RNAi approach, reporting that PLCE1 silencing in EC cell line resulted in increased apoptosis and cell cycle arrest through upregulation of caspase-3 and downregulation of cyclin D [44]. Moreover, Knockdown of PLCE1 in EC cell line markedly increased p53 expression and apoptosis [47]. It has been shown that rs2274223, which is a nonsynonymous variant, may increase mRNA and protein levels of PLCE1 [48]. The present study combined the.

C20orf54 rs13042395-T allele has been shown to be associated with a lowered risk of ESCC by a GWAS in a Chinese cohort [14]. However, the association of this SNP with ESCC is controversial as some studies failed to reproduce it in Chinese [15, 29] or Caucasians [16]. The current study did not identify a significant association between this SNP and ESCC, suggesting that it may not contribute to risk of ESCC in Iranians. Meta-analysis of eight studies also revealed no significant association (Table 4). However, it should be noted that the non-significant result of this meta-analysis will not roll out the possibility that discovering the true effect of this SNP may need a large sample size, as including the results of genome-wide studies may suggest only a subtle effect in allelic model [49].

Runt-related transcription factor 1 (RUNX1) rs2014300 has also been identified as risk of ESCC modulator in Chinese population [7]. It has been shown that the G allele increases the risk for ESCC [7]. This association has, recently, been confirmed through a joint analysis of three GWASs in Chinese population [13]. The association has been reported in South African Mixed Ancestry population in an opposite direction to those in the original GWASs, but not in South African Blacks [17]. RUNX1 belongs to runt-related transcription factor (RUNX) family and is thought to be involved in the development of normal hematopoiesis [provided by RefSeq]. In ESCC, it has been shown that LincRNA-uc002yug.2 promotes tumor progression through regulating alternative splicing of RUNX1 [50]. In conclusion, we provided the first evidence for the association of PLCE1 rs2274223 and RUNX1 rs2014300 with risk of ESCC in an Iranian cohort.