Introduction

Lung cancer is the most common cause of cancer death worldwide, with over one million cases annually (Ferlay et al. 2007). The pathogenesis of lung cancer is known to be influenced by smoking, air pollution, exposure to hazardous substances (e.g., asbestos, chromium, nickel, inorganic arsenic, mustard gas, tar, etc.). Among all the known environmental risk factors, smoking is considered to be the most important cause of lung cancer; smoking increases 20-fold susceptibility to develop lung cancer, and 80–90 % cases of lung cancer relate to smoking (Ruano-Ravina et al. 2003). Although cigarette smoking is the major risk factor for lung cancer, genetic factors also affect lung cancer susceptibility (Matakidou et al. 2005). For example, increased lung cancer rates are observed in several genetic syndromes, including Li-Fraumeni syndrome, hereditary retinoblastoma, familial breast cancer, and Bloom syndrome (Strong et al. 1984; Li et al. 1988; Sanders et al. 1989; German 1993; Johannsson et al. 1996). Despite intense efforts in the past decades, researchers have still not identified specific biomarkers for lung cancer risk assessment and prognosis prediction.

Lung cancer is classified into two main histological groups: small cell lung cancer (SC) and non-small cell lung cancer; the latter includes adenocarcinoma (AD) and squamous cell carcinoma (SQ), along with large cell lung cancer (LC). Worldwide, adenocarcinoma is the most frequently identified histological type, and the relative proportion of lung cancer due to this histology has steadily risen. Demographic, etiologic, clinical, and molecular characteristics of the lung cancer subtypes have been reported (Gabrielson 2006). Although family history of lung cancer has been associated with histological subtypes (Ambrosone et al. 1993; Gao et al. 2009), the inherited susceptibility factors that affect specific histologies are unknown.

Recently, three genome-wide association studies (GWAS) of lung cancer and subsequent pooled GWAS analyses identified inherited susceptibility variants on chromosome 15q25 (Amos et al. 2008; Hung et al. 2008; Thorgeirsson et al. 2008), 5p15 (McKay et al. 2008; Wang et al. 2008; Rafnar et al. 2009), and 6p21.5 (Wang et al. 2008). The 5p15.33 region comprises two candidate susceptibility genes, telomerase reverse transcriptase (TERT) and cleft lip and palate transmembrane 1-like (CLPTM1L). The IARC (McKay et al. 2008) and the Institute of Cancer Research (ICR) and MD Anderson groups (Wang et al. 2008) reported that two different SNPs at 5p15.33 (rs402710 and rs401681, respectively), which are in strong linkage disequilibrium (D′ = 1.00, r 2 = 0.66), are associated with the risk of lung cancer (IARC study: P = 2 × 10−7; ICR and MD Anderson groups: P = 8 × 10−9). CLPTM1L plays a role in apoptosis and has been found to be upregulated in cisplatin-resistant cell lines (Yamamoto et al. 2001).

Replication of initial genome-wide association findings is considered a gold standard for reporting genotype–phenotype associations. Over the past few years, considerable efforts have been devoted to exploring the relationships between the two SNPs (rs402710 and rs401681) of CLPTM1L at 5p15.33 and lung among various populations. However, existing studies have yielded inconsistent results. These disparate findings may be due partly to insufficient power, false-positive results, and publication biases. The interpretation of these studies has been further complicated by the use of different populations, or histological types. To help clarify the inconsistent findings, we conducted a comprehensive meta-analysis to quantify the overall risk of rs402710 and rs401681 polymorphism of CLPTM1L on developing lung cancer.

Materials and methods

Literature search strategy

Genetic association studies published before the end of January 2014 on lung cancer and the two SNPs (rs402710 and rs401681) at 5p15.33 were identified through a search of PubMed, Web of Science, EMBASE, SCOPUS, and Cochrane databases using combinations of the following keywords: ‘CLPTM1L,’ ‘rs402710,’ ‘rs401681,’ ‘5p15,’ ‘5p15.33,’ ‘polymorphism’ or ‘variant,’ and ‘lung cancer’ or ‘lung carcinoma’ or ‘lung neoplasm’ without restriction on language. We replaced one of those search terms each time until all possible combination mode was searched to avoid any missing literature. The titles and abstracts of potential articles were screened to determine their relevance, and any clearly irrelevant studies were excluded. The full texts of the remaining articles were read to determine whether they contained information on the topic of interest. Furthermore, reference lists of primary studies and review articles were also reviewed by a manual search to identify additional relevant publications.

Inclusion and exclusion criteria

We reviewed abstracts of all citations and retrieved studies. The following criteria were used to include published studies: (1) Identification of lung cancer cases was confirmed histologically or pathologically, (2) case–control or cohort studies to evaluate the association between rs402710 or rs401681 polymorphism and lung cancer risk, and (3) genotype distribution information in cases and controls or odds ratio (OR) with its 95 % confidence interval (CI) and P value. Major reasons for exclusion of studies were (1) case-only studies or family-based studies; (2) duplicated studies; and (3) no sufficient data were reported.

Data abstraction

Data abstraction was performed independently by two reviewers, and differences were resolved by further discussion among all authors. For each study, the following characteristics were collected: the first author, publication year, ethnicity of the study population, number of cases and controls, gender, age of cases and controls, histological types, cigarette smoking status, source of control, genotyping method, and number of genotypes in cases and controls.

Quality assessment

For association studies with inconsistent results on the same polymorphisms, the methodological quality should be assessed by appropriate criteria to limit the risk of introducing bias into meta-analyses or systematic reviews. A procedure known as ‘extended-quality score’ has been developed to assess the quality of association studies. The procedure scores each paper categorizing it as having ‘high,’ ‘median,’ or ‘poor’ quality. Detailed procedure of the quality assessment was previously described (Li et al. 2006).

Statistical methods

Deviation from Hardy–Weinberg equilibrium (HWE) was examined by chi-square test with 1 degree of freedom. Crude odds ratio (ORs) with corresponding 95 % confidence intervals (CIs) was used to assess the strength of association between the 5p15.33-rs402710 and 5p15.33-rs401681 polymorphism and lung cancer risk. Additional pooled estimates were also given with corresponding results under dominant and recessive genetic models. Cochran’s chi-square-based Q statistic test was performed in order to assess possible heterogeneity between the individual studies and thus to ensure that each group of studies was suitable for meta-analysis (TanTai et al. 2014). The degree of heterogeneity was assessed using the I 2 metric (I 2 = (Q − df)/Q), which is independent of the number of studies in the meta-analysis (Chen et al. 2014). ORs were pooled according to the method of DerSimonian and Laird that takes into account the variation between studies, and 95 % CI was constructed using Woolf’s method (Woolf 1955; DerSimonian and Laird 1986). The Z test was used to determine the significance of the pooled OR. Ethnicity (Caucasian vs. East Asian), study size (≥1,000 cases or, <1,000 cases), histological types (AD, SQ, SC, and LC), and smoking behavior (former smokers or current smokers or never smokers) were pre-specified as characteristics for assessment of heterogeneity. In addition, ethnicity, histological subtype, source of controls, sample size, smoking behavior, and genotyping method were analyzed as covariates in meta-regression. Sensitivity analyses were performed to assess the stability of the results; namely, a single study in the meta-analysis was deleted each time to reflect the influence of the individual data set to the overall OR. Publication bias was assessed using Egger’s test and Begg’s funnel plots (Begg and Mazumdar 1994; Egger et al. 1997). All P values are two-sided at the P = 0.05 level. All statistical analyses were carried out with the Stata software version 10.0 (Stata Corporation, College Station, TX).

Results

Characteristics of studies

In all, we included 27 studies in this meta-analysis (Supplementary figure 1), with a total of 60,828 LC cases and 109,136 controls (McKay et al. 2008; Wang et al. 2008, 2013; Landi et al. 2009; Rafnar et al. 2009; Zienolddiny et al. 2009; Hsiung et al. 2010; Miki et al. 2010; Truong et al. 2010; Yang et al. 2010; Yoon et al. 2010; Hu et al. 2011; Jaworowska et al. 2011; Pande et al. 2011; Young et al. 2011; Bae et al. 2012; Chen et al. 2012; Ito et al. 2012; Lan et al. 2012; Shiraishi et al. 2012; Jiang et al. 2013; Ke et al. 2013; Li et al. 2013; Lu et al. 2013; Myneni et al. 2013; Zhao et al. 2013). For the rs402710 polymorphism, 16 studies were available, including a total of 31,854 cases and 41,073 controls. For the rs401681 polymorphism, 17 studies involved a total of 35,395 cases and 79,399 controls. These two polymorphisms were found to occur in frequencies consistent with HWE in the control populations of all the published studies. Fourteen studies were given high quality, and thirteen studies were given median quality. No ‘poor quality’ study was found. Characteristics of studies included in the current meta-analysis are presented in Table 1.

Table 1 Characteristics of the studies included in the meta-analysis

Association of CLPTM1L-rs401681 polymorphism with lung cancer

For LC risk and the rs401681 polymorphism, our meta-analysis gave an overall OR of 1.14 [95 % CI 1.11–1.16, P(Z) < 10−5; P(Q) = 0.62; Fig. 1] without statistically significant between-study heterogeneity (P > 0.05). Significantly increased LC risks were also found using dominant [OR 1.20, 95 % CI 1.12–1.30, P(Z) < 10−5; P(Q) = 0.11] and recessive model [OR 1.28, 95 % CI 1.14–1.44, P(Z) < 10−5; P(Q) = 0.008]. After adjusting for multiple testing using Bonferroni correction, all significant associations for rs401681 under different genetic models remained. In meta-regression analysis, ethnicity, sample size, genotyping method, source of controls, and genotyping method did not significantly explain such heterogeneity (P > 0.05). By contrast, histological subtype (P = 0.03) was significantly correlated with the magnitude of the genetic effect. Galbraith plot analyses of all included studies were used to assess the potential sources of heterogeneity. One study (Pande et al. 2011) was found to be contributors of heterogeneity for rs401681 polymorphism (Supplementary figure 2).

Fig. 1
figure 1

Meta-analysis with a random-effects model for the association between lung cancer risk and CLPTM1L-rs401681 polymorphism

When stratifying for ethnicity, an OR of 1.14 (95 % CI 1.11–1.17, P < 10−5) and 1.13 (95 % CI 1.09–1.17, P < 10−5) resulted for the G allele, among Caucasian, and East Asian populations, respectively. Similar results were also detected using dominant and recessive genetic models (Table 2). Subsidiary analyses of sample size yielded a per-allele OR for small studies of 1.16 (95 % CI 1.11–1.22, P < 10−5), and for large studies of 1.14 (95 % CI 1.10–1.17, P < 10−5).

Table 2 Meta-analysis of the CLPTM1L-rs401681 polymorphism on lung cancer risk

The association between genotypes and the risk of lung cancer was further examined by stratifying the subjects according to histological type and smoking behavior (Table 3). In subgroup analyses by histological types of LC, we found that the rs401681 polymorphism was significantly associated with lung adenocarcinoma (G allele: OR 1.49, 95 % CI 1.19–1.86, P = 0.001) and squamous cell carcinoma (G allele: OR 1.59, 95 % CI 1.23–2.06, P = 0.001), while marginal significant associations were detected for small cell carcinoma (G allele: OR 1.41, 95 % CI 1.03–1.94, P = 0.03). However, no significant associations were detected for large cell carcinoma. We investigated the relationship between the polymorphism and known environmental tobacco exposures. The effect of environmental tobacco smoke was similar for never smokers with per-allele OR of 1.20 (95 % CI 1.08–1.33, P = 0.001), compared to current smokers (OR 1.43, 95 % CI 1.05–1.95, P = 0.02) and former smokers (OR 1.23, 95 % CI 1.10–1.37, P < 10−4).

Table 3 Per-allele ORs and 95 % CIs for the association between CLPTM1L-rs401681 and lung cancer risk by histological type and smoking behavior

Association of CLPTM1L-rs402710 polymorphism with lung cancer

Overall, there was evidence of an association between increased risk of LC and the rs402710 polymorphism in different genetic models when all the eligible studies were pooled into the meta-analysis. Using random-effect model, the summary per-allele OR of the rs402710 G variant for LC was 1.15 [95 % CI 1.12–1.19, P(Z) < 10−5, P(Q) = 0.15; Fig. 2], with corresponding results under dominant and recessive genetic models of 1.18 [95 % CI 1.15–1.24, P(Z) < 10−5, P(Q) = 0.08] and 1.22 [95 % CI 1.18–1.27, P(Z) < 10−5, P(Q) = 0.002], respectively. In addition, for rs402710, associations were maintain statistically significant after Bonferroni correction only for multiple genetic models.

Fig. 2
figure 2

Meta-analysis with a random-effects model for the association between lung cancer risk and CLPTM1L-rs402710 polymorphism

When studies were stratified for ethnicity, significant risks were found among East Asians in all genetic model [G allele: OR 1.12, 95 % CI 1.07–1.18; dominant model: OR 1.15, 95 % CI 1.07–1.23; recessive model: OR 1.18, 95 % CI 1.10–1.26]. Significant results were also found in the Caucasian populations [G allele: OR 1.17, 95 % CI 1.13–1.20; dominant model: OR 1.20, 95 % CI 1.16–1.25; recessive model: OR 1.23, 95 % CI 1.19–1.28]. By considering sample size subgroups, the OR was 1.15 (95 % CI 1.10–1.19, P < 10−5) in large studies compared to 1.17 (95 % CI 1.11–1.23, P < 10−5) in small studies. One study (Shiraishi et al. 2012) was found to be contributors of heterogeneity for rs402710 polymorphism after Galbraith plot analyses (Supplementary figure 3) (Table 4).

Table 4 Meta-analysis of the CLPTM1L-rs402710 polymorphism on lung cancer risk

In the subgroup analyses by histological type, significant associations were found for lung adenocarcinoma, squamous cell carcinoma, and small cell carcinoma. However, we failed to detect any association between large cell carcinoma for the polymorphism (Table 5). In addition, associations of the polymorphism with LC risk were observed both in smokers and never smokers (Table 5). Although the formal test for heterogeneity was not significant, we conducted meta-regression as there were also grounds for considering the ethnicity, sample size, control source, histological type, and genotyping method as potential sources of heterogeneity. However, the meta-regression showed that none of these covariates significantly contributed to the heterogeneity among the individual study results except for ethnicity (P = 0.008).

Table 5 Per-allele ORs and 95 % CIs for the association between CLPTM1L-rs402710 polymorphism and lung cancer risk by histological type and smoking behavior

Sensitivity analyses and Publication bias

A single study involved in the meta-analysis was deleted each time to reflect the influence of the individual dataset to the pooled ORs, and the corresponding pooled ORs were not qualitatively altered (Supplementary figures 4 and 5). Begg’s funnel plot and Egger’s test were performed to access the publication bias of the literatures. The shape of the funnel plots was symmetrical for these polymorphisms (Supplementary figures 6 and 7). The statistical results still did not show publication bias in these studies for rs401681 (Begg’s test, P = 0.15; Egger’s test, P = 0.14) and rs402710 (Begg’s test, P = 0.36; Egger’s test, P = 0.22).

Discussion

The association between polymorphisms of 5p15.33 and lung cancer risk had been originally reported by McKay et al. (2008). Recently, many lung cancer GWAS and replication studies have been conducted in European populations (Wang et al. 2008; Rafnar et al. 2009; Truong et al. 2010) and to a lesser extent in East Asians (Miki et al. 2010; Yoon et al. 2010; Hu et al. 2011). However, there are significant differences in allele frequencies and the prevalence of lung cancer among different populations. It is, therefore, important to quantitatively assess the effects of the GWAS-identified markers in different ethnic populations and explore potential heterogeneity of published data. This is the most comprehensive meta-analysis that examined the two SNPs (rs402710 and rs401681) at 5p15.33 and the relationship with susceptibility for lung cancer. Its strength was based on the accumulation of published data giving greater information to detect significant differences. In total, the meta-analysis involved 27 studies that provided 60,828 LC cases and 109,135 controls.

Our results demonstrated that the rs402710 and rs401681 polymorphism is a risk factor for developing lung cancer. In the stratified analysis by ethnicity, significant associations were found in East Asians and Caucasians for these polymorphisms in all genetic models, suggesting a similar role of these variant among different ethnicity with different genetic backgrounds. However, we observed that association between these polymorphisms and risk of lung cancer in Caucasians was stronger than that in East Asians. There are several possible reasons for such differences. Firstly, the frequencies of the risk-association alleles in these polymorphisms vary between different races. For example, the G allele distribution of the rs401681 varies between Caucasians and East Asians, with a prevalence of 56 and 70 %, respectively (McKay et al. 2008; Wang et al. 2008; Hu et al. 2011; Jaworowska et al. 2011). Secondly, study design or small sample size or some environmental factors may affect the results. Most of these studies did not consider most of the important environmental factors. It is possible that variation at this locus has modest effects on lung cancer, but environmental factors may predominate in the progress of lung cancer and mask the effects of this variation. Specific environmental factors such as lifestyle and smoking have been already well studied in recent decades (Matakidou et al. 2005). In addition, different populations usually have different linkage disequilibrium patterns. A polymorphism may be in close linkage with another nearby causal variant in one ethnic population but not in another. These polymorphisms may be in close linkage with different nearby causal variants in different populations.

Stratification of tumors by histological type indicated that the association of the 2 SNPs on 5p15.33 with lung cancer risk appeared to be similar for various histological types of lung cancer. In the stratified analysis by smoking behavior, significant associations were found among never smokers, ever smokers, and former smokers for the two polymorphisms. These results were in accordance with those reported by previous genome-wide association studies in a white population. McKay et al. (2008) reported odds ratios of 1.18 for rs402710, whereas Wang et al. (2008) reported an odds ratio of 1.14 for rs401681 (in strong linkage disequilibrium with rs402710). Neither of these studies reported heterogeneity by histology, smoking status, age at diagnosis, or sex. The magnitudes of these associations are consistent with our findings.

The two SNPs rs401681 and rs402710 are next to each other in the CLPTM1L gene and are strongly linked. CLPTM1L, named for its similarity to a gene implicated in susceptibility to cleft lip palate, was identified through screening for cisplatin (CDDP) resistance-related genes and was found to be upregulated in CDDP-resistant ovarian tumor cell lines and to induce apoptosis in CDDP-sensitive cells (Yamamoto et al. 2001). The CLPTM1L gene is well conserved and expressed in various tissues, including lung tissue.

Survival of cells undergoing DNA damage from genomic instability or genotoxic stress leads to the accumulation of genetic lesions characterizing human tumors. Abrogation of cell-cycle checkpoint and apoptotic safeguards against replication of damaged DNA is a hallmark of cancer. The result of protection against DNA damage-induced apoptosis includes both vulnerability to unchecked somatic mutation and decreased sensitivity to radiotherapy and chemotherapy. The previously referenced study by Zienolddinny et al. (2009) suggests that CLPTM1L polymorphisms may affect DNA damage accumulation. Thus, it is plausible that protection from apoptosis by CLPTM1L contributes to accumulation of DNA damage, thereby conferring susceptibility to tumorigenesis. An alternative hypothesis is that CLPTM1L plays a role in recognition or repair of DNA damage, affecting the accumulation of such damage. Another is that CLPTM1L directly influences the amount of DNA damage that is incurred by genotoxic agents. Recently, James et al. (2012) suggests that CLPTM1L expression does not directly affect levels of acute DNA damage incurred by cisplatin or nitrosamine 4-(methyl-nitrosa-mino)-1-(3-pyridyl)-1-butanone (NNK). Moreover, the current study demonstrates that CLPTM1L has an apoptotic role downstream of DNA damage and through regulation of Bcl-xL expression, which is likely to affect accumulation of DNA damage. The observed difference in accumulation of Bcl-xL upon modulation of CLPTM1L expression, along with reconstitution of an apoptosis-resistant phenotype with expression of exogenous Bcl-xL, provides evidence that CLPTM1L acts upstream of Bcl-xL to confer resistance to genotoxic stress-induced apoptosis.

Limitations also inevitably existed in this meta-analysis. First, our meta-analysis is based on unadjusted estimates, whereas a more precise analysis could be performed if individual data were available, which would allow for an adjustment estimate. To be made, however, this approach requires the authors of all of the published studies to share their data. Second, the subgroup meta-analyses considering interactions between the 2 SNPs and histological type, as well as smoking behavior, were performed on the basis of a fraction of all the possible data to be pooled, so selection bias may have occurred and our results may be overinflated. Nevertheless, the total number of subjects included in this part of the analysis comprises the largest sample size so far. Finally, only published studies were included in this meta-analysis. Therefore, publication bias may have occurred, even though the use of a statistical test did not show it.

In conclusion, our results suggest that the rs402710 and rs401681 polymorphism at 5p13.33 may contribute to increased risk of lung cancer. However, additional large studies are needed to validate our findings. Moreover, gene–gene and gene–environment interactions should also be considered in future studies.