Introduction

Lung cancer (LC) is a major cause of cancer death worldwide with >1 million deaths each year (Jemal et al. 2011). Although >80 % of the population attributable risk of lung cancer can be ascribed to tobacco smoking, several lines of evidence indicate that inherited genetic factors influence the development and progression of lung cancer; in particular, epidemiologic studies have consistently shown an elevated risk of lung cancer in relatives of lung cancer cases after adjustment for smoking (Tokuhata and Lilienfeld 1963). Lung cancer is classified into two main histologic groups: small cell lung cancer (SC) and non-small cell lung cancer; the latter includes adenocarcinoma (AD) and squamous cell carcinoma (SQ), along with rarer subtypes (Sun et al. 2007). Each type has different pathophysiological and clinical features, suggesting that their mechanisms of carcinogenesis differ (Daigo and Nakamura 2008). Despite much investigation, only a few genes identified through the candidate gene approach have been confirmed to be associated with LC (e.g., CYP1A1, XRCC1, GSTs, and TP53) (Ye et al. 2006; Chen et al. 2011; Li et al. 2013; Zhou et al. 2013). However, because the pathogenesis of LC is yet to be elucidated completely, the candidate gene approach is limited in power to detect novel disease-susceptibility genes.

Genome-wide association studies (GWAS) of lung cancer with a full range of histological types have been conducted in European populations, and associations at 15q25.1 (CHRNA5-CHRNA3-CHRNB3), 5p15.33 (TERT-CLPTM1L) and 6p21.33 (BAT3-MSH5) have been identified (Amos et al. 2008; Hung et al. 2008; McKay et al. 2008; Wang et al. 2008; Broderick et al. 2009). Recently, a GWAS conducted among subjects from Japan and South Korea replicated the association between lung adenocarcinoma and the CLPTM1L-TERT locus in East Asians and identified a novel association between lung adenocarcinoma and the TP63 locus on 3q28 (Miki et al. 2010). However, the association was less clear as it did not achieve genome-wide significance (P ≤ 10−7) when restricted to never-smoking males (P = 0.012) and never-smoking females (P = 3.5 × 10−6). Subsequently, considerable efforts have been devoted to further exploring the strength of this association with lung cancer risk in the absence of tobacco use among various ethnic populations. However, a proportion of them have produced inconsistent results. These disparate findings may be due partly to insufficient power, phenotypic heterogeneity, population stratification, small effect of the polymorphism on LC risk, and even publication biases. We therefore performed a meta-analysis of the published studies to clarify this inconsistency and to establish a comprehensive picture of the relationship between two common variants (rs10937405 and rs4488809) on chromosome 3q28 and lung cancer.

Materials and methods

Identification and eligibility of relevant studies

Electronic databases (Pubmed, EMBASE, ISI Web of Science, EBSCO, Cochrane Library databases) were searched up to Jan 2014 for all genetic association studies evaluating polymorphism at 3q28 and lung cancer in humans in all languages. The search strategy contained both medical subject heading terms and text words as follows: ‘‘3q28’’ or ‘‘tp63’’ or ‘‘rs10937405’’ or ‘‘rs4488809’’, in combination with ‘‘lung cancer’’ or ‘‘lung carcinoma’’ or ‘‘lung neoplasm’’ and combined with ‘‘genetic’’ or “polymorphism(s)’’ or ‘‘variations(s)’’ or ‘‘genotype’’ or ‘‘gene(s)’’. Articles were selected on the basis of the abstract, before examining the full text. In addition, the reference lists of selected articles were hand-searched to identify additional relevant reports. Case reports, case-only studies, editorials, and review articles were excluded. Articles in languages other than English were translated. The included studies have to meet the following criteria: (1) case–control or cohort studies to evaluate the association between polymorphism at 3q28 and LC risk, (2) identification of lung cancer patients was confirmed histologically or pathologically, (3) original papers containing independent data which have been published in peer-reviewed journal, (4) genotype distribution information in cases and controls or odds ratios (ORs) with its 95 % confidence intervals (CIs) and P value. If multiple published reports from the same study population were available, we included only the one with largest sample size and the most detailed information. Studies with different ethnic groups were considered as individual studies for our analyses.

Quality assessment and data extraction

Each article was read and assessed according to the score scale for randomized controlled association study proposed by Li and He (2008). In brief, papers were rated according to several items on the scale in relation to two areas: experiment design to minimize potential bias and data analysis. The quality score categorizes studies as of “high”, “median” or “low” quality.

Data extraction was performed independently by two reviewers. Review reports from the two were then compared to identify any inconsistency, and differences were resolved by further discussion among all authors. For each included study, the following information was extracted from each report according to a fixed protocol: first author, year of publication, ethnicity, identification of cancer cases, age, sex, smoking status (never-smoker and smoker), histological types (adenocarcinoma, squamous cell carcinoma, and small cell carcinoma), study design (GWAS or candidate gene study), source of control groups (population-based controls and hospital-based controls), Hardy–Weinberg equilibrium (HWE) status among controls, number of cases and controls, genotype frequency, and genotyping methods. Where essential information was not presented in articles, every effort was made to contact the authors.

Statistical analysis

We first assessed HWE in the controls for each study using goodness-of-fit test (Chi-square or Fisher’s exact test) and a P < 0.05 was considered as significant disequilibrium. The strength of the association between LC and common variations at 3q28 was estimated using ORs, with the corresponding 95 % CIs. The meta-analysis examined the association between the polymorphism and the risk of LC for the: (1) allele contrast, (2) heterozygous, and (3) homozygote (Pan et al. 2013). Cochran’s Chi square-based Q statistic test was performed to assess possible heterogeneity between the individual studies, and thus to insure that each group of studies was suitable for meta-analysis (Cochran et al. 1954). Both fixed-effects (Mantel–Haenszel method) (Mantel and Haenszel 1959) and random-effects (DerSimonian–Laird method) (DerSimonian and Laird 1986) models were performed to calculate the pooled ORs. Owing to a priori assumptions about the likelihood of heterogeneity between primary studies, the random-effects model, which usually is more conservative, was reported in the text. Subsidiary analyses included subgroup analyses or random-effects meta-regression with restricted maximum likelihood (Thompson and Sharp 1999). Ethnicity (East Asian vs. Caucasian), study design (GWAS vs. candidate gene studies), source of controls (population-based vs. hospital-based studies), and sample size (≥1,000 cases or, <1,000 cases) were pre-specified as characteristics for the assessment of heterogeneity. Ethnicity, sample size, study design, source of controls, histological types of LC, smoking behavior, and sex distribution in cases and controls were analyzed as covariates in meta-regression. The Z test was used to determine the significance of the pooled OR (Tang et al. 2010). Sensitivity analyses were performed to assess the stability of the results, namely, a single study in the meta-analysis was deleted each time to reflect the influence of the individual dataset to the overall OR. The potential publication bias was estimated using Egger’s linear regression test by visual inspection of the funnel plot (Egger et al. 1997). If publication bias existed, the Duval and Tweedie nonparametric ‘‘trim and fill’’ method was used to adjust for it (Taylor and Tweedie 1998). All P values are two-sided at the P = 0.05 level. All of the statistical tests used in this meta-analysis were performed by STATA version 10.0 (Stata Corporation, College Station, TX) and SAS (version 9.1; SAS Institute, Cary, NC, USA).

Results

Characteristics of included studies

The combined search yielded 91 references. In all, we included 10 studies in this meta-analysis (Table 1; Supplementary Fig. 1), with a total of 36,221 cases and 58,108 controls (Miki et al. 2010; Hu et al. 2011; Wang et al. 2011; Hosgood et al. 2012; Lan et al. 2012; Shiraishi et al. 2012; Timofeeva et al. 2012; Yin et al. 2013; Hu et al. 2014). The main study characteristics were summarized in Table 1. For the rs10937405 polymorphism, 24 data sets from 8 studies involved a total of 31,369 cases and 52,317 controls. For the rs4488809 polymorphism, 20 data sets from 9 studies involved a total of 21,511 cases and 28,014 controls. These two polymorphisms were found to occur in frequencies consistent with Hardy–Weinberg equilibrium in the control populations of all included studies. Of the cases, 70 % were East Asian, and 30 % were Caucasian. The studies finally included were of median-to-high quality and included no “poor quality” study.

Table 1 Studies investigating the association between common variations on 3q28 and lung cancer

rs10937405 polymorphism and lung cancer

There was a wide variation in the risk C allele frequency of the rs10937405 polymorphism among the controls across different ethnicities, ranging from 0.55 to 0.72 (Supplementary Fig. 2). For East Asian controls, the C allele frequency was 0.70 (95 % CI 0.67–0.72), which was lower than that in Caucasian controls (0.57; 95 % CI 0.55–0.59) indicating a significant difference among East Asians as compared with Caucasians (P < 10−5).

Significant associations were found in the pooled analysis between rs10937405 polymorphism and increased risk of LC (C allele OR = 1.19, 95 % CI = 1.14–1.25, P < 10−5; heterozygous OR = 1.28, 95 % CI = 1.18–1.40, P < 10−5; homozygote OR = 1.25, 95 % CI = 1.19–1.31, P < 10−5) (Fig. 1). Significant heterogeneity was present among the included studies of the 3q28-rs10937405 polymorphism (P < 0.05). In the stratified analysis by ethnicity, significant associations were detected among East Asians in all genetic models (C allele OR = 1.19, 95 % CI = 1.14–1.25; heterozygous OR = 1.32, 95 % CI = 1.19–1.46; homozygote OR = 1.27, 95 % CI = 1.21–1.34). However, we failed to detect any association to LC risk for Caucasians in all genetic models (Table 2). When stratifying for sample size, an OR of 1.27 (95 % CI 1.18–1.36, P < 10−5) resulted for the risk allele among small studies. Analysis restricted to the 11 studies with at least 1,000 cases, which should be less prone to selective publication than smaller studies, yielded an OR of 1.16 (95 % CI 1.10–1.22, P < 10−5). In the subgroup analyses by control source, the per-allele OR for population-based study of the C variant was 1.16 (95 % CI 1.08–1.25, P < 10−5) and for hospital-based study was 1.21 (95 % CI 1.16–1.27, P < 10−5). Further stratified according to study design, significant results were found for GWAS and candidate gene study in all genetic models (Table 2). After adjusting for multiple testing using Bonferroni correction, all significant associations for rs10937405 under different genetic models remained.

Fig. 1
figure 1

Per-allele ORs and 95 % CIs for the association between 3q28-rs10937405 and lung cancer risk stratified by ethnicity

Table 2 Meta-analysis of the 3q28-rs10937405 polymorphism on lung cancer risk

We further performed analyses to test for differences in the associations of the polymorphism with lung cancer risk with respect to different clinical factors (Table 3). Given the biological differences between the histological forms of LC, we examined the association between the polymorphism and risk by histological subtype. A subgroup analysis by histology revealed strong heterogeneity (P < 10−5) with the strongest association for adenocarcinoma (OR = 1.26, 95 % CI = 1.21–1.30, P < 10−5). rs10937405 also showed an association with squamous cell carcinoma in the (OR = 1.14, 95 % CI = 1.06–1.22, P < 10−4). The association between the polymorphism and the risk of LC was further examined by stratifying the subjects according to smoking behavior. Never-smokers with the C allele of the polymorphism had similar increased LC risk compared to smoker cancer cases with an OR of 1.27 (95 % CI 1.21–1.33, P < 10−5) and 1.26 (95 % CI 1.18–1.34, P < 10−5), respectively. We next analyzed the effect of rs10937405 according to sex, this SNP tended to have similar OR for both females and males (Table 3).

Table 3 Subgroup analysis of 3q28-rs10937405 polymorphism and lung cancer risk stratified by histologic type, smoking behavior, and sex

As the formal test for heterogeneity may not be powerful enough, we conducted meta-regression as there were also grounds for considering the ethnicity, sample size, histological subtype, study design, and source of controls as potential sources of heterogeneity. The meta-regression showed that none of these covariates significantly contributed to the heterogeneity among the individual study results except for ethnicity (P = 0.001) and histological subtype (P = 0.02). Galbraith plot analyses of all included studies were used to assess the potential sources of heterogeneity. The study by Timofeeva et al. (2012), the study by Shiraishi et al. (2012) and one dataset from the study by Miki et al. (2010) were found to be the contributors of heterogeneity for rs10937405 polymorphism (Supplementary Fig. 3).

Sensitivity analysis was performed by excluding one study at a time (Supplementary Fig. 4). The results confirmed the significant association between the rs10937405 polymorphism and the risk of LC, with ORs and 95 % CIs ranging from 1.18 (95 % CI 1.13–1.23) to 1.20 (95 % CI 1.14–1.26). A funnel plot of these included studies suggested a possibility of the preferential publication of positive findings in smaller studies (Egger test, P = 0.04, Supplementary Fig. 5) for rs10937405. The Duval and Tweedie nonparametric “trim and fill” method was used to adjust for publication bias. Meta-analysis with “trim and fill” method did not draw different conclusion (OR = 1.14, 95 % CI = 1.09–1.19, P < 10−5; Supplementary Fig. 6), indicating that our results were statistically robust.

rs4488809 polymorphism and lung cancer

In the overall analysis, the risk C allele of rs4488809 was significantly associated with elevated LC (OR = 1.17, 95 % CI = 1.10–1.23, P < 10−5; Fig. 2). Significant associations were also found for heterozygous (OR = 1.21, 95 % CI = 1.09–1.35, P < 10−5) and homozygous (OR = 1.22, 95 % CI = 1.12–1.33, P < 10−5) when compared with wild genotype. In view of significant heterogeneity and to seek for its potential sources, we performed a panel of subgroup analyses on ethnicity, sample size, study design and source of controls (Table 4). Significant associations were found in East Asians (OR = 1.18, 95 % CI = 1.12–1.25, P < 10−5), while no significant associations were observed in Caucasians. When the analyses were performed by sample size, increased risk of LC was found only in larger studies in all genetic models (Table 4). Subgroup analysis showed a significant association in the studies using hospital-based control with pooled OR of 1.23 (95 % CI 1.19–1.27, P < 10−5), as compared with population-based control study (OR = 1.12, 95 % CI = 1.00–1.24, P = 0.045). Subsidiary analyses of study design yielded a per-allele OR for GWAS of 1.12 (95 % CI 0.94–1.35) and for candidate gene study of 1.17 (95 % CI 1.11–1.25). In addition, for rs4488809, associations were maintain statistically significant after Bonferroni correction for multiple genetic models. Meta-regression was used to explore the cause of heterogeneity, and it was found that ethnicity (P = 0.05), study design (P = 0.07) and source of controls (P = 0.10) did not significantly correlated with the magnitude of the genetic effect. However, a slight effect in heterogeneity for ethnicity (P = 0.04) and histological subtype (P = 0.007) was found. One dataset from the study by Hu et al. (2011), one dataset from the study by Hosgood et al. (2012) and the study by Timofeeva et al. (2012) were found to be contributors of heterogeneity for rs4488809 polymorphism (Supplementary Fig. 7).

Fig. 2
figure 2

Per-allele ORs and 95 % CIs for the association between 3q28-rs4488809 and lung cancer risk stratified by ethnicity

Table 4 Meta-analysis of the 3q28-rs4488809 polymorphism on lung cancer risk

In considering histological types, the overall per-allele OR of the C variant for lung adenocarcinoma was 1.19 (95 % CI 1.10–1.29, P < 10−5), with corresponding results for squamous cell carcinoma and small cell carcinoma of 1.08 (95 % CI 0.87–1.34, P = 0.36) and 1.05 (95 % CI 0.95–1.17, P = 0.46), respectively. Tobacco smoking is the major risk factor for lung cancer, and we further performed analyses to test for differences in the associations of the polymorphism with lung cancer risk with respect to different smoking behavior. Among never-smokers, there was significant association between rs4488809 and the risk of lung cancer (OR = 1.14, 95 % CI = 1.04–1.25, P = 0.003). Among smokers, this SNP tended to have a higher OR of 1.21 (95 % CI 1.14–1.29, P < 10−5). In the subgroup analyses by sex, we observed a sex difference for rs4488809 and LC risk, with a stronger association in male than in female (Table 5).

Table 5 Subgroup analysis of 3q28-rs4488809 polymorphism and lung cancer risk stratified by histologic type, smoking behavior, and sex

For the rs4488809 polymorphism, sensitivity analysis indicated that no single study influenced the pooled OR qualitatively, suggesting that the results of this meta-analysis are stable (Supplementary Fig. 8). The publication bias was evaluated with asymmetry tests. The shape of the funnel plots was symmetrical for these polymorphisms (Supplementary Fig. 9). The Egger test provided evidence that there was no publication bias among the studies included for rs4488809 (Egger’s test, P = 0.21).

Discussion

Multiple lines of evidence support an important role for genetics in determining risk for LC, and association studies are appropriate for searching susceptibility genes involved in LC (Risch and Merikangas 1996). Nevertheless, small sample sized association studies lack statistical power and have resulted in apparently contradicting findings (Lohmueller et al. 2003). Via a comprehensive meta-analysis, we evaluated the association of two common polymorphisms on 3q28 with the risk of LC. Overall results demonstrated that rs10937405-C allele and rs4488809-C allele might be risk-conferring factors for the development of LC in East Asians, but not in Caucasians. Although potential sources of heterogeneity could not be easily eliminated, the present study, to our knowledge, is the first meta-analysis which involved a total of 36,221 cases and 58,108 controls from 10 studies to date dealing with the association of these two polymorphisms with LC susceptibility.

Genetic heterogeneity is inevitable in disease identification strategy (Hemminki et al. 2006). We identified ethnicity as a potential source of between-study heterogeneity by subgroup analysis and meta-regression. In the subgroup analysis by ethnicity, significant associations were found in East Asians but not for Caucasians. There are some points that should be concerned for such inconsistent results. First, ethnic differences may attribute to these different results, since the distributions of these polymorphisms were different between various ethnic populations. For instance, the frequency of risk-C allele of rs10937405 differs from 56 % in Whites (Wang et al. 2011), to 72 % in Chinese population (Hosgood et al. 2012; Yin et al. 2013). Second, this conflicting association could also be explained by study design or small sample size. This is particularly true for SNP rs4488809 because only one study conducted among Caucasians was included in the present meta-analysis which had insufficient statistical power to detect a slight effect or different linkage disequilibrium (LD) pattern of the polymorphism among Caucasians. Furthermore, it is possible that variation at this locus has modest effects on LC, while combinations of multiple genes and environmental factors finally lead to the disease, it would not be observed, because environmental factors may predominate in the development of LC, like air pollution (Zhao et al. 2006) and smoking that have been already well studied in recent years all around the world (Hecht 2002; Vineis et al. 2004). Moreover, clinical heterogeneity like age, sex ratio, dietary, years from onset and disease severity may also explain the discrepancy. Therefore, well-designed studies in different ethnic populations focused on other loci which are in LD with these variations are needed to further validate ethnic difference in their effects on LC.

Stratification of tumors by histological subtype indicated that rs10937405 and rs4488809 confer risk, preferentially for lung adenocarcinoma; thereby confirming the recent observation made by Miki and colleagues in an analysis of East Asian populations (Miki et al. 2010). While it will be challenging to identify the precise mechanism by which 3q28 variation affects lung adenocarcinoma development, accumulation of DNA damage and lack of response to genotoxic stress is recognized to contribute to lung carcinogenesis. rs10937405 and rs4488809 were located at the first intron of TP63 at chromosome 3q28. TP63 is a member of the tumor suppressor TP53 gene family, which transcriptionally regulates genes involved in DNA repair (Flores 2007; Lin et al. 2009), and it is important for normal development and differentiation of stratified epithelial tissues as well as for human carcinogenesis (Tomkova et al. 2008; Vousden and Prives 2009). Further, p63 has been found to play an important role in cancer development and progression through its interaction with mutant p53 (Melino 2011). Exposure of cells to DNA damage leads to induction of TP63 and both isoforms have the ability to transactivate TP53 target genes, hence impacting on cellular responsiveness to DNA damage (Katoh et al. 2000; Petitjean et al. 2008). TP63 is expressed mainly in two isoforms, the TA and N-terminal-truncated (ΔN) forms. The TAp63 isoforms are transcribed using a promoter-located upstream of exon 1 of the gene, whereas expression of the ΔNp63 isoforms are regulated by a promoter within intron 3 of TP63 (Moll and Slade 2004). Miki et al. show that rs10937405 and rs4488809 appear to define a single risk haplotype to which a functional variant maps (Miki et al. 2010). If the association annotated by this haplotype reflects a single risk variant, it does preclude the possibility that the haplotype may capture multiple functional risk alleles. Although elucidating a functional basis for the SNP associations will be contingent on fine mapping, it is entirely plausible that they may impact either directly or through LD on TP63 expression.

Cigarette smoking directly causes lung cancer development by several mechanisms (Hecht 1999; Zhou et al. 2006; Jorgensen et al. 2010), including inactivation of tumor suppressor genes (Liu et al. 2005; Lee et al. 2008), induction of oxidative stress and DNA damage, and activation of signaling pathways that underlie apoptosis and autophagy (Maiuri et al. 2009; Essick and Sam 2010). Furthermore, ΔNp63α and tobacco smoking have a synergetic effect on carcinogenesis (Ratovitski 2010). Therefore, it is difficult to determine whether these loci are associated with lung cancer risk, tobacco use, or perhaps both.

An important source of bias in every meta-analysis is publication bias because the likelihood of publishing a study could be related to the results of that study. However, among our meta-analysis, there have been many studies published with negative findings (Timofeeva et al. 2012; Yin et al. 2013). Although the funnel plot for East Asians is not symmetric, the overall results of both ethnic groups are concordant, indicating that this bias cannot affect the final result. On the other hand, funnel plot asymmetry is not always caused by publication bias. True heterogeneity may also lead to funnel plot asymmetry. For example, significant difference may be seen only in high-risk individuals, and these high-risk people are usually more likely to be included in small studies. This is particularly true in our meta-analysis because all the significant associations in East Asians have been observed among the studies from high-risk populations. Language bias or citation bias also could be an important source in this group of studies, meaning that the studies without significant findings are preferentially published in languages other than English and less likely to be cited in other articles. Finally, it is possible that an asymmetrical funnel plot arises simply by chance.

In summary, our meta-analysis showed that rs10937405 and rs4488809 at 3q28 might be risk-conferring factor for the development of non-small cell LC in East Asians, but not in Caucasians. As studies among Caucasian, African populations are currently limited, further studies including a wider spectrum of subjects to investigate the role of these variants in different populations will be needed.