Introduction

Cleft lip with/without cleft palate (CL/P) is a common and etiologically heterogeneous birth defect where both genes and environmental risk factors are known to influence risk (Dixon et al. 2011; Marazita 2012). CL/P includes cleft lip (CL) and cleft lip and palate (CLP) which share some epidemiologic features, such as a gender ratio skewed toward males and substantial differences in birth prevalence rates across racial groups (with rates generally higher in Asian populations compared to European populations, and lowest in populations of African ancestry). CL/P can occur as a phenotypic feature of several hundred malformation syndromes, including some Mendelian syndromes for which etiologic genes have been identified (e.g., Van der Woude syndrome; see Dixon et al. 2011 for a review). About 70 % of CL/P cases are classified as isolated and nonsyndromic, and it is becoming increasingly clear that multiple genes determine risk to CL/P.

Several genome-wide association studies (GWAS) have been conducted using both case–control (Birnbaum et al. 2009; Grant et al. 2009; Mangold et al. 2010) and case–parent trio study designs (Beaty et al. 2010), and different genes or specific chromosomal regions have shown strong evidence of harboring causal genes for CL/P. As a follow-up study to our case–parent trio GWAS from an international consortium (Beaty et al. 2010), we conducted an independent replication study on 1,108 case–parent trios. We genotyped a custom panel of 1,269 tagging single nucleotide polymorphic (SNP) markers in 33 selected genes/regions showing some evidence of linkage and association.

Methods

Samples

A total of 1,108 case–parent trios were drawn from several ongoing studies of oral clefts (see Table 1), and these trios included parents of various European ancestries (German and Turkish) plus trios from the multi-site EuroCran study. In addition, case–parent trios from the Philippines were included representing parents of Southeast Asian ancestry. Each CL/P case was examined by a health care provider, and cases with other congenital anomalies, recognized syndromic forms of CL/P or major development delay were excluded. Informed consent was obtained from parents of minor children and all affected individuals able to give informed consent directly. DNA was obtained from whole blood, saliva or a mouthwash sample. Questionnaire data on maternal smoking during the first trimester of pregnancy was collected from personal interview of the mother.

Table 1 Complete CL/P case–parent trios in replication study by recruitment source

Genotyping

A custom SNP panel was developed to include tagging SNPs in and around genes/regions identified in the GWAS as primary or secondary “hits”, plus genes showing evidence in one of the two major racial sub-groups (European and Asian) represented in Beaty et al. (2010). Additional candidate genes were considered based on prior evidence in the literature, and 33 genes were selected for genotyping using the Illumina Golden Gate platform. A custom panel of 1,536 SNPs was selected based on high Illumina design scores, reported minor allele frequencies (MAF) in European and Asian populations, and their potential for tagging the 33 genes/regions of interest. Among these, 267 SNPs had poor genotype qualities and were dropped, leaving 1,269 SNPs. Additional quality control (QC) criteria were used to flag SNPs: (1) four SNPs were flagged for missing genotype calls >1 %; (2) SNPs with MAF <0.005 in European or Filipino parents separately (54 SNPS in Filipino parents and 1 SNP in European parents, where the one SNP flagged in Europeans also had a low MAF in both populations); (3) SNPs were flagged for deviation from Hardy–Weinberg equilibrium (HWE) in the European and Filipino parents separately at p < 1 × 10−5 (3 SNPs in Europeans and 1 SNP in Filipinos and again, this single SNP showed deviation from HWE in both groups). Supplemental Table 1 lists all SNPs in or near 33 genes/regions examined here with the physical distance (on build 36) spanned by markers and the number of SNPs typed and not flagged in each gene.

Checks for biological relationships among family members (parent–offspring and parent–parent) were run, and SNPs showing excess Mendelian errors were excluded. Three trios were discovered to have sample switches between a parent and child, and correcting the pedigree structure resolved inconsistencies. Seven trios were dropped entirely and 17 individual parents were dropped resulting in incomplete trios, which cannot contribute to the analysis.

Statistical methods

The genotypic transmission disequilibrium test (gTDT) was used to test for evidence of linkage and association (i.e., linkage disequilibrium between the observed marker allele and an unobserved causal gene). This gTDT can be formulated as a conditional logistic regression model where the odds ratio of being the observed case (as opposed to the three ‘pseudo-controls’ possible from any given parental mating type) becomes a function of marker genotype. Within the ith trio, this model is written as:

$$ { \ln }\left\{ {P\left( {i - {\text{th case}}} \right)/[( 1- P\left( {i - {\text{th case}}} \right)]} \right\} = \beta_{{0{\text{i}}}} + \beta_{\text{G}} X_{i} $$

where X i represents the corresponding risk genotype(s) under an additive, dominant or recessive model (for an additive model, heterozygotes for the risk allele are coded as 1 and homozygotes for the risk allele are coded as 2; for a dominant model both heterozygotes and homozygotes for the risk allele are coded as 1; for a recessive model only homozygotes for the risk allele are coded as 1). The association of a marker with CL/P can be assessed using the estimated odds ratio as (\( {\text{OR}}\left( {{\text{CL}}/{\text{P}}} \right) = { \exp }\hat{\beta }_{\text{G}} \)) with 95 %CI calculated from estimated standard errors about \( \hat{\beta }_{\text{G}} \) (\( \hat{\beta }_{\text{G}} \) is an unbiased estimator of the log relative risk; Schaid 1996). The gTDT has an advantage in terms of statistical power over other TDT models and allows for different underlying genetic models (Schaid 1999; Fallin et al. 2002).

This conditional logistic regression model can also be extended to incorporate gene–environment (G×E) interaction (Cordell 2009a) and even gene–gene (G×G) interaction (Cordell 2002, 2009b; Schwender et al. 2012). Under an additive model, the extension to consider G×E interaction requires two regression coefficients, one for the effect of genotype (β G) and one for the interaction term itself (β G×E). The test statistic for G×E interaction alone is a 1 df test of β G×E = 0. These two regression coefficients can then be used to calculate the odds ratio of having a CL/P for unexposed carriers, \( {\text{OR}}\left( {{\text{CL}}/{\text{P}}|{\text{G no E}}} \right) = { \exp }(\hat{\beta }_{\text{G}} ) \)); and the corresponding \( {\text{OR}}\left( {{\text{CL}}/{\text{P}}|{\text{G and E}}} \right) = { \exp }(\hat{\beta }_{\text{G}} + \hat{\beta }_{\text{GxE}} ) \). We used the computationally efficient method of Schwender et al. (2012) implemented in the R package TRIO (version 1.7.0), available at http://cran.r-project.org to estimate these odds ratios with and without exposure.

Schwender et al. (2012) also show this conditional logistic regression approach can also be extended to consider gene–gene (G×G) interaction. Following the approach of Cordell (2002), the conditional logistic regression model was thus extended to compare the observed 2-locus genotype in the case to the remaining 15 “pseudo-controls” created when considering all 16 possible genotypes from any parental mating as a single matched set. A 4 df likelihood ratio test (LRT) for two-way interaction was calculated (Cordell 2009b). To minimize false-positive signals, we only considered markers in unlinked genes because even long-distance linkage disequilibrium (LD) between two SNPs can lead to spurious significance in a test for G×G interaction. To minimize the total number of tests, we limited tests for G×G interaction to genes showing some evidence of a marginal SNP effect.

Because this replication study used an independent sample of case–parent trios and only markers in regions identified from a previous GWAS or some other study were examined, significance levels were adjusted for the number of SNPs used in each individual candidate gene. Furthermore, because the original GWAS sample from an international consortium was composed of two distinct racial groups (European and Asian ancestry groups), we conducted additional stratified analyses on trios of European and Southeast Asian ancestry groups separately to check for racial difference in the statistical evidence from this replication study.

To assess the consistency of evidence in terms of both significance and direction of effect between the original GWAS sample and this replication study, we also conducted a meta-analysis in regions of interest imputing genotypes in the GWAS case–parent trios. Imputed SNPs were generated by the GENEVA Coordinating Center using the IMPUTE2 program (Howie et al. 2009) with ‘pre-phasing’ of haplotypes to improve accuracy. Prior to imputation, the original GWAS data were quality filtered both at the marker level and at the sample level (Laurie et al. 2010). Any GWAS sample where DNA had been generated by whole-genome amplification was excluded. The imputation target was based on the June 2011 release of 1,094 samples from the 1,000 Genomes Project (Durbin 2010). For our analysis, imputed genotypes were assigned to all members of the GWAS trios if the most likely genotype had a probability >90 %, and these were used in the gTDT separately on GWAS trios of European and Asian ancestry (which included some Filipino trios along with trios of East Asian ancestry). The resulting p values were generated via a meta-analysis over the GWAS discovery trios and these replication trios using the METAL program (Willer et al. 2010).

Results

The four genes/regions identified by Beaty et al. (2010) as reaching genome-wide significance included two with peak signals in recognized genes/regions (IRF6 and 8q24) and two genes that were novel at that time (ABC4A and MAFB). All four of these genes/regions yielded significant evidence of linkage and association in this independent sample of 1,108 case–parent trios, although SNPs in MAFB gave less dramatic significance in this replication sample (see Fig. 1). Stratified analysis of the European and Filipino trios separately showed much greater significance in the evidence of linkage and association among European trios (see Supplemental Figure 1). European trios provided highly significant results for many 8q24 markers, while the Filipino trios yielded only marginal significance for these same markers. Of the 53 SNPs in the 8q24 region, the mean gene diversity over all markers (equivalent to heterozygosity for biallelic SNPs) among European parents was higher than seen among Filipino parents (0.43 vs. 0.35).

Fig. 1
figure 1

Evidence for linkage and association from genotypic TDT in 1,108 CL/P case–parent trios for four genes/regions identified as genome-wide significant by Beaty et al. (2010). (note differences in scale of Y-axis for chr. 8q24)

Ludwig et al. (2012) showed several genes approaching but not attaining genome-wide significance in the GWAS (typically called ‘second tier hits’) yielded genome-wide significance when combined in a meta-analysis of the German case–control GWAS (Birnbaum et al. 2009; Mangold et al. 2010) and the case–parent trio GWAS of Beaty et al. (2010). The most significant SNP in PAX7 (rs742071) gave p = 7 × 10−9 with an estimated OR(CL/P) = 1.32 (95 %CI = 1.13–1.54) in the meta-analysis of Ludwig et al. (2012). Several SNPs in PAX7 on chr. 1p36.13 showed significant linkage and association in both European and Filipino trios from the current study (see Fig. 2). This same SNP yielded an estimated odds ratio of 1.43 (95 %CI = 1.24–1.66; p = 1.59 × 10−7) when comparing heterozygotes to the wild-type homozygote under an additive model in the total replication sample (see Table 2). When stratified into European and Filipino groups, these estimated odds ratio were 1.55 (95 %CI = 0.96–2.53; p = 0.07; MAF = 0.04) among Filipino trios and 1.44 (95 %CI = 1.23–1.68; p = 4 × 10−6; MAF = 0.45) among European trios.

Fig. 2
figure 2

Significance of meta-analyses of p values from genotypic TDT on CL/P case–parent trios from the replication study and the original GWAS for three genes identified as second tier hits: NTN1, PAX7 and COL8A1/FILIP1L1. Each of these genes achieved genome-wide significance in this meta-analysis based on called genotypes from imputed genotypic probabilities in the GWAS trios and the observed genotypes in the replication trios (n.b. the scale of the Y-axis varies)

Table 2 Second tier hits from original GWAS showing evidence of linkage and association in genotypic TDT in 1,108 CL/P case–parent trios

Markers in THADA on chr. 2p21 were also identified as reaching genome-wide significance in the meta-analysis of Ludwig et al. (2012), and SNP rs4372955 was nominally significant in this study (although Bonferroni correction for all unflagged SNPs left only marginal significance at p = 0.06). While the most significant SNP (rs7590268) in THADA from Ludwig et al. (2012) was not significant among either Europeans or Filipinos in this replication sample, several nearby SNPs did achieve significance (including rs4372955, located 26 kb away; MAF = 0.13 among European parents and MAF = 0.01 among Filipino parents). Similarly, the region containing DCAF4L2 on chr. 8q21.3 included one SNP achieving genome-wide significance in Ludwig et al. (2012), and our analysis of 12 SNPs spanning this region also showed evidence of linkage and association, although other SNPs yielded stronger significance. Supplemental information from Ludwig et al. (2012) showed rs1880646 in NTN1 on chr. 17p13 approached genome-wide significance in their meta-analysis, and this same SNP was again significant in our data (p = 1.20 × 10−3 after Bonferroni correction for 54 unflagged SNPs spanning this gene; this SNP had MAF = 0.25 among European parents and MAF = 0.43 among Filipino parents). Table 2 lists eight ‘second tier hits’ from the GWAS showing significance to varying degrees in this independent replication data set, of which four were also identified by Ludwig et al. (2012).

We also conducted a stratified analysis of CL and CLP trios to test for the possibility of heterogeneity between these two anatomically distinct types of oral clefts. Comparing the estimated regression coefficients from these separate models showed no evidence of heterogeneity in SNP effects between CL and CLP trios. The 33 genes examined here, however, did not include the genomic region near SPRY2 on 13q31.1, which yielded evidence of significance only for CLP (not CL) in Ludwig et al. (2012).

Meta-analysis with GWAS trios

We conducted a meta-analysis between European and Filipino case–parent trios using the observed genotypes from this replication study and called genotypes based on imputed probabilities for SNPs in case–parent trios of European and Asian ancestry from the original GWAS of Beaty et al. (2010). Any individual from the original GWAS whose most likely genotype call probability was <90 % was set to missing for this meta-analysis. Figure 2 shows the results for the combined data set for three genes which achieved genome-wide levels of significance: PAX7 on chr. 1p36.13, NTN1 on chr. 17p13 and COL8A1/FILIP1L on chr. 3q12.3). Supplemental Figure 2 shows corresponding results of stratified analysis. SNPs yielding the greatest significance fell in specific regions in and around these three genes. For PAX7, the most significant SNPs were in intron 3; for COL8A1/FILIP1L, the region of signal spanned both genes although almost all SNPs were intronic; for NTN1, intronic SNPs again gave the lowest p values.

G×G interaction

We tested for possible G×G interaction using Cordell’s test implemented in the R package TRIO. All markers in or near 8q24, IRF6, ABCA4, MAFB, PAX7, THADA, COL8A1, DCAF4L2 were used, but no pairs of SNPs within a gene were considered in this test for two-way interaction.

Cordell’s 4 df test was computed for all pairwise combinations of SNPs across genes. Overall, the resulting QQ plot revealed little evidence of G×G interaction (see Fig. 3). Given the large number of tests (173,644), none of these observed test statistics would retain their significance after strict Bonferroni correction (which would require p < 3 × 10−7). While this analysis does not argue for substantial interaction between markers across these genes, Supplemental Table 2 lists the ten most significant SNP pairs showing nominal significance. Interestingly, one SNP in NTN1 (rs11650357) showed evidence of possible G×G interaction with three different SNPs in IRF6, including the most significant individual SNP in this gene (rs6685182).

Fig. 3
figure 3

QQ plots for tests of G×G interaction using 647 SNPs in 12 genes showing some marginal effect of genotype

G×E interaction

Maternal exposure data were available on maternal smoking during pregnancy, and this was used to test for G×E interaction using the Trio package. Because the rates of exposure to maternal smoking differed substantially between European and Filipino trios (18 vs. 7 %, respectively, see Table 3), we conducted a stratified analysis. Two genes showed some evidence of G×Smoking interaction: GRID2 on chr. 4 and ELAVL2 on chr. 9. As expected the stronger evidence came from European trios where exposure rates were higher, and Fig. 4 illustrates the strength of G×Smoking interaction as the −log10(p) for all SNPs in these two genes among 645 European trios (Supplemental Figure 3 shows the corresponding plots for 439 Filipino trios). As apparent in the lower part of this figure, there was little evidence of strong LD across markers in these two genes (where gray indicates effectively no pairwise LD). Evidence for G×Smoking interaction were limited to specific regions of these genes, and only the interaction term itself was significant (i.e., the regression coefficient β G for SNP effects was not different from zero—see Supplemental Tables 3 and 4 for estimated OR, test statistics, p values and MAF for SNPs showing evidence of G×Smoking interaction, marked by the box in Fig. 4). Of course, stratification into European and Filipino groups resulted in some loss of statistical power to detect interaction. Although the median difference in MAF between European and Filipino parents was only 0.09 for the 25 SNPs in GRID2 and 0.13 for the 26 SNPs in ELAVL2, some SNPs showed larger differences. For example, SNP rs4380540 in GRID2 gave the strongest evidence for G×Smoking interaction among Europeans, but the MAF for this SNP among Filipino parents was too low to even permit estimation of β G×E (MAF = 0.144 among European parents vs. 0.003 among Filipino parents). Figure 5 shows the estimated odd ratios (with 95 %CI) for European children carrying the risk allele at SNPs in these two genes in the presence and absence of maternal smoking. Neither GRID2 nor ELAVL2 had shown any significant evidence of SNP effects alone when G×Smoking interaction was ignored.

Table 3 Mothers exposed and unexposed to smoking during pregnancy in each group
Fig. 4
figure 4

Significance of genotypic TDT model including a term for G×Smoking interaction in two genes among 654 European CL/P case–parent trios (where the exposure rate for maternal smoking was 18 %). Blocks of SNPs showing evidence of G×Smoking interaction are highlighted in black

Fig. 5
figure 5

Predicted OR(CL/P|unexposed carrier) and OR(CL/P|exposed carrier) in two genes showing evidence of G×Smoking interaction among 645 European CL/P case–parent trios. Open circles represent exposed infants with one risk allele (i.e., those whose mothers smoked) and black dots represent unexposed carrier infants. P values from 1 df test for G×Smoking interaction are shown along the x-axis. a 25 SNPs in GRID2, b 26 SNPs in ELAVL2

Discussion

Although numerous candidate gene and linkage studies have identified a number of genes as likely to play some role in the etiology of CL/P, it has been difficult to achieve consistency across studies. The GWAS approach has rapidly identified and confirmed several additional regions or genes (Mangold et al. 2011). The current replication study confirmed the influence of the four genes/regions (IRF6, ABCA4 and MAFB, plus the 8q24 region) achieving genome-wide significance in our previous case–parent trio GWAS of CL/P (Beaty et al. 2010). Because this replication study included both trios of European and Asian ancestry (here Filipino), we were also able to document differences in the strength of statistical evidence for linkage and association across these two racial groups, where exceptionally strong significance for markers in chr. 8q24 was only seen among Europeans. As discussed by Murray et al. (2012), the marker and haplotype diversity in 8q24 were greater among European parents compared to Asian parents in our original GWAS data which leads to less power among Asian trios, but there was no evidence of any distinct biological mechanism(s). In this replication sample, the level of gene diversity on 8q24 among Filipino parents (Southeast Asian) was again slightly lower than among European parents, but in general the direction of SNP effects on risk was similar (although their statistical significance was much diminished among Filipino trios; see Supplemental Figure 1). This difference in marker diversity could account for some of the inconsistencies across studies from different populations.

Several mutations in IRF6 are causal for Van der Woude syndrome (Leslie et al. 2012a), the most common autosomal-dominant malformation syndrome regularly including CL/P, but multiple studies have shown strong evidence of association between polymorphic markers in IRF6 and isolated, nonsyndromic CL/P cases (Rutledge et al. 2010; Mangold et al. 2011). Trios of European and Southeast Asian ancestry gave evidence that polymorphic markers in IRF6 are linked to and associated with an unobserved causal element for CL/P.

Beaty et al. (2010) was the first study to find evidence of association between markers in ABCA4 and CL/P, a gene known to be causal for inherited retinopathies, although this association has been since confirmed among Brazilians in a case–control study (Fontoura et al. 2012). The CL/P trios used here yielded strong evidence of linkage and association between markers in and between ABCA4 and the neighboring ARHGAP29 gene. ARHGAP29 is expressed in the developing craniofacial region (unlike ABCA4 itself), and Leslie et al. (2012b) identified several rare mutations and variants in ARHGAP29 suggesting this gene may play an etiologic role. The peak signal occurred in an intron of ABCA4 which contains a transcriptional enhancer in the homologous mouse region active in craniofacial tissues during embryonic development (A. Visel, personal communication). Conceivably, the human region may also contain a craniofacial enhancer of the neighboring ARHGAP29 gene, if not ABCA4 itself.

SNPs near MAFB on chr. 20 also showed strong evidence in Beaty et al. (2010), especially among Asian case–parent trios. The exact SNPs reported by Beaty et al. (2010) failed QC in this custom marker panel; however, SNP rs6029326 (at position 38845294) yielded p = 0.0003 in the gTDT, and remained significant after Bonferroni correction for the 93 unflagged SNPs typed around this small gene.

Recently, Ludwig et al. (2012) conducted a meta-analysis using the entire genome-wide marker panel from their cases and unrelated controls, and the corresponding genome-wide markers used in the case–parent trio GWAS of Beaty et al. (2010). This meta-analysis identified a total of 12 genes achieving genome-wide significance in tests of association with CL/P across these GWAS (the four mentioned above plus VAX1, NOG, PAX7, THADA, EPHA3, DCAF4L2, SPRY2 and TPM1). While 204 of the affected cases from German trios used here were also in the case–control GWAS of Mangold et al. (2010) (precluding a meta-analysis over both GWAS groups and these replication trios), these additional independent European and Southeast Asian case–parent trios highlighted the potential importance of 4 additional ‘second tier hits’ in influencing risk to CL/P (COL8A1/FILIP1L on chr. 3p12.3; GADD45G on 9q22.1; RBFOX3 on 17q25.3; and to a lesser extent FOXE1 on 9q22.33). Linkage studies using multiplex CL/P families provided strong evidence for a causal gene near FOXE1 (Marazita et al. 2009; Moreno et al. 2009), although GWAS studies have yielded only limited support for this gene to date and the corrected p value seen here was marginal (see Table 2).

Our formal tests for G×G interaction failed to show compelling evidence for 2-way interaction among SNPs in genes yielding evidence of marginal gene effects, but the number of tests involved is very large so even this replication study of over a thousand case–parent trios may not have sufficient statistical power to detect moderate levels of G×G interaction. This negative finding cannot exclude the possibility of biological interaction between one or more genes in determining the risk of CL/P, and further exploration is warranted.

We did identify two genes (GRID2 and ELAVL2) showing evidence of G×E interaction in the absence of any marginal gene effects among European case–parent trios (where the exposure rate was higher). This finding underscores the need to explore further the effects of genes, both recognized and novel, and their interactions with environmental exposures to the mother to expand our understanding of the genetic etiology of CL/P, which is highly complex and heterogeneous. To summarize, the number of genes contributing to the etiology of CL/P continues to expand (see Table 4).

Table 4 Genes confirmed as influencing risk to isolated, nonsyndromic CL/P over meta-analysis of German case–control GWAS (Birnbaum et al. 2009; Mangold et al. 2010), case–parent trio GWAS of Beaty et al. (2010) plus the present case–parent trio replication study