Introduction

TCF7L2 (T-cell factor 7-like 2, also known as TCF4) is uncontested as the gene that harbors the variant with the strongest effect on type 2 diabetes (T2D) identified to date. In 2006, a common genetic variation in the gene, a microsatellite located within intron 4 (DG10S478), was first associated with increased risk of T2D [1••]. The original finding was based on an Icelandic cohort and confirmed in US and Danish subjects, but has since been replicated in African, Asian, and European study populations [26]. The microsatellite is highly correlated with a single nucleotide C/T polymorphism, which has been widely studied (rs7903146, also located within intron 4). The risk T allele is common in European and African populations. However, in Asian populations the risk allele frequency of the rs7903146 is very low and other polymorphisms located in the 3′ end of the TCF7L2 gene have been associated with T2D [7].

In this review we focus on what is known about the functional consequences of these genetic variants and the possible molecular mechanism whereby they propagate their effect to influence the risk for developing T2D. We also wish to highlight the importance of attention to technical detail when analyzing TCF7L2 gene expression in different tissues. This is a focused review, and due to limitations imposed by the format many informative and well-constructed studies that deserve special mention have been omitted.

The rs7903146 Risk Allele Phenotype

The phenotypic changes associated with the TCF7L2 risk genotype suggest that T2D arises as a consequence of impaired islet function [6, 8••, 911]. The risk T allele is associated with impaired insulin secretion traits, such as decreased insulinogenic index and lower disposition index, which reflects the capacity for insulin secretion in relation to insulin sensitivity. By contrast, the risk allele does not affect body mass index, or parameters indicating impaired insulin sensitivity such as homeostasis model assessment of insulin resistance (HOMA-IR) [6, 912]. Studies have also suggested that risk TT and CT genotype carriers exhibit elevated hepatic glucose production, as well as an impaired incretin effect (ie, a less pronounced insulin response to a glucose load when given orally instead of intravenously) [11, 12].

TCF7L2 Functions

Prior to its discovery as a T2D gene, TCF7L2 was established as a transcription factor playing an important role in the canonical Wnt signaling pathway. Wnt signals affect a wide range of fundamental cellular processes such as embryonic development, stem cell maintenance, cell fate, cell proliferation, cell migration, tumour suppression, and oncogenesis [13]. Humans possess 19 different Wnt isoforms, which are secreted glycoproteins that can act via the canonical or non-canonical pathways. Canonical Wnt signaling is β-catenin/TCF/LEF-dependent and is activated when Wnt proteins bind the G-protein–coupled Frizzled receptors and their coreceptors, lipoprotein receptor-like proteins, LRP5/6 [14]. This leads to a chain of signaling events starting with activation of Dishevelled, which in turn inhibits the multifunctional glycogen synthase kinase-3beta (GSK3β). GSK3β phosphorylates β-catenin, thereby targeting it for ubiquitin-dependent proteosomal degradation. By contrast, unphosphorylated β-catenin is stable and is directed to the nucleus where it associates with TCF/LEF DNA-binding proteins and together with coactivators (eg, CBP or p300) activate gene transcription (described more in detail in the section on TCF7L2 Isoforms and Interaction Partners).

A comprehensive overview of the target genes for TCF7L2 in pancreatic islets is still lacking, but it is known to control expression of several genes involved in oncogenesis, such as c-jun [15], c-myc [16], Cyclin D1 [17], and many others (for a more extensive list see http://www.stanford.edu/∼rnusse/pathways/targets.html). TCF7L2-dependent signals have been implicated in the oncogenesis of colorectal cancers [18] and breast tumors [19] (reviewed in [20]), and an experimentally generated list of potential TCF7L2 target genes has been reported in colorectal cancer cells using ChIp-on-Chip analysis [21]. Interestingly, a recent ChIP-sequencing study in a carcinoma cell line identifies an overrepresentation of cardiovascular disease- and T2D-associated loci binding of TCF7L2, but no significant association with cancer-related loci [22].

When TCF7L2 was identified as an important T2D gene, it was largely unknown among diabetes researchers, but in recent years it has become clear that TCF7L2 plays an important role for several vital functions in the pancreatic islet. First, it is involved in pancreas development, although the exact effects on formation of the endocrine cells remain a matter of controversy [2326]. In addition, TCF7L2 appears to be required for SDF-1/CXCR4-induced cytoprotection of β cells [27]. Both these TCF7L2-dependent effects will act to increase the number of β cells, suggestive of a scenario in which the transcription factor is a key determinant of β-cell mass. However, TCF7L2 is also essential for maintaining the secretory function in mature β cells. The first evidence pointing in this direction was provided by the demonstration of glucose intolerance in mice with impaired canonical Wnt signaling due to genetic ablation of the Wnt coreceptor LRP5 [28]. Direct support for the involvement of TCF7L2 in this effect was provided by studies using short inhibitory RNA oligonucleotides, which showed decreased β-cell survival, as well as attenuation of glucose-induced insulin secretion [29]. The latter effect has been attributed to downregulation of glucokinase, insulin, genes/proteins in the exocytotic process, as well as the central transcription factor pdx1 [26, 2931]. Finally, in silico studies have also identified the consensus sequence (WWCAAWG) for TCF binding in the proprotein convertase 1 (PC1) and PC2 genes, which are essential for proinsulin conversion to mature insulin [32]. This is suggestive of a role of TCF7L2 in insulin processing, and that this processing may be disturbed in risk allele carriers.

TCF7L2 and GLP-1

A central player in glucose homeostasis is the incretin hormone glucagon-like peptide 1 (GLP-1). GLP-1 is produced in the enteroendocrine L cells in the small intestine and has a multitude of effects positive for controlling blood glucose [33, 34]. Acute addition of the peptide stimulates glucose-induced insulin secretion, but suppresses release of the glucose-elevating hormone glucagon from the pancreatic α cells. On a more long-term basis, GLP-1 stimulates proliferation of β cells and suppresses apoptosis. Outside the pancreas, GLP-1 is known to promote satiety signals in the hypothalamus and to slow down gastric emptying, both actions alleviating the glucose load on the body. In summary, GLP-1 is a multifunctional hormone essential for the body to maintain the nondiabetic state.

Interestingly, several lines of evidence suggest an important interplay between the incretin hormone GLP-1 and TCF7L2 in the regulation of pancreatic islet functions. First, TCF7L2 controls transcription of the proglucagon gene [35]. This gene encodes both glucagon and GLP-1, and the production of each hormone is controlled by selective post-translational cleavage of proglucagon in the L cells and α cells, respectively. Second, TCF7L2 is also essential for GLP-1 to exert many of its effects on the pancreatic islet. For example, the capacity of the incretin hormone to stimulate proliferation of rodent and clonal β cells may be related to the fact that GLP-1 increases transcript levels of several genes in the Wnt signaling pathway, including the cell cycle regulators cyclin D1 and c-myc [36]. Silencing of TCF7L2 obliterates the capacity of GLP-1 to induce β-cell proliferation, an effect involving several kinases such as protein kinase A, Akt, and the MEK/ERK pathway [36]. Additionally, reduced expression of the GLP-1 receptor as well as the glucose-dependent insulinotropic polypeptide receptor (GIPR) has been demonstrated in pancreatic islets from diabetic subjects, and when silencing TCF7L2 both receptors display reduced expression [30].

Mode of Action in T2D Pathogenesis

We are still far from a unifying hypothesis as to how common variation in the gene causes T2D. In the initial report by Grant et al. [1••], the large linkage disequilibrium block containing the rs7903146 polymorphism as well as all exons was sequenced, but no variation was found that was more highly associated with T2D. The region surrounding rs7903146 has also recently been shown to be in an open chromatin state in human pancreatic islets by formaldehyde-assisted isolation of DNA analyzed with high-throughput sequencing (FAIRE-seq) [37•]. The chromatin structure at rs7903146 was found to be more open in T allele than in C allele chromosomes. The implications for TCF7L2 expression and/or splicing of this difference in chromosome structure need to be further explored.

Concerning the mode of action of rs7903146 with respect to the pathogenesis of T2D, one possibility is that it could affect GLP-1 expression. However, this has not been substantiated by empirical data [12, 38], but several reports of an impaired incretin effect exist [11, 12]. There is also ample evidence supporting the view that rs7903146 is associated with impaired conversion of proinsulin to insulin [32, 3941]. In fact, in one study an increased proinsulin:insulin ratio post–oral glucose tolerance test was the only significantly affected intermediate parameter [39], and in another all phenotypic traits significantly associated with the minor T allele lost the association after correction for the proinsulin:insulin ratio. These findings strongly suggest that impaired insulin processing is a factor to be reckoned with in the development of TCF7L2-dependent diabetes and this aspect calls for further mechanistic investigations.

The genetic variation rs7903146 is located in a non-coding region and is therefore—like most common variations associated with T2D—assumed to act by altering expression of the genes in which they are located. Data suggestive of both increased expression of the TCF7L2 transcript in rs7903146 risk carriers [11, 42] and reduced TCF7L2 protein expression in donor islets from individuals with T2D [30] have been presented. Other studies have failed to detect any genotype-dependent expression differences [43, 44••]. Contradictory findings of TCF7L2 expression levels have also been reported in other tissues (eg, subcutaneous vs visceral adipose tissue) [44••, 45, 46]. These divergent results can be explained by the relatively small sample populations in studies presented so far, as well as different methodologic approaches (real-time analysis of mRNA expression vs immunocytochemical detection of protein levels). Furthermore, in samples from diabetic individuals it is difficult to assess the relative contribution of primary genotype-dependent effects, as opposed to changes in expression secondary to the hyperglycemic state.

Yet another factor that complicates interpretation of such data is that most genes in the human genome are expressed as multiple splice variants; as many as 92% to 94% of human multi-exon genes undergo alternative splicing, 86% with a minor variant frequency of 15% or more [47]. Extensive alternative splicing has also been shown for the TCF7L2 gene [48]. The splice variant expression pattern differs in islet from other tissues [44••, 49, 50•]. The fact that the absolute majority of studies point to the pancreatic islet being central for the TCF7L2-dependent increased risk of developing T2D, in combination with the finding that the absolute level of TCF7L2 transcripts is high in islets compared with other tissues [44••], suggests that this is the organ of primary interest for understanding the pathogenesis of T2D in relation to TCF7L2.

The Expression Level of TCF7L2 Transcripts

The total amount of TCF7L2 transcripts varies in different tissues. Comparing the level of expression of the same gene in different tissues is technically challenging. In using quantitative polymerase chain reaction (qPCR), one potential problem is that endogenous controls cannot be assumed to be expressed at the same absolute level in all tissues [51••, 52]. Differences in the quality of the obtained RNA and tissue heterogeneity with respect to cell types add to the difficulty of interpreting the results. The latter was recently highlighted by a report showing the differential expression of TCF7L2 in isolated pancreatic α and β cells [53]. One alternative approach to partly resolve these problems is to assess the absolute mRNA concentration rather than comparing the relative expression levels. This is achieved by including an oligonucleotide standard with known concentration covering the target sequence. We have previously used this approach to measure the total expression level of TCF7L2 in T2D-relevant tissues [44••]. TCF7L2 was found to be most abundantly expressed in pancreatic islets and adipose tissue, less in blood lymphocytes, and the lowest, but clearly detectable, in skeletal muscle [44••]. To make informative conclusions of the functional consequences of these expression differences it is necessary that they are followed up by quantitative investigations of TCF7L2 protein levels and also with activity measurements.

The Splice Pattern of TCF7L2

The genomic structure of TCF7L2 and the first extensive characterization of the splice pattern were published in 2000 by Duval et al. [48]. The TCF7L2 gene is comprised of 17 exons, five of which have been shown to be alternative (ie, exon 4 in the 5′ end and exons 13–16 in the 3′ end) (Fig. 1) (The numbering used by Duval et al. and others will be used here [48, 54••]). Exons 14 and 15 are highly similar and seem to be mutually exclusive with a strong tissue-specific usage bias, so that exon 15 predominates in islets and lymphocytes, whereas exon 14 predominates in muscle and adipose tissue [44••, 54••]. Alternative splice sites have been identified in exons 7, 9, 16, and 17, of which the first two are commonly used [48, 54••]. Also, several rarely used alternative transcriptional start sites in the promoter region and in exon 1 have been reported [50•]. Altogether, these variations give rise to a complex splice pattern with hundreds of potential protein isoforms. The splice patterns and variable exons reported from different groups are very similar, although the relative expression levels differ between reports [44••, 4850•, 54••, 55].

Fig. 1
figure 1

Gene and mRNA structure of TCF7L2. On top the gene structure is outlined displaying the 17 exons with the location of the rs7903146 polymorphism indicated. Below is the exon structure shown schematically with exon numbering. Alternative exons are colored and important alternative splice sites are marked by hatching. Sequences encoding important binding sites are indicated. Stop codon usage for “short,” “medium,” and “long” TCF7L2 protein isoforms is shown by boxed S, M, and L, respectively. At the bottom of the figure, the four predominating splice variants expressed in human pancreatic islets are shown [44••]. CBP CREB-binding protein; CtBP C-terminal binding protein; HMG high mobility group

The Splice Pattern of TCF7L2 in Pancreatic Islets

In 2009, we and collaborators published two studies describing the complex splice pattern of TCF7L2 in human T2D-relevant tissues, including pancreatic islets [44••, 50•]. The dependence of risk allele carrier status on this pattern was also investigated. Sequencing, restriction cleavage analysis, and absolute qPCR revealed a clear tissue-dependent difference in the splice pattern with four predominant splice variants of TCF7L2 expressed in pancreatic islets, these being the variant lacking all variable exons, and variants containing either of, or both exons 4 and 15 (Fig. 1) [44••]. Pancreatic islets display an unusually high incorporation of exon 4; on average, about 62% of all transcripts contains exon 4 compared to about 30% in the other tissues that were investigated (ie, skeletal muscle, adipose tissue, and blood lymphocytes). In the second study, a similar expression pattern was obtained [50•]. In 2010, Mondal et al. [49] observed the previously described variable exons in pancreatic islets using cloning, sequencing, and relative qPCR in a single individual [49]. These findings were extended to show that when the protein isoforms are heterologously expressed they are stable and have a functional β-catenin binding domain. Mondal et al. [49] described exon 16 as being exclusively expressed in pancreatic tissue, thereby supporting the data of Prokunina-Olsson et al. [50•] to the effect that exon 16 is expressed at low, but detectable levels in pancreatic tissue and colon, but absent in the other tissues examined. The very low expression of exon 16–containing transcripts in pancreatic islets is underpinned by data obtained by reverse transcriptase PCR and restriction cleavage analysis [44••]. In addition to pancreatic tissue and colon, exon 16–containing transcripts have been shown to be expressed in the brain [56, 57]. Given the very low representation of exon 16 in TCF7L2 transcripts in pancreatic islets, this exon is unlikely to play a major physiologic role in islets, although it cannot be completely excluded.

No association has been convincingly shown between T2D risk genotypes and the total expression of TCF7L2, or the expression level of any TCF7L2 splice variant. A genotype-dependent difference in the relative expression of transcripts targeted using assays for exons 14 to 16 and 14 to 17, respectively, has been reported with nominal significance, but not withstanding correction for multiple testing [50•]. It is possible that a significant genotype effect is revealed only when plasma glucose levels are taken into account. In fact, a significant, positive correlation was found between exon 4 incorporation in pancreatic islets and hemoglobin A1c levels [44••]. Although this correlation does not prove causality, it suggests a link between TCF7L2 splicing and plasma glucose levels. It should also be noted that the rs7903146 polymorphism is located within intron 4 in close proximity to exon 4.

Predicted TCF7L2 Protein Isoforms

As previously described, the differential splicing of TCF7L2 potentially gives rise to a large number of protein isoforms with highly differential functional properties. The splice variants and resulting isoforms have been grouped according to various criteria, mostly after predicted properties of the variable C-terminus. A distinction endorsed by several groups [48, 54]••, [58] differentiates between long, short, and medium forms, depending on the predicted stop codon used (Fig. 1). The long forms (“L” [or “E”]), which include the long, 415 nucleotide (nt) open reading frame (ORF) in exon 17, contain either exon 14 or 15 and not exon 16. Medium-length (“M”) forms contain neither of these exons, which shift the ORF to stop 77 nt into exon 17. In the short (“S”) forms, the ORF ends before exon 17. This only occurs if either exon 16 or if both exons 14 and 15 are retained in the transcript, both events that are rare in islets. Neither the alternative splice sites in exons 7 and 9, nor the variable presence of exons 4 and 13 cause a change in the ORF apart from the local codon insertions and deletions.

TCF7L2 Isoforms and Interaction Partners

The simplistic model for the role of TCF7L2 can be outlined as follows: In the absence of canonical Wnt signaling, TCF7L2 is bound to the promoter through its high mobility group (HMG) box in a repressing state in complex with the TLE/Groucho corepressor. Upon Wnt activation, β-catenin, which is otherwise efficiently degraded in the cytoplasm, enters the nucleus, displaces TLE/Groucho, and induces transcription.

However, this picture represents a severe simplification. First, a number of additional factors are known to modulate the actions of the TCF complexes [5961]. Second, the differential splicing of TCF7L2 potentially gives rise to a number of isoforms at the protein level competing for promoter binding and possessing highly diverse activation/repression properties [54••].

Most of the known binding sites in TCF7L2 are retained in all isoforms: The DNA-binding HMG box of TCF7L2 is encoded by exons 10 and 11, TLE/Groucho binding has been mapped to exon 9 [62], and β-catenin binding occurs at the N-terminal 50 amino acid residues (Fig. 1) [63, 64]. Notable exceptions are the corepressor C-terminal binding protein (CtBP) and the coactivators CREB-binding protein (CBP [or crebbp]) and p300, which recognize only the long TCF7L2 isoforms [6567]. CtBP interacts with two sites, both encoded by exon 17, and has been shown to inhibit TCF7L2-mediated expression in a dose-dependent manner for a long isoform, but not for an isoform in which the binding motifs have been corrupted by a frameshift mutation [68]. CBP and the very similar p300 are ubiquitous and multifunctional histone acetyltransferase coactivators with a vast number of interactions [69]. CBP and p300 also require the extended C-terminus of TCF7L2 for binding, which consequently, specifically confers corepressor—as well as coactivator—binding capacity to TCF7L2. In addition, transcription-enhancing activity through auxiliary DNA binding has been ascribed to cysteine-containing motifs specific for long isoforms [70]. Because the extended C-terminus is required to obtain efficient TCF7L2/β-catenin activation of the target promoter, at least in the case of Cdx1 [54••, 67], Axin2, and Siamois [54••], the activating properties of the domain seem to prevail under most circumstances.

In Xenopus laevis, differences in transcription properties and the ability to form TCF7L2/DNA/β-catenin complexes have been observed depending on the alternative splice sites in exons 7 and 9, where phosphorylation events of the longer form of exon 9 appear to be involved [71, 72]. These alternative splice sites are highly conserved, but their significance has not been established in mammalian species. The TLE/Groucho recognition site is encoded by exon 9; however, alternative splicing of this exon does not seem to interfere with TCF7L2-TLE/Groucho interaction [71].

Even if specific interactions with TCF7L2 binding partners have not been demonstrated with the exon 4–encoded part of the protein, the inclusion of exon 4 seems to severely impair TCF7L2-dependent promoter activation [54••]. Whether this is due to hitherto uncharacterized interactions or to medium- or long-range protein structural effects remains to be elucidated.

Discussion

Our insight into the implications of TCF7L2 splicing is still unclear. However, the existence of a complex, tissue-specific set of transcriptional variants is by now well established, presumably giving rise to an equally specific set of protein isoforms. In the interpretation of expression analyses this diversity should be kept in mind. First, one should realize exactly which splice variants or isoforms are being measured. Second, the overall balance between species with different properties should be considered. In connection with qPCR and small interfering RNA experiments this poses a problem, and the contribution of a splice variant should be established using appropriate methods. A common and seemingly straightforward experiment that is potentially problematic is the comparison of gene or protein expression between different tissues. It is not uncommon that measurements are related to an internal standard without ensuring that this standard is equally expressed in the tissues investigated. Differences in assay performance should also be taken into account when using relative measurements to compare different splice variants in the same tissue.

What is then the reason for the organism to boast such a broad, yet strongly tissue-biased spectrum of splice variants arising from the TCF7L2 gene? The need is obvious for a finely tuned regulation of this key factor in a pathway with significant consequences for the cell, yet, it seems grossly overcomplicated to have to control the relative expression of anywhere from a dozen to several hundreds of different transcripts to achieve this goal. One way of regarding this is to consider the TCF7L2 isoforms as a dynamic swarm of factors with a common set of target genes and with a variety of more or less pronounced functional differences, as opposed to regarding each isoform as specifically produced for a specific purpose. The size and properties of this swarm can then be shifted toward activation or repression depending on the sum of cooperative and counteracting properties of individual molecules. Consequently, the up- or downregulation of a single isoform may be immaterial, or even misleading, if not considered in context of the bulk of competing isoforms present in the nucleus. On a larger scale, the entire pool of Wnt-acting transcription factors and cofactors need to be taken into account. This general point is particularly relevant for a factor such as TCF7L2, considering its dual role in repression and activation. For example, the ratio between TCF7L2 isoforms with and without exon 4 is in certain respects likely to be of more consequence than the concentration of each type or, for that matter, the total amount of all TCF7L2 isoforms in the cell combined.

Conclusions

The phenotypic changes associated with the TCF7L2 risk genotype suggest that T2D arises as a consequence of reduced islet mass and/or impaired function, and it has become clear that TCF7L2 plays an important role for several vital functions in the pancreatic islet. Resolving if/how genetic variation in the TCF7L2 gene influences the expression and splice pattern in pancreatic islets using appropriate methods should be the focus for future studies. It is also necessary that this information is translated to the protein and functional level and that more target genes important for islet function are identified.