Introduction

In 1919, Thomas Morgan and Calvin Bridges described a spontaneous mutant Drosophila with wings reduced to vestiges (Bridges and Morgan 1919). Following the discovery of the so-called vestigial (vg) mutant, a considerable number of alleles of the gene were isolated (http://flybase.org/ for an update). Vestigial was shown to be required for wing development, and this property is strictly dependent on the association of Vg with the product of scalloped (sd), which belongs to a conserved family of transcriptional factors initially described in mammalian cells and that possesses a TEA/ATSS-DNA-binding domain (TEAD) (Andrianopoulos and Timberlake 1991; Burglin 1991; Halder et al. 1998; Kim et al. 1996; Xiao et al. 1991). While Vg functions as a transactivator, the protein complex Vg-Sd binds to DNA in a sequence-specific manner through the Sd TEA domain, thus activating specific genes within the regulatory network that control wing development (Halder et al. 1998; Williams et al. 1991). As described by Garcia-Bellido, the properties of Vg and Sd classify them as selector genes, whose function is to govern the fates of groups of cells within embryos (Garcia-Bellido 1975; Mann and Carroll 2002). A human homolog of Vg, named TONDU, was identified and found to be a substitute for Vg in wing formation in Drosophila (Vaudin et al. 1999). This was the first discovery of a protein belonging to a family of proteins that is homologous with Vg. Designated as vestigial-like (or Vgll), this family comprises four members in vertebrates (Vgll1-4).

Since their characterization in Drosophila, vestigial-like genes have been described in several vertebrates and some studies have described the functional properties some of them may have. Because of their interaction with TEADs, vestigial-like proteins undoubtedly have a profound effect on numerous functions, with a potential action at nodal points in several cellular processes mediated by TEADs. TEADs are encoded by a gene family comprised of four members in mammals (TEAD1-4). TEAD1 is the founder of this family; initially described as the transcriptional enhancer (aka TEF-1), it binds to the simian virus 40 (SV40) enhancer (Xiao et al. 1991). Among other transcription factor families, TEADs form a unique family of proteins with pleiotropic function that varies depending on the cell-type, developmental contexts, or the interacting partners. They not only regulate important physiological processes such as cell differentiation, cell proliferation, and stem cell maintenance but their alteration has been associated with human cancers (Pobbati and Hong 2013). The latter function mainly concerns the Hippo pathway, where TEADs interact with YAP (Drosophila Yorkie homolog), the transcriptional effector that mediates cell growth and oncogenic transformation in this pathway (Harvey et al. 2013).

While extensive knowledge has been accumulated about the functions of TEADs in vertebrates, little is known about vestigial-like, with the exception of recent data suggesting their pleiotropic functions in both development and cancer. Indeed, a recent study has identified a corepressor for Sd in Drosophila, namely Tgi. Considered as the mammalian ortholog of Vgll4, it can suppress the YAP-induced overgrowth of tissues in mice (Koontz et al. 2013). Here, we focus on our current knowledge of vestigial-like genes. In this study, we present an inventory of all known vertebrate genes and their orthologs in basal animals in order to highlight the early evolution of this gene family. We take a closer look at the structural organization of selected members and provide an overview of current knowledge about their functions, including the most recent findings on the subject.

Two distinct gene subfamilies

Gene subfamily encoding protein with a single Tondu domain

Vestigial (Vg) was originally described in Drosophila as a nuclear factor involved in wing and haltere development (Williams et al. 1991). It was then showed that this function was performed through its interaction with the TEAD transcription factor family protein scalloped (Sd) (Halder et al. 1998). The interaction of Vg with mammalian TEAD1 protein and the fact that TEAD1 could be used as a substitute for Sd during wing development led authors to postulate that a mammalian ortholog of Vg existed, and subsequently identify it as the protein TONDU (Deshpande et al. 1997; Simmonds et al. 1998; Vaudin et al. 1999). TONDU contains a 24 amino acid domain (Tondu domain or TDU) that is homologous with the Vg domain and necessary for its interaction with Sd (Vaudin et al. 1999). TONDU was the first mammalian homolog to Vg to be described, and has been now renamed “Vestigial-like 1” (VGLL1) to meet the requirements of the HUGO Gene Nomenclature Committee. Since then, several gene encoding proteins with a Tondu domain have been identified in different species, helping to form the vestigial-like genes family comprised of two genes in Drosophila and four to five genes in vertebrates, depending on the species (Table 1). Human VGLL2 (or VITO-1) and VGLL3 (or VITO-2) are paralogs of VGLL1 and were characterized independently in several different groups (Maeda et al. 2002; Mielcarek et al. 2002; Mielcarek et al. 2009). Like VGLL1, they encode proteins with a single Tondu domain, constituting with the Drosophila Vg the subfamily Vgll1-3/Vg. The presence of three paralog genes in vertebrates can be attributed to previous genome duplication events occurring in vertebrate evolution and are exemplified by several gene families that were found to have expanded from a single member in invertebrates to three to four members in vertebrates (Panopoulou et al. 2003). This is the case for the vestigial-like gene family where there is a single protein in early deuterostomes such as sea urchin or amphioxus.

Table 1 Inventory of vestigial-like gene family members in vertebrates and Drosophila. Standardized names are displayed according to nomenclature guidelines

Gene subfamily encoding proteins two Tondu domains

The vestigial-like gene family showed an additional level of complexity following the identification of proteins possessing two Tondu domains. VGLL4, initially identified in the human and the mouse through a search for TDU motifs in protein databases, is the archetype of these proteins and constitutes a second subgroup in the gene family (Chen et al. 2004b). VGLL4 has two TDU motifs that diverge from the consensus sequence, although they still mediate interaction with TEAD1 (Chen et al. 2004b). It is now established that there are four vestigial-like genes in mammals (Vgll1-4), but the situation may differ in other vertebrate taxa, depending on their evolutionary history. For instance, two Vgll2 genes (Vgll2a and Vgll2b) have been described in zebrafish (Hamade et al. 2013; Johnson et al. 2011; Mann et al. 2007). They correspond to duplicated genes, as described by current view about the occurrence of whole-genome duplication in the teleost lineage following divergence from the tetrapod lineage (Hoegg et al. 2004; Jaillon et al. 2004). In the amphibian Xenopus, four vestigial-like genes have been described that are true orthologs of their mammalian counterparts (Faucheux et al. 2010). However, a fifth gene, vestigial-like 4-like (or vgll4l), with two Tondu domains was identified in this amphibian with no equivalent in birds or mammals (Barrionuevo et al. 2014) (Table 1). Vgll4l shows a poor overall identity but a strong conservation in the Tondu domains, even with the mammalian proteins. Moreover, Vgll4l proteins are also characterized by a highly conserved 10 amino acids long motif called novel conserved sequence (NCS) (Barrionuevo et al. 2014). Vgll4l gene has also been identified in zebrafish, indicating that Vgll4 and Vgll4l genes have resulted from a duplication event limited to amphibians and fishes and can therefore be considered as homeologs (Melvin et al. 2013).

A hypothetical protein with two Tondu domains was also identified in Drosophila as early as 2004 (Chen et al. 2004b). The protein was then “rediscovered” and functionally characterized during screens seeking components of the Hippo pathway or partners of Sd. This protein, named Tgi (for Tondu-domain-containing Growth inhibitor) or SdBP (for Sd-Binding-Protein), contains two Tondu domains (TDU1 and TDU2) and can interact with Sd (Guo et al. 2013; Koontz et al. 2013). Tgi was found to be homologous with the mammalian Vgll4 and was shown to act as a cofactor in the repression of Hippo target genes (Guo et al. 2013; Koontz et al. 2013). A phylogenetic analysis confirmed the clusterings of Tgi with VGLL4 and Vg to VGLL1, 2 and 3 (Koontz et al. 2013). VGLL4/Tgi constitute therefore a second subfamily with containing two Tondu domains.

Evolution of the vestigial-like gene family

A growing number of genomic sequences are now available, facilitating the study of the evolutionary history of gene families. We have screened databases and identified proteins containing Tondu domain in representative species in order to establish the presence of vestigial-like proteins at different levels of a phylogenic tree (Fig. 1). We have characterized a total of 15 genomic clones containing Tondu domain from different species. All genes analyzed can be grouped into two subfamilies based on the presence of one or two Tondu domains. The Vgll1-3/Vg subfamily contains genes that encode proteins with a single Tondu domain. These genes are only found in metazoa and are, moreover, restricted to Bilateria. They are not found in sea anemones (Cnidaria) or in sponges (Porifera). During the diversification of Bilateria, the number of genes in this subfamily increases from one member in basal species (Vgll) to three members in vertebrates (Vgll1-3) (Fig. 1). This increase can be explained by the genome duplication events that occurred in vertebrate evolution and is further supported by the presence of a single protein in basal bilaterians such as ascidians, lancelets, or sea urchins (Dehal and Boore 2005).

Fig. 1
figure 1

Schematic representation of the eukaryotic tree of life showing the distribution of vestigial-like and TEAD gene family members. The tree is based on the current view of phylogeny (Adoutte et al. 2000; Delsuc et al. 2006). The presence of vestigial-like and TEAD genes is indicated for each species according to published data and genomic sequences retrieved from databases. Asterisk indicates only partial sequences. Vg vestigial, Vgll vestigial-like, Vgll-4l vestigial-like-4-like, Sd scalloped, TEAD TEA domain-containing protein, TDU Tondu domain, Tgi Tondu-domain-containing Growth inhibitor, D deuterostomes, O opisthokonts, P protostomes, V vertebrates. Representative species are frog (Xenopus laevis), fish (Danio rerio), ascidian (Ciona intestinalis), amphioxus (Branchiostoma floridae), sea urchin (Strongylocentrotus purpuratus), fly (Drosophila melanogaster), worm (Trichinella spiralis), annelid (Capitella capitata), oyster (Crassostrea gigas), sea anemone (Nematostela vectensis), sponge (Amphimedon queenslandica), protist (Capsaspora owczarzaki), yeast (Saccharomyces cerevisiae), amibe (Acanthamoeba castellanii)

The subfamily Vgll4/Tgi comprises Vgll genes that encode proteins with two Tondu domains and a novel conserved coding sequence (NCS) (Barrionuevo et al. 2014). These genes have been identified not only in Bilateria but also in Cnidaria, Porifera and in the unicellular filasterean species Capsaspora owczarzaki, indicating a premetazoan origin for this gene family (Koontz et al. 2013) (Fig. 1). In contrast to the Vgll1-3/vg subfamily, a single member of the Vgll4/tgi subfamily has been found in each of the species analyzed with the exception of frog and fishes where there are two members, Vgll4 and Vgll4l (Barrionuevo et al. 2014). This suggests that Vgll4 has not been subjected to duplication events, excepted in the frog and fishes lineages, or that any duplicated genes have been lost, although this is unlikely. For comparison, TEAD genes have a longer evolutionary history than vestigial-like genes and are present in the unicellular free-living amoebae Acanthamoeba castellanii (Sebe-Pedros et al. 2012) (Fig. 1).

The Tondu domain as an evolutionary unit

The Tondu domain mediates the interaction with TEAD proteins and is specific to vestigial-like proteins. Outside it, the proteins show little or no similarity. In the subgroup Vgll1-3/Vg, the Tondu domain (TDU) is 24 amino acids long and is found close to the N-terminal end, while it is located in a more central position in Drosophila (Fig. 2a). In terms of sequence conservation, the Tondu domains in vertebrate genes can be classified as VGLL2 > VGLL3 > VGLL1, ranging from total conservation in VGLL2 genes to more divergent in VGLL1 genes. The Tondu domain is more divergent between invertebrate species but is at least 50 % identical to vertebrate domains (Fig. 2b). In the Vgll4/Tgi subfamily, the two domains TDU1 and TDU2 are each 10 amino acids long and are highly conserved from sponges to mammals and in Vgll4l proteins (Fig. 2c) (Barrionuevo et al. 2014). A striking feature is that the sequences of the two domains are very similar to the last 10 amino acids of the Tondu domain found in the Vgll1-3/Vg subfamily. Moreover, the core domain VD/ED/EHF (VxxHF), which has been found crucial for interaction with TEAD proteins, is highly conserved between all Vgll proteins (Fig. 2b, c, (gray shading)) (Pobbati et al. 2012). Because TDU1 and TDU2 are found early in evolution, before the appearance of TDU in bilaterians, they may be considered as the ancestral Tondu domain and constitute the basic evolutionary unit of Vgll proteins. This unit is present in two copies in the ancestral Vgll protein and was submitted to selection pressure in order to acquire the functional ability to interact with TEAD. One may speculate that the 24 amino acid TDU domain present in the Vgll1-3/Vg subfamily appeared during the transition that gave rise to bilaterians. The great conservation of the VxxHF motif between TDU1/TDU2 and TDU (Fig. 2b, c) leads us to the hypothesis that after duplication, one duplicated TDU domain (either TDU1 or TDU2) gave rise to TDU by recruiting an extra 14 amino acid contiguous region, and this novel TDU domain was then subjected on evolutionary constraint in bilaterians to maintain interaction with TEAD.

Fig. 2
figure 2

Schematic representation and alignment of representative members of the vestigial-like protein family. a Schematic representation of human (VGLL1, VGLL2, VGLL3, VGLL4A, and VGLL4B) and Drosophila (Vg, Tgi RA, Tgi RB) proteins. VGLL4A, VGLL4B, Tgi RA, and Tgi RB are alternative forms of proteins generated from alternative promoters. The size of the proteins, in amino acids, is indicated above and the position and size of the Tondu domains (indicated in red) are drawn to scale. The position of PPxY motif is indicated by a blue dot. b Sequence comparison of the TDU domain from the Vgll1-3/Vg subfamily from different species. Comparison has been made with the human VGLL1 TDU domain. c Sequence comparison between TDU1 and TDU2 from the Vgll4/Tgi subfamily from different species. The conserved V-D/E-D/E-HF sequence at the core of the TDU domains has been highlighted in gray. Comparison has been made with the TDU1 and TDU2 of human VGLL4. Dots indicate identical amino acids residues. d Ribbon diagram representation of the Vgll1/TEAD4 structure. The interaction between the two proteins is divided into two interfaces (gray shaded). TEAD4 is colored red, and Vgll1 is colored green (Pobbati et al. 2012)

The contribution of the Tondu domains in mediating interaction with TEAD has been demonstrated for both Drosophila Vg and human VGLL1, and mutational analysis showed that both N-terminal and C-terminal portions of TDU are involved in the binding of Vestigial to TEAD (Simmonds et al. 1998; Vaudin et al. 1999). Human TEAD1 is a 426 amino acid protein whose region 205–329 has been shown to be required for interaction with Vgll1 and is well conserved with the Drosophila Sd protein (Vaudin et al. 1999). The structure of the complex between the transcription cofactor Vgll1 and the transcription factor TEAD4 was recently determined, and the interaction between proteins has been found to be mediated by an interface region of Vgll1 where the motif 41VxxHF45 is crucial (Pobbati et al. 2012) (Fig. 2d). Regarding Vgll4/Tgi, the mutation of the motif VxxHF in one of the TDU domain attenuates interaction between Tgi with Sd whereas mutation of the motif in the two TDU domains abolishes it (Guo et al. 2013; Koontz et al. 2013). A Tgi protein mutated on its two domains and therefore defective for Sd interaction antagonizes Sd-Yki mediated cell growth demonstrating the relevance of the Tgi-Sd interaction (Koontz et al. 2013). However, one cannot exclude that a Tgi mutated in only one TDU domain could be partially or fully functional as it has not been tested. Surprisingly, a similar interface mediates protein interaction between TEAD (or Sd in Drosophila) and YAP (or Yorkie, Yki in Drosophila), suggesting competition between proteins for the formation of a complex with TEAD (Cagliero et al. 2013; Pobbati et al. 2012).

Gene structure and organization of the vestigial-like gene family

Vgll1-3/Vg subfamily

In most of the species on the phylogenic tree in Fig. 1, we have identified the genomic regions containing sequences related to Tondu domain and then set out the corresponding gene structure and carried out a comparison between species. At the genomic level, vestigial-like genes display some heterogeneity but also show features that have been conserved during evolution (Fig. 3). When considering vertebrate genes that encode a protein with a single Tondu domain, VGLL1 genes display exon numbers that vary from three in fish to six in mouse, with the first exon being non coding. Surprisingly, mouse Vgll1 is distinct from its human ortholog, with the presence of an additional intron (Fig. 3). VGLL2 genes have a more conserved structure comprising four exons and a conserved splicing pattern with an identical class of intron (not shown). However, mouse Vgll2 do not share the same characteristics and has lost one intron, comprising three exons instead of four. VGLL3 genes display the more conserved structural pattern with an identical splicing (not shown), although the size of exons differs slightly between species (Fig. 3). VGLL2 genes are of a comparable size but are smaller than VGLL1 or VGLL3 genes; in the last two cases, the mammalian genes are larger than their orthologs due to longer intronic regions. The location of the Tondu domain in exon 2 is common to all VGLL1-3 genes. It can be noted that VGLL2 and VGLL3 contain both a conserved histidine/proline rich motif that has similarities to the one present in the Drosophila paired protein (Frigerio et al. 1986). The motif is located exactly at the same 5′ position in exon 3 for each gene. This motif is 11 to 13 amino acids long in VGLL2 while it is 14 to 22 amino acids long for VGLL3. This paired-related domain is a transactivation domain in paired but its function in VGLL proteins remains uncharacterized.

Fig. 3
figure 3

Schematic diagram showing the genomic structure of vestigial-like gene family members in representative species. Exons and introns are shown using boxes and solid lines, respectively. Coding sequences are represented by colored boxes and untranslated regions by white boxes. The size of the coding regions, in nucleotides, are shown in scale above each exon. Intronic regions and untranslated regions are not drawn to scale. The size of genes is indicated on the right. The Tondu domains are shown in red and exons containing Tondu domains are in yellow. Paired repeat domain in VGLL2 and VGLL3 have been boxed in blue. The VGLL4 family novel conserved sequence (NCS) domain is figured by a black box. PPxY motifs in fly, oyster, and sponge tgi are indicated by blue dots. The transcription initiation site is figured by a black arrow

In invertebrates, the gene encoding proteins with a unique Tondu domain display a greater structural heterogeneity with respect to exon numbers, which range from two in worm to eight in Drosophila which is so far the most complex gene in the family. The Tondu domain is found in the second exon of all the species considered, with the exception of Drosophila (where it is located in the fourth exon), amphioxus, and worm (incomplete sequences). Surprisingly, the genomic sequence encoding Tondu domain in oysters is split between two consecutive exons, with the first 20 amino acids on one exon and the last 4 amino acids located on the following exon (Fig. 3).

Vgll4/tgi subfamily

When considering the Vgll4/tgi subfamily, all vertebrates except zebrafish (excluded due to an incomplete sequence) show a highly conserved structural organization with six exons and two alternative promoters (Fig. 3). Although they differ slightly in respect to exon size, vertebrate Vgll4 genes display the same splicing pattern and all have the two Tondu domains located in the last exon, strictly spaced by an 18 amino acid linker sequence (Fig. 3). Invertebrates display a greater variation in terms of exon numbers, which range from three in the sea anemone to six in Drosophila. Again, the Drosophila gene is the most complex of the family and contains two alternative promoters, like its vertebrate orthologs. To date, there is no evidence of two promoters in any other invertebrate genes. Invertebrate Vgll/Tgi genes are much smaller than their vertebrate orthologs and display a highly variable spacing between the two TDU domains ranging from 20 amino acids in the sea anemone to 219 amino acids in Drosophila (Fig. 3). More surprisingly, the two Tondu domains are even split between two successive exons in annelid and oyster genes, suggesting a complex gene structure evolution of the ancestral gene. Given the high conservation of the two Tondu domain sequences, one may hypothesize that they split across two exons during the course of evolution in some lineages.

Although the vgll4l genes found in amphibians and fishes encode proteins with two Tondu domains, they have limited similarity with the vgll4 proteins (less than 40 % for amphibian proteins) (Barrionuevo et al. 2014). However, vgll4l and vgll4 can be considered to be homeolog genes resulting from duplication events. They show a remarkable conserved structural organization, with five exons displaying only slight size variations (Fig. 3). In both vgll4l and vgll4, the two Tondu domains are encoded in the last exon but are separated by a 15 (zebrafish) or 20 (Xenopus) amino acid sequence linker in vgll4l compared to the 18 amino acid linker region in Vgll4. The novel conserved sequence (NCS) identified by Barrionuevo et al. is found in all VGLL4/tgi genes from sponge to mammals (Fig. 3). The NCS is split on two exons in vertebrate genes while it is part of the exon in other species.

Vertebrate vestigial-like synteny

Synteny analysis confirms the relationship between vertebrate vestigial-like genes (Fig. 4). Indeed, there is a remarkable conservation between mammal (mouse and human), avian (chicken), amphibian (Xenopus), or fish (zebrafish) genes, confirming phylogenetic analysis. Except for the amphibian vgll4 gene, which cannot be analyzed in detail due to its location on a small genomic scaffold, the order and orientation of genes that flank the different vertebrate Vgll are highly conserved. In zebrafish, vgll1, vgll2, and vgll4 are duplicated and correspond to homeologs (identified as a and b in Fig. 4). This is related to the ancient fish-specific genome duplication that occurred in the teleost lineage after divergence from tetrapods (Taylor et al. 2003). The conserved synteny of zebrafish vgll1a, vgll2a, and vgll4b with their vertebrate cognates indicates that they are true orthologs of mammalian genes. Vgll4l from amphibian and fishes displayed a conserved synteny (Barrionuevo et al. 2014).

Fig. 4
figure 4

Syntenic organization of vertebrate vestigial-like genes. Human, mouse, chicken, Xenopus, and zebrafish chromosome regions containing vestigial-like genes family members are depicted. Genes are represented as colored boxes with the arrow indicating the orientation of the transcription unit. Boxes of the same color correspond to ortholog genes. To avoid complexity, no scale is used here. For Xenopus genes, synteny was deduced from Xenopus tropicalis version 7.1 and Xenopus laevis version 6.0 or 7.2 as follows; Vgll1, sca_8 and sca_12933; Vgll2 sca_5 and sca_275342; Vgll3, Sca_2 and sca_1709; Vgll4, sca_467 and sca_62870

Expression and function of vestigial-like family members

Although a wealth of data has been generated about the function of Vestigial in Drosophila since its description and identification, the roles played by vertebrate orthologs have been little explored to date. The gene expression of Vgll members in vertebrates has been achieved (Table 2), but their function still remains to be described, and knockout mouse models or data from knockdown strategies are pending. Clinical studies have highlighted that alteration in human Vgll expression has been associated with disease, namely cancer.

Table 2 Comparison of expression patterns of vestigial-like genes in vertebrates.

Drosophila melanogaster Vg

Cloning

The molecular cloning of the vestigial gene was first carried out via a P-element tagging strategy allowing the identification of a 46 kb DNA region containing the locus. A molecular mapping narrowed down this region to a 20 kb of DNA which is necessary for Vg function (Williams and Bell 1988). Complementary DNA (cDNA) analysis allowed the characterization of a 3.8 kb messenger RNA (mRNA) encoded in the vg locus and established the structure of the gene which is composed of eight exons with the first one being an exon leader containing no coding sequences (Williams et al. 1990, 1991) (Fig. 3). The Vg protein is 453 amino acids long and present no significant homology with known proteins except for two regions containing alternate histidine residues and similar to histidine-proline repeat present in paired protein (Williams et al. 1991).

Expression and regulation

Vestigial is first expressed during embryonic development in group of cells that will form the wings and haltere discs (Williams et al. 1991). In late third instar wing disc, vg is expressed throughout the presumptive wing region in a graded fashion and becomes more concentrated at the dorso/ventral boundary (Williams et al. 1993). Vg is also expressed in clusters of cells of the ventral nerve cord that will form interneurons and motor neurons (Williams et al. 1991; Guss et al. 2008). Adepithelial cells of the wing disc, that are precursors of thoracic muscles, express vg and form a distinct group of myoblasts (Ng et al. 1996; Sudarsan et al. 2001). Indeed, Vg is expressed in embryonic muscles (Baylies et al. 1998; Deng et al. 2009). Later, the gene is expressed in the developing indirect flight muscles (IFM) that represent the majority of the thoracic muscles of the adult and that contribute to flight by deformation of the thorax.

It has been shown that the spatially restricted expression of vg in D. melanogaster is mediated by an intronic regulatory element that is strongly conserved in D. virilis (Williams et al. 1991). The wing-specific expression of vg is mediated through two enhancers (vgBE and vgQE) that are sequentially activated during development and occur under the control of both dorsoventral and anteroposterior signaling systems that pattern the wing (Kim et al. 1996; Klein and Martinez Arias 1999; Williams et al. 1994). The overall expression of the gene is the sum of inputs from Notch, wingless and dpp signals that are integrated in distinct cis-regulatory modules (Carroll et al. 2001). Vg expression has also been showed to be repressed by ladybird which has a key role in leg myogenesis (Maqbool et al. 2006).

Molecular interaction

Several lines of evidence have suggested that Vg could act with a partner to perform its function, and this led to the assumption that scalloped (Sd) was this partner. Indeed, vg and sd are both required for wing development and they present similar patterns of expression in wing discs and mutant phenotypes (Campbell et al. 1991, 1992; Williams et al. 1993). Vg does not display any known nucleic acid-binding motif, but is located within the nuclei of cells that are destined to form wing structures (Williams and Bell 1988; Williams et al. 1991). The necessity of interaction between Vg and Sd to promote wing development and the regulation of wing-specific gene expression has been established in independent studies (Halder et al. 1998; Paumard-Rigal et al. 1998; Simmonds et al. 1998). The Vg-Sd interaction domain has been mapped to a 56 amino acid domain of Vg, while the Vg-binding region of Sd maps to the carboxy-terminal half of the protein (Simmonds et al. 1998). Sd is part of a highly conserved family of transcription factors containing a TEA domain, and the human ortholog TEAD1 can bind to Vg with a similar affinity (Campbell et al. 1992; Jacquemin and Davidson 1997; Simmonds et al. 1998). Genetic and biochemical studies support a model in which Vg and Sd form a heterodimer or a heterotetramer that bind DNA through the TEA domain of Sd (Halder and Carroll 2001; Halder et al. 1998).

Vg has also been shown to interact with the myogenic factor mef2 through two independent domains (Deng et al. 2009). Surprisingly, even when the Sd interacting region is removed, Vg can still interact with mef2 suggesting that Vg, mef2, and Sd can form a tripartite complex that regulates late-stage Drosophila embryonic muscle development (Deng et al. 2009). This property seems to be an evolutionarily conserved trait, as mammalian Vgll2 has been found interacting with Mef2 to activate Mef2-dependent promoters and stimulate muscle differentiation induced by MyoD (Maeda et al. 2002).

Functions

Vestigial has been shown to play an essential role in the regulation of cell proliferation and differentiation within the developing imaginal disc. Vg has been categorized as a selector protein owing to its necessary role in wing formation and its ability to reprogram cells in the leg and other imaginal discs to adopt a wing fate (Kim et al. 1996; Mann and Carroll 2002; Williams et al. 1991). In the absence of a functional wild-type gene, extensive cell death occurs in the wing imaginal disc, resulting in a complete loss of wing margin structures (Fristrom 1969). The initial description that Vestigial was required for cell proliferation in the wing imaginal disc was later extended to the complex Vg/Sd that is also required for cell survival (Delanoue et al. 2004; Kim et al. 1996; Legent et al. 2006). Furthermore, the mutant homozygous fly with a null vg allele has reduced viability, female sterility and shows a degeneration of indirect flight muscles (Bernard et al. 2009). Indeed, Vg has been shown to be required in determining indirect flight muscle identity and in the differentiation of these muscles (Bernard et al. 2003, 2006; Deng et al. 2010; Sudarsan et al. 2001). In order for indirect flight muscle differentiation to proceed, the anti-differentiation role of the Notch pathway must be repressed by Vg and this is mediated by fringe a member of the Notch pathway whose expression is lost in vestigial mutants (Bernard et al. 2006; Caine et al. 2014).

The activities of Sd/Vg complexes depend on their subcellular localization. Indeed, Vg is mainly cytoplasmic but becomes nuclear in the presence of Sd (Halder et al. 1998; Simmonds et al. 1998; Wu et al. 2008; Zhang et al. 2008; Goulev et al. 2008). This effect depends on a nuclear localization signal present in Sd (Magico and Bell 2011). Interestingly, it has been shown that the Hippo signaling pathway can promote Sd and Sd/Vg cytoplasmic localization, thus affecting its transcriptional activity regardless of Yki (Cagliero et al. 2013). There is in vitro evidence that Drosophila Vg can change the DNA target selectivity of Sd (Halder and Carroll 2001). Several observations corroborate these conclusions in vivo. Indeed, vg expression in the wing imaginal disc is under the control of the Notch pathway. It has been shown that in the central region of the wing pouch where Notch is active, Notch prevents Sd/Yki target genes expression (expanded, Diap1, dmyc) by inducing high levels of vg expression, promoting Sd/Vg complexes (Djiane et al. 2014). This indicates that the ratio between Sd, Yki, and Vg is important to determine whether Sd forms a complex with Yki or with Vg, allowing the induction of different set of target genes. The same situation probably exists in mammals, where TEAD4/VGLL1 activates target genes unactivated by TEAD4/YAP, and vice versa (Pobbati et al. 2012). For example, Insulin-like-Growth Factor Binding-Protein-5 (IGFBP-5) is induced by TEAD/VGLL1 and not by TEAD/YAP. This conclusion is supported by the observation of competition between VGLL1 and YAP or Vg and Yki for Sd binding, in which VGLL1 and Vg successfully bind with Sd. This suggests that Sd binds to Vg or Yki in a mutually exclusive manner excluding the formation of a trimeric complex between Sd, Vg, and Yki (Cagliero et al. 2013; Pobbati et al. 2012).

Vgll1 (TONDU)

Human VGLL1 was first described during a search for a human protein homolog of the Drosophila Vg (Vaudin et al. 1999). The human cDNA identified was named TONDU (or TDU) and encodes a 258 amino acid protein whose homology with Drosophila Vg is limited to a 24 amino acid domain that is essential for its interaction with Sd. Among the four vertebrate VGLL proteins, VGLL1 is the least well conserved between species, showing less than 30 % of sequence identity outside the Tondu domain. TONDU was shown to be a transcriptional activator through its interaction with TEAD and was able to rescue the loss of Vg function in Drosophila (Vaudin et al. 1999).

The expression of VGLL1 in human fetal tissues was found to be expressed in the lung and kidney while its expression in adult tissue is enriched in the placenta (Table 2) (Maeda et al. 2002; Vaudin et al. 1999). In Xenopus, vgll1 is expressed in the prospective epidermis of the embryo and solely in the skin of adults (Faucheux et al. 2010). Although the role played by Vgll1 in development and disease in mammals is not known, a recent report identified miR-934, which presents a marked overexpression in breast cancer, in intron 4 of the human VGLL1 gene (Castilla et al. 2014). A correlation was found between VGLL1 and miR-934 expression, suggesting a potential oncogenic role in breast cancer (Castilla et al. 2014). These observations may be linked to studies showing that Vgll1 can form a complex with TEAD4, activating several proliferating promoting genes and thus facilitating anchorage-independent cell proliferation (Pobbati et al. 2012). However, a specific function of VGLL1 in human cancer inferred from in vivo studies remains to be established.

Vgll2 (VITO-1)

The human VGLL2 gene was identified in a BLAST search for genes containing homolog sequences of Drosophila vg, and a cDNA was then amplified from the human skeletal muscle (Maeda et al. 2002). Concomitantly, a cDNA encoding VGLL2 (or VITO-1) was isolated using a subtractive hybridization approach to identify genes expressed in the human skeletal muscle (Mielcarek et al. 2002). The human VGLL2 is 317 amino acids long, with a Tondu domain that is identical in all vertebrate proteins displaying 60 % identity in the overall protein. Vgll2 has also been characterized in zebrafish, chicken, and Xenopus (Bonnet et al. 2010; Faucheux et al. 2010; Hamade et al. 2013; Mann et al. 2007).

In all the vertebrate species analyzed, Vgll2 expression was mainly localized on sites in the somitic myotomes of the early embryo and later, in the skeletal muscle (Fig. 5a (a, c, d, e) and Table 2). This is in agreement with the upregulation of Vgll2 expression observed during muscle differentiation of C2C12 myoblasts (Maeda et al. 2002; Mielcarek et al. 2002). In chicken and Xenopus embryos, Vgll2 is expressed downstream of myogenic factors (Bonnet et al. 2010; Faucheux et al. 2010). Vgll2 has no myogenic activity per se, but it can enhance the MyoD-mediated myogenic conversion of fibroblast cells through its binding to TEAD, and is a prerequisite for skeletal muscle differentiation, indicating a crucial role as a cofactor in the muscular regulatory program (Fig. 5a (b)) (Chen et al. 2004a; Gunther et al. 2004; Maeda et al. 2002). Vgll2 not only interacts with TEAD but is also able to interact with the muscle regulatory factor Mef2 to activate Mef2-dependent promoter genes in cell culture (Maeda et al. 2002). The analysis of six1/six4 mutant mice suggests that Vgll2 function could be related to the activation of slow-type muscle genes during muscle development. Indeed, Vgll2 expression is higher in slow-type myofibers than in fast-type myofibers and is not affected in six1/six4 mouse mutants where the fast-type muscle program has been disrupted (Niro et al. 2010). Remarkably, the function of Vgll2 in the muscle differentiation program seems to be a conserved feature in evolution. In Xenopus embryo, the knockdown of vgll2 through a morpholino-based strategy impairs hypaxial muscle development (Fig. 5a (d)) (Faucheux et al. 2010 and our unpublished data). Besides its major expression pattern in muscle, the expression of Vgll2 has also been shown in the vertebrate embryo, occurring in pharyngeal arches, in pituitary anlagen, and in a discrete region of the ventral forebrain that correspond to the future ventromedial hypothalamus (Fig. 5) (Table 2) (Bonnet et al. 2010; Faucheux et al. 2010; Johnson et al. 2011; Kurrasch et al. 2007; Maeda et al. 2002; Mielcarek et al. 2002). A morpholino-based knockdown analysis in zebrafish embryo pharyngeal arches showed that vgll2a is required for the survival of the pharyngeal endoderm, and play a role in the development of the neural crest cell-derived craniofacial skeleton (Fig. 5a (e)) (Johnson et al. 2011). Surprisingly, a genotype-phenotype correlation has been made in the human, with an emphasis on VGLL2 as part of a chromosome 6 deletion where individuals display, among other phenotypic traits, dysmorphic features of the face, obesity, and hyperphagia (Rosenfeld et al. 2012). These features could indeed be linked to VGLL2-altered expression in both the pharyngeal arch and the ventromedial hypothalamus.

Fig. 5
figure 5

Comparative expression and function of vertebrates VGLL2 and VGLL3. a VGLL2 expression and function in vertebrates. (a) Vgll2 expression in E9.5 and E11.5 mouse embryo, analyzed by whole mount in situ hybridization, is detected in branchial arches, somites, and forelimb. (b) The expression of myosin heavy chain in myotubes (arrow) detected by immunofluorescence in 10T1/2 fibroblasts transfected by MyoD (ctrl) is enhanced when Vgll2 is co-expressed with MyoD (+Vgll2). Knockdown of Vgll2 expression (-Vgll2) attenuates C2C12 myoblasts differentiation and myotubes (arrow) formation when compared to control cells (ctrl). (c) Vgll2 expression in chicken embryo at 20 somites (20so) and HH22 stage, analyzed by whole mount in situ hybridization, is detected in somites. High magnification shows Vgll2 expression in HH24 stage embryo somites. Transverse section of HH24 chick embryo at forelimb level shows Vgll2 expression in muscle mass (arrows). (d) Vgll2 expression in stage 30 (st30) and stage 38 (st38) Xenopus embryo marks somites, branchial arches, hypaxial, and head muscle. The hypaxial and head muscles of stage 42 (st42) Xenopus embryo is affected in vgll2a morphant embryos (-vgll2) when compared to control embryo (Ctrl). (e) Vgll2b expression in 25 somites stage (25so) zebrafish embryo and vgll2a expression in 22 hpf stage embryo marks somites and branchial arches. The pharyngeal cartilage of 5dpf zebrafish embryo is affected in vgll2a morphant embryos (-vgll2) when compared to control embryo (Ctrl). b VGLL3 expression and function in vertebrates. (a) Vgll3 expression in E8.5 and E9.5 mouse embryo, analyzed by whole mount in situ hybridization, is detected in brain and somites (left panel). Transverse section at the position indicated in E8.5 (asterisk) and E9.5 embryo (double asterisks) is shown in the right pannel. (b) Vgll3 expression in Xenopus embryo is detected in hindbrain rhombomere 2 (r2) in blue staining at stage 15 (transverse section) and stage 24 (dorsal view). In stage 16 (anterior view), vgll3 is detected in red and en2 is detected in the midbrain-hindbrain boundary (blue). (c) Human sarcoma cells where VGLL3 expression has been knock-down by shRNA (-VGLL3) have a reduced proliferation rate compared to control cells (Ctrl). (d) The migration of human sarcoma cells in wound healing assay is impaired when VGLL3 has been knocked down by shRNA when compared to control cells (Ctrl). Ba branchial arches, br brain, ch ceratohyal, en2 engrailed 2, fl forelimb, HH Hamburger Hamilton stage, hm head muscle, hym hypaxial muscle, m Meckel’s cartilage, r rhombomere, s, somites. Adapted from (Bonnet et al. 2010; Faucheux et al. 2010; Gunther et al. 2004; Helias-Rodzewicz et al. 2010; Johnson et al. 2011; Maeda et al. 2002; Mann et al. 2007; Mielcarek et al. 2009)

Vgll3 (VITO-2)

Human VGLL3 was first identified on chromosome 3 as a partial sequence corresponding to the TEAD interaction domain (Maeda et al. 2002). Mouse Vgll3 cDNA was characterized using in silico screening of EST databases with the Tondu domain, and named VITO-2 owing to 96 % similarity with VITO-1 (Vgll2) over the TEAD interaction domain (Mielcarek et al. 2009). Vgll3, like Vgll1 and Vgll2, has been found to be able to functionally interact with TEAD1 in a mammalian two-hybrid assay (Kitagawa 2007). Vgll3 has also been cloned and characterized in the amphibian Xenopus (Faucheux et al. 2010). The vertebrate gene is the best conserved of the gene family in terms of structure, displaying four exons and a conserved splicing pattern (Fig. 3). Only slight differences can be observed between the different Vgll3 ortholog genes in terms of exon size, giving rise to proteins that range from 310 amino acids in the frog to 330 in zebrafish. One peculiarity of vertebrate proteins is the presence of a histidine repeat of 6 or more residues, a relatively uncommon feature found in only 86 human proteins (Salichs et al. 2009). While human VGLL3 contains a single seven histidine repeat, the mouse protein contains a six histidine repeat and the amphibian protein contains two histidine repeats (of 7 and 8 amino acids). Proteins containing histidine repeats are mostly involved in DNA-and RNA-related functions, and these repeats cause them to accumulate in nuclear speckles (Salichs et al. 2009). Surprisingly, most of these proteins are expressed in the brain and/or nervous system development, as observed in Vgll3 (Table 2).

In mouse embryo, Vgll3 is transiently expressed in the neuroepithelium at the location of the future midbrain before its expression becomes apparent in somites at E9.5, followed by expression in dorsal root ganglia at E13.5 (Fig. 5b (a)) (Mielcarek et al. 2009). In Xenopus embryo, vgll3 has a unique expression pattern and is a specific marker of hindbrain rhombomere 2 (Fig. 5b (b)) (Table 2) (Faucheux et al. 2010). Expression of Vgll3 has also been found in chicken embryo hindbrain (Duprez D, personal communication). Due to its restricted expression to rhombomere 2, Vgll3 constitutes a good marker for the analysis of the regulatory network that regulates the development of this hindbrain domain. During mouse development, Vgll3 also marks the myogenic lineage (Fig. 5b (a)) (Mielcarek et al. 2009). Surprisingly, Vgll3 was the most strongly upregulated gene observed in a screen designed to identify targets of the myogenic regulator Pax3, suggesting that it could be involved downstream of Pax3 in the regulatory muscle program (Lagha et al. 2010). In adult human tissues, VGLL3 is mainly expressed in the placenta, while it is more widely expressed in the adult mouse and amphibian, where it is found in the brain, kidney, liver, stomach, or heart (Table 2) (Faucheux et al. 2010; Mielcarek et al. 2009).

A recent report using an in vitro cell differentiation system has highlighted a role for Vgll3 as a negative regulator of the adipocyte differentiation program (Halperin et al. 2013). Indeed, Vgll3 expression is downregulated during adipogenesis, whose differentiation can either be inhibited by overexpression of Vgll3 or promoted by its knockdown. Vgll3 locus has been identified in a Genome-Wide Association Study (WGAS) to be linked to age at puberty in humans (Cousminer et al. 2013). Surprisingly, two recent WGAS revealed that Vgll3 was also involved in age at maturity in salmon suggesting a conserved mechanism for timing puberty in vertebrates (Ayllon et al. 2015; Barson et al. 2015).

D. melanogaster Tgi (or SdBP)

Tgi was originally identified at the same time as human VGLL4 but was functionally characterized in a screen seeking gene that regulate the Hippo pathway in Drosophila (Chen et al. 2004b; Koontz et al. 2013). Tgi was also identified (under the name SdBP for Sd-Binding-Protein) in a yeast two-hybrid screen aiming to identify new Sd partners (Guo et al. 2013). Overexpression of Tgi was found to decrease eye and wing size (Guo et al. 2013; Koontz et al. 2013). Tgi is ubiquitously expressed in wing and eye discs and in an adult fly and can produce two transcripts (RA and RB in Fig. 2a) through alternative promoters that only differ in the first coding exon. The two proteins produced by tgi (RA and RB) are 536 and 383 amino acids long, respectively, but have no distinguishable difference in function and subcellular localization (Guo et al. 2013). In contrast to Vg, Tgi contains two Tondu domains (TDU1 and TDU2) that are 10 amino acids long and are highly conserved with the two domains found in vertebrate VGLL4. In addition to the Tondu domains that mediate interaction with Sd, Tgi has three PPxY motifs enabling it to interact with WW domain-containing proteins such as Yorkie (Yki) and members of the Hippo pathway (Fig. 2a). Tgi stands for Tondu-domain-containing Growth inhibitor and accordingly it was demonstrated to act as a repressor of the Hippo pathway by direct binding to Sd, thus inhibiting the transcriptional activation of the Sd-Yki complex (Fig. 6a (a)) (Guo et al. 2013; Koontz et al. 2013).

Fig. 6
figure 6

Conserved function of Tgi/VGLL4 in Hippo signaling and organ size control. a VGLL4/Tgi inhibits Yki/YAP-induced overgrowth and acts like a tumor suppressor in cancer. (a) Overgrowth of Drosophila compound eye induced by Yki, analyzed by scanning electron microscope, is reduced by Tgi co-expression. (b) Overgrowth of Drosophila eyes that overexpress YAP is inhibited by human VGLL4 overexpression. (c) Overexpression of human VGLL4, but not a protein mutated on Tondu domains (VGLL4 T1/2) induces Drosophila wing size reduction. (d) Yap induces mouse liver overgrowth and hepatocellular carcinoma that is suppressed by VGLL4. (e) VGLL4 inhibits A549 lung cancer cells growth in vitro. (f) VGLL4 reduces the size of induced lung tumors in mouse. (g) Expression of VGLL4 but not VGLL4 deleted of two Tondu domains (VGLL4T1/2) downregulates the expression of TEAD-induced CTGF gene. Adapted from Guo et al. 2013; Koontz et al. 2013; Zhang et al. 2014. b Models for the function of Tgi/VGLL4. Yorkie (Yki) in Drosophila or YAP in vertebrates are the effectors of the Hippo signaling (Hpo) with Scalloped (Sd) in Drosophila or TEAD in vertebrates, respectively. (a) When Hpo signaling is on, Yki/YAP is phosphorylated and cytoplasmic. The nuclear Tgi/VGLL4 can bind to Sd/TEAD and repress gene targets. According to data in Drosophila, additional Sd corepressors (protein X) likely exists. In the absence of Hpo signaling, Yki/YAP is nuclear and activates gene targets through Sd/TEAD binding. Model adapted from Koontz et al. 2013. (b) Tgi forms a trimeric complex with Yki and Sd in which Tgi-Yki interaction is as important as Tgi-Sd interaction. When Hpo signaling is on, Tgi competes with Yki for Sd binding and leads to an inactivation of Sd/Yki transcriptional activity while Tgi simultaneously interacts with Yki resulting in its nuclear retention and dysfunction (dashed red arrow). In the absence of Hpo signaling or when Yki is overexpressed, Yki nuclear concentration increases and can compete efficiently in a concentration-dependent manner with Tgi to activate gene targets through Sd binding. Model adapted from Guo et al. 2013

Phylogenetic analysis combined to experimental data established that VGLL4 is the mammalian ortholog of Tgi. Indeed, human VGLL4 can suppress eye overgrowth induced by YAP and can phenocopy Tgi-dependent suppression of tissue overgrowth when overexpressed in Drosophila (Fig. 6a (b)) (Koontz et al. 2013). Moreover, overexpression of VGLL4, like Tgi/SdBP, results in a decrease in wing size in a Tondu-dependent manner (Fig. 6a (c)). Furthermore, a recent study demonstrated a potential conserved role played by VGLL4 in growth control through antagonizing the Hippo pathway in a transgenic mouse model of hepatocellular carcinoma induced by YAP (mammalian Yorkie ortholog) overexpression. In this model, VGLL4 can block YAP-induced overgrowth and tumorigenesis (Fig. 6a (d)) (Koontz et al. 2013). Since these findings, several reports have confirmed the anti-oncogenic functions of VGLL4 in various models (Fig. 6a (e–g)) (Jiao et al. 2014; Li et al. 2015a; Zhang et al. 2014).

From these observations, it has been proposed a model in which Tgi/VGLL4 is a cofactor involved in Sd/TEAD’s default repressor function (Fig. 6b (a)) (Koontz et al. 2013). The link between this Sd default repressor function and Tgi illustrates that while Yki needs to interact with Sd to regulate tissue growth, Sd is not essential for tissue growth and cell survival. Indeed, in the absence of Yki, Sd/Tgi complexes repress target genes involved in tissue growth and cell survival, while in the absence of Sd, these genes remain expressed at basal levels. In this model, physiological levels of Yki do not promote normal tissue growth by “activating” transcription, but rather by relieving the repressor activity of Sd. The possible presence of additional Sd corepressors (protein X in Fig. 6b (a)) has been suggested, since Tgi plays a partial role in Sd-mediated default repression (Koontz et al. 2013). Another model has been proposed in which Sd, Yki, and Tgi form a trimer and where the Tgi-Yki interaction is as important as Tgi-Sd interaction (Fig. 6b (b)) (Guo et al. 2013). In this model, Tgi and Yki compete with each other for Sd binding in a concentration-dependent manner. When Tgi is overexpressed, it forms a repressive complex with Sd-Yki in the nucleus and inhibits their activity. In the absence of Hpo signaling or in the case of Yki overexpression, abundant Yki proteins translocate into the nucleus and bind with Sd, whether or not Tgi is present (Fig. 6b (b)).

Tgi and Vg, the two Tondu-domain-containing proteins in Drosophila, differ not only in their structure (one versus two Tondu domains) but also in their function as either an activator (Vg) or repressor (Tgi) through Sd binding. One other remarkable feature common to both proteins is the presence of three PPxY sites in Tgi. The presence of these motifs very early in the evolution of the Vgll family suggests that the function conferred by the TDU ancestral unit is related to the interaction with YAP family members and, like Hippo signaling, most probably predates metazoan origins (Sebe-Pedros et al. 2012).

Vgll4 and Vgll4l

Human VGLL4 was identified in database as a human protein containing two TDU domains (Chen et al. 2004b). The two domains are 10 amino acids long, spaced by an 18 amino acid linker region, and totally conserved among vertebrate proteins. Initially, a single type of Vgll4 protein was found in databases, but other proteins with distinct N-terminal sequence were subsequently identified. A comparison between mRNA and genomic sequences reveals that the different Vgll4 proteins are produced by two alternative promoters (Fig. 3). A close inspection of Vgll4 vertebrate genes indicates that mammalian, avian, and amphibian genes have a totally conserved structure comprising two promoters and six exons, and can encode proteins of 282 to 296 amino acids. Surprisingly, the Vgll4 Drosophila ortholog Tgi also has two promoters, but the two TDU domains are separated by a 219 amino acid linker region instead of the 18 amino acid linker found in vertebrate genes. Another difference between Tgi and Vgll4 proteins is the presence of three PPxY motifs in the Drosophila protein allowing the interaction with WW domain-containing proteins such YAP, while such motifs are absent in the vertebrate protein (Koontz et al. 2013). Oyster and sponge Tgi proteins also contain a PPxY motif (Fig. 3), suggesting that Vgll4/Tgi was already able to interact with YAP/Yorkie in early evolution, but lost this ability following the separation of protostome and deuterostome.

The two TDU domains of Vgll4 have been shown to be involved in the interaction with TEAD proteins in both Drosophila and mammalian cells (Chen et al. 2004b; Guo et al. 2013; Koontz et al. 2013). Similarly to VGLL2, VGLL4 has also been shown capable of interacting with Mef2 (Chen et al. 2004b). Interferon response factor 2 binding-protein 2 (IRF2BP2) was also found to interact with VGLL4 in a yeast two-hybrid screen (Teng et al. 2010). In contrast to other Vgll genes, Vgll4 expression is ubiquitous, showing however variations in its expression level depending on the tissue (Table 2) (Chen et al. 2004b; Faucheux et al. 2010).

The function of VGLL4 was first characterized in cardiac myocytes, where it regulates α-1 adrenergic dependent gene expression (Chen et al. 2004b). In these experiments, VGLL4 was found to be a negative regulator of both α-1 adrenergic and skeletal actin promoter transcription mediated by TEAD1 (Chen et al. 2010). VGLL4 was also shown to be a positive regulator of survival in human embryonic stem cells (hESCs), where it can decrease cell death induced by stress (Tajonar et al. 2013). This is in contrast with the finding that VGLL4 is able to relocalize IAP proteins (inhibitor of apoptosis) to the nucleus, thus counteracting their anti-apoptotic activity (Jin et al. 2011).

In addition to Vgll4, a second gene named vgll4l has been identified in zebrafish and Xenopus (Barrionuevo et al. 2014; Melvin et al. 2013) (Fig. 3). The amphibian and fishes vgll4l encode proteins of 252 and 266 amino acids that have only 30 to 40 % identity with vgll4, except over the TDU domains. The absence of these genes in the chicken, mouse, or human suggests that they are homeologs that have evolved from a duplication event in the shared lineage to fishes and amphibians (Barrionuevo et al. 2014). In Xenopus, vgll4 and vgll4l show some differences in their embryonic expression but are both expressed in migrating neural crest cells, like in zebrafish embryo, which could explain why vgll4l knockdown affects craniofacial development in zebrafish (Table 2) (Melvin et al. 2013).

Vestigial and cancer

Several studies have pointed out a potential role of Vgll3 in regulation of tumorigenesis. This stems from a study revealing that the transfer of a fragment of human chromosome 3 could suppress tumorigenicity of an ovarian cancer cell line (Cody et al. 2007). Among the genes present on the transferred chromosome is VGLL3, whose expression was upregulated in transformed cell and underexpressed in malignant ovarian tumor samples. This suggests that VGLL3 plays a role in the tumor suppression pathway (Cody et al. 2009; Gambaro et al. 2013). More recently, studies seeking to characterize biomarkers for the estimation of prostate cancer survival identified VGLL3 as a signature to categorize patient survival (Peng et al. 2014). However, opposite conclusions were reached regarding soft tissue sarcomas, where VGLL3 overexpression was found to be correlated to gene amplification (Antonescu et al. 2011; Hallor et al. 2009; Helias-Rodzewicz et al. 2010). In such sarcomas, the TEAD cofactor YAP1 was also found to be amplified with VGLL3, and the inhibited expression of both genes in derived cell lines lead to anti-oncogenic features such as a decrease of both cell proliferation and cell migration (Fig. 5b (c, d)) (Helias-Rodzewicz et al. 2010). VGLL3 has been found to be overexpressed in the cartilage of adult patients with endemic osteoarthritis, compared to normal joint cartilage (Wang et al. 2009). A recent case report describes a patient showing growth pattern problem and displaying a microdeletion of the chromosome 3 region that contains VGLL3 (Gat-Yablonski et al. 2011).

More recently, findings from several studies shed new light on the function of VGLL4. Initially, Vgll4 was identified as a candidate tumor suppressor gene in pancreatic cancer (Mann et al. 2012). Studies carried out in Drosophila and mammalian cells extend this data, independently demonstrating that VGLL4 regulates the hippo pathway by directly competing for the binding of the oncoprotein YAP to TEADs, thus blocking the function of the core component of the pathway (see before and Fig. 6b) (Guo et al. 2013; Koontz et al. 2013). VGLL4 has since been shown to act as a tumor suppressor in lung and liver cancer (Fig. 6b (d–g)) (Koontz et al. 2013; Zhang et al. 2014). Interestingly, the two TDU domains of VGLL4 are sufficient to inhibit YAP activity, leading a peptide-based YAP inhibitor to exhibit potent antitumor in mice gastric cancer tissue (Jiao et al. 2014). The downregulation of VGLL4 in gastric cancer cells is correlated to the expression of miR-222, whose level can be enhanced by TEAD1 (Li et al. 2015b). This suggests a positive regulatory loop that maintains low-level VGLL4 tumor suppressor activity. A recent report made in an esophageal carcinoma model further reinforces the argument that VGLL4 plays a role as a tumor suppressor in a variety of human cancers (Jiang et al. 2014). These observations suggest that the tumor suppressor property of VGLL4 could be the direct consequence of its action in the hippo pathway at the level of the core complex TEAD/YAP (or Scalloped/Yorkie).

Conclusions and future directions

Vestigial-like proteins constitute a family of cofactors which share a highly conserved domain (Tondu domain) that is crucial to the formation of a complex with members of the TEAD family of transcription factors. Although the Tondu domain was first described in Drosophila (Vestigial), it has a more ancient evolutionary history and was already present in a single premetazoan gene which then has evolved to give four genes in mammals (Vgll1-4). Two subfamilies of proteins have appeared over the course of evolution. A first family, corresponding to Vgll4/Tgi, encompasses proteins with two Tondu domains encoded by a single gene present from premetazoans to mammals. Vgll1-3/Vg forms a second subfamily comprising proteins with a single Tondu domain, found only in bilaterians. The exact functions of those different proteins are still largely unknown, and, until now, the wealth of data on Drosophila Vg cannot shed any light on this matter, as the formation of wing appendage cannot be readily extrapolated to vertebrates. However, the recent discovery of Tgi has opened new avenues to our understanding and indicates that the two Drosophila Tondu-domain-containing proteins fulfill distinct functions. Vestigial has a restricted expression and is instrumental in wing formation as a transcriptional coactivator of scalloped, while Tgi is widely expressed and functions as a transcriptional corepressor of scalloped at the end of the Hippo pathway. Vgll1 and Vgll4 can replace Drosophila Vg and Tgi, respectively, suggesting a functional conservation that is mainly mediated by the Tondu domain. However, several questions remain to be answered before a comprehensive picture can emerge. What determines the target selectivity of Tgi/Vgll4 versus Vgll1-3/Vg, and what are their target genes? Those issues could be addressed through the use of knockout lines combined to microarray analyses. What makes Tgi/Vgll4 a repressor, and Vgll1-3/Vg an activator? Vgll4/Tgi and YAP/Yki can compete for TEAD/Sd binding, but is there any competition between Vgll1-3/Vg and Vgll4/Tgi for TEAD/Sd binding, and if so, what are the consequences? Until now, TEAD has been found to be the major transcription factor bound by Vgll. However, we cannot exclude other proteins that could form a multiprotein complex with vestigial-like, either in combination with TEAD or not, and this complicates the issue further. Such a situation has been described in Drosophila where Vg is able to interact with Sd and Mef2 through different domain to form a multiprotein complex. Since Vgll1 and Vgll4 have been shown to be able to compete with YAP for binding to TEAD, a functional distinction between the two subfamilies is now less obvious. A potential implication of Vgll1-3 in cancer progression in a similar way to Vgll4 should be considered. This is particularly relevant in the case of human VGLL3, whose expression has been found to be either anti-oncogenic or pro-oncogenic, depending on the type of cancer. How can these opposite functions be integrated? Are they connected to the Hippo pathway, and if so how? We clearly need to know more about structure/function relationships for both Vgll4/Tgi and Vgll1-3/Vg. Recent findings have revealed novel cross talk between TGFβ and Hippo signaling pathways in which SMAD interacts with TEAD and TEAD mediates TGFβ dependent gene activation (Beyer et al. 2013; Hiemer et al. 2014). Given that Vgll4 interacts with TEAD, one interesting issue to address is where Vgll4 fits into TGFβ signaling cascade.

Very few functional knockdown studies have been performed to date, and loss of functions in mouse could provide useful information about not only the effect of vestigial-like proteins on growth and differentiation but also their potential redundancy. Vgll2 knockdown in zebrafish has revealed its implication in craniofacial formation and neural crest survival. However, because the major site of Vgll2 expression in vertebrates is the skeletal muscle lineage, we may expect dramatic effects as a result of its knockdown in this lineage. In this respect, it would be interesting to know how Vgll2 can be integrated in the regulatory gene network that governs muscle differentiation. The use of vertebrate models such as Xenopus or zebrafish in conjunction with the morpholinos antisense knockdown strategy will also provide relevant information about the functions of the different Vgll genes. Studies in basal bilaterian models such as sea urchin, ascidian, or amphioxus might shed new light on the early functions of the vestigial-like subfamilies.

One striking observation is the strong evolutionary conservation of the Tondu domain. This suggests that this unusually short domain performs unique functions that have been placed under selective forces, to be maintained for over 500 million years. Indeed, one can hypothesize that this domain constitutes an evolutionary unit that has been selected to maintain the formation of the Vgll/TEAD complex. One intriguing feature is the presence of four Vgll and four TEAD distinct proteins in vertebrates whereas there are only two Vgll and one TEAD proteins in other bilateria. Did Vgll proteins coevolve with TEAD proteins in a duplication/specialization process is not known but should deserve attention? Another related issue is whether some Vgll proteins interact preferentially with some TEAD and if yes in what physiological context?

As there are no other obvious domains in vestigial-like proteins, one may wonder what roles, if any, could be played by other regions of the proteins and how these regions impact global function of the complex. Because TEAD proteins are widely expressed and involved in many cellular processes, vestigial-like proteins can be considered to act at the core of multiple signaling pathways, ensuring fine tuning by either activating or repressing transcription of targets genes. We are only at the beginning of the story, and vestigial-like proteins have many more functions to reveal in the future. And thanks to the vestigial wings of Drosophila, the story has already taken off.