Introduction

The Zic gene family encodes a group of C2H2 zinc-finger transcription factors, which are important regulators of early vertebrate development. They are part of a larger Gli/Zic/NKL gene superfamily and, together with the Gli genes, are thought to provide positional information within the developing embryo (Brewster et al. 1998). The Zic genes are typically expressed in ectodermal tissues contributing to the nervous system and neural crest, as well as somitic mesoderm (Grinblat and Sive 2001; Toyama et al. 2004). There is strong experimental support for a combined role of the Zic genes in neurulation, neurogenesis, neural crest specification, and establishment of left–right asymmetry (reviewed by Aruga 2004). Deficits in Zic gene family members have been linked to developmental defects such as spina-bifida, holoprosencephaly, and X-linked heterotaxia (reviewed by Grinberg and Millen 2005).

Understanding the significance of Zic gene function during embryonic development is confounded by their broadly overlapping expression with the potential for competition for DNA-binding, sites as well as cross-regulatory and physical interactions among orthologues (Grinblat and Sive 2001; Koyabu et al. 2001; Mizugishi et al. 2001; Nakata et al. 2000; Toyama et al. 2004). It is therefore essential to define the combined expression of the Zic gene family members and understand their evolutionary relationships. Although there is significant conservation in the structure of the Zic protein DNA-binding domain, consisting of five zinc-fingers, there is also considerable divergence in other parts of the protein that may be correlated with altered post-translational regulation, protein–protein interactions and repressor/activator activities (Aruga et al. 2006). The evolutionary diversification among family members may, however, be constrained by their physical arrangement as paired genes (bigenes) of divergent orientation in the genome that share a limited amount of “upstream” DNA. Four known vertebrate homologues occur as zic1/zic4 and zic2/zic5 bigenes, with the exception being zic3, which is a single-gene locus, located on the X-chromosome in mammals (Aruga et al. 2006).

We describe in this paper the structure, genomic context, and embryonic expression of zebrafish zic6 and use those pieces of evidence to infer the evolutionary relationships of the Zic family members in the Euteleostomi. The zic6 gene was found to be teleost-specific, occurring among a broad range of fishes, but absent from the genomes of frogs, birds, and mammals. Genomic analysis established that zic6 is paired with zic3, in opposite orientation, as is the case with the zic1/zic4 and zic2a/zic5 gene pairs. Synteny of flanking genes confirmed that the zic3 loci of fish and the other vertebrate taxa are true homologues, supporting the conclusion that zic6 was the product of a chromosomal duplication before the divergence of fishes and tetrapods and was subsequently lost in the tetrapod lineage. The expression of zic6 in the neural plate lacked the lateral and rostral domains typical of the other Zic gene orthologues, indicating it has evolved a highly derivative if not entirely new regulatory role during early embryonic development of fish.

Materials and methods

Fish maintenance

Zebrafish, Danio rerio (strain AB*), were maintained at 28.5°C on a 14:10 light/dark cycle. All fish were housed at a density of ≤15 individuals per 2-l tank on a recirculating water system and fed daily with dry flake/krill and live brine. Embryos were fixed overnight at 4°C in 4% paraformaldehyde/phosphate-buffered saline (PBS), manually dechorionated and stored in absolute methanol at −20°C. Procedures were approved by the NIH ACUC.

Sequence analysis

Predicted protein sequences were obtained for previously described loci by TBLASTN query of the GenBank database (Tables 1 and 2), and undescribed loci were predicted from the Ensembl genomic DNA assemblies (Table 3). Amino acid sequences were aligned with Multalin (http://bioinfo.genopole-toulouse.prd.fr/multalin/) using the Blossum62 matrix and formatted with Se-Al v2.0a11 (http://evolve.zoo.ox.ac.uk/). Genetic distance, identity, neighbor-joining, and maximum likelihood calculations were performed with Phylip3.66 (http://evolution.genetics.washington.edu/phylip/phylipweb.html). Synteny analysis utilized genome assemblies from NCBI Map Viewer and the Ensembl Genome Browser.

Table 1 Comparisons of conservative domains among Zic protein family members, exclusive of the zinc-finger domains
Table 2 Accession numbers for amino acid sequences based on cDNA
Table 3 Ensembl-predicted genes from fish genome projects used to determine amino acid sequences of homologues used in the maximum likelihood analysis

cDNA cloning and in situ hybridization

The zebrafish zic6 coding region was amplified from total RNA of 24-h post-fertilization embryos by polymerase chain reaction (PCR) with the Qiagen One-Step enzyme mix, gel-purified with Qiagen Qiaquick columns and TA cloned into pCRII-TOPO vector (Invitrogen). Primers were designed from assembled genomic sequences (Fwd, 5′-cctcagccaagcttgcaacaaaac-3′; Rev, 5′-atgggaagcaactcacgactgtc-3′). The cloned complementary DNA (cDNA) was sequenced from plasmid by MWG (Gaithersburg, MD). DIG- and FITC-labeled riboprobes for zic6 and deltaA (Haddon et al. 1998) were in vitro synthesized and used for in situ hybridization essentially as described by Thisse and Thisse (1998) and Liang et al. (2000), using 1% Roche blocking reagent in PBS + 0.1% Tween-20 with Roche alkaline phosphatase-labeled antibodies and either Roche BM Purple or Fast Red substrates. In situ hybridization results were imaged with a ProgRes C14 camera mounted on a Leica MZ12 binocular microscope and post-processed with Adobe Photoshop CS.

Results and discussion

Zebrafish zic6 is a novel member of the Zic gene family

A thorough analysis of the zebrafish expressed sequence tag (EST) and genomic DNA sequence databases predicted a novel member of the Zic gene family in addition to homologues of zic1–5 from frog, chick, and mammals. The predicted 525 amino acid sequence of the coding region from cDNA matched that from the genomic sequences. This novel locus was also cloned independently by Parinov et al. (2004) from a Tol2 retrotransposon insertion screen for developmental enhancer traps and designated as zic6.

The Zic proteins have a characteristic five C2H2 zinc-finger domain that was highly conserved in the Zic6 protein as well (Fig. 1a). In addition, the Zic family proteins have two N-terminal conservative domains with suspected functional significance: (1) the ZOC domain, which is conserved between invertebrate Opa and vertebrate Zic1–3, and (2) the ZF–NC domain adjacent to the first zinc-finger (Aruga et al. 2006). There are also serine-rich and conserved N/SEWYV motifs in the C termini of the proteins, although their possible functions are not clear. The Zic6 protein had the highest, although moderate, overall identity with mouse Zic4 (Fig. 1b) and was similar to zebrafish Zic4 and Zic5 in having relaxed conservation of the ZOC and C-terminal domains relative to Zic1–3 (Table 1). Zic6 exhibited a striking lack of conservation in the ZF–NC domain, which has very high identity among the other Zic orthologues (Table 1). Cluster analysis of the ZF–NC/zinc-finger region from the mouse, frog (Xenopus laevis), and zebrafish Zic gene family members supported the status of zic6 as a novel orthologue rather than a paralogue of another family member, which frequently is the case in zebrafish, as with zic2a and zic2b (Fig. 1c). This was significant because evidence of a sixth orthologue has not been reported from any other vertebrate, even in a recent comprehensive phylogenetic analysis of the gene family by Aruga et al. (2006).

Fig. 1
figure 1

Comparative analyses of predicted protein sequences from cDNA clones of the zebrafish Zic gene family members. a Amino acid alignments with conserved residues shaded. Previously defined conserved domains (Aruga et al. 2006) are indicated: ZOC, Zic-Opa conserved domain; ZF–NC, N-terminal to zinc-fingers conserved domain; ZF1–ZF5, five conserved zinc-finger domains in tandem. b Matrix of amino-acid identity among full-length Zic protein orthologues from zebrafish and mouse. The white boxes indicate cross-species pairs that share highest identity. c Phenogram for cluster analysis of zebrafish, Xenopus, and mouse Zic proteins constructed by neighbor-joining using pair-wise genetic distances for the ZF–NC and ZF1–ZF5 containing region. Amphioxus (Branchiostoma floridae) was used as an out-group to anchor the phenogram

The zic6 gene is specific to teleosts

A survey of available genome databases yielded open reading frames (ORFs) for predicted proteins with 78–93% identity to the zebrafish zic6 gene in other teleosts, including medaka (Oryzias latipes), stickleback (Gasterosteus aculeatus), fugu (Takifugu rubripes), and pufferfish (Tetraodon nigroviridis). In each of the fishes, the zic6 gene structure consisted of two exons with conserved exon–intron boundaries. This organization is shared with zic4 and zic5 and contrasts with the conserved three-exon architecture of zic1–3 (Aruga et al. 2006). No homologue of zic6 was found in frogs (X. laevis and X. tropicalis), birds (Gallus gallus), or mammals (Bos taurus, Canis familiaris, Homo sapiens, and Mus musculus). Unfortunately, the limited data available for the Chondrichthyan fishes or agnathans precluded analyses of gene complements in vertebrates basal to the Euteleostomi. However, only a single Zic gene homologue is known from the basal chordate, amphioxius (Branchiostoma floridae), indicating that the radiation of zic1–6 occurred within the vertebrate lineage. The next closest living chordate group, the tunicates, experienced a separate radiation of the Zic genes (Aruga et al. 2006).

The teleost zic3/zic6 locus is syntenic to the tetrapod zic3 locus

Maximum likelihood analysis of the ZF–NC/zinc-finger region from teleost Zic protein homologues suggested that the Zic genes belonged to two symmetric clades, with common ancestors shared by zic1–3 and zic4–6 (Fig. 2a). This evolutionary pattern is consistent with the occurrence of vertebrate Zic orthologues as bigene pairs. The genomic architecture of vertebrate Zic genes is highly conserved, having loci consisting of paired orthologues in the case of zic1/zic4 and zic2/zic5, with the exception being zic3. In all taxa, the bigene loci include one member from each clade, which we term the (+)-strand and (−)-strand orthologues, corresponding respectively to the three-exon (AB intron) and two-exon (A intron) groups of Aruga et al. (2006). An analysis of the genomic context of the zic6 gene demonstrated that it is paired with zic3 in divergent orientation, similar to the zic1/zic4 and zic2a/zic5 gene pairs; however, the zic3/zic6 gene pair is only represented in teleosts (Fig. 2b). A comparison of the fish zic3/zic6 bigene with the zic3 locus in other vertebrates revealed synteny between flanking loci, although with an inversion in the orientation of zic3 (Fig. 2b). The evolutionary position of zic6 within the (−)-strand clade together with its genomic relationship to zic3 and syntenic position among taxa suggest that it is an ancestral Euteleostian gene that was lost in the Tetrapoda. These results support the duplication–loss model proposed by Aruga et al. (2006) to explain the unitary zic3 locus, rather than an equally likely single-gene duplication event, as appears to be the case for zebrafish zic2b. In the absence of data on basal vertebrates such as sharks, hagfish, and lampreys, it is not possible to confirm if the duplication generating the zic3/zic6 bigene occurred in a common ancestor to the Vertebrata, as suggested by Aruga et al. (2006), or at a subsequent stage basal to either the Gnathostoma or Euteleostomi.

Fig. 2
figure 2

Evolutionary and genomic analyses of the zic6 locus. a Maximum-likelihood phylogeny of the teleost Zic protein orthologues using the ZF–NC and ZF1–ZF5 containing region, with Amphioxus as the outgroup. Numbers indicate reliability of critical branches among 100 bootstrap replicates (%); parenthetical values are associated with branch lengths not significantly different from zero (p ≥ 0.05). Brafl, Branchiostoma floridae (Amphioxous); Danre, Danio rerio (zebrafish); Gasac, Gasterosteus aculeatus (stickleback); Oryla, Oryzias latipes (medaka); Takifugu rubipes (fugu); Tetni, Tetraodon nigroviridis (pufferfish). b Genomic analysis of synteny of genes flanking zic3 and zic6 in fish (same species as above), frogs (Xenopus tropicalis), birds (chick), and mammals (human, mouse)

The early expression of zic6 is limited to the intermediate neurogenic domain

An analysis of zebrafish zic6 messenger RNA distribution by in situ hybridization first detected weak expression at mid-gastrulation (70–80% epiboly) in the ectoderm that developed into a pair of strongly expressing patches by the end of gastrulation (Fig. 3a). The paired patches became elongated within the medial neural plate during early somitogenesis (Fig. 3b). Double in situ hybridization with the neurogenic domain marker deltaA (Fig. 3c,d) unequivocally demonstrated that the early expression of zic6 was limited to the intermediate neurogenic domain of the prospective hindbrain (Fig. 3e,f).

Fig. 3
figure 3

Early embryonic expression of zic6 in the zebrafish. Embryos are oriented with anterior to the right and dorsal up, with hindbrain (hb) and spinal cord (sc) divisions visible. a, b zic6 mRNA appears during late gastrulation (a, 90% epiboly) in two spots in the anterior dorsal ectoderm that are maintained during early somitogenesis (b, three somites) as elongated patches in the anterior-medial neural plate. c, d The expression of the Notch ligand deltaA marks the diencephalic (d), trigeminal (t), and lateral (l), intermediate (i) and medial (m) spinal neurogenic domains during late gastrulation (c, tailbud) and early somitogenesis (d, three somites). e, f Two-color in situ hybridization with deltaA (red) demonstrates that the expression of zic6 (blue) is specific to the intermediate neurogenic domain in the prospective hindbrain at tailbud and three-somite stages

The expression patterns of zebrafish Zic genes in general exhibit considerable conservation, especially between bigene pairs that may share common cis-regulatory elements. The expressions of the zebrafish zic1/zic4 and zic2a/zic5 bigenes are correlated during early development, such that the expression of zic2a and zic5 is similar (Toyama et al. 2004), while that of zic1 and zic4 are spatially identical (Fig. 4a–d). Similarly, the unpaired paralogue zic2b has expanded expression but overlaps completely with zic2a (Fig. 4g). In comparison, the expression of zic6 is more restricted than the other orthologues and is distinct from that of zic3, although the two bigene partners overlap where zic6 is expressed.

Fig. 4
figure 4

Comparison of the expression of the zebrafish Zic gene orthologues in three-somite stage embryos. The Zic genes are organized as antipodal bigenes in the genome: a, b zic1/zic4 on chr. 24; c, d zic2a/zic5 on chr. 9; e, f zic3/zic6 on chr. 14. g The exception is the paralogue zic2b on chr. 1, which is the result of a single-gene duplication within the zebrafish lineage. h The relative expression of the Zic genes in the neural plate was assessed by two-color in situ hybridization using deltaA as a reference transcript (shown: zic2b in red with deltaA in purple). The red lines indicate the lateral edges of the neural plate, the red arrows indicate ectodermal expression in the neural plate, and the black arrows indicate mesodermal expression

A range of critical developmental processes occur during the open neural plate stage after gastrulation but before formation of the neural tube, including progenitor cell maintenance, neurogenesis, neural crest differentiation, and somitogenesis. In zebrafish, the common vertebrate Zic genes are all expressed to varying degrees in the lateral plate, forebrain/midbrain of the neural plate, and to a lesser extent in the hindbrain (Fig. 4a–e,g). The expression of zebrafish Zic family members in these domains is consistent with the patterns of their homologues in other vertebrates. The overlap between zic1–5 in the neural plate border region underscores their critical function in the differentiation of neural crest (reviewed by Aruga 2004; Fujimi et al. 2006). In comparison, zic6 is absent from the lateral and forebrain/midbrain regions and has very restricted hindbrain expression (Fig. 4f). The zic3 gene is also expressed broadly in the hindbrain; therefore, the restricted expression of zic6 could involve elements shared with zic3 in the bigene promoter region. In general, however, the expression of zic6 is highly derived and may reflect reduced- or neo-functionalization of the locus. After 1 day of development, however, zic6 is expressed in the dorsal neural tube in a pattern similar to the other orthologues (Parinov et al. 2004).

Given the similarities among the various Zic gene family members, both among orthologues and across species, the highly restricted and very specific expression of zic6 in the intermediate neurogenic domain of the zebrafish embryonic hindbrain is exceptional. This suggests that the expression of zic6 may be a derived feature within the teleost fish lineage, possibly associated with a teleost-specific developmental feature. The fact that the other bigene pairs (zic1/zic4 and zic2a/zic5) tend to exhibit similar expression patterns leads us to predict that the ancestral zic3/zic6 bigene may also have shared regulatory elements. In that case, the loss of zic6 in the tetrapods may not have been of great consequence if there was substantial redundancy of zic3. Alternatively, the novel expression pattern of zic6 may have arisen in a common ancestor of the Euteleostomi and reflects a regulatory program that was subsequently lost in the Tetrapoda. However, the development of the intermediate neurogenic domain is very similar between zebrafish, which have zic6, and Xenopus, which lack zic6. This indicates that the function of zic6 in that domain is not critical to its specification or development in vertebrates in general.

The synteny between the tetrapod Zic3 locus and the teleost zic3/zic6 locus together with the novel expression pattern of zic6 in the zebrafish provides valuable insights into the evolutionary history of the Zic gene family (Fig. 5). The zic3/zic6 bigene common to teleost genomes is one of three extant loci that resulted from repeated rounds of duplication early in the evolution of the vertebrates (Aruga et al. 2006; Meyer and Schartl 1999). The teleost zic6 gene is a result of one such duplication that occurred in a common ancestor to the teleost fishes and the terrestrial tetrapods. A subsequent inversion/deletion event in the precursor to the mammalian X-chromosome resulted in a reversal of orientation of the zic3 gene and a loss of the zic6 gene. It is likely that the original function of zic6 was redundant with other family members with overlapping ancestral expression and therefore not critical. This conclusion would also be consistent with extensive changes to the cis-regulatory program and amino sequence of zic6 in comparison to other family members. However, the conserved zinc-finger DNA-binding domain provides evidence that zic6 acts upon promoter elements common to the other Zic family members. Thus, zic6 can be expected to contribute to a combinatorial code of Zic gene activity determining the transcriptional repression and/or activation of target genes during teleost development, which is not strictly analogous to that of the tetrapods. This is an important caveat when using zebrafish as a model system for human functional genomics, but it also provides a window on the ancestral functions of the Zic genes that can yield insights not available with tetrapod model organisms.

Fig. 5
figure 5

The evolutionary sequence of chromosomal duplications leading to the expansion of the Zic gene family in the Euteleostomi. The chromosomal duplications involving the Zic bigenes could have occurred during whole genome duplications before the divergence of the Chondrichthyes (green arrows) or in a common ancestor of the Euteleostomi (blue arrows). The red arrow indicates possible points at which the loss of the zic6 gene may have occurred relative to extant Sarcopterygian taxa. The taxa for which data on Zic gene complements were lacking are indicated in gray text