Introduction

Fragile X mental retardation is the most frequent form of syndromic X-linked mental retardation with an estimated prevalence of one in 4,000 males and one in 8,000 females. The syndrome is caused by expansion of a CGG trinucleotide repeat in the 5′ untranslated region (5′UTR) of the FMR1 gene, from under 50 CGG repeats to the “full mutation” of over 200 CGG repeats. The full mutation results in reduction or complete loss of gene transcription and, therefore, loss of the activity of the protein it encodes, FMRP (Brown 2002; Verkerk et al. 1991). The known range of FMR1 CGG repeat expansion-related phenotypes has recently become greater since moderate expansions to a “premutation” range of 50–200 repeats have been found to contribute to premature ovarian failure among female carriers (Hundscheid et al. 2000; Machado-Ferreira et al. 2004), a tremor/ataxia syndrome among aged male carriers and other neurological phenotypes (Hagerman et al. 2001). The neurological premutation phenotypes appear to depend upon the expansion of the mRNA rather than changes in protein structure (Jin et al. 2003).

In humans a family of FMR1-related genes exists consisting of FMR1 and its autosomal paralogs the FXR1 and FXR2 genes that code for FXR1P and FXR2P, respectively (Coy et al. 1995; Siomi et al. 1995; Zhang et al. 1995). The protein products of the FMR1-related gene family share a high level of amino acid identity and contain two ribonucleoprotein K homology domains (KH domains) and a cluster of arginine and glycine residues (RGG box), motifs characteristic of RNA-binding proteins (Ashley et al. 1993; Siomi et al. 1993, 1995; Zhang et al. 1995). FMRP has been found to bind around 4% of human fetal brain mRNA in vitro, including its own transcript and those of FXR1 and FXR2 (Ashley et al. 1993; Siomi et al. 1993; Sung et al. 2000). In addition, members of the FMR1-related gene family also possess a nuclear localisation signal (NLS) and a nuclear export signal (NES) implying that they have a role in nucleocytoplasmic shuttling (Eberhart et al. 1996). FMRP and the other members of the family have also been shown to be associated with messenger ribonucleoparticles (mRNPs) within actively translating ribosomes, suggesting that they have functions in translation, RNA transport and/or mRNA stability (Khandjian et al. 1996; Li et al. 2001; Schaeffer et al. 2001; Zhang et al. 1995).

Consistent with the primary features of the fragile X syndrome phenotype, FMR1 mRNA and FMRP are normally widely expressed in testis and in fetal and adult brain (Abitol et al. 1993; Agulhon et al. 1999; Devys et al. 1993). The majority of FMRP, FXR1P and FXR2P protein is localized to cytoplasm in neural cells (Abitbol et al. 1993; Devys et al. 1993). FXR1P and FXR2P are expressed in a slightly different spatio-temporal pattern and a small amount of FXR1P is found in the nucleolus of particular differentiated neurons. While FMRP, FXR1P and FXR2P are all expressed in early spermatogonia, FXR1P and FXR2P, but not FMRP, are also expressed in more mature spermatogonia (Bontekoe et al. 2002; Khandjian et al. 1998; Sittler et al. 1996; Tamanini et al. 1997).

Murine homologs of the FMR1-related gene family have been identified. Murine Fmr1, Fxr1h and Fxr2h are orthologous to human FMR1, FXR1 and FXR2 respectively. The expression pattern of Fmr1 has been shown to be similar to the human ortholog in both tissue specificity and time of expression. In the mouse, FMRP expression starts during embryonic development, where it is expressed ubiquitously. During late embryonic stages FMRP shows a specific pattern of expression, and is mainly found in tissues of ectodermal origin (de Diego Otero et al. 2000). This pattern of protein expression is consistent with that of human FMR1P during early stages of development (Hergersberg et al. 1995; Hinds et al. 1993). Human FMRP is found predominantly in the cytoplasm with occasional nuclear staining (Devys et al. 1993; Verheij et al. 1993).

Drosophila contains only a single gene of the FMR1-related gene family, dfxr, which has higher overall similarity to human FXR2 than to FMR1 or FXR1. dFXR protein distribution recapitulates that of the human FMR1-related gene family, with highest expression of the protein in muscles, central nervous system and gonads. The dfxr gene is also expressed in the mushroom bodies of the larval Drosophila brain that mediate learning and memory (Morales et al. 2002; Schenck et al. 2002; Wan et al. 2000).

Orthologs of all the members of the human FMR1-related gene family have previously been identified in mice, birds (chicken) and amphibians (Xenopus). Zebrafish genes similar to those of the FMR1-related gene family have been cloned and reported (Wan et al. 2000). However, a phylogeny relating zebrafish FMR1-related genes to their human orthologs has not yet been established nor have the expression patterns of the zebrafish orthologs been investigated. Zebrafish are vertebrates and their relatively simple early nervous system with its identifiable, stereotyped neurons makes this organism an outstanding model system in which to establish study of FMR1-related gene function and, in particular, the consequences of loss of protein expression from these genes. Here we describe the sequences and embryonic expression patterns of the FMR1-related genes in zebrafish and their evolutionary relationship to the human FMR1-related genes. Expression of these genes in various adult zebrafish tissues is also analysed using Real-Time PCR.

Materials and methods

Complete coding sequences for the zebrafish FMR1-related genes were assembled from data in the GenBank database (http://www.ncbi.nlm.nih.gov/Genbank/index.html). Phylogenetic analyses were conducted using existing sequence data for human (FMR1, FXR1 and FXR2) and Drosophila (dfxr) FMR1-related genes available in GenBank. Zebrafish fmr1 and fxr2 have been renamed from zfmr1 and Similar to fragile X mental retardation homolog 2 respectively (Wan et al. 2000). The coding sequence data for zebrafish fxr1 has been assembled from a cDNA (accession number AF169146.1) and an overlapping EST (AI722010) identified by BLASTN searches of GenBank (using NCBI BLAST with default search parameters; http://www.ncbi.nlm.nih.gov/BLAST/). Complete accession numbers and coding regions used in evolutionary analyses are reported in Table 1. The zebrafish genome was scanned for additional genes with homology to FMR1, FXR1 or FXR2 by searching the Zv2 assembly of shotgun sequence data at the Sanger Institute (http://www.ensembl.org/Danio_rerio/) with TBLASTN.

Table 1 fmr1 and FMR1-related gene sequences used in the phylogenetic analysis shown in Fig. 1. The GenBank accession number of each sequence and the nucleotide regions used for the analysis are indicated (full coding sequences)

Phylogenetic analysis was performed using the BioManager facility provided by the Australian National Genomic Information Service (ANGIS, http://www.angis.org.au/). The protein multiple sequence alignment shown in Fig.1a was constructed using ClustalW (accurate; Thompson et al. 1994), with default parameters and was prepared for publication using the GenDoc program (Nicholas et al. 1997). Sequences were added to the analysis in the descending order shown in Fig. 1a. The phylogenetic tree was constructed using Protdist (Felsenstein 1989). The Protdist distance matrix was generated under the Dayhoff PAM matrix method of amino acid substitution, and a phylogenetic tree subsequently constructed using the Neighbor Joining Method (Felsenstein 1989; Fig. 1b). Bootstrap analysis was conducted using Seqboot (Felsenstein 1989) to produce 1,000 resampled datasets that were then analysed with Protdist. Bootstrap values were finally generated using the Consense program (Felsenstein 1989), and are shown in Fig. 1b.

Fig. 1
figure 1

a ClustalW alignment of translated FMR1-family gene sequences. The degree of conservation at each amino acid residue position is indicated by shading (100% black, 80% dark grey, 60% light grey, 0% white). Motifs common to all members of the family in all analysed species are underlined and labelled (NLS nuclear localisation sequence, NES nuclear export sequence, KH 1 and 2 K homology domains 1 and 2, RGG box). The number of the right-hand most amino acid residue in each alignment line is indicated to the right. b Phylogenetic tree of protein sequences from FMR1-related genes constructed using the Neighbor Joining Method. Nodal bootstrap values are shown as a percentage. Sequences are labelled with gene names

For whole-mount in situ transcript hybridisation, zebrafish embryos were raised at 28.5°C and staged as previously described (Kimmel et al. 1995). In situ transcript hybridisation was performed as described by Jowett (1997) using single-stranded RNA probes labelled with digoxigenin-UTP (Roche, Basel, Switzerland). Probes against fmr1, fxr1 and fxr2 transcripts were synthesised from cDNA clones in the Bluescript SK vector (Stratagene) using T7 or T3 RNA polymerase as appropriate. The overall identity of each of the fmr, fxr and fxr2 transcript probes to the other two sequences was, in all cases, not greater than 67%. For the fxr1 in situs, cDNA corresponding to the long splice variant (see Table 1) was used to generate the probe (Khandjian et al. 1998; Kirkpatrick et al. 1999).

Analysis of FMR1-related gene-specific message expression in adult zebrafish was assessed by Real-Time PCR. Zebrafish tissues were rapidly dissected, ground to powder under liquid N2 using a mortar and pestle, and homogenised using QIAshredder (QIAGEN). The Oligotex poly(A)+ mRNA isolation system (QIAGEN) was used to isolate poly-A mRNA. Reverse transcription of mRNA was performed using the Omniscript first strand cDNA synthesis system (QIAGEN). Real-Time quantitative PCR analysis was performed using an ABI 7000 Sequence Detection System (Applied Biosystems, Foster City, Calif., USA). Brain, heart, skeletal muscle and liver tissue expression of FMR1-related genes in the zebrafish was quantified and normalised against control reactions for zebrafish elongation factor 1-alpha (ef-1α) RNA. Each tissue analysis was performed in triplicate and included primer pairs for simultaneous RT-PCR against both ef-1α and FMR1-related gene transcripts. cDNA was synthesised from 2 μg total RNA from each tissue sample according to manufacturers instructions. Real-Time PCR reactions were run in 96-well format with 25 μl reaction mixture in each well containing: SYBRmix (Molecular Probes) (12.5 μl), cDNA from a reverse transcriptase reaction (1 μl), 0.05 μM each of specific 5′ and 3′ primers for each gene (shown below). Calculation of relative mRNA levels was performed using methods outlined in the Sequence Detection Systems Chemistry Guide (Applied Biosystems).

Primer pairs used for Real-Time PCR:

ef-1α

Forward

5′-CCAACTTCAACGCTCAGGTCA-3′

Reverse

5′-CAAACTTGCAGGCGATGTGA-3′

fmr1

Forward

5′- ATGATTCCCGCTCCCGTAGTT-3′

Reverse

5′- GGTGTGGATGCTCCTCTCGTTA-3′

fxr1—short splice variant specific

Forward

5′- CCTCGTTACTGTGGCCGATTA-3′

Reverse

5′- CGTTGCTCACAGATTCAGCAG-3′

fxr1—long splice variant specific

Forward

5′- TCGATGGAGCTGAAGCCAAA-3′

Reverse

5′- AGCTCGCGATATGTAATCGGC-3′

fxr2

Forward

5′- AAGCGAAAATGGACTGGAAGAG-3′

Reverse

5′- AACAGTAACTGGCTGTCGGTCA-3′

Results and discussion

Initial investigations of GenBank data identified sequences coding for complete open reading frames for a number of zebrafish FMR1-related genes. Complete coding sequences with a high level of similarity to human FMR1 and FXR2 have previously been identified (Wan et al. 2000). The complete coding sequence of a gene with similarity to human FXR1 was assembled using a cDNA sequence from GenBank and an overlapping EST, identified by a BLASTN search of Danio rerio EST sequences. Accession numbers of the identified zebrafish FMR1-related gene family sequences are reported in Table 1. Note that we failed to identify any CGG repeats in available 5′UTR sequences from that zebrafish gene shown (below) to be orthologous to human FMR1.

Human FMR1-related genes and identified similar zebrafish genes were used to search the Sanger Institute zebrafish whole genome shotgun assembly Zv2 to identify any additional related genes or pseudogenes that may have arisen by gene duplication. A number of contigs were identified, containing short regions of high similarity to functional domains found in the FMR1-related gene family. However, these regions of similarity are short and are not located syntenically, suggesting that they do not represent additional members of the zebrafish FMR1-related gene family. Instead, these regions of similarity appear to be indicative of genes distantly related to FMR1 that share one of the well-conserved functional domains coded by this gene (described below and in Fig. 1a).

To establish the orthology relationships of the zebrafish and human FMR1-related genes we performed a phylogenetic analysis comparing peptide sequences known or predicted from cDNA sequences from human, zebrafish and Drosophila FMR1-related genes indicated in Table 1. With the exception of the RGG-box, these sequences align very closely in evolutionarily conserved domains. Overall human FMRP and zebrafish Fmr1 protein identity is 74%, while FXR1P and Fxr1 protein are 79% identical and FXR2P and Fxr2 protein are 68% identical (sequences described in Fig. 1a). The RGG-box domain is present in all FMR1-related genes. However, the extent of this domain varies somewhat between genes. The clustalW alignment shown in Fig. 1a was used to produce the unrooted phylogenetic tree shown in Fig. 1b. Strong evidence was found to support the existence of a zebrafish FMR1-related gene family consisting of fmr1, fxr1 and fxr2 genes orthologous to those found in humans.

Studies in human, mouse and Drosophilia have described, in part, spatio-temporal expression of the FMR1-related genes during development and in adults. Here we describe FMR1-related gene expression in zebrafish embryos and adults. In situ transcript hybridisation of whole zebrafish embryos was conducted at a variety of developmental stages between 0 hours post fertilisation (hpf) and 24 hpf (Fig. 2a–o). Real Time PCR was used to compare levels of FMR1-related gene expression between various adult tissues.

Fig. 2
figure 2

Whole-mount in situ hybridisation for fmr1, fxr1 and fxr2 gene transcription in zebrafish embryos from 0 to 24 hpf. a–c Lateral view, animal pole to the top. d–f, g1, h1, i1, j–o Lateral view; anterior is to the top and dorsal is to the right. g2, h2, i2 Axial view; anterior to top. n1–3, o1–3 Lateral view; anterior to left, dorsal to top. a, d, g1–2, j, m fmr1 expression. b, e, h1–2, k, n, n1–3 fxr1 expression. c, f, i1–2, l, o, o1–3 fxr2 expression. Stages: a–c 1 cell stage, 0 hpf; d–f shield stage, 6 hpf; g1–2, h1–2, i1–2 six somite stage, 12 hpf; j–l 18 somite stage, 18 hpf; m, n, n1–3, o, o1–3 prim-6 stage, 24 hpf, yolk removed. The expression of the three genes appears to be ubiquitous at every stage tested but with region- and tissue-specific variation in expression levels. a–f fmr1, fxr1 and fxr2 transcript is present at 0 hpf (presumably maternal transcript) and 6 hpf. g1, g2, h1, h2, i1, i2 These genes are expressed throughout the developing brain at 12 hpf. fxr1 expression is detected at low levels in adaxial cells at 12 hpf, but is expressed at particularly high levels in the anterior of the embryo (h2, h3). In contrast, fxr2 expression appears strong in adaxial cells by 12 hpf. In embryos subjected to extended staining, localized expression of fxr2 was also observed in early developing somites (bracket). Expression of fmr1 is higher in the anterior of the embryo. j–l Expression at 18 hpf. fmr1 expression is generally at a low level at this stage, while adaxial and somitic expression of fxr1 and, particularly, fxr2 intensifies. m–o Expression at 24 hpf. Expression of fmr1 is higher in the brain, while fxr1 and fxr2 share a complex pattern of higher expression in somites. In middle somites, expression of fxr1 and fxr2 is concentrated centrally (n2, o2). Expression of fxr1 and fxr2 appears less restricted in anterior (older) or posterior (younger) somites (n1–3, o1–3). Scale bars: a–o 250 μm; g2, h2, i2 60 μm; n1–3, o1–3 100 μm

Low level and variable ubiquitous expression at all stages of development is a feature of the human and murine FMR1-related gene families (Abitol et al. 1993; Agulhon et al. 1999; Bakker et al. 2000; de Diego Otero et al. 2000). Ubiquitous expression is also observed for this gene family throughout the first 24 h of zebrafish embryo development. The early embryonic expression of the zebrafish FMR1-related genes is similar until 10 hpf (Fig. 2a–f). From this stage onward, fmr1, fxr1 and fxr2 display similar spatio-temporal patterns of expression, with individual variations. fmr1, fxr1 and fxr2 expression is found in the developing brain in embryos from 10 hpf onward (Fig. 2g–o) and in adults (Fig. 3).

Fig. 3
figure 3

Quantitative analysis of fmr1, fxr2 and fxr1 long/short splice form transcripts in various adult zebrafish tissues (Khandjian et al. 1998; Kirkpatrick et al. 1999). A graphical representation of Real-Time quantitative RT-PCR results shows fmr1 and fxr1 long and short splice forms and fxr2 transcript levels in zebrafish brain, skeletal muscle, liver tissue and heart tissue. For each gene/splice form the results from the tissue showing highest levels of expression are used as a reference against which to compare expression levels in other tissues (shown as a percentage)

At 12 hpf, fmr1 expression, while ubiquitous, is distinctly higher in the anterior of the embryo (Fig. 2g1, g2). fxr1 (Fig. 2h1, h2) and fxr2 (Fig 2i1, i2) share this pattern of expression. The higher expression of fxr1 in this area of, primarily, brain development is particularly pronounced (Fig. 2h2). fxr1 and fxr2 also show higher relative expression in adaxial and somitic cells. fxr1 is expressed ubiquitously throughout the tailbud, whilst fxr2 apparently shows discrete, higher level expression surrounding the node (Fig. 2h2, i2). Extended staining reveals discrete expression of fxr2 in the region of the developing somites (Fig. 2i2).

The raised expression of fxr1 and fxr2 in adaxial and somitic cells continues in 18-hpf embryos (Fig. 2j–l). Higher levels of fmr1 expression are observed in the brain relative to other tissues from 18 hpf (Fig. 2j). fxr1 expression is found in the brain and at a relatively high level marking somites and adaxial cells (Fig. 2k). fxr2 expression is found in the brain and at a relatively very high level throughout somites and adaxial cells. The somitic expression of fxr2 shows some spatial variation in mRNA concentration (Fig. 2l). In 11- to 14-day-old mouse embryos FXR1P is highly expressed in muscle, as zebrafish fxr1 appears to be. However, unlike in mouse, zebrafish fxr2 is also expressed at relatively high levels in these same tissues (de Diego Otero et al. 2000).

fmr1 expression remains higher in the brain at 24 hpf (Fig. 2m). Expression of fxr1 and fxr2 is still found throughout the brain at 24 hpf (Fig. 2n, o) but, by this stage, has refined to different and identifiable patterns within somites of differing ages. One interesting development in the expression patterns of fxr1 and fxr2 is that, despite a large difference in their expression in somites earlier in development, by 24 hpf somitic expression of the two genes appears broadly similar in distribution and level. The somitic expression of fxr1 and fxr2 at 24 hpf appears quite complex. Each somite develops in a stereotypic manner in an anterior to posterior progression. In somites midway along this progression (“middle somites”) expression of fxr1 and fxr2 is most concentrated in the areas midway between somite borders (Fig. 2n2, o2). In anterior (older) or posterior (younger) somites, fxr1 and fxr2 expression is not as clearly defined but appears consistent with the expression in the middle somites (Fig. 2n1–3, o1–3). The somitic expression of fxr1 and fxr2 in presumptive muscle tissue is consistent with the expression of their mammalian orthologs in fetal and adult skeletal muscle (Coy et al. 1995; Khandjian et al. 1998; Kirkpatrick et al. 1999). Furthermore, ubiquitous embryonic expression of the zebrafish FMR1-related genes, particularly in the brain, is consistent with human and mouse embryonic and fetal FMR1-related gene expression (Abitol et al. 1993; Agulhon et al. 1999; Bakker et al. 2000; Coy et al. 1995; de Diego Otero et al. 2000; Khandjian et al. 1998).

In adult mice, alternative mRNA splice forms and their levels of expression in various tissues have been described for the three FMR1-related genes (Fmr1, Fxr1h and Fxr2h). Mouse FMR1-related gene transcripts are found ubiquitously at low levels, with higher expression in specific tissues (Bakker et al. 2000; Coy et al. 1995; de Diego Otero et al. 2000; Khandjian et al. 1998). Notably, different Fxr1h mRNA splice forms and their protein products have been identified as being preferentially expressed in certain tissues (Khandjian et al. 1998; Kirkpatrick et al. 1999; Tamanini et al. 2000). In particular, the protein products of the long Fxr1h mRNA splice forms are relatively highly expressed in heart and skeletal muscle with much lower expression in brain, testes and a variety of other tissues. In contrast, the protein products of the short Fxr1h mRNAs are expressed with generally less extreme variation between tissues (Kirkpatrick et al. 1999; Tamanini et al. 2000). To observe whether patterns of FMR1-related gene (and transcript splice form) expression are conserved in adult zebrafish, we used Real-Time PCR to assess transcript levels in selected zebrafish tissues (brain, skeletal muscle, heart and liver). Primer pairs specific for the zebrafish gene transcripts were designed across exon boundaries flanking exon 15 in fmr1, and equivalent regions of fxr1 and fxr2. To examine the relative expression of the fxr1 long and short splice variants in zebrafish we designed primer pairs specific for the zebrafish equivalent of mouse Fxr1h exon 15. The presence of this exon identifies long fxr1 splice forms. Note that we identified long and short forms of zebrafish fxr1 mRNA from the GenBank EST database (see Table 1) but that we have not established how many variants of these forms exist in zebrafish.

Consistent with mice, zebrafish fmr1 is expressed at highest levels in the brain, with relatively low but significant levels in all other tissues tested (Bakker et al. 2000; de Diego Otero et al. 2000). fxr2 was found to be expressed at similar levels in brain, skeletal muscle and liver but at relatively lower levels in heart muscle. In fetal and adult mouse Fxr2h is expressed at relatively high levels in the brain, but at lower levels in liver heart and skeletal muscle (Bakker et al. 2000; de Diego Otero et al. 2000). Significantly, the tissue-specific expression levels of the long splice form(s) of zebrafish fxr1 mRNA are broadly consistent with observations in mice. In zebrafish, this splice form is found predominantly in skeletal muscle, with considerably lower levels in liver and relatively very low levels in brain and heart (Bakker et al. 2000; Coy et al. 1995; de Diego Otero et al. 2000; Khandjian et al. 1998; Kirkpatrick et al. 1999). However, the relatively very low level of the fxr1 long transcript in zebrafish heart is not equivalent to the expression of the corresponding transcript in fetal and adult mice. Similar to the mouse, the lowest relative level of the short transcript was observed in the zebrafish heart. However, dissimilar to mice, the fxr1 short transcript was expressed at relatively higher levels in skeletal muscle compared to the other tissues tested (Bakker et al. 2000; de Diego Otero et al. 2000; Khandjian et al. 1998). Unlike the long fxr1 splice form, the short fxr1 transcript is expressed at medium levels in brain and liver.

As described above, the patterns of expression of the FMR1-related gene family in zebrafish are broadly consistent with the expression of orthologs found in mouse and human, implying that functions and interactions within this gene family may be comparable between species. As such, zebrafish should be an excellent model organism in which to study interactions within the vertebrate FMR1-related gene family and the consequences of loss of activity of individual gene family members through genetic manipulation.