Abstract
We characterized four eEF1A genes in the alternative rhabditid nematode model organism Oscheius tipulae. This is twice the copy number of eEF1A genes in C. elegans, C. briggsae, and, probably, many other free-living and parasitic nematodes. The introns show features remarkably different from those of other metazoan eEF1A genes. Most of the introns in the eEF1A genes are specific to O. tipulae and are not shared with any of the other genes described in metazoans. Most of the introns are phase 0 (inserted between two codons), and few are inserted in protosplice sites (introns inserted between the nucleotide sequence A/CAG and G/A). Two of these phase 0 introns are conserved in sequence in two or more of the four eEF1A gene copies, and are inserted in the same position in the genes. Neither of these characteristics has been detected in any of the nematode eEF1A genes characterized to date. The coding sequences were also compared with other eEF1A cDNAs from 11 different nematodes to determine the variability of these genes within the phylum Nematoda. Parsimony and distance trees yielded similar topologies, which were similar to those created using other molecular markers. The presence of more than one copy of the eEF1A gene with nearly identical coding regions makes it difficult to define the orthologous cDNAs. As shown by our data on O. tipulae, careful and extensive examination of intron positions in the eEF1A gene across the phylum is necessary to define their potential for use as valid phylogenetic markers.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The A subunit of Eukaryotic Elongation Factor 1 is one of the most abundant cytoplasmic proteins (Slobin 1980). In addition to its key function in the elongation step of protein biosynthesis, several additional functions have been attributed to the elongation factor (Yang et al. 1990). Because of its abundance, cDNAs transcribed from eEF1A genes have been isolated from nearly every organism for which an expressed sequence tag (EST) project has been established. Nucleotide and amino acid sequences have been used for phylogenetic studies (Hashimoto et al. 1995), although attempts to use these sequences to resolve deep evolutionary relationships between extant taxa have not been particularly successful (Hashimoto et al. 1995). Nonetheless, eEF1A sequences provide relatively consistent results in resolving more recent evolutionary branches (Cho et al. 1995). The high expression level of the eEF1A gene has prompted several groups to use its promoter region as a driver in chimeric expression vectors (Linney et al. 1999). These constructs have only been applied to mammalian cell cultures, perhaps because very few eEF1A genes have been functionally characterized in eukaryotes other than the few standard models. Although the Caenorhabditis elegans genes for eEF1A were completely sequenced and partially characterized (Bieri et al. 2007), the function of their promoter regions has not been described.
Because of the crucial functions of eEF1A, the genes that code for this protein are usually present at more than one copy per haploid genome (Liu et al. 1996). Indeed, the parasitic protozoan Trypanosoma brucei contains 8 to 10 eEF1A genes per haploid genome (Ridgley et al. 1996).
In the metazoans, there are two or three copies of the eEF1A gene, each of them differentially expressed during development or expressed in distinct tissues (Shepherd et al. 1989; Deschamps et al. 1991; Abdallah et al. 1991; Lee et al. 1992, 1993). A number of diverse biological processes are known to involve eEF1A protein, including actin bundling (Yang et al. 1990) and aging (Shepherd et al. 1989; Negrutskii and El’skaya 1998). Non-expressed pseudogenes resembling eEF1A have been described in the human genome (Madsen et al. 1990) but not in the genomes of other metazoans.
Although partial cDNAs exist for several parasitic nematodes (Wylie et al. 2004), descriptions of eEF1A genes are rare in Nematoda and are restricted to the genus Caenorhabditis. In order to better understand its phylogenetic position among the rhabditid nematodes we characterized the eEF1A genes of O. tipulae. O. tipulae (strain CEW1) has been used as a comparative genetic model in nematodes (Félix 2006) and was the second nematode (after C. elegans) in which operons were shown to be present (Evans et al. 1997). Although morphologically similar to C. elegans, the vulval precursors of O. tipulae arise by double induction of the anchor cell (Dichtel-Danjoy and Félix 2004). In addition, one of the vitellogenin genes of O. tipulae uses different codon combinations from those of C. elegans (see codon usage table, Winter et al. 1996). Ahn and Winter (2005, 2006) showed that although their genomes are the same size, the GC content of O. tipulae genomic DNA is significantly higher than that of C. elegans. Recent results from Baïlle et al. (2008) showed that O. tipulae populations harbor fivefold higher genetic diversity than Caenorhabditis spp. populations.
These results indicate that some introns of O. tipulae eEF1A genes are absolutely conserved in sequence and position, a feature not yet detected in any other metazoan eEF1A gene. Compared to other nematode species, the O. tipulae genome contains twice the number of genes coding for eEF1A. A GenBank search revealed eEF1A ESTs from other species, and these were assembled into complete coding sequences (when possible) and used for phylogenetic analysis of nematodes.
Materials and Methods
Worm Cultures and DNA/RNA Purification
Caenorhabditis elegans, strain N2, and Oscheius tipulae, strain CEW1, were maintained in NGM plates (22-mm diameter) with a lawn of E. coli (Strain NA22), as previously described (Stiernagle 1999). Larger quantities of worms were obtained from 90-mm NGM plates or 1500-ml liquid cultures according to Sulston and Hodgkin (1988). The isolated worms were thoroughly washed in S buffer, floated once in 30% sucrose, kept in S buffer at room temperature for 2 h for intestine cleansing, and washed again before use.
DNA extractions were performed by incubating the cleansed worms in 1 vol of lysis buffer (100 mM Tris–HCl, 50 mM EDTA, 100 μg/ml proteinase K, and 1.0% SDS) at 65°C for 30 min, followed by extractions with 1:1 Phenol:chloroform and chloroform. The genomic DNA was precipitated with ammonium acetate and ethanol (Winter et al. 1996).
Bacteriophage Clones and Libraries
The genomic clone λ-EMB-I3-1 was isolated from an O. tipulae genomic library after screening with a cDNA probe (Winter et al. 1996). Screening of a λ-ZAP II (Promega) cDNA library (obtained from Dr. Marie-Anne Félix, Université de Paris IV) was performed in six 140-mm NZCYM plates, using two Hybond-N replicas for each. Alkali-treatment, washing, hybridization, and phage clone purification were performed as described by Sambrook et al. (1989). Phagemid excision was performed with the Ex-Assist helper phage (Stratagene) according to the manufacturer’s instructions.
DNA and RNA manipulations were performed as described by Sambrook et al. (1989) except when noted in the text. Restriction fragments were subcloned into pTZ-19U (Fermentas) or Litmus 29 (New England BioLabs). The PCR fragments were cloned into pGEM-T easy vector (Promega) or Topo-TA (Invitrogen) according to the manufacturer’s instructions. Chemically competent HB101, DH-5α, and XL1-Blue E. coli strains were used for transformation with these plasmids.
Probes for screening and hybridization were radioactively labeled by random priming (Feinberg and Vogelstein 1983) using the RediPrime kit (Amersham) according to the manufacturer’s instructions. Hybridization took place overnight at 37°C in 50% formamide, 5× SSC, 1.0× Denhardt, 50 mM sodium phosphate buffer, pH 6.5, 200 μg/ml ssDNA, and 0.1% SDS after a 2-h prehybridization at 42°C in 50% formamide, 5× SSC, 2.5× Denhardt, 50 mM sodium phosphate buffer, pH 6.5, 25 μg/ml ssDNA, and 0.1% SDS. After the hybridization incubation, blots were washed three times for 5 min in 2× SSC, 0.1% SDS at 22°C, followed by one 30-min wash in 0.1× SSC, 0.1% SDS at 55°C and 5 min in 0.1× SSC at 22°C prior to film exposure.
PCR amplifications and DNA sequencing reactions (Thermo Sequenase, Amersham; Big Dye automatic sequencing kit, Perkin Elmer) were performed according to the manufacturer’s instructions. The radioactive sequencing reactions were resolved in 0.4-mm-thick polyacrylamide denaturing gels (8% polyacrylamide, 8 M urea) with glycerol and glycerol-tolerant buffer gradients whose values varied, respectively, from 0% glycerol and 0.5× buffer at the top to 10% glycerol and 2.5× buffer at the bottom. Automatic sequencing was performed in an ABI 310 platform (Perkin-Elmer) using the enzymatic method (Sanger et al. 1977) with fluorescence-labeled ddNTPs.
Sequence Alignment and Phylogenetic Analysis
Sequence alignments were performed with ClustalX or with the MegAlign module from the Lasergene software suite; contig assembly was performed with SeqMan (DNAstar Inc.). Phylogenetic trees were constructed using Mr Bayes version 3.1.2 (Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003) and PAUP version 4.0b10 (Sinauer Associates, Sunderland, MA). Individual ESTs, obtained from GenBank, were assembled into complete eEF1A sequences for each nematode species. We assessed species cited on the Parasitic Nematode Sequencing Project (http://www.nematode.net/) at the Genome Sequencing Center, Washington University School of Medicine. Several GenBank sequences of eEF1A cDNA could not be fully resolved. All robust EST sequences were aligned and grouped according to coincident single-nucleotide polymorphisms (SNPs) and/or variable untranslated regions (UTRs).
From the GenBank EST database, we constructed complete eEF1A cDNA sequences for Ancylostoma caninum (BK006068 and BK006069), Ancylostoma ceylanicum (BK006070 and BK006060), Ascaris suum (BK006061), Globodera pallida (BK006062), Heterodera glycines (BK006063), Pristionchus pacificus (BK006064), Strongyloides stercoralis (BK006065 and BK006066), and Xifinema index (BK006059 and BK006065) and deposited them at the NCBI Third Party Annotation (TPA) database. The complete cDNA sequence of Onchocerca volvulus eEF1A (M64333) was extracted from the GenBank database. The alignment of all nematode eEF1A cDNA coding sequences used for phylogenetic analysis can be found under accession number ALIGN_001131 at the EMBL-Align databank (ftp://ftp.ebi.ac.uk/pub/databases/embl/align/).
The sequence alignments of O. tipulae eEF1A genes described here are databased at GenBank (accession numbers AY928342 [Oti-eft-1], AY928338 [Oti -eft-2], AY928340 [Oti -eft-3] and AY928337 [Oti -eft-4]), as are the sequences of their cDNAs (AY928339, AY928341).
Results
Oscheius tipulae Has Four eEF1A Genes
The first eEF1A gene was identified in a genomic clone (λ-EMB-I3-1; see Materials and Methods) by hybridization with a total cDNA probe (see Materials and Methods). A 1380-bp-long region of eEF1A coding sequence was isolated, along with the regions ~2.0 kbp upstream and ~1.0 kbp downstream of the coding sequence. This entire region was referred to as Oti-eft-1,Footnote 1 and it is the only one of the four regions described here which was characterized directly from a cloned genomic DNA fragment. The coding sequence was completely sequenced, together with 731 bp upstream and 187 bp downstream.
Using a large portion (~70%) of the Oti-eft-1 coding region as a probe, an exhaustive screening identified 10 eEF1A clones from a λ-ZAP II O. tipulae cDNA library. These clones were purified, excised, and sequenced, and revealed similar coding sequences (though the 5′ and 3′ UTRs were predictably divergent). These coding sequences were grouped into four classes, with one class composed entirely of Oti-eft-1 transcripts. The genomic sequences corresponding to the cDNAs were isolated by PCR with primer pairs designed to anneal at the 5′ and 3′ UTR sequences (see Supplementary Table S1). These genomic sequences were named Oti-eft-2, Oti-eft-3, and Oti-eft-4, respectively (Fig. 1).
O. tipulae eEF1A Gene Introns
All the isolated genes contain introns of various sizes. Oti-eft-1 contains four introns within the coding region, along with a large intron (516 bp) in the 5′ UTR, seven bases upstream from the start codon (cf. Fig. 1). Unfortunately, this feature could not be confirmed in the other genes because, unlike Oti-eft-1, they were not isolated from a genomic library. Introns located within the 5′ UTR region are not an uncommon feature of eEF1A genes, having been described in Drosophila, Xenopus, Danio rerio, and human eEF1A genes, where 5′ introns appear to play an enhancer role in gene expression (Kim et al. 1990). These 5′ UTR introns have not been identified in the Caenorhabditis spp. eEF1A genes annotated to date (Bieri et al. 2007).
Examination of the coding regions of the four O. tipulae eEF1A genes allowed for identification of conserved and unique introns (Fig. 2). The position-conserved introns were designated by letters (A to E) and ranged from 100% sequence identity to completely dissimilar and unalignable sequences (Fig. 3, Table 1).
Oti-eft-1 has the most uniformly sized introns, ranging from 41 to 43 nucleotides (nt; all phase 0). The least uniform is Oti-eft-3, which has two short, 39-bp introns, three medium introns (ranging from 42 to 72 bp), and a 328-bp intron, the longest intron within any of the coding regions.
Oti-eft-2 and Oti-eft-4 share the greatest number of conserved introns (A, B, C, and D), some of which have highly conserved sequences (Fig. 3). Intron C, with four substitutions and one indel in position 194, is found only in these two genes, and is almost four times the size of an average C. elegans intron (Blumenthal and Steward 1997). Considering that large introns in C. elegans tend to be novel (Coghlan and Wolfe 2004), and given the remarkable sequence similarity found between intron C in Oti-eft-2 and intron C in Oti-eft-4 (Table 1), a recent origin for these introns is plausible, and the homology possibly arose due to an insertion followed by a recombination event.
The Oti-eft-3 gene contains the largest number of gene-specific introns (four). The first intron of Oti-eft-3 is of particular interest, as it is bounded by GC-AG ends instead of typical GU-AG ends. The former are found in 0.7% of human pre-mRNA introns and 0.6% of the cDNA-confirmed introns of C. elegans, where they play an important role in the control of the alternative splicing of some genes (Farrer et al. 2002).
The coding sequence of Oti-eft-3 is the most divergent (see phylogenetic analysis below) and includes a three-base insertion close to the translation start codon (Fig. 1a). These three new bases present in the gene insert a glycine residue in a highly conserved region possibly involved in the GTP-binding site (see Fig. 5). This raises the question whether this insertion is a footprint due to the loss of an intron, as occurred with the jingwei gene in Drosophila teissieri (Llopart et al. 2002).
The intron phase distribution is remarkably biased toward phase 0, and indeed, 16 of 20 introns (80%) show phase 0 characteristics. All the position-conserved introns in O. tipulae eEF1A genes are phase 0. Some authors contend that this is an indication of a single ancient origin (Zhaxybayeva and Gogarten 2003), although other authors (Qiu et al. 2004; Coghlan and Wolfe 2004) disagree. One of the phase 0, gene-copy-specific introns, as well as three of the position-conserved introns (which are also phase 0), is inserted within a consensus sequence thought to be a protosplice site, MAG↓R (Dibb and Newman 1989; Fedorov et al. 2003), where introns would preferably insert (Fig. 3). According to Coghlan and Wolfe (2004), these features could be interpreted as evidence of a recent origin for these introns.
Nevertheless, O. tipulae eEF1A genes share introns between genes, but not with other nematodes (Fig. 4). Coghlan and Wolfe (2004) used this characteristic to identify novel introns in C. briggsae and/or C. elegans. In our study, we were unable to distinguish recent insertions from ancient insertions (older than 100 MY) because of the rapid sequence divergence of nematode genomes (Kiontke et al. 2004; Mitreva et al. 2005).
With regard to the gene-specific introns, the intron phase is distributed evenly among the three phases, with three phase 0 introns and two of each other phase.
The 5′ and 3′ ends of the O. tipulae eEF1A introns fit into previously established consensus for C. elegans (Blumenthal and Steward 1997) (Table 2), suggesting a similar splicing apparatus between these two rhabditid nematodes (Winter et al. 1996; Evans et al. 1997; see also Guiliano and Blaxter 2006). O. tipulae eEF1A genes do not contain any introns homologous to C. elegans or C. briggsae homologues. Intron A has a somewhat similar position, three bases upstream of its putative corresponding Cel-eft-3 intron. This is particularly interesting since this C. elegans intron is position conserved between protostomes (nematodes and insects) and deuterostomes (vertebrates) (Fig. 4). A similar conservation of position is observed in O. tipulae Intron D, which is homologous to introns in arthropods and vertebrates, but not in C. elegans or C. briggsae.
O. tipulae eEF1A Coding Sequences and Their GC Content
Compared with their C. elegans paralogous counterparts, the genes described here contain a 12-nt-long deletion (Fig. 1B), which corresponds to an indel of four amino acids in the C-terminal portion of the translated protein. Aside from this difference, the coding sequences are largely similar, ranging from 80.2% (Oti-eft-3:Cel-eft-4) to 85.1% (Oti-eft-4:Cel-eft-4) sequence identity and from 90.3% (Oti-eft-3) to 91.4% (Oti-eft-4) identity for amino acid sequences (see Supplementary Table S5). This contrasts with the two C. elegans eEF1A genes, in which the translated amino acid sequences are identical.
The G/C content of the O. tipulae eEF1A genes is slightly higher than that of the C. elegans genes. Although the codon usage table of the O. tipulae coding sequences (Supplementary Table S2) resembles that of C. elegans, the former shows some preference for the codons with C or G in the third position. Comparing the eEF1A genes reveals that the average G/C value of the codons is similar: ~55% in O. tipulae and ~53% in C. elegans. However, considering only the third codon positions, O. tipulae shows an average value of 70%, compared to only 62% in C. elegans. A similar preference is observed in proline codon preference (100% CCA in C. elegans, compared to 49.5% CCC in O. tipulae), histidine (59.1% CAC in C. elegans and 80% in O. tipulae), and glutamine (53.8% CAA in C. elegans and 85.7% CAG in O. tipulae). Similar results have been described for the vit-6 gene in O. tipulae (Winter et al. 1996). Furthermore, the G/C content of eEF1A introns in O. tipulae genes is higher, varying from 35% to 50%, compared to an average value of 30% for eEF1A introns in C. elegans (Blumenthal and Steward 1997). These facts reflect the higher overall G/C content in the O. tipulae genome (42%) (Ahn and Winter 2005) compared to the C. elegans genome (36%) (Waterston et al. 1997).
Evolution and Comparative Analysis of Nematode eEF1A cDNA Sequences
In order to evaluate the reliability of eEF1A sequences as a phylogenetic molecular marker for nematodes, we made contigs of complete eEF1A-coding mRNA sequences using EST sequences from various nematode species available in GenBank. Although eEF1A mRNAs tend to be abundant in cells, some nematode EST projects are probably based on normalized cDNA libraries, making complete eEF1A mRNA coding sequences relatively rare in the databanks. The contigs corresponding to complete eEF1A coding sequences were aligned and compared. In some cases, coincident SNPs allowed detection of sequences that had probably been transcribed from different gene copies. Unfortunately, these transcripts often did not contain complete EST coding-region sequences. When entire ESTs were present, the assembled sequences were treated independently, although there is no experimental evidence that they are actually transcribed from different genomic regions.
The inferred amino acid sequences proved to be quite similar, particularly in the N-terminal region, where they were almost identical to the 83rd residue of the C. elegans eEF1A sequence. The other sequences showed a number of differences, some of which were species specific. None of the GTP- or aminoacyl-tRNA binding sites depicted in Fig. 5 seem to be affected by these differences. As in C. briggsae, C. elegans, and O. tipulae, most of the indels are concentrated in the carboxyl end of the protein (cf. Fig. 1).
Most of the differences between nucleotide sequences are in the third position of the codons. The G/C content of the third position varied remarkably, from 26.9% in S. stercoralis to 75.9% in P. pacificus (Fig. 3). The other codon positions showed more uniform values (data not shown). A recent extensive analysis of 25 million codons from ESTs of 32 different nematode species suggests that the frequency of G/C in the third codon position varies with the G/C of the entire genome (Mitreva et al. 2006). The G/C content observed in the third codon position for all the analyzed cDNAs was always higher than the G/C content for the other genes analyzed in these species (Cutter et al. 2006).
Bayesian, maximum-likelihood, and maximum-parsimony trees constructed with the coding DNA from nematode eEF1A genes showed very similar topologies supported by either bootstrapping values or Bayesian posterior probability (Fig. 7). Though some details differ, our OTU groupings superficially resemble those suggested by rooted trees constructed by other authors (Blaxter 2003; De Ley 2006). All the genes used to construct our trees originated from nematode species of the order Rhabditida, with the exception of Xiphinema index, which belongs to the order Dorylaimida. Within rhabditids, sequences obtained from species of suborder Rhabditina cluster together, though the resolution between orders is not reflected in branch topology. Pristionchus pacificus eEF1A cDNA (Ppa-eft-1) clusters within Rhabditomorpha sequences, though this may be caused by the low sampling of Diplogasteromorpha species.
The Ancylostoma genes grouped together but were not consistent between species: A. ceylanicum genes grouped with A. caninum. Xiphinema index was positioned closer to the plant-parasitic nematode branch and the Ascaris suum/Onchocerca volvulus group showed rather low bootstrap values. In the distance trees, these branches are quite long, probably because they connect single OTUs belonging to distinct nematode orders and suborders. The fact that all three methods yielded relatively similar topologies suggests that the inconsistent groupings are not inherent to the methods used to construct the trees, but may be due to low sampling or may indicate that eEF1A nucleotide sequences are not good taxonomic markers for this group.
These puzzling grouping patterns are consistent, even when considering only the third position of each codon and/or discarding the indel-rich portions on the 3′ ends of the sequences (results not shown). A pairwise distance analysis derived from RSCU (Relative Synonymous Codon Usage [McInerney 1998]; see Supplementary Table S4 and Fig. S1) shows a highly distorted tree, probably because of the GC3 differences. Indeed, Ascaris suum groups with members of the suborder Tylenchina (Heterodera glycines and Globodera pallida), and separately from Onchocerca volvulus, a member of the same suborder as A. suum (Spirurina). We conclude that eEF1A gene sequences, although useful for phylogenetic studies of closely related nematodes, should be used with caution if the number of paralogous sequences in the species analyzed is unknown.
Discussion
The Number of eEF1A Genes in the O. tipulae Genome
The four eEF1A genes present in the O. tipulae genome support the Winter et al. (1996) hypothesis that nematode genomes diverge more rapidly than nematode morphology (Kiontke et al. 2004; Mitreva et al. 2005). This large number of eEF1A genes is uncommon in described metazoan genomes, and indeed, most genomes contain only two eEF1A copies. The greatest number of eEF1A genes is found in Xenopus laevis (Deschamps et al. 1991), which appears to have at least three copies, expressed in a developmentally controlled manner (Abdallah et al. 1991). In Homo sapiens, a large number of eEF1A genes (nearly 20) have also been described, although only 2 of them have been confirmed as expressed copies, and all other copies were shown to be pseudogenes (Madsen et al. 1990). The O. tipulae eEF1A genes described here are unlikely to be pseudogenes, as cDNAs were isolated and sequenced for all of them.
Results from Southern blots of restriction enzyme-digested O. tipulae genomic DNA using C. elegans actin probes suggested a greater number of actin genes in the O. tipulae genome than in the C. elegans genome (Winter, unpublished results). Interestingly, although these results could be taken as evidence of partial genome duplications, studies of genomic DNA reassociation kinetics showed that the genome size is the same in O. tipulae and C. elegans (Ahn and Winter 2006). Intrachromosomal rearrangements are fairly common in nematodes (Mitreva et al. 2005) and can lead to gene duplications.
Comparative Analysis of O. tipulae eEF1A Genes: Introns
Oti-eft-3 has the highest number of gene-specific introns. These introns do not appear to show a preference for a particular insertion phase, as expected in a pattern of random intron insertion. The largest intron (328 nt; see Figs. 2 and 3) found interrupting the coding region of an O. tipulae eEF1A gene is found in Oti-eft-3. It is larger than any other intron within a coding sequence described in O. tipulae.
Introns A, D, and E of O. tipulae genes are inserted within a motif that may be a protosplice site (MAG↓R) (Dibb and Newman 1989; Long et al. 1998) (see Fig. 3). This motif appears to be common in regions immediately adjacent to nematode introns (Blumenthal and Steward 1997). Some studies have shown that recently acquired introns in C. elegans tend to be phase 0 introns located within an AG↓G motif (Coghlan and Wolfe 2004). In addition to its similarity to the protosplice site, this motif is also commonly found in the vicinity of introns specific to certain groups of organisms or protein families, supporting the hypothesis of a recent origin (Kent and Zahler 2000; Sverdlov et al. 2003; Qiu et al. 2004). Of all the introns of the O. tipulae genes available in databanks (including those described here), the protosplice site appears in 13 of 34 (35.3%) (results not shown). If only the introns of the eEF1A genes of O. tipulae are considered, the protosplice site appears in one of the seven nonconserved introns and, interestingly, in the position-conserved introns A, D (the introns with weaker sequence conservation), and E (see Fig. 3 and Table 1). This could be interpreted as support for an insertional origin of these introns.
The other conserved introns (B and C) have strong sequence conservation, which is unusual between introns. Even when the introns of the eEF1A genes within the same genome in other organisms are compared, they are conserved only in their position, and not in their sequence.
If we assume that the eEF1A introns of O. tipulae and C. elegans have different origins (i.e., they were not present in an ancestral genome), two hypotheses could potentially explain our finding: (a) intron insertion in one of the copies was followed by spreading through recombination events to other locations, or (b) there has been some selective pressure maintaining the intron sequences. Hypothesis (a) could explain the insertion of intron A within a target sequence with poor sequence conservation, and could also explain its absence in Oti-eft-3. However, the sequence-conserved introns (B and C) are not located within any of the target motifs for intron insertion. Certainly, intron insertion could occur in sites other than the protosplice sites, but these insertion events would have to be followed by repetitive recombination events. As such, the characteristics of introns C and D could be explained by this hypothesis. However, in the case of introns B and E, the insertion event would have to have been followed by no fewer than three (intron B) or two (intron E) recombination events. Hypothesis (b) would presuppose sequence conservation of some introns vital for gene function from a pool of introns that may have existed in an ancestral gene. All other introns would have arisen by secondary insertion events.
The high intron turnover rate described in nematodes (Kiontke et al. 2004), along with the different transcriptional control strategy hypothesized above, could account for the differences in intron positions of the eEF1A genes in O. tipulae and C. elegans. Different transcriptional control strategies would require elements located in different positions, making introns important for one species less important for another.
The first intron of Oti-eft-3 is of particular interest because it has a rare GC-AG intron boundary, which is a unique type of donor site that appears in less than 1% of C. elegans introns. This motif is sometimes implicated in the regulation of alternative splicing (Farrer et al. 2002). However, we failed to detect any Oti-eft-3 cDNA clones that displayed evidence of alternative splicing (results not shown).
Can eEF1A Be Used as a Tool to Study Nematode Evolution?
Unlike the trees in other studies, which were based on the primary structure of eEF1A proteins, the trees described here (see Fig. 7) were based on nucleotide sequences. Despite the small number of sampled taxa, the grouping pattern obtained was relatively similar to that established by rRNA phylogenies. The Pristionchus pacificus eEF1A cDNA sequence is remarkably distinct in its composition from those of the other nematodes (cf. Fig. 6). This might account for its persistent positioning among the infraorder Rhabditomorpha, which includes, among others, the genera Caenorhabditis, Oscheius, and Ancylostoma (see Fig. 7).
Oscheius tipulae as an Example of eEF1A Gene Diversity in Nematodes
The data described in this paper reinforce previous findings (Winter et al. 1996) and reveal striking molecular differences between Oscheius tipulae and C. elegans. Analysis of the large subunit of RNA polymerase I showed that this protein has an amino acid substitution rate in Rhabditid nematodes comparable to that of different orders of the Mammalia (Kiontke et al. 2004). Therefore, although Oscheius tipulae and C. elegans are very different at the molecular level, this does not imply great evolutionary distance. O. tipulae has a strong preference for G or C at the third position of its codons (see Fig. 6) and has a higher copy number of the highly expressed eEF1A gene. The introns show strong sequence conservation, suggesting some selective pressure on them and possibly a consequentially different strategy of gene expression.
The use of eEF1A sequences as a tool for evolutionary studies has been highly controversial. The criticism of this strategy cites the inconsistency of basal branching (Hashimoto et al. 1995; Cho et al. 1995). The results described here show some unexpected branching, including that of Ancylostoma and Caenorhabditis, as well as the splitting of the suborder Tylenchina, suggest that special attention is required when using eEF1A as a phylogenetic marker. Indeed, eEF1A is frequently coded by more than one gene copy, each with different expression patterns. This itself poses a phylogenetic problem, as comparisons between paralogous copies could give rise to false organism phylogenies. Differences in expression patterns may also be confounding, as these can lead to different codon-usage preferences (Duret and Mouchiroud 1999).
The use of eEF1A as a phylogenetic marker in nematodes would require a major effort to increase the number of sampled taxa and to describe a more comprehensive set of genes from each organism. In beetles (Curculionidae), coincident intron positioning appears to be a good criterion for supporting gene orthology inferences (Jordal 2002). A similar approach in nematodes would likely be confounded, however, due to the limited number of closely related taxa that have been sampled. However, the number of eEF1A genes found in O. tipulae and the number of introns in these genes suggest that the genus Oscheius would be a good candidate for this type of analysis. In conclusion, the high diversity of Nematodes at the molecular level (Mitreva et al. 2005) and the results we describe here indicate that nematode eEF1A genes are good candidates for use in the study of intron diversity and evolution. Due to the high genetic diversity of different geographic isolates of O. tipulae (Baïlle et al. 2008), an analysis of eEF1A genes from different populations could yield important data about intron evolution in rhabditid nematodes.
Notes
About gene nomenclature: although we use the same three-letter code for the O. tipulae eEF1A genes as that used for C. elegans, the numbering does not reflect orthology, but simply the order in which the genes were isolated and/or characterized.
References
Abdallah B, Hourdry J, Krieg PA, Denis H, Mazabraud A (1991) Germ cell-specific expression of a gene encoding eukaryotic translation elongation factor 1 alpha (eEF-1 alpha) and generation of eEF-1 alpha retropseudogenes in Xenopus laevis. Proc Natl Acad Sci USA 88:9277–9281
Ahn IY, Winter CE (2005) Determination of DNA base composition by small scale acrylamide-CsCl gradient centrifugation. J Biochem Biophys Methods 63:155–160
Ahn IY, Winter CE (2006) The genome of Oscheius tipulae: determination of size, complexity and structure by DNA reassociation using fluorescent dye. Genome 49:1007–1015
Andersen GR, Pedersen L, Valente L, Chatterjee I, Kinzy TG, Kjeldgaard M, Nyborg J (2000) Structural basis for nucleotide exchange and competition with tRNA in the yeast elongation factor complex eEF1A:eEF1Balpha. Mol Cell 6:1261–1266
Bieri T, Blasiar D, Ozersky P et al (2007) WormBase: new content and better access. Nucleic Acids Res (Database Issue):D506–D510
Blaxter ML (2003) Nematoda: genes, genomes and the evolution of parasitism. Adv Parasitol 54:101–195
Blumenthal T, Steward K (1997) RNA processing and gene structure. In: Riddle D, Blumenthal T, Meyer BJ, Priess JR (eds) C. elegans II. Cold Spring Harbor Laboratory Press, Woodbury, NY, pp 117–146
Carr-Schmid A, Durko N, Cavallius J, Merrick WC, Kinzy TG (1999) Mutations in a GTP-binding motif of eukaryotic elongation factor 1A reduce both translational fidelity and the requirement for nucleotide exchange. J Biol Chem 274:30297–30302
Cho S, Mitchell A, Regier JC, Mitter C, Poole RW, Friedlander TP, Zhao S (1995) A highly conserved nuclear gene for low-level phylogenetics: elongation factor-1 alpha recovers morphology-based tree for heliothine moths. Mol Biol Evol 12:650–656
Coghlan A, Wolfe KH (2004) Origins of recently gained introns in Caenorhabditis. Proc Natl Acad Sci USA 101:11362–11367
Cutter AD, Wasmuth JD, Blaxter ML (2006) The evolution of biased codon and amino acid usage in Nematode genomes. Mol Biol Evol 23:2303–2315
De Ley P (2006) A quick tour of nematode diversity and the backbone of nematode phylogeny, The C. elegans Research Community, editor. WormBook, doi/10.1895/wormbook.1.41.1/ Available at: http://www.wormbook.org
Deschamps S, Morales J, Mazabraud A, le Maire M, Denis H, Brown DD (1991) Two forms of elongation factor 1 alpha (EF-1 alpha O and 42Sp50), present in oocytes, but absent in somatic cells of Xenopus laevis. J Cell Biol 114:1109–1111
Dever TE, Glynias MJ, Merrick WC (1987) GTP-binding domain: three consensus sequence elements with distinct spacing. Proc Natl Acad Sci USA 84:1814–1818
Dibb NJ, Newman AJ (1989) Evidence that introns arose at protosplice sites. EMBO J 8:2015–2021
Dichtel-Danjoy ML, Félix MA (2004) Phenotypic neighborhood and micro-evolvability. Trends Genet 20:268–276
Duret L, Mouchiroud D (1999) Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila and Arabidopsis. Proc Natl Acad Sci USA 96:4482–4487
Evans D, Zorio D, MacMorris M, Winter CE, Lea K, Blumenthal T (1997) Operons and SL2 trans-splicing exist in nematodes outside the genus Caenorhabditis. Proc Natl Acad Sci USA 94:9751–9756
Farrer T, Roller AB, Kent WJ, Zahler AM (2002) Analysis of the role of Caenorhabditis elegans GC-AG introns in regulated splicing. Nucleic Acids Res 30:3360–3367
Fedorov A, Roy S, Fedorova L, Gilbert W (2003) Mystery of intron gain. Genome Res 13:2236–2241
Feinberg AP, Vogelstein B (1983) A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal Biochem 132:6–13
Félix MA (2006) Oscheius tipulae. In: The C. elegans Research Community, (ed) WormBook, doi/10.1895/wormbook.1.119.1. Available at: http://www.wormbook.org
Gouet P, Courcelle E, Stuart DI, Metoz F (1999) ESPript: multiple sequence alignments in PostScript. Bioinformatics 15:305–308
Guiliano DB, Blaxter ML (2006) Operon conservation and the evolution of trans-splicing in the phylum Nematoda. PLoS Biol 2:1871–1882
Hashimoto T, Nakamura Y, Kamaishi T, Adachi J, Nakamura F, Okamoto K, Hasegawa M (1995) Phylogenetic place of kinetoplastid protozoa inferred from a protein phylogeny of elongation factor 1 alpha. Mol Biochem Parasitol 70:181–185
Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17:754–755
Jordal BH (2002) Elongation factor 1 α resolves the monophyly of the haplodiploid ambrosia beetles Xyleborini (Coleoptera: Curculionidae). Insect Mol Biol 11:453–465
Kent WJ, Zahler AM (2000) Conservation, regulation, synteny, and introns in a large-scale C. briggsae-C. elegans genomic alignment. Genome Res 10:1115–1125
Kim DW, Uetsuki T, Kaziro Y, Yamaguchi N, Sugano S (1990) Use of the human elongation factor 1 alpha promoter as a versatile and efficient expression system. Gene 91:217–223
Kiontke K, Gavin NP, Raynes Y, Roehrig C, Piano F, Fitch DH (2004) Caenorhabditis phylogeny predicts convergence of hermaphroditism and extensive intron loss. Proc Natl Acad Sci USA 101:9003–9008
Lee S, Francoeur AM, Liu S, Wang E (1992) Tissue-specific expression in mammalian brain, heart, and muscle of S1, a member of the elongation factor–1 alpha gene family. J Biol Chem 267:24064–24068
Lee S, Wolfraim LA, Wang E (1993) Differential expression of S1 and elongation factor-1 alpha during rat development. J Biol Chem 268:24453–24459
Linney E, Hardison NL, Lonze BE, Lyons S, DiNapoli L (1999) Transgene expression in zebrafish: a comparison of retroviral-vector and DNA-injection approaches. Dev Biol 213:207–216
Liu G, Edmonds BT, Condeelis J (1996) pH, EF-1alpha and the cytoskeleton. Trends Cell Biol 6:168–171
Llopart A, Comeron JM, Brunet FG, Lachaise D, Long M (2002) Intron presence-absence polymorphism in Drosophila driven by positive Darwinian selection. Proc Natl Acad Sci USA 99:8121–8126
Long M, de Souza SJ, Rosenberg C, Gilbert W (1998) Relationship between “protosplice sites” and intron phases: evidence from dicodon analysis. Proc Natl Acad Sci USA 95:219–223
Madsen HO, Poulsen K, Dahl O, Clark BF, Hjorth JP (1990) Retropseudogenes constitute the major part of the human elongation factor 1 alpha gene family. Nucleic Acids Res 18:1513–1516
McInerney JO (1998) GCUA (General Codon Usage Analysis). Bioinformatics 14:372–373
Mitreva M, Blaxter ML, Bird DM, McCarter JP (2005) Comparative genomics of nematodes. Trends Genet 21:573–581
Mitreva M, Wendl MC, Martin J, Wylie T, Yin Y, Larson A, Parkinson J, Waterston RH, McCarter JP (2006) Codon usage patterns in Nematoda: analysis based on over 25 million codons in thirty-two species. Genome Biol 7:R75
Negrutskii BS, El’skaya N (1998) Eukaryotic translation elongation factor 1 alpha: structure, expression, functions, and possible role in aminoacyl-tRNA channeling. Prog Nucleic Acid Res Mol Biol 60:47–78
Peden JF (1999) Analysis of codon usage. PhD thesis. University of Nottingham, UK
Qiu WG, Schisler N, Stoltzfus A (2004) The evolutionary gain of spliceosomal introns: sequence and phase preferences. Mol Biol Evol 21:1252–1263
Ridgley EL, Xiong ZH, Kaur KJ, Ruben L (1996) Genomic organization and expression of elongation factor-1 alpha genes in Trypanosoma brucei. Mol Biochem Parasitol 79:119–123
Ronquist F, Huelsenbeck JP (2003) MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574
Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463–5467
Shepherd JC, Walldorf U, Hug P, Gehring W (1989) Fruit flies with additional expression of the elongation factor EF-1 alpha live longer. Proc Natl Acad Sci USA 86:7520–7521
Slobin LI (1980) The role of eucaryotic factor Tu in protein synthesis. The measurement of the elongation factor Tu content of rabbit reticulocytes and other mammalian cells by a sensitive radioimmunoassay. Eur J Biochem 110:555–563
Stiernagle T (1999) Maintenance of C. elegans. In: Hope IA (ed) C. elegans—a practical approach. Oxford University Press, Oxford, UK, pp 51–67
Sulston J, Hodgkin J (1988) Methods. In: Wood WB (ed) The nematode Caenorhabditis elegans. Cold Spring Harbor Laboratory Press, Woodbury, NY, pp 587–606
Sverdlov AV, Rogozin IB, Babenko VN, Koonin EV (2003) Evidence of splice signal migration from exon to intron during intron evolution. Curr Biol 13:2170–2174
Waterston RH, Sulston JE, Coulson AR (1997) The genome. In: Riddle DL, Blumenthal T, Meyer BJ, Priess JR (eds) C. elegans II. Cold Spring Harbor Laboratory Press, Woodbury, NY, pp 23–45
Winter CE, Penha C, Blumenthal T (1996) Comparison of a vitellogenin gene between two distantly related rhabditid nematode species. Mol Biol Evol 13:674–684
Wylie T, Martin JC, Dante M, Mitreva MD, Clifton SW, Chinwalla A, Waterston RH, Wilson RK, McCarter JP (2004) Nematode.net: a tool for navigating sequences from parasitic and free-living nematodes. Nucleic Acids Res (Database Issue):D423–D426
Yang F, Demma M, Warren V, Dharmawardhane S, Condeelis J (1990) Identification of an actin-binding protein from Dictyostelium as elongation factor 1a. Nature 347:494–496
Zhaxybayeva O, Gogarten JP (2003) Spliceosomal introns: new insights into their evolution. Curr Biol 13:R764–R766
Acknowledgments
We would like to thank Dr. Marie Anne Félix for the kind gift of the cDNA library of Oscheius tipulae (strain CEW1). We would also like to thank Drs. Jeffrey J. Shaw (Department of Parasitology, University of São Paulo) and Thomas Blumenthal (Department of Molecular, Cellular, and Developmental Biology, University of Colorado at Boulder) for commentary on the first draft of the manuscript. This work was supported by grants from FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo, The State of São Paulo Research Foundation). R.N.A. was a FAPESP graduate fellow.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Akamine, R.N., Winter, C.E. Oscheius tipulae as an Example of eEF1A Gene Diversity in Nematodes. J Mol Evol 67, 278–290 (2008). https://doi.org/10.1007/s00239-008-9147-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-008-9147-8