Introduction

Proteomic approaches demonstrated that the venom of poisonous animals is composed of dozens or even hundreds of toxic peptides (e.g. Favreau et al. 2006; Liao et al. 2007; Fox and Serrano 2008). Still, little is known about the genes encoding this vast arsenal of toxins and their evolution. It has been shown in snakes, cone snails, and scorpions that toxins are often encoded by multiple gene families subjected to strong positive Darwinian selection, also known as diversifying selection (Nakashima et al. 1993; Duda and Palumbi 1999; Zhu et al. 2004). Positive selection promotes the fixation of non-synonymous substitutions and “accelerates” diversification of related sequences. This high substitution rate is typical to the region encoding the mature toxin. In contrast, the regions encoding the signal peptide and propart, which are involved in secretion, are usually highly conserved (Nakashima et al. 1993; Duda and Palumbi 1999).

Sea anemones are poisonous animals that spend most of their life in a sessile state and thus heavily depend on toxin-producing stinging cells (nematocytes) for prey and defense. Whereas a large number of neurotoxins from various sea anemones have been isolated (Honma and Shiomi 2006), the evolution and genomic organization of their corresponding genes have been rarely investigated. The recent genome sequencing project of the sea anemone Nematostella vectensis (Sullivan et al. 2006; Putnam et al. 2007) provided an opportunity to analyze gene families encoding toxins. It was found that this genome contains at least 12 genes all encoding an identical peptide that belongs to the Type I sea anemone class of neurotoxins (Moran and Gurevitz 2006). Their comparison with Type I neurotoxin genes from two Mediterranean anemone species, Anemonia viridis (previously called Anemonia sulcata) and Actinia equina, revealed that this gene family evolved under concerted evolution (Moran et al. 2008a). In this rare phenomenon, the sequence of genes is homogenized through unequal crossing over and gene conversion, resulting in a pattern where two gene family members from one species are more similar to one another than to their corresponding homologues in other species (Nei and Rooney 2005).

Unlike N. vectensis, which produces a small number of Type I neurotoxins, A. viridis and A. equina possess a larger arsenal of this toxin class. For example, in addition to multiple copies of genes encoding the Type I neurotoxin Av2, the genome of A. viridis encodes highly diverged Av2 homologues (Av6, Av8, Av9), which seem to have ‘escaped’ the concerted evolution process and evolve under positive Darwinian selection (Moran et al. 2008a).

While many sea anemone species produce Type I neurotoxins, only four species have been shown to produce Type III neurotoxins. Among them, A. viridis is the only species that has been found thus far to produce both toxin types (Beress et al. 1975; Honma and Shiomi 2006). Type I neurotoxins are peptides of 47–51 amino acids structured as an anti-parallel β-sheet and a long flexible loop, whereas Type III neurotoxins are 27–32 amino acid-long peptides structured merely of rigid β and γ turns (Fig. 1a; Manoleras and Norton 1994). Although the two neurotoxin types are unrelated in sequence and three-dimensional structure, they exhibit similar inhibitory effect on inactivation of voltage-gated sodium channels (Navs) (Fig. 1b; Hartung and Rathmayer 1985; Moran et al. 2007). This similarity in function raised the question whether the two toxin types exhibit a similar pattern of evolution, a question unresolved thus far in the absence of any nucleotide data regarding Type III neurotoxins. To address this question we have amplified and compared genes encoding representatives of Types I and III toxins from A. viridis. This comparison revealed a region of high sequence similarity shared between these genes, including the 5′-untranslated region (5′-UTR), the first intron, and part of the sequence encoding the signal peptide. This surprising similarity strongly suggests a fusion incident between genes encoding the two toxin types. In addition, this analysis identified processed pseudogenes of Types I and III neurotoxins in the genome of A. viridis.

Fig. 1
figure 1

a Comparison of sea anemone neurotoxins. Representative Type III sea anemone neurotoxins (Av3, Av7, and Av10 of A. viridis, Er I of Entacmaea ramsayi, PaTx of Entacmaea quadricolor, Da I of Dofleinia armata) and representative Type I sea anemone neurotoxins (Av2 and Av6 of A. viridis, Hk2 of Anthopleura sp., Sg2 of Stichodactyla gigantea, Am3 of Antheopsis maculata, and Nv1 of N. vectensis) were aligned according to their cysteine residues which form disulfide bonds (dotted lines). Conservative amino acid substitutions are in gray and non-conservative in black (GenBank accessions appear in the legend of Fig. 4 and in Honma and Shiomi 2006). The three-dimensional modeled structure of Av2 (Moran et al. 2006) and the NMR-based structure of Av3 (Manoleras and Norton 1994) are shown next to their respective groups. b The effects of sea anemone neurotoxins on currents mediated by DmNav1 expressed in X. laevis oocytes. The oocytes were clamped at −80 mV and currents were elicited by step depolarizations to −10 mV. Upon application of 1 μM of either Av2, Av3, and Av7 or 5 μM of Av10 inactivation gradually ceased until the toxin effect reached saturation

Materials and Methods

Strains and Sample Collection

All DNA manipulations and plasmid preparations were performed using the Escherichia coli strain DH5α. A. viridis specimens were collected at the Mediterranean beaches of Atlit and Michmoret, Israel, and kept alive in sea water. When required they were swiftly dried in paper towels and frozen at −70°C until used. Ovaries were isolated from mesenteries of adult A. viridis females during the breeding season (March till May) by inserting a plastic pipette into the pharynx. The ovary samples were monitored by light microscopy (YS100 microscope, Nikon, USA, equipped with a DP 200 digital camera, Deltapix, Denmark; Supplementary Fig. 1) and contained mostly oocytes as expected (Wedi and Dunn 1983). Ejaculation of sperm was stimulated by exposing A. viridis males to direct sunlight and high temperatures (~30°C). Sperm samples were collected from the water, centrifuged (3,000×g for 5 min), and water was discarded.

Extraction of Nucleic Acids

A. viridis samples were flash frozen in liquid nitrogen and ground to fine powder using mortar and pestle. For DNA extraction, the powder was dissolved in extraction buffer (Tris 10 mM, pH 8.0, EDTA 100 mM, SDS 0.5%) with 0.5 mg/ml proteinase K (Sigma, USA) and incubated at 65°C for two and a half hours before the addition of 10% CTAB (hexadecyltrimethyl ammonium bromide; Sigma, USA) in 0.7 M NaCl to a final concentration of 0.3% of the sample volume. The samples were then incubated for 20 additional minutes at the same conditions, before an equal volume of chloroform/isoamyl alcohol (24:1) was added and after mixing the sample was centrifuged at 12,000×g for 20 min. From this point the procedure was continued as previously described (Dellacorte 1994). To avoid the slightest possibility of RNA contamination, the DNA samples were treated with RNase A (Fermentas, Lithuania). For RNA extraction, the powder was dissolved in Trizol® reagent (Invitrogen, USA), and further RNA purification steps were carried out according to the manufacturer’s instructions. The RNA was treated with a ‘DNA Free kit’ (Ambion, USA) in order to eliminate any residual DNA. Each aliquot of nucleic acids was produced from a single individual and the genomic DNA (gDNA) used in this study was from the same individual collected at Atlit beach near Haifa, which was used in a previous study (Moran et al. 2008a).

3′- and 5′-Rapid Amplification of cDNA Ends

3′-Rapid amplifications of cDNA ends (RACE) were performed by the 5′/3′-RACE kit, 2nd generation (Roche Applied Sciences, Germany), according to manufacturer’s instructions in the presence of Protector RNAse inhibitor (Roche). 5′ RACE was performed by the rapid ligase-mediated RACE (RLM-RACE) approach. The FirstChoice® RLM-RACE kit (Ambion, USA) was used with the following necessary modifications: instead of the supplied M-MLV reverse transcriptase the Omniscript enzyme (Qiagen, USA) was used at 50°C, and PCR conditions after first strand synthesis were modified to “touchdown PCR” [(1) 94°C for 25 s, (2) 72°C for 2 min, (3) back to step 1 six times, (4) 94°C for 25 s, (5) 67°C for 2 min, (6) back to step 4 31 times, (7) 67°C for 7 min]. The PCR products were cloned into pBluescript KS (Stratagene, USA) predigested with EcoRV.

Genome Walking

Genome walking was performed as previously described (Siebert et al. 1995). In brief, 2.5 μg of A. viridis gDNA was digested with DraI, EcoRV, ScaI, or StuI (New England Biolabs, USA), purified and ligated by Mighty mix ligation kit (Takara Bio, Japan) to DNA adaptors. Touchdown PCR was performed with a primer corresponding to the adaptor and a primer corresponding to the sequence of the gene of interest using the digested gDNA as template. This stage was followed by nested PCR, using the diluted product of the previous PCR as template. The genome walking was carried out using primers specific for either the neurotoxin genes or the processed pseudogenes (Supplementary Fig. 2 and Supplementary Table 1) in order to distinguish them from one another and map their flanking regions. Primers specific for the pseudogenes were designed to match the ends of two exons, whereas those for the parental genes were designed to match the intronic sequence. All genome walking PCRs were performed with Phusion, hot-start version (Finnzymes, Finland) and the products were cloned to pBluescript KS (Stratagene) predigested with EcoRV.

Multiple Alignments, Phylogeny, and Alignment Search

Multiple alignments were created using MUSCLE 3.6 (Edgar 2004). All local alignment searches were performed using BlastN, BlastX, or BlastP v. 2.2.17 (Altschul et al. 1997) based on BLOSUM62 matrix. The databases used were the nucleotide collection (nr/nt) and the non-redundant protein sequences collection (nr) accessible via the Blast section of the NCBI website (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Phylogenetic trees were constructed with the MEGA4 software (Tamura et al. 2007) in the neighbor-joining method with 1,000 bootstrap replicates. Reverse elements were identified using the protein-based RepeatMasking tool of the RepeatMasker website and software (The Institute for Systems Biology, USA; www.repeatmasker.org).

Detection of Recombination Events

Alignments were analyzed by RDP2 (Recombination Detection Program 2) suite (Martin et al. 2005) using the RDP, GENECONV, Maximum χ2, Bootscan, Chimera, and SiScan models. All models except GENECONV use a sliding-window approach in order to detect sequence incongruity (Martin et al. 2005). We used all models with their RDP2 default parameters, except for the parameters of linear sequences and a P < 0.01.

Expression and Purification of Recombinant Neurotoxins

The expression and purification of Type III neurotoxins were carried out as described previously for Av3 (Moran et al. 2007). In summary, the sequence encoding the mature toxin was amplified via PCR, cloned into the NcoI and BamHI sites of a pET-32b derivative, and the resulting plasmid was transformed to Rosettagami E. coli cells (Novagen, USA) used for protein expression. The expressed neurotoxins, fused to thioredoxin and His6 tags, were purified on HisTrap® (GE Life Sciences, Sweden) affinity column and the tags were then cleaved overnight with thrombin. The toxins were purified after cleavage using a Resource® 3 ml reverse phase HPLC column (GE Life Sciences, Sweden) and eluted with a linear 25–30% acetonitrile gradient.

Electrophysiological Assays

cRNAs encoding the α-subunit of DmNav1 (Drosophila melanogaster sodium channel) and the auxiliary TipE subunit were transcribed in vitro using T7 RNA-polymerase and the mMESSAGE mMACHINE™ system (Ambion, USA) and injected into Xenopus laevis oocytes as was described (Shichor et al. 2002). Currents were measured 1–4 days after injection and data were acquired as previously described (Moran et al. 2006). Currents were elicited by depolarization to −10 mV from a holding potential of −80 mV in the presence of toxin and were allowed to reach a steady-state level prior to the final measurement.

Results

Transcripts Encoding Type III Neurotoxins in A. viridis

PCR amplification of A. viridis cDNA using a primer corresponding to the first seven amino acids of the conserved signal peptide of the Type I neurotoxin, Av2, and a poly-dT primer yielded sequences encoding Av2 and a number of its homologues (Moran et al. 2008a). Surprisingly, the same PCR also yielded a sequence encoding a putative Type III neurotoxin, which we named Av7 (Fig. 1a). The first 69 nucleotides of the Av7 sequence, encoding the first part of the signal peptide, exhibited high identity with those of Av2 (Fig. 2) and its homologue Av6 (Moran et al. 2008a). However, no sequence similarity of Av2 and Av7 was detected downstream of this signal peptide region. The signal and propart region of Av7 ends in a Lys-Arg tandem, proposed to be a cleavage site of signal peptides in many nematocyte proteins (Anderluh et al. 2000; Kozlov and Grishin 2007; Moran et al. 2008a). The mature region of Av7 is 78% identical and 85% similar at the amino acid level to the Type III toxin Av3 (Fig. 1). Three transcripts encoding Av7 were identified and distinguished by length variations at their 3′-UTRs, whereas no Av3 transcripts were obtained (Supplementary Fig. 3). As Av3 was isolated from an A. viridis population near Naples, Italy (Beress et al. 1975), it is possible that the population in Israel does not produce this neurotoxin. The 5′-UTRs of Av2 and Av7 amplified by 5′ RACE exhibited ~90% sequence identity.

Fig. 2
figure 2

Similarity of Av2 and Av7 loci. a Sequence alignment of the Av2 and Av7 signal peptides. Identical amino acids are in capital letters and non-identical are in lower case. b Alignment scheme of the homologous regions in the Av2 and Av7 loci. Only the homologous regions are drawn in scale. The size of the non-homologous sequences is in brackets. The site of the L2 retrotransposon insertion that appears in Av2 but not in Av7 is indicated by an arrow. Detailed sequence alignment is provided in Supplementary Fig. 5. The transcription start site is indicated by TSS

The Genes Encoding Type III Neurotoxins in A. viridis

Primers corresponding to the 5′- and 3′-UTRs as well as to the regions encoding the signal peptide and mature Av7 (Supplementary Table 1) were used to amplify neurotoxin genes from A. viridis gDNA. Three ~2.5-kb long sequences encoding Av7 and containing five exons separated by four introns were obtained (Fig. 3; Supplementary Fig. 2). The first exon corresponds to part of the 5′-UTR, the second to the remaining 5′-UTR and the beginning of the signal peptide, the third and the fourth correspond to the remainder of the signal peptide, and finally the fifth exon perfectly matches the region encoding the mature toxin. This gene structure is more complex than that of the Type I neurotoxin Av2 and its homologue, Av6 (Fig. 3a; Moran et al. 2008a). The Av7 and Av2 genes demonstrate a “patchy” homology at their 5′ end: the exons encoding the 5′-UTR, the first intron, and the exon encoding the N-terminal part of the signal peptide share highly similar stretches of 50–200 bp separated by unrelated sequences (Fig. 2b; Supplementary Fig. 5). Unexpectedly, BlastX and RepeatMasker analyses revealed that the unrelated intronic sequence in Av2 separating the first two stretches (Fig. 2b) exhibited noticeable homology to Line/L2 retrotransposons. Traces of this retroelement also appear in the Av6 gene, but in a highly degenerated form, thus preventing direct identification by RepeatMasker. In addition to the Av7 genomic sequences, another putative Type III neurotoxin sequence, Av10, was amplified. The region encoding the mature Av10 neurotoxin differs in three nucleotides from Av7, resulting in two amino acid substitutions (Fig. 1a). Other than these differences the Av10 sequence is quite similar to the Av7 sequences, with minor differences at the non-coding regions (Supplementary Fig. 2).

Fig. 3
figure 3

Av2 and Av7 genes and their processed pseudogenes. a A graphic scheme describing the structure of Av2 and Av7 genes and their processed pseudogenes. The intron is illustrated as a thin line and the exons as filled boxes. Stripped boxes represent exons corresponding to the 5′-UTR; gray boxes represent exons encoding the signal peptide; black boxes represent exons encoding the mature toxin. b PCR amplification products of Av2 and Av7 specific primers. The large and small bands at the two lanes of the gDNA-based amplifications are the Av2 and Av7 genes and their processed pseudogenes

Analysis of the alignment of the Av7 and Av10 sequences using the RDP2 software (Martin et al. 2005) suggested several recombination events (data not shown). All models included in RDP2 supported these events (P < 0.01) except SiScan. Recombinant sequences can be generated either by gene conversion or reciprocal exchange (Pâques and Haber 1999). The GENECONV model, which was specifically developed to identify gene conversion (Sawyer 1989), detected seven such putative events in the Av7 and Av10 sequences. Notably, gene conversion may facilitate concerted evolution as it homogenizes sequences (Nei and Rooney 2005).

Activity of Av7 and Av10

To evaluate whether Av7 and Av10 are functional toxins, the sequences encoding their mature toxin regions were amplified by PCR, cloned into the expression vector pET-32b, and produced in recombinant form in a similar fashion to the expression of Av3 (Moran et al. 2007). The activity of the recombinant Av7 and Av10 was examined on the Drosophila sodium channel DmNav1 expressed in Xenopus oocytes. Both toxins completely inhibited channel inactivation at 1 and 5 μM concentrations, respectively (Fig. 1b). This effect resembled the activity of the highly potent Av2 and Av3 (Moran et al. 2006, 2007) and suggested that Av7 and Av10 are components of the toxic arsenal of A. viridis.

Identification of Neurotoxin Processed Pseudogenes in A. Viridis

In addition to the multiple Av7 and the single Av10 sequences that were amplified using primers specific for Av7, two smaller fragments of ~300 bp were obtained as well (Fig. 3). Sequence analysis revealed that these fragments are intronless copies of the Av7 gene. While one fragment perfectly corresponds to an Av7 transcript, the other contains an insertion of a single nucleotide at the beginning of the region encoding the mature toxin causing a frameshift. This single nucleotide polymorphism was detected in several PCRs using as template different DNA preparations from the same A. viridis specimen, and was also revealed in independent genome walking reactions. The simplest explanation for the absence of introns is the retrotransposition of a spliced Av7 transcript by an L1 retrotransposon or another unknown retroelement. Such genes, called ‘processed pseudogenes,’ are common in mammalian genomes and to a lesser degree in invertebrate genomes (D’errico et al. 2004; Vinckebosch et al. 2006; Sakai et al. 2007). It is commonly accepted that processed pseudogenes should arise in gametes, gonads, or at an early embryonic stage to be inherited (Ding et al. 2006). Therefore, we produced cDNA from sperm and oocyte-rich ovaries and used the primers specific for Av7 in PCR. While no product was obtained with the sperm cDNA (data not shown), a sequence encoding Av7 was obtained using the ovaries cDNA as template, which confirmed the transcription of this neurotoxin in gonads. Unexpectedly, the sequence encoding Av2 was also amplified from the ovaries cDNA (Fig. 3b). This result suggested that Av2 processed pseudogenes could be present in the A. viridis genome as was indeed found using PCR with short polymerization time (12 s) (Fig. 3).

Mapping the Flanking Regions of Neurotoxin Genes

Specific primers designed for genome walking were used for the amplification of the flanking regions of Av2 and Av7 genes, and of their processed pseudogenes. The Av7 genes and their processed derivatives exhibited high sequence similarity much further downstream of the polyA stretch originally found at the cDNA level (~100 bp of the stop codon; Supplementary Fig. 4). Such conservation suggests that the Av7 processed pseudogenes have been derived from an Av7 transcript with a much longer 3′-UTR. Similar to a number of processed pseudogenes from various organisms, the Av7 processed pseudogenes have no polyA tail (Vinckebosch et al. 2006; Sakai et al. 2007; Betran et al. 2002). Experiments with Ty1 retrotransposons in yeast demonstrated that in some retrotransposition incidents a polyA tail is missing (Schacherer et al. 2004).

Analysis of the regions upstream of the Av2 and Av7 genes revealed high similarity (>90%) 150 bp prior to their transcription start sites (Fig. 2b, Supplementary Fig. 5). Whereas no further similarity was detected at the 5′ flanking regions of the Av2 and Av7 genes, two distinct highly homologous sequences (280 bp long with 97% identity) in inverted orientations were found further upstream to the transcription initiation site of two different Av2 genes. No similarity was detected further upstream of these inverted sequences.

Discussion

Fusion of Types I and III Neurotoxin Genes in A. viridis

Gene fusion is regarded as a mechanism that enables acquisition of a new function while retaining at least some of the ancestral role of the fused genes (Long 2000). Nevertheless, in the case of the unrelated Types I and III neurotoxin genes of A. viridis one gene family had likely recruited via gene fusion the 5′-UTR and the signal peptide encoding sequence of the genes of the other family. It is clear that the fusion occurred before a retrotransposon invaded the intron in the 5′-UTR of the A. viridis Type I neurotoxin genes as it is present in Av2 and Av6 genes but not in the Av7 gene. Recruitment of the 5′-UTR and signal peptide-coding region of another gene may be advantageous in improving transcript stability and intracellular trafficking to the nematocyst, respectively. Commonality of key amino acid residues in signal peptides of a variety of cnidarian toxins was previously demonstrated (Anderluh et al. 2000), but the transmission of nucleotide sequences between unrelated toxin genes has not been reported. Kozlov and Grishin (2007) suggest that most propart and signal peptides of animal toxins are cleaved at several sites during maturation, and that each site is targeted by a different protease. Moreover, the signal peptide is responsible for the recognition of the precursor protein by one or more of the various units of the endoplasmic reticulum (ER) translocation machinery (Sakaguchi 1997). Therefore, Av2 and Av7 signal peptides may be recognized by the same components of the ER translocation machinery and share most of the proteases that participate in this complex cleaving process.

Genomic structural variations (SVs) caused by translocation, inversion, deletion, or duplication may lead to gene fusion (Korbel et al. 2007). SVs have been considered until recently events that involve much longer sequences than the <1 kb fragment shared between Av2 and Av7. Nevertheless, it recently became evident that many human SVs span genomic fragments as small as 2 kb, and that smaller SVs were not identified due to technical limitations. In a study encompassing the genomes of two humans, ~25% of the detected SVs were insertions, some involving coding sequences (Korbel et al. 2007). The most common mechanism that forms SVs is non-homologous end joining (NHEJ). The hallmark of this mechanism is the absence of long repeats that are typical of homology-based mechanisms. Yet, in many SVs created by NHEJ, micro-homologies (>5 bp) are found at the breakpoint junctions of the indels (Korbel et al. 2007). Indeed, close inspection of the breakpoint junctions flanking the homologous region shared between the Av2 and Av7 genes highlighted a 4-bp repeat (TTGA) at both ends of this region in the Av2 gene. Although the sequence characteristics of NHEJ are somewhat equivocal, this finding suggests that the observed homology between the Av2 and Av7 genes was likely generated via NHEJ. Another mechanism enabling gene fusion is retrotransposition (Vinckebosch et al. 2006). However, as part of the shared sequence between Av2 and Av7 is intronic and the homology spreads 150 bp upstream of the 5′-UTRs, it is unlikely that a retroelement-based mechanism has retrotransposed this sequence.

Multiple evidence implies that a Type III neurotoxin gene was the donor of the sequence common to the two types of toxin genes. The finding of the transcript encoding Hk2 revealed a close homologue of Av2 from Anthopleura sp. (91% amino acid identity; Fig. 1a) that lacks a signal peptide. This may reflect either a secondary loss or the status of the ancestral gene (Liu et al. 2003). In a very recent study (Richier et al. 2008), sequencing of various Anthopleura elegantissima ESTs unexpectedly revealed the transcript encoding Anthopleurin A (ApA), a well-studied Type I neurotoxin previously known from Anthopleura xanthogrammica (reviewed in Honma and Shiomi 2006). Although the mature Av2 and ApA are highly similar (85% identity; Fig. 1a), their signal peptides differ (Fig. 4). The signal peptide of ApA is very similar to those of other Type I toxins, whereas that of Av2 is closely related to the signal peptide of Av7 (Fig. 4). Based on this, we propose the following evolutionary scenario: the ancestor gene of the close homologues, Av2, ApA, and Hk2, contained a signal peptide similar to that of ApA and other Type I toxins. At a latter stage, the ancestor gene of Av2 and Hk2 lost the signal peptide sequence and then the ancestor of the Av2 gene acquired the signal peptide of Av7 in a lineage-specific manner. The fact that Av7 has a more complex gene structure than that of Av2 provides further support to the suggestion that a Type III neurotoxin gene might have been the donor of the 5′-UTR and the region encoding the signal peptide of Type I neurotoxins in A. viridis.

Fig. 4
figure 4

Phylogenetic trees of sea anemone mature neurotoxins and of signal peptides. The consensus neighbor-joining trees were constructed using MEGA4 (Tamura et al. 2007). Complete-deletion option and the p-distance model were used. Bootstrap values are based on 1,000 replications, and only those >50% are shown. The scale bar applies for both trees. Accession numbers are Av2 (ABW97331), Av6 (ABW97349), Ae1 (Q9NJQ2), Ae4 (ABW97362), Sg2 (Q76CA3), Am3 (P69928), ApA (P01530 and FG392551), Hk2 (P0C5F4), Av7 (ACL12302), Av10 (ACL12305). According to its cDNA sequence Hk2 lacks a signal peptide (Liu et al. 2003)

It was recently shown that waprins and kunitz-type serine protease inhibitors, two distinct classes of venom proteins from Australian snakes, share highly similar signal peptides (St. Pierre et al. 2008). This finding was explained as the result of extreme accelerated evolution of the sequences encoding the mature toxin regions that left no clue as to their common ancestry. As the Av2 and Av7 genes share high similarity at their 5′-UTRs, introns, and untranscribed sequences (Fig. 2b and Supplementary Fig. 5), the gene fusion explanation is much more conceivable for the case of the A. viridis neurotoxins. It is noteworthy that in a previous study performed on another snake species, a chimeric sequence encoding fused domains of mature waprin and kunitz-type inhibitor was explained as a gene fusion event (Pahari et al. 2007).

Neurotoxin Processed Pseudogenes and Neurotoxin Evolution

The finding of Av2 and Av7 processed pseudogenes in the genome of A. viridis was puzzling as it implies that their parental genes were transcribed either in gametes, gonads, or an early embryonic stage (Ding et al. 2006). The finding of Av2 and Av7 transcripts in ovaries of A. viridis supports this assumption (Fig. 3b). We have also reported the transcription of Nv1, a Type I toxin of N. vectensis, in embryos (Moran et al. 2008b). These findings together with a report on the transcription of a pore-forming toxin gene in the embryo of Hydra (Genikhovich et al. 2006) raise the question as to the putative role of these toxin transcripts in germ-line cells and early life stages of cnidarians. The fact that nematocytes and germ-line cells in cnidarians share a common progenitor (interstitial stem cell; Bosch and David 1987; Miljkovic-Licina et al. 2004) may explain the presence of neurotoxin transcripts in ovaries. Yet, it should be noted that interstitial cells were not found in several cnidarians (Yuan et al. 2008) and therefore the developmental pathway of nematocytes in some species is not necessarily shared with germ-line cells.

The processed pseudogenes of Av2 and Av7 have probably arisen quite recently as they are identical to the corresponding exons of their ancestor genes. While one of the Av7 processed pseudogenes has a single nucleotide insertion that causes a critical frameshift, the others are perfectly intact (data not shown). Due to this remarkable sequence conservation we cannot determine whether these genes are transcriptionally active (retrogenes) as their transcripts would be identical to those of the ancestral genes. In recent years, the emergence of retrogenes is recognized as a path for expanding gene families and acquiring novel gene functions in addition to classic gene duplications (Vinckebosch et al. 2006; Ding et al. 2006).

The need for fast production of large amounts of Type I neurotoxins was suggested as one of the reasons for maintaining, via concerted evolution, a collection of multiple gene copies encoding a single neurotoxin such as Av2 (Moran et al. 2008a), and the same reasoning may apply for the maintenance of several nearly identical Av7 genes, which may also have evolved in concert. The emergence of neurotoxin processed pseudogenes may enable the ‘escape’ of neurotoxin encoding sequences from the control of concerted evolution, thus driving further the divergence of these toxins. This possibility is highly hypothetical as diverged retrogenes have not been found. Moreover, sequencing of Type III neurotoxin genes from additional species will be essential in order to estimate more accurately whether the Av7 and Av10 genes evolved in concert rather than by multiple recent duplications. It cannot be ruled out that the various Av7 and Av10 sequences are the alleles of at least two loci since A. viridis is most probably a diploid like other anemones (Putnam et al. 2007). This puzzle may be solved when the genome sequence of A. viridis is assayed by high throughput methods.

In summary, although Types I and III neurotoxins are structurally unrelated their genes share a number of common characteristics. Both gene families possess multiple loci encoding a single toxin and both gave rise to processed pseudogenes in the genome of A. viridis. The two gene families share a common sequence that flanks the 5′-UTR, first intron and an exon that encodes part of the signal peptide, suggesting a common intracellular fate. The common genetic features of these two gene families may reflect similar evolutionary patterns and selective pressures, as is also indicated by the finding that both gene families have likely evolved under concerted evolution.