Introduction

The Mormyroidea and Their Electric Sense

Within the basal teleostean superorder Osteoglossomorpha, the Mormyroidea or ‘African weakly electric fish’ account for the vast majority of species (~90 %, i.e., about 200 of 220 species). The members of the superfamily Mormyroidea form the single largest group of electric fish (Alves-Gomes and Hopkins 1997). They comprise two families, i.e., the speciose Mormyridae and the monospecific Gymnarchidae (only Gymnarchus niloticus) (Sullivan et al. 2000).

The weak electricity in Mormyroidea is generated by an electric organ (EO) composed of specialized cells, so-called electrocytes (Bennett and Grundfest 1961; Bennett 1971). This EO arises from the deep lateral muscle—however, the ability to contract is lost in electrocytes (Szabo 1960). The electrocytes are electrically excitable cells and fire action potentials in response to a depolarizing impulse (Bennett 1971). Each electrocyte of an EO receives innervation from spinal electromotor neurons, and the joint excitation of all electrocytes leads to summation of their action potentials and the generation of the so-called electric organ discharge (EOD) (Carlson 2002 and references therein).

Within Mormyroidea, two major types of EODs have evolved: pulse-type EODs as produced by all species of the family Mormyridae and a wave-type EOD as produced by their sister taxon G. niloticus (Hopkins 1980; Caputi et al. 2005; Von der Emde 2013). Due to their nocturnal life style and the often considerably limited visual range in African stream waters, Mormyroidea strongly rely on their electric sense (Moller 1995; Kramer 1996). They use their self-generated weak electric fields for electrolocation (Lissmann and Machin 1958; Heiligenberg 1975; Von der Emde and Ringer 1992; Moller 1995; Von der Emde 1999, 2006), foraging (Von der Emde and Bleckmann 1998; Von der Emde and Schwarz 2002), and social communication (Moller 1970; Hopkins and Bass 1981; Kramer 1996; Carlson and Gallant 2013). The pulse-type EOD of Mormyridae can be considered a key-innovation, as these fish are able to recognize subtle differences in EOD waveform and pattern (Kramer and Kuhn 1994; Crawford et al. 1997) and, in a social context, they use this ability to distinguish between con- and heterospecific individuals. The EOD plays a major role in driving the adaptive radiation in the Mormyridae by enhancing assortative mating (Feulner et al. 2008a, b). In our study, we focused on mormyrid fish with an adult EOD of the pulse type.

The EO and Voltage-Gated Sodium Channels

Being mainly responsible for the upstroke (i.e., the rising phase) of an action potential, voltage-gated sodium channels (Nav1) are of crucial importance for excitable cells such as electrocytes. The alpha subunit of these membrane proteins consists of approximately 1700 to 2000 amino acids and forms the major pore of the channel. An alpha subunit is made up of 24 alpha-helical transmembrane segments, which are organized into four homologous and covalently linked domains (referred to as domain I–IV; Catterall 2000; Hanck and Fozzard 2007). Each domain consists of six transmembrane segments (S1–S6) being connected by large intracellular linkers.

In teleost fish, there are eight genes encoding for the alpha subunit of Nav1 channels, which form the sodium channel gene family (SCN; Novak et al. 2006; Zakon et al. 2009; Widmark et al. 2011). Two of these gene copies, which have arisen from a common ancestor by the fish-specific genome duplication (FSGD; Amores et al. 1998; Wittbrodt et al. 1998; Meyer and Schartl 1999; Taylor et al. 2001, 2003; Meyer and Van de Peer 2005), are of special interest here, since these two paralogs are known to be differentially expressed in electric fish (Zakon et al. 2006; Arnegard et al. 2010): SCN4ab is mainly expressed in muscle tissue, while its paralog SCN4aa is solely expressed in the adult EO, which is derived from muscle tissue. This is an intriguing example for neofunctionalization of one gene copy after gene duplication. The increased nonsynonymous substitution rate (i.e., the higher than 1 dN/dS ratio analyzed in a Maximum Likelihood framework) in several SCN4aa exons in electric fish (Zakon et al. 2006; Arnegard et al. 2010) indicates strong positive selection and hence confirms adaptation of this gene with regard to the active electric sense.

Here, we investigate sequence variation in the entire SCN4aa gene (all 25 exons and most of the introns, encompassing over 20,000 bp) in weakly electric fish of the genus Campylomormyrus. We are specifically interested in evaluating (1) whether an accelerated rate of evolutionary divergence (as already identified generally for electric fish; Arnegard et al. 2010) can be also observed in a more recent radiation, i.e., among the closely related species of the genus Campylomormyrus and (2) whether nonsynonymous substitutions in the SCN4aa gene relate to the strikingly divergent pulse-type EODs among these species. Among our six focus species, C. compressirostris, C. curvirostris, and C. tamandua produce brief EODs (~160 to 500 µs), C. tshokwe an EOD of intermediate length (~4 ms), while C. numenius and C. rhynchophorus exhibit comparatively long EODs (~25 to 30 ms), such that the difference between shortest and longest EOD encompasses almost two orders of magnitude (Feulner et al. 2007, 2008a, Fig. 1).

Fig. 1
figure 1

(adapted from Feulner et al. 2008a)

Campylomormyrus species for which we analyzed the SCN4aa gene and their adult electric organ discharges (EODs). The number next to the EOD indicates the duration of the displayed discharge record. Note the variability in lengths and shape of the EODs

Since the properties of the electrocyte’s action potentials are reflected in EOD characteristics, differences in EOD duration might be explained by variable inactivation kinetics of the voltage-gated sodium channel encoded by SCN4aa. These kinetics are in turn determined by several distinct inactivation mechanisms (Catterall 2000; Goldin 2003; Ulbricht 2005). The process of fast inactivation is nowadays very well understood and described as a hinged-lid mechanism (West et al. 1992; Kellenberger et al. 1997; Goldin 2003), in which four hydrophobic amino acids [i.e., the isoleucine-phenylalanine-methionine-tryptophan (IFMT) motif] interact with their docking sites located in the linkers connecting transmembrane segment 4 and 5 of domain III and IV, respectively, and the cytoplasmic end of transmembrane segment 6 in domain IV, to block the inner mouth of the channel pore and to prevent further ion flow (Goldin 2003). Hence, substitutions of amino acids involved in this process can change the inactivation kinetics and might cause differences in the length of the EOD by altering the duration of action potentials.

Materials and Methods

Sequencing

The samples used in this study were collected during a field trip to Brazzaville, Republic of the Congo, in 2006. We have chosen six representatives of the genus Campylomormyrus (i.e., C. compressirostris, C. tamandua, C. curvirostris, C. tshokwe, C. numenius, and C. rhynchophorus; Feulner et al. 2007, 2008a), for which we sequenced the entire SCN4aa gene from genomic DNA. An initial PCR was performed with primers based on the SCN4aa gene of Gnathonemus petersii. Therefor, we identified two conserved regions within the gene by inspection of aligned teleost sequences from GenBank corresponding to exon 13 (forward primer: 5′ gac ccc tac tac tac ttc cag g 3′) and 18 (reverse primer: 5′ aag tcc agc cag cac cag gc 3′). Later, with the availability of a transcriptome for C. compressirostris (Lamanna et al. 2014), we completed our SCN4aa sequences by designing additional primers from transcriptome reads (read sequences from the electric organ are available at http://www.ncbi.nlm.nih.gov/sra under the Accession no: SRR786741). The farthermost upstream primer is situated in the 5′UTR (forward primer: 5′ gca aaa tcc tgg agt ggt gt 3′; from read with Accession no: SRR786741.3936907) and the farthest downstream primer binds in the 3′UTR shortly behind exon 25 (reverse primer: 5′ gtt agt gtt agt ctt gct cca g 3′; from read with Accession no: SRR786741.34049175; species-specific internal primers for primer walking are available upon request). Due to the length of fragments, amplifications were performed with LA Taq™ DNA Polymerase (Takara Bio Inc.) in a total volume of 15 µl containing 1.5 µl 10 × LA PCR™ Buffer II, 1.5 µl MgCl2 (25 mM), 2.4 µl dNTP mix (2.5 mM each), 0.7 µl of each Primer (0.5 µM each), 0.14 µl TaKaRa LA Taq™ (5 Units/µl), and 1 µl of genomic DNA solution. Sequencing was performed with the BigDye Terminator v3.1 cycle sequencing kit (Applied Biosystems) on an ABI3130xl automated sequencer (Applied Biosystems). We yielded the sequences for exon 1 up to 25 and part of the 3′ UTR. Sequencing of SCN4aa from genomic DNA enabled us to additionally obtain most of the introns for the six species, which can be expected to be phylogenetically informative (see below).

Sequences for SCN4ab were amplified with a degenerated forward primer (sequence: 5′ tca gcm ctg agg acv tty cgt gtg ctv 3′) derived from teleost sequences and a reverse primer (sequence: 5′ gat gtt gca gac gca gtc ctt gta gt 3′) obtained via RACE-PCRs from extracted RNA of C. compressirostris. The amplified region of genomic DNA of the SCN4ab gene had a length of ~4 kb and covers exon 6–14.

Data Sets, Sequence Alignment, and Database Retrieval

For all six species of Campylomormyrus, we obtained the SCN4aa gene comprising the entire coding region (exonic sequences: 5461 bp; intronic sequences: 15,348 bp). For four of this six species (i.e., C. compressirostris, C. curvirostris, C. numenius, and C. rhynchophorus), we additionally obtained 1206 bp of exonic sequences for SCN4ab. For comparative analyses, we downloaded available sequences of SCN4aa and SCN4ab for additional Mormyroidea, other nonelectric Osteoglossomorpha (closely related to the Mormyroidea), a gymnotiform, and other nonelectric Elopocephala (see Table 1 for species names and accession numbers).

Table 1 Species included in the analysis, their taxonomic position, and accession numbers for the paralogous genes SCN4aa and SCN4ab

Since most of the public sequences available are only partial, we decided to generate two distinct SCN4aa alignments for subsequent analyses. First, we built one alignment comprising a wide taxon sampling (32 taxa, ~2800 bp; hereafter referred to as “short alignment”; see Table 1 for species names) with 18 Mormyridae representing eight distinct genera, the only wave discharging mormyroid fish (i.e., G. niloticus) and 13 nonelectric Teleostei (three other Osteoglossomorpha and 10 Elopocephala). Second, we used our six whole gene sequences of Campylomormyrus together with those of nine model teleostian fish, all nonelectric, to create the second alignment (15 taxa, ~5300 bp; hereafter referred to as “long alignment”; see Table 1, those taxa are indicated by an asterisk). All alignments were constructed using the TranslatorX software (available online: http://translatorx.co.uk/; Abascal et al. 2010) with the MAFFT algorithm and we refined gap positions manually.

In order to perform a comparative analysis of gene evolution among SCN4aa and SCN4ab with focus on Mormyridae, an alignment with both paralogous genes was generated. Here, we included nonelectric fish species as well as a representative of the wave-type discharging gymnotiformes (i.e., Sternopygus macrurus), for which whole gene sequences were available, and four species of our focus genus Campylomormyrus. Sequences for S. macrurus were downloaded from NCBI GenBank (accession numbers: SCN4aa [AF378144.2], SCN4aa [AF378139.2]. The SCN gene family member SCN12aa of Petromyzon marinus (Ensemble accession number: ENSPMAG00000007100) was included as outgroup for rooting the phylogenetic tree. The alignment was constructed using the TranslatorX software with the MAFFT algorithm and ambiguous parts were removed manually. This alignment had a final length of 1119 nucleotides.

Phylogenetic Analysis

In order to unravel the still poorly resolved phylogenetic relationships among the species of Campylomormyrus, we explored the utility of our SCN4aa data for phylogeny reconstruction. Therefore, we used the intron as well as the exon alignment separately. G. petersii—the sister taxon to Campylomormyrus—was used as outgroup. Note that only exon data were available for G. petersii. Therefore, the intron phylogeny was rooted with C. tamandua, as this species is considered basal within the genus Campylomormyrus (Feulner et al. 2008a, Lamanna et al. 2016). The evolutionary model for both alignments was determined with ModelGenerator v0.85 (Keane et al. 2006). According to the Akaike Information Criterion (AIC), the TrN-model (Tamura and Nei 1993) was the best fitting model for the intron alignment and the HKY-model (Hasegawa et al. 1985) for the exon alignment (both models with a proportion of invariable sites). The maximum likelihood (ML) analyses for both data sets were run under RaxML v.7.2.6 (Stamatakis 2006) with 100,000 bootstrap replicates. Here we applied the GTR model (the only nucleotide substitution model implemented in RaxML) with a proportion of invariable sites. Further, a maximum parsimony (MP) analysis was run for the intron data set using Paup* v.4.0b10 (Swofford 2002) to test whether the consideration of gaps as a fifths character state adds phylogenetic information. The MP analysis was also conducted with 100,000 bootstrap replicates.

For the combined alignment of the paralogs SCN4aa and SCN4ab, ModelGenerator found the general time reversible model (GTR; plus a proportion of invariable sites and a gamma distribution) to be the best fitting evolutionary model, according to the AIC. Considering the suggested model, a Bayesian analysis was run using MrBayes v.3.2 (Ronquist and Huelsenbeck 2003). Two independent runs were processed in parallel for 3 million generations and a consensus tree was built by removing the first 25 % of the collected trees as burnin. Additionally, a ML analysis was run using RaxML v.7.2.6 with the GTRGAMMAI model specified and support was assessed via 100,000 bootstrap replicates.

All alignments are provided in the supplement (supplementary files S1–S3).

Relation Between Variable Sites in Scn4aa and EOD Characteristics

We evaluated the inferred amino acid sequences of the SCN4aa exon alignment from the six species of Campylomormyrus to identify sites with nonsynonymous substitutions. Subsequently, amino acid substitutions were checked for their likely impact on protein function using the SNAP method (Bromberg and Rost 2007) and the PolyPhen-2 prediction tool (Adzhubei et al. 2010), taking the human Swiss-Prot sequence for the SCN4a gene (protein identifier: P35499) as reference. For both methods, available online tools were used. The algorithm behind PolyPhen-2 takes known variability/invariability of a codon into account, as well as structural information and Swiss-Prot annotation for the respective protein. As output, PolyPhen-2 provides scores for amino acid substitutions, ranging from 0 (benign/neutral) to 1 (damaging/altering protein function). SNAP is a neuronal network-based method and considers also sequence conservation and secondary structure for the evaluation of a substitution’s impact. The SNAP scores for substitutions are translated into a binary prediction of effect (i.e., neutral or non-neutral). The prediction can then be evaluated by calculation of a reliability index or percentage of expected accuracy.

We looked for shared amino acids among species with a similar EOD length. Further, we mapped all amino acid sites variable among the six Campylomormyrus species onto the Nav1.4a channel structure in order to visualize the affected functional region. The channel was modeled based on the well-annotated human homolog SCN4a (from SwissProt database), using the TOPO2 software as online-tool (http://www.sacs.ucsf.edu/TOPO2/). Finally, we surveyed the SCN4aa gene of Campylomormyrus for substitutions, where mutations in human are known to cause diseases (channelopathies) as well as sites known to be involved in channel activation or fast inactivation. We also evaluated whether our substitutions had been detected in other mormyrid fish.

Positive Selection and Lineage-Specific Substitutions

We tested for signs of selection in the African weakly electric fish using both the short alignment (with a wider taxon sampling) and the long alignment (with complete SCN4aa gene sequences; restricted to our focus taxa and teleost model species).

As gene conversion among duplicated genes—a unidirectional transfer of sequence information from one locus to another locus of the gene family—can lead to false positives in selection analyses (Casola and Hahn 2009 and references therein), we checked first for gene conversion among SCN4aa and SCN4ab using GENECONV v1.81 (Sawyer 1989). The test was applied to 13 mormyrid species for which both paralogous genes were available (Table 1). Default settings were applied, except for the option ‘gscale,’ which was set to 3 to allow for internal mismatches in fragments. This option enables the detection of less recent conversion events.

Codeml as implemented in the PAML package version 4.4 (Yang 2007, 2011) was utilized for identification of sites under selection. By calculating the dN/dS ratio (nonsynonymous to synonymous substitution rate, also denoted as ω) per codon site and sorting the values into different categories, according to the chosen model, this test evaluates site-specific selection pressures. A value of ω = 0 indicates neutral evolution, whereas ω > 1 or ω < 1 indicate positive or purifying selection, respectively. Different nested pairs of site models were compared to test for dN/dS heterogeneity (M0 vs. M3) and for positive selection (M1a vs. M2a and M7 vs. M8), using a likelihood ratio test (LRT). This LRT statistic is calculated as twice the difference between the log likelihood values of a null model and the corresponding alternative model (nested model) and follows a χ 2-distribution. Under model M2a and M8, the Bayes empirical Bayes (BEB; Deely and Lindley 1981; Yang et al. 2005) method is utilized to calculate posterior probabilities for sites of a certain codon class with ω > 1. We only refer to those sites with posterior probabilities >95 % as positively selected sites (PSSs).

Since we are interested in selection among pulse-type discharging mormyrid fish, only such species (i.e., Mormyridae) were analyzed in the site tests (18 mormyrid species in the short and our six Campylomormyrus species for the long SCN4aa alignment; Table 1).

To further examine, whether the selective regime changed over time for different clades of Teleostei, we also conducted branch-site tests, where the dN/dS ratio is allowed to vary both among sites and among branches (Yang and Nielsen 2002; Yang et al. 2005; Zhang et al. 2005; Yang 2007). These tests were conducted each with the short and long alignment and in each case all included Mormyridae were labeled as the foreground branch. For both data sets, two branch-site tests were performed: one stringent test for positive selection (hereafter called positive selection branch-site test; Yang 2011) and another, which cannot distinguish between positive selection and relaxation of selection pressure (hereafter called relaxed constraints branch-site test; Zhang 2004; Zhang et al. 2005). For the positive selection branch-site test, two variants of model A are compared. Model A0 forces all sites to have evolved neutrally (ω 2 fixed to 1) and the alternative model A1 allows additionally sites in the foreground branch to have ω > 1. In the relaxed constraints branch-site test, the alternative model A1 is compared to the null model M1a, which allows ω to vary from 0 to 1 for all sites and branches. Only sites with ω > 1 and posterior probability >95 % by BEB were considered as PSSs.

All substitutions inferred to constitute PSSs were examined for their potential impact on protein function by the SNAP and the PolyPhen-2 algorithm.

Functional Divergence of SCN4aa Among Electric and Nonelectric Fish

We identified amino acid sites that differ between nonelectric fish and mormyrid fish with a pulse-type EOD, but are monomorphic (i.e., highly conserved) among all nonelectric fish, both in the short and the long alignment. For these sites, we checked the effect of the substitutions in pulse-type discharging fish using the SNAP algorithm and the PolyPhen-2 prediction method. Sites, which were predicted to alter protein function, were mapped onto the Nav1.4a channel. For this purpose, we modeled a visualization of the membrane protein using the human-annotated SCN4a sequence from the UniProt database (protein identifier: P35499) and the online tool TOPO2. Additionally, all PSSs (probability >95 % by BEB) from the positive selection analysis with Codeml were marked in this channel visualization.

Results

Phylogenetic Analyses

The maximum parsimony (MP) and maximum likelihood (ML) analysis of the SCN4aa intron data yielded the same highly supported tree topology for the six Campylomormyrus species (Fig. 2a). The topology of the ML tree based on the SCN4aa exon data was mostly identical to the one generated from the intron data, although the topology is less well supported (Fig. 2b). In both the intron and exon topology, the two taxa with long discharge (i.e., C. rhynchophorus and C. numenius) are closely related, suggesting the elongated EOD to be an apomorphy of this clade.

Fig. 2
figure 2

Phylogenetic trees proposing the relationship among the analyzed six species of the genus Campylomormyrus. a Best supported tree of the maximum likelihood (ML) analysis based on the SCN4aa intron data. Statistical support values from the ML and maximum parsimony analyses (which yielded an identical tree topology) are presented at the nodes. b Best supported ML tree based on the SCN4aa exon data. Bootstrap support values from 100,000 replicates are given at the nodes. Gnathonemus petersii—the sister taxon to Campylomormyrus—was used to root this tree

Evolutionary Rates in SCN4aa Versus SCN4ab

Among our Campylomormyrus species, the percentage of fixed replacement differences at nucleotide positions was more than 4 times higher in SCN4aa (1.14 %; 62 out of 5461 bp) as compared to SCN4ab (0.25 %; 3 out of 1206 bp). Further, both the ML and the Bayesian analyses of the combined SCN4aa/SCN4ab data set confirmed previous findings (Zakon et al. 2006; Arnegard et al. 2010) of an increased evolutionary rate in SCN4aa compared to SCN4ab in Mormyrid weakly electric fish, as evidenced by longer branch lengths leading to the genus Campylomormyrus (Fig. 3). The same can be observed for S. macrurus, a representative of the wave-discharging weakly electric Gymnotiformes, an evolutionary lineage which independently evolved an active electric sense and whose members also reveal a differential expression pattern among SCN4aa and SCN4ab (i.e., SCN4aa is not expressed in ordinary skeletal muscle; Zakon et al. 2006; Arnegard et al. 2010).

Fig. 3
figure 3

Bayesian phylogeny based on nucleotide sequences of the two paralogous genes SCN4aa and SCN4ab. Numbers at nodes represent bootstrap support values and posterior probabilities. For lineages leading to electrogenic species (i.e., the mormyrid Campylomormyrus and the unrelated gymnotiform Sternopygus) branch length values are provided (red for SCN4aa and blue for SCN4ab)

Relation Between SCN4aa-Fixed Differences and EOD in Campylomormyrus

To check for a potential relationship between the SCN4aa gene-fixed differences and the characteristics of the EOD in Campylomormyrus, we compared the inferred amino acid sequences of the SCN4aa gene among six species of this genus (i.e., C. compressirostris, C. tamandua, C. curvirostris, C. tshokwe, C. rhynchophorus, and C. numenius). Thirty-one amino acid sites in total were different among these six examined species (Fig. 4 and Supplementary Table S1 for details). In the analyzed specimen of C. curvirostris, we further observed three heterozygous sites, each resulting in different amino acids (verified by sequences from independent PCR runs). The 31 sites were mapped onto a model of the sodium channels’ protein structure, and the predicted effect of the substitutions in Campylomormyrus compared to the human homolog SCN4a is indicated (Fig. 5a; see Supplementary Table S1 for detailed output of prediction methods). In C. rhynchophorus and C. numenius, the two species with the longest EOD, the SCN4aa sequence differs at two amino acid sites from that of all other species (Fig. 4; alignment positions 4 and 10; Table S1). The first, D211E, is located in the linker connecting transmembrane segment 3 and 4 (S3-S4) of domain I (DI) and is rated to possibly affect protein function by PolyPhen-2. The second is located in the intracellular DI – DII linker, but it was not possible to estimate the effect of this substitution to glutamate (E), since no homolog amino acid in human, mouse, or rat exists. C. rhynchophorus has in addition one substitution (R1001W), located in the DII–DIII linker, which is rated to probably affect protein function. Note that at this site, arginine (R) is a conserved amino acid, which is also shared by H. sapiens. These aa substitutions have however also occured in other mormyrid lineages (Table S1). Private substitutions can be found in the inferred amino acid sequence of C. numenius at three sites (Fig. 4; alignment positions 5, 15, and 17; Table S1). L212 M is evaluated as impacting protein function by PolyPhen-2; leucine (L) is generally conserved at this site and shared also by H. sapiens. T1019 K is predicted to be benign, while L1117T is rated as impacting protein function. In essence, the two species with the longest EODs exhibit several amino acid substitutions predicted to substantially alter protein function.

Fig. 4
figure 4

Sites with amino acid sequence variation in SCN4aa among the six analyzed species of Campylomormyrus. Note the heterozygous condition at 3 sites in C. curvirostris (C. cur). Sites are numbered from 1 to 31 and the exact position of these sites is given below the alignment relative to the human SCN4a gene (H. sap.). The species-specific adult electric organ discharges (EODs) are also shown. C. com. = C. compressirostris, C. tam. = C. tamandua, C. tsh. = C. tshokwe, C. rhy. = C. rhynchophorus, C. num. = C. numenius

Fig. 5
figure 5

Visualization of the Nav1.4a channel based on annotation information for the human SCN4a gene. a SCN4aa sites variable in their amino acids among the six species of Campylomormyrus are indicated (green and red stars). Position numbers refer to the human SCN4a gene. Red stars indicate sites at which substitutions are predicted to alter protein function. Green stars indicate substitutions predicted to be neutral. b Substitutions specific to pulse-type discharging fish (i.e., Mormyridae) for which the respective substitution were predicted to alter protein function (red diamonds). For sites inferred to be under positive selection (PSS, labeled with an asterisk), position numbers are provided relative to the human SCN4a gene. Orange circles indicate the four amino acids of the IFMT motif within the inactivation loop

SCN4aa Frameshift Mutation in C. tshokwe

The SCN4aa coding sequence for the voltage-gated sodium channel Nav1.4a in C. tshokwe exhibits a single base insertion within exon 25. This mutation results in a premature stop codon at the sixth codon after the insertion and leads to a truncation of the distal C-terminus of 37 amino acids.

Inference of Gene Conversion and Selection

GENECONV did not reveal any evidence for recombination or gene conversion among the analyzed mormyrid SCN4aa and SCN4ab sequences.

Within Mormyridae (data set with 18 taxa, representing eight different genera), 12 PSSs with a posterior probability >95 % were identified by the BEB method implemented in Codeml (Table 2; see Supplementary Table S2 for details). Five of these 12 PSSs are in transmembrane segments and seven in linker regions – from these seven, five are located in the linker between DII and DIII. None of the substitutions at these PSSs in mormyrids matches to any reported mutation in the human SwissProt sequence P35499 for SCN4a. Two substitutions were predicted to be non-neutral by SNAP and PolyPhen-2: D949F in the DII–DIII linker and A1354P in S1 of DIV. The latter substitution is only 6 AA apart from a site where the substitution M1360 V in humans is reported to cause hyperkalemic periodic paralysis (HyperPP; Jurkat-Rott et al. 2010).

Table 2 Inference of positively selected sites (PSSs) using site and branch-site models

We detected signs of lineage-specific selection (i.e., sites which are under positive selection in Mormyridae compared to the nonelectric background taxa). Model A1 revealed six sites in the foreground branch (Mormyridae), which are under very strong positive selection (ω = 5.22611). They do not coincide with the sites found to be PSS within Mormyridae by the site test (Table 2; see Supplementary Table S3 for details). All but one substitution are in linker regions and two of them, I1310C and T1374S, are ranked to potentially alter the protein function by both prediction tools (PolyPhen-2 and SNAP). The I1310C substitution is located in the DIII–DIV linker, or so-called inactivation loop, and affects the first amino acid of the highly conserved IFMT motif (this substitution was already described by Zakon et al. 2006). Especially, the first three strong hydrophobic amino acids play a crucial role in fast inactivation. T1374S is located in the S1–S2 linker of DIV and is separated by 3 amino acids from a site, which is involved in causing a sodium channelopathy in human (i.e., paramyotonia congenital; Rüdel et al. 2000; Jurkat-Rott et al. 2010).

When analyzing the SCN4aa gene variation among the species of the genus Campylomormyrus separately, none of the three LRTs yielded significance for any site. This outcome could result from reduced power of the LRT due to low sample sizes (Anisimova et al. 2001). We performed two branch-site tests of selection with the long alignment using nonelectric Teleostei as background taxa and the six Campylomormyrus as foreground branch. The test statistic of the positive selection branch-site test did not receive significance, but the comparison of the two nested branch-site models within the relaxed constrained branch-site test revealed a significant better fit of the model A1, which allows for sites under positive selection (Table 2). In this case, ~13.6 % of the variable sites had a d N/d S ratio indicative of positive selection (ω > 1) along the foreground branch and 24 sites with posterior probabilities >95 % were found by BEB (Supplementary Table 4). The majority of them are in linker regions. Six of them were predicted to alter protein function by the PolyPhen-2 and SNAP prediction methods. Among them are the substitutions I1310C and T1374E, which were also identified with the branch-site tests along the lineage ancestral to Mormyridae (Supplementary Table S3 and S4).

Fixed Amino acid Substitutions in SCN4aa Among Mormyrid Versus Nonelectric Fish

In the SCN4aa exons, we identified 112 sites conserved in all nonelectric fish, but different in mormyrid electric fish (see Supplementary Table S5). Eleven occur in the N-terminus, 36 in transmembrane segments, 54 in linkers, and 11 in the C-terminus. 24 of these 112 sites coincide with inferred positively selected sites (PSSs in Table 2; see Supplementary Table S5 for summary). Mormyrid-specific substitutions at almost half of the sites (56) are predicted to have an effect on protein function by at least one of the utilized prediction methods, i.e., PolyPhen-2 and/or SNAP (red diamonds in Fig. 5b indicate positions of these sites in the modeled channel structure).

Among these 112 amino acid substitutions specific to mormyrid electric fish, one exactly matches a site known to be involved in a sodium channelopathy of skeletal muscle in humans and a further nine sites are in close proximity to sites known to cause channelopathies: The substitution L1433V in S3 of DIV occurring in most mormyrids (except for Petrocephalus soudanensis and Pollimyrus adspersus) is located at a site, where a change from leucine (L) to arginine (R) in humans causes paramyotonia congenita (PMC; Hayward et al. 1996; Jurkat-Rott et al. 2010). An analysis by PolyPhen-2 ranked the substitution L1433V as probably altering protein function. G268A, a substitution also predicted to alter protein function by PolyPhen-2, is separated by one amino acid from a site involved in causing PMC (Jurkat-Rott et al. 2010). A131P, L142I, S1159A are each close to sites that are reported to cause potassium-aggravated myotonia (PAM; Jurkat-Rott et al. 2010). In DII, there are three amino acid substitutions in close proximity, each near to known deleterious Nav1.4 mutations, causing hyperkalemic periodic paralysis in humans (HyperPP; Jurkat-Rott et al. 2010). The substitution I1157V in the S4-S5 linker of DIII is also next to a mutation leading to HyperPP. Of the 36 pulse-type fish-specific substitutions located in transmembrane segments, three occur in the voltage sensing parts of the channel (S4 of DI, DII, DIII, and DIV). All three are situated in S4 of DI. T220A is separated by one amino acid from a site that causes hypokalemic periodic paralysis (HypoPP; Jurkat-Rott et al. 2010) in humans, if mutated.

The two substitutions I1157V and S1159A in the S4-S5 linker of DIII are close to one docking site (N1152) for the inactivation particle, which controls the process of fast channel inactivation. These substitutions were predicted to alter protein function.

Discussion

The present study comprises the first complete assessment of the entire SCN4aa gene in electric fish, i.e., in six closely related Campylomormyrus species with pulse-type EODs of different durations. SCN4aa encodes the voltage-gated sodium channel Nav1.4a. Since the action potentials generated by the voltage-gated sodium channels in the EO underlie the EOD production, they are probably directly involved in the generation of the diverse EODs within Momyridae (Bennett 1971; Zakon et al. 2008, 2009). Our study is a first attempt to relate Scn4aa sequence variability to a particular EOD trait, i.e., EOD duration.

Further, we used exon and intron data of SCN4aa to construct a robust phylogeny for Campylomormyrus, which is a prerequisite for any inference of EOD evolution in this genus. Our phylogeny is mostly consistent with a recent assessment of Campylomormyrus phylogenetic relationships based on sequence data from several loci (Lamanna et al. 2016), except for the position of C. curvirostris.

Character Polarity in the EOD Evolution in Campylomormyrus

Our phylogenetic analyses of the genus Campylomormyrus, based on over 20,000 bp sequence information of the sodium channel gene SCN4aa, comprise the by far largest phylogenetic data set for this genus available and yield robust phylogenetic trees with high bootstrap support. Our inferred phylogeny is generally in agreement with earlier studies (Feulner et al. 2007), but provides a new hypothesis for some hitherto uncertain relationships, as between C. compressirostris and C. curvirostris, which both produce short EODs, but are phenotypically dissimilar. While a recent phylogeny based on less sequence data, but originating from several loci (Lamanna et al. 2016), suggests a sister group relationship among C. compressirostris and C. curvirostris, according to our analyses, C. curvirostris is clearly separated from C. compressirostris and is inferred to share a common ancestor with C. numenius and C. rhynchophorus, the two species with very long EODs. Our data also suggest the placement of C. tshokwe as the sister taxon to C. compressirostris. Our phylogeny as well as the one provided by Lamanna et al. (2016)—albeit not fully identical in all aspects—are both best explained by a scenario, where the short EODs shared by C. compressirostris, C. curvirostris, and the more distantly related C. tamandua comprise the ancestral character state within the genus Campylomormyrus. From this ancestral type, different EODs have likely evolved at least twice, i.e., once in C. tshokwe (towards an elongated and almost only head-positive EOD; cf. Figure 1) and once in the common ancestor of C. rhynchophorus and C. numenius (towards an elongated EOD, retaining head-positive and head-negative phases). This provides compatible evidence to the hypothesis of Feulner et al. (2008b, 2009) that EODs should be most divergent among closely related taxa, such that assortative mating with regard to EOD promotes reproductive isolation.

Functional Divergence of Sodium Channel Genes SCN4aa and SCN4ab

The two copies of the SCN4a gene have divergent expression patterns in mormyrid electric fish, i.e., SCN4aa is solely expressed in the muscle-derived adult electric organ (EO), while SCN4ab is predominantly expressed in ordinary muscle (Zakon et al. 2006; Arnegard et al. 2010). This expression pattern has been verified also for Campylomormyrus (Lamanna et al. 2015).

Our gene tree of the paralogs SCN4aa and SCN4ab and our nucleotide diversity measures confirm an elevated evolutionary rate of the EO-specific SCN4aa gene in mormyrids (cf. Fig. 3) and reveal such a pattern also within the genus Campylomormyrus. A comparison of the inferred amino acid sequences of SCN4aa and SCN4ab in Campylomormyrus to SCN4a in humans clearly revealed the divergent evolution of the EO-specific gene SCN4aa: Among 475 amino acids of homologous sequences, SCN4aa differed from the human SCN4a at 65 sites, whereas SCN4ab differed only at 37 sites. These findings, together with inferred positive selection, confirm a strong directional selection pressure acting on SCN4aa in mormyrid electric fish, presumably in the course of neofunctionalization of this gene in the electric organ, as previously detected (Zakon et al. 2006; Arnegard et al. 2010).

Positive Selection During SCN4aa Gene Evolution in Mormyrid Electric Fish

Most of the 24 inferred PSSs occurring along the lineage ancestral to Campylomormyrus can be found in linkers and regions of the transmembrane segments close to the membrane surface (mainly on the extracellular site). One of our inferred PSSs is the substitution I1310C already identified by Zakon et al. (2006) and shared by all Mormyridae including Campylomormyrus. This substitution regards the first position of the 4-amino acid IFMT motif in the linker between DIII and IV. The four hydrophobic amino acids of the IFMT motif function as a lid that closes the pore of the channel during fast inactivation (Goldin 2003). Deschênes et al. (1999) have demonstrated an accelerating effect on recovery (=shortened refractory period) for the substitution I1310C in human cardiac sodium channels. Zakon et al. (2006) suggested that such a shortened refractory period increases the firing rate of action potentials and could hence enable the generation of series of very short EODs (pulse-type), as they are typical for most mormyrid fish.

Seven PSSs are inferred for the DII–DIII linker of the Nav1.4a channel (cf. Fig. 5). In neuronal sodium channels, this linker is involved in compartmentalization of the channels along the axon by binding of a 27-amino acid motif to ankyrin G, which in turn anchors the sodium channels to the cell membrane (Garrido et al. 2003). Ankyrin G has additionally been shown to modify gating properties of a human neuronal sodium channel (Nav1.6; Shirahata et al. 2006), but the effect of Ankyrin G on sodium channels of electric fish—if any—is so far unknown.

SCN4aa Sequence Motifs Involved in Fast Inactivation Evolve Differently Among Electric and Nonelectric Fish

Delayed repolarization caused by incomplete or disrupted inactivation is responsible for many well-studied channelopathies in humans (Catterall 2000; Rüdel et al. 2000; Goldin 2003; Motoike et al. 2004; Jurkat-Rott et al. 2010). In addition to the above-mentioned IFMT motif, numerous other sites and motifs involved in fast inactivation have been identified. For example, the docking site for the inactivation particle includes parts of the S4-S5 linkers of DIII and DIV and the cytoplasmic end of S6 in DIV (Goldin 2003). Here, G1306 and G1307, a pair of glycines that functions as the hinge of the inactivation gate, are conserved in Mormyridae, but not among other nonelectric teleosts. G. niloticus, the mormyroid with a wave-type EOD, exhibits the substitution G1307E and several nonelectric fish show the substitution G1307D. Kellenberger et al. (1997) have shown the slowing effect on inactivation of substitutions at these sites; their conservation in SCN4aa of Mormyridae indicates a functional constraint, putatively in enabling the short pulse-type EOD typical for members of this family of weakly electric fish.

The C-terminus has also profound influence on fast inactivation, as demonstrated by examining chimeric voltage-gated sodium channels with interchanged C-termini (Mantegazza et al. 2001). During fast inactivation, the C-terminus comes in contact with two docking sites within the inactivation loop. Both of the docking sites are located shortly behind the IFMT motif: the KPQ motif starts at site 1330 and the PIPR motif at site 1334 (Motoike et al. 2004). Bennett et al. (1995) studied particularly the effect of an absent KPQ motif and discovered a prolonged duration of action potentials for this mutation. The KPQ motif is present in all examined teleost fish (including electric fish), but there is a Mormyroidea (including G. niloticus)-specific substitution from glutamine (Q) to alanine (A) (Q1332A), while this position is variable in gymnotiform electric fish (Arnegard et al. 2010). This substitution may have an effect, since glutamine (Q) is a polar amino acid with a reactive side chain and is often involved in binding sites, whereas alanine (A) is a nonpolar amino acid with a nonreactive side chain (Betts and Russell 2003). Without mutagenesis experiments, however, the functional implications of this substitution cannot be evaluated.

SCN4aa Fixed Differences in Campylomormyrus

We also detected 31 amino acid sites in the SCN4aa gene showing fixed differences among the six species of Campylomormyrus, which might contribute to the diversity of pulse-type EODs within this genus, in particular to the elongated EOD of C. numenius and C. rhynchophorus. These two species share two substitutions (Fig. 4). As of now, we are unable to discern whether these substitutions are simply reflecting shared phylogenetic ancestry or have a functional impact on EOD duration. As data for these sites from other mormyrid fish are lacking so far (cf. Table S1), we do not know either whether these substitutions are private to these two species. In any case, these substitutions and the private substitutions occurring in C. numenius should be further analyzed for their functional implications, as they could contribute not only to EOD duration, but also to the peculiar shape of the EODs positive phase with its characteristic inflection point. Also, the frameshift mutation in C. tshokwe, the other lineage with elongated EOD, warrants further investigation. To this end, we are currently aiming at the production of cross-species hybrids, as subsequent generations/backcrosses could be evaluated for co-segregation of specific alleles and distinct EOD characteristics.

SCN4aa Gene Evolution in G. niloticus

SCN4aa evolution is different in the mormyroid G. niloticus (the species producing a wave-type EOD), when compared to species of its sister taxon (i.e., the pulse-type discharging Mormyridae). The adult EO of this species is homologous to the larval EO of Mormyridae (Kirschbaum 1977, 1980, 1995). The EO of G. niloticus is also derived from muscle tissue (Dahlgren 1914), but the electrocytes differ in many aspects from those in the EO of Mormyridae (Schwartz et al. 1975; Hopkins 1999). Therefore, a different selection pressure might be expected to act on SCN4aa in G. niloticus. Above all, however, the difference in discharge type between G. niloticus and Mormyridae (wave vs. pulse-type EOD) is most likely the driving force for the divergent SCN4aa evolution among the groups, as both types have different requirements with regard to channel kinetics and inactivation timing. We focused in our analyses mainly on regions involved in sodium channel fast inactivation—therefore, most differences were discovered in those regions. It remains to be further evaluated, which other properties of the Nav1.4a channel specifically diverged between wave versus pulse-type electric organs.

Conclusion

In summary, we confirmed previously known (Arnegard et al. 2010) and found numerous additional positions at which the SCN4aa gene in mormyrids differs from that of other nonelectric Teleostei and mammalian model organisms, indicating strong divergent and adaptive evolution in electric fish. This is corroborated by our inference of strong positive (directional) selection at various sites of this gene in mormyrids. We denominate a number of substitutions which could be candidates to be further evaluated in functional studies in order to unravel the contribution of specific substitutions in the SCN4aa gene to the characteristics of the adult EOD in mormyrid fish.

While this study was focused on the identification of nonsynonymous genetic variation in the SCN4aa gene among different Campylomormyrus species displaying particularly diverse EODs, one should not disregard other factors which may account for differences in EOD duration and shape: (1) Innervation and/or electrocyte anatomy can profoundly impact EOD characteristics (Paul et al. 2015 and references therein). (2) Alternative splicing: For the gymnotiform S. macrurus, a splice variant of a voltage-gated sodium channel has been shown to substantially alter EOD characteristics (Liu et al. 2008). (3) Differential gene expression. Especially among closely related species with strikingly distinct EODs (like in our example of Campylomormyrus), it would be interesting to examine, whether there are differences in expression and/or splice variants of SCN4aa correlating with variable EOD features. Here, the impact of the frameshift mutation found for SCN4aa in C. tshokwe would be interesting to evaluate, as it is predicted to cause a truncated transcript (because of a premature stop codon within exon 25).