Introduction

Venom is an adaptation that is central to the ecology and evolution of the front-fanged snakes from the families Elapidae, Viperidae, and Atractaspidae (Mitchell 1861; Boquet 1979; Jackson 2003; Fry et al. 2015). Snakes lacking these prominent fangs were thought, until recently, to be non-venomous as—with a few notable exceptions—they posed little medical threat to humans (Minton 1990; Fry et al. 2003a, b, 2006). A combination of factors, including the recognition that many non-front-fanged snakes (NFFS) possess enlarged, grooved, teeth in the back part of the mouth (Fritts et al. 1994; Jackson 2003; Deufel and Cundall 2006), studies showing the homology between the venom glands of colubroid snakes (Fry et al. 2008; Jackson et al. 2016), adoption of a more inclusive definition of venom among the research community (Fry and Wüster 2004; Fry et al. 2009a, b, 2015; Nelsen et al. 2014; Jackson and Fry 2016), and actual extraction and analysis of secretions from the oral glands of these snakes (Fry et al. 2003a, b; Lumsden et al. 2005; Pawlak et al. 2006), has led to the realization that many NFFS should be considered venomous. The species that were the subjects of these latter studies—Coelognathus radiatus and Boiga dendrophila—belong to the Colubridae subfamily Colubrinae which is thought to have originated in Africa and radiated in a wide diversity of species across Africa, Europe, Asia, Oceania, and the Americas (Burbrink and Lawson 2007; Pyron et al. 2013a; Figueroa et al. 2016). The venoms of these snakes likely influenced their evolution as they migrated across continents and exploited new prey populations.

Venom genes are found to be subject to positive selection more frequently than housekeeping genes or the physiological homologs of toxin proteins (Fry 2005; Jackson et al. 2013; Casewell et al. 2013; Sunagar et al. 2013; Vonk et al. 2013; Fry et al. 2015; Dashevsky and Fry 2018). This positive selection drives toxins to evolve rapidly through mechanisms such as the rapid accumulation of variation at exposed residues or extensive duplication and neofunctionalization (Sunagar et al. 2013; Margres et al. 2017). One of the most significant sources of these selective pressures is the highly varied niches that snakes occupy, especially the correspondingly broad range of prey species (Sunagar et al. 2013, 2014; Fry et al. 2013). While snake venom may be secondarily used in defence—such as that found in spitting cobras (Cascardi et al. 1999; Hayes et al. 2008; Panagides et al. 2017)—its primary function is to subdue prey (Fry et al. 2008, 2009b, 2013; Jackson et al. 2013; Debono et al. 2017). Coupled with the rapid rate of venom evolution, this leads to many snakes evolving toxins that are particularly potent against their specific prey items (Daltry et al. 1996; Jackson et al. 2013, 2016; Cipriani et al. 2017). Studies of prey specificity across a broad selection of snakes and prey taxa have led to the rule of thumb small non-enzymatic neurotoxins are particularly useful against diapsid prey (birds and reptiles), while synapsids (mammals) tend to be more susceptible to enzymatic coagulotoxins (Li et al. 2005; Pawlak et al. 2006; Fry et al. 2008; Jackson et al. 2016; Cipriani et al. 2017). In fact, boigatoxin-A, denmotoxin, and irditoxin, the earliest characterized Boiga toxins, were found to be particularly potent in diapsid (avian) tissue assays (Lumsden et al. 2005; Pawlak et al. 2006, 2008).

Boigatoxin-A, denmotoxin, and irditoxin are all three-finger toxins (3FTx), which is one of the most widespread families of snake toxins (Fry 2005; Utkin et al. 2015). The 3FTx family originated early in snake venom evolution: over 100 million years ago (Fry et al. 2003a, b, 2006, 2013; Utkin et al. 2015). While present in the venoms of multiple snake families, this toxin type is at its highest relative abundance in the families Colubridae and Elapidae (Fry et al. 2003b, 2008). This broad trend holds true in the genus Boiga, with various studies finding that 3FTx are the predominant toxins in their venoms (Fry et al. 2003b; McGivern et al. 2014; Modahl and Mackessy 2016; Pla et al. 2018). The 3FTx derive their name from their shared structure: three β-stranded loops, or ‘fingers’, emerging from a stable, cysteine-bonded core (Utkin et al. 2015). Despite the conservation of this structure, 3FTx have evolved numerous toxic activities including α-neurotoxicity, κ-neurotoxicity, cytoxicity, and platelet inhibition (Fry et al. 2003b; Fry and Wüster 2004; Sunagar et al. 2013; Kessler et al. 2017). Plesiotypic 3FTx are characterized by 10 conserved cysteine residues, α-neurotoxicity, and display greater affinity towards the post-synaptic nicotinic acetylcholine receptors (nAChRs) of birds and reptiles than those of rodents (Fry et al. 2003a, b; Pawlak et al. 2006, 2008; Heyborne and Mackessy 2013). Virtually all 3FTx from non-elapid snakes retain these 10 cysteines (Fry et al. 2003a, 2008; Utkin et al. 2015). Because early studies of the activity of these plesiotypic toxins used synapsid animal models (rodents) rather than diapsid models—which would have been more similar to the avian and reptilian natural prey of the snakes being studied—the 10-cysteine 3FTx were initially mistakenly referred to as “weak neurotoxins” (Utkin et al. 2001). The prevalence of such anthropocentric experimental design in investigations of the toxicity of the venoms of NFFS led to some authors concluding Boiga species were non-venomous (Rochelle and Kardong 1993).

Irditoxin, from the venom of B. irregularis, is particularly notable among 3FTx, because the mature toxin is a covalently linked dimer of two distinct 3FTx which exhibits a tenfold higher potency than the monomeric denmotoxin (Pawlak et al. 2006, 2008). Due in part to its effective venom, B. irregularis has become a noxious pest that is responsible for the extirpation of bird populations on several Pacific islands that were previously snake-free (Savidge 1986; Rodda and Fritts 1992). Since bird populations have crashed after their introductions, lizards formed the majority of the diet for B. irregularis on Guam, especially smaller individuals (Savidge 1988). Irditoxin may have played a key role in the invasion, its diapsid-specific toxicity enabling the snakes to efficiently exploit the local bird populations and transition to eating lizards once birds became scarce, but it remains unknown if irditoxin is unique to B. irregularis or if closely related Asian or African species also produce irditoxin-like dimers (Pyron et al. 2013a; Figueroa et al. 2016). Modahl and Mackessy (2016) suggested, based on sequence similarity, that the presence of such dimers in B. cynodon venom is likely. However, this study did not explore whether the newly evolved cysteines characteristic of irditoxin were present in the sequences of species outside of B. irregularis.

To investigate these questions, we sequenced the transcriptomes of seven populations from five species of Boiga ranging from the semi-terrestrial B. trigonata which is found in the Middle East and South Asia (Baig et al. 2011), to highly arboreal Southeast Asian species including two subspecies of B. dendrophilaB. d. dendrophila and B. d. gemmicincta—as well as B. cynodon and B. nigriceps (Das 2012), and populations of B. irregularis from Sulawesi in western Indonesia and Brisbane in eastern Australia. On separate occasions, we collected venom samples from B. d. dendrophila and Toxicodryas blandingii for proteomic investigation. These species encompass almost all the genetic, geographical, ecological, and morphological diversification within this clade of colubrine snakes. T. blandingii is a sister taxon to the genus Boiga and is prevalent within the tree canopies of sub-Saharan Africa, where this clade is suspected to have originated (Pyron et al. 2013a; Figueroa et al. 2016). T. blandingii is one of the largest of species in this niche, exceeding 3 m in length, and predates largely upon amphibians, birds, and lizards (Broaders and Ryan 1997). B. d. dendrophila is widespread in Southeast Asia and also feeds on amphibians, birds, and lizards (Pawlak et al. 2006).

Materials and Methods

Venom and Tissue Supplies

Tissue samples in the form of venom glands and venom samples were collected from localities, as outlined in Table 1.

Table 1 Species collection locations and analyses performed

Transcriptomics

A multi-step process was used to sequence and align Boiga 3FTx from the SE Asian clade.

Species Examined

To determine the sequence variability of Boiga 3FTx, venom glands were obtained from Boiga cynodon, B. d. dendrophila, B. d. gemmicincta, B. irregularis (Sulawesi), B. irregularis (Brisbane), B. nigriceps, and B. trigonata. All glands were dissected from freshly euthanized specimens 4–5 day post venom extraction, as this is the time when toxin mRNA transcriptions are being transcribed at the highest rate (Pawlak et al. 2006, 2008; Rokyta et al. 2012). Toxicodryas blandingii venom glands were not available and, therefore, were not included (Table 1).

RNA Extraction and mRNA Purification

Venom-gland tissue (20 mg) was homogenized using a rotor homogenizer and total ribonucleic acid (RNA) extracted using the standard TRIzol Plus methodology (Invitrogen). RNA quality was assessed using a Nanodrop (Nanodrop 2000/2000c, v1.4.2 Spectrophotometer, Thermo Scientific, USA). mRNA was extracted and isolated using and following standard Dynabeads mRNA DIRECT Kit (Life Technologies Ambion, 1443431, Thermo Fisher Scientific, USA).

Sequencing

Total RNA was extracted from venom glands using the standard TRIzol Plus method (Invitrogen). Extracts were enriched for mRNA using standard RNeasy mRNA mini kit (Qiagen) protocol. mRNA was reverse transcribed, fragmented, and ligated to a unique 10-base multiplex identifier (MID) tag prepared using standard protocols and sequenced on a MiSeq platform (Australian Genome Research Facility). MID reads informatically separated sequences from the other transcriptomes on the plates, which were then post-processed to remove low-quality sequences before de novo assembly into contiguous sequences (contigs).

Assembly

Illumina reads that were likely to be cross contamination between multiplexed samples were removed from our read files by identifying 57 nucleotide k-mers in our focal read set that were present in another read set from the same lane at a 1000-fold or higher level. Reads with 25% or more of their sequence represented by such k-mers were filtered from the data set. This was accomplished using Jellyfish 2.2.6 (Marçais and Kingsford 2011) and K-mer Analysis Toolkit (KAT) 2.3.4 (Mapleson et al. 2017). We then removed adaptors and low-quality bases from the reads and removed any reads shorter than 75 base pairs using Trim Galore version 0.4.3 (Krueger 2015). We then used PEAR 0.9.10 (Zhang et al. 2013) to combine pairs of reads whose ends overlapped into one, longer, merged read. We then carried out several independent de novo assemblies of these reads using the programs Extender version 1.04 (Rokyta et al. 2012), Trinity version 2.4.0 (Grabherr et al. 2011), and SOAPdenovo version 2.04 (Xie et al. 2014). SOAPdenovo was run repeatedly with k-mer sizes of 31, 75, 97, and 127. The raw reads may be found in the NCBI Sequence Read Archive under the accession number SRP155444.

Annotation

The de novo assemblies were concatenated and searched against reference toxin sequences obtained from UniProt using BLAST version 2.7.1 (Altschul et al. 1990; Consortium 2017). We then removed all remaining contigs that did not contain complete coding sequences. Those that did were screened by visualizing the read coverage for all contigs, whose highest and lowest coverage differed by a at least factor of 10 using the Burrows–Wheeler Aligner (BWA) 0.7.16a (Li and Durbin 2009), the Genome Analysis ToolKit (GATK) 3.8 (McKenna et al. 2010), and BEDTools 2.26.0 (Quinlan and Hall 2010). Those that showed sharp discontinuities indicative of chimerical assembly were removed from the data set. In cases, where two samples contained transcripts for identical amino acid sequences, we aligned the raw reads to all the 3FTx from each sample using BWA to check if either of them was expressed at unusually low levels which might indicate contamination. In every case, the number of reads aligned to these contigs was within an order of magnitude of the highest number aligned to any toxin from that sample. Finally, we used CD-HIT version 4.7 to cluster the remaining sequences and remove duplicates (Li and Godzik 2006; Fu et al. 2012). The sequences of these contigs are available in Supplementary File 1 or at the NCBI Transcriptome Shotgun Assembly Sequence Database under the following accession numbers: GGUA00000001(B. cynodon), GGUB00000001–GGUB00000043 (B. d. dendrophila), GGUC00000001–GGUC00000012 (B. d. gemmicincta), GGUD00000001–GGUD00000019 (B. irregularis, Brisbane), GGUE00000001–GGUE00000014 (B. irregularis, Sulawesi), GGUF00000001–GGUF00000007 (B. nigriceps), and GGUG00000001–GGUG00000007 (B. trigonata).

Proteomics

Our proteomic investigations included using a combined approach of 1D and 2D SDS-PAGE, excision of gel bands and spots, and LC–MS/MS identification of the proteins therein. Proteomic methods were performed as previously described by us (Ali et al. 2013; Debono et al. 2016, 2017). Figure 5a shows bands picked for each species for MSMS processing.

Analysis

Phylogenetic Reconstruction

Protein sequences for all 10-cysteine 3FTx that were available from the UniProt database were combined with the translations of our 3FTx transcripts (Consortium 2017). The sequences were aligned using a combination of manual alignment of the conserved cysteine positions and alignment using the MUltiple Sequence Comparison by Log-Expectation (MUSCLE) algorithm implemented in AliView for the blocks of sequence in between these sites (Edgar 2004; Larsson 2014). This alignment contained 282 total sequences of which 159 sequences were from Boiga. We reconstructed the phylogeny of these sequences using MrBayes 3.2 for 15,000,000 generations and 1,000,000 generations of burnin with lset rates = invgamma (allows rate to vary with some sites invariant and other drawn from a γ distribution) and prset aamodelpr = mixed (allows MrBayes to generate an appropriate amino acid substitution model by sampling from 10 predefined models) (Ronquist et al. 2012). The run was stopped when convergence values stabilized at approximate 0.013. Two replicate runs recovered virtually identical topologies. A nexus file containing the full alignment and MrBayes settings as well as the output tree can be found in the Supplementary File 1.

Similarity Network

An all-vs-all Basic Local Alignment Search Tool (BLAST) search was conducted on the same data set of protein sequences as was used for the phylogeny with -outfmt “10 qacc sacc qcovs evalue” (Altschul et al. 1990). The results of this search were filtered using a custom R script (see Supplementary File 2) to remove self-to-self results, collapse bidirectional results into one entry, and create a similarity score defined as − log10 e value for each entry. Edges with coverage < 70% or e value > 1 × 10−17 were excluded from the analysis and the network was created in Cytoscape 3.5.1 using the Prefuse Force Directed OpenCL Layout on the similarity scores (Shannon et al. 2003).

Protein Clustering

Clustering was carried out using the CD-HIT 4.7 algorithm with options -c 0.45 -n 2 -d 0 -sc 1 -g 1 (Li and Godzik 2006; Fu et al. 2012). This sets the similarity threshold of the clusters to 45% and sorts the clusters by the number of sequences they contain.

Tests for Selection

Coding DNA sequences for denmotoxin-like sequences were compiled from GenBank (Benson et al. 2013). The sequences were trimmed to only include those codons which translate to the mature protein, translated, aligned, and reverse translated using AliView and the MUSCLE algorithm (Edgar 2004; Larsson 2014). The resulting codon alignments can be found in Supplementary Files 03–09.

Phylogenetic trees for each clade were generated from the resulting codon alignments using the same methods as described above. These tree topologies were used for all subsequent analyses.

We used several of the tests for selection implemented in HyPhy version 2.220150316beta due to their different emphases (Pond et al. 2005). The Analyze Codon Data analysis generates overall ω values for an alignment, while the Fast Unconstrained Bayesian AppRoximation (FUBAR) method gauges the strength of consistent positive or negative selection on individual amino acids (Murrell et al. 2013). In contrast, the Mixed Effects Model of Evolution (MEME) method identifies individual sites that were subject to episodes of positive selection in the past (Murrell et al. 2012).

Protein Modelling

Custom models for each clade of 3FTx were generated by inputting representative sequences (Clade A—Boiga_irregularis_Brisbane_3FTx_00; Clade B—Boiga_dendrophila_dendrophila_3FTx_14; Clade C—Boiga_irregularis_A0A0B8RS39; Clade D—Trimorphodon_biscutatus_A7X3S0; Clade E—Boiga_dendrophila_dendrophila_3FTx_28; Clade F—Boiga_dendrophila_dendrophila_3FTx_38; Clade G—Boiga_dendrophila_dendrophila_3FTx_29) to the Phyre2 webserver using the Intensive option (Kelley et al. 2015).

Alignments of each clade were trimmed to match these structures and attribute files were created from FUBAR and MEME results. Conservation scores were calculated using the default settings of AL2CO (Pei and Grishin 2001). The structures were rendered and colored according to these attributes in UCSF Chimera version 1.10.2 (Pettersen et al. 2004).

Results

Transcriptomics

The summary statistics describing the concatenated transcriptomes for the various species are very similar, yet they contained highly variable numbers of unique 3FTx isoforms (Table 2). The most unusual transcriptome in terms of these statistics is that of the B. irregularis from Sulawesi which had a much greater portion of the reads left unpaired, more contigs, and a greater N50 length; despite these differences, the number of 3FTx isoforms in this sample roughly similar to the number from our B. irregularis from Brisbane. Similarly, the B. cynodon assembly contained only one full-length 3FTx (and fragmentary contigs indicating the likely presence of a second isoform at low expression levels) despite containing the second highest number of contigs. Our transcriptomics also confirm the presence of sequences with one or both the additional cysteines that characterize the irditoxin dimer complex in a range of taxa beyond B. irregularis, where it was originally discovered (Table 3). Interestingly, we did not recover toxins with either irditoxin-like cysteine from the B. trigonata transcriptome, even though several taxa more distantly related to B. irregularis contain one or both. It remains unclear if this is a genuine result or an artefact of the assembly process, especially given that B. trigonata’s assembly had the lowest number of contigs and N50 lengths (Table 2).

Table 2 Assembly statistics for concatenated transcriptomes
Table 3 Presence and absence of irditoxin-like cysteines

The presence of single toxins which contained the additional cysteines that characterize both the A and B subunits of irditoxin (Fig. 1) was an unexpected finding. While it is possible that these sequences are chimeric or otherwise represent assembly error when we examined the coverage of the contigs and the sequences of the individual reads from which they were assembled, there was no evidence of assembly error. The potential structural and functional impacts of a novel cysteine pair within the same sequence on the mature toxins’ structures and activity are entirely unknown.

Fig. 1
figure 1

Mature peptide sequences of notable denmotoxin-like sequences and additional representatives of each subclade with the sites of the ten canonical cysteines and irditoxin A and B cysteines highlighted in color. These sequences include denmotoxin (a previously characterized monomeric toxin), irditoxin A and B subunits (a previously characterized dimeric toxin), sequences with cysteines in the characteristic position of Irditoxin A or B from Trimorphodon and Telescopus, and the two sequences which with irditoxin-like cysteines at both positions. Phylogeny and subclades are derived from Fig. 2. Corresponding IDs for the selected sequences: Clade A—Boiga_irregularis_Brisbane_3FTx_00; Clade B—Boiga_dendrophila_dendrophila_3FTx_14; Trimorphodon B—Trimorphodon_lambda_A0A193CHM1; Clade C—Coelognathus_radiatus_P83490; Trimorphodon A—Trimorphodon_biscutatus_A7X3S2; Irditoxin B—Boiga_irregularis_A0S865; Clade D—Boiga_cynodon_A0A193CHL2; Telescopus B—Telescopus_dhara_A7X3V0; Clade E—Boiga_dendrophila_gemmicincta_3FTx_09; B. d. gemmicincta A & B—Boiga_dendrophila_gemmicincta_3FTx_05; Denmotoxin—Boiga_dendrophila_Q06ZW0; Clade F—Boiga_dendrophila_dendrophila_3FTx_08; B. d. dendrophila A & B—Boiga_dendrophila_dendrophila_3FTx_41; Irditoxin A—Boiga_irregularis_A0S864; Clade G—Boiga_nigriceps_3FTx_03. (Color figure online)

Phylogenetics and Protein Similarity Network

We found that 56.4% of published 10-cysteine 3FTx come from the genus Boiga. This is likely due in large part to publication bias: Boiga have long been recognized as venomous, and B. irregularis is one of the most notorious invasive species in the world (Minton 1990; Rodda and Savidge 2007). The other well-studied NFFS venom is that of Dispholidus typus, the dangerously toxic boomslang, which is primarily composed of P-III snake venom metalloproteinases and so contributes relatively few sequences to our data (Debono et al. 2017; Pla et al. 2017). Among the front-fanged snakes, the stereotypical toxins for each lineage are something besides 10-cysteine 3FTx: viperid venoms are primarily enzymatic (Mackessy 2010), elapid venoms often dominated by derived 8-cysteine 3FTx (Fry 1999), and atractaspid venoms by blood pressure acting serotoxin peptides and procoagulant enzymes (Kochva et al. 1982; Terrat et al. 2013; Oulion et al. 2018). Together, these factors make it so that there are far more 10-cysteine 3FTx known from Boiga than any other snake lineage. However, the sheer number of sequences from Boiga does indicates that the 3FTx toxin family has undergone extensive duplication during their evolution.

Of these Boiga toxins, 92.5% belong to a clade that we refer to as denmotoxin-like (Fig. 2). Independent protein clustering and protein similarity network analyses confirmed this subdivision within our data set (Fig. 3). Within the denmotoxin-like sequences, we designated monophyletic clades for further analysis (Figs. 1, 2): Clade A is composed exclusively of Boiga sequences, they retain the ancestral state of only 2 amino acids before cysteine #1 (as opposed to 9 in all other denmotoxin-like clades), and none possess either of the derived irditoxin-like cysteines; Clade B contains Boiga and Telescopus sequences none of which possess either of the irditoxin-like cysteines; Clade C contains sequences from Coelognathus, Oxybelis, and Trimorphodon, the latter of which includes toxins with the irditoxin-like A and B cysteines separately (though the T. biscutatus sequence also lacks the canonical cysteines #2 and #3); Clades D and E both contain only Boiga sequences with the irditoxin-like B cysteine; and Clades F and G are also composed entirely of Boiga sequences, but contain a mix of those with the irditoxin-like A cysteine, neither, and one sequence each with both the irditoxin-like A and B cysteines. Clades E, F, and G are part of a polytomy that is separated from Clade D by two monotypic branches including a Telescopus sequence with the irditoxin-like B cysteine.

Fig. 2
figure 2

Phylogenetic tree of all publicly available 10-cysteine 3FTx and those from our transcriptome assemblies. Denmotoxin-like sequences are divided into monophyletic clades for further analyses and colored according to their irditoxin-like cysteines. (Color figure online)

Fig. 3
figure 3

Protein similarity network based on BLAST e values. Labels correspond to CD-HIT clusters. Both the network and the clusters confirm the distinction between the denmotoxin-like sequences and others. Denmotoxin-like sequences are colored according to their irditoxin-like cysteines. (Color figure online)

Signals of Selection

We examined the signals of selections on the various clades of denmotoxin-like sequences using whole gene as well as site-specific algorithms (Table 4). We found Clade D to be subject to negative selection, Clade C neutral, Clades A, E, F, and G positive, and Clade B extreme positive selection. However, Clades B, C, and E included few sequences which increases the potential error and decreases the statistical power of these estimates. This is illustrated by the fact that, despite Clade B’s extreme ω value across the whole gene, our site-specific analyses found relatively few sites that met their statistical significance thresholds. A possible reason for the high ω value in Clade B is because the taxa within it span a broad range of the colubrines. Though many sites failed to reach statistical significance, the pattern of estimated selection across the proteins for each clade can be seen in Fig. 4 (full size images for each clade can be found in Supplementary Figs. 1–7).

Table 4 Tests of selection
Fig. 4
figure 4

Schematic phylogeny of the clades within the denmotoxin-like 3FTx. Branches for each clade are colored and tips are labeled according to ω values. Protein models show front and back views colored according to the estimated strength of selection (β − α) from FUBAR (left) and MEME (right). (Color figure online)

Proteomics

The combined 1D and 2D gel approach revealed that both T. blandingii and B. d. dendrophila were dominated by 3FTx (Fig. 5). Comparison of reduced and non-reduced B. d. dendrophila 1D gels revealed the presence of a band in the non-reduced gel between the 15 and 20 kDa markers (Fig. 5a), while in the reduced gel, it was apparent that the monomeric 3FTx band sitting just below the 10 kDa marker was correspondingly thicker and LC–MSMS identified the additional band as 3FTx (Fig. 5a, Supplementary Table 2). This confirmed the presence of a dimeric complex made up of two 3FTx. Homologous bands were either absent entirely from the T. blandingii venom or was too faint to detect in both the reduced and non-reduced gels (Fig. 5a). The 2D gels (Fig. 5b, c) further demonstrate the diversity of 3FTx isoforms found in both venoms.

Fig. 5
figure 5

a 1D SDS-PAGE glycine produced under reducing and non-reducing conditions stained with colloidal coomassie brilliant blue G-250. Toxins of interest were identified by LC/MS–MS of excised bands (green numbers correspond to Supplementary Tables 1 and 2 of LC–MS/MS results). From left; Toxicodryas blandingii (Tbl) reduced (R), non-reduced (NR), ladder (250–10 kDa), Boiga dendrophila (Bdd) reduced (R), non-reduced (NR). b 2D SDS-PAGE reduced electrophoresis mini gel. Toxicodryas blandingii crude venom stained with colloidal coomassie brilliant blue G-250. First dimension: isoelectric focusing (pH 3–10 non-linear gradient); second dimension molecular weight 250–10 kDa: 12% SDS-PAGE. The pH gradient and the molecular weight marker positions are shown. 3FTx were identified by weight and comparison to 1D results. c 2D SDS-PAGE reduced electrophoresis mini gel. Boiga dendrophila crude venom stained with colloidal coomassie brilliant blue G-250. First dimension: isoelectric focusing (pH 3–10 non-linear gradient); second dimension molecular weight 250–10 kDa: 12% SDS-PAGE. The pH gradient and the molecular weight marker positions are shown. 3FTx were identified by weight and comparison to 1D results. (Color figure online)

Discussion

NFFS venoms remain a neglected area of research. This is in part due to the comparative difficulty of obtaining venom samples (Hill and Mackessy 1997), the fact that bites from most of these snakes have negligible impacts on human health (Minton 1990), and because it has only been relatively recently that the research community has begun to regard many of these snakes as venomous at all (Fry et al. 2003a, 2015; Fry and Wuster 2004; Nelsen et al. 2014). This study focuses on the evolutionary dynamics of one toxin family and one genus of colubrines; there remains much work to be done on other toxin types and other NFFS.

Proteomic and transcriptomic studies of Boiga venoms revealed that diapsid-specific neurotoxic 3FTx are the predominant toxins secreted by this amphibian, bird, and reptile feeding genus. The previous investigation into this genus revealed a unique disulphide-bond-linked 3FTx dimer called irditoxin isolated from B. irregularis, an invasive species responsible for devastating decreases in bird populations on the island of Guam (Savidge 1986, 1988; Pawlak et al. 2008). This was especially significant, because it was the first known covalently linked 3FTx–3FTx dimer toxin. Irditoxin was shown to have a particularly strong diapsid-specific (bird and reptile) neurotoxicity, tenfold that of monomeric toxins (Pawlak et al. 2008). Despite this toxin being isolated 8 years ago, the presence or absence of homologous toxins in other Boiga species has not been thoroughly investigated until now.

Our transcriptomic analyses revealed for the first time that species besides B. irregularis produce transcripts with the necessary additional cysteines to form both the A and B chains of irditoxin homologues. Such pairs of transcripts were recovered from B. dendrophila dendrophila, B. d. gemmicincta, and B. nigriceps (Table 3). We also found, from the alignment of our data and previously published toxin sequences, that two papers had published toxins with cysteines at the characteristic irditoxin sites prior to this paper. Fry et al. (2008) published a toxin sequence from Trimorphodon biscutatus (Trimorphodon_biscutatus_A7X3S2) with the A cysteine and one with the B cysteine from Telescopus dhara (Telescopus_dhara_A7X3V0) the year prior to the discovery of irditoxin (Pawlak et al. 2009). Thus, the earlier study did not recognize the significance of these cysteines. Modahl and Mackessy (2016) discussed the similarity of their two B. cynodon sequences to irditoxin A (Boiga_cynodon_A0A193CHL1, 83% identical) and B (Boiga_cynodon_A0A193CHL2, 93% identical); however, only the sequence similar to the B subunit actually contained the characteristic cysteine and the presence of this cysteine was not commented upon. Modahl and Mackessy (2016) also published—but did not discuss—a B. nigriceps sequence that contained the A cysteine and a Trimorphodon lambda sequence that contained the B cysteine; nor did they discuss the presence or absence of the cysteines necessary for forming covalent bonds between the subunits when they speculated on the possible occurrence of dimeric complexes homologous to irditoxin in species besides B. irregularis.

Conspicuously, our phylogenetic, clustering, and network analyses (Figs. 2, 3) demonstrate that sequence similarity does not reliably distinguish between toxins with or without the A and B cysteines. Our transcriptomic analyses confirm the finding of toxins with the A cysteine from B. nigriceps, but they did not recover any sequence with the B cysteine from B. cynodon. Because their technique of sequencing mRNA from venom rather than gland tissue resulted in very low yields, Modahl and Mackessy (2016) did not recover toxins with both the A and B cysteines from any species including B. irregularis. Thus, our analyses are the first evidence that species besides B. irregularis transcribe both the A and B subunits.

Despite the fact that our data include Trimorphodon toxins with cysteines at the characteristic site of both the A and B chains of irditoxin, they are found in different species with the A chain in T. biscutatus and the B chain in T. lambda. The T. biscutatus toxin with the A cysteine is particularly unusual, because it has additionally mutated the third and fourth canonical cysteines to phenylalanine and threonine, respectively. Whether either species actually secretes both chains and whether they form a dimeric toxin will require a thorough proteomic investigation of their venoms. The function of the putative toxin is also of interest; while it seems somewhat unlikely these terrestrial snakes would possess a bird-specific toxin, these toxins are also potent against other diapsids such as lizards. However, B. trigonata, which is morphologically and ecologically convergent with Telescopus and Trimorphodon, did not yield contigs with either of the characteristic irditoxin cysteines. Specific biochemical investigation of B. trigonata venom would be necessary to conclusively confirm or refute this finding.

The distribution of these toxins—with the Trimorphodon A subunit more closely related to the Trimorphodon B subunit than the Boiga A subunit and the Telescopus B subunit nested within the Boiga B subunits (Fig. 2)—suggests that the B subunit may have arisen in a common ancestor before much of the colubrine radiation and that toxins similar to the A subunit has evolved independently in Boiga and Trimorphodon. However, given how closely related the A and B subunits are in our phylogeny, it must also be considered that this denmotoxin-like clade of 3FTx may have been exapted for forming dimeric complexes and that both subunits could have evolved independently on multiple occasions. While the latter scenario is less parsimonious, to settle the issue, a greater diversity of colubrine sequences must be isolated and characterized to elucidate the likely ancestral forms of these proteins, as has been done with dimeric phospholipase A2 toxins from crotaline venoms (Whittington et al. 2018).

Somewhat surprisingly, we did not find evidence for widespread positive selection across the denmotoxin-like 3FTx. In snake venom proteins, extensive duplication is typically accompanied with high ω values (e.g., Margres et al. 2013; Sunagar et al. 2013; Dashevsky and Fry 2018). The only extreme positive selection we measured was in Clade B (ω = 3.02), and as discussed earlier (See “Signals of Selection”), this value may be an artefact from low sample size and wide taxonomic range that would decrease with further sampling. On the other hand, we found Clade D, which includes the canonical irditoxin B chain, to be under fairly strong negative selection (ω = 0.67). Perhaps, this indicates that the structural constraints of dimerization may strengthen the negative selection acting on a toxin, as has been shown for non-covalently linked dimeric 3FTx (Sunagar et al. 2013). The overall low ω values may help explain how these denmotoxin-like 3FTx have remained relatively similar across much of the colubrine radiation and the extensive diversification within Boiga.

We also expected to see a pattern among site-specific signals of selection similar to that found in the 8-cysteine 3FTx, where the loops of the toxins—the domains that interact with the toxins’ targets—tended to be enriched in sites experiencing positive selection and the cores which are essential to the overall structure of the protein were enriched for those experiencing negative selection (Sunagar et al. 2013; Dashevsky and Fry 2018). However, no such pattern is apparent in the clades of denmotoxin-like 3FTx (Fig. 4). This may be partly due to the overall strength of positive selection being lower. We also see little variation in the predicted protein structures of the various clades. This is certainly due in part to the fact that only three structures have ever been published of 10-cysteine 3FTx: irditoxin B chain (Clades A, B, C, and D), denmotoxin (Clades E and F), and irditoxin A chain (Clade G). However, the relatively high similarity between the clades and low ω values also plays a role in this. For these reasons, we suspect that the main source of functional diversity within the denmotoxin-like 3FTx is the actual dimerization, i.e., all the monomeric toxins are likely to exhibit similar activity to other monomeric toxins such as denmotoxin, while all the Boiga dimeric toxins are likely to be very similar to irditoxin. As discussed earlier, the function of the putative dimeric Trimorphodon toxin remains speculative, but diapsid-specific toxicity seems likely. Given the apparent conservation of function within the denmotoxin-like 3FTx, it seems likely that the extensive duplication serves the proximal evolutionary purpose of increasing venom yield through gene dosage effects, a pattern of venom evolution that has been demonstrated in rattlesnakes (Margres et al. 2017).

Toxicodryas blandingii and B. d. dendrophila were used for our venom proteomics, because they bracket much of the genetic, morphological, geographical, and ecological diversity found within this clade of colubrine snakes: T. blandingii is restricted to Africa, where this radiation is thought to have started, while B. d. dendrophila is representative of the recently derived Southeast Asian lineage within Boiga (Pyron et al. 2013a; Figueroa et al. 2016). Proteomic investigations via 1D and 2D SDS-PAGE confirmed that both venoms were dominated by 3FTx (Fig. 5). It was hypothesised that B. d. dendrophila would produce 3FTx homologous to those of its close relative, B. irregularis, since they occupy similar ecological niches and we had assembled transcripts of such homologues. A series of analyses including 1D (reduced and non-reduced) SDS-PAGE, 2D (reduced) SDS-PAGE, LC–MS/MS peptide sequencing, and transcriptomics supported this hypothesis (Fig. 5, Supplementary Tables 1 and 2). The protein band between 15 and 20 kDa that can be seen in the B. d. dendrophila—but not T. blandingii—venom under non-reducing conditions matches what we would expect from irditoxin: the combined mass of its A and B chains is approximately 17 kDa (Fig. 5a, b; Pawlak et al. 2008). It is apparent that this band contains dimeric complexes, because under reducing conditions (which break disulphide bonds), this heavier band disappears and a new monomeric 3FTx band appears at a lower molecular weight (Fig. 5a). The double band pattern is also seen in the 2D reduced gel for B d. dendrophila, but not T. blandingii (Fig. 5b, c). This suggests that B. d. dendrophila produces toxins of similar molecular mass and biochemical structure to irditoxin, while T. blandingii does not. These toxins were confirmed to be 3FTx by LC–MS/MS sequencing of proteins isolated from gel bands (Supplementary Tables 1 and 2). This indicates that the evolution of dimeric toxins with both subunits homologous to those of irditoxin occurred after the genus Boiga separated from its sister genera and may help explain the patterns of migration and speciation we see in the genus.

Our results suggest that the idea that the evolution of dimeric toxins was the key evolutionary innovation exclusive to B. irregularis which facilitated it being a unique and highly successful invasive species is likely too simplistic, because similar toxins can be found in several other species. Perhaps, all Boiga species, or at least those expressing irditoxin-like dimers, pose similar risks and it is merely chance that it was B. irregularis invaded Guam rather than another species. If this is the case, then irditoxin may have underpinned the rapid evolution and migration of the modern Boiga species. The havoc B. irregularis wreaked on Guamanian bird populations demonstrates how effective these toxins and the snakes employing them might have been as they encountered naïve prey populations during their invasion of Southeast Asia and Oceania. A positive feedback between their arboreal lifestyle and this potent, diapsid-specific, toxin could explain why there are so many species of Boiga occupying highly arboreal niches throughout the Asian region. However, this would raise the question of why none of the other species have invaded other locations: perhaps, Guam’s ecology was unique in the opportunities and Boiga was given to invade in the first place, the lack of native snake species, or some other factor. While it is possible that there is something else distinctive about B. irregularis that makes it more noxious than its close relatives, no aspect of their ecology is an obvious culprit. For instance, B. irregularis is known to be highly food-motivated and somewhat generalist in feeding habits (Chiszar et al. 1993) which are traits often found in invasive species, but many other Boiga species have similarly broad diets (Stuebing et al. 1999). Thus, the invasion of Guam and surrounding islands by B. irregularis may simply have been a chance event based on which species was first transported to these islands and that other large species such as B. cynodon or B. dendrophila may have been just as invasive had either been given the opportunity. Irditoxin itself could still be the answer if the B. irregularis toxins happen to be more potent than their homologs in closely related species. However, neurotoxicity testing has shown that B. cynodon, for example, is of comparable toxicity (Lumsden et al. 2004). Whatever the reason, the fact remains that B. irregularis is a proven threat as an invasive species, while other Boiga species are not. The currently available evidence neither disproves nor supports the idea that this is due to differences in their biology rather than a contingent historical fact.