Introduction

Interest in the study of flaviviruses (Flaviviridae) has been mostly fueled by their impact on human and animal health. These enveloped viruses with ssRNA (+) genomes are usually vectored by hematophagous invertebrates, such as mosquitoes and ticks, which transmit them to vertebrates while taking a blood meal. For some flaviviruses, no invertebrate vector (No Known Vector, NKV viruses) has yet been identified [1].

Phylogenetic inference analyses tend to separate mosquito-borne from tick-borne flaviviruses, while most NKV viruses segregate in another independent monophyletic cluster in phylogenetic trees [1, 2]. A small number of particular NKV viruses fall within the mosquito-borne clade and this seems to reflect the secondary loss of transmission by an arthropod [1, 2]. However, representatives of a fourth genetic lineage of flaviviruses have been discovered and characterized in growing numbers. Even though they have been mainly found in mosquitoes, sequences related to these so-called insect-specific flaviviruses (ISFs) have also been found in phlebotomine sandflies [3, 4]. Although, ISFs share a common genetic organization, polyprotein hydropathy profiles, and cleavage sites in common with the “classical” flaviviruses, unlike the latter they do not seem to replicate in vertebrate cells, either in vitro [511] or by inoculation in the brains of suckling mice [12]. Furthermore, and in contrast to most flaviviruses, replication of ISFs have been shown to give rise to DNA forms of their genomic RNA [13, 14], some of which have been found integrated in mosquito genomes [1416].

Whilst the first ISF to be discovered, designated cell fusing agent virus (CFAV), has been known for over three decades [17], more recently, genetically diverse ISFs have been isolated from mosquitoes collected all over the world, literally on a global scale, and, in some instances, at a high prevalence [69, 12, 13, 1824]. Among the others, three recent studies have reported ISF nucleotide sequences in mosquitoes collected in the Iberian Peninsula [11, 16, 25]. One of these works, carried out as a result of an entomological survey that amounted to the collection of over 36,000 adult mosquitoes in 2009–2010 [26], led to the isolation and genetic characterization of CTFV, an ISF from Culex theileri [11].

In this report, we describe a new ISF isolated from Ochlerotatus caspius (Pallas, 1771) (Diptera: Culicidae), henceforth designated O. caspius [27, 28], which is found at high densities in field collections using CO2-baited CDC traps, especially when carried out in the coastal/estuarine areas of southern Portugal [29, 30]. Viral strains related to this isolate have been found in the Iberian Peninsula and northern Europe, and the latter, designated Hanko virus (HANKV) [9], was characterized as being non-cytopathic to C6/36 cells. Curiously, despite obvious genetic relatedness between the ISF here described (OCFVPT) and HANKV, infection of C6/36 cells with OCFVPT was initially associated with extensive cytopathic effects (CPE). Notably, however, the latter are most probably due to the replication of a Negev-like virus [31] co-isolated with OCFVPT.

Materials and methods

Mosquito collection and homogenate preparation

Adult mosquito collections, all comprising unfed O. caspius (Pallas, 1771) (Diptera: Culicidae) females, were carried out using CDC light-traps baited with CO2, in the wetlands of the Algarve, Portugal’s southernmost province (district of Faro): (i) pools 174 (n = 30 mosquitoes) and 220 (n = 25 mosquitoes) resulted from collections in May and June 2009, respectively, close to a horse riding school (Mato Santo Espírito, close to the city of Tavira, 37°8′25.43″N, 7°37′47.75″W); (ii) pool 207 included 56 mosquitoes collected in June 2009 in the vicinity of marshlands (Beiradas, Odiáxere, 37°9′0.38″N, 8°38′42.24″W); (iii) pool 350 included 50 mosquitoes collected in May 2010 close to a seaside lagoon and an urban waste water treatment plant (Quinta das Salinas, near Almancil, 37° 2′11.90″N, 8° 1′59.86″W); and (iv) pool 595 consisted of 50 mosquitoes collected in August 2010 close to a marshland natural park (Quinta do Cavalo, Monte Francisco, Castro Marim, 37°14′12.32″N, 7°26′34.44″W). Mosquito identifications were initially carried out using keys by Ribeiro and Ramos [32], considering as recognized European taxa those referred by Ramsdale and Snow [33].

Mosquito homogenates were prepared by mechanical disruption of adult specimens using glass beads as previously described [34]. After clarification by centrifugation at 13,000×g (4 °C for 10 min), the macerates were sterilized through 0.22-μm disposable PVDF filters (Millex-GV, Millipore Corp., Bedford, USA), and kept at −80 °C until further use.

Cell culture and virus isolation

The Stegomyia albopicta C6/36 cell line was used for virus isolation. Cells were maintained at 28 °C (in the absence of CO2) in L-15 Leibovitz medium (Lonza, Walkersville, MD, USA) supplemented with 10 % heat inactivated fetal bovine serum (FBS) (Lonza, Walkersville, MD, USA), 2 mM l-glutamine (Gibco BRL, Gaithersburg, MD, USA), 100 U/ml penicillin, and 100 μg/ml streptomycin (Gibco BRL, Gaithersburg, MD, USA) and 1× triptose phosphate broth (AppliChem GmbH, Darmstadt, Germany). Viral replication in vertebrate cells was tested using the Vero E6 cell line (ATCC CRL-1586) maintained at 37 °C with 5 % CO2 in Dulbecco’s modified Eagle medium (Lonza, Walkersville, MD, USA) supplemented with 10 % FBS.

Approximately 500 μl of filter-sterilized mosquito homogenate was diluted in an equal volume of phosphate buffered saline and inoculated onto semi-confluent layers of C6/36 cells grown in T25 culture flasks (Nunc, Roskilde, Denmark). The viral inoculum was removed after 1 h at room temperature (for viral adsorption) and 5 ml of L-15 Leibovitz medium (5 % FBS) was then added to each flask. The cell cultures were incubated at 28 °C for a week. Culture supernatants collected after the third blind passage were used as viral stocks and stored at −80 °C. CPE was determined by optic microscopic observation of the inoculated cell cultures.

Transmission electron microscopy (TEM)

C6/36 cell cultures were infected with 1 ml of viral stocks (only the 174 isolate was used). When CPE became evident (48 h postinfection), the cells were scraped from the culture flask and prepared for TEM examination. Briefly, infected cells were fixed sequentially in 3 % glutaraldehyde (in cacodylate buffer), osmium tetroxide (in the same buffer), and uranyl acetate (in bi-distilled water). Dehydration was carried out in increasing concentrations of ethanol. After passage through propylene oxide, the samples were embedded in Epon-Araldite, using SPI-Pon as an Epon 812 substitute. Thin sections were made with glass or diamond knives and stained with 2 % aqueous uranyl acetate and Reynold’s lead citrate. The stained sections were examined and photographed in a JEOL 100-SX electron microscope.

Nucleotide sequence amplification and DNA sequencing

Viral RNA was extracted from 150 μl of culture supernatant using the ZR Viral RNA Kit™ (Zymo Research, Irvine, CA, USA) according to manufacturer’s recommendations. Total RNA was also extracted from infected and non-infected C6/36 cells using the INSTANT Virus RNA kit (Analytik Jena AG, Jena, Germany). Reverse transcription of viral RNA was carried out with the RevertAid™ H Minus First Strand cDNA Synthesis kit and random hexaprimers (Fermentas, Thermo Fisher Scientific, Waltham, USA), using 5–11 μl of the RNA extract. The obtained cDNA served as template for the amplification of viral sequences using Phusion™ High-Fidelity DNA Polymerase (Finnzymes, Thermo Fisher Scientific, Waltham, USA), and the oligonucleotides listed in Supplementary Table 1. Negevirus-like sequences were amplified using first-round PCR primers NegOF/NegOR (5′-CAYGTRAARATYTTCTGCGAYATGTC-3′, and 5′-GAGTGACAGAMAACRGTYTCYTGMCCG-3′, respectively) and second-round primers NegIF/NegIR (5′-AGTGCTTCAACGTGACATTCCCCCGTCC-3′, and 5′-TAATCGTTTGTGCGGTARACATTGAGGC-3′, respectively). Detection of densovirus genomes was carried out using the generic primers DNV4F and DNV1U, as described [35].

When necessary, 5′ and 3′ rapid amplification of cDNA ends (RACE) was carried out essentially as previously described [36, 37]. DNA amplicons were purified with the DNA Clean & Concentrator™-5 (Zymo Research, Irvine, CA, USA) and either directly sequenced or previously cloned in CloneJET™ (Fermentas, Thermo Fisher Scientific, Waltham, USA) using Escherichia coli NovaBlue (Merck KGaA, Darmstadt, Germany) as host, prior to DNA sequencing.

Partial mitochondrial cytochrome c oxidase subunit I (COI) sequences were amplified from total DNA, extracted from mosquito homogenates with the ZymoBead™ Genomic DNA kit (Zymo Research, Irvine, CA, USA), and the Phusion™ High-Fidelity DNA Polymerase (Finnzymes, Thermo Fisher Scientific, Waltham, USA), using previously described primers and reaction conditions [13, 38]. The amplicons, directly obtained from mosquito pools, were purified as stated before and directly sequenced.

Nucleotide and amino acid sequence analyses

The DNA sequences of OCFVPT were assembled to generate a near full-length genomic sequence using the CAP Contig Manager tool available in BioEdit 7.0.2. [39]. Nucleotide and protein similarity searches were carried out through the NCBI web server using BLASTn and BLASTx (http://blast.ncbi.nlm.nih.gov/Blast.cgi).

Phylogenetic relationships were inferred from nucleotide sequences aligned (codon alignment was maintained) with MAFFT vs. 6 [40], using the evolutionary model indicated by jModeltest [41], and defined with Akaike information criterion (GTR+I+Γ). The list of viral reference sequences used, downloaded from public databases, can be found in Supplementary Table 2. Phylogenetic trees were constructed using MrBayes v3.0b4 [42]. The Bayesian analyses consisted of 20 × 106 generations starting from a random tree and four Markov chains with default heating values sampled every 100th generation. The first 10 % sampled trees were discarded (burn-in). To prevent reaching only apparent stationarity, two separate runs were conducted for each analysis. Phylogenetic relationships were also inferred from amino acid sequence alignments produced with the MUSCLE multiple alignment algorithm [43]. Both nucleotide and amino acid sequence alignments were treated with GBlocks [44] to remove highly variable regions of the alignment with dubious homology. Final trees were manipulated for display using FigTree v.1.2.2. (Available at http://tree.bio.ed.ac.uk/software/figtree/).

For the detection of possible recombination events, near full-length genome sequences were aligned using MUSCLE and the resulting alignment was treated via GBlocks using the least stringent options available. A sliding window approach was then used to cut the alignment in fragments of 800 bp moving in steps of 100 bp. Separate phylogenetic trees, built for each of these segments, were constructed using the maximum-likelihood method as implemented in PhyML [45]. The HKY85 evolutionary model was used, with transition/transversion ratio, number of invariable sites, and across site rate variation estimated from the alignment. The maximum-likelihood tree search was conducted with the nearest neighbor interchange and subtree pruning and regrafting search algorithms. Finally, clustering reliability was tested with the LRT method implemented in PhyML.

Mosquito taxonomic classification based on a molecular approach was carried out by BLAST searches and phylogenetic analysis of the obtained COI sequences using Barcode of Life Data Systems Identification engine (BOLD-IDS) available at http://www.boldsystems.org/views/login.php.

Protein identity percentages were calculated with BioEdit 7.0.2. from alignments obtained using MAFFT. Protein hydropathy plots were constructed with the Gene Runner 3.05 software (available for download at http://www.generunner.net/) using the Kyte–Doolittle hydropathy scale and a window of ten amino acid residues. Protein motifs were identified by running Pfam and Prosite protein profile (using the Motif Search tool at: http://www.genome.jp/tools/motif/) and conserved domain searches (using the Web CD-search tool, at http://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi). Transmembrane helices in protein sequences were predicted with SOSUI (http://bp.nuap.nagoya-u.ac.jp/sosui/). Potential kinase-specific eukaryotic protein phosphorylation sites were predicted with NetPhosK 1.0 (http://www.cbs.dtu.dk/services/). Putative sumoylation sites were predicted with the SUMOplot™ Analysis Program (http://www.abgent.com/tools/). Molecular weight and isoelectric point of proteins were predicted with the Compute pI/Mw tool (http://web.expasy.org/compute_pi/). Potential nuclear localization signals were tentatively identified with PredictProtein (available at http://www.predictprotein.org), NucPred (http://www.sbc.su.se), and NLSmapper (http://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi), as well as by visual inspection of protein sequences using the previously defined consensus [46] as a reference. Subcellular localization was predicated with PSORTII (http://www.psort.org/), while prediction of cellular role, enzyme class, and gene ontology category was carried out with ProtFun 2.2 (available at http://www.cbs.dtu.dk/services/ProtFun/). Protein secondary structure analyses were carried out using the consensus secondary structure prediction tool available at http://pbil.univ-lyon1.fr. Sequence-based protein disordered region prediction was carried out with PreDisorder (http://casp.rnet.missouri.edu/predisorder.html).

Nucleotide sequence accession numbers

The nucleotide sequence of the OCFVPT near full-length genome reported in this work has been deposited in the GenBank/EMBL/DDBJ databases under accession number HF548540. The five partial OCFVPT NS5 sequences have been assigned the accession numbers HE997070-HE997074. The partial ORF1–ORF2 Negevirus-like coding sequence described has been assigned accession number HF913429, while O. caspius COI sequences have been deposited under accession numbers HE997063-HE997065.

Results

Viral isolation and electron microscopy

Sequences encompassing a small region (≈ 200 nucleotides) of the flavivirus NS5-coding region (RNA-dependent RNA polymerase) were initially detected by nested RT-PCR, as previously described [11], using RNA purified from five macerates (laboratory code numbers 174, 207, 220, 350, and 595) of pools of female mosquitoes, identified morphologically as O. caspius. A molecular confirmation of three of these identifications (174, 207, and 350) was obtained by analysis of part of the region coding for the “barcoding” section of the mitochondrial COI gene (encoding the mitochondrial cytochrome c oxidase subunit I). Species identification was based on BOLD-IDS that confirmed the initial taxonomic assignments based on morphology by assigning the analyzed sequences as O. caspius (98.2 % probability) (Supplementary Data 1).

For viral isolation, filter sterilized aliquots of one of the mosquito pool macerates (174 was selected at random) were used to inoculate semi-confluent monolayers of C6/36 cells, which were then regularly checked for CPE. After the third weekly blind passage (i.e., virus subculture to new cells regardless of the observation of CPE, in order to dilute out possible inhibitors of virus replication and/or allow virus titer to increase), and when compared with the negative controls (Fig. 1a), C6/36 infected cultures revealed evident CPE, characterized by cell growth retardation, rounding, and detachment from the solid surface. This could already be observed as soon as 24 h after viral infection (Fig. 1b), increasing in severity until 120 h after infection (Fig. 1c). No CPE was ever perceived in Vero cell cultures inoculated (not shown).

Fig. 1
figure 1

Microscopic observation of C6/36 cells: mock-infected cells (×200) (a), at 48 h (b) or 120 h (c) after infection with OCFVPT. d Transmission electron micrograph of a thin section of C6/36 cells infected (day 2, postinfection) with CTFV strain 153 (thin arrows indicate viral particles) or OCFVPT (e) showing nuclear hyperplasy. Nuclear enlargement is accompanied by a separation of the two leaflets of the nuclear envelope (indicated by the arrows in f), enlarging the intercisternal space of the nuclear envelope that was found filled with vesicles (e, f) and what seem to be membrane trabeculae (g). Viral particles were seen in the intercisternal space of the nuclear membrane/from which some seem to gemulate from; (see top/right panel in h), as well as in cytoplasmic vacuoles, that seem to carry them to the cell surface for exocytosis (h). Cytoplasmic cisternae (most probably corresponding to endoplasmic reticulum) were seen to hold either single or multiple viral particles. Viral particles were also seen in close association with the cytoplasmic membrane (arrows in j), from which gemulae seem to form (dotted squares in h). Ves. vesicles, Chrom. chromatin

Electron microscopy analysis of thin sections of C6/36 cells at 48 h postinfection with a culture supernatant containing the flavivirus tentatively named OCFVPT (for O. caspius flavivirus from Portugal) revealed nuclear hyperplasy (compare Fig. 1d and e) and an evident enlargement of the intracisternal space of the nuclear envelope. This space was filled by multiple sized vesicles (Fig. 1e, f) and a complex network of tubular structures with an apparent diameter of 20–30 nm (Fig. 1g). Virions with a diameter of 40–50 nm and a dense core (Fig. 1h–j), surrounded by an apparent envelope (Fig. 1j), were observed inside cisternae of the endoplasmatic reticulum, from which they seem to reach the cell surface by vesicular transport, compatible with their observation in different sized cytoplasmic vesicles/vacuoles (Fig. 1d, h, i) containing different numbers of viral particles. Structures compatible with viral capsids were also observed to gemulate apparently from the outer leaflet of the nuclear envelope (Fig. 1e), and viral particles were also seen in the intercisternal space of the nuclear envelope (Fig. 1h). Gemulae growing directly from the plasma membrane to which viral particles, some presenting a dense core, were occasionally associated (Fig. 1j) were also detected. As much as our analysis showed, and apart from nuclear hyperplasy, no other signs of apoptosis, including cytoplasmic vacuolization, chromatin condensation, cytoplasmic membrane blebbing, or the formation of apoptotic bodies, were observed at 48 h postinfection.

Nuclear hypertrophy, and the proliferation of membrane-bound vesicles are frequently observed in densovirus-infected insect cells [47, 48]. Although no viruses morphologically distinct from the description above were observed in C6/36-infected cells, and especially not inside the nucleus, the presence of densoviruses in both the supernatant and the sediment of infected C6/36 cultures was also investigated, using previously described primers and reaction conditions (see “Materials and methods” section). No densovirus-specific sequences could be amplified from the supernatant of infected cultures. Nevertheless, a parvovirus that persistently infects the C6/36 cells used, designated AaPV (accession number X74945), was detected in the DNA extracted from the cell pellet. The presence of this virus was not related, in any way, to the replication of OCFVPT (data not shown) nor with the observation of any CPE in noninfected C6/36 cells.

Identification of a putative Negevirus

Although, some of the observed CPEs are compatible with morphological changes caused in infected cells during flavivirus replication [49], the effects seemed notably excessive, considering that the genetically related virus HANKV was described as being noncytopathic [9]. Fortunately, the very recent description by Vasilakis et al. [31] of CPE similar to that observed in our work, associated to the replication of highly cytopathic putative “Negeviruses” (including Negev, Piura, Loreto, Dezidougou, Santana, and Ngewotan viruses), called our attention to the possibility that one such virus might have been co-isolated along with OCFVPT. To investigate this hypothesis, two sets of outer and inner primers complementary to the Negevirus-like sequences deposited in the databanks (see “Nucleotide sequence amplification and DNA sequencing” section) allowed the specific amplification by RT-PCR of a fragment encoding the C-terminus of ORF1 and the N-terminus of ORF2 of a Negev-like virus in the culture supernatants containing the OCFVPT strain. Despite clustering with high probability (p = 0.96) with putative Negev-like viruses, phylogenetic analysis of the obtained sequence (Fig. 2a) clearly showed it to be distinct from those from Anopheles coustani (strain EO239), C. quinquefasciatus (strain M33056), and C. coronator (strain M30957) [31]. The presence of Negev-like viruses was also confirmed in most (174, 207, 220, 350), but not all (595), of the other O. caspius pool macerates in which the OCFVPT NS5 gene region sequence had been detected (Fig. 2b; also see “OCFV replication in C6/36 infected cells” section). Since the putative Negeviruses seem to be highly cytopathic to C6/36 cells [31], we suggest that they may be responsible for the CPE described above. The genetic characterization of this newly described virus is presently underway and will be the subject of a future publication.

Fig. 2
figure 2

a Bayesian phylogenetic analysis of partial ORF1–ORF2 nucleotide sequences of putative Negeviruses [31]. Posterior probability values ≥0.96 are indicated at specific branches. The sequences used are denoted by viral name, strain (in brackets) and accession number (subscript). The size bar indicates 30 % of genetic distance. b RT-PCR amplification of Negev-like viral sequences (≈ 1.1 kb) from RNA directly extracted from O. caspius macerates 174 (1), 207 (2), 220 (3), 350 (4), and 595 (5). Lane 6 corresponds to the negative control, while M denotes the NZYTech (Lisbon, Portugal) DNA Ladder VI

OCFV replication in C6/36 infected cells

RT-PCR amplification was unsuccessful when RNA extracted from OCFVPT-infected C6/36 culture supernatants was used as template without prior cDNA synthesis, while specific amplicons were produced when the RT-PCR progressed to completion (e.g., see Fig. 3a). This indicated that, as expected, the OCFVPT genome is an RNA molecule. On the other hand, no amplification of OCFVPT sequences by RT-PCR was achieved using the AcFV11F/12R (partial NS1-NS2A fragment, ≈1.3 kb), AcFV13F/21R (NS2b-NS3 fragment, ≈1.4 kb), AcFV18F/15R (NS4a-NS4b fragment, ≈1.9 kb), and AcFV19F/20R (NS5 fragment, ≈800 nt) primer pairs, and either genomic DNA extracted from Vero or C6/36 cultures infected with OCFVPT (compare Fig. 3a and b), or from any of the mosquito macerates used (not shown), suggesting that the virus has not integrated its genome into that of its host cells. Moreover, no amplification was obtained when these primers were used in combination with cDNA prepared from RNA extracted from the supernatants of Vero cells inoculated with OCFVPT. In combination with the absence of CPE, as mentioned before, these results suggest that OCFVPT does not replicate in the Vero cell line, as expected for an ISF. Nevertheless, the presence of OCFVPT in either RNA or DNA extracts prepared from these cells was not carried out.

Fig. 3
figure 3

a Detection of OCFVPT sequences by RT-PCR using the AcFV11F/12R (lane 1 NS2b-NS3 fragment), AcFV13F/21R (lane 2 partial NS1-NS2A fragment), AcFV18F/15R (lane 3 NS4a-NS4b fragment), and AcFV19F/20R (lane 4 NS5 fragment) primer pairs. Lanes 69 correspond to attempted amplifications of viral sequences from DNA extracted from C6/36 cells infected with OCFVPT, from which a COI-specific amplicon was amplified (lane 10). b Kinetics of OCFVPT RNA detection in C6/36 infected and mock-infected (−) cells. At different times (hours) after infection, total RNA was extracted from the culture supernatant (S lanes 17) and cell sediment (C lanes 912). Viral sequences were amplified with the AcFV11F/12R pair of primers. In lanes 912, asterisk indicates a DNA fragment originating from the retrotranscription and subsequent amplification of a cellular mRNA sequence (see text). The GeneRuler 1 kb Plus DNA ladder (Fermentas, Thermo Fisher Scientific, Waltham, USA) was used as a molecular weight marker (lanes 5 and 8 in a and b, respectively)

Under the experimental conditions used, viral replication in C6/36 cells was rapid, as OCFVPT RNA could already be detected in culture supernatants, and cell sediments, 24 h after infection (Fig. 3b). However, and as indicated in Fig. 3b (asterisk), the primers used bound nonspecifically to RNAs constitutively expressed in C6/36 cells, which could be detected in cellular RNA extracts whether or not the cells had been exposed to a OCFVPT-containing supernatant. Sequence analysis of the obtained 900 nt. amplicon indicated it corresponded to a segment encoding part of a putative protein with considerable similarity with an endonuclease-RT from Bombyx mori (ADI61823) or a RT-like protein (XP_001866757) from C. quinquefasciatus (64 and 71 %, respectively).

Amplification of the near full-length genomes of OCFVPT

The analysis of a small NS5 sequence fragment (see above) revealed over 98 % identity (BLASTn) with several short (≈200 bp) sequences amplified from O. caspius (HQ441842-5 and GQ476991-4, EU716417-24), and over 83 % identity with the corresponding NS5 sequences of several putative flaviviruses isolated from different mosquito species (classified as Aedes, Ochlerotatus sp., or Culex sp.) via BLASTx.

Given the taxonomic position of the mosquitoes from which OCFVPT was isolated, and in the absence of any closely related viral sequence at the time this study was initiated, a multiple sequence alignment was constructed including the full-length genomes of three Aedes-associated viral sequences, downloaded from public databases. This enabled the design of several PCR primers complementary to sequences scattered across the whole of the viral genome (Supplementary Table 1), and allowed the amplification of the near full-length genome of the OCFVPT #174 viral strain. The amplification of approximately 1.3 kb of genomic sequence at the viral 5′-end was accomplished by RACE (see “Materials and methods” section). Finally, a small fragment corresponding to the 5′-end of the viral genome was obtained with primers designed only after the very recent publication of the genomic sequence of HANKV [9]. Unfortunately, not even those designed taking into account the 3′-UTR sequence of HANKV enabled the amplification of the 3′-UTR of OCFVPT. The near full-length sequence of OCFVPT was found to be 89 % identical to that of HANKV and includes a single open reading frame (ORF).

Phylogenetic analyses of OCFVPT sequences

A full analysis of the phylogenetic relationships between OCFVPT and other flaviviruses was carried out using alignments of the predicted amino acid sequences for the ORF (Fig. 4) and E/NS3/NS5 (Supplementary Data 2). While inspection of ORF and NS5 (Fig. 4 ; Supplementary Data 2C) sequences placed the distinct OCFVPT/HANKV lineage as the first segregating from the ISF radiation, analysis of the E gene region tended to place them between Aedes-related viruses (KRV-Kamiti River virus, AeFV-Aedes Flavivirus) and a larger group that included NAKV (Nakiwogo virus) and Culex-associated viral sequences, such as CTFV (C. theileri flavivirus), CxFV (Culex flavivirus), and QBV (Quang Binh virus) as well as CFAV (Supplementary Data 2A), as expected [50]. The NS3 gene tree again revealed a consistent clustering of OCFVPT and HANKV, but their precise placement in the ISF radiation was not statistically supported (Supplementary Data 2B).

Fig. 4
figure 4

Bayesian phylogenetic analysis of flavivirus ORF amino acid sequences. Viral sequences are denoted by abbreviation and accession number. Posterior probability values ≥0.80 are indicated at specific branches. The list of viral sequences used, and corresponding abbreviated name, can be found in Supplementary Table 2. The size bar indicates 30 % of genetic distance. ISF insect-specific flaviviruses, MBV mosquito-borne flaviviruses, TBV tick-borne flaviviruses, NKV flaviviruses with no known vector. In the MBV clade, the two viral sequences indicated with “asterisk” (NC_005039 and NC_008718) indicate viruses found only in bats, suggesting that they have lost vector transmission secondarily [2]

A sliding window approach (see “Materials and methods” section) was used to investigate the potential evidence of recombination in the OCFVPT genome, but this was not found, with or without HANKV in the multiple alignments used. In any case, in all regions of the genome, OCFVPT and HANKV cluster together (not shown).

In view of the relatively limited number of (near) full-length ISF genomic sequences available to date, we decided to extend the representation of our phylogenetic analyses by comparing the partial NS5 sequences previously described [16], and using nucleotide sequences obtained from each of the 174, 207, 220, 350, and 595 mosquito pools, in which OCFVPT viral sequences had initially been detected (Fig. 5). With this larger assemblage of ISF sequences, again OCFVPT segregated away from most other known ISFs in a monophyletic cluster that also includes the Spanish O. caspius flavivirus (SOcFV) reported by Vázquez et al. [16], and in which HANKV falls as the basal lineage. Curiously, and despite its low genetic diversity, the NS5 viral sequences amplified from O. caspius mosquitoes collected in the Iberian Peninsula were not absolutely homogeneous, and were found distributed in two distinct clusters supported by high (p > 0.99) posterior probability. This topology is somewhat different from that displayed by the NS5 sequences mostly associated with C. theileri mosquitoes [11, 16], and which formed an internally unstructured CTFV/SCxFV cluster (Fig. 4). In any case, these results show that different viral strains with relatively similar NS5 sequences are widely distributed in Portugal and Spain.

Fig. 5
figure 5

Bayesian phylogenetic analysis of partial flavivirus NS5 nucleotide sequences. Posterior probability values ≥0.80 are indicated at specific branches. The list of sequences used, denoted by viral abbreviated name, can be found in Supplementary Table 2. The sequences solely indicated by their accession numbers correspond to the SCxFV (Spanish Culex flavivirus) and SOcFV (Spanish Ochlerotatus caspius flavivirus) clusters, previously described by Vázquez et al. [16]. In the SCxFV cluster, CTFV indicate the two viral sequences from ISF isolated from C. theileri, described by Parreira et al. [11]. In the SOcFV group, the NS5 sequences amplified from the O. caspius pools analysed (174, 207, 220, 350, and 595) are indicated in boldface. Those sequences referred to as DNA forms (group 1), and DNA forms (group 2) indicate sequences that were directly obtained from the amplification of mosquito DNA (described by Vázquez et al. [16]). The size bar indicates 30 % of genetic distance

Analysis of OCFVPT encoded proteins

The genome of OCFVPT encodes a single polyprotein of at least 3,279 amino acid residues. It displays a hydropathy profile, structural organization (C/anchored C, prM/M, E, NS1, NS2a, NS2b, NS3, NS3, NS4a, NS4b, NS5), and functional domains expected for a flavivirus ORF. These include a flavi_glycoprotein superfamily (PSSMID 201480) and flavi_NS1 superfamily (PSSMID 189781) domains, a peptidase_S7 superfamily (PSSMID201522), DEXDc/DEAD-like RNA helicase (PSSMID 28927), and HELICc helicase superfamily c-terminal domains (PSSMID 197757) in NS3, and FtsJ-like/AdoMet methyltransferase (PSSMID 201939) and flavi_NS5 superfamily (PSSMID 110005) domains in NS5. The viral serine protease seems to be involved in the cleavage of C/anchored C, NS2a/NS2b, NS2b/NS3, NS3/NS4a, and NS4b/NS5, while a furin-like protease seems to be involved in the processing of the anchored C/prM, M/E, E/NS1, and possibly NS4a/NS4b protein junctions. Several putative (p > 0.80) serine, threonine, and tyrosine phosphorylation sites were predicted to scatter along the viral ORF. Computer-assisted searches for nuclear localization signals in viral proteins were negative, but the visual inspection of the C, NS1, NS3, and NS5 sequences disclosed several protein sections particularly rich in basic amino acid residues (R and K, in particular). Finally, several highly probable SUMO (small ubiquitin-related modifier) target sequences were predicted in the envelope, NS1, NS3, and NS5 coding regions of the viral ORF, with the most likely ones located in the first two regions.

Pairwise sequence identity comparative analysis of some of the OCFVPT encoded proteins (E, NS3, NS5, and ORF) and those of other flaviviruses revealed values that fell to a minimum of 35 % in E (AEFV), but rose from 53.1 % (CFAV) to 97.8 % (HANKV) in NS5, the most conserved of the viral proteins compared. Interestedly, while the NS3 and NS5 proteins of HANKV, CAFV, AEFV, and KRV were more similar to those encoded by OCFVPT (Table 1) than their respective E proteins, OCFVPT E was more similar to those encoded by CxFV and CTFV than their respective NS3 amino acid sequences. The average diversity for the NS5 aligned amino acid sequences of OCFVPT and SOcFV was 2.9 %, while inclusion of HANKV in this cluster raised genetic diversity to 3.7 %. Within the two smaller subgroups that comprise the OCFVPT/SOcFV radiation (G1 and G2 in Fig. 5), mean genetic diversity fell to 0.5 % (G1) or 1.7 % (G2), suggesting that this cluster includes two similar viral strains of the same virus, encoding two types of very similar NS5 proteins, clearly differing from that encoded by HANKV.

Table 1 Comparison of the putative OCFVPT amino acid sequences with those of other ISF

In common with other characterized ISFs, OCFVPT also seems to encode a FIFO protein [51], if a -1 frameshift occurs during the translation of the second codon of the N-terminus of NS2A. This frameshift event is expected to take place at a slippery heptanucleotide (Fig. 6a) that conforms to the GGAUUUY consensus [51] and is located within a region of the viral genome that tends to fold into a long stem-loop structure (Fig. 6b). Similar to all other predicted FIFO proteins, the encoded by OCFVPT presents a 22 amino acid transmembrane domain (TMD; 41ETQILGTISLVLFVVSAVWCTH63), with the remainder of the protein expected to locate mostly in the cytoplasm, and has no N-glycosylation sites. It was predicted to correspond to an acidic polypeptide (pI = 5.7) of approximately 29 kDa, 92.2 % identical to that encoded by HANKV. The scores with the highest information content of the ProtFun analysis suggested that OCFVPT-FIFO may be an enzyme (as opposed to non-enzyme) and/or that it may be involved in translation [52]. The C-terminus of OCFVPT-FIFO was predicted to display regions of random coil and extended strand structure, while the central part of the protein seems to display a helical structure. Apart from the helical TMD, most the N-terminus should be in a random coil state. Therefore, OCFVPT-FIFO was predicted to hold intrinsically disordered N- and C-termini (Fig. 6c).

Fig. 6
figure 6

a While translating the OCFVPT ORF, in the region encoding the NS2A protein, ribosomes may slip out-of-frame at a conserved heptanucleotide consensus (boldface). As a result of this −1 frameshift, some of the translating ribosomes may continue protein synthesis, giving rise to a putative product designated FIFO. b Predicted secondary structure of the viral RNA in the region where the −1 frameshift (−1FS) is supposed to occur. c PreDisorder graphic showing the probability of disorder for each residue in OCFVPT-FIFO. A cut-off value of 0.5, representing the threshold probability of disorder for a residue, is indicated by the dotted line

Discussion

The genus Flavivirus is, to date, the most diverse genus of the Flaviviridae family. The majority of its members are known human and animal pathogens, and for that reason, some of them have been extensively studied. Although some flaviviruses seem to be exclusively found in vertebrates, such as bats and rodents [53], most of them are transmitted to vertebrates by hematophagous invertebrate vectors (primarily mosquitoes and ticks) during a blood meal.

Phylogenetic analyses of the genus tend to place flaviviruses in three major lineages, associating them with their vectors and hosts [1, 50]. However, especially over the last decade [50], an increasing number of studies have reported the identification of a diverse group of flaviviruses, commonly found in insects and not apparently able to replicate in vertebrate cells [511], justifying their designation as ISFs [18, 20, 50]. Although the virion structural features, genome type and structure, and the putative biochemical properties of the proteins encoded, seem to place ISFs among classical flaviviruses, they do form a distinct viral monophyletic group in phylogenetic trees, and have previously been suggested to represent an ancestral lineage of the genus [1]. However, and unlike classical flaviviruses, the finding of sequences related to ISFs integrated in the genome of mosquitoes [1416] was unexpected, and complicates the analysis of their evolutionary history. Furthermore, unlike many flaviviruses, their genome also seems to code for a protein known as FIFO, encoded out-of-frame comparatively to all the other structural and nonstructural viral proteins, the translation of which seems to depend on the occurrence of a frameshift event that displaces the ribosomes from the 0 frame to the −1 frame [51].

The viruses identified as ISFs have been described from all over the world [69, 1113, 1824]. Not surprisingly, they have also been recently described in mosquitoes collected in the Iberian Peninsula [11, 16, 25]. In Portugal, in recent years, mosquito collections made via CO2-baited traps, and especially those conducted in southern estuarine and coastal areas, are particularly abundant in C. theileri and O. caspius adult specimens [29, 30]. In a previous report, we characterized an ISF isolated from C. theileri [11]. In order to further describe the genetic diversity of ISF in Portugal, in the present work, we describe the isolation of another of such viruses, this time from pools of mosquitoes classified as O. caspius.

The virus under study, tentatively named OCFVPT (from O. caspius flavivirus from Portugal), does not seem to replicate in vertebrate cells, but replicates rapidly in the S. albopicta C6/36 cell line. The genomic sequence of OCFVPT strain 174 was obtained to near-full completion and characterized genetically. The viral genome comprised an RNA molecule encoding a single ORF with a hydropathy profile, putative protease cleavage sites, and conserved protein domains similar to other flaviviruses. Multiple putative phosphorylation and sumoylation sites were predicted in the OCFVPT ORF. Although, the role played by these possible post-translation modifications is only speculative, proteins modified by SUMO have been shown to display altered sub-cellular localization, activity or stability, and sumoylation has been shown to control many aspects of cell physiology, such as cell cycle regulation, transcription, nucleocytoplasmic transport, DNA replication and repair, chromosome dynamics, apoptosis and ribosome biogenesis [54]. Its implications in the ISF replication cycle remain to be established.

An additional aspect of ISF replication that rests uncharacterized is the possible role played by FIFO in the viral replication cycle. In the case of OCFVPT, FIFO was characterized as an acidic protein with approximately 29 kDa that could be encoded as a result of a -1 ribosomal frameshift, supposed to occur at a slippery heptanucleotide included in a stem-loop secondary structure. Although in vitro frameshifting assays have previously allowed the detection of a product compatible with FIFO in vitro, this protein was not consistently observed in CxFV-infected cells [55]. Our analysis also suggested that OCFVPT-FIFO has intrinsically disorganized terminii, but whether these facilitate its interaction with other partners or contribute to regulate its expression, as suggested [55], is not yet known, but will be investigated in the near future.

Phylogenetic analyses based on amino acid sequence alignments unambiguously placed OCFVPT in the ISF radiation, where it formed a monophyletic cluster that included HANKV, an ISF recently isolated from O. caspius mosquitoes collected in Finland [9]. Genetic similarities between both viruses were evident, with the two viral sequences forming a divergent basal viral lineage amongst ISFs, as revealed by phylogenetic analysis of ORF and NS5 sequences. Although the virus–mosquito co-divergence hypothesis has been recently questioned [50], analysis of the E, NS3, and NS5 region sequences did suggest an apparent segregation of viral sequences into distinct host-associated viral lineages, suggesting that this issue deserves further clarification.

Unlike other ISFs [1416], we found no evidence for the integration of OCFVPT nucleotide sequences in either the genome of the mosquitoes from which they were isolated or that of C6/36 cells in which they replicated. However, our analysis was somewhat limited and restricted to the attempted amplification of partial NS1-NS2A, NS2b-NS3, NS4a-NS4b, and NS5 sequences, therefore, not formally excluding the possibility that other sections of the viral genome might be found in the host’s DNA.

The geographical and climatic features of the Iberian Peninsula are such that OCFVPT is not expected to be restricted to the Portuguese territory. In fact, viral strains with phylogenetically similar NS5 sequences have been previously identified [16] from mosquitoes collected in Spain. Together with SOcFV (Spanish O. caspius flaviviruses) OCFVPT form a monophyletic cluster that represents a sister group to the HANKV lineage, with an apparent separation into two genetic sub-lineages (G1 and G2 in Fig. 5). All these sequences (OCFVPT, SOcFV, and HANK) are evidently related and they seem to represent different viral strains of the same virus, forming a distinct lineage that evolved within the same host (O. caspius), adding to the high genetic diversity of ISFs identified in mosquitoes collected over the world.

Although, part of the cellular morphological changes apparently associated with the replication of OCFVPT are akin to those accompanying the replication of flaviviruses [49], the profuse CPE observed when C6/36 cells were exposed to culture supernatants containing OCFVPT was, nevertheless, uncommon. Furthermore, the high genetic relatedness between OCFVPT and HANKV, a virus described as non-cytopathic [9], seemed to contradict the possibility that the observed CPE were exclusively due to the replication of OCFVPT. These included nuclear hyperplasy, with a clear separation of the two leaflets of the nuclear envelope, and the consequent enlargement of its intracisternal space, locally filled-up with numerous, heterogeneously sized vesicles, as well as tubular structures. While such CPE had never before been associated with the replication of flaviviruses, viral particles with a morphology and size compatible with them (40–50 nm) were seen mostly in cytoplasmic cisternae and vacuoles/vesicles, where they seemed to be in transit to the cell surface. In some infected cells, viral particles were also seen in close association with the cytoplasmic membrane, some demonstrating a dense core. Despite the fact that a continuum between a putative viral envelope and the cellular membrane could not be shown unequivocally, the identification of what seemed to be gemulae growing from the membrane plane suggested that the cytoplasmic membrane had been directly engaged for viral assembly. Though this is infrequent amongst flaviviruses in general, it is not altogether unexpected and has been described previously [56, 57].

Fortunately, the very recent description of a group of viruses that are suggested to form a putative new viral taxon, tentatively designated Negevirus [31], may help to resolve the apparent contradiction between the probable non-cytopathic nature of OCFVPT (as far as its phylogenetic similarity with HANKV seems to suggest), and the observed extensive CPE in C6/36 cells. These Negeviruses give rise to round enveloped virions, 50 nm in diameter, with polyadenylated ssRNA genomes. They are highly cytopathic to insect cells in culture, producing CPE similar to those described in this work [31]. Unexpectedly, analysis of RNA extracts containing OCFVPT genomes by RT-PCR revealed the presence of a Negev-like viral sequence, as demonstrated by phylogenetic analysis. Despite its distinctiveness, this sequence clustered with high probability with Negev-viruses formerly identified in mosquitoes collected in Israel (A. coustani) and the USA (C. quinquefasciatus and C. coronator). We believe that most of the CPE initially associated to OCFVPT infection may be, in fact, attributable to the replication of this potential new virus. Nevertheless, future analyses of insect cells infected with each one on these viruses will reveal their true impact on cellular metabolism and physiology. Described here for the first time in association with O. caspius mosquitoes collected in Europe, this extends the already large host and geographic ranges of Negev-like viruses. The characterization of this virus and how it interacts with OCFV in co-infected cells is undergoing, and will be reported in a future publication.