Introduction

Scuticociliatosis, a parasitic disease of fish and crustaceans is caused by invasive scuticociliates, which are free-living marine protozoa belonging to the subclass Scuticociliatida of the phylum Ciliophora (Gao et al. 2016). In the last decade, scuticociliate infection became a prominent parasitological problem in mariculture worldwide. There have been many reports of severe disease outbreaks caused by several scuticociliates species including Uronema marinum, Philasterides dicentrarchi, Pseudocohnilembus persalinus, Anophryoides haemophila, and Miamiensis avidus, which have led to serious economic loss (Ragan et al. 1996; Kim et al. 2004a, b; Harikrishnan et al. 2012; Stidworthy et al. 2014; Iglesias et al. 2018; Li et al. 2018). Scuticociliates are characterized by their high potential for systemic invasion, infecting not only the surface of the body but also internal organs, including the brain, kidney, spleen, and the spinal cord, leading to high host mortality rates (Puig et al. 2007; Harikrishnan et al. 2010). The major clinicopathological manifestations of scuticociliatosis in infected fish are dark coloration, loss of scales, excessive body mucus, hemorrhagic and bleached spots on the skin, and severe dermal necrotic lesions that ultimately destroy tissues, leading to mortality (Jung et al. 2007; Jin et al. 2009; Moustafa et al. 2010).

Previous reports of scuticociliate infection, including pathological and chemotherapeutic studies, have focused on the prevention and control of scuticociliatosis, species identification for taxonomy, investigation on diversity, and on the diagnosis of scuticociliatosis. Little attention has been paid to the molecular mechanisms of pathogenicity, mainly due to a lack of basic research on areas such as the life cycle, genetics, and the genome of scuticociliates. Although morphological analysis of scuticociliates performed by a series of methods such as scanning electron microscopy, protein silver staining, and silver impregnation is an useful approach for identification, it is time-consuming and laborious, and thus, is not suitable for the identification of closely related species and for scuticociliatosis treatment (Jung et al. 2011; Huang et al. 2021).

Although molecular identification of scuticociliate species, such as Pseudocohnilembus persalinus (Jones et al. 2010), Phiasterides dicentrachi (DE Felipe et al. 2017), Uronema marinum (Smith et al. 2009) and Miamiensis avidus (Jung et al. 2005), has been performed through the DNA identification of mitochondrial cytochrome oxidase c subunit I (COI) sequences, mitochondrial small subunit ribosomal DNA (mtSSU-rDNA) sequences, and nuclear small subunit ribosomal DNA (nSSU-rDNA) sequences (Whang et al. 2013; Zhang et al. 2019), and through phylogenetic analyses based on SSU rDNA or the ITS1-5.8 S-ITS2 region sequences of related parasite species, these methods are limited in their identification of closely related species (Jung et al. 2011). In addition, due to the presence of different serotypes of scuticociliate species (Piazzón et al. 2008; Song et al. 2009), molecular detection in hosts with low parasite numbers and vaccine-mediated controls is difficult (Jung et al. 2011). Mitochondrial (mt) genomes are increasingly used to study the evolution and molecular epidemiology of scuticociliates such as Pseudocohnilembus persalinus isolated from Turbot (Scophthalmus maximus L.) (Gao et al. 2018) and Uronema marinum isolated from Takifugu rubripes (Li et al. 2018). The mt-genome of Miamiensis avidus causing flatfish scuticociliatosis has been largely uninvestigated.

Here, we sequenced and annotated the first complete mt-genome of Miamiensis avidus, in order to determine the nucleotide and amino acid similarities between the mt-genomes of Miamiensis avidus and other related ciliates.

Materials and methods

Ciliates isolation and cultivation

M. avidus strain was obtained from the Pathology Research Division of the National Institute of Fisheries Science, which were identified as M. avidus using species-specific oligonucleotide primers (Seo et al. 2013) were isolated from the ascitic fluids of olive flounders in a local farm (Pohang-si, Gyeongsangbuk-do, 2008). In a recent publication we isolated 32 clones harboring peptidase gene sequences from 1,265 EST clones of the M. avidus cDNA library that related to infection of M. avidus by comparison of expression level between the cell-fed and the starved ciliates. Chinook salmon epithelia-214 cells were incubated at 20 °C in Eagle’s minimum essential medium (Sigma) supplemented with 10% heat-inactivated fetal bovine serum and were used to grow the ciliates. The ciliates were then harvested by centrifugation at 200 × g for 5 min and washed three times by centrifugation at 150 × g for 5 min in 10 ml of phosphate-buffered saline (Sigma). The washed ciliates were counted using a hemocytometer.

DNA isolation, library preparation, and sequencing

The M. avidus ciliates were harvested by centrifugation at 200 × g for 5 min and washed more than three times by centrifugation at 150 × g for 5 min in Hanks’ balanced salt solution (Sigma) or filtered seawater. The genomic DNA of M. avidus was extracted using a HiGene™ Genomic DNA Prep Kit (Biofact, Daejeon, Korea). Using a Covaris G-tube, fragments of 20 kb were generated by shearing the genomic DNA according to the manufacturer’s protocol. The AMpureXP bead purification system was used to remove small fragments. A total of 5 µg of each sample was used as the input for the preparation of the library to be used in PacBio sequencing (Pacific Biosciences). The SMRTbell library was constructed using a SMRTbell™ Template Prep Kit 1.0 (PN 100-259-100). Using the BluePippin Size selection system, small fragments were removed from the large-insert library. After a sequencing primer was annealed to the SMRTbell template, DNA polymerase was bound to the complex (Sequel Binding Kit 2.0). Purification was performed using SMRTbell clean up columns (SMRTbell® Clean Up Columns v2 Kit-Mag: PN 01-303-600). The purification step was performed after polymerase binding to remove the unbound polymerases and the polymerase molecules that are bound to small DNA inserts. A MagBead Kit was used for the binding of the library complex with MagBeads before sequencing, since MagBead-bound complexes provide more reads per single-molecule real-time (SMRT) cell. This polymerase-SMRTbell-adaptor complex was then loaded into zero-mode waveguides. The SMRTbell library was sequenced using three SMRT cells (Pacific Biosciences, Sequel™ SMRT® Cell 1 M v2) using a sequencing kit (Sequel Sequencing Kit 2.1) and 1 × 600 min movies were captured for each SMRT cell using the PacBio sequencing platform (Pacific Biosciences).

Genomic DNA (gDNA) (200 ng) for Miseq was sheared using an S220 Ultra sonicator (Covaris) (peak incident power 175 W, duty factor 5%, 200 cycles per burst, treatment time 35 s). Library preparation was performed using an Illumina TruSeq Nano DNA sample prep kit (Illumina) according to the manufacturer’s instructions. Briefly after clean-up of fragmented gDNA using sample purification beads, the fragmented gDNA was end-repaired at 30°C for 30 min, followed by size selection for 500 bp insert size using sample purification beads. A single ‘A’ nucleotide was added to the 3’ ends of the blunt fragments using the a-tailing mix reagent by incubation at 37 °C for 30 min followed by incubation at 70 °C for 5 min. Indexing adapters were ligated to the ends of the DNA fragments using ligation mix 2 reagents at 30 °C for 10 min. After washing twice with sample purification beads, a PCR was performed to enrich the DNA fragments with adapter molecules on both ends. The thermocycler conditions were as follows: 95 °C for 3 min, 8 cycles of 98 °C for 20 s, 60 °C for 15 s, and 72 °C for 30 min, with a final extension at 72 °C for 5 min. Eventually, the quality and the band size of the libraries were assessed using an Agilent 2100 Bioanalyzer (Agilent). The libraries were quantified by qPCR using a CFX96 Real-Time System (Bio-Rad). After normalization, sequencing of the prepared library was performed using the MiSeq system (Illumina) with 300 bp paired-end reads.

Illumina Miseq sequencing, PacBio SMRT sequencing, and hybrid assembly

A total of 16.1 Gb of paired-end fastq files were produced using Illumina MiSeq paired-end sequencing, and Raw sequence data were deposited into Sequence Read Archive (SRA) database (https://www.ncbi.nlm.nih.gov/sra/) with the accession no. PRJNA763762. Using these data, the genome size of coconut was estimated by k-mer distribution using the KmerGenie program (Chikhi and Medvedev 2014). Pre-assembled short-read next generation sequencing contigs were used to correct and derive a compact representation of the 7.8 Gb SMRT (Pacific Biosciences) long reads using a DBG2OLC hybrid assembler (Ye et al. 2016). The DBG2OLC assembler was used to assemble the raw PacBio SMRT sequence data, with the Illumina Miseq contig sequence assembly utilized as an anchor for error correction. The overlap and the consensus steps were executed with the following parameters: k-mer value: 19, adaptive k-mer matching threshold: 0.0001, fixed k-mer matching threshold: 2, minimum overlap score between a pair of long reads: 8, removal of chimeric reads: allow. The quality of the resulting assembly was assessed using a local Perl script (https://github.com/aubombarely/GenoToolBox/blob/master/SeqTools/FastaSeqStats). The assembled contigs were scanned for M. avidus mitochondrial sequence using local Basic Local Alignment Search Tool (BLAST) (Altschul et al. 1997) with the Tetrahymena thermophila mtDNA sequence (NC_003029) as the search query. The complete M. avidus mitochondrial sequence with terminal repeat sequences at both ends was selected.

Annotation of the M. avidus mtDNA genome and phylogenetic analysis

Open reading frame (ORF) finding and gene prediction were performed using BLAST (BLASTn and BLASTx), GeneMarkS (Besemer et al. 2001), and Geneious (ver. R11). The large and small subunits of ribosomal RNA (rRNA) were identified using BLASTn with published ciliate rRNAs as queries and the transfer RNAs (tRNAs) were identified using the tRNAscan-SE search server (Lowe and Eddy 1997) (http://lowelab.ucsc.edu/tRNAscan-SE/). All protein-coding, rRNA, and tRNA genes, and genes in the novel mt-genomes were confirmed by multiple sequence alignment with published ciliate data using the T-Coffee package (Di Tommaso et al. 2011) (http://www.tcoffee.org). A graphical linear map of the complete mt-genome was produced using Circos v0.67 (Krzywinski et al. 2009). The whole mt-genome alignment of M. avidus was compared with that of other related ciliates (Table 1) using the MAUVE program (Mauve version 20,150,226 build 10)(Darling et al. 2010). To compare the M. avidus mt-genome with other ciliate mt-genomes, the respective amino acid sequences of the protein-encoding gene rpl14 were aligned using ClustalX and neighbor-joining (NJ) analysis was conducted using MEGA 7.0 with 1000 bootstrap replicates (Kumar et al. 2016). In addition, the distance tree of the mitochondrial cytochrome c oxidase subunit 1 (cox1) genes was constructed using MEGA 7.0 with 1000 bootstrap replicates. The list of the taxa used in the cox1 alignments is listed in Table 10. The additional sequences used for comparison and their GenBank accession numbers were as follows: Philasterides dicentrarchi cox1_a (accession number: MN531306.1), P. dicentrarchi cox1_b (GQ342957.1), Pseudocohnilembus persalinus cox1 from freshwater-reared rainbow trout, Oncorhynchus mykiss (GU584095), P. persalinus cox1 from Turbot (Scophthalmus maximus) (MH608212.2), P. persalinus cox1 (GQ500579), Uronema marinum isolate ZZF20170302 cox1 from Turbot (MG001901.1), and Uronema heteromarinum isolate FXP08082901 cox1 (MG001901.1).

Table 1 Mitochondrial genome sequences of ciliates sequenced completely prior to the present study and used for sequence analyses
Table 2 Mitochondrial genome organization of M. avidus
Table 3 Nucleotide composition of mitochondrial genome of M. avidus

Results

Complete mt-genome organization and composition

The mt-genome of M. avidus was observed to be linear with two terminal repeat sequences (TRS; 30 bp) located at both ends and 38,695 bp in length with 47 genes, including 40 protein-coding genes (29 coding sequences, four large subunit ribosomal proteins [rpl], seven small subunit ribosomal proteins [rps]), two ribosomal RNA (rRNA) genes, and five transfer RNA (tRNA) genes (Fig. 1). The mt-genome organization of M. avidus, including a list of gene order, gene length, alternative start codons, and intergenic spacer regions, is given in Table 2. The nucleotide composition of the entire M. avidus mtDNA sequence was found to be 40.8% thymine, 39.1% adenine, 10.4% guanine, and 9.7% cytosine, while the complete mtDNA sequence had a high AT content of 79.87% (Table 3). Clusters of orthologous protein-coding genes (COGs) located in the M. avidus mt-genome included genes encoding 13 energy pathway proteins, including ATP synthase subunit 9 (atp9), two subunits of cytochrome c oxidase (cox1 and cox2), apocytochrome b (cob), and nine NADH dehydrogenase subunits (nad1a, nad1b, nad2, nad3, nad4, nad5, nad7, nad9, and nad10), genes for translation, ribosomal structure, and biogenesis, including six ribosomal proteins (rpl2, rps12, rps13, rpl14, rpl16, and rps19), and genes encoding proteins involved in defense mechanisms, such as putative mitochondrial protein orf386 (orf386) (Table 4).

Fig. 1
figure 1

Visual representation of the linear mitochondrial genome of Miamiensis avidus. Protein-coding genes (40) are purple, tRNAs (5) are dark blue, rRNAs (2) are red, and Terminal Repeat Sequences (TRS; 30 bp) at both ends are not shown. Gene composition in the whole mitochondrial genome of the M. avidus is indicated by a table below the mtDNA map

Table 4 Clusters of orthologous protein-coding genes (COGs) located in M. avidus mt-genome
Table 5 Protein-coding genes in Ciliate mitochondrial genomes
Table 6 Ciliate mitochondrially encoded rRNAs

Codon usage

Seven alternative start codons were found to be used in the M. avidus mt-genome, in which 37 of the 40 protein-coding genes have been shown to use TAA as a termination codon, and three genes (yejR, rps13, and rps3_a) were shown to use TAG as a termination codon. On the whole, 27.5% of the genes were found to have an ATG start codon; 27.5%, ATT (rpl2, rps12, rpl6/orf176, orf73, rpl16, nad4L/orf119, nad9, cob, nad5, nad1_a, and rps13); 20%, TTA (rps3_b/orf149, rps19, atp9, nad2_a/orf371, orf159, rpl14, orf143, and rps3_a/orf320); 12.5%, ATA (orf437, rps14/orf107, rps7/orf229, cox2, and nad6/orf246); 5%, ATC (nad4 and yejR/orf571); 5%, TTG (orf195 and orf492); and only 2.5%, a GTG start codon (nad2_b) in the M. avidus mt-genome (Table 2).

Comparative analysis of the complete mitogenomes of ciliates

The genome structure and organization of the mt-genomes (mitogenomes), the mitochondrially encoded gene content, and the TRS in the mitochondrial genome of M. avidus determined in the present study were compared to that of the mt-genomes of other ciliates reported previously (Table 1). As shown in Table 5, with the exception of a few gene losses, ciliate mitochondrial genomes share largely the same complement of known protein-coding genes. The content of the identified protein-coding genes was almost identical in M. avidus and U. marinum mtDNAs, both organisms belonging to the order Philasterida. Genes for the small and large subunit ribosomal RNAs (rnl_b_1 and rns b, respectively) were found in M. avidus mtDNA in contrast to the split large and small subunit ribosomal RNA genes and the duplicated large subunit rRNA genes in the Tetrahymena genus (Table 6). The rnl and rns genes in M. avidus were found to be not duplicated, similar to those in T. pyriformis and I. multifiliis.

Table 7 Ciliate mitochondrially encoded tRNAs
Table 8 M. avidus-specific ORFs in mitochondrial genome

The M. avidus mitogenome was also found to contain 12 M. avidus and ciliate-specific ORFs, called Ymf genes, for which function a cannot be assigned, due to lack of sufficient sequence similarity to strongly indicate homology (Table 8). Three of these ORFs (orf386, orf437, and orf492) were also observed in Tetrahymena, Ichthyophthirius, and Paramecium mtDNAs (Ymf67, Ymf66, and Ymf68 in I. multifiliis). The linear mitogenomes of ciliates have between 18 and 35 bp repeat sequences and have symmetrical ends forming TRS (Table 9). The terminal repeat regions in M. avidus and U. marinum were found to be 30 and 32 nt long, respectively. In six ciliate species, the terminal repeats at both ends of the mitochondrial DNA were observed to be completely different.

Table 9 Comparison of Terminal Repeat Sequences (TRS) with other ciliates

In general, Miamiensis, Uronema, Tetrahymena, and Ichthyophthirius mtDNAs had almost the same preference for relative synonymous codons, with only a slight difference (Fig. 2). The most frequent amino acids in the M. avidus mitogenome were leucine (Leu), phenylalanine (Phe), and isoleucine (Ile), with percentages of occurrence of 13.4%, 11.37%, and 10.51%, respectively. The least frequent amino acids were cysteine (Cys), histidine (His), and proline (Pro), with percentages of occurrence of 0.95%, 1.47%, and 1.72%, respectively.

Fig. 2
figure 2

Relative synonymous codon usage (RSCU) of 8 ciliate representatives. Codon families are labelled on the x-axis. Values on the top of the bars denote amino acid usage

Mitochondrial genome organization and phylogenetic relationships

To assess the extent and comparison of ciliate mitochondrial genome arrangements more closely, mt-genomes were aligned using the MAUVE program. The mt-genomes of Miamiensis and Uronema were found to be largely collinear. Those of Tetrahymena and Ichthyophthirius were largely collinear as well. In the large subunit ribosomal protein (rpl14)-based phylogeny, class Oligohymenophorea was monophyletic with the orders Philasterida, Hymenostomatida, and Peniculida clustered together, supported by significant statistical values (Fig. 3). Considering the order Philasterida to which M. avidus and U. marinum belong as a benchmark, the order Hymenostomatida, which is most closely related to Philasterida, revealed only slight differences in the order of common genes (Table 5).

Fig. 3
figure 3

Whole mitochondrial genome alignment of other ciliate genomes by the MAUVE program (Mauve version 20,150,226 build 10). Color blocks of the same color represent homologous regions between different mitogenomes. GenBank accession numbers are provided (in parentheses) for all reference sequences

Two major clusters were observed in the phylogenetic tree of the cox1 gene in the M. avidus isolates (Fig. 4; Table 10). Cluster I included 11 strains (GJ01, Mie0301, WDB-0708, SJF-06 A, YS2, WD4, JJ4, JJ3, SJF-03 A, WS1, and Nakajima) and cluster II contained 8 strains (Iyo1, xiapu1, SJF-03B, YK1, YK2, JF05To, RF05To, and SK05Kyo). While the highest bootstrap values were observed in two major clusters of cox1 gene (100%) including M. avidus SCUTICA2 strain, the low bootstrap values were observed in Philasterides dicentrarchi strains (33–52%).

Fig. 4
figure 4

Phylogenetic tree of Miamiensis avidus strains based on the Cox 1 gene. Abbreviation: Cox 1, Cytochrome c oxidase I

Table 10 Miamiensis avidus isolates used in this study

Discussion

The complete mt-genome of M. avidus was observed to be remarkably compact when compared to other ciliate mitogenomes in similar taxonomic positions (Burger et al. 2000; Brunk et al. 2003; de Graaf et al. 2009; Harikrishnan et al. 2010; Gao et al. 2018; Power et al. 2019; Retallack et al. 2019) and had the shortest genome length and the least coding sequence (CDS) of genes. The most remarkable observation in the order Philasterida and Hymenostomatida mt-genomes is the very low GC content observed in certain species, for instance, in the five Tetrahymena species (18.5 ~ 21.3%), U. marinum (19.0%), and I. multifiliis (16.7%), while species such as N. ovalis and P. Aurelia had relatively high GC contents of 41.5% and 41.2%, respectively (Johri et al. 2019). The low GC content of Paramecium mitogenomes is marked by a highly biased codon usage, with most synonymous positions exhibiting a strong bias for A or T nucleotides (Barth and Berendonk 2011). Only TAA was found to terminate protein-coding genes, whereas TAG never occurred in P. caudatum and Tetrahymena. In the M. avidus mitogenome, 37 of the 40 protein-coding genes were shown to use TAA as the termination codon, and three genes (yejR, rps13, and rps3_a) were shown to use TAG as the termination codon.

COGs located in the M. avidus mitogenome showed that most of the functional genes were related to secondary metabolite biosynthesis, transport, and catabolism, as has been observed in the mitochondrial genomes of 14 related ciliates (Huang et al. 2021). The M. avidus mitogenome shares a number of structural features with the existing ciliate mt-genome showing molecular affinity and similarities in encoded gene content with other members of the order, such as Philasterida, Hymenostomatida, and Peniculida. Indeed, M. avidus had extensive gene loss, especially for ribosomal proteins, compared to species within Hymenostomatida, with only two ribosomal genes in its mitochondrial genome and only five tRNAs.

In all ciliate mitogenomes, including that of M. avidus, there is either a central in T. pyriformis (Burger et al. 2000)d minuta (de Graaf et al. 2009), and the hydrogenosome of Nyctotherus ovalis (de Graaf et al. 2011) or terminal in P. aurelia (Pritchard et al. 1990)d avidus, which bear low sequence complexity repeats. In the P. aurelia mitogenome, pure AT repeats were observed in the terminal region (AAATATTAATATATTTATTTTTTTATTTTAAT), and. P. aurelia terminal repeat sequences are considerably more GC-rich (51.4% AT) than all other ciliate mitochondrial genome repeats.

A previous report on the analysis of cox1 genes from the serotypes of 21 strains of M. avidus isolated from diseased olive flounder (Paralichthys olivaceus), ridged-eye flounder (Pleuronichthys cornutus), and spotted knifejaw (Oplegnathus fasciatus) indicated that the 21 strains can be classified to have five cox1 types (two heterogeneous clusters and three individual branches). Among the species-specific regions flanked by conserved sequences, such as ribosomal RNA (SSU and LSU) and cox1 genes, cox1 was similar to SSU and better than LSU in discriminating between M. avidus and related scuticociliates, which has been used previously in investigations of a wide range of human and nonhuman infectious diseases. These cox1 types have not been found to reflect geographical origins and host species (Jung et al. 2011). In the present study, we performed a comparative and phylogenetic analysis of cox1 genes from the mitochondrial genomes of M. avidus and the serotypes of M. avidus strains. The mitochondrial cox1 gene of M. avidus which belongs to the order Philasterida of the subclass Scuticociliatida showed promise as a valuable genetic marker for species detection. Based on previous studies, intraspecific variations in cox1 are not attributed to the infectivity and virulence demonstrated in some strains (Nakajima, WS1, YK1, Iyo1, and Mie 0301) of M. avidus that are highly pathogenic to olive flounder (Song et al. 2009). However, the cox1 cluster I and II types in the serotypes A and B of the M. avidus strains showed cross-immobilization/agglutination activities with the anti-sera against serotypes A and B, respectively. The present study can be utilized in epidemiological studies by informing detection, taxonomic research, strain identification, geographical spread, and disease control.

Miamiensis avidus is a dangerous parasitic pathogen that causes scuticociliatosis in fish and high mortality rates in mariculture worldwide. Methods to identify this species from among closely related species were limited. Mitochondrial DNA sequences can be valuable genetic markers for species detection and diagnostics, which are increasingly used in molecular epidemiology and surveillance tool. This is the first report in which the mitogenome of M. avidus was thoroughly compared to seven related ciliate mitogenomes by analyzing the nucleotide composition, codon usage, genome organization, protein-coding genes, and the terminal repeat sequences. The results of this study could facilitate better understanding of scuticociliate infection, aid in the development of control measures against scuticociliatosis, and provide insights into the molecular epidemiology of scuticociliates.