Introduction

Babesiosis is caused by intracellular Babesia species and can infect a wide range of animals and even human beings, giving rise to an enormous economic loss for farmers (El-Dakhly et al. 2015; Schnittger et al. 2012; Uilenberg 2006; Wickramasekara Rajapakshage et al. 2012). Canine babesiosis, a common tick-borne protozoan disease with a wide distribution mainly in Asia, Africa, Australia, Europe, and North America, may result in fever, anemia, hemoglobinuria, hyperthermia, pallor, anorexia, jaundice, splenomegaly, and even death in serious cases (Goo and Xuan 2014). The clinical symptoms are variable based on the host health conditions, vector specificity, and parasite species. Moreover, the infected canines were chronic carriers and the major source of infection in most cases. This prevalent disease is caused by three large forms of Babesia species (Babesia canis (B. canis), Babesia rossi (B. rossi), and Babesia vogeli (B. vogeli)) and three small forms (Babesia gibsoni (B. gibsoni), Babesia conradae (B. conradae), and Babesia microti (B. microti)–like also regarded as “B. vulpes”) (El-Dakhly et al. 2015; Goo and Xuan 2014; Solano-Gallego et al. 2016). Among them, B. gibsoni is well known to be more severe than the other species and was first characterized in India in 1910 (Solano-Gallego et al. 2016). It is commonly transmitted by tick vectors such as R. sanguineus and Haemaphysalis longicornis and was also reported to be transmitted probably by blood transfusion, dog bite, and placenta (Solano-Gallego et al. 2016).

The sequencing of the whole genome of B. gibsoni has not been completed, and its properties remain poorly understood. In 2009, the mitochondrial (mt) genome of a B. gibsoni isolate was reported (Hikosaka et al. 2010). However, the information about the B. gibsoni isolates prevalent in China is very limited. In 2017, the B. gibsoni isolate (WH58) endemic to Wuhan, China, was identified and reported in our previous work (He et al. 2017). However, the mt genome was not sequenced and annotated, and the structure was also not determined and analyzed. For Babesia and other intracellular protozoa, the mt organelle plays a significant role in energy metabolism and calcium homeostasis (Cornillot et al. 2012; Frederick and Shaw 2007; Hikosaka et al. 2010; Mogi and Kita 2010). Under most circumstances, the mt genome of intracellular protozoa encodes three protein-coding genes (cytochrome c oxidase subunits I (cox1), cytochrome c oxidase subunits III (cox3), and cytochrome b (cob)), large subunit (LSU) and small subunit (SSU) ribosomal RNAs (rRNAs), and terminal inverted repeats (TIRs) (Lin et al. 2011; Wickramasekara Rajapakshage et al. 2012; Yang et al. 2015). However, in apicomplexan parasites, the mt genomes vary in length, form, species, and the number of protein-coding genes (Cornillot et al. 2013; Hikosaka et al. 2010). In this study, the mt genomes of apicomplexan parasites were compared in structure and organization, and the genes of cox1 and cob were used for phylogenetic and evolutionary analyses. All the results reported in this article may facilitate a basic understanding of the mt genome of B. gibsoni endemic to Wuhan, China, and provide new insights into the genetic relationships among the apicomplexan protozoa.

Materials and methods

Mitochondrial DNA cloning and sequencing

The genomic DNA (gDNA) of B. gibsoni was extracted and stored at − 80 °C as previously reported (He et al. 2017). The genome sequence was determined by polymerase chain reaction (PCR) using specific primers (Table 1). The five pairs of primers were designed by aligning with the reported mt genome sequences of B. gibsoni (GenBank accession number AB499087), B. canis (KC207822), B. vogeli (KC207825), and B. rossi (KC207823). PCR was performed in a 50 μl reaction mixture containing 10 mM Tris–HCl (pH 8.4), 50 mM KCl, 4 mM MgCl2, 0.2 mM dNTP, 0.2 mM of each primer, 2 U Taq polymerase (Takara Biotechnology, Beijing, China), and 2 μl gDNA. The primers used were F1 and R1, F2 and R2, F3 and R3, F4 and R4, and F5 and R5. PCR conditions were as follows: the initial denaturation at 95 °C for 5 min, 33 cycles (denaturation at 94 °C for 30 s, annealing at 55–68 °C (depending on the primers used) for 30 s, extension at 72 °C for 1–6 min (depending on amplicon size, 1 min/kb)), and a final extension of 10 min at 72 °C. Amplicons were cloned into the pMD19-T vector (Takara) for subsequent sequencing using the ABI PRISM 377 DNA sequencer according to the manufacturer’s instructions. The vector primers M13F and M13R as well as five specific pairs of PCR primers were used for the sequencing of the mt genome.

Table 1 Primers used for cloning B. gibsoni (WH58) mt genome

Gene annotation and sequence analysis

The obtained mt genome sequences of B. gibsoni (WH58) were assembled and aligned with the reported mt genome sequences of B. gibsoni (GenBank accession number AB499087), B. canis (KC207822), B. vogeli (KC207825), and B. rossi (KC207823) by MAFFT 7.0 (https://mafft.cbrc.jp/alignment/server/), followed by manual correction (Katoh et al. 2017). Protein-coding genes were deduced based on the previously annotated sequences from B. gibsoni, B. canis, B. vogeli, and B. rossi. The amino acid sequences of the protein-coding genes were generated using ExPASY online tool (http://www.expasy.org/translate/), and the open reading frames (ORFs) were analyzed by ORF finder (https://www.ncbi.nlm.nih.gov/orffinder/). To determine the putative rRNA genes, mt sequences were queried against previously reported rRNA sequences from the four related species using BLASTn under default algorithm parameters (NCBI, BLAST). The transfer RNA (tRNA) genes were identified by subjecting the entire mt genome of B. gibsoni (WH58) to tRNAscan-SE (http://lowelab.ucsc.edu/tRNAscan-SE/) using the Mito/Chloroplast model and the Nematode Mito model, followed by comparison of the results from the two models and annotation according to the B. gibsoni (AB499087) mt genome annotation.

Phylogenetic analysis

The amino acid sequences of the mt genome of B. gibsoni (WH58) were aligned with those of the related species by MAFFT version 7 (https://mafft.cbrc.jp/alignment/server/) (Katoh et al. 2017), including B. gibsoni (AB499087), B. canis (KC207822), B. vogeli (KC207825), B. conradae (KC207826), B. rossi (KC207823), B. orientalis (KF218819), B. bigemina (AB499085), B. caballi (AB499086), B. microti (FO082868), B. microti (AB624353), B. bovis (AB499088), B. bovis (EU075182), B. rodhaini (AB624357), T. orientalis (AB499090), T. equi (AB499091), T. annulata (NT167255), T. annulata (NW_001091933), T. parve (AB499089), T. parva (Z23263), and other Plasmodium spp. The concatenated amino acid sequences of cox1, cob, and cox1+cob were used for phylogenetic analysis, with the cox3 gene being excluded due to its high divergence in Babesia and Theileria spp. and its presence in the nuclear genome rather than in mt genome in some species, such as T. thermophile (Hikosaka et al. 2010). The nucleotide sequences were aligned by MAFFT v7 with those of the cox1 and cob genes, including 21 apicomplexan parasite species. Alignments were edited and adjusted manually using BioEdit v7.0.5.2 software (HALL 1999). Moreover, the nucleotide sequence identities of apicomplexan parasites were determined based on the sequences of the cox1 and cob genes by DNAstar software (Burland 2000). All the phylogenetic trees were inferred by maximum likelihood and neighbor-joining methods (1000 bootstrap replications) using MEGA v6.0 software (Tamura et al. 2013).

Availability of data and materials

All data are included as tables and figures within the article.

Results and discussion

Mitochondrial genome map of B. gibsoni (WH58)

The whole mt genome was cloned and sequenced by using five pairs of primers (Table 1). The amplification fragments by these five pairs of primers contained the overlapping domains in order to cover the entire mt genome. The full size of the mt genome was 5865 bp, with the respective size of each amplicon being 419 bp, 456 bp, 2304 bp, 1476 bp, and 1460 bp. The mt genome of B. gibsoni (WH58) was annotated and deposited in GenBank (accession number KP666169). The mt genome was also identified to be in the linear form and contain three protein-coding genes (cox1, cox3, and cob), two TIRs on both ends, and six LSU rRNA fragments. Similar to other apicomplexan parasites, tRNA was absent in the mt genome of B. gibsoni (WH58). The three protein-coding genes cox1, cox3, and cob were cloned by specific primers, and the length was 1428 bp, 642 bp, and 1092 bp, respectively. The respective size of the six rRNA genes was 305 bp, 34 bp, 110 bp, 81 bp, 69 bp, and 42 bp. LSU1–3 and LSU6 were located mainly between cox3 and cob genes, ranging from 3114 to 5516 bp. L4 and L5 were located adjacent to the cob gene and the TIRs. The two TIRs were 77 bp and 74 bp in length, far less than the length of other apicomplexan parasites.

Twenty-six mt genomes from apicomplexan parasites including Babesia, Theileria, and Plasmodium spp. and Toxoplasma gondii (RH), were compared in size, host infection, mt form, and protein-coding gene number (Table 2) (Carlton et al. 2002; Ke et al. 2018; Lloyd et al. 2018; Preiser et al. 1996). The mt genomes of B. gibsoni (WH58) and most of Babesia, Theileria, and Plasmodium spp. were similar in the size, which was about 6000 bp (Hikosaka et al. 2011). Interestingly, the mt genomes of parasites that cause canine babesiosis, including B. gibsoni (WH58), B. canis, B. vogeli, B. rossi, and B. conradae (5603–5865 bp in length) had a slightly smaller size than the other Babesia and Theileria species (5847–11,149 bp in length). B. microti (R1 strain) showed a size of about 11,100 bp, which was twice more than that of the others. The sizes of mt genomes of B. gibsoni (WH58) and Plasmodium spp. were three-fold more than Toxoplasma gondii (RH strain) that was only 2607 bp. The mt genome form was linear for piroplasma including the identified B. gibsoni (WH58) and other Babesia and Theileria species. However, P. falciparum and P. knowlesi contain a circular mitochondrion (Gardner et al. 2002; Hikosaka et al. 2010; Lau 2009). Additionally, most of the 26 apicomplexan parasites contain the three genes of cox1, cox3, and cob, with the exception of Toxoplasma gondii (RH strain) which only contained cox1 and cob genes (Gjerde 2013; Schreeg et al. 2016). Even though previous study has reported that B. conradae was short of cox3 gene, a section of the mt sequence of B. conradae had a high similarity with cox3 gene of other Babesia and Plasmodium spp. through blast in the NCBI database (Schreeg et al. 2016). Therefore, a cox3-like gene may exist in the mt genome of B. conradae. All in all, the mt genome of B. gibsoni (WH58) was similar to other Babesia spp., but divergent with P. falciparum in the mt genome form and with Toxoplasma gondii (RH strain) in the mt genome size and numbers of protein-coding genes.

Table 2 Comparative analysis of the mt genome of apicomplexan parasites

The distribution and direction of the protein-coding genes, LSU and TIR, were compared among B. gibsoni (WH58), B. bovis, B. rodhaini, Theileria equi (T. equi), and P. falciparum due to their difference in infection to hosts (Fig. 1). Despite a TIR length twice smaller than that of B. bovis, B. gibsoni (WH58) was most close to B. bovis in the five different species in terms of location and the size of all elements. B. gibsoni (WH58) was remarkably divergent from B. rodhaini and T. equi, especially P. falciparum. Different from B. gibsoni (WH58) and other apicomplexan parasites, P. falciparum had a circular mt genome and contained three protein-coding genes (cox1, cox3, cob), 12 LSU ribosomal RNAs, seven small subunit (SSU) rRNAs, and seven miscellaneous (misc) RNAs. However, no TIR was available in P. falciparum (Lau 2009). Despite obvious divergence in the five species, the size was practically the same for cox1, cox3, and cob. The direction of cox1 and cob was compared in the five species. For the cob gene, the direction of P. falciparum (3D7), B. rodhaini, and B. microti was different from that of the other species and was from 3′ to 5′. For the cox1 gene, the direction of T. equi was from 3′ to 5′ and was opposite to that of the other species.

Fig. 1
figure 1

Mitochondrial genome structures of B. gibsoni (WH58) (a), B. bovis (b), B. rodhaini (c), T. equi (d), and P. falciparum (e). The protein-coding genes (cox1, cox3, and cob) are indicated by white boxes. Large subunit (L1–L12) and small subunit (S1–S7) rRNA fragments are indicated by dark and gray boxes. Terminal-inverted repeats (TIRs) are indicated by arrows with P. falciparum being absent

Phylogenetic analysis

The nucleotide sequence distances of some apicomplexan parasites were analyzed based on the sequences of the cox1 and cob genes, and the results are shown in Table 3. It can be seen that B. gibsoni (WH58) was more similar to B. gibsoni (AB499087), B. canis, and B. rossi, with an average identity percentage of over 80%, while B. conradae was obviously far divergent from other Babesia spp.

Table 3 Nucleotide sequence identities of apicomplexan parasites based on cox1 and cob genes

Despite the highest similarity to the reported B. gibsoni (AB499087) in nucleotide sequence, B. gibsoni (WH58) showed a difference of 21 bp from B. gibsoni (AB499087), with a 5 bp, 4 bp, and 6 bp difference in the sequence of the cox1, cox3, and cob genes, respectively, corresponding to the difference in their amino acid sequences. Due to the close association of mitochondria with the metabolism of Babesia spp., the differences in the sequences of nucleotides and amino acids of different isolates may lead to divergence in the properties such as the virulence to host and environment of in vitro culture. For example, the isolate B. gibsoni (AB499087) in Japan was more adaptive to in vitro culture than WH58 isolate. Therefore, it is necessary and significant to sequence, annotate, and compare the mt genomes of different isolates for a better understanding of the mechanism of B. gibsoni infection.

The genetic relationships of apicomplexan species were analyzed based on the sequences of the amino acids of cox1, cob, and cox1+cob (Fig. 2). The phylogenetic analysis contained the mt sequences of Babesia spp., Theileria spp., Plasmodium spp., and Toxoplasma gondii. Among them, the mt genomes of B. canis, B. rossi, B. vogeli, and B. conradae had been cloned, sequenced, and annotated in previous studies and were included in the present study for comparison with the mt genomes of B. gibsoni and other parasites (Schreeg et al. 2016). For the phylogenetic analysis, the species that infect the same host were assigned to one group. For instance, B. conradae infects canine and is far distant from B. gibsoni and other canine parasites, but more close to B. microti. This specific relationship was also reflected by 18S phylogenetic analysis (He et al. 2017; Schreeg et al. 2016). Moreover, B. microti is more distant from other Babesia spp., due to its infection to humans. Plasmodium spp., Toxoplasma gondii, and Eimeria tenella were assigned in one group due to their close relationship and divergence from Babesia and Theileria spp. Furthermore, the bootstrap values in the tree based on the amino acid sequences of the cox1+cob were notably higher than those based on amino acid sequences of either cox1 or cob, indicating the credibility and applicability of the cox1+cob-based evolutionary relationships.

Fig. 2
figure 2

Molecular phylogenetic analysis of apicomplexan parasites according to the amino acid sequences of cox1 (a), cob (b), and cox1+cob (c). All positions containing gaps and missing data were eliminated. The numbers on branches show the percentage of 1000 bootstrap replications. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. GenBank accession numbers are indicated on the left of each species name. The bold branch was the mt sequence obtained in this study (WH58). The mt sequence obtained in this study (WH58) was indicated in bold

Conclusions

This article reported for the first time the mt genome of B. gibsoni endemic to Wuhan, China. The mt genomes of apicomplexan parasites were compared for a basic understanding of their evolutionary relationships. The results indicated that the mt genome of B. gibsoni (WH58) was more similar and close to that of B. gibsoni (AB499087), B. canis (KC207822), and B. rossi (KC207823) in structure and phylogeny. This study contributes to a comprehensive understanding of the apicomplexan protozoan phylogeny and facilitates further related research.