Introduction

Gastrointestinal parasites of livestock cause substantial economic losses worldwide (Gasser et al. 2008). Marshallagia spp. are among the most common gastrointestinal nematodes of ruminants, and more than 10 species have been recognized in the genus Marshallagia. Of them, Marshallagia marshalli is widely distributed and is important in tropical and subtropical regions with high prevalence in sheep, goat, and wild ruminants, causing loss of appetite, weight loss, constipation, diarrhea, and even death (Moradpour et al. 2014; Shirvan et al. 2017). It is also one of the most prevalent helminths in ruminants in China, resulting in significant economic losses to the animal industry.

Mitochondrial (mt) genome sequences have been widely used as genetic markers for studying molecular epidemiology, population genetics, and phylogenetics at various taxonomic levels of different organisms (Herd et al. 2015; Song et al. 2016, 2017; Wang et al. 2016; Li et al. 2017), because of their unique characteristics (Wolstenholme 1992; Boore 1999). The current hypothesis of Trichostrongylidae phylogeny was based on morphological and ecological characters and SSU rRNA gene sequence analyses (Hoberg and Lichtenfels 1994; Gouÿ de Bellocq et al. 2001). Recently, mt genome sequences have also been used to reconstruct the phylogenetic relationships among Trichostrongylidae nematodes (Jex et al. 2010). Despite these advances, difficulties remain concerning the phylogenetic relationships among Trichostrongylidae nematodes. For example, although some studies (Durette-Desset 1985; Durette-Desset and Chabaud 1993) indicated that Trichostrongylidae is monophyly, others (Gouÿ de Bellocq et al. 2001; Lin et al. 2012; Zhao et al. 2014) have argued the opposite and have suggested that Trichostrongylidae is sister to Haemonchidae and Cooperiidae. These inconsistent hypotheses may result from the inadequate resolution at higher taxonomic levels based on different DNA datasets as well as the uses of different inference methods. In addition, although Trichostrongylidae is a large nematode family, to date, the complete mt genomes of only a limited number of species have been sequenced (Jex et al. 2010). No mt genome has been reported for any of the members of the genus Marshallagia. The lack of enough mt genomes of nematodes in this family forms a major limitation for phylogenetic studies of Trichostrongylidae.

Given this background and the significance of M. marshalli, the present study aimed to determine the gene content, arrangement, and composition of the mt genome of M. marshalli and to reconstruct phylogenetic relationships of the superfamily Trichostrongyloidea using mtDNA sequences.

Materials and methods

Parasites and DNA extraction

Two adult specimens of Marshallagia were collected from the abomasum of a Tianzhu white yak which was naturally infected in Gansu Province, China. The samples were cleaned by phosphate-buffered saline (PBS) and fixed in 70% ethanol and stored at − 20 °C until further use. It was difficult to acquire accurate morphological data from the samples preserved in 70% ethyl alcohol; therefore, molecular identification was carried out to determine the identities of the specimens. Total genomic DNA was extracted from Marshallagia samples by using sodium dodecyl sulfate (SDS)/proteinase K treatment and spin column purification (Wizard® SV Genomic DNA Purification System, Promega).

Long-range PCR, sequencing, and annotation

The primers were designed to relatively conserved regions of mt genome sequences of closely related species, namely Cooperia oncophora (GQ888713) and Teladorsagia circumcincta (GQ888720) (Table 1). The entire mt genome of M. marshalli was amplified by long PCR with five overlapping amplicons located between cox1 and rrnL (~ 3.5 kb), rrnL and rrnS (~ 4.3 kb), rrnS and cytb (~ 4.1 kb), cytb and cox3 (~ 1.5 kb), and cox3 and cox1 (~ 2.4 kb) (Table 1).

Table 1 Sequences of primers used to amplify PCR fragments of Marshallagia marshalli

Each long-PCR reaction was conducted in a total volume of 50 μl, which included 25 μl PrimeStar Max DNA polymerase premix (Takara, Dalian, China), 25 pmol of each primer (synthesized in Sangon Biotech Company, Shanghai, China), 0.5 μl DNA template, and approximately 24 μl H2O, in a thermocycler (Biometra, Göttingen, Germany). The PCR cycling conditions began with an initial denaturation at 98 °C for 1.8 min, and then followed with 22 cycles of denaturation at 98 °C for 18 s, annealing at 50–58 °C for 10 s, and extension at 60 °C for 1.8–5 min, followed by 98 °C denaturation for 2 min, with additional of 28 cycles of 98 °C denaturation for 18 s, annealing at 50–58 °C for 10 s, and extension at 60 °C for 1.8–5 min in accordance to the fragment length (average 1000 bp/1 min), with 10 min of the final extension step at 66 °C. One sample without genomic DNA (negative control) was added in each amplification run. Each amplicon (3 μl) was detected by electrophoresis in a 0.8% agarose gel and stained with ethidium bromide (Sangon Biotech Company, Shanghai, China) with a clear, single, and bright band. After column purification (Wizard-SV Genomic DNA Purification System, Promega), the products were sequenced by Sangon Biotechnology Company (Shanghai, China) using a primer walking strategy.

The mt genome was annotated using an approach similar to that of Ascaridomorph nematodes and Gongylonema pulchrum (Liu et al. 2015; Liu et al. 2016). Briefly, each mt protein-encoding gene was identified by comparison with corresponding genes of the mt genome of a reference species (i.e., T. circumcincta, accession number GQ888720) (Jex et al. 2010). The tRNA genes were identified using the program tRNAscan-SE (Lowe and Eddy 1997) or by visual inspection (Hu et al. 2002); rRNA genes were predicted by comparison with those of T. circumcincta (Jex et al. 2010).

Phylogenetic analysis based on concatenated amino acid sequence data

The amino acid sequences conceptually translated from individual genes of the mt genome of M. marshalli were concatenated. Concatenated amino acid sequences predicted from published mt genomes of representative nematodes were selected for comparison, including the superfamily Trichostrongyloidea, including the family Haemonchidae (Haemonchus contortus, NC_010383 (Jex et al. 2008); Haemonchus placei, NC_029736; and Mecistocirrus digitatus, NC_013848 (Jex et al. 2010)), the family Trichostrongylidae (Trichostrongylus axei, NC_013824; Trichostrongylus vitrinus, NC_013807; Teladorsagia circumcincta, NC_013827) (Jex et al. 2010), the family Cooperiidae (Cooperia oncophora, NC_004806 (Van der Veer and de Vries 2004)), the family Molineidae (Nematodirus oiratianus, NC_024639, and Nematodirus spathiger, NC_024638) (Zhao et al. 2014), and the family Dictyocaulidae (Dictyocaulus viviparus, NC_019810; Dictyocaulus eckerti, NC_019809) (Gasser et al. 2012), using Oesophagostomum quadrispinulatum (GenBank accession number NC_014181) (Lin et al. 2012) as outgroup. Amino acid sequences inferred from the sequences of 12 mt protein-coding genes were aligned individually first using MAFFT 7.122 (Katoh and Standley 2013) and were then concatenated to form a single dataset; ambiguously aligned regions were excluded using Gblocks 0.91b (doc) (http://molevol.cmima.csic.es/castresana/Gblocks_server.html) (Talavera and Castresana 2007) with the default parameters (allow smaller final blocks, allow gap positions within the final blocks, and allow less strict flanking positions). Phylogenetic analysis was conducted using Bayesian inference (BI) as described previously (Liu et al. 2016). Phylograms were drawn using the program FigTree v.1.4 (http://tree.bio.ed.ac.uk/software/figtree).

Results and discussion

Acquisition of ITS rDNA

A specimen was identified as M. marshalli based on PCR-based sequencing of the internal transcribed spacer (ITS-1 and ITS-2) rDNA regions (Newton et al. 1998; Chilton et al. 2001; Nabavi et al. 2014), and both ITS-1 and ITS-2 regions (MG011724) had 99% identity to previously published sequences for M. marshalli from Uzbekistan and Iran (GenBank accession nos. KT428384 and HQ389231, respectively).

Content, organization, and annotation of mt genome

The complete mt genome of M. marshalli (GenBank accession no. MG011723) was 13,891 bp in length (Fig. 1), including 12 protein-coding genes (cox1–3, nad1–6, nad4L, atp6, and cytb), 22 tRNA genes, 2 rRNA genes, and 2 non-coding regions (NCR) (Table 2). The nucleotide composition of the M. marshalli mt genome is A = 4131 (29.7%), T = 6458 (46.5%), G = 2308 (16.6%), and C = 994 (7.2%). The gene content and arrangement are the same as those of H. contortus (Jex et al. 2008), T. axei (Jex et al. 2010), N. oiratianus (Zhao et al. 2014), and D. viviparus (Gasser et al. 2012).

Fig. 1
figure 1

Arrangement of the mitochondrial genome of M. marshalli. The scales are similar. All genes are transcribed in the clockwise direction, using standard nomenclature. The 22 tRNA genes are represented by the one-letter code for the corresponding amino acid, with numerals differentiating each of the two leucine-specifying and serine-specifying tRNAs (L1 and L2 for codon families CUN and UUR, respectively; S1 and S2 for codon families AGN and UCN, respectively)

Table 2 The features of the mitochondrial genome of Marshallagia marshalli

M. marshalli mt genome encoded 12 protein-encoding genes, with 3414 amino acids in total. It has four initiation codons (ATT, ATA, TTG, ATG) and four termination codons (TAA, TAG, TA, T). Among them, ATT is the highest frequency of being used as initiation codons, which has been used six times in total, by cox1, cox2, nad5, nad1, nad4 and atp6. ATA with secondary high rate of recurrence (three times) as start codons, nad3, cytb, and cox3, used it in the mt genome of M. marshalli. Moreover, TTG are used by nad6 and nad2 in the mt genome. As far as stop codons are concerned, TAA is the most frequently used with seven times altogether, by cox2, nad3, nad6, nad4L, nad1, atp6, and cox3. The genes of nad5, cox3, and cox1 use T as termination codons. Furthermore, there is abbreviated termination codon TAG being used in nad2 of the mt genomes, while TA is also another unfinished stop codon being used by cytb genes of M. marshalli. These results are consistent with the arrangement in the mt genomes of other Trichostrongyloidea nematodes (Trichostrongylus axei, Trichostrongylus vitrinus, and Teladorsagia circumcincta) (Jex et al. 2010).

There are 22 tRNA sequences in the mt genome of M. marshalli ranging from 51 to 65 nucleotides in length. The rrnS gene of M. marshalli is located between trnE and trnS genes and has a length of 696 bp. The rrnL gene is located between trnH/T and nad3 genes and is 959 bp in length. Both the rrnS and rrnL are high in A+T contents, 78.3 and 82.6%, respectively (Table 3). The longer non-coding region (LNCR, 345 bp in length) is located between the trnA and trnP, and the shorter one (SNCR, 78 bp in size) is located between nad4 and cox1 genes (Table 2). The A+T contents of the LNCR is 89.8%, and the SNCR is 78.2% (Table 3). These non-coding regions may be important for the replication and transcription processes, although these actual processes are still unknown (Shadel and Clayton 1997).

Table 3 Nucleotide composition and skew of Marshallagia marshalli mitochondrial protein-coding genes

Phylogenetic analysis

The phylogenetic tree was inferred from the concatenated amino acid sequences of 12 key nematodes representing the superfamily Trichostrongyloidea (Fig. 2). Our results supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support (Bayesian posterior probabilities = 1.0, Fig. 2), but rejected the monophyly of the family Trichostrongylidae, which were consistent with those of previous studies (Gouÿ de Bellocq et al. 2001; Lin et al. 2012; Zhao et al. 2014). Two species from the family Trichostrongylidae were more closely related to H. contortus, H. placei, and M. digitatus (Haemonchidae) than they were to the other two species (T. circumcincta and M. marshalli) from the family Trichostrongylidae. The close relationship between the species of the family Trichostrongylidae and H. contortus, H. placei, and M. digitatus (Haemonchidae) was moderately supported in BI (Bayesian posterior probabilities = 0.88, Fig. 2). N. oiratianus and N. spathiger (Molineidae) was more closely related to D. viviparous and D. eckerti (Dictyocaulidae) than C. oncophora (Cooperiidae), Trichostrongylus spp., T. circumcincta and M. marshalli (Trichostrongylidae), and Haemonchus spp. and M. digitatus (Haemonchidae). These results were consistent with previous study (Zhao et al. 2014).

Fig. 2
figure 2

Phylogenetic relationships of M. marshalli and other Trichostrongyloidea nematodes. Tree inferred from the concatenated amino acid sequence dataset for 12 protein-coding genes from 12 Trichostrongyloidea nematodes was performed by Bayesian inference (BI). Oesophagostomum quadrispinulatum (GenBank accession number NC_014181) was chosen as the outgroup

Significance and implications

Animal infections with gastrointestinal nematodes (including marshallagiasis) can sometimes be diagnosed based on clinical symptoms, such as diarrhea, anemia, mortality, or decreased fertility (Gasser et al. 2008). However, this approach is usually unreliable, because these clinical signs in animals can be caused by one or more members of the gastrointestinal nematodes or other nematodes. Larval stages of M. marshalli cannot be identified reliably morphologically. Fortunately, DNA technological approaches have been used as diagnostic methods for many nematodes (Fernández-Soto et al. 2016; Lodh et al. 2016; Roeber et al. 2017; Solórzanogarcía and Pérezponce 2017). Molecular markers, such as the first internal transcribed spacer (ITS-1) region of nuclear rDNA, have been used as alternative tools for clinical diagnosis and molecular epidemiological investigations of M. marshalli (Dallas et al. 2000; Dallas et al. 2001). The characterization of the mt genome of M. marshalli now provides novel genetic markers for developing new analytical and diagnostic tools.

Mt genome sequences, in particular the protein-coding gene sequences, have been successfully used for examining systematic status of nematodes (Aghazadeh et al. 2015; Blouin 2002; Hawash et al. 2015; Liu et al. 2015; Sun et al. 2016, 2017; Wang et al. 2016; Kim et al. 2017). Therefore, in this study, we determined the mt genome of M. marshalli which stimulates a reassessment of the systematic relationships of Trichostrongyloidea nematodes using mt genomic datasets. There have been controversies regarding the systematics of members of the Trichostrongyloidea (including Haemonchidae, Molineidae, Cooperiidae, Trichostrongylidae, and Dictyocaulidae). To date, mt genomes of many species of the family Trichostrongylidae are still underrepresented or not represented. Therefore, expanding taxa sampling is necessary for future phylogenetic studies of Trichostrongylidae species using mt genomic dataset.

Conclusion

The present study determined the complete mt genome sequence of M. marshalli. Phylogenetic analyses rejected the monophyly of the family Trichostrongylidae. The availability of the M. marshalli mt genome sequences provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.