Introduction

Saturnia japonica (Lepidoptera) is one of the most precious wild silkworms, mainly distributed in China, Japan, and Korea, which typically consumes the leaves from walnut or chestnut (Chen et al. 2020). S. japonica can spin fluorescent and scintillating silk, which is often used as a raw material for various anti-counterfeiting signs and high-grade clothing (Li et al. 2014). In addition, S. japonica is also a kind of resource insect. Its pupa is often used as an edible insect for supplement protein.

Insect mitogenome DNA (mtDNA) is typically composed of 37 conserved genes. It is a closed-circular molecule ranging in size from 14 to 19 kilobases (kb), including the very short or non-existent interval between genes (Jiang et al. 2009). It contains 13 protein-coding genes (PCGs), 22 transfer RNAs (tRNAs), and 2 ribosomal RNAs (rrnL and rrnS) (Boore 1999; Wolstenholme 1992). In addition, there is a poorly conserved adenine (A)+thymine (T)-rich region, which contains the initiation sites of genome transcription and replication (Cameron 2014; Kim et al. 2006; Shadel and Clayton 1993; Lu et al. 2013). This control region is also considered to control transcription initiation and genome replication (Wolstenholme 1992). MtDNA is a powerful tool for species identification (Dowton et al. 2002), genomic evolution (Miya et al. 2001), phylogeography (Avise et al. 1987), and molecular evolution (Forstén 1991) based on its simple genomic organization, high rate of evolutionary, and almost clear homology (Zhang et al. 2015).

Lepidoptera is a gigantic family with more than 160000 species, which can be divided into 45–48 superfamilies (Hao et al. 2012). Saturniidae contains 800 species distributed worldwide and more than 44 species recorded in China. In addition to being an important indicator of plant pest control in agriculture and forestry, they are also valuable in the process of plant pollination (Chen et al. 2020). However, only a few species mitogenomes have been completely sequenced, which are publically available in GenBank (Table 1) (Kawaguchi and Nishida 2001, Sun et al. 2017).

Table 1 Details of the Lepidopteran mitogenomes used in this study

It is important to understand the genetic characteristics and phylogenetic status of S. japonica for improving its comprehensive utilization. The complete mitogenome of S. japonica was sequenced, annotated, and compared with the other Lepidoptera species to elucidate molecular evolution, comparative and evolutionary genomics, phylogenetics, and population genetics (Jiang et al. 2009).

Materials and Methods

Experimental Insects and DNA Extraction

Saturnia japonica pupae were collected from the chestnut trees on the Scenic of Dabie Mountains, Anhui Province, China. Fresh specimens were preserved in 100% ethanol and stored at -80℃. The total DNA was extracted using the Genomic DNA Extraction Kit according to the manufacturer’s instructions (Sangon Biotech Co. Ltd, Shanghai, China). The extracted DNA quality was examined by 1% agarose gel electrophoresis (w/v) and then used for the PCR amplification.

Primer Design, PCR Amplification, and Sequencing

Fourteen overlapping fragments pairs of primers were designed based on the conserved nucleotide sequences of known the reported mitogenome of Lepidopteran species and synthesized by General Biosystems Co. Ltd. (Chuzhou, China). The complete list of successful primers is given in Table 2. All PCRs were performed in 50 μL reaction volume, which including 35 μL sterilized distilled water, 5 μL 10 × Taq buffer (Mg2+ plus), 4 μL dNTP (2.5 mM), 1.5 μL DNA as template, 2 μL of each primer (10 μM), and 0.5 μL (1 unit) Taq DNA polymerase (TaKaRa Biotechnology Co., Dalian, China). The PCR amplification conditions were as follows: 4 min at 94 °C, followed by 30 cycles of 30 s at 94 °C, 40 s at 46–56 °C, and 2–3 min (depending on putative length of the fragments) at 72 °C, and then a final extension step of 72 °C for 10 min. The PCR products were detected by 1% agarose gel electrophoresis (w/v) and then purified using a DNA gel extraction kit (Sangon Biotech Co. Ltd., Shanghai, China). The purified fragments were ligated into the T-vector (TaKaRa Biotechnology Co., Dalian, China) and transformed into Escherichia coli TOP10. The insert DNA-positive recombinant colonies were sequenced bidirectionally at least three times by General Biosystems Co. Ltd. (Chuzhou, China).

Table 2 Details of the primers used to amplify the mitogenome of S. japonica

Sequence Assembly and Gene Annotation

Sequence annotation was performed by comparing with other Lepidoptera species sequenced previously using blast tools available from the NCBI (https://blast.ncbi.nlm.nih. gov/Blast.cgi) and SeqMan II program from the Lasergene software package (DNASTAR Inc., Madison, USA) (Guo et al. 2019). The protein-coding sequences were translated into putative proteins on the basis of the Invertebrate Mitochondrial Genetic Code. The skewness was measured by the method given by Junqueira et al and describe the base composition of the nucleotide sequence as follows: AT skew = [A–T]/[A+T], GC skew = [G–C]/[G+C] (Junqueira et al. 2004). The relative synonymous codon usage (RSCU) values were calculated by using MEGA7.0 (Tamura et al. 2011).

The tRNA genes were verified using either program tRNA scan-SE Search with the default settings (http://lowelab.ucsc.edu/tRNAscan-SE/) (Tamura et al. 2011), or by manually identifying sequences with the appropriate anticodon capable of folding into the typical cloverleaf secondary structure (Lowe and Eddy 1997). The tandem repeats in the A+T-rich region were determined by the tandem repeats finder program (http://tandem.bu.edu/trf/trf.html) (Dai et al. 2015).

Phylogenetic Analysis

To illustrate the phylogenetic relationship among Lepidoptera insects, 38 complete mitogenomes were downloaded from the GenBank DataBase (Table 1). The mitogenomes of Cucujus clavipes (GU176341.1) (Song et al. 2010) and Cucujus haematodes (KX087268.1) were downloaded as outgroups. Multiple comparison of the 13 PCGs concatenated nucleotide sequences were conducted using MEGA version 7.0. Then the 13 protein nucleotide sequences were serialized into a group for phylogenetic analysis, which were performed using Maximum Likelihood (ML) and Neighbor Joining (NJ) method based on the Kimura 2-parameter (K2P) model (Kimura et al. 1980). Evolutionary analysis was conducted in the MEGA version7.0 program (Tamura et al. 2011). The ML analysis was used to infer phylogenetic trees with 1000 bootstrap replicates, and Neighbor Joining (NJ) distance analysis was performed using PAUP4b 10 (Thompson et al. 1997). The NJ analysis was done with 1000 bootstrap replicates. The consensus tree was visualized using FigTree v1.4.0 (http://tree.bio.ed.ac.uk/Sofw-are/fgtree/).

Results and Discussions

Genome Structure, Organization and Composition

The mitogenome of S. japonica is a circular molecule of 15, 376 bp in length (Fig. 1), which is within the range observed in the entire sequence of Lepidopteran species with the size ranging from 16, 179 bp in Plutella xylostella (Plutellidae) to 15, 113 bp in Lamproptera meges (Papilionidae) (Table 1) (Swofford 2003). The sequence had been annotated and deposited into GenBank under the accession number MT614593. The mitogenome structure of S. japonica conforms to the classic 38 regions of the Lepidopteran mitogenome, including 13 protein-encoding regions, 22 tRNA-encoding regions, two rRNA-encoding regions, and a large non-coding-region with high A + T-rich composition (Table 3) (Wei et al. 2013; Rand 1993). The arrangement and orientation of genes in the mitogenome of S. japonica is trnM-trnI-trnQ, which is different from the ancestral gene order trnI-trnQ-trnM (Boore 1999). Most of the 23 genes were transcribed on the majority-coding strand (H-strand) and a few on the minority-coding strand (L-strand). The comparison of S. japonica mitogenome composition and skewness level with other sequenced Lepidoptera species is represented in Table 4. The genome composition of S. japonica is A: 39.37%, T: 41.30%, G: 7.56%, and C: 11.77%, with a total A + T content of 80.66%. This is within the scope of similar species (A + T bias of 78.26% in Tecia solanivora and 82. 56% in Leucoptera malifoliella) (Table 4) (Mcknight and Shaffer 1997; Ramirez-Rios et al. 2016). Additionally, it exhibits negative AT skewness (− 0.024) and negative GC skewness (− 0.218). The AT skewness in other Lepidopteran mitogenomes sequenced to date ranges from 0.057 (A. cinerarium) to − 0.036 (T. maculata) (Wu et al. 2012), while the GC skewness from − 0.247 (L. dispar) to − 0.177 (A. ipsilon) (Table 4) (Liu et al. 2014). Similarly, the 13 PCGs, tRNA, rRNA, and A + T-rich region of S. japonica are all within the range of the observed Lepidoptera.

Fig. 1
figure 1

Map of the mitogenome of S. japonica

Table 3 List of the annotated mitochondrial genes of S. japonica
Table 4 Composition and skewness in different Lepidopteran mitogenomes

Protein-Coding Genes and Codon Usage

The 13 protein-coding genes of S. japonica mitogenome are 11, 235 bp long (Appendix, Table 5) and account for 73.07% of the total nucleotides. The AT negative skewness (− 0.019) indicates the occurrence of less As than Ts. Nine of these PCGs (nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad6, and cob) are coded by the H-strand, while the remaining four PCGs (nad5, nad4, nad4L, and nad1) are coded by the L-strand. In addition, twelve of these PCGs begin with ATN (four with ATA, two with ATT, four with ATG, and two with ATC) codons, while the remaining cox1 gene of S. japonica starts with CGA codon as previously documented in Leucoma salicis (Wu et al. 2015). Except that cox2 terminates with a single T and nad4 terminates with TA, the remaining 11 PCGs use typical TAA termination codon (Table 3). This phenomenon has been reported in most of the sequenced Lepidopteran mitogenomes (Sun et al. 2016). Single T-stop codon generates functional stop codons by polyadenylation of adjacent PCGs and endonuclease recognition of polycistronic pre-mRNA transcription (Liu et al. 2013).

The mitochondrial genome codon usage of ten Lepidopteran insects was analyzed and divided into eight superfamilies: three species belonging to Bombycoidea, and seven belonging to Pyraloidea, Noctuoidea, Papilionoidea, Geometroidea, Yponomeutoidea, Gelechioidea, and Hepialoidea (Fig. 2). The results revealed that Asn, Ile, Leu2, Lys, Met, Phe, and Tyr were the most frequently utilized amino acids; however, the Arg codon family was the rarest. Codon distributions of three species in Bombycoidea are in consistency, and each amino acid has equal contents in different species (Appendix, Fig. 7).

Fig. 2
figure 2

Comparison of codon usage in the mitogenome of Lepidoptera

The 13 PCGs of the S. japonica mitogenome contain all codons (Appendix, Fig. 8). This is similar to A. cinerarium (Wu et al. 2012), L. dispar, T. renzhiensis (Wich et al. 1986), T. solanivora (Mcknight and Shaffer 1997), and G. timur (Cao et al. 2012), but in some Lepidopteran insects, high GC content codons are abandoned as one of the mitochondrial features (Lu et al. 2013; Chen et al.2016), such as B. mori (lack of GCG), C. suppressalis (lack of GCG & CGT), P. colligata (lack of GCG, GCG & CGT), and P. xylostella (lack of GCG) (Swofford 2003).

Ribosomal and Transfer RNA Genes

The mitogenome of S. japonica including two typical rRNA genes of Lepidoptera has been sequenced. The large ribosomal gene (rrnL) is 1407 bp in length and resided between tRNALeu (CUN) and tRNAVal, whereas the small ribosomal gene (rrnS) is only 772 bp, and located between tRNAVal and A + T-rich region (Table 3). The A + T content of the two rRNAs is 84.72%, which ranged from 83.60% (T. solanivora) to 86.09% (L. malifoliella) of Lepidopterans. In addition, AT skewness (−0.037) and GC skewness (-0.351) of the two rRNAs are negative (Appendix, Table 5), which are located within the range of reported Lepidopteran insects. This was similar to A. assama, A. ipsilon, and P. flavescens and so on.

The S. japonica mitochondrial genome harbors an entire set of 22 tRNA genes ranging from 63 bp (tRNACys) to 73 bp (tRNAAsp). 14 of the 22 tRNA genes were coded by the H-strand and remainder eight were coded by the L-strand. The A + T content of the 22 tRNA genes is 81.79%, and AT skewness (− 0.013) and GC skewness (− 0.138) of the two rRNAs are negative (Appendix, Table 5). All tRNAs exhibit typical secondary structure of clover, except for trnS1 lacking the DHU stem (Fig. 3),which is similar to several other previously sequenced Lepidopterans insects (Dai et al. 2016; Liao et al. 2010).

Fig. 3
figure 3

Putative secondary structures of the 22 tRNAs of the S. japonica mitogenome

A total of 12 mismatched bps in the S. japonica tRNAs were identified, 7 of the 12 G-U wobble pairs scatter throughout the seven tRNAs (two in acceptor stem, four in DHU, and one in anticodon stem). A–A mismatches in the DHU of the tRNG, three U–U mismatch in acceptor stem of the trnA, trnL2, trnN, and one U–U mismatches in anticodon stem of the trnV (Fig. 3).

Overlapping and Intergenic Spacer Regions

The mitogenome of S. japonica cins 42 bp overlapping nucleotides, which are located in 12 pairs of adjacent genes with the length from 1 to 17 bp and the longest overlap (17 bp) existed between trnF and nad5. In addition, there is 194 bp intergenic nucleotides (IGN) that are distributed among 26 pairs of adjacent genes ranging from 1 to 53 bp, and the longest spacer sequence was located between trnQ and nad2, which is usually found in Lepidopteran mitogenomes (Wu et al. 2010). Surprisingly, the seven nucleotides sequence “ATGATAA” is included in the mitochondrial sequence of the ten Lepidopteran insects observed (Fig. 4), common feature across Lepidopteran mitogenomes (Zhu et al. 2013). The 22 bp spacer between trnS2 (UCN) and nad1 contains the motif ‘ATACTAA,’ which is highly conserved region and found in most insect mtDNAs, but Hepialoidea only contains the motif ‘ATACTA’ (Fig. 5A); hence, the fragment ‘ATACTA’ is highly conserved region, which may be the peptide-binding site of mitochondrial transcription termination (mtTERM protein) (Taanman 1999).

Fig. 4
figure 4

Alignment of overlapping region between atp8 and atp6 across Lepidoptera and other insects

Fig. 5
figure 5

(A) Alignment of the intergenic spacer region between trnS2 (UCN) and nad1 of several Lepidopteran insects. The shaded ‘ATACTAA’ motif is conserved across the Lepidoptera order. (B) Features present in the A + T-rich region of S. japonica. The sequence is transcribed in the reverse strand. The ATATG motif is shaded. The poly-T stretch is underlined while the poly-A stretch is boldly underlined. The single-microsatellite T/A repeats sequence is indicated by dotted underlining

Fig. 6
figure 6

Phylogenetic relationships among species. Tree constructed using Maximum Likelihood (ML) and Neighbor Joining (NJ) with 1000 bootstrap replicates showed the phylogenetic relationships among 39 species, Cucujus clavipes (GU176341.1) and Cucujus haematodes (KX087268.1) were used as outgroups. This tree (ML tree), with bootstraps values from ML and NJ is at the nodes above and below, respectively

The A + T-Rich Region

The A + T-rich region of S. japonica mitogenome is located between the rrnS and trnM, with a length of 332 bp, remarkably shorter than that of T. renzhiensi (1367 bp) and longer than L. botrana (286 bp) (Appendix, Table 5). This region contains the highest A + T content (91.87%) in the mtDNA, and its AT skewness (− 0.082) and GC skewness (-0.481) are both negative (Appendix, Table 5). Several short repeating sequences scattered throughout the entire region, including the motif ‘ATAGA’ followed by a 17 bp poly-T stretch, a microsatellite-like (AT)9 element and a poly-A element upstream of trnM gene similar to other Lepidopteran mitogenomes (Fig. 5B). The length of poly-T stretch is variable (Lu et al. 2013), while ‘ATAGA’ region is highly conserved among Lepidoptera species (Cameron and Whiting 2008).

Phylogenetic Analysis

In this study, the nucleotide sequences of the 13 PCGs were concatenated and aligned to reconstruct the phylogenetic relationships among 39 Lepidoptera insects by using Maximum Likelihood (ML), Neighbor Joining (NJ) methods (Chai et al. 2012; Hassanin 2006). Species are clustered by family (Fig. 6). The phylogenetic analysis showed that the S. japonica was within the Saturniidae family (Bombycoidea), and it has a closer relationship to S. boisduvalii and S. jonasii. This is consistent with the conclusion of Kim et al. (2015) and clustered with other superfamilies, ordinal the Geometroidea, Noctuoidea, Pyraloidea, Gelechioidea, Papilionoidea, Tortricoidea, and Yponomeutoidea. The phylogenetic tree constructed by ML and NJ methods showed that Saturnia japonica, Saturnia boisduvalii, Saturnia jonasii, Eriogyna pyretorum, Antheraea assama, and Samia cynthia ricini belonged to Saturniidae, which was consistent with traditional taxonomic results. Similar phenomena also occur in other superfamilies, and the results of phylogenetic tree analysis are completely consistent with the results of traditional entomological taxonomy.