Introduction

The alternative synonymous codons in the genes do not occur with equal frequencies but follow the particular codon usage pattern in most organisms (Babbitt et al. 2014). It seems that each amino acid prefers one synonymous codon to the other. This phenomenon is termed codon usage bias. The majority of genes in a genome have similar selections among the synonymous codons (Grantham et al. 1980). The same protein encoded by synonymous codons may differ in their protein structures and functions (Buhr et al. 2016). According to existing literatures, the restriction factors contributing to the codon usage preference include the mutational bias, translational and transcriptional selection, tRNA abundance, RNA stability, GC content, gene length and protein structure (Hunt et al. 2014; Pop et al. 2014). Analysis of codon usage bias can provide the information on the molecular evolution of species and the mechanism of action of protein, and help to optimize the protein expression system for improving the production of heterologous proteins (Plotkin and Kudla 2010; Zhou et al. 2016).

Members of Baculoviridae family contain rod-shaped enveloped virions carrying double-stranded circular DNA genome in size ranging from 80 to 180 kbp. Traditionally, baculoviruses can be morphologically classified into two distinct subgroups: nuclear polyhedrosis viruses (NPVs) and granulosis viruses (GVs). The new classification has divided baculoviruses into four genera: alphabaculovirus (Lepidoptera NPV), betabaculovirus (Lepidoptera GV), gammabaculovirus (Hymenoptera NPV) and deltabaculovirus (Lepidoptera NPV) (Miele et al. 2011; Miller 1997). The alphabaculoviruses could be subdivided into two groups known as group I and group II based on the phylogenetic studies (Wang et al. 2014; Zhu et al. 2014). According to the effective number of codon (ENc), previous study found that most of sequenced baculovirus species do not have the strong codon usage bias (Jiang et al. 2008). Autographa californica multiple nucleopolyhedrovirus (AcMNPV), exhibiting a wide host range from 33 Lepidopteran species in ten families, has been most comprehensively studied in the molecular biology level because of its use as a foreign protein expression vector, gene therapy tool and arthropod-specific non-chemical biopesticide.

The helicase gene (also termed p143) of AcMNPV encodes a protein with the predicted size of 143 kDa with the motifs of Helicase, which could unwind the double strands DNA to promote the replication of virus genome (McDougal and Guarino 2001). Recent study showed that there is a replication origin (ori) in the encoding region of AcMNPV helicase, which could replicate in the insect cells and mammalian cells (Wu et al. 2014). The ori also could promote the gene expression in the mammalian cells. Moreover, AcMNPV helicase is also related to host range and shows the species specificity (Croizier et al. 1994; Hamajima et al. 2015). Although the functions of helicase in AcMNPV have been well studied, the study on the synonymous codon usage bias of helicase has not yet been comprehensively reported.

In this study, synonymous codon usage bias of helicase of AcMNPV C6 strain was analyzed and compared with those of 41 other baculoviruses. Then, we explored the possible restriction factors contributing to the codon usage bias. Moreover, the codon usage preference of helicase of AcMNPV was compared with those of E. coli, yeast, mouse and human. All these analysis results may provide some clues to the features of genetic evolution of baculovirus and the possible functional mechanism of helicase.

Materials and methods

Sequence data

The nucleotide sequences of helicase of AcMNPV clone C6 (Accession number: L22858.1) and other 41 other baculoviruses (Table 1) were downloaded from the NCBI GenBank database (http://www.ncbi.nlm.nih.gov/).

Table 1 Summary analysis of the helicase gene of AcMNPV C6 strain and 41 reference baculoviruses

Codon usage analysis of helicase genes of AcMNPV C6 strain and 41 reference baculoviruses

To analyze the codon usage bias in helicase genes of AcMNPV and other baculoviruses, some indicators were calculated.

CAI means codon adaptation index. It is a useful and universal measure of synonymous codon usage of genes of a number of organisms (Carbone et al. 2003). In addition, this index can be utilized to predict the expression level of a gene, and assess the adaptation of viral genes to their hosts (Sharp and Li 1987). The values of CAI range from 0 to1, and higher values of CAI denote higher levels of codon usage bias.

ENc means effective number of codons. ENc values range from 20 to 61, and small values of ENc mean high levels of codon usage bias. ENc = 20 means only one codon is used for each amino acid while ENc = 61 means alternative synonymous codons are used in equal frequency. ENc is a meaningful indicator of the codon bias.

GC means the frequency of the nucleotides G + C in codon, and GC3s means the frequency of the nucleotides G + C at the third position of codon except Met, Trp and termination codons. They are the effective indicators of the content of the base composition bias.

RSCU means relative synonymous codon usage. It is employed to investigate the overall synonymous codon usage bias among the genes, and this value is demonstrated by the ratio of the observed codon usage to the expected value (Sharp and Li 1986). RSCU value > 1.0 means that the corresponding codon is more frequently used than expected. RSCU value would be 1.0 in the absence of codon usage bias. RSCU value < 1.0 means that the codon is less frequently used.

The fraction means the proportion of the synonymous codons encoding the same amino acid. The frequency of each codon means the appearance ratio of each codon in the 1000 bp of coding sequence.

The relative abundance of dinucleotides (RAD) in the helicase genes of baculoviruses was calculated by counting the odds ratio as previously described (Burge et al. 1992). The odds ratio

$${\rho _{xy}}=\frac{{{f_{xy}}}}{{{f_x} \times {f_y}}},$$

where f x refers the frequency of the nucleotide X, f y refers the frequency of the nucleotide Y, and f xy the frequency of the dinucleotide XY. Compared with a random association of mononucleotides, if ρ xy  > 1.23, the XY dinucleotide is considered to be over-represented; if ρ xy  < 0.78, the XY dinucleotide is considered to be under-represented.

Neutrality plot (GC12 vs. GC3). The correlation analysis between the GC contents at the first and second codon positions (GC12) and that at the third position (GC3) is used to judge the influence of mutation pressure and translational selection on the base composition (Wang et al. 2016).

Parity rule 2 (PR2) plot. PR2 plot can be constructed to examine the effect of mutation pressure and translational selection on the codon usage pattern (Sueoka 1999). The plot is displayed by the value of AU-bias as the ordinate and GC-bias as the abscissa at the third position of the four-codon amino acids (alanine, arginine, glycine, leucine, proline, serine, threonine and valine). The centre of the plot is the position where both coordinates are 0.5, with no bias between influence of mutation and translational selection rates.

CAI, ENc, GC, GC3s, A3s, T3s, C3s, G3s, hydrophobicity (GRAVY), aromaticity (AROMO) and RSCU were analyzed by CodonW 1.4.2 program (Peden 2005; http://codonw.sourceforge.net/). Correspondence analysis (CA) based on RSCU values was proceeded using CodonW 1.4.2 program. The GC12, GC3, frequency and fraction were analyzed by EMBOSS CUSP program (Rice et al. 2000). The heatmap based on RSCU values was produced by HemI software (Deng et al. 2014). The mutation analysis of baculovirus Helicases was carried out by Multalin (Corpet 1988).

Comparison between the codon preference of helicase of AcMNPV and those of E. coli, yeast, mouse and human

To examine whether different species have the similar codon usage pattern and further select suitable expression system, the comparison of the codon usage bias of AcMNPV helicase with those of E. coli, yeast, mouse and human was performed. The codon usage frequency data in E. coli, yeast, mouse and human is obtained from the web site http://www.kazusa.or.jp/codon.

Statistical analysis

Correlation analysis was performed using Spearman’s rank correlation test. All statistical analyses were conducted using SPSS 13.0 software package (SPSS Inc., 2003) (SPSS 2003). The comparison of values of CAI, ENc, GC content and GC3s content between NPVs and GVs was carried out by the one-way ANOVA test using SPSS 13.0 software package (SPSS Inc. 2003) (SPSS 2003). The P values of < 0.05 were considered statistically significant.

Results

Analysis of the codon usage of the helicase genes in 42 baculoviruses

The values of CAI, ENc, GC content, GC3s content and mononucleotide frequency were shown in Table1. The results indicated that the codon usage of helicases is generally random.

The CAI values of helicases in 42 baculoviruses range from 0.190 to 0.288, with an average value of 0.246 and a standard deviation (SD) of 0.022 (Table1). The average CAI value of helicases in NPVs is 0.248 (0.190–0.288; SD, 0.024) whereas that in GVs is 0.240 (0.214–0.267; SD, 0.016). The mean CAI value of helicases of NPVs and GVs is almost equal (one-way ANOVA, P > 0.05).

The values of ENc of helicases in 42 baculoviruses vary from 36.030 to 56.560, with a mean value of 50.539 and a SD of 3.766, indicating that the codon usage bias of helicases is small. The ENc values of helicases of NPVs range from 36.030 to 54.360, with an average value of 49.739 and a SD of 3.700 whereas those of GVs range from 47.750 to 56.560, with an average value of 52.792 and a SD of 3.082. The ENc values of NPVs are significantly higher than that of GVs (one-way ANOVA, P < 0.05). Particularly, compared with other members of baculoviruses, the value of ENc of helicase of LdMNPV is 36.03, indicating slightly higher codon usage bias.

The mean value of GC content in 42 baculoviruses is 40.112% (31.80–57.30%; SD, 5.264), whereas the average value of GC3s content is 49.319 (29.00–85.20%; SD, 11.334). The mean value of GC content in NPVs is 40.80% (31.800–57.300%; SD, 5.672), whereas the average value of GC3s content is 50.80% (29.000–50.800%; SD, 11.869). The mean value of GC content in GVs is 38.18% (32.20–44.80%; SD, 3.402), whereas the average value of GC3s content is 45.14 (29.00–63.00%; SD, 8.835). The GC and GC3s contents of NPVs are similar to those of GVs (one-way ANOVA, P > 0.05).

Correlation analysis among ENc, GC3s, CAI and gene length

The ENc value has been demonstrated to correlate with the GC3s content. A plot of ENc against GC3s was used to detect the heterogeneity of codon usage (Wright 1990). If GC3s is the only limiting factor for shaping the codon usage pattern, the values of ENc would fall on a continuous curve, representing random codon usage (Jiang et al. 2007; Wright 1990). Here, ENc value of each helicase against GC3s was plotted (Fig. 1a). It could be seen that although only a few points lie on the theoretical fitting curve, the bulk of points appear to lie near the solid curve, indicating that these helicases are subject to GC compositional constraints (r = −0.437, P < 0.01).

Fig. 1
figure 1

Correlation analysis among ENc, GC3s and gene length of helicase genes in 42 baculoviruses. a Plot of GC3s versus ENc for the helicase genes. The solid curve shows the expected ENc value if the codon usage is only determined by the variation in GC3s. b Plot of gene length versus ENc for the helicase genes. c Plot of gene length versus GC3s for the helicase genes

In addition, plots of gene length against ENc (Fig. 1b) and GC3s (Fig. 1c) were used to demonstrate the distribution of each helicase. The gene length of helicases of NPVs is significantly higher than that of GVs (one-way ANOVA, P < 0.01). From Fig. 1b, c, it appeared that the longer genes have relatively higher variance of ENc and GC3s values than the shorter genes. Correlation analysis showed that gene length has a significant correlation with GC3s values (r = 0.376, P < 0.05). However, there was no significant correlation between gene length and ENc values (r = −0.114, P > 0.05).

The effect of mutation pressure on the codon usage pattern of helicases of baculoviruses

The codon usage of base composition of genes of a species with A/T rich genome may behave differentially from these with G/C rich genomes. Correlation between CAI and ENc has been utilized to show the effect of mutation pressure and translational selection on the codon usage bias (Vicario et al. 2007). If the correlation (r) between the two indices approaches − 1, the translational selection is preferred. Otherwise, if r approaches 0 (no correlation), the mutation pressure may be influential. The results showed that CAI value of helicases of baculoviruses has no significant correlation with ENc value (r = − 0.243, P > 0.05). This result reflected the influence of mutation pressure on the codon usage pattern of baculovirus helicases.

To further determine the effect of mutation pressure and translational selection on the formation of codon usage bias of helicases of baculoviruses, the neutrality plot analysis was performed. The result showed that there is a significant correlation between GC12 and GC3 (r = 0.615, P < 0.01), indicating that mutation pressure affects the codon usage bias of helicases of baculviruses.

To determine the relationship between pyrimidines (C and T) and purines (A and G) contents in fourfold degenerate codon families, PR2 plot analysis was carried out. The result indicated that A and G are more frequently used than T and C in fourfold degenerate codon families (Fig. 2). This result indicated that mutation pressure and other factors may predominately shape the codon usage bias of helicases of baculoviruses.

Fig. 2
figure 2

Parity rule 2 (PR2) plot [A3/A3 + T3] against [G3/(G3 + C3)]. PR2 plot was calculated for each helicase gene among the 42 baculoviruses

Variation in gene codon usage and amino acid composition of AcMNPV helicase

The overall codon preference of the helicase gene in AcMNPV C6 strain was shown in Table 2. RSCU values were calculated for 59 codons (excepting Met, Trp, and termination codons) in the helicases. The RSCU values of 25 codons are more than 1, indicating that these codons are more frequently used. This result was consistent with the corresponding fractions and frequencies. Among 25 codons, 16 have an end base of G- or C, while nine have an end base of A- or T. The amino acids Ala, Gly, Leu, Pro, Arg, Ser, Thr and Val show a higher level of diversity of codon usage due to their six and fourfold coding degeneracy. In addition, the amino acids Cys, Asp, Glu, Phe, Ile, His, Lys, Asn, and Gln also have a high level of diversity of codon usage bias, although they only have two or threefold coding degeneracy. Thus, it is concluded that the most and least frequencies of the used codons of each amino acid are different in AcMNPV helicase.

Table 2 The result of codon preferences in AcMNPV C6 strain helicase gene analyzed using the CUSP program

Relationship between dinucleotide biases and codon usage in helicases of baculoviruses

Dinucleotide bias analysis indicated that the occurrences of dinucleotides in helicases are not randomly distributed (Table 3). In particular, the dinucleotides CG, TG and TT are over-represented (ρxy > 1.23) while AG, CT and TA are under-represented (ρxy < 0.78). Moreover, the analysis of RSCU values of AcMNPV helicase showed that the most of codons containing AG, CT and TA are not preferentially used, including GCT, GAG, ATA, AAG, CTA, CTC, CTG, CTT, TTA, CCT, CAG, AGG, AGT, TCT, ACT, and TAT. Totally, the compositions of dinucleotides play a role in shaping the synonymous codon usage pattern of helicases of baculoviruses.

Table 3 Relative abundance of the 16 dinucleotides in helicases of 42 baculoviruses

Effect of other factors on codon usage of helicases of baculoviruses

GRAVY and AROMO also affect the codon usage bias of viruses (Wang et al. 2016). Correlation analysis indicated that AROMO is positively correlated with Axis2 and ENc while GRAVY is negatively correlated with Axis2 and ENc. However, both AROMO and GRAVY do not have any correlation with Axis1, GC3s and GC (Table 4). This result indicated that aromaticity and degree of protein hydrophobicity do not play an essential role on the codon usage pattern of helicases of baculoviruses.

Table 4 The correlation analysis between the AROMO, GRAVY, the first two axes, GC3s, ENc, and GC contents in helicase genes of baculoviruses

Phylogenetic persistence in codon usage bias of the AcMNPV helicase gene and amino acid mutation analysis

To generate a visual representation of the variation in the codon usage bias, a cluster analysis was performed based on the RSCU values of the helicase genes of AcMNPV C6 strain and other 41 baculoviruses (Table 2). From heatmap (Fig. 3), AcMNPV, PlxyMNPV, BomaNPV, BmNPV, RoMNPV and MaviMNPV in the group I alphabaculovirus firstly cluster together and form a separate branch, then cluster with the other members of group I alphabaculovirus, such as ThorNPV, AngeNPV, ChfuNPV and HycuNPV, and subsequently cluster with the some strains of group II alphabaculovirus. These results wholly showed the internal relationship of the codon usage feature between AcMNPV and other alphabaculoviruses, indicating that the more distant the genetic relationship, the wider the expected variation in the codon usage bias. Overall, the codon usage preference of AcMNPV is fairly close to that of the strains of group I alphabaculovirus.

Fig. 3
figure 3

Heatmap of RSCU values for helicase genes of 42 baculoviruses (clustered by RSCU values). Each column represents a codon, and each row represents different species. Cluster is shown on the right

Three amino acids at positions 556, 564, 577 in AcMNPV helicase were found to be involved in host range extension of AcMNPV (Croizier et al. 1994). Therefore, the amino acid alignment of Helicase in 42 baculoviruses was carried out. The result showed that, at position 556, Val (n = 23, n means the number of baculovirus species that contains this amino acid in Helicase) is universal; at position 564, Ser (n = 16) is universal; at position 577, Phe (n = 17) is universal (Table 5).

Table 5 Three mutation sites and amino acid mutations of helicases of baculoviruses

Correspondence analysis and correlation analysis: compositional characteristics of helicases of baculoviruses

The A, T, C, G, GC contents were compared with A3s, T3s, C3s, G3s, and GC3s contents (Table 6). It can be seen that both the A and T contents have a significant positive correlation with the contents of A3s or T3s as well as a significant negative correlation with the contents of C3s, G3s, and GC3s. In addition, the C, G, GC contents have a significant negative correlation with the contents of A3s, T3s as well as a significant positive correlation with that of C3s, G3s, and GC3s. These data indicated that nucleotide composition constraint can influence the codon usage pattern of helicases of baculoviruses.

Table 6 The correlation analysis between the A, T, C, G, GC contents and A3s, T3s, C3s, G3s, GC3s contents in helicase genes of baculoviruses

Correspondence analysis was performed to identify the main trends in the codon usage variation and distribution of each helicase gene along the continuous axes. The positions of helicase genes defined by the first (Axis1) and second (Axis2) axes were shown in Fig. 4. The first axis accounts for 43.79% of total variation, and the second axis accounts for 8.49% of total variation. Hence, the first axis explains a substantial amount of the total variation of synonymous codon usage of helicase. A correlation analysis indicated that Axis1 has a significant positive correlation with ENc and the contents of A, T, A3s, and T3s as well as a significant negative correlation with the contents of C, G, C3s, G3s, GC, and GC3s. However, Axis2 is only positively correlated with ENc (Table 7). Totally, the results indicated that mutation pressure from base composition makes an effect in shaping the codon usage of helicases of baculoviruses.

Fig. 4
figure 4

Scatter plot of values of the first and second axis of each helicase gene in CA. The first axis accounts for 43.79% of total variation, and the second axis accounts for 8.49% of total variation

Table 7 Correlation analysis between the first two axes and the nucleotide contents

Comparison of codon usage of AcMNPV helicase gene with those of E. coli, yeast, mouse and human

The codon usage bias of the helicase of AcMNPV C6 strain was compared with that of E. coli, yeast, mouse and human in order to select the optimal host for the expression of helicase. From the Table 8, there are 29, 27, 27 and 30 codons showing an AcMNPV-to-E. coli, AcMNPV-to-yeast, AcMNPV-to-mouse and AcMNPV-to- human ratio higher than two or lower than 0.5, respectively, demonstrating that the distinct diversity of codon preference exists in these hosts. It seemed that the codon usage pattern of AcMNPV helicase approximates to that of yeast and mouse. Yeast is a familiar eukaryotic expressing system, which may be more suitable to express AcMNPV helicase.

Table 8 Comparison of codon preferences between AcMNPV C6 strain helicase gene and E. coli, yeast, mouse and human

Discussion

This study analyzed the synonymous codon usage bias of helicases of AcMNPV and other 41 baculoviruses. We found that baculovirus helicases show weak codon usage bias.

Analysis of gene sequences reveals that the synonymous codon usage bias exists widely in different species. Among the mammals and birds, the codon usage preference may arise from the genomic compositional constraints. For some unicellular organisms, the codon usage patterns of the highly expressed genes are subject to translational selection and largely tended to match the more abundant tRNA species whereas that of the lower expressed genes are subject to mutational bias without the translational selection (Gouy and Gautier 1982; Moriyama and Powell 1997). Codon usage bias was regarded as a consequence of the balance between the mutation bias and translational selection. The mutation bias seems to make more effects than translational selection on the codon usage bias of some viruses (Auewarakul 2005; Nasrullah et al. 2015). In this study, correlation analysis between ENc and GC3s, GC12 and GC3, CAI and ENc, as well as the first two axes and the nucleotide contents indicated that baculovirus helicases are subject to GC compositional constraints and mutation pressure plays an essential role in shaping the codon usage pattern of helicases. PR2 plot analysis suggested that mutation pressure and other factors may shape the codon usage bias of helicases of baculoviruses. Although GRAVY and AROMO are related to codon usage bias of some viruses, the results presented in this study showed that they do not play a critical role in shaping the codon usage bias of helicases.

To assess the synonymous codon usage pattern independent of amino acid composition of different genes, RSCU was calculated to examine the codon usage bias of a variety of codons in the helicase of AcMNPV. The obvious bias toward G- or C-ended codon at the third position was shown and a higher level of diversity of codon usage bias was observed for coding amino acids, Ala, Gly, Leu, Pro, Arg, Ser, Thr and Val. These results indicated that AcMNPV helicase shows the significant preference for one or more postulated codons for each amino acid. The usage of optional codons may contribute to the presence of mutational pressure. Recent study found that codon usage regulates protein structure and activity as well as mRNA stability by affecting translation elongation speed. Accordingly, optimal codons promote the translation elongation rate whereas non-optimal codons reduce it (Zhao et al. 2017). Therefore, it is possible that these preferred codons may be required for high translation efficiency and activity in host cells, which might be beneficial to baculovirus DNA replication.

Increasing evidence suggests that the relative abundance of dinucleotides can affect the codon usage bias in some RNA viruses (Greenbaum et al. 2008; Wang et al. 2016). Previous study showed that TA and CG are under-represented in many sequence sets (Burge et al. 1992). However, for baculovirus helicases, CG is over-represented while TA is under-represented. The low abundances of CpG and UpA were thought to be required for viral immune escape and efficient transcription, respectively (Kumar et al. 2016). As the immune system of insect is different from that of mammalian in some aspects, it would be possible that the high abundance of CG in baculovirus genome may not be recognized by host cells and lead to the activation of the immune system.

The NPVs could infect the larvae in the orders Lepidoptera, Hymenoptera and Diptera whereas GVs only infect the larvae in the order Lepidoptera. In particular, AcMNPV can infect a wide range of Lepidopteran larval hosts. For example, AcMNPV can infect Plutella xylostella that is more permissible to PlxyMNPV (Harrison and Lynn 2007; Kadir et al. 1999). The underlying mechanism that determines the wide host range of AcMNPV is still unclear. The Helicase has turned out to be a candidate related to the host range of AcMNPV. When AcMNPV helicase is replaced by the counterpart of BmNPV, a member of group I alphabaculovirus, AcMNPV could kill Bm cells and larvae (Croizier et al. 1994). However, the helicase gene of TnGV, which belongs to betabaculovirus, could not substitute the AcMNPV helicase (Bideshi and Federici 2000). We hypothesized that the codon usage pattern of helicases among different baculovirus species may be highly variable, contributing to the difference of their host range. However, the codon usage analysis indicated that baculovirus helicase is less biased. Therefore, it is possible that, in addition to helicase, other genes or factors may also play a role in the determinant of extension of host range of AcMNPV. This proposal can be supported by some experimental evidence. For example, ChfuMNPV p143 solely cannot substitute AcMNPV p143. However, ChfuMNPV p143 and lef3 together can substitute the corresponding homologous genes of AcMNPV and initiate the virus replication (Yu and Carstens 2012). Xu et al. found that GP64 plays an important role in host range determination of BomaNPV S2 which can cover the host ranges of AcMNPV and BmNPV (Xu et al. 2012). Previous study also suggested the co-evolutionary adaptations of baculoviruses to their insect hosts, which may play a role in specialization of baculoviruses for their hosts (Herniou et al. 2004).

Previous study found that the replacement of three amino acids at positions 556, 564, 577 in AcMNPV Helicase by the corresponding amino acids of BmNPV can extend AcMNPV host range to B. mori (Croizier et al. 1994). We attempted to use the counterpart of CpGV to replace the helicase of AcMNPV. However, the recombinant AcMNPV cannot produce progeny virus (unpublished data). The amino acid at position 564 in AcMNPV is different from that in CpGV, which may be one reason why the CpGV helicase cannot replace the homologue of AcMNPV. Interestingly, the three amino acids in some alphabaculoviruses, including AcMNPV, PlxyMNPV, RoMNPV, MaviMNPV, HearNPV, HzSNPV, LeseNPV, and SpliNPV, are identical. It is unknown whether the helicases in these alphabaculoviruses can substitute the homologue of AcMNPV.

This study showed that the CAI value, GC and GC3s contents of helicases of NPVs are similar to that of GVs (one-way ANOVA, P > 0.05) while the ENc values of helicases of NPVs are significantly higher than that of GVs (one-way ANOVA, P < 0.05), suggesting their co-evolution. Previous study proposed that evolution enables the alphabaculovirus and betabaculovirus hosted by the same host species to share more similar patterns (Shi et al. 2016). In fact, some members of both NPVs and GVs evolve in same hosts, even at species levels, such as Plutella xylostella, Helicoverpa armigera and Agrotis segetum.

To compare the codon usage difference of genes from diverse species, the codon usage bias of AcMNPV was compared with those of E. coli, yeast, mouse and human. It appears that the codon usage bias of helicase of AcMNPV resembles that of yeast and mouse. Hence, the yeast expression system is more suitable for heterologous expression of AcMNPV helicase. These information would be beneficial for the optimization of AcMNPV helicase expression in vitro.

Taken together, the data presented here showed the random codon usage pattern of helicases from AcMNPV and other 41 baculovirus species. The main factor shaping codon usage bias is GC compositional constraints. This study may provide some clues for understanding the functional mechanism of AcMNPV Helicase and the evolution of baculoviruses.