Introduction

Influenza A virus is widely distributed throughout the world [1] and is characterized by a high degree of genetic diversity. It is divided into several subtypes based on the combination of the two major viral glycoproteins: hemagglutinin (HA) and the neuraminidase (NA). Avian influenza virus (AIV) has a segmented RNA genome that evolves rapidly and continuously. Site mutations and reassortments are the main mechanisms of AIV evolution, which promote viral escape from immune system, resistance to antivirals, and crossing of species barriers [2].

Among the identified subtypes, AIV H9N2 is the most common subtype infecting poultry, especially in Asian and North African countries [3]. The natural reservoir of this virus is aquatic birds, which migrate long distances all over the world and spread the virus to other avian species [4].

H9N2 virus was first detected in 1966 in turkeys in the USA [5]. During the 1970s, the first Asian cases of H9N2 virus infection in poultry were reported in Hong Kong [6]. From 1994 to 1999, many H9N2 virus outbreaks affected domestic poultry in South Korea, China, Germany, Italy, Ireland, South Africa, the USA, Pakistan, and Iran [7,8,9]. This period also witnessed the appearance of three major Eurasian H9N2 lineages, notably, the Korean lineage, the G1 lineage, and the Y280 lineage [7, 9, 10]. Subsequently, the H9N2 viruses became enzootic in many areas of Asia, the Middle East, and some African countries [11, 12]. The history of H9N2 outbreaks was also marked by interspecies transmissions between birds and mammals. In the late 1990s, the first cases of mammalian infection with H9N2 viruses were recorded in swine and humans in China and Hong Kong [9, 13, 14]. Furthermore, the susceptibility of other mammals such as mice [15], ferrets [16], minks, foxes, raccoon dogs [17], cats and dogs to the H9N2 subtype was demonstrated [18].

Several studies highlighted a number of molecular determinants required for H9N2 viruses to break the host species barrier. The hemagglutinin human-like marker 226L (H3 numbering) is one of the most important mammalian host restriction factors [19]. H9N2 viruses containing this marker are able to bind α2,6 sialic acid receptors on human epithelial cells [20] and have the potential to become the next pandemic avian influenza virus [21]. Continuous epidemiological surveillance of this subtype is highly recommended to limit its zoonotic threat.

Tunisia is one of the northern African countries that are concerned about H9N2 outbreaks. Since 2006, a national program of AIV surveillance has been established to monitor virus circulation on commercial poultry farms in northern, central, and southern Tunisia [22]. At the end of 2009, the H9N2 subtype was identified to be the major cause of economic losses in poultry production in Tunisia. The first isolations of H9N2 strains were in 2010 from chickens, turkeys and migratory birds [22, 23]. Other H9N2 outbreaks were reported in 2011 and 2012 in chickens [22, 24]. Molecular analysis showed that the Tunisian H9N2 strains circulating in poultry flocks belong to the Eurasian G1-like lineage and that their hemagglutinin has the human-like marker 226L, which could represent a potential public-health risk in Tunisia.

The evolutionary history of the Tunisian H9N2 viruses is still not fully understood. Therefore, investigating the geographic and epidemiological origins of Tunisian H9N2 viruses could contribute to better preparation for and control of future outbreaks. Moreover, no Tunisian H9N2 strains have been molecularly characterized since January 2012. Continuous monitoring of the molecular evolution of such zoonotic viruses is extremely important. The present study provides insights into the historical origins of the Tunisian H9N2 viruses through Bayesian analysis. This approach allows the geographic source of the virus, the spatiotemporal dynamics of viral spread, hosts involved in viral transmission, and reassortant subtypes of Tunisian viruses to be determined. By analyzing viral protein sequences, this study also highlights the genetic and zoonotic evolution of five recent H9N2 strains isolated in Tunisia from April 2012 to January 2016.

Materials and methods

Virus isolation

As part of a national program of AIV surveillance, tracheal/cloacal swabs and respiratory organs (trachea and lungs) were collected (n = 660) during 2012-2016 from commercial poultry flocks raised in 23 governorates situated in the northern, central and southern regions of Tunisia. To detect AIV, all samples were analyzed using an ELISA IDScreen® Influenza A Antigen Capture Kit. A sample is considered positive when its optical density is higher than the cutoff calculated according to manufacturer’s instructions. We screened five positive samples that originated from different regions of Tunisia and were collected from April 2012 to January 2016 (Table 1 and Supplementary File 1). These avian influenza viruses were isolated and propagated in specific-pathogen-free embryonated chicken eggs. After incubation of the eggs for 96 h at 37°C, allantoic fluid was harvested, and the viral titer was determined by hemagglutination assay [25]. The virus stock was aliquoted and stored at -80°C until use.

Table 1 Tunisian H9N2 strains used in this study

Virus subtyping

Viral RNA was extracted from allantoic fluid using TRIzol® Reagent (Invitrogen, USA). In order to confirm the presence of AIV and identify the hemagglutinin subtype, the M, H5 and H9 genes [22, 26] were targeted for real-time RT-PCR (rRT-PCR) using AgPath-ID™ One-Step RT-PCR (Applied Biosystems, UK). The rRT-PCR program was as follows: a single cycle of 45°C for 10 min and one of 95°C for 15 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min. The M and H9 genes of the five viruses were detected by rRT-PCR. The N2 subtype was determined by conventional RT-PCR using a SuperScript III One-Step RT-PCR System Kit (Invitrogen, USA) and using the following temperature profile: a single cycle for 30 min at 55°C and one for 2 min at 94°C, followed by 40 cycles of 15 s at 94°C, 1 min at 58°C, and 1 min at 68°C and one final cycle of 5 min at 68°C.

Genome amplification and sequencing

The whole genome was amplified by multisegment RT-PCR (MRT-PCR) [27] using a SuperScript III One-Step RT-PCR System Kit (Invitrogen, USA). The MRT-PCR products were purified by the eCube PCR Purification kit (Phile Korea, Korea). The quality and quantity of genomic dsDNA were determined using a Qubit® 3.0 Fluorometer (Invitrogen, USA). A sequencing library was prepared from 1 ng of dsDNA using a Nextera XT DNA Library Preparation Kit (Illumina, USA) according to the manufacturer’s instructions. The quality and length of the library was evaluated using an Agilent 2100 Bioanalyzer System (Agilent, USA) employing a High Sensitivity DNA Kit (Agilent, USA). Paired-end sequencing was performed using an Illumina Miseq sequencer (Illumina, USA) and a Miseq Reagent Kit V2 (Illumina, USA). Read quality was assessed using the FastQC program [28]. De novo assembly and mapping of reads was carried out using the IVA program [29]. The mapping was performed using the H9N2 reference strain A/chicken/Jinan/3925/2013 (GenBank accession numbers KP415257 to KP415264). The genomic nucleotide sequences of the Tunisian H9N2 strains were submitted to GenBank, and the accession numbers are provided in Supplementary File 1.

Dataset preparation

In order to select nucleotide sequences of other influenza virus strains to be included in this study, we used the genome sequences of the Tunisian H9N2 viruses to perform a similarity search in the online BLAST suite at NCBI (https://blast.ncbi.nlm.nih.gov). We compiled a subset of AIV nucleotide sequences that varied in their isolation year, geographic origin, host, and subtype. The prototype strains of different Eurasian H9N2 lineages (G1-like, Y280-like and Korean-like) and the nucleotide sequences of the H9N2 strains previously isolated in Tunisia, during the 2010-2012 outbreaks (Table 1), were also included. Supplementary File 1 contains information about the 106 to 219 selected sequences that were compiled per genome segment.

A multiple nucleotide sequence alignment and nucleotide sequence identity calculations were performed using the BioEdit v7.0.9 program [30]. Maximum-likelihood (ML) phylogenetic trees were constructed for each segment in PhyML v3.0 [31], using the evolution model GTR + Γ4 + I, which was selected as the best-fit model by the online program SMS (http://www.atgc-montpellier.fr/sms/). The ML trees were analyzed using TempEST v1.5.4 to discard outlier sequences that not fit to the clock-likeness [32].

Bayesian analysis

The BEAST v1.8.4 package [33] was employed to construct Bayesian maximum-clade-credibility (MCC) phylogenetic trees for each genomic segment. To investigate the geographic and ecological origins of Tunisian H9N2 viruses, location, host and subtype traits were analyzed as discrete trait diffusion models. The locations were organized by country, and the hosts were categorized into 10 groups: chicken, quail, turkey, duck, swan, goose, other avian species, unspecified avian species, environment, and mammalian hosts. To estimate the transition rates between different locations, hosts and subtypes, a Bayesian stochastic search variable selection (BSSVS) was applied for each model, employing a symmetrical discrete trait substitution model with a strict clock assumption.

In order to select the best-fit nucleotide substitution and clock models, we used path sampling (PS) and stepping-stone sampling (SS) [34] to compare the following four model combinations: GTR (Γ4) model/strict clock model, GTR (Γ4) model/uncorrelated lognormal relaxed clock model, HKY model/strict clock model, and HKY model/uncorrelated lognormal relaxed clock model. According to the Bayes factor (BF), the GTR (Γ4) model and the uncorrelated lognormal relaxed clock model were selected as the best-fit models for all genomic segments. A Bayesian skyline tree model was used in this analysis because it is considered as the best demographic model that can describe the complexity of AIV population evolution dynamics [35]. Each Markov chain Monte Carlo (MCMC) was run for 100 million generations, sampling every 10,000 states to achieve a good effective sample size (ESS > 200) for the important parameter estimates. The ESS values were evaluated in TRACER v1.8.4 [36]. The generation of the MCC trees was performed using TreeAnnotator v1.8.4 after removing 10% burn-in. The MCC trees were visualized using the FigTree v1.4.3 program, which allowed the estimation of the time of the most recent common ancestor (tMRCA) and the 95% highest posterior density (HPD) for the Tunisian H9N2 viruses. A posterior probability (pp) > 0.95 indicates good statistical support for the tree node.

To study the spatiotemporal dynamics of each genomic segment, KML files (Supplementary File 2) were created from the MCC tree files using the SpreaD3 program [37] and visualized using Google Earth Pro. We employed a 2D map (https://geojson-maps.ash.ms/) to present only the transitions involved in the evolutionary history of the Tunisian H9N2 viruses.

Bayes factors (BF) and posterior probabilities (pp) were calculated for the transition rates determined for each model (location, host and subtype) using SpreaD3. A transition rate is considered statistically well supported when BF > 3 and pp > 0.5. The BF was evaluated using the following support levels: BF > 3, substantial support; BF > 10, strong support; BF > 30, very strong support; BF > 100, decisive support [38].

Sampling bias

Sensitivity analysis was carried out using a ‘randomized tip swap’ approach to assess the influence of sampling bias for each discrete trait diffusion model (location, host and subtype) [38]. After being generated with BEAST 1.8.4, the XML files were modified by adding the tip swap operator in the block of operators, and each MCMC was run for 100 million iterations, sampling every 10,000 states. This approach allows the randomization of location, host and subtype tip assignments throughout the MCMC simulation process. The sample is considered unbiased when the distribution of root state probabilities is different between the main and ‘randomized tip swap’ analyses. In addition, the transition rates should be similar in both types of analysis.

Protein sequence analysis

From the aligned nucleotide sequences of HA, NA, PB1, PB2, PA, M, NP and NS segments, the protein sequences were deduced using BioEdit v7.0.9.0 [30]. A 3D structure model of neuraminidase NA was generated online using Swiss model (https://swissmodel.expasy.org) and visualized using the PyMOL program [39].

Results

Phylogenetic analysis

The HA, NA, M and NP Bayesian clock trees showed that the Tunisian H9N2 viruses (Table 1) were derived from the G1 lineage (Figs. 1-2 and S1). However, the PB1, PB2, PA and NS segments were unrelated to the Eurasian lineages: G1-like, Y280-like and Korean-like (Fig. 4 and Fig. S1). All Tunisian H9N2 viruses clustered in a well-supported clade (pp = 0.99-1) including strains from the United Arab Emirates (UAE), Saudi Arabia, Libya, Morocco and the United States of America (USA) (Fig. 1 and Fig. S1). The HA segment of recent Tunisian H9N2 viruses (isolated in 2012-2016) is very similar to those of the Libyan H9N2 strains, with up to 97% sequence identity. The other segments have high similarity (93 to 97% identity) to those of the Middle Eastern H9N2 strains isolated in the UAE and Saudi Arabia.

Fig. 1
figure 1

Bayesian MCC trees of the location trait for the (A) HA and (B) NA segments. Branches are colored according to the location of the most probable ancestor of the branch. Groups composed of Tunisian H9N2 strains are highlighted in grey. Circles indicate pp > 0.95,. The MCC trees were generated using FigTree v1.4.3

Fig. 2
figure 2

Spatiotemporal dynamics for all AIV genomic segments, describing the geographic origins of the Tunisian H9N2 viruses. The eight segments are indicated by different colors. Supported transitions are indicated by continuous lines (BF >3), while unsupported transitions are indicated by discontinuous lines (BF < 3)

An estimation of tMRCA for each segment of the Tunisian H9N2 viruses gave different values. This could mean that multiple reassortment events have occurred between the Tunisian strains and others of different origin. The mean tMRCA values indicated divergence dates ranging from 2000 to 2008, suggesting that the Tunisian H9N2 strains emerged in 2008 (Table 2).

Table 2 Estimated tMRCA for the genomic segments of all the Tunisian H9N2 viruses isolated from 2010 to 2016

MCC tree analysis of location traits and spatiotemporal dynamics

As indicated by the MCC trees expressing location traits, all genomic segments shared a recent common ancestor with an H9N2 strains from the UAE, which could be the origin of H9N2 virus introduction in Tunisia.

In order to provide more details about the evolutionary history of the Tunisian H9N2 viruses, the origins of their eight genomic segments were investigated through spatiotemporal dynamics analysis. As illustrated in Fig. 2, PB2 is the oldest segment. Indeed, the avian influenza strains carrying this segment circulated in Russia before being transferred to Germany during the 1960s. The analysis also showed that the PA segment originated from AIVs circulating in China and Sweden during the 1970s (Fig. 2). The 1980s period witnessed the transmission of PA, NS and NP segments from their ancestral locations in Hong Kong and Russia to Sweden and China (Fig. 2). Since the 1990s, the virus has spread rapidly from China to many other regions of East Asia (Hong Kong and Japan) and the Middle East (Iran, Pakistan and the UAE) via multiple segment reassortments (Fig. 2). Over the same period, the UAE became an accumulation region for many avian influenza strains, and their eight segments have been transferred to Tunisia since the mid-2000s. Furthermore, the analysis highlighted that, in this period, H9N2 viruses originating in Tunisia spread to Libya (in 2016 and 2013) and the UAE (in 2011) through HA and NP reassortments (Fig. 2).

When the viral diffusion pathways identified by spatiotemporal dynamics analysis were quantified, a large number of transitions were found to be statistically well supported (BF > 3; pp > 0.5) (Fig. 2), with BF = 8.68-868367.16 and pp = 0.54-1 (Table S1). The highest BF values providing decisive support for transition rates were recorded in HA, PB1, PA, NS, M and NP phylodynamics (Table S1).

We also found that the transition rates of three pathways in HA, PB1 and M phylodynamics were not statistically supported by posterior probability (pp < 0.5), but were substantially supported by the Bayes factor (BF > 3) (Table S1). These included Japan – Pakistan (median = 0.70 transitions per year; 95% HPD = 0-2.65; BF = 7.71; pp = 0.44 for HA), China – Sweden (median = 0.99 transitions per year; 95% HPD = 0-2.75; BF = 11.36; pp = 0.48 for PB1), and the UAE – Tunisia (median = 0.68 transitions per year; 95% HPD = 0-2.57; BF = 4.83; pp = 0.42 for M).

Statistically insignificant transition rates (BF < 3; pp < 0.5) were identified in the phylodynamics of the HA, NA, PB1 and NS segments corresponding to the routes connecting Hong Kong to Japan, China to the UAE, Sweden to Saudi Arabia, Hong Kong to Sweden, and Sweden to the UAE (Table S1).

Host and subtype phylodynamics

We also investigated the contribution of ancestral hosts and subtypes in the evolutionary history reconstruction of the Tunisian H9N2 viruses. The host phylodynamics analysis showed that the chicken category dominated the MCC tree nodes of the HA, NA and M segments, suggesting that these viral segments were transmitted to Tunisia after long-term adaptation in poultry (Fig. S2). The first tree nodes of the other segments (PB1, PB2, PA, NP and NS) were labeled by duck category (Fig. 3A and Fig. S2). This could indicate that these aquatic birds contributed to the evolutionary history of these internal genes at the early stages of viral spread, before the virus transmission to poultry in Tunisia. For these segments, all transitions between ducks and chickens were statistically valid (BF = 24.30-327737.17; pp = 0.86-1) (Table S2). The MCC tree of the NP segment showed that other avian species (terrestrial birds) were the hosts of the most recent common ancestor of the viruses of the chicken category that defines the clade containing the Tunisian H9N2 strains (Fig. 3B). This transition was highly supported by Bayes factor and posterior probability values (BF = 3109.34; pp = 0.99).

Fig. 3
figure 3

Bayesian MCC trees of the host trait for the (A) NS and (B) NP segments. Branches are colored according to the host of the most probable ancestor of the branch. Groups composed of Tunisian H9N2 strains are highlighted in grey. Circles indicate pp > 0.95. The MCC trees were generated using FigTree v1.4.3

The subtype phylodynamics analysis revealed that the HA, NA, M, NP, NS and PA segments of Tunisian H9N2 viruses originated through intra-subtype reassortments with other H9N2 strains (Fig. S3). The MCC tree of the PB2 segment provided evidence of intersubtype reassortments with other subtypes of H2N2 and H10N4 with the H9N2 strains that belong to the clade where the Tunisian viruses are found (Fig. 4A). However, the transitions between these subtypes were not statistically supported, as they had low Bayes factors (BF < 3) (Table S3). We also observed that the PA segment of the Tunisian H9N2 viruses appears to have originated from reassortment with H4N6 (Fig. 4B). This transition had high statistical support (BF = 20386.07; pp = 1) (Table S3).

Fig. 4
figure 4

Bayesian MCC trees of the subtype trait for the (A) PB2 and (B) PB1 segments. Branches are colored according to the subtype of the most probable ancestor of the branch. Groups composed of Tunisian H9N2 strains are highlighted in grey. Circles indicate pp > 0.95. The MCC trees were generated using FigTree v1.4.3

Sampling bias

Concerning the location model, the comparison of root state probabilities between the main and randomization analyses showed that only the ancestral state reconstruction of the HA segment was influenced by sample selection bias. Indeed, China was the same ancestral root state for this segment (main, pp = 0.66; tip swap, pp = 0.18) (Table S4). Only eight transitions identified in the spatiotemporal dynamics of the HA, PA, NS and M segments appeared to be affected by the sampling scheme, as they were characterized by a large variation of rates between the observed analysis and the tip randomization (Table S5). These biased transitions included the virus diffusion event between the UAE and Tunisia via PA reassortment (main, 2.10 transitions per year, 95% HPD 0.06-5.28; tip randomization, 0.83 transitions per year, 95% HPD 0.06-2.12) (Table S5).

In host phylodynamics, chicken and duck categories had the same ancestral root state in both types of analysis for HA, NA, PB2, PB1 and NS segments, providing evidence for sampling bias (Table S6). Except for PA and PB2 segments, none of the transitions from ducks or other avian species to chickens were biased by sample selection. Indeed, the rates of main analysis were very similar to those estimated by the ‘randomized tip swap’ approach (Table S7).

In subtype phylodynamics, the ancestral reconstruction for the HA, NA and M segments was affected by sample bias, because the root state probability distribution was the same between the main and the tip randomization analyses (Table S8). For the PB2 segment, the sensitivity analysis showed that the statistically unsupported intersubtype reassortments (BF < 3) between H2N2, H10N4 and H9N2 did not exhibit sampling bias, as their transition rates were not largely different from those of the tip swap (Table S9). This suggests that these subtypes could not be directly connected, and their relationships could be better explained by reassortments with other unsampled virus subtypes. For the PB1 segment, the transition rates of the reassortment between H4N6 and H9N2 subtypes were similar in both types of analysis, also indicating sampling bias for this transition event (median = 1.02 transitions per year, 95% HPD 0.06-2.97 in main analysis; median = 1.16 transitions per year, 95% HPD 0.26-3.02 in randomization analysis) (Table S9).

Molecular analysis

Hemagglutinin (HA)

Molecular analysis showed that the hemagglutinins of the recent Tunisian H9N2 (isolated from April 2012 to 2016) viruses carry a monobasic cleavage site at position 333-341 (H9 numbering), indicating their low-pathogenic nature. The strains isolated in 2012 and 2016 have a serine (S) residue at position 334, making their cleavage site pattern different from those found among strains from 2010-2011 and 2013-2015 (Table 3).

Table 3 Molecular characterization of the Tunisian H9N2 viruses collected from April 2012 to 2016 by comparison with those previously isolated from 2010 to January 2012

The receptor-binding site (RBS) of the five recent H9N2 strains contains the marker 234L (226L in H3 numbering) which gives the virus the ability to bind α2,6 receptors on human epithelial cells [20]. Since 2010, this molecular marker has been conserved among all the Tunisian H9N2 viruses with exception of the strain A/migratory bird/Tunisia/51/2010, which has the avian-like marker 234Q (Table 3). The only variation detected in the RBS was observed in a 2016 isolate (A/chicken/Tunisia/40/2016), where the amino acid substitution A198T was found (Table 3).

Neuraminidase (NA)

The neuraminidase sequences of the five Tunisian strains contain the residues 72T, 370L, 392I and 427I, which give the H9N2 viruses the ability to replicate in mouse cells [40]. These amino acids were also observed in the neuraminidase sequences of the other Tunisian strains isolated from January 2010 to January 2012 (Table 3). Moreover, the recent Tunisian H9N2 strains have two potential glycosylation sites that are situated near and inside the hemadsorbing site (HBS) and create variation between the Tunisian strains. The strains from 2010 have only one potential glycosylation site, 402 NWS 404, inside the HBS. In 2011, this site disappeared but was restored again among the viruses of 2012-2016, with the creation of the new glycosylation site 331 NSS 333 near the HBS (Fig. S4).

Subunits of the polymerase complex (PB1, PB2 and PA)

The PB1 subunit of the recent Tunisian H9N2 strains contains the conserved residues 13P, 207K and 436Y, which are associated with adaptation of AIV to mammals [41, 42] (Table 3). Molecular analysis showed that the cap-binding region of the PB2 protein had undergone point mutations: E249A (strains isolated from April 2012 to 2016) and K339Q (isolates of 2013-2015). These amino acid substitutions have never been studied before (Table 3). The PA protein of the recent Tunisian viruses was characterized by the appearance of the substitution S409N, which has been found in several highly pathogenic AIVs, such as H5N1, H7N9, H1N1, H2N2 and H3N2, suggesting its ability to enhance the fitness of the virus in humans (Table 3) [43, 44].

M1, M2 and NP proteins

Many conserved molecular markers associated with mammalian host restriction were present in the M1, M2 and NP proteins of the Tunisian H9N2 strains, including: 15I (M1), 30D (M1), 215A (M1), 55F (M2) and 136M (NP) (Table 3). In addition, the human-like marker E372D was acquired by the NP protein of the Tunisian viruses collected between April 2012 and 2015 [45, 46] (Table 3). Interestingly, we also observed that the ion channel M2 in the Tunisian H9N2 strain from 2016 (A/chicken/Tunisia/40/2016) carries the amino acid substitution A30S, which provides viral resistance to amantadine [47] (Table 3).

Nonstructural protein NS1

Sequence analysis of the NS1 protein revealed that all Tunisian H9N2 strains have the residues 42S, 103L and 184G, which increase virulence and replication efficiency in mice affecting interferon induction [48,49,50] (Table 3). In the C-terminal part of the protein, the Tunisian viruses possessed the motif 227GSEV230, which represents the PDZ ligand-binding domain. This motif causes the virus to have low pathogenicity in mice [51, 52].

Accessory proteins (PB1-F2 and PA-X)

The genome of the Tunisian H9N2 viruses encodes a truncated protein, PB1-F2 (52 amino acids). This accessory protein is considered one of the virulence factors of AIVs [53]. Furthermore, a full-length PA-X protein (252 amino acids) was expressed in the Tunisian strains. PA-X expression has been reported to increase the pathogenicity of H9N2 virus in mice [54].

Discussion

This is the first comprehensive study that investigates the historical origins and evolutionary traits of Tunisian H9N2 viruses based on Bayesian phylodynamic analysis and full genome characterization.

In this study, phylogenetic analysis showed that the HA, NA, M and NP segments belong to the G1 lineage, whose members are able to infect humans, suggesting that the Tunisian strains have zoonotic potential [55]. The other segments were distantly related to those of the Eurasian lineages. These segments were probably obtained from Middle Eastern strains that underwent multiple intersubtype reassortments, making the determination of their internal gene origins difficult, as discussed by Fusaro et al. [25].

Bayesian analysis gave evidence that there was an exchange of H9N2 strains between Tunisia and the UAE, which could indicate that the virus was carried between these countries by migratory birds. Furthermore, similarities have already been found between a H9N2 isolate from migratory bird in Tunisia and those from the Middle East, highlighting the contribution of wild-bird migration in intercontinental viral transmission [23]. Spatiotemporal dynamics analysis revealed that the H9N2 virus was transmitted from Tunisia to Libya in 2006 and 2013. The spread of virus between these two neighboring countries could be explained by active poultry exportation at the border [56]. Both tMRCA estimation and spatiotemporal phylodynamics showed that the H9N2 virus emerged in Tunisia after the mid-2000s. At the same time, the infection zone extended from Tunisia to other countries, notably the UAE and Libya. Interestingly, this period witnessed great progress in poultry production in Tunisia [56], suggesting that the growing activities of this sector contributed in the spread of the Tunisian H9N2 viruses due to neglect of biosecurity measures on poultry farms.

The host phylodynamics highlighted the contribution of ducks and terrestrial birds in the evolutionary history of the Tunisian H9N2 viruses. Waterfowl were found to be the ancestral source of global H9N2 viruses, suggesting their important role in virus spread and persistence in poultry [57]. In addition, wild terrestrial birds showed the ability to transmit H9N2 virus efficiently to domestic poultry, which makes them a risk factor for interspecies virus spread [58]. The subtype phylodynamics showed that the Tunisian H9N2 viruses were characterized by low susceptibility to intersubtype reassortment. These findings could be explained by the high prevalence of the H9N2 subtype in domestic poultry populations, as suggested by Lu et al. [59]. However, the Tunisian H9N2 viruses underwent an intersubtype reassortment event with the H4N6 subtype, the evidence for which was supported by statistical analysis. This reassortment could increase the virulence of H9N2 viruses in mammals, as demonstrated by Li et al. [60].

As indicated by root state probability analysis, although there was possible sample selection bias in our analysis, this did not influence most of the transition events identified in the evolutionary history of the Tunisian H9N2 viruses. Most of the uncertain transitions were described in the first episodes of virus spread history, which could be explained by the lack of H9N2 isolates before 1995, as suggested by Yang et al. [57].

Molecular characterization of recent Tunisian H9N2 viruses has demonstrated the conservation of many mammalian-adaptive mutations, including the hemagglutinin human-like marker 234L (226 in H3 numbering), which allow virus binding to the sialic acid receptors of human cells [20]. In this context, Jegde et al. [61] provided evidence for positive selection of the 234L marker during serial passage of the virus in chickens, suggesting that the H9N2 viruses acquired the mammalian adaptation while circulating in poultry.

The recent Tunisian H9N2 viruses isolated from poultry in 2012-2016 were characterized by the accumulation of new adaptive mutations in M2, PA and NP, which were previously shown to contribute to viral fitness and adaptation to mammalian hosts. Interestingly, the HA evolved considerably over this period by acquiring two mutations, A198T and A334S, in the RBS and the cleavage site, respectively. Position 198 is known to be a modulator of the affinity of HA for α2,6 sialic acid receptors. The presence of A, T or V at this position indicates low, moderate or high avidity, respectively [62,63,64], suggesting that the Tunisian H9N2 virus increased its affinity for the α2,6 receptors of human epithelial cells. Furthermore, the A334S substitution has been shown to cause an increase in the virulence of H9N2 viruses in chickens and mice [65]. The polymerase PB2 of the recent Tunisian H9N2 viruses has two new mutations, E249A and K339Q, in its cap-binding site. Although the effects of these mutations on the properties of the protein are still unknown, their positions are considered important for viral replication in human cells [66].

The new mutations in neuraminidase created two potential glycosylation sites near and inside the HB site, which could prevent antibodies from recognizing this important antigenic site [67]. On the other hand, glycosylation in the HB site could weaken the affinity of NA for cell receptors [68]. In this context, Mitnaul et al. reported that reduced NA activity could be compensated by mutations near the RBS of H9N2 viruses [69]. Thus, we speculate that the new mutations in the RBS and the cleavage site (A198T and A334S) of HA were created to compensate for the altered affinity of the NA and to maintain the balance between the HA and NA activities.

This study provides a comprehensive overview of the historical origins and zoonotic potential of Tunisian H9N2 avian influenza viruses. Given the growing potential threat of the H9N2 virus to public health, surveillance of viral molecular evolution should be continued, and biosecurity should be enhanced on the poultry farms in Tunisia. For a better understanding of the ecological aspects of the H9N2 virus, its circulation in migratory birds and mammals, including wild mice and humans, should be investigated. The present study also highlights the evolution of hemagglutinin in the H9N2 strains in Tunisia. This protein could be a perfect target for developing new vaccines or drugs against circulating H9N2 viruses in Tunisia.