Currently, H5 subtype high pathogenic avian influenza (HPAI) is posing heavy burdens to the global poultry and severe threats to the world public health. Although the arduous and expensive culling with or without immunization strategy has been implemented in most involved countries to control the HPAI epidemics since the year 2003, the epidemics continue expanding from several spots in Asia in 2003 to most regions in Eurasia in 2005 and even to Africa in 2006. Meanwhile, the toll of highly fatal human infections with the HPAI virus from more and more countries also continues rising [1, 2] (See http://www.who.int and http://www.oie.int).

The diversity of H5 subtype influenza viruses has been explored from regional views with limited sequences [317], but not from the global and dynamic view. In this study, we try to present such a panorama and explore its significance through analyses of haemagglutinin (HA) sequences of H5 influenza viruses available online.

Until April 1, 2006, approximately 716 H5 influenza isolates whose complete sequences of HA1 domain in HA gene were reported to the influenza sequence database (http://www.flu.lanl.gov). These 716 isolates covered 7 neuraminidase (NA) subtypes, namely, N1, N2, N3, N6, N7, N8 and N9, respectively, with the number of 538, 123, 25, 3, 3, 3 and 12, and the NA subtype of the rest 9 isolates were unknown. More H5N1 and H5N2 sequences other than the NA subtypes’ were reported, which might be because only H5N1 and H5N2 subtypes have widely circulated in the domestic fowls in the past decades.

Through sequence analyses with the online alignment and BLAST software in the influenza sequence database (http://www.flu.lanl.gov), 533 H5N1 strains isolated after 1996 were found to be more similar to A/Goose/Guangdong/1/1996 than any other strains isolated before 1997 in HA1 sequences, and therefore only 19 (A/Chicken/Hong Kong/1203/97, A/Hong Kong/156/97, A/duck/Zhejiang/15/2000, A/duck/Fujian/17/2001, A/duck/Shanghai/37/2002, A/chicken/Korea/ES/03, A/wild duck/Guangdong/314/2004, A/chicken/Jilin/9/2004, A/swan/Guangxi/307/2004, A/cat/Thailand/KU-02/04, A/whooper swan/Mongolia/4/05, A/grebe/Novosibirsk/29/2005, A/quail/Guangxi/575/2005, A/turkey/Turkey/1/2005, A/Chicken/Vietnam/NCVD12/2005, A/quail/Thailand/Nakhon Pathom/QA-161/2005, A/chicken/Nigeria/641/2006, A/swan/Iran/754/2006, A/chicken/Egypt/960N3-004/2006) among these 534 H5N1 isolates were randomly selected. Similarly, some H5N2 strains isolated in the same place and in the same year and with HA1 sequence similar to that of some selected strains were also excluded for further analyses. After the preliminary selection, HA1 sequences of 170 isolatesFootnote 1 including 24 (H5N1), 97 (H5N2), 25 (H5N3), 3 (H5N6), 3 (H5N7), 3 (H5N8), 12 (H5N9), and 3 NA-unknown strains were selected in this study. Among these 170 isolates, 62 were from chickens, 36 from domestic ducks, 11 from turkeys, 2 from geese, 2 from mammals (A/HongKong/156/97 and A/cat/Thailand/KU-02/04) and the rest from wild birds. All H5 influenza viruses previously isolated from mammals were assumed to be transmitted from infected fowls and could not be transmitted among mammals [1]. In addition, 53 among these 170 isolates were isolated in 1990s, and other 76 were isolated in 2000s, and the rest 41 were isolated in 1959–1989.

Sequence alignment, which was done with the online software in the influenza sequence database (http://www.flu.lanl.gov), demonstrated that there were no insertions or deletions among the HA1 sequences except the HA0 cleavage site that is located at the end of HA1 sequence. The phylogenetic tree (Fig. 1) of the sequences was constructed using software PHYLIP 3.62 (Fig. 1) with maximum likelihood method [18, 19]. Although many H5N1 sequences similar to that of A/Goose/Guangdong/1/1996 and some H5N2 strains were excluded from the tree as stated above, the main topology of the phylogenetic tree (Fig. 1) changed little when these sequences were included through tests (data not shown). Therefore, the phylogenetic tree (Fig. 1) was assumed to be the panorama of the diversity of H5 influenza viruses.

Fig. 1
figure 1

The phylogenetic tree of 170 H5 influenza isolates based on their HA1 nucleic acid sequences estimated with software PHYLIP 3.62 using maximum likelihood method [18]. The transitions/transversions ratio was set as 8.0, and the substitution rates were set in Gamma distribution. The earliest isolate, A/chicken/Scotland/59, was selected as the outgroup. Some bootstrap values out of 100 replicates were given near to the corresponding nodes, and the isolation time was given at the right side. Most isolate designations were omitted due to space limitation. “*” indicated that the isolates were highly pathogenic with >3 basic amino acid residues at the HA0 cleavage site, and “Δ” indicated that there were 2 or 3 basic amino acid residues at the HA0 cleavage site and the isolates remained low pathogenic. Words in rectangles expressed the place, serotype and outbreak time of related severe avian influenza epidemics. The vertical lines were just for spacing while the transversal lines represented genetic distances whose scale was given at the right below corner

The phylogenetic tree (Fig. 1) suggested that all the H5 strains isolated after 1961 could be aligned into one of the two lineages except A/Turkey/Ramon/73. The two lineages were designated as the Eastern lineage and the Western lineage, respectively, because strains in the Eastern lineage were all isolated from the Eastern Hemisphere (Asia, Europe or Africa) and strains in the Western lineage were all isolated from the Western Hemisphere (America) except A/chicken/Taiwan/1209/03 and A/duck/Hokkaido/84/02, which were also assumed to be oddly introduced from the Western Hemisphere (See: http://www.recombinomics.com/News/09040501/H5N2_WBF_Japan.html). This result was consistent with some previous regional studies, which suggested that avian influenza viruses of some HA subtypes could be assorted into Eurasian and North American lineages [2, 5, 10]. The geographical distribution of the two lineages was possibly because of the separate migration routes and habitats of birds in the two hemispheres [20]. Migration birds, which may carry the viruses to a faraway place, were considered as the main reservoir of the influenza viruses [3]. According to this hypothesis, H5 influenza viruses in South America isolated there in the future will probably belong to the Western lineage because many birds migrate between the North and South America [20].

Each of the two distinct lineages harbored several sublineages (Fig. 1). For example, the isolates causing HAPI in USA in 1980s and those causing HAPI in Mexico in 1990s were located in different sublineages. Figure 1 suggested that isolates of the same NA subtype, or of similar virulence, which was deduced from the amino acid residues at the HA0 cleavage site, or isolated from same region, or isolated from the same host species, or isolated in the same year were possibly situated in different HA minor lineages, and a HA minor lineage possibly contained isolates of different NA subtypes, or of different virulence, or isolated from different regions or different host species, or isolated in different years. In addition, both of the two distinct lineages covered all the aforementioned 7 NA subtypes (Table 1).

Table 1 The neuraminidase subtype distributions of the two lineages in Fig. 1

Figure 1 demonstrated that the two distinct lineages have existed at least for decades. The high bootstrap values at nodes A, B, C and D in Fig. 1 suggested that A/chicken/Scotland/59, A/tern/SouthAfrica/61 and A/Turkey/Ramon/73 could be considered as the intermediate strains of the two lineages with confidence [19, 21]. However, because inadequate sequences of strains isolated before 1970, the exact divergence time remains unclear. Though there was no obvious correlation between minor HA lineages and their isolation time, which is different from human influenza virus [22], the distance between the two distinct lineages has increased in the past decades.

To estimate when the two distinct lineages began divergence, the divergence rate of the two lineages was calculated. Firstly, the distances between the two lineages were estimated using software PHYLIP 3.62 according to Kimura-2-Parameter model with substitution rates differing in Gamma distribution and transistions/transversions ratio was set as 8.0 [19]. Because of inadequate sequences before 1990, only the means of the distances between the two lineages in 1990s (=0.2996 substitution/site) and in 2000s (=0.3129 substitution/site) were calculated with the nucleotide sequences of the aforementioned 53 selected isolates isolated in 1990s (Mean of the isolation time = 1996.09) and 76 selected isolates isolated in 2000s (Mean of the isolation time = 2002.45). Therefore, the rate of the divergence was approximately 0.00209 substitution/site/year). According to this rate, the two lineages possibly began divergence in 1850s. However, this was just a preliminary estimate based on the assumption that the two lineages diverged at a constant rate under the Kimura-2-Parameter model.

In 2000s, there have been 17 amino acid residues that were found to be different between the two lineages (Table 2), but the significance of these residues need to be investigated experimentally. Despite these differences in amino acid sequences between the two lineages, inactivated vaccine viruses from the Western lineage could provide complete protection from clinical signs and deaths caused by the current H5N1 HPAI viruses from the Eastern lineage and reduce the number of chickens infected and shedding virus [2]. This was possible because these different residues were not situated on the HA antigenic epitopes [23].

Table 2 The amino acid residues different between the two lineages in 2000s

Figure 1 integrated all important H5 epidemics in the past decades. For example, it indicated that the current rampant H5N1 HAPI viruses in the Eastern Hemisphere were phylogenetically closer to the H5N1 HPAI viruses circulating in England in 1959 than to the H5N2 HPAI viruses circulating in Mexico in 1990s. Figure 1 also provided a framework for the studies on the evolution and epidemiology of H5 influenza viruses. For example, it indicated that H5 HPAI viruses possibly originated from multiple minor lineages in each of the distinct lineages and their NA subtypes could be N1, N2, N3, N8 or N9.

In conclusion, this study described a whole map of the diversity of H5 influenza viruses, which provided us some information about the distributions of NA subtypes, isolation places, isolation years, host species and virulence of the viruses. It also provided a framework for the studies on the evolution and epidemiology of H5 influenza viruses including the investigations on the further development of the two distinct lineages. For example, whether will one disappear? whether will they circulate in overlapping areas and further evolve into two separate subtypes?